<<

Sebastian Rosengren Networks and processes that live on them are everywhere. We find them in our social structures, in our brain, and in the way a disease Random Graph and Growth spreads amongst us. There is a need to understand these structures and processes. Paper I deals with a dynamic extension to the famous Erdös- Models Rényi graph. Paper II deals with a multi-type extension of the . Paper III is concerned with how the distribution over small degrees affects the size of the giant Sebastian Rosengren

in the . In Paper IV, we consider the frog model Random Graph and Growth Models and a two-type extension of it, showing that the shape of the asymptotic set does not depend on the initial starting set(s) and particle configuration(s) there. Paper V is concerned with the predictability of the set of discovered sites generated by the first passage model. We show that it is possible to predict the shape of this set using a neural network.

Sebastian Rosengren is motivated by using probability models to describe complexity. He hopes his research has helped explain a little of the complexity in the world.

ISBN 978-91-7911-256-1

Department of

Doctoral Thesis in Mathematical at Stockholm University, Sweden 2020

Random Graph and Growth Models Sebastian Rosengren Academic dissertation for the of Doctor of Philosophy in at Stockholm University to be publicly defended on Friday 25 September 2020 at 13.00 in sal 15, hus 5, Kräftriket, Roslagsvägen 101.

Abstract Random graphs is a well-studied field of , and have proven very useful in a range of applications — modeling social networks, epidemics, and structures on the Internet to name a few. However, most random graphs are static in the sense that the network structure does not change over time. Furthermore, standard models also tend to consist of single-type objects. This puts restrictions on possible applications. The first part of this thesis concerns random graphs with a focus on dynamic and multi-type extensions of standard models. The second part of the thesis deals with random growth models. Random growth models are important objects in probability theory and, as the name suggests, models the random growth of some entity. Typical examples include infectious disease spread; how a liquid flows through a random medium; and tumor growth. The growth of these models, properly scaled by time, tends to be deterministic. The second theme of the thesis concerns the final shape of the growing entity for two standard random growth models. In Paper I, we study a dynamic version of the famous Erdős-Rényi graph. The graph changes dynamically over time but still has the static Erdős-Rényi graph as its stationary distribution. In studying the dynamic graph we present two results. The first result concerns the time to stationarity, and the second concerns the time it takes for the graph to reach a certain number of edges. We also study the time until a large component emerges, as well as how it emerges. In Paper II, we introduce and study an extension of the preferential attachment tree. The standard version is already dynamic, but its vertices are only allowed to be of one type. We introduce a multi-type analog of the preferential attachment tree and study its asymptotic degree distributions as well as its asymptotic composition. Paper III concerns the configuration model — a random graph neither dynamic nor multi-type — and we break with the first theme of the thesis since no extensions are made to the model. Instead, we argue that the size of the largest component in the model does not depend on the tail of the , but rather on the distribution over small degrees. This is quantified in some detail. In Paper IV, we consider the frog model on ℤd and a two-type extension of it. For the one-type model, we show that the asymptotic shape does not depend on the initial set and the particle configuration there. For the two-type model, we show that the possibility of both types to coexist also does not depend on the initial sets and the particle configurations there. Paper V is concerned with the predictability of the set of discovered sites generated by the first passage percolation model. First passage percolation has the property that the set of discovered sites, scaled properly by time, converges to some deterministic set as time grows. Typically, not much is known about this set, and to get an impression of it simulations are needed. Using simulated data we show that it is possible to use a neural network to adequately predict the shape, on this dataset, from some easily calculable properties of the passage times. The purpose of the paper is to give researchers a proof of concept of this method as wells as a new tool for quickly getting an impression of the shape.

Stockholm 2020 http://urn.kb.se/resolve?urn=urn:nbn:se:su:diva-184028 ISBN 978-91-7911-256-1 ISBN 978-91-7911-257-8

Department of Mathematics

Stockholm University, 106 91 Stockholm

RANDOM GRAPH AND GROWTH MODELS

Sebastian Rosengren

Random Graph and Growth Models

Sebastian Rosengren ©Sebastian Rosengren, Stockholm University 2020

ISBN print 978-91-7911-256-1 ISBN PDF 978-91-7911-257-8

Printed in Sweden by Universitetsservice US-AB, Stockholm 2020 It was the best of times, it was the worst of times — Charles Dickens

List of Papers

This thesis is based on the following papers, which are referred to in the text by their Roman numerals.

I: A dynamic Erdős-Rényi graph model. Rosengren S., Trapman P. (2019). Markov Processes and Related Fields, 25(2):275-301.

II: A multi-type preferential attachment tree. Rosengren, S. (2018). Internet Mathematics, 1(1). DOI:10.24166/im.05.2018 III: The tail does not determine the size of the giant. Deijfen, M., Rosengren, S., Trapman, P. (2018). Journal of Statistical Physics, 173:736–745. DOI:10.1007/s10955-018-2071-4. IV: The initial set in the frog model is irrelevant. Deijfen, M., Rosen- gren, S. (2020). Electronic Communications in Probability, 25(50):1-7 DOI:10.1214/20-ECP329

V: Predicting first passage percolation shapes using neural net- works. Rosengren, S. (2020). arXiv:2006.14004.

Reprints were made with permission from the publishers.

Author’s contributions: S. Rosengren has taken an active part in develop- ing the content of all papers. Paper I was based on the ideas of P. Trapman. The results in the first part of the paper was derived by S. Rosengren, and the second part was done jointly. The writing of the manuscript was done mainly by S. Rosengren. S. Rosengren is the sole author of Paper II, which is based on the ideas of M. Deijfen. Paper III was based on the ideas of M. Deijfen and P. Trapman. S. Rosengren derived all main results, and the simulations were done by S. Rosengren. The writing of the manuscript was mainly done by M. Deijfen. Paper IV was based on the ideas of M. Deijfen. The coupling argument, important for the main results, was provided by M. Deijfen but details were carried out by S. Rosengren. The writing of the manuscript was done jointly. Finally, S. Rosengren is the only author of Paper V, which is based on his ideas.

Acknowledgments

Here we are then, at the end of this journey. I know what you are thinking. A lazy person of average intelligence, now a PhD in Mathematical Statistics, how the hell did he pull this off? Turns out, you can get quite far with witty banter and wry observations. Joking aside, the answer is of course that I have had a lot of help, and I would like to thank all the people who made this thesis possible. First and foremost, I would like to thank my supervisor Professor Mia Deijfen. For her continuous support and guidance, her wisdom, intelligence, and endless patience. Thank you for being so generous with your ideas, and your time. My co-supervisor Pieter Trapman for all our fruitful discussions, collaborations, and for encouraging me to pursue a PhD. I want to thank all my colleagues at the Department of Mathematics at Stock- holm University for providing a fun and friendly work environment. To all the PhD students at the department — thank you for all the fun we have had. A special thank you to Felix, Måns, Hampus, and Erik for being very good friends and for making the quarantine quite enjoyable. Thank you Carl and Gabriel for our ventures outside academia. Finally, I want to thank my family, for all their love and support. My father for interesting me in the more difficult things of my life — mathematics and running — and for teaching me that the best tastes are acquired. My mother for taking such good care of me and my brother, and for always cheering us on.

Contents

List of Papers i

Acknowledgments iii

I Introduction 3

1 Random Graphs 5 1.1 Motivation ...... 5 1.1.1 Graph Properties ...... 6 1.2 Random Graph Models ...... 9 1.2.1 Erdős-Rényi Graph ...... 9 1.2.2 Preferential Attachment Models ...... 10 1.2.3 The Configuration Model ...... 11 1.3 Dynamic Processes on Graphs ...... 13

2 Random Growth Models 15 2.1.1 First Passage Percolation ...... 16

d 2.1.2 The Frog Model on Z ...... 18

3 Overview of Papers 21 3.1 Paper I ...... 21 3.2 Paper II ...... 22 3.3 Paper III ...... 23 3.4 Paper IV ...... 24 3.5 Paper V ...... 25

Sammanfattning 27

References 31

II Papers 33

Part I

Introduction

Chapter 1

Random Graphs

Networks and processes that live on them are everywhere. We find them in our social structures, in our brain, and in the way a disease spreads amongst us. There is a need to understand these structures and processes. What is the worst possible social structure for an epidemic? Who should we vaccinate? How long will the outbreak last? When will it peak? Can a rumor spread to most people in the network, if so how fast? Is the network vulnerable to outside attacks? Random graph and growth models can shed light on questions like these. In this chapter we give a short overview of the field of random graphs, provid- ing underlying motivation, theory, and some examples. Chapter 2 deals with random growth models.

1.1 Motivation1

Graphs are mathematical objects used to represent some kind of network. They consist of vertices and edges. Vertices are meant to represent e.g. individuals and edges are meant to represent some kind of relation, see Figure 1.1 for an illustration. Mathematically, a graph G is a collection of vertices V together with a collection of edges E ⊂ V × V , i.e. G = (V,E). For a given , the number of incident edges is called the degree of the vertex, and the distribution over the degrees generated by choosing a vertex uniformly is called the degree distribution. Random graphs arise when the construction of the graph involves randomness. Random graphs are interesting mathematical objects in their own right, and they are often the suitable choice when modeling networks that are inherently random, e.g. when the relationships between vertices can be regarded as random. However, maybe the most useful application of random graphs is to let them serve as models for complex networks. A is a network with a complicated structure, e.g. the so- cial structures of Facebook, LinkedIn, or just a large group of people. These networks are difficult to study. For example, if you want to know the size of the largest connected component of such a network you would need full access to the network, and even then the problem can be computationally infeasible — the time complexity of general search methods (for instance breadth-first search) is of order O(|V | + |E|) (which in turn can be of order O(n2), where

1This section is a modified replication of the introduction of [28]

5 Figure 1.1: Small graph representing actors connected and not connected to Kevin Bacon [2]. An edge represents that the two people have worked on a movie together. n is the size of the network). One technique of dealing with complexity is to replace it with randomness, in our case replacing the complex network with a random graph. For instance, instead of knowing the full network structure we may have access to local properties, e.g. the average number of friends of a uni- formly chosen individual. If we can find a random graph model with the same local properties, we could model the complex network as an outcome of that random graph, which in turn can be analyzed. This often allows us to make useful predictions about the complex network. Of course, predictions are only as reliable as the model fit and there is a need for different models for different situations. For extensive treatments of the area of random graph models for complex networks, see e.g. [17], [27], [32]. To put the model assumptions of random graphs into context, we first list what is often found when observing such graphs in reality.

1.1.1 Graph Properties

In this section, we give an overview of some properties observed in real-world networks, and their theoretical counterpart, as well as other properties that are often studied by researchers. In order to make mathematically precise the findings of real-world networks, we first have to introduce a graph sequence. Typically, when modeling networks, we are interested in properties for graphs that are very large or growing. With this in mind, it makes sense to consider a sequence {Gn}n≥1 of random graphs, with n denoting the size of the network.

Degree Distributions and Power Laws

Of paramount importance for random graphs is the degree distribution, since it affects many properties of a random graph. For a graph Gn it is the distribution over degrees for a uniformly selected vertex. To define its empirical counterpart, (n) called the empirical degree distribution, let Pk be the fraction of vertices in

6 Gn that has degree k. The difference between the distributions is then in the empirical part, the latter distribution requires data. Furthermore, it is usually (n) assumed that the graph sequence {Gn}n≥1 is sparse, meaning that Pk → pk as n → ∞ (in some sense, e.g. probability) where {pk}k≥0 is a . Empirical networks are observed networks, i.e. real-world networks. Of course, empirical networks range in structure, properties, and origin. However, quite surprisingly, they often share two properties. First, they tend to be scale-free, in that the degree distribution follows a power law. Secondly, they tend to be small-worlds, meaning that the distances between vertices is small (the concept is formally defined in the next section), see [32, Ch. 1] and the references therein. When developing models for empirical networks it is therefore good to have these two properties in mind. (n) That a network is scale-free means that Pk follows a power law, that is,

(n) −τ Pk ≈ C · k for large n. It is also possible to relax this relationship somewhat, and instead assume that (n) −τ Pk ≈ l(k)k where l(k) is a slowly varying function. For a graph sequence {Gn}n≥1 we can be more precise in our definition. We say that it exhibits a pk power-law degree distribution if C·k−τ → 1 as k → ∞. The value of τ varies depending on the application, but often τ ∈ (2, 3). These values correspond to degree distributions with finite mean but infinite variance — meaning that, as the network grows, the average degree stays bounded but the variance of degrees grows to infinity. The reason for the ubiquitousness of power-laws is the Pareto principle, or the 80-20 rule — e.g. 20% of the vertices have 80% of the connections. Networks having a version of this property will have power-law degree distributions, and vice-versa. We will return to power-laws in Section 1.2.2. Although interesting and worth studying, many standard graph models are not suitable for real-world applications. One reason for this is that the limiting degree distribution of a vertex tend to have an exponentially decaying tail — i.e. the fraction of vertices with degree k decays exponentially in k. As noted, in many empirical networks the tail of the degree distribution decays more slowly. For instance, according to a Gallup poll ([1]) the mean number of friends of an average American is 8.6. However, all of us know people with, say, 30 or more friends (albeit maybe not mathematicians). A degree distribution with an exponentially decaying tail would, according to intuition, predict too few people with 30 or more friends.

Small-worlds

That a network is a small-world means that the distances between vertices tend to be small. Many social networks have been found to have this property.

7 For instance, the famous example of six degrees of separation, where empirical evidence shows that two randomly selected people are six or fewer social con- nections away from each other (see [26]). A picture illustrating the small-world phenomena for articles on Wikipedia can be found in Figure 1.2. Small-world networks are necessarily highly connected (see definition in the next section) with a large component. Therefore when studying theoretical models of such networks one is typically interested in parameter values for which the resulting graph is highly connected.

Figure 1.2: Shortest paths between the Wikipedia entries six degrees of sep- aration and Stockholm University. An edge represent a hyperlink between the articles. Picture generated from [3].

We can make the notion of small-world precise by again considering a graph sequence {Gn}n≥1. First let dist(u, v) be the minimal number of edges linking the vertices u with v, and equal to ∞ if no exists. In order to describe the typical in a graph Gn let U1 and U2 be two uniformly selected vertices and set Hn = dist(U1,U2) — this quantity is referred to as the typical distance of the graph. There are other measures of distance on graphs that are use- ful, for instance the diameter diam(G) = maxu,v∈G dist(u, v|u, v connected). Both measures are useful, but the distribution of the typical distance cap- tures information about all distances between vertex pairs. Now, we say that a graph sequence {Gn}n≥1 is a small-world if there exists a constant K such that lim (Hn ≤ K log n) = 1. This means that, the typical distance is logarithmic n→∞ P in the size of the graph, meaning that every step we explore from a vertex multiples the number of accessible vertices.

Highly Connected Graphs and Giant Components

As mentioned, empirical networks are often highly connected. This means that a positive fraction of the vertices will be in a single connected component. For a graph G and vertex v, let C(v) = {u ∈ G, dist(u, v) < ∞} be the set containing all vertices accessible from v — called the connected component of v. To make the notion of highly connected precise, let C(i) denote the ith largest component in the graph. We say a graph sequence {Gn}n≥1 is highly connected if lim inf |C(1)|/n > 0 (in some sense, e.g. convergence in probability), and call n→∞

8 C(1) a . Furthermore, if lim sup |C(2)|/n = 0 we say that C(1) is n→∞ the unique giant component — i.e. there exists one component which contains a positive fraction of vertices, while all other components are of sub-linear size. A giant component is important for many properties of a network. For instance, a giant component can tell us something about the impact of an epidemic; the likely success of a vaccination scheme or an ad-campaign; or if a rumor can spread. Therefore, there has been considerable interest in the giant component of random graph models — how the size of it relates to model parameters, and for which parameter values it emerges.

Phase Transitions

Phase transitions are everywhere in the natural world. It refers to the phe- nomena whereby a small change in the input results in a drastic change of the output — e.g. water changes state from liquid to gas. It is a phenomenon that is inherent to most random graph models. For instance, there is often a phase transition regarding the giant component. Typically, as some model parameter varies (e.g. the edge probability) the graph goes from having only small components to the emergence of a giant component.

1.2 Random Graph Models

Having the properties and phenomena listed above in mind, we next introduce some of the standard random graph models in the field.

1.2.1 Erdős-Rényi Graph2

The Erdős-Rényi graph model(s) was first introduced in [18] and [20], and is a well-studied model of random graphs. It is defined as either: (i) consist- ing of n vertices and m edges, where the edges are assigned uniformly to the n 2 vertex pairs — this graph model is denoted G(n, m); or (ii) consisting of n vertices where edges are assigned independently between vertex pairs with probability p — this graph model is denoted G(n, p), see [12] for more details and many properties of the model. In what follows we shall focus solely on the G(n, p)-model. It is one of the simplest random graph models but still has an interesting behavior as n grows large. For instance, with pn = λ/n, its degree distribution converges to a with parameter λ. However, the most famous property of the model is that it exhibits a phase transition — there is a sharp parameter threshold where properties of the graph drastically change. Namely, for the number of vertices in the largest component, denoted |C1(n, p)|, we have the following classic result.

2This section is a modified replication of Section 1.1 of [28]

9 λ Theorem 1. [32, Thm. 4.4, 4.5, and 4.8] Let pn = n where λ > 0.

(i) If λ < 1 then

|C (n, p )| p 1 1 n −→ as n → ∞. log(n) λ − 1 − log(λ)

(ii) If λ > 1 then

|C (n, p )| p 1 n −→ ξ as n → ∞ n where ξ is the survival probability of a with offspring distribution X ∼ Po(λ).

The critical case when λ = 1 is also of mathematical interest. The size of the largest component is then of order n2/3, see e.g. Theorem 5.1 of [32]. In Paper I we extend the Erdős-Rényi graph to a dynamic graph and see that, among other things, it too exhibits a phase transition.

1.2.2 Preferential Attachment Models3

The preferential attachment model is a well-studied model of random network growth and was first introduced in [9]. In preferential attachment models, tra- ditionally, new vertices arrive according to some process (often at integer times t = {1, 2,...}) and upon arrival attach to an existing vertex with probability proportional to that vertex’s degree. Let degn(vi) denote the degree of vertex vi at time n. For the standard preferential attachment model, the probability that a new vertex vn+1 attaches to an existing vertex vj at time n + 1 is given by

degn(vj) P(vn+1 → vj) = Pn . i=1 degn(vi) Therefore, vertices that have a high degree are more likely to attract new vertices — this is called the rich-get-richer effect. The above dynamic generates degree distributions following a power law. More formally, let Pk(t) be the fraction of vertices with degree k at time t. It was proven in [11] that

p 4 P (t) −→ ∝ k−3 as t → ∞. k k(k + 1)(k + 2)

The model described above is sometimes called a preferential attachment tree, since each arriving vertex attaches to a single existing one and therefore gen- erates a tree structure. The framework can be extended to include vertices 3This section is a modified replication of Section 1.2 of [28]

10 arriving with multiple edges, see e.g. [32, Chapter 8]. Note that the graph is always fully-connected, so questions about the largest component are irrelev- ant. In a recent paper [14], vertex death was incorporated into the model so that components may arise. However, the focus of the paper was not on the components but left as an open problem. The preferential attachment model is already dynamic, but the vertices are only allowed to be of one single type. In Paper II, we extend the preferential attachment tree by allowing vertices to be of different types, as well as regarding time as continuous. This allows for a more flexible modeling framework but does not change the fundamental properties that the preferential attachment dynamic gives rise to. In particular, the multi-type tree still has power-law degree distributions.

1.2.3 The Configuration Model

So far we have covered two random graph models with degree distributions exhibiting two kinds of tail behavior — the Erdős-Rényi graph with an expo- nential decaying distribution (Poisson degree distribution), and the preferential attachment model with power-law degree distributions. This does not capture all reasonable modeling situations. For instance, a ran- dom network can have bounded degrees. In many situations, you might even know the exact degree distribution, for instance by polling people. The configur- ation model is one of the simplest and most well-known models for generating a random graph with a prescribed degree distribution. The model is well-studied, and many important questions have been answered in a precise manner. See [24] for results concerning the giant component; [33] for diameters and distances, and [32, Ch. 7], [31, Ch. 4] for detailed overviews of the model. Pn Call d = (d1, . . . , dn) a degree sequence if di ∈ Z+ and assume that i=1 di is even. We want to construct a graph where vertex i has degree di. If we restrict ourselves to simple graphs, i.e. graph with no self-loops and at most one edge between two vertices, this is not always possible. However, if we allow self-loops and multiple edges between two vertices this can be done. We call such a graph a multi-graph. To generate a random (multi-) graph of given size n with degree sequence d = (d1, . . . , dn) we proceed as follows:

1. Assign di half-edges to vertex i. 2. Choose two half-edges uniformly at random, and connect them to form an edge. Continue until no half-edges remains.

The resulting model is called the configuration model, and is denoted CMn(d). We can introduce a degree distribution by letting di be drawn (i.i.d.) from a Pn D, and agree that if i=1 di is odd we add one half-edge to the nth vertex (asymptotically this does not matter) and proceed as above — the model is then denoted CMn(D).

11 A common variant of this model is the erased configuration model, where one starts with a configuration model and removes all self-loops and merges all multiple edges. It can be shown that, as n → ∞, the degree distribution of the erased configuration model converges to the degree distribution in the config- uration model. Hence, for large n we can use this method to generate simple graphs with a prescribed degree distribution. It follows that many asymptotic properties of the two models are the same, for instance, the size of the giant component. The configuration model too exhibits a phase transition, determined by the value of the parameter ν = E(D(D−1))/E(D). This may seem counter-intuitive since the mean number of connections of a uniformly selected vertex is E(D) and we might expect this to be the important parameter (this is indeed the case in the Erdős-Rényi graph model). While it is true that the average degree of a vertex v is E(D), the same does not hold for the neighbors of v. They have, by construction of the graph model, on average E(D(D − 1))/E(D) connections. This is easiest understood by realizing that a vertex with degree k is k times more likely to be connected to a given vertex compared to a vertex with degree 1. This type of phenomenon is sometimes summarized by your friends have more friends than you, since on average we are more likely to be friends with more social people, and the degree distribution of your friends is said to be size-biased, see [32, Ch. 1] for more details. Now, if ν = E(D(D −1))/E(D) > 1 (with some added technical conditions) the configuration model will have a giant component. While if ν ≤ 1, the size of the largest component grows sub-linearly. Using a two-stage branching process approximation it can be shown that the (proportional) size ξ of the largest com- ponent corresponds to the survival probability of a branching process where the ancestor has degree distribution F = {pi, i = 0, 1,...} and all other offspring has (a down-shifted size biased) degree distribution F¯ = {p¯i, i = 0, 1,...} where

(i + 1)p p¯ = i+1 . i µ

Let z¯ be the survival probability of the branching process with offspring dis- tribution F¯, and let g be the probability generating function of F . It follows from standard branching process theory that z¯ is the smallest positive solution to the equation s = g0(s)/E(D). The survival probability 1 − ξ of the above described two-stage branching process is then given by

∞ X k 1 − ξ = z¯ pk = g(¯z). k=0

From this equation, we see that the size of the giant will depend mostly on the distribution over small degrees, and in Paper III we study this in detail.

12 1.3 Dynamic Processes on Graphs

A useful application of many graph models is to let them serve as the structure on which a process lives. There are many examples that fall into this category: traffic and routing; rumor and information spreading; opinion formation and voting behavior. However, perhaps the most well-known application is the mod- eling of infectious disease spread (epidemics) and percolation. In this section we give a short introduction to epidemic modeling and percolation. The topic of epidemics will not be covered in any of the papers. A thorough treatment of the subject can be found in e.g. [16].

Epidemics

One of the simplest and most well-known stochastic models for infectious dis- ease spread is the SIR (susceptible-infectious-recovered) model. Of course, the simplicity comes at a cost — it is not entirely realistic. Still it is an interesting model, and can serve as a stepping stone to more realistic models. Furthermore, it is a good model to use when introducing concepts important to epidemic modeling. In the SIR model, the population consists of n individuals, and may not grow over time. The social structure of the population is modeled with some (ran- dom) graph, and an infection can only take place along the edges of the graph. Each individual can be in one of three states: susceptible (S), i.e. can contract the disease; infectious (I), the individual can spread the disease; and recovered (R), the individual is immune to disease and can not spread it. Initially, there is one infected individual. An infected individual — or infective — stays infec- tious for a (random) period I, and these periods are assumed i.i.d. for different individuals. An infective makes contact with each neighbor in the underly- ing graph according to a Poisson process with rate λn. If a contact occurs between an infective and a susceptible individual an infection takes place, and the newly infected individual starts to spread the disease. After the infectious period I, the individual recovers and becomes immune to the disease and can not spread it further. Finally, all random elements in the process are assumed to be independent. In the standard SIR model, the is modeled by a fully-connected graph (each pair of individuals share an edge) and the transmission rate is λn = λ/n. Several simplifications have thus been made: there is no change in the infective’s behavior over time; the immunity lasts for the duration of the epidemic; there is no population growth. There are models addressing these restriction, see e.g. [7], [16], [17] for an overview the area. Arguably, when modeling an epidemic the single most important quantity is the reproduction number R0. The formal definition varies from model to model, but informally it is the average number of infections spread by an infective in the beginning of the epidemic. For the SIR model on a fully-connected graph,

13 R0 = λE(I), since the average infection period is E(I) during which the indi- vidual makes λ contacts per time units (likely only to susceptible individuals). Knowing the reproduction number is extremely useful. It determines if a large outbreak can occur — i.e. a positive fraction of individuals become infected as n → ∞. There is a phase transition here as well, if R0 ≤ 1 no large outbreak can occur and if R0 > 1 a large outbreak can occur. There are many more calculable properties of interest: the probability of a large outbreak; the size of the outbreak; how many needs to be vaccinated to prevent a large outbreak; the duration of an outbreak.

Percolation

Percolation models can be thought of as dynamic processes on graphs. Say we have a network, that only works well if it is highly connected. For instance, this is the case with the Internet, it only works if everyone on it is connected — otherwise e.g. e-mail starts to fail. For such a network, we may ask what is its fault tolerance, or if it is vulnerable to an attack? If we remove each edge in the network with some probability p, will it still be highly connected? Vaguely speaking, if for some removal probability p the graph is still highly connected, or a large component exists, we say that it percolates. Many models also exhibit a phase transition — there exists a critical value pc such that if p > pc the model percolates and if p ≤ pc it does not. Often researches are interested in if the model percolates at the critical threshold (i.e. p = pc), a question that has proven to be deceptively difficult to answer. There is a rich history of percolation, particularly on the lattice Zd. For a thorough treatment of the subject see [21]. Many percolation models can be recast as epidemic models and vice-versa. The Reed-frost model is such an example. It is one of the simplest epidemic models on a graph, and describes the evolution of an epidemic in terms of generations, where each infective in generation t independently infects each of its susceptible neighbors with some probability p. The underlying graph is typically assumed to be sparse. The infectives (infected at time t − 1) are then removed from the population, and the new infectives (infected at time t) determines the next generation of infectives. We see that this can be thought of as a type of percolation, in that we start with a graph and then iteratively remove those edges that are not used for transmission. This model has been extended in many ways. For instance, if we add a random weight τe to each edge e, and let the transmission probability be a function of this weight we have introduced inhomogeneity into the transmission process. It is also possible to interpret these weights, not as probabilities, but as traversal times for the infection. This model type is called first passage percolation, and leads us into the subject of random growth models.

14 Chapter 2

Random Growth Models

A random growth model can often be described as a dynamic process on a graph. However, random growth models are usually presented as a separate field of study, since it differs in origin and history. Random growth models have roots in statistical physics, while dynamic processes on random graphs have a more diverse origin. Furthermore, random growth models have a longer history, going back to the 1950s, while dynamic processes on random graphs started off in the early 2000s. Traditionally, random growth models have an infinite graph, such as Zd, as its underlying structure. However, the two fields share a lot of common ground, and many random growth models can be recast in terms of an epidemic on a graph. First passage percolation is perhaps the most widely known random growth model. It was first introduced as a model for a liquid flow through a random medium, but can also be interpreted as a type of epidemic — namely as an SI-epidemic (Susceptible-Infectious), where an infection lasts forever and no immunity can be achieved. We introduce first passage percolation in some de- tail in Section 2.1.1. Another well-known example of a random growth model is the contact process. It is used as a model for population growth on some graph G, where an occupied site becomes vacant at a constant rate; while a vacant site becomes occupied at a rate proportional to the number of occupied neighboring sites. This can be interpreted as an SIS-epidemic (Susceptible- Infectious-Susceptible), where a person stays infected for some time, but gains no immunity, and instead becomes susceptible to the disease again. Closely connected to the contact process is the voter model. A voter is placed at each point of Zd, where the voter can have two opinions. The voter at site x changes opinion at rate c(x, η) where η is the current opinions of all voters. In the lin- P ear voting model the rate is chosen as c(x, η) = ||y−x||≤N 1{η(y) 6= η(x)}, i.e. the rate is determined by all other voters of different opinions within range N. For a thorough treatment of the contact process, the voter model, and related processes see [25]. Next, we describe the two random growth models studied in Papers IV and V — first passage percolation and the frog model on Zd

15 2.1.1 First Passage Percolation

First passage percolation is a well-known random growth model on Zd. It was first introduced in [22] as a model for how a liquid flows through a medium, and has found applications in physics, biology, computer science, and mathematics. A recent summary of its history is provided in [8]. On each edge e in Zd we place a non-negative random variable τe. The family of random variables {τe} is assumed to be i.i.d. In the original setting τe is interpreted as the time it takes the liquid to pass through the edge e. The process is started by introdu- cing liquid at the origin, and then having it propagate through the medium. As mentioned, first passage percolation can also be interpreted as an SI-epidemic on Zd. Indeed, an individual can be susceptible or infectious, and once infec- ted the individual stays infectious forever, spreading the disease to its nearest neighbors on Zd. An important result on first passage percolation is that the set of discovered sites, scaled by time, converges to a deterministic shape. Knowing this limit shape can be useful since it tells something about how the process will grow. Therefore, a key concept is that of a path Γ, defined as a sequence of edges e1, e2,... such that en and en+1 are connected. We define the passage time of a path Γ as X T (Γ) = τe, e∈Γ i.e. the time it takes to traverse the path. Furthermore, we define the passage time between two points x, y ∈ Zd to be T (x, y) = inf{T (r): r is a path from x to y}. Let

d B(t) = {y ∈ Z : T (0, y) ≤ t} i.e. the set of vertices that can be reached from the origin by time t — referred to as the set of infected or discovered sites. For a wide range of passage times the model exhibits a shape theorem. Let ¯ 1 1 d B(t) = {x + [− 2 , 2 ] : x ∈ B(t)} be the continuum version of B(t).

Theorem 2 (Cox and Durrett [13]). Assume that {τe} satisfies

(i) E(min{τe1 , τe2 , . . . , τe2d }) < ∞, where τe1 , τe2 , . . . , τe2d are iid copies of τe. d (ii) P(τe = 0) < pc(d) where pc(d) is the threshold for bond percolation on Z .

Then, there exists a convex non-empty compact set B ∈ Rd such that for each  > 0,  B¯(t)  (1 − )B ⊂ ⊂ (1 + )B for all large t = 1. P t

16 This is proven by using the sub-additive ergodic theorem to show that B(t) grows linearly in a given direction, and then using additional arguments to show linear growth in all directions simultaneously. Figure 2.1 gives an impression of B for gamma distributed passage times.

Figure 2.1: Simulation of B(t)/t with τe ∼ Γ(n = 10, λ = 1) for t = 250, 500, 750, 1000.

Unfortunately, not much is known about B. We know the shape is non-empty, convex, compact, and inherits all symmetries from Zd, see e.g. [4]. Currently, for non-degenerate passage times, the only way to get an impression of the shape is through simulations. In Paper V, we investigate if it is possible to use a neural network to adequately predict the shape of B using some easily calculable properties of the passage times. There has also been considerable interest regarding the asymptotic shape from the statistical physics community. They have mainly focused on the fluctuations of B¯(t) around its mean. It is conjectured that first passage percolation belongs to the Kardar-Parisi-Zhang universality class, meaning that B¯(t) fluctuates in a special way, see [8] for more details. The model is famous for having many properties that are believed to be true but very difficult to prove. Still, there are precise answers to many questions. For instance, there are important results about the properties of time-minimizing paths (so-called geodesics), see e.g. [8]. We also mention that first passage percolation has been studied on other struc- tures than Zd. First passage percolation on a configuration model with power- law degree distributions and finite mean degree and was studied in [10]. It turns out, in this case, that the topology of the graph compared to first passage per- colation on the graph is quite different. Configuration models with power-law degree distributions (τ ∈ (2, 3)) are ultra-small worlds, meaning that the typ-

17 ical distance between two uniformly selected connected vertices are of order log log(n). However, for first passage percolation, the number of edges on the minimal weight path connecting two uniformly selected vertices, called the hop- count, is of order log(n). This implies that the most time-efficient path between two connected vertices tend to be much longer than the shortest path.

2.1.2 The Frog Model on Zd

The frog model is a random growth model, driven by particles (or frogs) moving according to discrete-time random walks on Zd. Each site x ∈ Zd is initially populated by an i.i.d. number η(x) of sleeping particles, and each particle has an independent associated with it. It is also possible to use lazy random walks, i.e. a random walks that stay put independently in each time step with some probability 1 − p. At time 0, the sleeping particles at the origin are activated (conditioned on the origin being populated (η(0) ≥ 1)), and starts to move according to their random walks. When a site is first visited by a particle, any sleeping particles there are activated and start to move according to their random walks. Like first passage percolation, the frog model to exhibits a shape theorem. Namely, the set of visited sites, properly scaled by time, converges to a de- terministic set. Let ξn denote the set of discovered sites at time n and let ¯ d ξn = {x + (−1/2, 1/2] , x ∈ ξn} be the continuum version of ξn, and let ν be the product measure on Zd generated by {η(x), x ∈ Zd}. Theorem 3. [6, Thm. 1.1] For any dimension d ≥ there is a nonempty convex set A = A(d, ν) ⊂ Rd such that, for any 0 <  < 1 and conditioned on η(0) ≥ 1, almost surely ξ¯ (1 − )A ⊂ n ⊂ (1 + )A n for all n large enough.

In Paper IV, we show that the limiting set does not depend on how the process is started. We can start the process by activating some (finite) set A in anyway we want (we do not have to draw the starting particles from ν) and the limiting set will not be affected. The model has also been studied with regards to transience and recurrence, i.e. is a site visited infinitely many times or not? It was shown in [30] that the frog model, with one frog at each site, is strongly recurrent. They also studied the frog model on other structures than the lattice, and then recurrence is not a given. In [23] the frog model on trees was studied with regards to the limiting set, and it was shown that the set of discovered sites grows linearly, similar to its behavior on Zd. In [15], the frog model was extended to a two-type version. An activated particle can be of one of two types i = 1, 2, and a type i particle moves according to

18 a lazy random walk with jump probability pi. Furthermore, if a type i particle discovers a site, all particles at the site become activated and are assigned type i. Ties can be resolved by some arbitrary mechanism, e.g. by a fair coin toss. At time 0, the process is initiated by activating the particles at the origin 0 and assigning them type 1, while particles at another site z are activated and assigned type 2, where we condition on that η(0) ≥ 1 and η(z) ≥ 1. All other particles are sleeping and do not yet have a type assigned. Deijfen et. al. [15] studies, among other things, the possibility of co-existence, i.e. the probability that both types discover infinitely many sites. In Paper IV, we show that the possibility of co-existence also does not depend on how you start the processes — if co-existence is possible from some starting configuration, it is also possible from all others.

19

Chapter 3

Overview of Papers

Paper I, II, and III concern random graph models: the Erdős-Rényi graph; the preferential attachment tree; and the configuration model respectively. Paper IV, and V concern random growth models: the frog model; and first passage percolation. Below we give an overview of the content in the papers.

3.1 Paper I1

In Paper I we study a dynamic Erdős-Rényi graph model (first introduced in [5]), in which, independently for each vertex pair, edges appear and disappear according to a Markov on-off process. For α, β > 0 and n a positive integer, the dynamic Erdős-Rényi graph is a evolving according to the following dynamics.

(i) The number of vertices is fixed and equal to n. (ii) Independently for each vertex pair, if no edge is present an edge is added β after an Exp( n−1 )-distributed time; if an edge is present, the edge is removed after an Exp(α)-distributed time.

It is straightforward to show that the stationary distribution of the dynamic β graph is the static Erdős-Rényi graph with edge probability p = 2α . Our first result concerns the time it takes for the graph to reach stationarity. We show n that the (fastest) time to stationary is distributed as the maximum of 2 β independent exponentially distributed random variables with rate α + n−1 . 2 log(n) This implies that the graph enters stationarity roughly at time α . We also show that among all strong stationary times this is the fastest one. The main result concerns the time it takes until the dynamic graph contains i = [cn] edges, where c is a positive constant. For large n the expected time to go from 0 to i = [cn] edges exhibits three different behaviors depending on β the value of c, i.e. a phase transition occurs. For c < 2α the graph reaches i β edges after roughly a constant time; for c = 2α the graph reaches i edges after β an logarithmic time (in n); while for c > 2α the graph reaches i edges after an exponentially large time (in n).

1This section is a modified replication of Section 2.1 [28]

21 Using the above results we also spend some time answering the question of how a large component emerges in this model. Is it from many edges being present, or through an unlikely configuration of few edges? In the critical case, β = α, it turns out that a large component tends to emerge through an unlikely config- uration of few edges, meaning that a large component emerges with fewer edges present in the graph than would be needed in the static setting. The reason for only studying the critical case for this question was because of mathematical tractability. For the sub- and supercritical case we provide an order expression for how long one has to wait for a component of a given size to emerge.

Open Problems

For the sub- and supercritical cases the question of how a large component emerges was left as an open problem. We believe the same methods used for the critical case can be applied to the other cases, but with added technical difficulties. There are of course many other properties of the model that can be studied: the time until the graph is fully-connected; or when a giant emerges, how long does it exist?

3.2 Paper II2

In this paper, we introduce a multi-type preferential attachment tree, and study it using general multi-type branching processes following an extension of the analysis first introduced in [29]. The motivation for extending the preferential attachment tree to the multi-type case is mainly that it allows us to model trees exhibiting some level of or heterophily. We derive a framework for studying the model in continuous time where a type i vertex generates new type j vertices with rate wij(n1, n2, . . . , np) where nk is the number of type k vertices previously generated by the vertex, and the attachment rate wij is a function from Np to R. The reason for re-casting the model in continuous-time is that it then fits into a general multi-type branching processes framework, and can be analyzed with the powerful tools therein. By inspecting the model only at the birth-times of vertices we get a discrete model. For instance, the one- type case with attachment rate w11(n1) = γ11n1 + β11 reduces to the standard preferential attachment tree. We derive results concerning asymptotic degree distribution and composition (total proportion of vertices with a specific type) for general attachment rates. We then apply the theory to models with more specific attachment rates. In the case with linear preferential attachment — where type i vertices gener- ate new type j vertices with rates depending on the vertex’s total degree, i.e. wij(n1, n2, . . . , np) = γij(n1 + n2 + ··· + np) + βij, γij ≥ 0, βij ≥ 0 — the main result concerns the asymptotic degree distribution. We show that under

2This section is a modified replication of Section 2.2 of [28]

22 mild regularity conditions on the parameters {γij}, {βij} the asymptotic degree distribution of a vertex is a power-law distribution. The proportion of type i vertices with k children (of any type) satisfies

α −(1+ γ +···+γ ) pi(k) ≈ C · k i1 ip if γi1 + ··· + γip > 0, where α is the Malthusian parameter of the multi-type branching process as- sociated with the graph, and C is a constant.

Open Problems

An obvious drawback of the model is its tree structure, which restricts poten- tial applications. A natural next step would be to extent the tree to a graph. At first glance, it would seem that one has to analyze such a model without the powerful framework of multi-type branching processes. However, in a re- cent paper [19] it was shown that it is possible to collapse trees, generated by continuous-time branching process, into multi-graphs, thereby allowing for more realistic models while keeping some powerful tools for analysis. Following [14], an extension allowing for vertex death should also be possible to analyze within the multi-type framework. Such an extension would allow for studying questions regarding component structure since the model does not result in a tree.

3.3 Paper III

Paper III deals with the size of the giant component in the configuration model. We show that, for a large family of degree distributions, the size of the largest component is largely determined by the distribution over small degrees, that is the tail behavior of the degree distribution is not essential. This is not surprising since the size of the largest component is connected to the survival probability of an approximating branching process. The survival probability of a branching process with offspring distribution {pk} is given by 1 − s where s ∈ [0, 1] is the smallest positive solution to the equation

∞ X k s = s pk k=0

Since s ∈ [0, 1] we see that indeed it is the distribution over smaller degrees that is most important. However, Paper III quantifies the importance of small degrees in some de- tail. We fix the probability (p1, . . . , pL) for the first L degrees in the degree distribution, as well as the mean degree µ, and ask how large or small the largest component can be. Under two technical conditions on p1, . . . , pL we give optimal bounds on the size of the largest component. The lower bound is

23 PL achieved by putting most of the remaining probability mass 1 − k=0 pk on the integer L + 1 and a vanishingly small probability on m (to keep µ fixed), and then take m → ∞. The upper bound is achieved by the degree distribu- tion which has all remaining probability mass on the two consecutive integers κlo = sup{k ∈ N; k ≤ E(D|D > L)} and κhi = inf{k ∈ N; k ≥ E(D|D > L)}. These bounds are valid under two technical conditions on p1, . . . , pL, but by numerical investigation, we believe these conditions to be mild. Furthermore, numerical examples show that the lower and upper bound are close already for small values of L supporting the claim that, if the distribution over small degrees is fixed, then the size of the giant is not affected much by the tail of the degree distribution.

Open Problems

It is quite unsatisfactory to add technical conditions, that provides no insight, just to make a proof work. We do believe that some condition is needed to en- sure that the considered family of degree distributions does not simultaneously contain both sub- and supercritical degree distributions, but the condition could be more intuitive.

3.4 Paper IV

In Paper IV, we consider the frog model on Zd and the two-type extension of it, introduced in [15]. Throughout the paper, we restrict ourselves to frog models driven by lazy random walks. For the one-type model we show that the asymptotic shape of the set of discovered sites does not depend on how the process is started. That is to say, if two independent frog models have different sets activated at the start, but the same particle distribution on all other sites, they will have the same asymptotic shape. In the two-type model, we say that the types co-exist if both of them activate infinitely many particles. In [15], it is shown that co-existence is possible with starting configuration (0, 1) if and only if it is possible with starting config- uration (0, z). Paper IV extends this result somewhat. We relax the starting conditions from single sites to bounded disjoint sets, as well as allowing the particle configuration on these sets to be arbitrary, i.e. the particles on these sets do not have to be drawn from the same distribution as the rest of the sites (see Section 2.1.2).

Open Problems

There are some possible natural extensions of the results. For instance, we believe that all the results will hold for models driven by non-lazy random walks. This will require different proofs since the present arguments are based

24 on couplings where we need to be able to force particles to stay put.

3.5 Paper V

Paper V concerns the predictability of the asymptotic shape generated by first passage percolation. As mentioned in Section 2.1.1, first passage percolation exhibits a shape theorem on Zd, i.e. the shape of the discovered/infected sites converges to a deterministic set if properly scaled by time. For a given passage time distribution the only general way to get an adequate impression of the shape is to use simulations. This has the potential drawback of being time inefficient, and in some cases, it might be better to have a method that is slightly less accurate but faster. In Paper V, we investigate if this is possible. In the paper we simulate the shape B for a variety of passage times, and then fit a neural network to the data. We show that, for our simulated data, we can adequately predict the (simulated) shape from some easily calculable properties of the passage time distribution. Input from the distribution consists of: the percentiles, the mean, and the standard deviation. The reason for choosing a neural network approach is that compared to tradi- tional statistical methods they can be a good alternative under the following assumptions.

(i) Data is plentiful.

(ii) There is little a priori knowledge (or difficult to extract for modeling purposes) about the functional relationship between data and outcome.

(iii) The functional relationship is potentially non-linear.

All of the above is true for our problem: we can simulate data; not much is known, and what is known is difficult to use for modeling purposes; by inspecting simulated shapes we know the shape is non-linear. For the simulated data we used passage times from the normal, uniform, and beta distribution. Generalizations work well for new passage times within these distribution families, with a slight decline in performance when tested on Pareto distributed passage times. Figure 3.1 illustrates how the neural network per- forms on training data. Note that, since we know the shape is symmetric, it is enough to predict it only above the line y = x in the first quadrant.

25 (a) Normal Distribution (b) Gamma Distribution

(c) Beta Distribution

Figure 3.1: Illustration of neural network model predicting shape for training data, for three representative distributions (Figure 2 in Paper V).

The results are promising, and since this area of discrete probability is rather untouched by modern methods our hope is that these results will stimulate further research in this area.

Open Problems

There are of course many open problems here. For instance, will the predictive ability of the model increase or decrease if we add more distributional families to the training data? Can other machine learning methods produce better results, or generate insights about the shape?

26 Sammanfattning

Slumpgrafer är ett välstuderat område inom sannolikhetsteorin, och har visat sig vara användbart i en lång rad tillämpningar såsom: modellering av sociala nätverk, epidemier, samt strukturer på internet. Dock är de flesta slumpgraf- modeller statiska i den mening att strukturen inte ändras över tid. Dessutom tenderar de att bestå av objekt av en och samma typ. Detta begränsar möj- liga tillämpningar. Den första delen av avhandlingen fokuserar på hur man kan generalisera slumpgrafmodeller till dynamiska situationer och till situationer där de ingående objekten kan vara av flera olika typer. Den andra delen av avhandlingen handlar om stokastiska tillväxtmodeller. Stokastiska tillväxtmod- eller är en viktig del av sannolikhetsteorin, och som namnet antyder beskriver den stokastiska tillväxten av något objekt. Typiska exempel inkluderar: hur en vätska flödar genom ett medium; tumörspridning; samt epidemier. Om dessa modeller skalas, på rätt sätt, med tiden så tenderar de att växa på ett mer eller mindre deterministiskt sätt. Det andra temat i avhandlingen handlar om den asymptotiska formen hos denna tillväxt. I Artikel I studerar vi en dynamisk version av den kända Erdős-Rényi grafen. Grafen ändras dynamiskt över tid men har ändå den statiska Erdős-Rényi grafen som sin stationära fördelning. Det första resultatet beskriver tiden det tar för den dynamiska grafen att nå stationäritet och det andra resultatet handlar om tiden det tar för grafen att nå ett visst antal kanter. Vi studerar också tiden det tar för en stor komponent att uppstå. I Artikel II introducerar och studerar vi en utökning av ett preferential attach- ment trädet. Standardmodellen är redan dynamisk, men noder tillåts bara vara av en typ. Vi introducerar en generalisering av denna modell till en situation där nodel tillåts vara av olika typer och studerar dess asymptotiska gradför- delning och sammansättning. Artikel III handlar om konfigurationsmodellen, en slumpgraf som är statisk och består av noder av endast en typ. Vi generaliserar inte modellen och bryter således med det första temat av avhandlingen. Istället argumenterar vi för att storleken på den största komponenten i grafen beror mer på gradfördelningen över små grader än på gradfördelningens svans. Detta påstående kvantifieras i detalj. I Artikel IV handlar om grodmodellen på Zd och en två-typs version av denna. I fallet med en typ, så visar vi att den asymptotiska formen inte beror den initiala startmängden eller partikelkonfigurationen där. För modellen med två typer visar vi att möjligheten för att typerna kan samexistera inte heller beror på startmängden eller partikelkonfigurationen där. Artikel V handlar om den asymptotiska formen för första passage perkolation, och huruvida den kan predikteras. Första passage perkolation har egenskapen

27 att mängden av upptäckta noder, med korrekt skalning av tid, konvergerar mot en deterministisk mängd. Dock vet man inte speciellt mycket om denna mängd och det enda sättet att få en bild av den är via simuleringar. Vi simulerar data och tränar ett neuralt nätverk på data och visar att det är möjligt att prediktera den asymptotiska formen förvånansvärt bra från några enkla attribut hos pas- sage tiderna. Vårt mål med artikeln är att ge forskare ett proof-of-concept av denna metodik, samt ett verktyg för att snabbt få en bild av den asymptotiska formen.

28 Bibliography

[1] Americans satisfied with number of friends, closeness of friendships. https://news.gallup.com/poll/10891/ americans-satisfied-number-friends-closeness-friendships. aspx. Accessed: 2020-07-08.

[2] Bacon number. https://www.cs.dartmouth.edu/~scot/cs10/lab/ lab5/lab5.html. Accessed: 2020-07-08.

[3] Six degrees of wikipedia. https://www.sixdegreesofwikipedia.com/. Accessed: 2020-07-08.

[4] S. E. Alm and M. Deijfen, First passage percolation on Z2 : A simulation study, Journal of Statistical Physics, 161 (2014), pp. 657–678.

[5] M. Altmann, Susceptipble-infected-removed epidemic models with dynamic partnerships, J. Math. Biol., 33 (1995), pp. 661–675.

[6] O. Alves, F. Machado, S. Popov, and K. Ravishankar, The shape theorem for the frog model with random initial configuration, Markov Process. Relat. Fields, 7 (2001).

[7] H. Andersson and T. Britton, Stochastic Epidemic Models and Their Statistical Analysis, Volume 151 of Lecture Notes in Statistics, Springer, 2000.

[8] A. Auffinger, M. Damron, and J. Hanson, 50 years of first passage percolation, 2015.

[9] A.-L. Barabási and R. Albert, Emergence of scaling in random networks, Science, 286 (1999), pp. 509–512.

[10] S. Bhamidi, R. van der Hofstad, and G. Hooghiemstra, First passage percolation on random graphs with finite mean degrees, Ann. Appl. Probab., 20 (2010), pp. 1907–1965.

[11] B. Bollobás, O. Riordan, J. Spencer, and G. Tusnády, The degree sequence of a scale-free random graph process, Random Structures and Algorithms, 18 (2001), pp. 279–290.

[12] B. Bollobás, Random Graphs, Cambridge University Press, Cambridge, 2001.

[13] J. T. Cox and R. Durrett, Some limit theorems for percolation processes with necessary and sufficient conditions, Ann. Probab., 9 (1981), pp. 583–603.

29 [14] M. Deijfen, Random networks with preferential growth and vertex death, Journal of Applied Probability, 47 (2010), pp. 1150–1163.

[15] M. Deijfen, T. Hirscher, and F. Lopes, Competing frogs on Zd, Electronic Journal of Probability, 24 (2019).

[16] O. Diekmann, H. Heesterbeek, and T. Britton, Mathematical Tools for Understanding Infectious Disease Dynamics:, Princeton University Press, 2013.

[17] R. Durrett, Random Graph Dynamics (Cambridge Series in Statistical and Probabilistic Mathematics), Cambridge University Press, New York, NY, USA, 2006.

[18] P. Erdős and A. Rényi, On the evolution of random graphs, Publica- tion of the Mathematical Institute of the Hungarian Academy of Sciences, (1960), pp. 17–61.

[19] A. Garavaglia and R. van der Hofstad, From trees to graphs: collapsing continuous-time branching processes, 2017.

[20] E. N. Gilbert, Random graphs, Ann. Math. Statist., 30 (1959), pp. 1141– 1144.

[21] G. Grimmett, Percolation, Springer-Verlag Berlin Heidelberg, 2 ed., 1999.

[22] J. M. Hammersley and D. J. A. Welsh, First-Passage Percolation, Subadditive Processes, Stochastic Networks, and Generalized , Springer Berlin Heidelberg, Berlin, Heidelberg, 1965, pp. 61–110.

[23] C. Hoffman, T. Johnson, and M. Junge, Infection spread for the frog model on trees, Electron. J. Probab., 24 (2019), p. 29 pp.

[24] S. Janson and M. J. Luczak, A new approach to the giant component problem, Random Structures & Algorithms, 34 (2009), pp. 197–216.

[25] T. M. Liggett, Interacting Particle Systems, Springer-Verlag Berlin Heidelberg, 2005.

[26] S. Milgram, The small world problem, Psychology Today, 2.1 (1967), pp. 60–67.

[27] M. Newman, A.-L. Barabasi, and D. J. Watts, The Structure and Dynamics of Networks: (Princeton Studies in Complexity), Princeton Uni- versity Press, Princeton, NJ, USA, 2006.

[28] S. Rosengren, Random graphs: Dynamic and multi-type extensions, Li- centiate thesis, Stockholm University (2017), diva2:1141654.

[29] A. Rudas, B. Tóth, and B. Valkó, Random trees and general branching processes, Random Structures & Algorithms, 31 (2007), pp. 186–202.

30 [30] A. Telcs and N. C. Wormald, Branching and tree indexed random walks on fractals, Journal of Applied Probability, 36 (1999), pp. 999–1011.

[31] R. van der Hofstad, Random graphs and complex networks. volume 2. https://www.win.tue.nl/~rhofstad/NotesRGCNII.pdf. accessed 2020- 08-12.

[32] R. van der Hofstad, Random Graphs and Complex Networks. Volume I, Cambridge Series in Statistical and Probabilistic Mathematics, Cam- bridge, 2017.

[33] R. van der Hofstad, G. Hooghiemstra, and P. Van Mieghem, Distances in random graphs with finite variance degrees, Random Struc- tures & Algorithms, 27 (2005), pp. 76–123.

31

Part II

Papers