Models of networks (synthetic networks or generative models)

Prof. Ralucca Gera, Applied Mathematics Dept. Naval Postgraduate School Monterey, California [email protected]

Excellence Through Knowledge Learning Outcomes

• Identify network models and explain their structures; • Contrast networks and synthetic models; • Understand how to design new network models (based on the existing ones and on the collected data) • Distinguish methodologies used in analyzing networks. The three papers for each of the models

Synthetic models are used as reference/null models to compare against and build new complex networks

•“On Random Graphs I” by Paul Erdős and Alfed Renyi in Publicationes Mathematicae (1958) Times cited: 3, 517 (as of January 1, 2015) •“Collective dynamics of ‘small-world’ networks” by Duncan Watts and Steve Strogatz in Nature, (1998) Times cited: 24, 535 (as of January 1, 2015) •“Emergence of scaling in random networks” by László Barabási and Réka Albert in Science, (1999)

Times cited: 21, 418 (as of January 1, 2015) 3 Why care?

• Epidemiology: – A virus propagates much faster in scale-free networks. – Vaccination of random nodes in scale free does not work, but targeted vaccination is very effective • Create synthetic networks to be used as null models: – What effect does the distribution alone have on the behavior of the system? (answered by comparing to the configuration model) • Create networks of different sizes – Networks of particular sizes and structures can be quickly and cheaply generated, instead of collecting and cleaning the data that takes time Reference network: Regular Lattice

The 1-dimensional lattice is the Harary graph H(n,r) or the Circulant graph (1, 2, …, r) start with an n-cycle, and each is adjacent to r/2 vertices to the left, and r/2 vertices to the right.

5 Source: http://mathworld.wolfram.com/CirculantGraph.html Reference network: Regular Lattice

a particular Circulant graph 𝐶(1, 2, …, r):

Source: http://mathworld.wolfram.com/CirculantGraph.html 6 Source: http://mathworld.wolfram.com/CirculantGraph.html Reference network: Regular Lattice

• The higher dimensions are generalizations of these. An example is a hexagonal lattice is a 2-dimensional lattice: graphene, a single layer of carbon atoms with a honeycomb lattice structure.

7 Source: http://phys.org/news/2013-05-intriguing-state-previously-graphene-like-materials.html Erdős-Rényi Random Graphs (1959)

8 Random graphs (Erdős-Rényi , 1959)

ERmodel : created at random with fixed parameters • G(n, m): fix n (node count) and m (edge count) • G(n,p): fix n and probability p of the edge existence between vertices (m is not fixed) 𝑛 – The mean value of edges: 𝑚 𝑝 2 – The average degree 𝑘 𝑛1 𝑝 – The distribution of finding a node of degree 𝑘is binomial: 𝑛1 𝑃 𝑘 𝑝 1𝑝 𝑘 • Constructing using need Gephi’s plug-in. • NetworkX has more synthetic models and classes 9 Creating G(n,m)

• To make a random network : – take n nodes, – m unlabeled edges randomly placed between the n vertices • Put the graph in a box, make another one and put it in the box, and another one… • Pull one network at random out of the box and it will have a Normal (classic degree distribution): almost everyone has the same number of friends on average 10 Creating G(n,m) – method 2

Method two and equivalent to the first: • To make a random network : – take n nodes, – m pairs of nodes at random to form edges, – place the edges between the randomly chosen nodes. • The average degree: , where is often used to denote the degree of vertex i in complex networks (enumerate the vertices, 1, 2, …)

11 Creating G(n,p)

• To create a random network : – take n nodes, – A fixed probability for the whole graph – Attach edges at random to the nodes, with the probability p

12 Degree distribution for both for 𝐺𝑛,𝑚 and 𝐺𝑛,𝑝 Results about E-R graphs:

• Degree distribution: Binomial • Average path is small compared to n: , where is the average degree – Comparable to the of the observed networks • is small: (The probability that two neighbors of a node are connected is equal to the probability of any two random nodes being connected)

– However observed networks have high clustering.13 Generating Erdős-Rényi ER(n,p)

• ER graphs are models of a network in which some specific set of parameters take fixed values, but the construction of the network is random (see below in Gephi)

14 Generating Erdős-Rényi ER(n,m)

15 Generating Erdős-Rényi random networks

Reference for python: http://networkx.lanl.gov/reference/generated/networkx.generators.random_graphs.erdos_renyi_graph.html#networkx.generators.random_graphs.erdos_renyi_graph 16 The Random Geometric model

17 Random Geometric Model

• Again the connections are created at random, but based on proximity (such as ad hoc networks) • Proximity is relevant: for each node , the edge is created with a probability if , for given fixed r. • There is no perfect model for the world around us, not even for specific types of networks

18 An example of a random geometric

19 https://www.youtube.com/watch?v=NUisb1-INIE Creating it in Python

https://networkx.github.io/documentation/networkx- 1.10/reference/generated/networkx.generators.geometric.random_geometric_graph.html#networkx.generators.geometric.random_geometric_graph 20 The Malloy Reed Configuration model (1995)

21 The configuration model

•A model created based on Degree sequence of choice (can be scale free)

• Maybe more than degree sequence is needed to be controlled in order to create realistic models 22 The MR configuration model

• A random graph model created based on a degree sequence of choice: 4, 3, 2, 2, 2, 1, 1, 1

Step 1:

Step 2: Or this step 2:

23 Mathematical properties

•Let and be two nodes. • Expectation of to be an edge : – Pick an edge out of the m edges in G: the probability that the left end node is i is 𝑘 (its degree), and the probability that the right end node is j, is 𝑘 ), and so: p (used 2m since each edge is counted from each of its two ends) • Expectation of a multi edge – Given that 𝑖𝑗 ∈𝐸 𝐺,then the probability that it will be an edge again is p , and so the probability of both happening is p which simplifies to:

24 Mathematical properties (parallel edges)

∑ Average degree: 𝑘 , and … ∑ the average of their squares: 𝑘 . Then, the expected number of parallel edges is:

25 http://tuvalu.santafe.edu/~aaronc/courses/5352/csci5352_2017_L4.pdf Mathematical properties (loops)

1. Recall that for parallel edges, Thus the expectation of a one edge of node 𝑖 has been used 2. And the equation on the previous page simplifies to the expected number of loops being

Conclusion: Since the variables in the equation in 2. above are constant with respect to the size of the network,  only a small fraction of edges are loops or parallel edges

26 http://tuvalu.santafe.edu/~aaronc/courses/5352/csci5352_2017_L4.pdf Generating it in Python

https://networkx.github.io/documentation/networkx-1.10/reference/generated/networkx.generators.degree_seq.configuration_model.html 27 Part 2

28 Coding it in CoCalc

•Go to www.CoCalc.com and create an account using your NPS email

• Create your new folder to copy the code • Open “MA4404-2019” folder to copy its contents to your new folder. 29 Copy contents to NEW folder

30 Make a copy

• Choose “CreateSyntheticNetworks.ipynb”

• Notice projects, folders & files

31 Create ER networks

32 Watts-Strogatz Small World Graphs (1998)

33 Small world models

• Duncan Watts and Steven Strogatz small world model: a few random links in an otherwise structured graph make the network a small world: the average shortest path is short

regular lattice (one small world: random graph: type of structure): mostly structured all connections my friend’s friend is with a few random happen at always my friend connections random

Source: Watts, D.J., Strogatz, S.H. (1998) Collective dynamics of 'small-world' networks. Nature 393:440-442. Small worlds, between order and chaos

High clustering: .75 Low clustering: p (probability) High average path: Low average path: Small worlds

the graph on the left has order (probability p =0), the graph in the middle is a "small world" graph (0 < p < 1), the graph at the right is complete random (p=1).

Source: http://www.bordalierinstitute.com/target1.html Avg path and avg clustering

Variations of avg path and clustering as a function of the rewiring probability p

36 https://pdfs.semanticscholar.org/8c4c/455de44fa99e73e79d6fddf008ca6ae0f9aa.pdf Generating Watts-Strogatz WS (n, k, alpha)

Alpha is the rewiring probability

37 Generating Watts-Strogatz networks

.15 is the rewiring probability

http://networkx.lanl.gov/reference/generated/networkx.generators.random_graphs.watts_strogatz_graph.html#networkx.generators.random_graphs.watts_strogatz_graph 38 Barabási-Albert Scale free model (1999)

39 Network growth & resulting structure

• Random attachment: new node picks any existing node to attach to • Preferential/fitness attachment: new node picks from existing nodes according to their degrees/fitness (high preference for high degree/fitness)

http://projects.si.umich.edu/netlearn/NetLogo4/RAndPrefAttachment.html Scale-free

• Scale-free networks are a type of small world • Whether static or evolutionary, they have – A power-law degree distribution: • Common ways to grow the network: – based on degree (for Barabási-Albert type the probability of attachment , where is the degree of node ). ∑ – Preferential attachment based on fitness (preassigned values). Power law networks

• Many real world networks contain hubs: highly connected nodes • Usually the distribution of edges is extremely skewed

many nodes with small degree

No “typical” degree node fat tail: a few nodes with a very large degree number of nodes of that degree that of nodes of number

Degree (number of edges) But is it really a power-law?

• A power-law will appear as a straight line on a log-log plot: let 𝑝 be the count of vertices of degree k. 𝑝 𝐶 𝑘 ln 𝑝 𝛼ln𝑘𝑐 Log of number of nodes of that degree nodes of Log of number

log of the degree • A deviation from a straight line could indicate a different distribution: – exponential – lognormal Fitting distributions

Node (frame) and edge (inset) counts of European Airline Transportation Network's layers with distribution fitting. 44 http://faculty.nps.edu/rgera/ANGEL.html Fitting distributions

European Airline Transportation Network's multilayer network: Degree histogram of the multiplexes with the log scale in the inset. Upper right: average shortest path, lower right: coefficient, per node 45 http://faculty.nps.edu/rgera/ANGEL.html Scale Free networks

• One example is introduced by Albert Laslo Barabási and Reka Albert (BA model) as a degree based preferential attachment :

– Start with a small set of nodes ( ) and random edges – Attach new nodes one at the time; • each with the same fixed number 𝑙of new edges, attaching to the existing nodes in the network, with preference for high degrees (once the high degrees appear) https://www.youtube.com/watch?v=5YdkhWB_uYQ • Network growth (measured by node count). • Not the only way to get scale–free networks! 46 Generating Barabasi-Albert

47 Generating Barabasi-Albert networks

http://networkx.lanl.gov/reference/generated/networkx.generators.random_graphs.barabasi_albert_graph.html#networkx.generators.random_graphs.barabasi_albert_graph 48 Modified BA • Many modifications of this model exists, based on: – Nodes “retiring” and losing their status/outdated – Nodes disappearing (such as website going down) – Links appearing or disappearing between the existing nodes (called internal links) – Fitness of nodes (modeling newcomers like Google) • Most researchers still use the standard BA model when studying new phenomena and metrics. – It is a simple model (allows consistent research) that has growth and preferential attachment – One can add more conditions to this basic model, in 49 order to mimic reality A zoo of complex networks

50 Random, Small-World, Scale-Free

Scale Free networks: 1. High degree heterogeneity 2. Various levels of 3. Various levels of randomness

Man made, “large world”:

51 http://noduslabs.com/radar/types-networks-random-small-world-scale-free/ Main References

• Newman “The structure and function of complex networks” (2003) • Estrada “The structure of complex Networks” (2012) • Barabasi “” (online: http://barabasi.com/networksciencebook/) • References to the classes that exist in python: http://networkx.lanl.gov/reference/generators.html

52 Back to coding in CoCalc

53