Lecture 2: Structure (2)

How the friendship we form connect us? Why we are within a few clicks on Facebook?

COMS 4995-2: Introduction to Social Networks Thursday September 15th

1 Outline

* Milgram’s “small world” experiment

* It’s a “combinatorial small world” * It’s a “complex small world” * It’s an “algorithmic small world”

2 So far we have been

* Milgram “Short chains connect us (Small world)” o After all, it might be a simple eﬀect of randomness o Formally, in a random graph, if you take p(n)=ln(n)/n, * The graph is sparse …as p(n) -> 0 as n grows large everybody knows a vanishing fraction of the world * But still a large connected component + small diameter!

* Q? Is that suﬃcient to explain our observation?

3 20.2. STRUCTURE AND RANDOMNESS 613

Small world: where we are you

your friends

friends of your friends 20.2. STRUCTURE* Remember this diagram AND RANDOMNESS 613 (a) Pure exponential growth produces a small world

you you

your friends your friends

friends of your friends friends of your friends

(b) Triadic closure reduces the growth rate (a) Pure exponential growth produces a small world o Now we can prove small world property, assuming Figure 20.1: Social networks expand to reach many people in only a few steps. the graph is a sparse random graph. you

people brings us to more than 100 100 100 = 1, 000, 000 people who in principle could be · · three steps away. In other words, the numbers are growing by powers of 100 with each step, your friends bringing us to 100 million after four steps, and 10 billion after ﬁve steps. There’s nothing mathematically wrong with this reasoning, but it’s not clear how much it tells us about real social networks. The diculty already manifests itself with the second friends4 of your friends step, where we conclude that there may be more than 10, 000 people within two steps of you. As we’ve seen, social networks abound in triangles — sets of three people who mutually (b) Triadic closure reduces the growth rate know each other — and in particular, many of your 100 friends will know each other. As a result, when we think about the nodes you can reach by following edges from your friends, Figure 20.1: Social networks expand to reach many people in onlymany a few of steps. these edges go from one friend to another, not to the rest of world, as illustrated schematically in Figure 20.1(b). The number 10, 000 came from assuming that each of your 100 friends was linked to 100 new people; and without this, the number of friends you could people brings us to more than 100 100 100 = 1, 000, 000 people who inreach principle in two could steps could be be much smaller. · · three steps away. In other words, the numbers are growing by powers of 100So with the eeach↵ect step, of triadic closure in social networks works to limit the number of people bringing us to 100 million after four steps, and 10 billion after ﬁve steps.you can reach by following short paths, as shown by the contrast between Figures 20.1(a) There’s nothing mathematically wrong with this reasoning, but it’s not clear how much it tells us about real social networks. The diculty already manifests itself with the second step, where we conclude that there may be more than 10, 000 people within two steps of you. As we’ve seen, social networks abound in triangles — sets of three people who mutually know each other — and in particular, many of your 100 friends will know each other. As a result, when we think about the nodes you can reach by following edges from your friends, many of these edges go from one friend to another, not to the rest of world, as illustrated schematically in Figure 20.1(b). The number 10, 000 came from assuming that each of your 100 friends was linked to 100 new people; and without this, the number of friends you could reach in two steps could be much smaller. So the e↵ect of triadic closure in social networks works to limit the number of people you can reach by following short paths, as shown by the contrast between Figures 20.1(a) Outline

* Milgram’s “small world” experiment

* It’s a “combinatorial small world” * It’s a “complex small world” o Granovetter’s observations, strong-weak ties o Triadic closure, Clustering Coeﬃcient * It’s an “algorithmic small world”

5 When do you need social networks?

* Looking for a job? o Need information, perhaps recommendation o Topic studied for a very long time by sociologist o Society starts with division of labors creating interdependence between people o Who does what is not neutral, socially conditioned? * 70s Interview of people who recently changed jobs: o Observation #1: jobs come from personal contact o Observation #2: typically from a distant friendship

The Strength of Weak Ties, M. 6 Granovetter Am. J Soc. (1983) How to interpret Grannovetter?

* Who are friends? Who are acquaintances? 3.2. THE STRENGTH OF WEAK TIES 51

J G K

F H

C A B

D E

Figure 3.4: The A-B edge is a local bridge of span 4, since the removal of this edge would increase the distance between A and B to 4.

7 Bridges and Local Bridges. Let’s start by positing that information about good jobs is something that is relatively scarce; hearing about a promising job opportunity from someone suggests that they have access to a source of useful information that you don’t. Now consider this observation in the context of the simple social network drawn in Figure 3.3. The person labeled A has four friends in this picture, but one of her friendships is qualitatively di↵erent from the others: A’s links to C, D, and E connect her to a tightly-knit group of friends who all know each other, while the link to B seems to reach into a di↵erent part of the network. We could speculate, then, that the structural peculiarity of the link to B will translate into di↵erences in the role it plays in A’s everyday life: while the tightly-knit group of nodes A, C, D, and E will all tend to be exposed to similar opinions and similar sources of information, A’s link to B o↵ers her access to things she otherwise wouldn’t necessarily hear about. To make precise the sense in which the A-B link is unusual, we introduce the following deﬁnition. We say that an edge joining two nodes A and B in a graph is a bridge if deleting the edge would cause A and B to lie in two di↵erent components. In other words, this edge is literally the only route between its endpoints, the nodes A and B. Now, if our discussion in Chapter 2 about giant components and small-world properties taught us anything, it’s that bridges are presumably extremely rare in real social networks. You may have a friend from a very di↵erent background, and it may seem that your friendship is the only thing that bridges your world and his, but one expects in reality that there will How to interpret Grannovetter?

* Who are friends? Who are acquaintances? 3.2. THE STRENGTH OF WEAK TIES 51

J G K o Hypothesis: o Strong ties should F H follow triadic closure property: C A B

D E

Figure 3.4: The A-B edge is a local bridge of span 4, since the removal of this edge would increase the distance between A and B to 4.

8 Bridges and Local Bridges. Let’s start by positing that information about good jobs is something that is relatively scarce; hearing about a promising job opportunity from someone suggests that they have access to a source of useful information that you don’t. Now consider this observation in the context of the simple social network drawn in Figure 3.3. The person labeled A has four friends in this picture, but one of her friendships is qualitatively di↵erent from the others: A’s links to C, D, and E connect her to a tightly-knit group of friends who all know each other, while the link to B seems to reach into a di↵erent part of the network. We could speculate, then, that the structural peculiarity of the link to B will translate into di↵erences in the role it plays in A’s everyday life: while the tightly-knit group of nodes A, C, D, and E will all tend to be exposed to similar opinions and similar sources of information, A’s link to B o↵ers her access to things she otherwise wouldn’t necessarily hear about. To make precise the sense in which the A-B link is unusual, we introduce the following deﬁnition. We say that an edge joining two nodes A and B in a graph is a bridge if deleting the edge would cause A and B to lie in two di↵erent components. In other words, this edge is literally the only route between its endpoints, the nodes A and B. Now, if our discussion in Chapter 2 about giant components and small-world properties taught us anything, it’s that bridges are presumably extremely rare in real social networks. You may have a friend from a very di↵erent background, and it may seem that your friendship is the only thing that bridges your world and his, but one expects in reality that there will Triadic closure

* A node satisﬁes the strong triadic closure property o If all of her strong ties have also become friends * An edge e=(A,B) is a local bridge o if A and B have no friends in common o Intuitively, if you remove e, distance increases * Theorem: if A has at least 2 strong ties and follows triadic closure, any local bridge of A is a weak tie.

9 Examples:

52 CHAPTER 3. STRONG AND WEAK TIES

J S G W K 3.2. THE STRENGTH OF WEAK TIES 55 S S W S

F H Strong Triadic Closure says the B-C edge must exist, but W W W the deﬁnition of a local bridge W says it cannot. C C S A W B S

S S S S S S S S S

A S B D W E W

Figure 3.5: Each edge of the social network from Figure 3.4 is labeled here as either a strong tie (S)oraweak tie (W ), to indicate the strength of the relationship. The labeling in the Figure 3.6: If a node satiﬁes Strong Triadic Closure and isﬁgure involved satisﬁes in at the least Strong two strong Triadic Closure Property at each node: if the node has strong ties ties, then any local bridge it is involved in must be a weakto tie. two The neighbors, ﬁgure illustrates then these the neighbors must have at least a weak tie between them. reason why: if the A-B edge is a strong tie, then there must also be an edge between B and 10 C, meaning that the A-B edge cannot be a local bridge. be other, hard-to-discover, multi-step paths that also span these worlds. In other words, if we were to look at Figure 3.3 as it is embedded in a larger, ambient social network, we would We’re going to justify this claim as a mathematical statementlikely see – a that picture is, it that will looks follow like Figure 3.4. logically from the deﬁnitions we have so far, without our having to invoke any as-yet- Here, the A-B edge isn’t the only path that connects its two endpoints; though they may unformalized intuitions about what social networks ought to look like. In this way, it’s not realize it, A and B are also connected by a longer path through F , G, and H. This kind a di↵erent kind of claim from our argument in Chapter 2 that the global friendship network of structure is arguably much more common than a bridge in real social networks, and we likely contains a giant component. That was a thought experiment (albeit a very convinc- use the following deﬁnition to capture it. We say that an edge joining two nodes A and B ing one), requiring us to believe various empirical statements about the network of human in a graph is a local bridge if its endpoints A and B have no friends in common — in other friendships — empirical statements that could later be conﬁrmed or refuted by collecting words, if deleting the edge would increase the distance between A and B to a value strictly data on large social networks. Here, on the other hand, we’ve constructed a small num- more than two. We say that the span of a local bridge is the distance its endpoints would ber of speciﬁc mathematical deﬁnitions — particularly, local bridges and the Strong Triadic be from each other if the edge were deleted [190, 407]. Thus, in Figure 3.4, the A-B edge is Closure Property — and we can now justify the claim directly from these. a local bridge with span four; we can also check that no other edge in this graph is a local The argument is actually very short, and it proceeds bybridge, contradiction. since for Take every some other net- edge in the graph, the endpoints would still be at distance two if work, and consider a node A that satisﬁes the Strong Triadic Closurethe edge Property were deleted. and is Noticeinvolved that the deﬁnition of a local bridge already makes an implicit in at least two strong ties. Now suppose A is involved in aconnection local bridge with — triadic say, to closure, a node in that the two notions form conceptual opposites: an edge B — that is a strong tie. We want to argue that this is impossible,is a local bridge and the precisely crux of when the it does not form a side of any triangle in the graph. argument is depicted in Figure 3.6. First, since A is involved in at least two strong ties, Local bridges, especially those with reasonably large span, still play roughly the same and the edge to B is only one of them, it must have a strong tie to some other node, which we’ll call C. Now let’s ask: is there an edge connecting B and C? Since the edge from A to B is a local bridge, A and B must have no friends in common, and so the B-C edge must not exist. But this contradicts Strong Triadic Closure, which says that since the A-B and Can we validate this hypothesis? Using cellphone dataset “who talks to whom?” X-axis: edges sorted by # min. of communications Y-axis: neighborhood overlap

• Stronger edges overlaps You can validate it too! (Ex.6)

J. Onnela, et. Al “Structure and tie strengths in mobile communication networks,” Proceedings of the National Academy of Sciences11 , vol. 104, no. 18, p. 7332, 2007. Social networks are not ﬂat

* Do not underestimate the eﬀect of the inbreeding! o How many of your friends are friends (strong ties)? o Which are friends with diﬀerent people (weak ties)? * Several evidences that these play diﬀerent roles o Strong ties: cohesion, resists innovation/danger (e.g., reduce risk of teen suicide) o Weak ties: diversity, connect faraway groups (e.g., who to ask when looking for a job)

The Strength of Weak Ties, M. Granovetter Am. J Soc. (1983)

Suicide and Friendships Among American Adolescents12 . P. Bearman, J. Moody, Am. J. Public Health (2004) 1.2 Randomly augmented lattice One fair criticism of random graph model is that they lack structure representing local variations between the nodes. 1.2 Randomly augmented lattice This can be immediately characterized by the so-called “clustering coe- cienCan we quantify the diﬀerence? t”, whicOneh isfairgenerallycriticism ofdeﬁnedrandomasgraph model is that they lack structure representing local variations between the nodes. This can be immediatelyv, w Ncharacterized(u) (v, w)by theE so-called “clustering coe- C = | { ⌅ | ⌅ } | . cient”, whichuis generally Ndeﬁned(u) asN(u) 1 | | ⇤ | ⇥ | * A metric of strong ties v, w N(u) (v, w) E This may be related withCuthe= |conditional{ ⌅ probabilit| ⌅y of}triadic| . closure N(u) N(u) 1 o Clustering coeﬃcient | | ⇤ | ⇥ | o Also writes P [(v, w) E (u, v) E and (u, w) E ] 1.3 “Small w⌅orld”| navigation⌅ ⌅ * whereStructured networks: rings, grids, torus nodesWevfoandcus herew areoncthehosencaseuniformlyof dimensioninkV=. 2 although the result easily gener- o Clustering remains constant in large graph alize to any dimension k 1. The “small world” model deﬁned by Kleinberg is a variation of the randomly 1.3o As indeed, in most empirical data set “Smallaugmenwtedorld”lattice innatrovigationduced by Watts and Strogatz. As opposed *W Random graphs are e focus here on the caseradically of dimensiondiﬀerent k = 2 although the result easily gener- alizeo Clustering coeﬃcient goes to zero as n -> ∞ to any2dimensionExercicesk 1. … all results so far The “small world” mobiased del deﬁnedas all ties are weak by Kleinberg is a variation of the randomly augmented¡++¿lattice introduced by Watts and Strogatz. As opposed 13

2 Exercices

¡++¿

3

3 Small-world model

* Main idea: social networks follows a structure with a random perturbation * Formal construction: 1. Connect nearby nodes in a regular lattice 2. Rewire each edge uniformly with probability p (variant: add a new uniform edge with probability p)

Collective dynamics of ‘small-world’ networks. D. Watts, S. Strogatz, Nature (1998) 14 Small-world model

* Main idea: social networks follows a structure with a random perturbation

Collective dynamics of ‘small-world’ networks. D. Watts, S. Strogatz, Nature (1998) 15 Between order and randomness As a function of p: − C(p): clustering coeﬃcient − L(p): median length of the shortest path − Both normalized by p=0. For small p (~ 0.01) − large clustering − small diameter A few rewire suﬃce!

16 A new statistical mechanics

* Thm: ∀p>0, augmented lattice has small diameter, there is A such that P[D≤A ln(N)] → 1 o Elementary proof if assuming regular augmentation o Otherwise proof similar to Uniform Random Graph

* More on “randomly perturbed structures” o Degree distribution, o assortative mixing,

17 Outline

* Milgram’s “small world” experiment

* It’s a “combinatorial small world” * It’s a “complex small world” * It’s an “algorithmic small world”

18 Where are we so far? Analogy with a cosmological principle − Are you ready to accept a cosmological theory that does not predict life?

In other words, let’s perform a simple sanity check

19 A thought experiment

1. Consider a randomly augmented lattice (N nodes)

20 A thought experiment

1. Consider a randomly augmented lattice (N nodes) 2. Perform “small world” Milgram experiment

Can you tell what will happen? (a) The folder arrives in 6 hops (b) The folder arrives in O(ln(N)) hops (c) The folder never arrives (d) I need more information

21 A thought experiment

(a) The folder arrives in 6 hops NOT TRUE

* It actually does look like a naive answer * More precisely: o By previous result we know that shortest paths is of the order of ln(N), which contradicts this statement.

22 A thought experiment

(b) The folder arrives in O(ln(N)) ACCORDING TO OUR PRINCIPLE, OUGHT TO BE TRUE BECAUSE IT WAS OBSERVED BY MILGRAM

* A suﬃcient condition for this to be true is: o Milgram’s procedure extract shortest path * Answering this critical question boils down to an algorithmic problem

23 A thought experiment

(c) The folder never arrives SEEMS UNLIKELY

unless the procedure is badly designed (cycle) or we model people dropping or if the grid contains hole

24 A thought experiment

(d) I need more information * In particular, how to model Milgram’s procedure * “If you do not know a target, forward the folder to your friend or acquaintance that is most likely to know her.”

25 What is Greedy Routing?

* A mathematical model of what Milgram measured o Participants know where the target is located o They use grid information + shortcuts “incidentally” N.B.: Grid “dimensions” can describe geography or other sociological property (occupation, language) * Example: (see board)

26 1.2 Randomly augmented lattice One fair criticism of random graph model is that they lack structure representing local variations between the nodes. This can be immediately characterized by the so-called “clustering coe⇥- cient”, which is generally deﬁned as

v, w N(u) (v, w) E C = | { | } | . u N(u) N(u) 1 | | ⇤ | ⇥ | This may be related with the conditional probability of triadic closure

P [(v, w) E (u, v) E and (u, w) E ] | where nodes v and w are chosen uniformly in V . The randomly augmented lattice G(N, p) is deﬁned as follows: each node is initially connected with all its neighbor in the original lattice that contains N = m2 nodes. In addition, each node decides independently with probability p to draw a “shortcut” edge, which connects this node to another node chosen uniformly.

Proposition 2 For any p > 0, there exists a constant C such that the diameter D of the G(N, p) graph satisﬁes

D A ln(N) with high probability as N . ⇧ ⌃ 1.3 Algorithmic “Small world” analysis We focus here on the case of dimension k = 2 although the result easily gener- alize to any dimension k 1. Let us ﬁrst analyze asimple case, the randomly augmented lattice with dimension k = 1. How does greedy routing perform? Proposition 3 In a randomly augmented lattice of dimension k = 1 with N nodes, greedy routing uses at least ( (N)) steps. ⌅ Proofbb:yyLetthethe* usalgorithmDoes it extract the shortest path? algorithmconsiderdodoa targetnotnot impactimpactt, andthethetheoutcomeoutcomeneighborhoofoforandomrandomd subsetcchoiceshoices notnot currencurrentlytly used.used. o Not necessarily, this is why we need to analyze it! ⌥ LetLet XXIli=bbee thetheu nonoV dede uthatthatt thethel shortcutshortcutN .rorootedoted atat UUi ppoinointsts to.to. XXi formsforms * Case study: dim. k=1, target t, starting from s=ui | ⇥ | ⌅ ⇤ i i anan i.i.d.i.i.d. sequencesequence⇤ ofof uniformlyuniformly cchosenhosen pp⌃oinoints.ts. EacEachh ofof themthem 0 lieslies inin IIll withwith 1 2l This subsetprobabilitprobabilitcontainso yWe introduce interval: yt1and22ll⇤at⇤NNmost 22ll⌥.. N nodes. NN ⇥⇥ NN by the algorithm do not impact the outcome of random choices not currently Starting fromTheThean initialoprobabilitprobabilit The greedy routing constructs a path pointyysthatthat, ⌅greedy⌅ oneone⇥ routingofof thethe⇧ nnconstructﬁrstﬁrst elemenelemena pathtsts ofofsXX=iiused.Ulieslies0, Uinin1,IIUll 2isis thenthen uppupperer by following atbboundedoundedeach stepbbyy thetheeitherunionuniona lobbcalound:ound:edge in the grid or tha shortcut. (Ui)i 0 we denote the end-point of the i shortcuts as Let Xi be the node that the shortcut rooted at Ui points to. Xi forms are then random points in the grid. an i.i.d. sequence of uniformly chosen points. Each of them lies in Il with Since shortcuts are chosen independently of each other, we can applynlnlthe 1 ⌅ 2l probability N 2l N . principle of deferred decision,PP which statesXXii thatIIll previous randomPP [[XXii choicesIIll]] used.. ⇥ N {{ ⌅⌅ }}⇧⇧ ⇥⇥ ⌅⌅ ⇥⇥ ⇤⇤N ii=1=1,...,n,...,n ii=1=1,...,n,...,n TheNprobabilit y that⇤ one of the n ﬁrst elements of Xi lies in Il is then upper ⇤⇤ ⌥⌥ ⇥ ⌃ bounded by the union bound: ⇥ 11 ⇤ ⌃ Hence,Hence, forfor nn == ll == 2⇤NN,, thisthis ooccursccurs withwith probabilitprobabilityy atat mostmost 11//2.2. WWee 2 27 concludeconclude that,that, withwith probabilitprobabilit3 yy atat leastleast 11//22 thethe nn ﬁrstﬁrst shortcutsshortcuts encounencounteredtered nl P [ i=1,...,n Xi Il ] P [Xi Il ] . lie outside Il. In that case, if we assume that s itself is not in Il, the⇤greedy { ⇧ } ⇥ ⇧ ⇥ ⌅ lie outside Il. In that case, if we assume that s itself is not in Il, the greedy i=1,...,n N proprocedurecedure needsneeds eithereither ⇥ 1 ⌅ Hence, for n = l = 2 N, this occurs with probability at most 1/2. We toto makmakee nn stepssteps •• conclude that, with probability at least 1/2 the n ﬁrst shortcuts encountered lie outside I . In that case, if we assume that s itself is not in I , the greedy otherwise,otherwise, toto reacreachh bbeforeefore tt bbeforeefore nn steps,steps, itit requiresrequires toto reacreachhltt fromfrom thethe l •• procedure needs either bboundaryoundary ofof IIll usingusing ll lolocalcal edges.edges. ⇤ to make n steps InIn bbothoth cases,cases, ⇤NN stepssteps areare required.required. • ⇤⇤ otherwise, to reach before t before n steps, it requires to reach t from the Proposition 4 In a randomly augmented lattice of dimension• k 1 with N Proposition 4 In a randomly augmented lattice of dimensionboundaryk 1 withof Il Nusing l local edges. kk nonodes,des, grgreeeedydy rroutingouting usesuses atat leleastast ((NN kk+1+1)) steps.steps. In both cases, ⌅N steps are required. ⇤ TheThe “small“small wworld”orld” momodeldel deﬁneddeﬁned bbyy KleinKleinbbergerg isis aa vvariationariation ofof thethe randomlyrandomly augmenaugmentedted latticelattice inintrotroducedduced bbyy WWattsatts andand Strogatz.Strogatz. AsAsPropoppopposedositionosed 4 In a randomly augmented lattice of dimension k 1 with N k nodes, greedy routing uses at least (N k+1 ) steps. 22 ExercicesExercices The “small world” model deﬁned by Kleinberg is a variation of the randomly augmented lattice introduced by Watts and Strogatz. As opposed

2 Exercices

44

4 by the algorithm do not impact the outcome of random choices not currently by the algorithmused. do not impact the outcome of random choices not currently by the algorithm do not impact the outcome of random choices not currently used. Let Xi be the node that the shortcut rooted at Ui points to. Xi forms used. Let Xani bei.i.d.the nosequencede thatoftheuniformlyshortcutchosenrootedpatoinUts.i pEacointsh ofto.themXi formslies in Il with Let Xi be the node that the shortcut rooted at Ui points to. Xi forms an i.i.d. probabilitsequence yof 1uniformly2l⇤N chosen2l . points. Each of them lies in Il with an i.i.d.N sequence⇥ofuniformlyN chosen points. Each of them lies in Il with 1 ⇤ 21l 2l probability NThe2probabilitl probabilitN yy .that2⌅l⇤None of the. n ﬁrst elements of Xi lies in Il is then upper ⇥ NN ⇥ N The probabilitbounded ybThethaty⌅theprobabilitoneunionof theybthatound:n⌅ ﬁrstone ofelementhe n tsﬁrstofelemenXi liests inof XIli isliesthenin Iluppis thener upper bounded by the unionboundedbound:by the union bound: nl P Xi Il P [Xi Il ] nl . P { ⌅Xi }Il⇧ ⇥ P [Xinl⌅ Il ] ⇥ ⇤N. P Xi=1i ,...,nIl { ⌅ }P⇧ [⇥iX=1i,...,nIl ] ⌅ . ⇥ ⇤N { ⇤⌅ i=1,...,n}⇧ ⇥ ⌥i=1⌅,...,n ⇥ ⇤ Analysis of Greedy routing i=1,...,n ⇤ i=1,...,n ⌥ N ⇥ ⇥ 1 ⌃ ⌃ Hence,⇤for n = l = ⇤N1 ⇤, ⌥this occurs with probability at most 1/2. We ⇥ Hence, for n = l2⌃= 2 N, this occurs with probability at most 1/2. We 1 ⇤ Hence,concludefor n conclude=that,l = that,2withNwith,probabilitthisprobabilitoccursy atywithatleastleastprobabilit1/12/2thetheynnatﬁrstﬁrstmostshortcutsshortcuts1/2. encounWencoune teredtered * CLAIM: If none of conclude that,X1,withX2X, .1 .probabilit,.X, 2X, .n . . , Xlieny are in outsideatlieleastoutsideI1l./2IInl.theInthatthatn case,ﬁrstcase,shortcutsififwwee assumeencounthatthattereds itselfs itselfis notis not { { } } X and we start from u, X ,in. . .I,l,Xthein Igreedyllie, the outside outsidegreedyprocedureIpro. Incedurethatneedsneedscase,eithereitherif we assume that s itself is not { 1 2 n } 0 l oin Then greedy routing needs at least min(Il, the greedy procedureto makneedse n stepseithern,l) steps to mak• e n steps • to make n stepsotherwise, to reach before t before n steps, it requires to reach t from the • otherwise,• to reach before t before n steps, it requires to reach t from the • boundary of Il using l local edges. otherwise,btooundaryreach bofeforeIl usingt beforel localn steps,edges.it requires to reach t from the • In both cases, ⇤N steps are required. boundary of Il using l local edges. ⇤ In both cases, ⇤N steps are required. ⇤ ⇤Proposition 4 In a randomly augmented lattice of dimension k 1 with N In both cases, N steps are required. k ⇤ Propositionnodes, gr4eedyInroutinga randomlyuses at augmenteleast (N kd+1lattic) steps.e of dimension k 1 with N k 28 Propositionnodes,4 grIneeTheadyrandomlyr“smalloutingworld”usesaugmentemoat delleastddeﬁnedlattic(Nbeky+1ofKlein)dimensionsteps.berg is a variationk 1 ofwiththe Nrandomly k nodes, greedy routingaugmenusested atlatticeleastintro(ducedN k+1 b)ysteps.Watts and Strogatz. As opposed The “small world” model deﬁned by Kleinberg is a variation of the randomly The “smallaugmenworld”ted latticemodelindeﬁnedtroducedby Kleinby Wbattsergandis a vStrogatz.ariation ofAstheopprandomlyosed augmented lattice2 introExercicesduced by Watts and Strogatz. As opposed 2 Exercices 2 Exercices

4

4 4 by the algorithm do not impact the outcome of random choices not currently by the algorithmused.do not impact the outcome of random choices not currently by the algorithm do not impact the outcome of random choices not currently used. Let Xi be the node that the shortcut rooted at Ui points to. Xi forms by theby thealgorithmalgorithmdo notdoimpactnot impacttheused.outcomethe outcomeof randomof randomchoices notchoicescurrennottly currently by the algorithmLetdoXnoti bimpacte anthei.i.d.thenooutcomede sequencethatof therandomofshortcutcuniformlyhoices notrootedcurrenchosentlyat Upioinpoints.tsEacto.h Xofi themformslies in Il with used. Let X be the node that the shortcut rooted at U points to. X forms used.used. an i.i.d. sequence of uniformly1i ⇤chosen p2loints. Each of them liesi in I with i Let Xi be the node thatprobabilitthe shortcutan i.i.d.y rosequenceoted2lat NUi ofpoinuniformlyts to.. Xi cformshosen points. Each of theml lies in I with LetLetXi bXei thebe nothedenothatde thethatshortcutthe shortcutroNoted roatotedUi p⇥atointsUiNto.poinXtsi formsto. Xi forms l an i.i.d. sequence of uniformly1 chosen poin2ts.l 1Each of them lies2l in Il with an ani.i.d.i.i.d.probabilitsequencesequenceof yuniformlyof uniformly2l⇤ThechosenprobabilitN probabilitchosenpoints.y p. oinEacy2ts.hl⇤thatofNEacthemoneh ofliesofthem. intheIlliesnwithﬁrstin Ilelemenwith ts of X lies in I is then upper 1 N2l NN ⌅ N i l probabilitprobabilityy 1 22l1l⇤⇤NN 2l .. 2l ⇥ ⇥ probabilitHow does greedy routing perform? NN y 2l⇤⇥NNNbounded. Thebyprobabilitthe uniony thatbound:one of the n ﬁrst elements of X lies in I is then upper TheN probabilit⇥ ⇥ yN that⌅ one of the n ﬁrst⌅ elements of Xi lies in Il isithen uppl er TheTheprobabilitprobabilit yythatthat⌅⌅ oneoneofofthethennﬁrstﬁrstelemenelementstsofofXXii liesliesininIIll isisthenthenuppupperer The probabilit y that⌅ one ofboundedthe n ﬁrstby elementhe unionts ofbXound:i lies in Il is then upper bboundedounded bbyyoundedthethe unionunionbbbyound:ound:the union bound: bounded by the union bound: nl P Xi Il P [Xi Il ] nl . nlnlX I [nlX I ] ⇤. PP XXii IIll PP[P[XXii IIll]] { .i.⌅ nll }⇧ ⇥ P i ⌅l ⇥ N {{ ⌅⌅ }}⇧⇧⇥⇥ Xi iI=1l ⌅⌅,...,n⇥⇥ ⇤⇤{N ⌅ }[X⇧ i⇥ i=1Il ],...,n ⌅. ⇥ ⇤N ii=1P=1,...,n,...,n P Xi Il ii=1=1,...,n,...,n ⇤i=1P ,...,n[Xi NIl ] P . i=1⌥,...,n ⇤⇤ { ⌅ }{⌥⇧⌥⇥ ⌅ }⇧⇤⇥⌅ ⇥ ⇤N ⌅ ⌥ ⇥ ⇤N ⇥ i=1,...,n i=1,...,n⌃ i=1⇥,...,n 1i=1,...,n ⌃ ⇥ ⇤11⇤ ⇤⌃ ⌥⇥ ⇤1⌥ ⌃ Hence,Hence, forfor nn == ll == 2 ⇤NN,, thisthisHence,ooccursccursHence,forwithwithfornprobabilitprobabilit=n =l =l yy=atat most⇤mostNN,,1this1this//2.2. WWooccurseccurse withwithprobabilitprobability atymostat most1/2. 1W/e2. We ⇥ 2 ⇥ ⌃ ⌃ 2 2 concludeconcludeHence,*that, that,Fixing withforwithnprobabilitprobabilit= l =conclude y1y⇤atatNleastleast, , this event has this1that,1/1/o22⇤ccursthethewithnnwithﬁrstﬁrstprobabilitshortcutsprobaprobabilitshortcuts ≤1/2 encounencounyyatatmostteredteredleast1/2.1/2Wthee n ﬁrst shortcuts encountered Hence, for n2 =concludel = 2 that,N, thiswithoprobabilitccurs withy atprobabilitleast 1/2 they atn mostﬁrst shortcuts1/2. Wencoune tered lielie outsideconcludeoutside IIll..that,InIn thatthatwithcase,case,probabilitifif wwee assumeassumey at leastthatthat ss1/itselfitself2 theisisnnotnotﬁrstinin IshortcutsIll,, thethe greedygreedyencountered So with proba ≥1/2, X , XX 1, X. .2. ,,.X. . , X are not in n lielieoutsideoutsideIIl.. InIn thatthatcase,case,if ifwewassumee assumethatthats itselfs itselfis notis not proprocedurecedureconcludeneedsoneeds eithereitherthat, with1 probabilit2 ny at least 1/2 thel n ﬁrst shortcuts encountered lie outside Il. In that case,{ in{if wIe, assumethe greedythat}pro}s itselfcedureis needsnot ineitherIl, the greedy * On this event, assuming s not in X1, X2, . . . , XinnIl, liethel outsidegreedyIprol. Incedurethat case,needsifeitherwe assume that s itself is not prototoceduremakmak{ ee nnneedsstepsstepseither } •• ino Greedy routing needs at least Il, the greedy procedureto makneedse n steps stepseither otherwise,otherwise,to maktotoe nreacreacstepshhbbeforeefore tt bbeforeeforeto•maknn steps,steps,e nititstepsrequiresrequirestoto reacreachhtt fromfrom thethe •• • • bboundaryoundaryofoftoIIll makusingusingell lonlocalcalstepsedges.edges. otherwise, to reach before t before n steps, it requires to reach t from the otherwise,• to reach beforeotherwise,t•before n steps,to reacit hrequiresbeforetotreacbeforeh t fromn steps,the it requires to reach t from the • ⇤ boundary of Il using l local edges. InIn bbothoth cases,cases,boundary⇤NN stepsstepsof Ilareareusingrequired.required.l•local edges. ⇤ otherwise, to reacboundaryh29 beforeoftIbl eforeusingnl steps,local edges.it requires⇤ to reach t from the • In both cases, ⇤N steps are required. ⇤ PropPropInositionositionboth cases,44 InInboundary⇤aaNrrandomlyandomlysteps ofareaugmenteaugmenteIrequired.l usingdd latticlatticl loeecalofof dimensiondimensionedges. kk 11 withwith NN ⇤ In both cases,kk ⇤N steps are required. nonodes,des, grgreeeedydy rroutingouting usesuses atat leleastast ((NNkk+1+1)) steps.steps. ⇤ Proposition 4 In a randomly⇤ Propaugmenteositiond 4latticIneaofrandomlydimensionaugmentek 1 withd latticN e of dimension k 1 with N In both cases, N steps are required. k ⇤ The “small world” model deﬁned by Kleinbergk is a variation of the randomly k+1 The “small world” model deﬁnednobdes,y Kleingrebekergdy+1 isroutinga variationusesofatthelerandomlyast (N ) steps. augmennodes,tedgrlatticeeedy inroutingtroducedusesPropbyatWattsleositionastand(NStrogatz.4 )Insteps.Asa roppandomlyosed augmented lattice of dimension k 1 with N augmented lattice introduced by Watts and Strogatz. As opposed k k+1 TheProp“smallositionworld” mo4nodelIndes,deﬁneda grrTheandomlyeedyb“smally Kleinroutingwaugmenteborld”ergusesis moa variationdelatd latticledeﬁnedastofetheof(bNyrandomlydimensionKlein)bsteps.erg is akvariation1 withof theN randomly k augmennoteddes,latticegreeindytrorducedoutingaugmenbyusesWattstedatandlatticeleastStrogatz.in(troN ducedk+1As)oppsteps.byosedWatts and Strogatz. As opposed 22 ExercicesExercices The “small world” model deﬁned by Kleinberg is a variation of the randomly The “smallaugmenworld” moteddellatticedeﬁnedintrobyducedKleinbbyergWisattsa vandariationStrogatz.of theAsrandomlyopposed 2 Exercicesaugmented lattice in2troducedExercicesby Watts and Strogatz. As opposed 2 Exercices 2 Exercices

44

4 4

4 4 Proof : Let us consider a target t, and the neighborhood subset

I = u V u t l . l { ⌥ | | ⇥ | ⇤ } This subset contains t and at most 2l nodes. ⇧ Starting from an initial point s, greedy routing construct a path s = U1, U2, U3 by following at each step either a local edge in the grid or a shortcut. (Ui)i 1 are then random points in the grid. Since shortcuts are chosen independently of each other, we can apply the principle of deferred decision, which states that previous random choices used by the algorithm do not impact the outcome of random choices not currently used. Let Xi be the node that the shortcut rooted at Ui points to. Xi forms an i.i.d. sequence of uniformly chosen points. Each of them lies in Il with probability 1 2l 2l . N ⇧ ⇤ N The probability that one of the n ﬁrst elements of Xi lies in Il is then upper bounded by the union bound:

nl P Xi Il P [Xi Il ] . { ⌥ }⇤ ⇤ ⌥ ⇤ N i=1,...,n i=1,...,n ⌅ ⌃ ⇥ by the⇧algorithm do not impact the outcome of random choices not currently 1 ⌃ used. Hence, for n = l = 2 N, this occurs with probability at most 1/2. We conclude that, with probability atLetleastXi be1/the2 thenoden thatﬁrsttheshortcutsshortcut encounrooted atteredUi points to. Xi forms an i.i.d. sequence of uniformly chosen points. Each of them lies in Il with X1, X2, . . . , Xn lie outside Il. In that case, if we assume that s itself is not { } probability 1 2l⇤N 2l . in Il, the greedy procedure needs either N ⇥ N The probabilit y that⌅ one of the n ﬁrst elements of Xi lies in Il is then upper to make n steps bounded by the union bound: • A thought experiment otherwise, to reach before t before n steps, it requires to reach t from the nl • P Xi Il P [Xi Il ] . boundary of I using l local edges. { ⌅ }⇧ ⇥ ⌅ ⇥ ⇤ l i=1,...,n i=1,...,n N ⇤ ⌥ ⇥ ⌃ ⌃ 1 ⇤ In both cases, N steps are required.* Hence,Greedy routing steps forHencen = l the= 2 expN,ectedthis occursnumbwither ofprobabilitsteps y at most 1/2. We needed by greedy routing is loconcludewer boundedthat, withby aprobabilitconstanytattimeleastthe1/2squarethe n ﬁrstrootshortcuts encountered Xo, Xsquare root is , . . . , X lie outsidenot I . In that case, if we assume that s itself is not of N. { 1 2 n } l in Il, thesatisfying for small world greedy procedure needs either Proposition 4 In a randomly augmenteto makde latticn stepse of dimension k 1 containing • 1 ⌅ 1 k+1 N nodes, greedy routing uses in expo otherwise,k>1? Not better ectation toat reacleasth b4eforeN t beforesteps.n steps, it requires to reach t from the • boundary of I using l local edges. The proof is a good practice exerciceo even worse, the proof applies to using thel same argument. Noteany thatdistributed it alg! may a priori seem that the shortcutIn* bOur sanity check test has grandly oth cases,augmen⇤Ntationsteps areof required.the lattice doesfailedconnect! ⇤ points better as k increases. But note that in a lattice of dimension k with N Propo osition“Small world” results explain that short paths exist 4 In a randomly1 augmented lattice of dimension k 1 with N nodes, the distance in the lattice is of the order of O(N k ), so thatk the relative nodes,… greﬁndingedy routing them remains a daunting algorithmic task uses at least (N k+1 ) steps. improvement obtained with shortcuts augmentation actually becomes worse as k increases. The “small world” model deﬁned 30 by Kleinberg is a variation of the randomly augmented lattice introduced by Watts and Strogatz. As opposed

2 Exercices

4

4 31 Outline

* Milgram’s “small world” experiment

* It’s a “combinatorial small world” * It’s a “complex small world” * It’s an “algorithmic small world” o Beyond uniform random augmentation

32