Chair for Network Architectures and Services Overview Department of Informatics TU München – Prof. Carle

1.2a) Basics ‰ „Unstructured“ / „Structured“ Peer-to-Peer Systems ‰ Early unstructured Peer-to-Peer networks and Security ƒ IN2194 ƒ ‰ Theory Chapter 1 ƒ Random Graphs Peer-to-Peer Systems ƒ Small World Theory 1.2a Unstructured Systems ƒ Scale-Free Graphs Dipl.-Inform. Heiko Niedermayer Christian Grothoff, PhD Prof. Dr.-Ing. Georg Carle

NetworkPeer-to-Peer Security, Systems WS and2008/09, SecurSecSecurity,urity, Chapterity, SummerSS 2007,2009,9 2010,Chapter Chapter 01 1 2

Unstructured / Structured Unstructured / Structured

Unstructured Network ‰ Does not self-organize into a predefined structure. ‰ Graph is created by random interactions.

Examples for structures Unstructured / Structured ‰ Full Mesh / Clique ƒ All nodes are connected with each other. ƒ n nodes Î degree = n-1 ƒ Diameter = 1 ‰ Ring ƒ Nodes organized in a ring ƒ Degree = 2 ƒ n nodes Î diameter = n/2

NetworkPeer-to-Peer Security, Systems WS and2008/09, SecurSecSecurity,urity, Chapterity, SummerSS 2007,2009,9 2010,Chapter Chapter 01 1 3 NetworkPeer-to-Peer Security, Systems WS and2008/09, SecurSecSecurity,urity, Chapterity, SummerSS 2007,2009,9 2010,Chapter Chapter 01 1 4 Unstructured networks Early Unstructured P2P Systems

Properties ‰ No structure has to be created and maintained whenever something changes in the network. ƒ Join • Completed once the node is registered at one other node (except for the need of this node to get to know more nodes….) ƒ Leave • No need to rework, but to locally remove the link Early Unstructured P2P ‰ Unless destination is known, there is no way to know where it is but to search all over the network. Systems v88 v0 ‰ Nodes store their own items. v11 Who has item 41. Which way? v39 v5 v74

Go to v11. v54 Which way? v51 v1

NetworkPeer-to-Peer Security, Systems WS and2008/09, SecurSecSecurity,urity, Chapterity, SummerSS 2007,2009,9 2010,Chapter Chapter 01 1 5 NetworkPeer-to-Peer Security, Systems WS and2008/09, SecurSecSecurity,urity, Chapterity, SummerSS 2007,2009,9 2010,Chapter Chapter 01 1 6

Napster Filesharing

Filesharing Napster ‰ and announce content ‰ A centralized Peer-to-Peer system ‰ Search for content ƒ Centralized P2P = management and indexing done by central servers data transfer ‰ content ‰ 1999 by Shawn Flemming (student at Northwestern University) Problems ‰ Finally shut down in 2001 as result of law suits. Napster ‰ Legal issues (see Napster) Æ Decentralization ‰ Approach Server ‰ How to find content? ƒ Central Server ƒ String queries • Manages index of files ƒ Peers •Substring • Register to server with their shared files ƒ Fuzzy queries • Query server for files Îlist of Peers with their hits ƒ Usually no exact queries for the query • Download from Peer Æ Thus, the task for the unstructured decentralized network is to search the network for hits. ƒ Peer-to-Peer • Only the data exchange between the Peers

NetworkPeer-to-Peer Security, Systems WS and2008/09, SecurSecSecurity,urity, Chapterity, SummerSS 2007,2009,9 2010,Chapter Chapter 01 1 7 NetworkPeer-to-Peer Security, Systems WS and2008/09, SecurSecSecurity,urity, Chapterity, SummerSS 2007,2009,9 2010,Chapter Chapter 01 1 8 Gnutella Gnutella

Gnutella 0.4 Basic primitives of Gnutella 0.4 ‰ Pure Peer-to-Peer approach ‰ Ping / pong: discover neighborhood ƒ No central entities like in Napster. ‰ Query / query hit: discover content ƒ Avoid single points of failure, any peer can be removed without loss of functionality. ‰ Push: download request sent to firewalled nodes ‰ Join ƒ Firewalls may only allow connections to be established from inside to the ƒ Via any node in the network Internet and not the other way around. • Taken from downloaded host list, peer cache, … ƒ The firewall and NAT aspects of Peer-to-Peer are discussed in a later • Receives a list of recently active peers from this node. section. ƒ Explore neighborhood with ping/pong messages. ƒ Establish connections until a quota is reached. Properties ‰ Limited flooding as routing principle ‰ Immense consumption due to flooding for the signalling and ƒ Flood message to neighbors unless TTL of message exceeded. unsuccessful search traffic! ƒ Store the source of these messages to be able to return the hit ƒ Gnutella 0.4 does not scale (~ overhead dominates the network). to the source (= previous node, not the original source of the ‰ Provides a weak form of anonymity as query is without source address request). and hits are returned hop-by-hop on the path.

NetworkPeer-to-Peer Security, Systems WS and2008/09, SecurSecSecurity,urity, Chapterity, SummerSS 2007,2009,9 2010,Chapter Chapter 01 1 9 NetworkPeer-to-Peer Security, Systems WS and2008/09, SecurSecSecurity,urity, Chapterity, SummerSS 2007,2009,9 2010,Chapter Chapter 01 1 10

Gnutella2 Theory

Gnutella2 ‰ Hybrid Peer-to-Peer approach ƒ Distinction between peers and super peers • Super peers form unstructured network data transfer • Client peers connect to some super peers ‰ Hubs (super peers) Hub ƒ Accept hundreds of leaves (client peers) Hub ƒ Many connections to other hubs Hub ƒ Query Hit Table Hub Theory • List of files provided by its leaves. ‰ Leaves (client peers) ƒ Each leaf connects to one or two hubs. ‰ Search ƒ Gather a list of hubs and iteratively ask them. ‰ Properties ƒ Less traffic overhead, scales better

NetworkPeer-to-Peer Security, Systems WS and2008/09, SecurSecSecurity,urity, Chapterity, SummerSS 2007,2009,9 2010,Chapter Chapter 01 1 11 NetworkPeer-to-Peer Security, Systems WS and2008/09, SecurSecSecurity,urity, Chapterity, SummerSS 2007,2009,9 2010,Chapter Chapter 01 1 12 Theory Theory: Random Graphs

Observation Randomly-created Graphs ‰ Graphs of unstructured networks are created by random and social ‰ Way to model the structure of these networks interactions. ‰ Necessary to understand the behaviour of these networks ƒ Randomness ƒ Social aspects (social network, entry points, uptime, …) Random Graphs / Uniform Random Graphs ƒ Content (interesting files, …) ‰ Graph G = (V,E) ƒ E is created randomly Questions ƒ n = |V|, m = |E| ‰ What is their form? ‰ Assumption ‰ Are they good? ƒ Nodes randomly connect to each other. ‰ We will also call them uniform random graphs to distinguish them from In the following we present some theoretic graph models that are used to other graphs that are also randomly-created, but where nodes are not approximate these graphs and their properties. all equal and strategies bias the link selection. ‰ Average distance in random graphs is most likely to be close to optimal for given n and m.

NetworkPeer-to-Peer Security, Systems WS and2008/09, SecurSecSecurity,urity, Chapterity, SummerSS 2007,2009,9 2010,Chapter Chapter 01 1 13 NetworkPeer-to-Peer Security, Systems WS and2008/09, SecurSecSecurity,urity, Chapterity, SummerSS 2007,2009,9 2010,Chapter Chapter 01 1 14

Erdös-Rényi model The Small-World Phenomenon

Uniform random graphs according to Erdös-Rényi ‰ We meet someone we know at a place where we do not expect model (1960) something like that to happen. Î What a small world ?!? ‰ Given: ƒ n nodes und probability p Degree distribution for n=50, p=0.4 An experiment by Stanley Milgram (1960s) ‰ Construction: ‰ Milgram sent mail to people in Nebraska. ƒ For any two nodes v1, v2 do with probability p: ‰ The mail should only be sent to people they personally know who connect(v1,v2) P(k) might know better how to reach to the targeted receiver. ‰ Resulting graph: ‰ The targeted receivers of the mails were people from Boston. 2

ƒ E[|E|] = p * n / 2 0.00 0.04 0.08 ‰ The result was that on average six hops were required and that the 0 1020304050 ƒ The node degree follows the binomial distribution median was below six. (approx. by Poisson distribution for large n). k ‰ Subsequently, this lead to the term ”‘Six degrees of separation”’ and ‰ Discussion: the conclusion that we live in ”‘small world”’. ƒ Too simple and uniform for a model of real networks.

NetworkPeer-to-Peer Security, Systems WS and2008/09, SecurSecSecurity,urity, Chapterity, SummerSS 2007,2009,9 2010,Chapter Chapter 01 1 15 NetworkPeer-to-Peer Security, Systems WS and2008/09, SecurSecSecurity,urity, Chapterity, SummerSS 2007,2009,9 2010,Chapter Chapter 01 1 16 Discussion of the Milgram experiment Graph measure: Characteristic path length (L)

‰ First of all, ”‘six degrees of separation”’ sounds more like a maximum, In the following, we introduce two scalar properties that can be used to but it is an average and the maximum, say the diameter of the graph, characterize graphs. may be significantly larger. ‰ Judith Kleinfeld [Klei02] looked into the experiments of Milgram in Characteristic path length (L) more detail. ‰ L corresponds to the average length of a shortest path in an undirected n n ƒ Most of Milgram’s messages did not find their receiver. In fact, the success graph 1 rate (chain completion rate) was below 20 %. L = avg d()i, j = ∑∑d(i, j) i, j∈V ,i≠ j ⎛n⎞ i=+11j=i ƒ The people that were selected were also biased in such a way that well-off ⎜ ⎟ higher-ranked people were preferred. Moreover, even six degrees may be ⎝2⎠ a strong barrier in reality, say a big world, that cannot be bridged in ‰ Recap of the definition of the diameter particular among different races and classes. ƒ A big world afterall….? D = max d()i, j i, j∈V ,i≠ j ‰ L and random graphs (e.g. constructed by Erdös-Rényi model) log n L ~ random log()m / n

NetworkPeer-to-Peer Security, Systems WS and2008/09, SecurSecSecurity,urity, Chapterity, SummerSS 2007,2009,9 2010,Chapter Chapter 01 1 17 NetworkPeer-to-Peer Security, Systems WS and2008/09, SecurSecSecurity,urity, Chapterity, SummerSS 2007,2009,9 2010,Chapter Chapter 01 1 18

Graph measure: Clustering coefficient () Graph measure: Clustering coefficient

cluster Cluster Clustering coefficient C ‰ engl. Traube, Bündel, Schwarm, Haufen ‰ Given graph G = (V,E) v ‰ In data analysis points with similar ‰ We define the neighborhood of a vertex v properties. Γv = {}u ∈V | u adjacent to v G(V,E) Clustering in networking ‰ Given U as subset of V, we define E(U) the edges outlier ‰ Here, a group of nodes that are all closely of the subgraph of V spanned with the nodes U. connected. ‰ Local clustering coefficient of node v ‰ An informal notion of a cluster is that #edges _ of _ subgraph _ G(Γv , E(Γv )) E(Γv ) nodes in a cluster are close to each other. Cv = = #all _ possible _ edges _ between _ nodes _ Γ ⎛deg ree(v)⎞ So, most neighbors of a node in a cluster are v ⎜ ⎟ ⎜ 2 ⎟ also close or even neighbors of each other. Graph with 2 clusters ⎝ ⎠ ‰ Clustering coefficient C of G Î „When my friends are also friends, we are a cluster.“ 1 1 E(Γv ) G(U,E(U)) C = ∑Cv = ∑ We will use this idea to define a measure n v∈V n v∈V ⎛deg ree(v)⎞ called clustering coefficient. ⎜ ⎟ ⎝ 2 ⎠ Rather unclustered

NetworkPeer-to-Peer Security, Systems WS and2008/09, SecurSecSecurity,urity, Chapterity, SummerSS 2007,2009,9 2010,Chapter Chapter 01 1 19 NetworkPeer-to-Peer Security, Systems WS and2008/09, SecurSecSecurity,urity, Chapterity, SummerSS 2007,2009,9 2010,Chapter Chapter 01 1 20 Examples – Clustering coefficient Calculation The Small-World Phenomenon in P2P Networking ⎛ 2 ⎞ 2 *1 ⎛ 4 ⎞ 4 * 3 ⎜ ⎟ = = 1 ⎜ ⎟ = = 6 ⎝ 2 ⎠ 2 *1 ⎝ 2 ⎠ 2 *1 ⎛ 3 ⎞ 3* 2 ⎛ 5 ⎞ 5 * 4 ⎜ ⎟ = = 3 ⎜ ⎟ = = 10 1 E(Γv ) ⎜ 2 ⎟ 2 *1 ⎜ 2 ⎟ 2 *1 ‰ The clustering C = ⎝ ⎠ ⎝ ⎠ Small-World Graph ∑ deg ree(v) coefficient n v∈V ⎛ ⎞ ⎜ ⎟ ‰ A Small-World graph is a graph with a characteristic path length close to that of ⎝ 2 ⎠ an equivalent uniform random graph ( L L ), but with a cluster coefficient 1/3 3/3 ≈ random The graph with 2 clusters much greater ( C >> C random ). ‰ As example we compute the local clustering coefficient of a rather central node 5/10 5/6 Small-World on the Internet and elsewhere 2/3 2/3 ƒ It has 5 neighbors . Size Avg. L L_random C C_random ƒ Their graph has 5 edges of 10 possible 3/3 1/1 5/6 degree edges. Internet graph (2002) 260.000 3.39 11.4 10.1 0.023 0.000014 ƒ Thus, its coefficient is 5/10 = 0,5. Skitter topology (***) ‰ The coefficent of the graph C = 0,759 1/6 Gnutella (2000) n/a n/a 3.86 3.19 0.045 0.0068 1/3 Snapshot (**) Film collaboration (*) 225000 61 3.65 2.99 0.79 0.00027 The rather unclustered graph 1/6 0/3 1/3 1/3 Power Grid (*) 4900 2.67 18.7 12.4 0.080 0.005 ‰ The example node has 4 neighbors that Neural network of 282 14 2.65 2.25 0.28 0.05 share only one edge. Its local clustering worm C.elegans (*) coefficient is 1/6 = 0,167. (*) Watts & Strogatz 1999 (**) Li et. al 2004, (***) Jin & Bestavros 2006 ‰ The coefficient of the graph C = 0,296 1/3 2/3 2/6

NetworkPeer-to-Peer Security, Systems WS and2008/09, SecurSecSecurity,urity, Chapterity, SummerSS 2007,2009,9 2010,Chapter Chapter 01 1 21 NetworkPeer-to-Peer Security, Systems WS and2008/09, SecurSecSecurity,urity, Chapterity, SummerSS 2007,2009,9 2010,Chapter Chapter 01 1 22

Small-World-Theory and real networks Zipf‘s Law and Scale-Free networks

google cnn you Real networks and Small-World networks content / popularity ‰ Real networks (WWW, Gnutella, etc.) often show Small- World properties local clusters ƒ Characteristic path length is small -α ƒ Clustering coefficient is high Zipf’s law: “The popularity of ith-most popular object is proportional to i , α: Zipf coefficient.” ….but…. unlike Small-World networks, they are ‰ Zipf-like popularity can be found for websites, words in natural languages, movies, … ‰ not symmetric ‰ the peers are way from being equally used. Node degrees in the example ‰ In fact, the popularity and the degree of nodes differs ‰ Google 18, CNN 6, you 1, other nodes 1-5 extremely . ‰ In Filesharing replace the websites with popular content. ƒ E.g. compare google.com, cnn.com and your webpage. ‰ Small-World theory does not explain and contain this variation. Æ Next model: Scale-Free networks

NetworkPeer-to-Peer Security, Systems WS and2008/09, SecurSecSecurity,urity, Chapterity, SummerSS 2007,2009,9 2010,Chapter Chapter 01 1 23 NetworkPeer-to-Peer Security, Systems WS and2008/09, SecurSecSecurity,urity, Chapterity, SummerSS 2007,2009,9 2010,Chapter Chapter 01 1 24 Scale-Free Networks Scale-Free Networks

Scale-Free networks / Power-Law networks ‰ The term scale-free relates to the fact that the degree distribution is independent of any scale (e.g. no size of the network in it). ‰ Power Law distribution of the node degree P(k) ~ k −γ typically with γ ≈ 3±1 P(k) Degree distribution for the Actor, WWW, and Power Grid networks taken from 2 10 Albert-Laszlo Barabasi and Reka Albert „Emerging of Scaling in Random Network“, Science 1999. 101 Properties 100 ‰ The Power Law distribution has extremely high variability. 0 1 2 3 10 10 10 10 k ‰ A consequence of the extreme variation of node degree, is an ‰ Other definitions for Scale-Free graphs can be found. existence of few high-degree nodes. Typically, they are called hubs. ‰ Scale-Free graphs are a likely outcome of random graph ƒ The hubs are hotspots. construction processes that contain some element with high ƒ Failure or leave of hubs is a problem for these networks („Archilles heel“) variability. ƒ Failure of non-hubs is considered less problematic. Unless a hub is hit (More on the topic: Li, Alerderson, Tanaka, Doyle, Willinger: „Towards a random failures hardly have an impact on the network, say on the average Theory of Scale-Free Graphs“, 2005) path length.

NetworkPeer-to-Peer Security, Systems WS and2008/09, SecurSecSecurity,urity, Chapterity, SummerSS 2007,2009,9 2010,Chapter Chapter 01 1 25 NetworkPeer-to-Peer Security, Systems WS and2008/09, SecurSecSecurity,urity, Chapterity, SummerSS 2007,2009,9 2010,Chapter Chapter 01 1 26

High variability / Power-Law distribution Scale-Free networks (Barabasi-Albert model)

In many fields of networking, there is an element of high variability. Scale-Free graphs according to Barabasi-Albert‘s „The rich get richer“ model (1999) ‰ e.g. network traffic, degree distribution, peer lifetime distribution,… ‰ Also called cumulative advantage. ‰ High variability („heavy-tail“) means variation coefficient >> 1 ‰ Given: n nodes n=6 ƒ The values vary more than their mean. Example ‰ 1 1 2 45 1 0 1 1023 3 1 2 4 0 1 0 11 … Start with m0 unconnected nodes, add random link for each node m0=4 „Bus stop paradox“ (time between buses with variation >> 1) ƒ Minimum degree of each node is 1. ‰ „Passenger is happy when she just misses a bus.“ ‰ For i = 1 to t do m=2, t=2 Passengers waiting ƒ Add node, connect node to m nodes, select nodes according to the following distribution („linear preferential attachment“) i=1 …just missed k the U-Bahn p(i) = i ‰ In most cases when the bus just left the bus stop (arrow), a bus will k j come within short time (black arrows). ∑ i=2 Available _ nodes j ‰ In most cases when a lot of people are waiting, the arrival of the next ‰ Result bus will still take a while. ƒ Graph with t*m+m0 edges

NetworkPeer-to-Peer Security, Systems WS and2008/09, SecurSecSecurity,urity, Chapterity, SummerSS 2007,2009,9 2010,Chapter Chapter 01 1 27 NetworkPeer-to-Peer Security, Systems WS and2008/09, SecurSecSecurity,urity, Chapterity, SummerSS 2007,2009,9 2010,Chapter Chapter 01 1 28