Peer-to-Peer Systems

Winter semester 2014 Jun.-Prof. Dr.-Ing. Kalman Graffi Heinrich Heine University Düsseldorf Peer-to-Peer Systems

Unstructured P2P Overlay Networks – Unstructured Heterogeneous Overlays

This slide set is based on the lecture "Communication Networks 2" of Prof. Dr.-Ing. Ralf Steinmetz at TU Darmstadt Unstructured Heterogeneous P2P Overlays

Unstructured P2P Structured P2P

Centralized P2P Homogeneous P2P Heterogeneous P2P DHT-Based Heterogeneous P2P

1. All features of 1. All features of 1. All features of 1. All features of 1. All features of Peer-to-Peer Peer-to-Peer Peer-to-Peer Peer-to-Peer Peer-to-Peer included included included included included 2. Central entity is 2. Any terminal 2. Any terminal 2. Any terminal 2. Peers are necessary to entity can be entity can be entity can be organized in a provide the removed without removed without removed hierarchical service loss of loss of without loss of manner 3. Central entity is functionality functionality functionality 3. Any terminal some kind of 3. ! no central 3. ! dynamic central 3. ! No central entity can be index/group entities entities entities removed without database 4. Connections in loss of functionality the overlay are Examples: “fixed” Examples: Examples: § 0.6 Examples: Examples: § § Gnutella 0.4 § Fasttrack § • AH-Chord § § eDonkey § CAN • Globase.KOM §

from R.Schollmeier and J.Eberspächer, TU München HHU – Technology of Social Networks – JProf. Dr. Kalman Graffi – Peer-to-Peer Systems – http://tsn.hhu.de/teaching/lectures/2014ws/p2p.html 3 Principles – Hierarchical / Heterogeneous

Approach: ++ Advantages to combine best of both worlds § More robust than § Robustness by distributed centralized solutions indexing § Faster searches than in § Fast searches by server pure P2P systems queries -- Disadvantages Components § Need of algorithms to § Supernodes choose reliable supernodes • Mini servers / super peers • Used as servers for queries – To build a sub-network between supernodes – Queries distributed at sub- network between supernodes § “Normal” peers • Have only overlay connections to supernodes

Picture from R.Schollmeier and J.Eberspächer, TU München HHU – Technology of Social Networks – JProf. Dr. Kalman Graffi – Peer-to-Peer Systems – http://tsn.hhu.de/teaching/lectures/2014ws/p2p.html 4

Peer-to-Peer Filesharing History of P2P Filesharing Networks 2 1 0 2

t f a h c s l l e s e G - r e HHU – Technology of Social Networks – JProf. Dr. Kalman Graffi – Peer-to-Peer Systems – http://tsn.hhu.de/teaching/lectures/2014ws/p2p.html 5 f o h n u a r F

©

8 / 25 Decentralized with Distributed Servers

For example: eDonkey see e.g. • http://www.overnet.org/ • http://www.emule-project.net/ • http://savannah.gnu.org/projects/mldonkey/

eDonkey file-sharing protocol § Most successful/used file-sharing protocol in • e.g. Germany & France in 2003 [see sandvine.org] – 52% of generated P2P file sharing traffic – only for 44% in Germany § Stopped by law • February 2006 largest server „Razorback 2.0“ disconnected be Belgium police – http://www.heise.de/newsticker/eDonkey-Betreiber-wirft-endgueltig-das- Handtuch--/meldung/78093

HHU – Technology of Social Networks – JProf. Dr. Kalman Graffi – Peer-to-Peer Systems – http://tsn.hhu.de/teaching/lectures/2014ws/p2p.html 6 The eDonkey Network - Principle

Distributed server(s) § Set up and RUN BY POWER-USERS

§ à nearly impossible to shut down all servers § Exchange their server lists with other servers • using UDP as transport protocol § Manages file indices

Client application § Connects to one random server and stays connected § Using a TCP connection § Searches are directed to the server § Clients can also extend their search • by sending UDP search messages to additional servers

HHU – Technology of Social Networks – JProf. Dr. Kalman Graffi – Peer-to-Peer Systems – http://tsn.hhu.de/teaching/lectures/2014ws/p2p.html 7 Edonkey functionality

eDonkey hash can be used for several queries § eDonkey server • Search for peers – Servers block requests if too many requests are sent § à additional structure p2p overlay • Search for peers (including peers behind a firewall) – Very efficient (10 requests per second) Queries to peers – Finds more peers than found using servers • Ratings and comments for all Kad peers – Not used very widely § Directly from the peer (requests to a specific file) • Query for the filename – About 65 % of all peers answer with filename • Ratings and comments of the peer • Search for further peers

HHU – Technology of Social Networks – JProf. Dr. Kalman Graffi – Peer-to-Peer Systems – http://tsn.hhu.de/teaching/lectures/2014ws/p2p.html 8 The eDonkey Network

Search

TCP

UDP Server List Exchange Supernode Download Extended Search

HHU – Technology of Social Networks – JProf. Dr. Kalman Graffi – Peer-to-Peer Systems – http://tsn.hhu.de/teaching/lectures/2014ws/p2p.html 9 The eDonkey Network

Procedure Search § New servers send • their port + IP to other servers (UDP) Server List § Servers send Exchange • server lists (other servers Download they know) to the clients

§ Server lists can also be Extended downloaded on various Search websites

TCP

UDP

Supernode

Node

HHU – Technology of Social Networks – JProf. Dr. Kalman Graffi – Peer-to-Peer Systems – http://tsn.hhu.de/teaching/lectures/2014ws/p2p.html 10 The eDonkey Network eDonkey Files are identified by § This helps in § Unique MD4 • Resuming a download ● Filesharing network with most files • Message-Digest from a different source ● CentralizedAlgorithm4, P2P network RFC 1186 with file many eDonkey• Downloading servers the same file hashes from multiple sources at ● Additional• 16 byte DHT: long Kad the same time § Are not identified by • Verification that the file has ● eDonkey hash is created directly from file content filenames been correctly downloaded

2 1 0 2

t f a h c s l l e s e G - r e f o h n u a r F

HHU – Technology of Social Networks – JProf. Dr. Kalman Graffi – Peer-to-Peer Systems – http://tsn.hhu.de/teaching/lectures/2014ws/p2p.html 11 ©

10 / 25 The eDonkey Network

à the SEARCH consists of two steps

1. Full text search to • Connected server (TCP) or • Extended search with UDP to other known servers. § Search result are the hashes of matching files

2. Query Sources • Query servers for clients offering a file with a certain hash

Later • Download from these sources

Status: 1,229,568 users, 37,399,014 files (30.08.2012)

HHU – Technology of Social Networks – JProf. Dr. Kalman Graffi – Peer-to-Peer Systems – http://tsn.hhu.de/teaching/lectures/2014ws/p2p.html 12 The eDonkey Network

à the alternate SEARCH consists of two steps

0. Participate in the KAD network 1. Know MD4 – hash of file 2. Query Sources in KAD § Send lookup to node responsible for file hash § Query responsible node for clients offering the

Later • Download from these sources

Status: 600k-2M users, 200M-600M files (30.08.2012)

HHU – Technology of Social Networks – JProf. Dr. Kalman Graffi – Peer-to-Peer Systems – http://tsn.hhu.de/teaching/lectures/2014ws/p2p.html 13 Testing the Content in Edonkey Networks

Forensic Test set Consists out of about 1500 files § Images, music, videosHits, documents and

miscellaneous ● Hitrate: 385 / 1479 (26 %) § Images: Fraunhofer, Windows 7, KDE, 4chan § Music: mainly three big music collections § Videos: YouTube, Open Source Films, P2P, ... § Documents: diverse PDFs, Fraunhofer, BitTorrent § Miscellaneaous: Zips, 2 1 0 2

executables, Malware, ... t f a h c s l l e s e

G - r e f o h

14 n HHU – Technology of Social Networks – JProf. Dr. Kalman Graffi – Peer-to-Peer Systems – http://tsn.hhu.de/teaching/lectures/2014ws/p2p.html u a r F

©

21 / 25 KaZaA: Decentralized File Sharing with Super Nodes

see § www.kazaa.com, .sourceforge.net, http://www.my-k-lite.com/ System § Developer: Fasttrack § Clients: KaZaA Properties: § Most successful P2P network in USA in 2002/3 Architecture: neither completely central nor decentralized § Supernodes to reduce communication overhead

#downloads P2P system #users #files terabytes (from download.com Fasttrack 2,6 Mio. 472 Mio. 3550 4 Mio. eDonkey 230.000 13 Mio. 650-2600 600.000 Gnutella 120.000 28 Mio. 105 Ca. 525.000 Numbers are from 10‘2002

HHU – Technology of Social Networks – JProf. Dr. Kalman Graffi – Peer-to-Peer Systems – http://tsn.hhu.de/teaching/lectures/2014ws/p2p.html 15 Decentralized File Sharing with Super Nodes

Examples: KaZaA, Gnutella 0.6 (, ) Peers § Connected only to some super nodes § Send IP address and file names only to super peers Super nodes - super peers: § Peers with high-performance network connections § Take the role of the central server and proxy for simple peers § Answer search messages for all peers (reduction of comm. load) § One or more supernodes can be removed without problems Additionally, the communication between nodes is encrypted

Search

Service Search Delivery

Superpeer Download

Peer

HHU – Technology of Social Networks – JProf. Dr. Kalman Graffi – Peer-to-Peer Systems – http://tsn.hhu.de/teaching/lectures/2014ws/p2p.html 16 Example for KaZaA

HHU – Technology of Social Networks – JProf. Dr. Kalman Graffi – Peer-to-Peer Systems – http://tsn.hhu.de/teaching/lectures/2014ws/p2p.html 17 Decentralized File Sharing with Complete Files

Drone 1 has § 50 KB/s upload rate § not utilized Drone 1 receives until he has whole file § 25% of the file § at 12,5 KB/s rate

Queen Bee has § 100 MB file § 50 KB/s upload rate in total

At the beginning Later From www.wtata.com HHU – Technology of Social Networks – JProf. Dr. Kalman Graffi – Peer-to-Peer Systems – http://tsn.hhu.de/teaching/lectures/2014ws/p2p.html 18 Issues with KaZaA / Gnutella 0.6

Keyword-based search § You do not know what you get § Pollution a problem • Music companies flooded the network with false files • Chance to get a “good” file ~ 10% • Problem for “small” files

Full file download before uploading § User go offline after download finished § Only few uploaders online § Problem for “large” files

HHU – Technology of Social Networks – JProf. Dr. Kalman Graffi – Peer-to-Peer Systems – http://tsn.hhu.de/teaching/lectures/2014ws/p2p.html 19 Google Trends for KaZaA, Limewire, Torrent, Emule

http://www.google.com/trends?q=kazaa,+limewire,+torrent,+emule

HHU – Technology of Social Networks – JProf. Dr. Kalman Graffi – Peer-to-Peer Systems – http://tsn.hhu.de/teaching/lectures/2014ws/p2p.html 20 Unstructured Hybrid Resource Sharing:

Offered Services § IP Telephony features § File exchange §

Features § KaZaA technology § Encrypted high media quality § Support for teleconferences § Multi-platform

Further Information § Very popular, low-cost IP telephony § SkypeOut extension to call regular phone numbers (not free) § Great business potential if combined with free WIFIs

Very Oct.2011 bought by Microsoft popular § Super nodes are now servers From www.skype.com HHU – Technology of Social Networks – JProf. Dr. Kalman Graffi – Peer-to-Peer Systems – http://tsn.hhu.de/teaching/lectures/2014ws/p2p.html 21 Skype

Network Architecture § formerly KaZaA based

message exchange at login

super node

Skype login regular node server

HHU – Technology of Social Networks – JProf. Dr. Kalman Graffi – Peer-to-Peer Systems – http://tsn.hhu.de/teaching/lectures/2014ws/p2p.html 22 Exercise

3 2 1 4 5 6 3 7 10 8 9 1 2 8 3 7 4 4 2 6 5 5 6 2 3 4 7 1 10 1 9 2 8 3 1 5 7 4 8 1 6 2 5 6 7 3 4 5

HHU – Technology of Social Networks – JProf. Dr. Kalman Graffi – Peer-to-Peer Systems – http://tsn.hhu.de/teaching/lectures/2014ws/p2p.html 23

...... Problem 2.2 - Super Hyper Hierarchical Networks

Let us imagine a hierarchical overlay with three hierarchy steps: there are normal peers, super peers, and hyper peers. a) Number of hyper peers needed § Assume that a super peer cares for 100 normal peers, and a hyper peer is responsible for 100 super peers. How many hyper peers would we need in a network of 999 000 peers in total? b) Querying in the hyper-super-overlay § Suggest a way how such a network could handle search queries. Which information should a super peer maintain? What should a hyper peer know? How would a search query be processed?

HHU – Technology of Social Networks – JProf. Dr. Kalman Graffi – Peer-to-Peer Systems – http://tsn.hhu.de/teaching/lectures/2014ws/p2p.html 24 Peer-to-Peer Systems

Structured Homogenous P2P Overlay Networks – Distributed Indexing and DHTs

This slide set is based on the lecture "Communication Networks 2" of Prof. Dr.-Ing. Ralf Steinmetz at TU Darmstadt Structured P2P Overlays: Principles

Unstructured P2P Structured P2P

Centralized P2P Pure P2P Hybrid P2P DHT-Based Hybrid P2P

1. All features of 1. All features of 1. All features of 1. All features of 1. All features of Peer-to-Peer Peer-to-Peer Peer-to-Peer Peer-to-Peer Peer-to-Peer included included included included included 2. Central entity is 2. Any terminal 2. Any terminal 2. Any terminal 2. Peers are necessary to entity can be entity can be entity can be organized in a provide the removed without removed without removed hierarchical service loss of loss of without loss of manner 3. Central entity is functionality functionality functionality 3. Any terminal some kind of 3. ! no central 3. ! dynamic central 3. ! No central entity can be index/group entities entities entities removed without database 4. Connections in loss of functionality the overlay are Examples: “fixed” Examples: Examples: § Gnutella 0.6 Examples: Examples: § Napster § Gnutella 0.4 § Fasttrack § Chord • AH-Chord § Freenet § eDonkey § CAN • Globase.KOM § Kademlia

from R.Schollmeier and J.Eberspächer, TU München HHU – Technology of Social Networks – JProf. Dr. Kalman Graffi – Peer-to-Peer Systems – http://tsn.hhu.de/teaching/lectures/2014ws/p2p.html 26 Structured Overlay Networks: Interconnection Networks

Structured Overlay Networks § Give peers and objects (unique) identifier • PeerIDs and ObjectIDs shall be from the SAME key set • Each peer is responsible for a specific range of ObjectIDs § Indexing (knowledge on location of resources) to be distributed § No search needed anymore (local indexing) § No server knowing all (global indexing) available

New challenge: to find peer(s) with specific ID in overlay § Lookup: • “Route” queries across the to peer with specific ID § Once peer is found • Initiate direct communication • Upload / download resources

HHU – Technology of Social Networks – JProf. Dr. Kalman Graffi – Peer-to-Peer Systems – http://tsn.hhu.de/teaching/lectures/2014ws/p2p.html 27 Functions in a Structured P2P Overlay (all)

IsMyKey(K) à true if node is responsible for Key K Route(K, M, hint) à send message M to node responsible for K § Hint: Optional first hop GetNodeHandle (K, hint) à get contact details of responsible node Send(M, q) à Send Message M to node q

HHU – Technology of Social Networks – JProf. Dr. Kalman Graffi – Peer-to-Peer Systems – http://tsn.hhu.de/teaching/lectures/2014ws/p2p.html 28 Schematic View on Distributed

HHU – Technology of Social Networks – JProf. Dr. Kalman Graffi – Peer-to-Peer Systems – http://tsn.hhu.de/teaching/lectures/2014ws/p2p.html 29 Additional Functions in a Distributed Hash Table

Put (Data D, Key K) à Copies Data to node responsible for K GetData (Key K) à Gets Data stored under the Key K

Optional further functions: § Replication § H(„my data“) = 3107

1008 1622 2011 § Access Control 709 2207

? 611 2906 3485

12.5.7.31

peer-to-peer.info berkeley.edu planet-lab.org 89.11.20.15

95.7.6.10

86.8.10.18 7.31.10.25

HHU – Technology of Social Networks – JProf. Dr. Kalman Graffi – Peer-to-Peer Systems – http://tsn.hhu.de/teaching/lectures/2014ws/p2p.html 30 Look up in Structured P2P Systems

Principle § Location of the objects is found via •  Node A (provider) advertises object at responsible peer B » Advertisement is routed to B. • ‚ Node C looking for object sends query » Query is routed to responsible node. • ƒ Node B replies to C by sending contacting information of A

2. “Routing” 3. P2P com- to / Lookup of munication. desired Object Node B Get link to object.

?

Node C

1. Publish link at responsible Peer Node A HHU – Technology of Social Networks – JProf. Dr. Kalman Graffi – Peer-to-Peer Systems – http://tsn.hhu.de/teaching/lectures/2014ws/p2p.html 31 Strategies for Data Retrieval: Distributed Indexing

Goal is scalable complexity for § Communication effort: O(log(N)) hops Routing in O(log(N)) § Node state: O(log(N)) routing entries steps to the node storing the data

H(„my data“) = 3107

1008 1622 2011 709 2207

? 611 2906 3485

Nodes store O(log(N)) 12.5.7.31 routing information to

peer-to-peer.info berkeley.edu planet-lab.org 89.11.20.15 other nodes

95.7.6.10

86.8.10.18 7.31.10.25

The content of this slide has been adapted from “Peer-to- Peer Systems and Applications”, ed. by Steinmetz, Wehrle HHU – Technology of Social Networks – JProf. Dr. Kalman Graffi – Peer-to-Peer Systems – http://tsn.hhu.de/teaching/lectures/2014ws/p2p.html 32 Recall Hash Function & Hash Table

Hash function H(x) Like arrays, hash tables can § maps • large input domain provide O(1) lookup § onto § with respect to the number • smaller target domain/range (most often subset of integer) of records in the table. § such that • we get few collisions • i.e. – it would be possible to uniquely identify most of these strings using this hash And .. Question § IF H(x) ≠ H(y) Hash table • THEN (implies) x ≠ y ? § data structure that provides fast lookup (yes) § of a record indexed by a key § where § IF H(x) = H(y) • the domain of the key is too large for simple indexing; • THEN (implies) x = y ? as would occur if an array were used (no)

HHU – Technology of Social Networks – JProf. Dr. Kalman Graffi – Peer-to-Peer Systems – http://tsn.hhu.de/teaching/lectures/2014ws/p2p.html 33 Recall Hash Tables & Hash Functions

Hash tables are a well-known data structure § A fixed-size array § Elements of array also called “hash buckets”

Properties § allow insertions § allow deletions § allow to find entry in constant (average) time

Hash functions § map keys to elements onto (in) the array

Properties of good hash functions: § Fast to compute § Good distribution of keys into hash table § Example: SHA-1 algorithm • SHA = Secure Hash Algorithm

HHU – Technology of Social Networks – JProf. Dr. Kalman Graffi – Peer-to-Peer Systems – http://tsn.hhu.de/teaching/lectures/2014ws/p2p.html 34 Hash Tables: An Example

Assume a network of N=10 nodes Hash Table, an example Hash function: § hash(x) = x mod 10 (=N) Keys Values

0 0 Example 1 1 § Insert numbers 0, 1, 4, 9, 2 § 16, and 25 3 4 4 5 25 Properties 6 16 7 § Easy to find if a given key 8 is present in the table 9 9

HHU – Technology of Social Networks – JProf. Dr. Kalman Graffi – Peer-to-Peer Systems – http://tsn.hhu.de/teaching/lectures/2014ws/p2p.html 35 Hash Tables: An Example

Drawback of the example Hash Table, an example § Collisions are likely to happen

§ Time to search grows linearly with amount of peers Keys Values

0 § To insert and remove a peer 0 scales also linearly 1 1 2 § Hash function must be adapted 3 to the amount of available peers 4 and it is extremely time 4 consuming 5 25 6 16 Distributed Hash Table DHT 7 § Huge hash table (2^160 entries) 8 § Assigns concatenated input 9 9 RANGE to peers • (instead of individual numbers)

HHU – Technology of Social Networks – JProf. Dr. Kalman Graffi – Peer-to-Peer Systems – http://tsn.hhu.de/teaching/lectures/2014ws/p2p.html 36 Design Aspects for Distributed Hash Tables

1. Choice of an identifier space

2. Mapping of resources and peers to the identifier space

3. Management of the identifier space by the peers

4. Graph embedding (structure of the logical network)

5. Routing strategy

6. Maintenance strategy

HHU – Technology of Social Networks – JProf. Dr. Kalman Graffi – Peer-to-Peer Systems – http://tsn.hhu.de/teaching/lectures/2014ws/p2p.html 37 Overlay Network: Design Decisions

Group of peers P

FR : R → I Group of resources R

Overlay maps resources R and peers P on identifier space I

Example I: § Chord: [0, 2^160[ § Pastry: [0, 2^128[ § CAN: multidimensional FP : P → I

HHU – Technology of Social Networks – JProf. Dr. Kalman Graffi – Peer-to-Peer Systems – http://tsn.hhu.de/teaching/lectures/2014ws/p2p.html 38 (1) Choice of Identifier Space

Importance of Identifier Space: § Identifier space needed for addressing resources and peers • Often: ID (object ) = hash (object content) § Identifier space should be large to support large systems § Identifier space independent from physical location of peer à mobility of peers § Clustering of resources due to closeness metric of identifier space § Message routing uses identifier space

HHU – Technology of Social Networks – JProf. Dr. Kalman Graffi – Peer-to-Peer Systems – http://tsn.hhu.de/teaching/lectures/2014ws/p2p.html 39 (1) Choice of Identifier Space

à Main addressing ID space The identifier space must posses closeness metric d: d : I × I → R Which MUST satisfy the following conditions: ∀x, y∈I : d(x, y)≥ 0

∀x∈I :d(x, x)= 0

x, y I :d x, y 0 x y ∀ ∈ ( )= → = And SHOULD satisfy the following conditions:

∀x, y∈I :d x, y = d y, x ( ) ( ) x, y, z I :d x, z d x, y d y, z ∀ ∈ ( )≤ ( )+ ( )

HHU – Technology of Social Networks – JProf. Dr. Kalman Graffi – Peer-to-Peer Systems – http://tsn.hhu.de/teaching/lectures/2014ws/p2p.html 40 (2) Mapping to the Identifier Space

à Assigning ID addresses to peers (à Peer ID)

Possible design decisions:

Completeness: F P may be complete or partial

Identifier space should be injective

∀p,q ∈ P : p ≠ q ⇒ FP ( p) ≠ FP (q)

Dynamicity: F P may be fixed or change dynamically over time

HHU – Technology of Social Networks – JProf. Dr. Kalman Graffi – Peer-to-Peer Systems – http://tsn.hhu.de/teaching/lectures/2014ws/p2p.html 41 (3) Management of Identifier Space

Peers are responsible for resource identifiers

Identifier space I is managed by peers P:

P Responsibility function: M : I → 2

Which associates

§ the identifiers of a resource (i = FR (r)∈ I) § with a set of peers managing the resource

Through M r ∈ R § each peer p is assigned responsibility 1 § for a set of identifier M − ( p)

Locating a resource corresponds to finding a peer p in M (FR (r))

HHU – Technology of Social Networks – JProf. Dr. Kalman Graffi – Peer-to-Peer Systems – http://tsn.hhu.de/teaching/lectures/2014ws/p2p.html 42 Responsibility Function

Basic properties of M:

Completeness: ∀i ∈ I : ∃p ∈ P : p ∈ M (i)

Cardinality: § One or more peers are responsible for given identifier M § OFTEN induced by proximity: identifiers are associated with peers that are numerically closest

p ∈ M (i) ⇒ d(Fp ( p),i) =min d(FP (q),i) q∈P Dynamicity: § the responsibility function typically changes as peers join and leave

Uniformity / non-uniformity of replication

HHU – Technology of Social Networks – JProf. Dr. Kalman Graffi – Peer-to-Peer Systems – http://tsn.hhu.de/teaching/lectures/2014ws/p2p.html 43 (4) Graph Embedding

à Creating a graph / network with the peers Overlay can be modeled as a graph G=(P,E)

A neighborhood function N : P → 2P defines the neighbor set N(p) of the peer p

Which means that p ∈ P maintains all connected nodes as neighbors: ∃q ∈ N( p) : ( p,q)∈ E

Typically: Here the overlays differ in the implementation § Which nodes to connect to § How to set up the routing table

HHU – Technology of Social Networks – JProf. Dr. Kalman Graffi – Peer-to-Peer Systems – http://tsn.hhu.de/teaching/lectures/2014ws/p2p.html 44 (4) Graph Embedding – Desired properties

Desired properties of the graph are: Graph diameter: § a small graph diameter should provide lower bounds for latencies during routing

Connectivity: § the graph should be connected at any time

Local connectivity: § a peer should be connected to a subset of its immediate neighbors

Long-range connectivity: § overlay connectivity should be structurally similar to small world graphs and should satisfy the condition: 1 P[q ∈ N( p)]≈ −d d(FP ( p), FP (q)) § with d denoting the dimension of the identifier space § (Many “close” contacts, few “far away” contacts)

HHU – Technology of Social Networks – JProf. Dr. Kalman Graffi – Peer-to-Peer Systems – http://tsn.hhu.de/teaching/lectures/2014ws/p2p.html 45 (5) Routing strategy

à Routing in the network to the queried identifier Routing is modeled as asynchronous message passing § which forwards a message m with identifier i to peer p route(p,i,m)

A routing strategy defined as a non-deterministic function: R : P× I → 2P

Which selects at § A given element of I as destination ID § a given peer p § with neighborhood N(p) § the set of next peers R( p,i)∈ N( p)

HHU – Technology of Social Networks – JProf. Dr. Kalman Graffi – Peer-to-Peer Systems – http://tsn.hhu.de/teaching/lectures/2014ws/p2p.html 46

Routing strategy: Greedy routing

Mostly used in structured overlays

For a routing step from peer p to peer q with q ∈ R( p,i) the follow condition holds:

d(Fp ( p),i) ≤ d(FP (q),i)

Which means that § the distance to the target § after one routing step is § less or equal to the distance before • (Ideally: distance is halved in order to reach O(log(N)) routing steps)

HHU – Technology of Social Networks – JProf. Dr. Kalman Graffi – Peer-to-Peer Systems – http://tsn.hhu.de/teaching/lectures/2014ws/p2p.html 47 (6) Maintenance Strategy

à Keeping the routing information up-to-date Maintenance strategies can be classified into: § Proactive correction (e.g. using heartbeat messages) § Reactive mechanisms • Correction on use • Correction on failure • Correction on change

Goal for maintenance strategy: § Sufficient level of consistency § Minimize effort

HHU – Technology of Social Networks – JProf. Dr. Kalman Graffi – Peer-to-Peer Systems – http://tsn.hhu.de/teaching/lectures/2014ws/p2p.html 48 Design Concepts for P2P Overlays

Key design concepts : • choice of an identifier space • mapping of resources and peers to the identifier space • management of the identifier space by the peers • graph embedding (structure of the logical network, selection of contacts) • routing strategy • maintenance strategy

Unstructured vs. structured § Structured: object IDs and peer IDs same ID space • Every object ID is assigned to one single peer • Lookup possible := routing to peer being responsible for desired object ID

HHU – Technology of Social Networks – JProf. Dr. Kalman Graffi – Peer-to-Peer Systems – http://tsn.hhu.de/teaching/lectures/2014ws/p2p.html 49 Motivation Distributed Indexing

Flooding

O(Nk) Bottleneck: • Communication Overhead • False negatives (viele Fehlmeldungen) Bottlenecks: • Memory, CPU, Network

Communication Overhead • Availability O(log N) ? Scalable solution Central between both Server O(1) extremes?

O(1) O(log N) State(s) of Node O(N) Communication overhead i.e. § no. of hops vs. § State(s) of node • (i.e. amount of routing entries stored in node, e.g. server)

The content of this slide has been adapted from “Peer-to- Peer Systems and Applications”, ed. by Steinmetz, Wehrle HHU – Technology of Social Networks – JProf. Dr. Kalman Graffi – Peer-to-Peer Systems – http://tsn.hhu.de/teaching/lectures/2014ws/p2p.html 50 Motivation Distributed Indexing

Communication overhead vs. node state

Scalability: O(log N) Flooding No false negatives § i.e. Never ( answer YES .. if it is NOT there ) O(N) More resistant against changes Bottleneck: § Failures, Attacks Communication • § Short time users Overhead • False negatives Bottlenecks: • Memory, CPU, Network

Communication Communication Overhead Distributed • Availability O(log N) Hash Table Central Server O(1)

O(1) O(log N) O(N) Node State

The content of this slide has been adapted from “Peer-to- Peer Systems and Applications”, ed. by Steinmetz, Wehrle HHU – Technology of Social Networks – JProf. Dr. Kalman Graffi – Peer-to-Peer Systems – http://tsn.hhu.de/teaching/lectures/2014ws/p2p.html 51 Fundamentals of Distributed Hash Tables

Challenges for designing DHTs:

1. Desired Characteristics § Flexibility § Reliability §

2. Load balancing § Equal content “load” for all nodes § vs. content load proportional to node capacity § vs. content load proportional to content consumption

3. Permanent adaptation to faults, join, leave of nodes § Assignment of responsibilities to new nodes § Re-assignment and re-distribution of responsibilities in case of node failure or departure

HHU – Technology of Social Networks – JProf. Dr. Kalman Graffi – Peer-to-Peer Systems – http://tsn.hhu.de/teaching/lectures/2014ws/p2p.html 52