CS 417 29 October 2018 Paul Krzyzanowski 1
Total Page:16
File Type:pdf, Size:1020Kb
CS 417 29 October 2018 Distributed Lookup • Look up (key, value) • Cooperating set of nodes Distributed Systems • Ideally: – No central coordinator 16. Distributed Lookup – Some nodes can be down Paul Krzyzanowski Rutgers University Fall 2018 October 29, 2018 © 2014-2018 Paul Krzyzanowski 1 October 29, 2018 © 2014-2018 Paul Krzyzanowski 2 Approaches 1. Central Coordinator 1. Central coordinator • Example: Napster – Napster – Central directory – Identifies content (names) and the servers that host it 2. Flooding – lookup(name) → {list of servers} – Download from any of available servers – Gnutella • Pick the best one by pinging and comparing response times 3. Distributed hash tables • Another example: GFS – CAN, Chord, Amazon Dynamo, Tapestry, … – Controlled environment compared to Napster – Content for a given key is broken into chunks – Master handles all queries … but not the data October 29, 2018 © 2014-2018 Paul Krzyzanowski 3 October 29, 2018 © 2014-2018 Paul Krzyzanowski 4 1. Central Coordinator - Napster 2. Query Flooding • Pros • Example: Gnutella distributed file sharing – Super simple – Search is handled by a single server (master) – The directory server is a single point of control • Well-known nodes act as anchors • Provides definitive answers to a query – Nodes with files inform an anchor about their existence – Nodes select other nodes as peers • Cons – Master has to maintain state of all peers – Server gets all the queries – The directory server is a single point of control • No directory, no service! October 29, 2018 © 2014-2018 Paul Krzyzanowski 5 October 29, 2018 © 2014-2018 Paul Krzyzanowski 6 Paul Krzyzanowski 1 CS 417 29 October 2018 2. Query Flooding Overlay network • Send a query to peers if a file is not present locally An overlay network is a virtual network formed by peer connections – Each request contains: – Any node might know about a small set of machines – “Neighbors” may not be physically close to you • Query key • Unique request ID • Time to Live (TTL, maximum hop count) • Peer either responds or routes the query to its neighbors – Repeat until TTL = 0 or if the request ID has been processed – If found, send response (node address) to the requestor – Back propagation: response hops back to reach originator Underlying IP Network October 29, 2018 © 2014-2018 Paul Krzyzanowski 7 October 29, 2018 © 2014-2018 Paul Krzyzanowski 8 Overlay network Flooding Example: Overlay Network An overlay network is a virtual network formed by peer connections – Any node might know about a small set of machines – “Neighbors” may not be physically close to you Overlay Network October 29, 2018 © 2014-2018 Paul Krzyzanowski 9 October 29, 2018 © 2014-2018 Paul Krzyzanowski 10 Flooding Example: Query Flood Flooding Example: Query response Query TTL=0 TTL=1 TTL=0 TTL=2 TTL=1 Result TTL=2 TTL=0 TTL=1 Result TTL=2 Result TTL=0 TTL=1 Found! Found! TTL=1 TTL=0 TTL=0 TTL=1 Back propagation TTL=0 TTL = Time to Live (hop count) October 29, 2018 © 2014-2018 Paul Krzyzanowski 11 October 29, 2018 © 2014-2018 Paul Krzyzanowski 12 Paul Krzyzanowski 2 CS 417 29 October 2018 Flooding Example: Download What’s wrong with flooding? • Some nodes are not always up and some are slower than others – Gnutella & Kazaa dealt with this by classifying some nodes as special (“ultrapeers” in Gnutella, “supernodes” in Kazaa,) Request download • Poor use of network resources • Potentially high latency – Requests get forwarded from one machine to another – Back propagation (e.g., in Gnutella’s design), where the replies go through the same chain of machines used in the query, increases latency even more October 29, 2018 © 2014-2018 Paul Krzyzanowski 13 October 29, 2018 © 2014-2018 Paul Krzyzanowski 14 Hash tables • Remember hash functions & hash tables? – Linear search: O(N) – Tree: O(logN) 3. Distributed Hash Tables – Hash table: O(1) October 29, 2018 © 2014-2018 Paul Krzyzanowski 15 October 29, 2018 © 2014-2018 Paul Krzyzanowski 17 What’s a hash function? (refresher) Considerations with hash tables (refresher) • Hash function • Picking a good hash function – A function that takes a variable length input (e.g., a string) – We want uniform distribution of all values of key over the space 0 … N-1 and generates a (usually smaller) fixed length result (e.g., an integer) – Example: hash strings to a range 0-7: • Collisions • hash(“Newark”) → 1 • hash(“Jersey City”) → 6 – Multiple keys may hash to the same value • hash(“Paterson”) → 2 • hash(“Paterson”) → 2 • hash(“Edison”) → 2 • Hash table – table[i] is a bucket (slot) for all such (key, value) sets – Table of (key, value) tuples – Within table[i], use a linked list or another layer of hashing – Look up a key: • Hash function maps keys to a range 0 … N-1 table of N elements • Think about a hash table that grows or shrinks i = hash(key) – If we add or remove buckets → need to rehash keys and move items table[i] contains the item – No need to search through the table! October 29, 2018 © 2014-2018 Paul Krzyzanowski 18 October 29, 2018 © 2014-2018 Paul Krzyzanowski 19 Paul Krzyzanowski 3 CS 417 29 October 2018 Distributed Hash Tables (DHT) Consistent hashing Create a peer-to-peer version of a (key, value) data store • Conventional hashing – Practically all keys have to be remapped if the table size changes How we want it to work • Consistent hashing 1. A peer (A) queries the data store with a key – Most keys will hash to the same value as before 2. The data store finds the peer (B) that has the value – On average, K/n keys will need to be remapped 3. That peer (B) returns the (key, value) pair to the querying peer (A) K = # keys, n = # of buckets Make it efficient! Example: splitting a bucket Only the keys in slot c get remapped – A query should not generate a flood! query(key) B C slot a slot b slot c slot d slot e A value D E slot a slot b slot c1 slot c2 slot d slot e October 29, 2018 © 2014-2018 Paul Krzyzanowski 20 October 29, 2018 © 2014-2018 Paul Krzyzanowski 21 3. Distributed hashing • Spread the hash table across multiple nodes • Each node stores a portion of the key space lookup(key) → node ID that holds (key, value) Distributed Hashing lookup(node_ID, key) → value Case Study Questions CAN: Content Addressable Network How do we partition the data & do the lookup? & keep the system decentralized? & make the system scalable (lots of nodes with dynamic changes)? & fault tolerant (replicated data)? October 29, 2018 © 2014-2018 Paul Krzyzanowski 22 October 29, 2018 © 2014-2018 Paul Krzyzanowski 23 CAN design CAN key→node mapping: 2 nodes • Create a logical grid y=ymax x = hashx(key) – x-y in 2-D (but not limited to two dimensions) y = hashy(key) • Separate hash function per dimension if x < (xmax/2) n1 has (key, value) – hx(key), hy(key) n1 n2 if x ≥ (xmax/2) n2 has (key, value) • A node – Is responsible for a range of values in each dimension n2 is responsible for a zone – Knows its neighboring nodes x=(xmax/2 .. xmax), y=(0 .. ymax) y=0 x=0 xmax/2 x=xmax October 29, 2018 © 2014-2018 Paul Krzyzanowski 24 25 Paul Krzyzanowski 4 CS 417 29 October 2018 CAN partitioning CAN key→node mapping y=ymax y=ymax Any node can be split in x = hashx(key) two – either horizontally or vertically y = hashy(key) n1 n1 if x < (xmax/2) { if y < (ymax/2) n0 has (key, value) n2 n2 ymax/2 ymax/2 else n1 has (key, value) } n0 n0 if x ≥ (xmax/2) n2 has (key, value) y=0 y=0 x=0 xmax/2 x=xmax x=0 xmax/2 x=xmax October 29, 2018 © 2014-2018 Paul KrzyzanowsKi 26 October 29, 2018 © 2014-2018 Paul Krzyzanowski 27 CAN partitioning CAN neighbors y=ymax y=ymax Any node can be split in Neighbors refer to n1 n1 two – either horizontally nodes that share or vertically adjacent zones in the n0 n11 n0 n11 overlay network n2 n2 Associated data has to be moved to the new n4 only needs to keep n3 n3 node based on track of n5, n7, or n8 as ymax/2 hash(key) ymax/2 its right neighbor. n5 n6 n5 n6 n9 Neighbors need to be n9 n7 made aware of the new n7 n4 node n4 n8 n10 A node knows only of its n8 n10 neighbors y=0 y=0 x=0 xmax/2 x=xmax x=0 xmax/2 x=xmax October 29, 2018 © 2014-2018 Paul Krzyzanowski 28 October 29, 2018 © 2014-2018 Paul Krzyzanowski 29 CAN routing CAN y=ymax • Performance lookup(key) on a node n1 – For n nodes in d dimensions that does not own the value – # neighbors = 2d n0 n11 – Average route for 2 dimensions = O(√n) hops n2 Compute hashx(key), hashy(key) n3 and route request to a ymax/2 neighboring node • To handl e fai l ures n5 n6 n9 – Share knowledge of neighbor’s neighbors Ideally: route to n7 – One of the node’s neighbors takes over the failed zone minimize distance to n4 destination n8 n10 y=0 x=0 xmax/2 x=xmax October 29, 2018 © 2014-2018 Paul Krzyzanowski 30 October 29, 2018 © 2014-2018 Paul Krzyzanowski 31 Paul Krzyzanowski 5 CS 417 29 October 2018 Chord & consistent hashing • A key is hashed to an m-bit value: 0 … (2m-1) • A logical ring is constructed for the values 0 ... (2m-1) Distributed Hashing • Nodes are placed on the ring at hash(IP address) Case Study 15 0 1 Node 14 2 hash(IP address) = 3 Chord 13 3 12 4 11 5 10 6 9 7 8 October 29, 2018 © 2014-2018 Paul Krzyzanowski 32 October 29, 2018 © 2014-2018 Paul Krzyzanowski 33 Key assignment Handling query requests • Example: n=16; system with 4 nodes (so far) • Any peer can get a request (insert or query).