Consistent Hashing and Distributed Hash Tables

Marco Serafini

COMPSCI 590S Lecture 12 Motivation

2 Peer-to-Peer Networks • Decentralized distributed systems • No/minimal central authority • Main functionality: • BitTorrent • Problems • No central index: how to find file? • No central master: how to know participant? • Churn: nodes come and go all the time • Q: Do these problems arise in Big Data systems?

33 Content Delivery Networks • Cache services closer to the clients • Initially: mainly static web contents, images etc. • Lower latency due to proximity • Data replication offloads source servers • Example: Akamai • Similar problems as P2P networks

44 Consistent Hashing • Consistent Hashing • Assigns keys to values (files) • Membership information is distributed • Designed to balance load and deal with churn • Distributed (DHT) • Used to find the responsible for a key quickly • Distributed index: each node keeps a subset of the index

55 Consistent Hashing

6 Consistent Hashing • Nodes are assigned a key • Values are assigned a key • Values are stored by the first successor node

nodes

NOTE: In this example: • n = 8 • Ids are uniformly distributed, but this is a simplification

7 7 Membership Changes • Node joins: take data from successor • Node leaves: give data to successor • Local changes, no global reconfiguration à good for churn

8 8 Theoretical Results

• Q: What do these results tell us? • ! is arbitrarily small with "(log ') virtual nodes • Virtual nodes: multiple keys associated to same virtual node

99 Finding Data with Consistent Hashing

10 Finger Tables • Each node keeps a finger table • Start of finger table element !: " + 2%&' ()* 2+ • Each element defines interval of size 2%&' until next element

• Each interval is associated to the first node in it Typo: should • Constants be “node” • Number of bits in key: ( • Number of nodes in system: "

11 11 Properties • More info on closer keys • Subsequent intervals double their size • No info for successors of all keys • Only successors for intervals

1212 Lookups for Key k • Find node j immediately preceding k in local finger table • Ask j for node immediately preceding k in remote

finger table Typo: should • Iterate be “node” • Every step halves distance • Q: Why? • After !"# $ steps go sequential

13 13 Number of Lookups With high probability, the number of nodes that must be contacted to find a success in a !-nodes network is "($%& !)

Proof • Each step from node n to node p halves the distance • After log ! steps, distance ≤ 2-/! • Interval contains 1 node on average, " log ! w.h.p.

1414 Dealing with Node Joins • Single node joins • New node initializes finger table (with lookups) • Existing nodes update fingers • Transfer keys to the new successor • In practice • Concurrent node joins • Failures • No strong guarantees: lookups can fail • tries to stabilize eventually (best effort)

1515 Stabilization • Having correct successor pointers is sufficient for correctness (with joins) • When joining • Find successor and point to it • Get part of the successor’s keys • Periodically • Replace successor’s predecessor with self if needed • Check random local fingers and see if still updated by doing lookup

n p s 1616 • Each node keeps list of r-hop successors • If stabilize notices that successor is failed, replace it • Application can replicate to r-hop successors

1717