CompSci 514: Computer Networks Lecture 21-2: From to BitTyrant Problem Statement ...

Server

• One-to-many content distribution – Millions of clients downloading from the same server Evolving Solutions

• Observation: duplicate copies of data are sent • Solutions – IP multicast – End system multicast – Content distribution networks e.g. Akamai – P2P cooperative content distribution • Bittorrent etc. IP multicast

• End systems join a multicast group • Routers set up a multicast tree • Packets are duplicated and forwarded to multiple next hops at routers • Multicast pros and cons End system multicast

• End systems rather than routers organize into a tree, forward and duplicate packets • Pros and cons Content distribution networks

• Akamai – Works well but expensive, requires infrastructure support Peer-to-Peer Cooperative Content Distribution

u Use the ’s uplink bandwidth

u New problem: incentives for cooperation or how to motivate clients to upload The Gnutella approach u All nodes are true peers u A peer is the publisher, the uploader and the downloader. u No single point of failure. u Efficiency and scalability issue: u File searches span across a large number of nodes generating lots of traffic. u Integrity, i.e.content pollution issue: u Anyone can claim that he publishes valid content u No guarantee of quality of objects u Incentive issue: u No incentives for cooperation (free-riding in Gnutella) Outline u Problem of Content Distribution u The BitTorrent approach u BitTyrant BitTorrent overview

Tracker 1 2 3 Leecher A

Seeder Leecher C

Leecher B u File is divided into chunks (e.g. 256KB) uShA1 hashes of all the pieces are included in the .torrent file for integrity check uA chunk is divided into sub-pieces to improve efficiency u Seeders have all chunks of the file u Leechers have some or no chunks of the file BitTorrent overview

Tracker 1 2 3 Leecher A

Seeder Leecher C

Leecher B u File is divided into chunks (e.g. 256KB) uShA1 hashes of all the pieces are included in the .torrent file for integrity check u.torrent file has address of a tracker uTracker tracks all downloaders Terminology

• Seeder: peer with the entire file – Original Seed: The first seed • Leecher: peer thats downloading the file – Fairer term might have been downloader • Sub-piece: Further subdivision of a piece – The unit for requests is a subpiece – But a peer uploads only after assembling complete piece BitTorrent overview

View Tracker Leecher A Seeder Leecher A Leecher B Seeder Leecher C Leecher C

Leecher B

u Clients (seeders or leechers) contact the tracker u Tracker has complete view of the swarm BitTorrent overview Partial View View Seeder Tracker Leecher A Seeder Leecher B Leecher A Leecher B Seeder Leecher C Leecher C

Leecher B

u Tracker sends partial view to clients u Clients connect to peers in their partial view BitTorrent overview

1 2 3 Tracker Leecher A

Seeder Leecher C

Leecher B

u Every 10 sec, seeders sample their peers’ download rates u Seeders unchoke 4-10 interested fastest downloaders BitTorrent overview

Tracker 3 Leecher A 1

2 Seeder Leecher C

Leecher B

u A node announces available chunks to their peers u Leechers request chunks from their peers (locally rarest-first) BitTorrent overview

Tracker Leecher A

1

Seeder Leecher C

Leecher B

u Rate-based tit-for-tat Every 30 sec, leechers optimistically unchoke 1-2 peers Optimistic unchoking helps in discover other faster peers and prompt them to reciprocate Bootstrap new peers with no data to upload BitTorrent overview

Tracker Leecher A

1

Seeder Leecher C

Leecher B

u Rate-based tit-for-tat Every 30 sec, leechers optimistically unchoke 1-2 peers Every 10 sec, leechers sample their peers’ upload rates BitTorrent overview

Tracker Leecher A

3 2

Seeder Leecher C

Leecher B u Rate-based tit-for-tat Every 30 sec, client optimistically unchokes 1-2 peers Every 10 sec, seeder samples its peers’ upload rates Leecher unchokes 4-10 fastest interested uploaders Leercher chokes other peers Q: Why does this algo encourage cooperation? Scheduling: Choosing pieces to request • Rarest-first: Look at all pieces at all peers, and request piece thats owned by fewest peers – Increases diversity in the pieces downloaded • avoids case where a node and each of its peers have exactly the same pieces; increases throughput – Increases likelihood all pieces still available even if original seed leaves before any one node has downloaded entire file Start time scheduling

• Random First Piece: – When peer starts to download, request random piece. • So as to assemble first complete piece quickly • Then participate in uploads – When first complete piece assembled, switch to rarest-first Choosing pieces to request

• End-game mode: – When requests sent for all sub-pieces, (re)send requests to all peers. – To speed up completion of download – Cancel request for downloaded sub-pieces Outline u Cooperative Content Distribution u The BitTorrent approach u BitTyrant BitTyrant: a strategic BT client

• Question: can a strategic peer game BT to significantly improve its download performance for the same level of upload contribution?

• Conclusion: incentives do not build robustness. Strategic peers can gain significantly in performance. Key ideas

• Observations: – Choked as long as one is among the fastest peer set – High capacity peers equally split its upload rates to active set peers – Altruism comes from unnecessary contributions • Strategies: – Maximize per connection download bandwidth – Maximize the number of reciprocating peers – Do not upload more than needed for reciprocation

BitTyrant strategy

p …

up

dp

• Rank peers by dp/up • Unchoke in decreasing order of peer ranking until upload capacity is saturated • Assumption: data is always available, download is not limited by data scarcity Challenges

• Determining up – Initialized with the expected equal split capacity obtained from measurement – Periodically update it • Increase multiplicatively if peer does not reciprocate • Decrease if peer does • Estimating d_p – Measure from download rate – Estimate from peer announced block available rate U: U/ActiveSize • Sizing the neighborhood – Request as fast as possible Figure 8: Left: The expected download performance of a client with 300 KB/s upload capacity for increasing active set size. Right: The performance-maximizing active set size for peers of varying rate. The strategic maximum is linear in upload capacity, while the reference implementation of BitTorrent suggests active size √rate. Although several ∼ hundred peers may be required to maximize throughput, most trackers return fewer than 100 peers per request. rocation probability to equal split rate. For each peer p, maintain estimates of expected download Figure 8 is for a single strategic peer and suggests that performance dp and upload required for reciprocation up. strategic high capacity peers can benefit much more by Initialize up and dp assuming the bandwidth manipulating their active set size. Our example peer with distribution in Figure 2. upload capacity 300 KB/s realizes a maximum down- load throughput of roughly 450 KB/s. However, increas- dp is initially the expected equal split capacity of p. ing reciprocation probability via active set sizing is ex- up is initially the rate just above the step in the tremely sensitive—throughput falls off quickly after the reciprocation probability. maximum is reached. Further, it is unclear if active set sizing alone would be sufficient to maximize reciproca- Each round, rank order peers by the ratio dp/up and unchoke tion in an environment with several strategic clients. those of top rank until the upload capacity is reached. These challenges suggest that any a priori active set d d d d d sizing function may not suffice to maximize download 0 , 1 , 2 , 3 , 4 , ... u u u u u rate for strategic clients. Instead, they motivate the dy- 0 1 2 3 4 choose k k u cap namic algorithm used in BitTyrant that adaptively mod- | i=0 i ≤ ifies the size and membership of the active set and the | P{z } At the end of each round for each unchoked peer: upload bandwidth allocated to each peer (see Figure 9). In both BitTorrent and BitTyrant, the set of peers that If peer p does not unchoke us: up (1 + δ)up ← will receive data during the next TFT round is decided by If peer p unchokes us: dp observed rate. the unchoke algorithm once every 10 seconds. BitTyrant ← differs from BitTorrent as it dynamically sizes its active If peer p has unchoked us for the last r rounds: up (1 γ)up set and varies the sending rate per connection. For each ← − peer p, BitTyrant maintains estimates of the upload rate Figure 9: BitTyrant unchoke algorithm required for reciprocation, up, as well as the download throughput, dp, received when p reciprocates. Peers are ranked by the ratio dp/up and unchoked in order until the date operation dynamically estimates these rates and, sum of up terms for unchoked peers exceeds the upload in conjunction with the ranking strategy, optimizes capacity of the BitTyrant peer. download rate over time. The rationale underlying this unchoke algorithm is BitTyrant is designed to tap into the latent altruism in • that the best peers are those that reciprocate most for the most swarms by unchoking the most altruistic peers. least number of bytes contributed to them, given accurate However, it will continue to unchoke peers until it ex- information regarding up and dp. Implicit in the strategy hausts its upload capacity even if the marginal utility are the following assumptions and characteristics: is sub-linear. This potentially opens BitTyrant itself to The strategy attempts to maximize the download rate being cheated, a topic we return to later. • for a given upload budget. The ranking strategy cor- The strategy can be easily generalized to handle con- • responds to the value-density heuristic for the knap- current downloads from multiple swarms. A client can sack problem. In practice, the download benefit (dp) optimize the aggregate download rate by ordering the and upload cost (up) are not known a priori. The up- dp/up ratios of all connections across swarms, thereby

USENIX Association NSDI ’07: 4th USENIX Symposium on Networked Systems Design & Implementation 7 BitTyrant Modelling

• Reciprocation probability: – Probability that a peer will select the client with certain equal split as downloader in the next unchoking round

• Active set: peers that upload to us • Neighborhood: peers that we connect to Notations Expected Reciprocation Probability

Q

P Expected Download Rate Summary

• Bittorrent – Swarm-based distribution – Tit-for-tat for incentives

• Bittyrant: a selfish client – Donot upload faster than necessary – Estimate minimum upload rates to warrant unchoking