<<

Overview

• Web 15-441 • Protocol interactions 15-441 Computer Networking 15-641 • Caching • Cookies • Consistent hashing Lecture 16: Delivering Content • Peer-to-peer Web and Peer-Peer • CDN Peter Steenkiste • Video

Fall 2014 www.cs.cmu.edu/~prs/15-441-F14

2

Web history Internet Traffic History

• 1945: Vannevar Bush, “As we may think”, Atlantic 100000 Monthly, July, 1945. • Describes the idea of a distributed hypertext system. 10000 All Fixed 1000 • A “memex” that mimics the “web of trails” in our minds. Mobile • 1989: Tim Berners-Lee (CERN) writes internal proposal 100

to develop a distributed hypertext system 10

• Connects “a web of notes with links”. PByte/month 1 • Intended to help CERN physicists in large projects and manage information 0.1 • 1990: TBL writes graphical browser for Next machines 0.01

0.001 • 1992-1994: NCSA/Mosaic/Netscape browser release Year

3 4

1 Typical Workload (Web Pages) HTTP 0.9/1.0

• Multiple (typically small) objects per page • One request/response per TCP connection • File sizes •Lots of small objects • Simple to implement • Heavy-tailed means & TCP • Short transfers are very hard on TCP • Pareto distribution for tail • 3-way handshake • Multiple connection setups  three-way handshake • Lognormal for body of distribution • Lots of slow starts each time • Embedded references • Extra connection state • Several extra round trips added to transfer • Number of embedded objects also Pareto • Many slow starts – low throughput because of small -k Pr(X>x) = (x/xm) window • This plays havoc with performance. Why? • Never leave slow start for short transfers • Loss recovery is poor when windows are small • Solutions? • Lots of extra connections • Increases server state/processing

5 6

Single Transfer Example Improving Performance

Client Server • Multiple concurrent connections (Netscape) 0 RTT SYN • Benefits are mixed: more state, timeouts, … opens TCP SYN 1 RTT • Multiplex multiple transfers onto one TCP connection connection ACK Client sends HTTP request DAT (HTTP 1.1) ACK Server reads from for HTML DAT disk • Also allow pipelined transfers, i.e., multiple outstanding requests 2 RTT FIN ACK • How to identify requests/responses Client parses HTML FIN • Delimiter  Server must examine response for delimiter string Client opens TCP ACK connection SYN • Content-length and delimiter  Must know size of transfer in SYN advance 3 RTT ACK • Block-based transmission  send in multiple length delimited Client sends HTTP request DAT blocks for image Server reads from ACK disk • Store-and-forward  wait for entire response and then use 4 RTT content-length DAT Image begins to arrive • Solution  use existing methods and close connection otherwise

7 8

2 Persistent Connection Solution Other Problems

Server • Serialized transmission but first bytes may be most useful • May be better to get the 1st 1/4 of all images than one Client complete image (e.g., progressive JPEG) 0 RTT DAT • Can “packetize” transfer over TCP, e.g., range requests Client sends HTTP request ACK Server reads from for HTML DAT disk • Application specific solution to transport protocol 1 RTT ACK problems. :( Client parses HTML DAT • Could fix TCP so it works well with multiple Server reads from Client sends HTTP request simultaneous connections ACK disk for image DAT • More difficult to deploy

2 RTT • Persistent connections can significantly increase state on the server Image begins to arrive • Allow server to close HTTP session • Client can no longer send requests

9 10

Web Proxy Caches No Caching Example (1)

Assumptions • Average object size = 100,000 bits origin • User configures browser: Web origin • Avg. request rate from institution’s servers browser to origin servers = 15/sec accesses via cache server public • Delay from institutional router to • Browser sends all HTTP Proxy Internet requests to cache any origin server and back to router server = 2 sec • Object in cache: cache client returns object Consequences • Else cache requests object • Utilization on LAN = 15% 1.5 Mbps access link from origin server, then • Utilization on access link = 100% returns object to client • Total delay = Internet delay + access institutional network delay + LAN delay 10 Mbps LAN = 2 sec + minutes + milliseconds client origin server

11 12

3 No Caching Example (2) With Caching Example (3)

Possible solution Install cache • Increase bandwidth of access link origin origin • Suppose hit rate is .4 to, say, 10 Mbps servers servers Consequence • Often a costly upgrade public • 40% requests will be satisfied almost public Internet immediately (say 10 msec) Internet Consequences • 60% requests satisfied by origin server • Utilization on LAN = 15% • Utilization of access link reduced to 60%, • Utilization on access link = 15% resulting in negligible delays • Total delay = Internet delay + access 10 Mbps • Weighted average of delays 1.5 Mbps delay + LAN delay access link = .6*2 sec + .4*10msecs < 1.3 secs access link = 2 sec + msecs + msecs institutional institutional network network 10 Mbps LAN 10 Mbps LAN

institutional cache

13 14

HTTP Caching Problems

• Clients often cache documents • Fraction of HTTP objects that are cacheable is dropping • Challenge: update of documents • Why? • If-Modified-Since requests to check • Major exception? • HTTP 0.9/1.0 used just date • This problem will not go away • HTTP 1.1 has an opaque “entity tag” (could be a file signature, • Dynamic data  stock prices, scores, web cams etc.) as well • CGI scripts  results based on passed parameters • When/how often should the original be checked for • Other less obvious examples changes? • SSL  encrypted data is not cacheable • Check every time? • Most web clients don’t handle mixed pages well many generic objects transferred with SSL • Check each session? Day? Etc? • Cookies  results may be based on past data • Use Expires header • Hit metering  owner wants to measure # of hits for revenue, etc. • If no Expires, often use Last-Modified as estimate • What will be the end result?

15 16

4 Cookies: Keeping “state” Cookies: Keeping “State”

client Amazon server

Many major Web sites use Cookie file usual http request msg server cookies usual http response + creates ID Four components: Example: ebay: 8734 Set-cookie: 1678 1678 for user 1) Cookie header line in the • Susan accesses Internet HTTP response message always from the same PC Cookie file usual http request msg 2) Cookie header line in HTTP • She visits a specific e- amazon: 1678 cookie- request message cookie: 1678 commerce site for the first time ebay: 8734 specific 3) Cookie file kept on user’s • When initial HTTP requests usual http response msg action host and managed by user’s arrives at the site, the site one week later: browser creates a unique ID and creates 4) Back-end database at Web an entry in a backend database usual http request msg Cookie file cookie- site for that ID cookie: 1678 amazon: 1678 specific ebay: 8734 usual http response msg action

17 18

Overview Distributing Load across Servers

• Web • Given document XYZ, we need to choose a • Consistent hashing server to use • Peer-to-peer • E.g., in a data center • CDN • Suppose we use simple hashing: modulo of a • Video hash of the name of the document • Number servers from 1…n • Place document XYZ on server (XYZ mod n) • What happens when a servers fails? n  n-1 • Same if different people have different measures of n • Why might this be bad?

19 20

5 Consistent Hash: Goals Consistent Hash – Example

• Construction • “view” = subset of all hash buckets that are 0 • Assign each of C hash buckets to 14 candidate locations random points on mod 2n circle, where, hash key size = n. Bucket • Correspond to a real server 12 4 • Map object to random position on • Desired features unit interval • Load – all hash buckets have a similar number • Hash of object = closest bucket 8 of objects assigned to them • Monotone  addition of bucket does not cause • Smoothness – little impact on hash bucket movement between existing buckets contents when buckets are added/removed • Spread & Load  small set of buckets that lie • Spread – small set of hash buckets that may near object hold an object regardless of views • Balance  no bucket is responsible for large number of objects 21 22

Consistent Hashing: Ring Consistent Hashing Example

• Use consistent has to map both keys and nodes to an m-bit identifier Rule: A key is stored at its successor: node with next higher or equal ID in the same (metric) identifier space • For example, use SHA-1 hashes IP=“198.10.10.1” 0 K5 • Node identifier: SHA-1 hash of IP address IP=“198.10.10.1” SHA-1 ID=123 N123 K20

• Key identifier: SHA-1 hash of key Circular 7-bit SHA-1 ID=60 Key=“LetItBe” K101 ID space N32 • Also need “rule” for assigning keys to nodes • For example: “closest”, higher, lower, ..

N90 Key=“LetItBe” K60

24 23

6 Consistent Hashing Properties Finer Grain Load Balancing

• Load balance: all nodes receive roughly the same • Redirector knows all server IDs number of keys • It can also track approximate “load” for more • For N nodes and K keys, with high probability precise load balancing • Each node holds at most (1+)K/N keys • Provided that K is large compared to N • Need to define load and be able to track it • When server is added, it receives its initial work load from • To balance load: “neighbors” on the ring • W = Hash(URL, ip of s ) for all i • “Local” operation: no other servers are affected i i • Sort W from high to low • Similar property when a server is removed i • Find first server with low enough load • Benefits and drawbacks?

25 26

Consistent Hashing Overview Used in Many Contexts

• Distribute load across servers in a data center • Web • The redirector sits in data center • Consistent hashing • Peer-to-peer • Finding storage cluster for an object in a CDN uses centralized knowledge • Motivation • Architectures • Why? • • Can use consistent hashing in the cluster • Skype • Consistent hashing can also be used in a • CDN distributed setting • Video • P2P systems can use it find files

27 28

7 Scaling Problem P2P System

• Millions of clients  server and network meltdown

• Leverage the resources of client machines (peers) • Computation, storage, bandwidth

29 30

Why p2p? Common P2P Framework

• Harness lots of spare capacity • Common Primitives: • 1 Big Fast Server: 1Gbit/s, $10k/month++ • Join: how to I begin participating? • 2,000 cable modems: 1Gbit/s, $ ?? • Publish: how do I advertise my file? • 1M end-hosts: Uh, wow. • Search: how to I find a file? • Capacity grows with the number of users! • Fetch: how to I retrieve a file? • Build very large-scale, self-managing systems • Search tends to be the most challenging: • Same techniques useful for companies and p2p apps • Needles vs. Haystacks • E.g., Akamai’s 14,000+ nodes, Google’s 100,000+ nodes • Searching for top 40, or an obscure punk track from 1981 that • Many differences to consider nobody ever heard of? • Servers versus arbitrary nodes • Search expressiveness: Whole word? Regular • Hard state (backups!) versus soft state (caches) expressions? File names? Attributes? Whole-text? … • Security, fairness, freeloading, .. • E.g., p2p or p2p google?

31 32

8 Searching What is (was) out there?

Central Flood Super- Route N2 node N1 N3 flood Whole Gnutella Key=“title” Internet Value=MP3 data… ? File Client Publisher Lookup(“title”) Chunk DHTs N4 N6 N5 Based (bytes, eDonkey not 2000 chunks)

33 34

The Solution Space Napster: Centralized Database

• Centralized Database • Operational from 1999 to 2001 • Napster • Peaked at 1.5 million simultaneous users • Query Flooding • Gnutella • Intelligent Query Flooding • KaZaA • Join: on startup, client contacts central • Swarming server • BitTorrent • Publish: reports list of files to central server • Structured Overlay Routing • Search: query the server => return someone • Distributed Hash Tables that stores the requested file • Fetch: get the file directly from peer

35 36

9 Napster: Publish Napster: Search

123.2.0.18

insert(X, search(A) 123.2.21.23) --> ... Fetch 123.2.0.18

Publish Query Reply

I have X, Y, and Z! Where is file A? 123.2.21.23

37 38

Napster: Discussion Next Topic...

• Pros: • Centralized Database • Simple • Napster • Search scope is O(1) • Query Flooding • Gnutella • Controllable (pro or con?) • Intelligent Query Flooding • Cons: • KaZaA • Server maintains O(N) State • Swarming • Server does all processing • BitTorrent • Structured Overlay Routing • Single point of failure • Distributed Hash Tables

39 40

10 Gnutella: Flooding Gnutella: Search

• Released in 2000 and improved over time; still alive I have file A. • Many clients available, very popular I have file A. • Join: on startup, client contacts a few other nodes; these become its “neighbors” • Publish: no need Reply • Search: ask neighbors, who ask their neighbors, and so on... when/if found, reply to sender. • TTL limits propagation • Fetch: get the file directly from peer Query

Where is file A?

41 42

Gnutella: Discussion KaZaA: Query Flooding

• Pros: • First released in 2001 and still used today • Fully de-centralized • Also very popular • Search cost distributed • Processing @ each node permits powerful search • Join: on startup, client contacts a “supernode” ... may at semantics some point become one itself • Cons: • Publish: send list of files to supernode • Search: send query to supernode, supernodes flood query • Search scope is O(N) amongst themselves. • Search time is O(???) • Fetch: get the file directly from peer(s); can fetch • Nodes leave often, network unstable simultaneously from multiple peers • TTL-limited search works well for haystacks. • For scalability, does NOT search every node. • May have to re-issue query later

43 44

11 KaZaA: Network Design KaZaA: File Insert

“Super Nodes”

insert(X, 123.2.21.23) ...

Publish

I have X!

123.2.21.23

45 46

KaZaA: File Search KaZaA: Fetching

search(A) • More than one node may have requested file... --> • How to tell? 123.2.22.50 • Must be able to distinguish identical files • Not necessarily same filename • Same filename not necessarily same file... search(A) 123.2.22.50 • Use Hash of file --> • KaZaA uses UUHash: fast, but not secure Query Replies 123.2.0.18 • Alternatives: MD5, SHA-1 • How to fetch? Where is file A? • Get bytes [0..1000] from A, [1001...2000] from B • Alternative: Erasure Codes 123.2.0.18

47 48

12 KaZaA: Discussion Stability and Superpeers

• Pros: • Why superpeers? • Tries to take into account node heterogeneity: • Query consolidation • Bandwidth • Many connected nodes may have only a few files • Host Computational Resources • Propagating a query to a sub-node would take more b/w than • Host Availability (?) answering it yourself • Rumored to take into account network locality • Caching effect • Cons: • Requires network stability • Mechanisms easy to circumvent • Still no real guarantees on search scope or search time • Superpeer selection is time-based • Similar behavior to gnutella, but better. • How long you have been on is a good predictor of how long you will be around

49 50

The Solution Space Napster: History

• Centralized Database • 1999: Sean Fanning launches Napster • Napster • Query Flooding • Peaked at 1.5 million simultaneous users • Gnutella • Jul 2001: Napster shuts down • Intelligent Query Flooding • KaZaA • Swarming • BitTorrent • Structured Overlay Routing • Distributed Hash Tables

• More on Thursday …

51 66

13 Gnutella: History KaZaA: History

• In 2000, J. Frankel and T. Pepper from Nullsoft • In 2001, KaZaA created by Dutch company Kazaa released Gnutella BV • Soon many other clients: Bearshare, , • Single network called FastTrack used by other LimeWire, etc. clients as well: Morpheus, giFT, etc. • In 2001, many protocol enhancements including “ultrapeers” • Eventually protocol changed so other clients could no longer talk to it • Very popular file network today with >10 million users (number varies)

67 68

14