15-441 Computer Networking Lecture 16-17: Delivering Content

15-441 Computer Networking Lecture 16-17: Delivering Content

Overview • Web 15-441 • Protocol interactions 15-441 Computer Networking 15-641 • Caching • Cookies • Consistent hashing Lecture 16-17: Delivering Content • Peer-to-peer Part 1 • CDN Peter Steenkiste • Video Fall 2013 www.cs.cmu.edu/~prs/15-441-F13 2 Web history Web history (cont) • 1945: Vannevar Bush, “As we may think”, Atlantic • 1992 Monthly, July, 1945. • NCSA server released • 26 WWW servers worldwide • Describes the idea of a distributed hypertext system. • 1993 • A “memex” that mimics the “web of trails” in our minds. • Marc Andreessen releases first version of NCSA Mosaic Mosaic • 1989: Tim Berners-Lee (CERN) writes internal proposal version released for (Windows, Mac, Unix). to develop a distributed hypertext system • Web (port 80) traffic at 1% of NSFNET backbone traffic. • Connects “a web of notes with links”. • Over 200 WWW servers worldwide. • Intended to help CERN physicists in large projects share and • 1994 manage information • Andreessen and colleagues leave NCSA to form "Mosaic • 1990: Tim BL writes graphical browser for Next Communications Corp" (Netscape). machines. 3 4 1 Typical Workload (Web Pages) HTTP 0.9/1.0 • Multiple (typically small) objects per page • One request/response per TCP connection • File sizes •Lots of small objects • Simple to implement • Heavy-tailed means & TCP • Disadvantages • Pareto distribution for tail • 3-way handshake • Multiple connection setups three-way handshake each time • Lognormal for body of distribution • Lots of slow starts • Several extra round trips added to transfer • Embedded references • Extra connection state • Multiple slow starts • Number of embedded objects also Pareto -k Pr(X>x) = (x/xm) • This plays havoc with performance. Why? • Solutions? 5 6 Must Consider Network Layer - Single Transfer Example Example: TCP 1: TCP connections need to be set up Client Server 0 RTT SYN • “Three Way Handshake”: Client opens TCP SYN 1 RTT connection ACK Client sends HTTP request DAT Client Server ACK Server reads from for HTML DAT disk SYN (Synchronize) 2 RTT FIN ACK Client parses HTML FIN Client opens TCP ACK SYN/ACK (Synchronize + Acknowledgement) connection SYN SYN ACK 3 RTT ACK Client sends HTTP request DAT …Data… for image Server reads from ACK disk 4 RTT 2: TCP transfers start slowly and then ramp up the DAT bandwidth used – bandwidth is variable Image begins to arrive 7 Lecture 3: Applications 8 2 Performance Issues Netscape Solution • Short transfers are hard on TCP • Mosaic (original popular Web browser) fetched one object • Stuck in slow start at a time! • Loss recovery is poor when windows are small • Netscape uses multiple concurrent connections to improve • Lots of extra connections response time • Increases server state/processing • Different parts of Web page arrive independently • Servers also hang on to connection state after the • Can grab more of the network bandwidth than other connection is closed users • Why must server keep these? • Does not necessarily improve response time • Tends to be an order of magnitude greater than # of active connections, why? • TCP loss recovery ends up being timeout dominated because windows are small 9 10 Persistent Connection Solution Persistent Connection Solution • Multiplex multiple transfers onto one TCP Server connection Client • Also allow pipelined transfers 0 RTT DAT • How to identify requests/responses Client sends HTTP request ACK Server reads from for HTML DAT disk • Delimiter Server must examine response for delimiter string 1 RTT • Content-length and delimiter Must know size of transfer in ACK advance Client parses HTML DAT Server reads from Client sends HTTP request • Block-based transmission send in multiple length delimited ACK DAT disk blocks for image • Store-and-forward wait for entire response and then use 2 RTT content-length Image begins to arrive • Solution use existing methods and close connection otherwise 11 12 3 Other Problems Web Proxy Caches • Serialized transmission • Much of the useful information in first few bytes • User configures browser: Web origin • May be better to get the 1st 1/4 of all images than one complete accesses via cache server image (e.g., progressive JPEG) • Browser sends all HTTP Proxy • Can “packetize” transfer over TCP, e.g., range requests requests to cache server • Application specific solution to transport protocol • Object in cache: cache client problems. :( returns object • Could fix TCP so it works well with multiple • Else cache requests object simultaneous connections from origin server, then • More difficult to deploy returns object to client • Persistent connections can significantly increase state on the server client • Allow server to close HTTP session origin server • Client can no longer send requests 13 14 No Caching Example (1) No Caching Example (2) Assumptions Possible solution • Average object size = 100,000 bits origin • Increase bandwidth of access link origin • Avg. request rate from institution’s servers to, say, 10 Mbps servers browser to origin servers = 15/sec • Often a costly upgrade public public • Delay from institutional router to Internet Internet any origin server and back to router Consequences = 2 sec • Utilization on LAN = 15% Consequences • Utilization on access link = 15% • Utilization on LAN = 15% 1.5 Mbps • Total delay = Internet delay + access 10 Mbps access link • Utilization on access link = 100% delay + LAN delay access link • Total delay = Internet delay + access institutional = 2 sec + msecs + msecs institutional delay + LAN delay network network 10 Mbps LAN 10 Mbps LAN = 2 sec + minutes + milliseconds 15 16 4 With Caching Example (3) HTTP Caching Install cache origin • Clients often cache documents • Suppose hit rate is .4 servers • Challenge: update of documents Consequence • 40% requests will be satisfied almost public • If-Modified-Since requests to check immediately (say 10 msec) Internet • HTTP 0.9/1.0 used just date • 60% requests satisfied by origin server • HTTP 1.1 has an opaque “entity tag” (could be a file signature, • Utilization of access link reduced to 60%, etc.) as well resulting in negligible delays • Weighted average of delays 1.5 Mbps • When/how often should the original be checked for = .6*2 sec + .4*10msecs < 1.3 secs access link changes? institutional • Check every time? network 10 Mbps LAN • Check each session? Day? Etc? • Use Expires header • If no Expires, often use Last-Modified as estimate institutional cache 17 18 Example Cache Check Request Example Cache Check Response GET / HTTP/1.1 HTTP/1.1 304 Not Modified Accept: */* Date: Tue, 27 Mar 2001 03:50:51 GMT Accept-Language: en-us Server: Apache/1.3.14 (Unix) (Red-Hat/Linux) mod_ssl/2.7.1 Accept-Encoding: gzip, deflate OpenSSL/0.9.5a DAV/1.0.2 PHP/4.0.1pl2 mod_perl/1.24 If-Modified-Since: Mon, 29 Jan 2001 17:54:18 GMT Connection: Keep-Alive If-None-Match: "7a11f-10ed-3a75ae4a" Keep-Alive: timeout=15, max=100 User-Agent: Mozilla/4.0 (compatible; MSIE 5.5; Windows NT ETag: "7a11f-10ed-3a75ae4a" 5.0) Host: www.intel-iris.net Connection: Keep-Alive 19 20 5 Problems Caching Proxies – Sources for Misses • Well over 50% of all HTTP objects are not cacheable • Capacity • Why? • How large a cache is necessary or equivalent to infinite • Major exception? • On disk vs. in memory typically on disk • This problem will not go away • Compulsory • Dynamic data stock prices, scores, web cams • First time access to document • CGI scripts results based on passed parameters • Non-cacheable documents • Other less obvious examples • CGI-scripts • SSL encrypted data is not cacheable • Personalized documents (cookies, etc) • Most web clients don’t handle mixed pages well many generic • Encrypted data (SSL) objects transferred with SSL • Consistency • Cookies results may be based on past data • Document has been updated/expired before reuse • Hit metering owner wants to measure # of hits for revenue, etc. • Conflict • What will be the end result? • No such misses 21 22 Cookies: Keeping “state” Cookies: Keeping “State” client Amazon server Many major Web sites use Cookie file usual http request msg server cookies usual http response + creates ID Four components: Example: ebay: 8734 Set-cookie: 1678 1678 for user 1) Cookie header line in the • Susan accesses Internet HTTP response message always from the same PC Cookie file usual http request msg 2) Cookie header line in HTTP • She visits a specific e- amazon: 1678 cookie- request message cookie: 1678 commerce site for the first time ebay: 8734 specific 3) Cookie file kept on user’s • When initial HTTP requests usual http response msg action host and managed by user’s arrives at the site, the site one week later: browser creates a unique ID and creates 4) Back-end database at Web an entry in a backend database usual http request msg Cookie file cookie- site for that ID cookie: 1678 amazon: 1678 specific ebay: 8734 usual http response msg action 23 24 6 Overview Distributing Load across Servers • Web • Given document XYZ, we need to choose a • Consistent hashing server to use • Peer-to-peer • Suppose we use simple hashing: modulo of a • CDN hash of the name of the document • Video • Number servers from 1…n • Place document XYZ on server (XYZ mod n) • What happens when a servers fails? n n-1 • Same if different people have different measures of n • Why might this be bad? 25 26 Consistent Hash: Goals Consistent Hash – Example • Construction • “view” = subset of all hash buckets that are visible 0 • Assign each of C hash buckets to 14 • Correspond to a real server random points on mod 2n circle, where, hash

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    24 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us