An Analysis of Internet Content Delivery Systems

An Analysis of Internet Content Delivery Systems

An Analysis of Internet Content Delivery Systems S. Saroiu, K. P. Gummadi, R. J. Dunn, S. D. Gribble, and H. M. Levy Department of Computre Science & Engineering University of Washington 2008. 11. 27 Kyusik Kim Wireless Networks lab Contents Introduction Overview of Content Delivery Systems WWW, Content delivery networks (CDNs), peer-to-peer systems (P2P) Measurement Methodology High-Level Data Characteristics Detailed Content Delivery Characteristics objects, clients, servers, scalability of P2P systems The Potential Role of Caching in CDNs and P2P Conclusion 1 Wireless Networks Lab. Introduction (1) Purpose Examining content delivery traffic HTTP web Akamai content delivery network p2p file sharing systems Gnutella, Kazza Providing a detailed characterization and comparison of content delivery systems analyzing a nine day trace (incoming and outgoing Internet traffic at University of Washington) over 500 million transactions and over 20 terabytes of HTTP data 2 Wireless Networks Lab. Introduction (2) Results quantify the extent to which p2p traffic has overwhelmed web traffic as a leading consumer of Internet bandwidth the differences in the characteristics of objects being transferred the impact of the two-way nature of p2p communication the ways in which p2p systems are not scaling, despite their explicitly scalable design 3 Wireless Networks Lab. Content Delivery Systems -WWW The world-wide web (WWW) Simple client-server architecture Using the HTTP protocols Most web objects are small (5~10KB) The number if web objects is enormous and rapidly growing 4 Wireless Networks Lab. Content Delivery Systems - CDNs Content delivery networks (CDNs) Dedicated collection of servers located strategically across the wide- area Internet Content providers contracts with commercial CDNs to host and distribute content Content is replicated across the wide area highly available Clients can access topologically nearby replicas with low latency DNS redirection causes overhead 5 Wireless Networks Lab. CDN example - Akami 6 Wireless Networks Lab. Content Delivery Systems –P2P Peer-to-peer systems (P2P) Peers collaborate to form a distributed system exchanging content Peers behave as both servers and clients Architecture types of P2P systems logically centralized architecture Napster fully distributed architecture Gnutella, Freenet hybird architecture some peers are elected as supernodes Kazaa 7 Wireless Networks Lab. P2P example Distributed Centralized Hybrid 8 Wireless Networks Lab. Measurement Methodology Passive network monitoring collects traces of traffic between the University of Washington(UW) and the Internet UW connects to its IPSs via two border routers one for outbound traffic, the other is for inbound traffic both are fully connected to four switches each switches has a monitoring port sending copies of incoming and outgoing packets to the monitoring hosts Traffic types HTTP traffic WWW, Akamai, Kazza, Gnutella non-HTTP TCP traffic Kazza and Gnutella search traffic 9 Wireless Networks Lab. Classifying traffic types Akamai HTTP traffic on port 80, 8080, 443 WWW HTTP traffic on port 80, 8080, 443 HTTP traffic on port 6346, 6347 Gnutella - inculding file transfer - excluding search, control traffic HTTP traffic on port 1214 Kazza - including file transfer -excluding search, control traffic P2P Gnutella + Kazza any other TCP traffic - NNTP, SMTP, HTTP traffic to other ports Non-HTTP TCP traffic -traffic from other P2P systems - control and search traffic on Gnutella and Kazza 10 Wireless Networks Lab. High-Level Data Characteristics 11 Wireless Networks Lab. HTTP trace summary Exporting 16.65 TB, importing 3.44 TB UW is a net provider rather than consumer of HTTP data P2P systems account for a large percentage of the bytes exported and the total bytes transferred 12 Wireless Networks Lab. TCP Bandwidth All systems show a typical diurnal cycle Bandwidth consumption Akamai - 0.2% Gnutella - 6.04% WWW traffic - 14.3% of TCP traffic Kazaa - 36.99% of TCP bytes other TCP based protocols – 43% 13 Wireless Networks Lab. UW Client and Server TCP Bandwidth Figure (a) – Inbound Data BW (web and P2P downloads from UW clients) WWW peaking in the middle of the day Kazza peaking late at night Figure (b) – Outbound Data BW (web and P2P uploads from UW servers) Peak Kazza BW dominates WWW by a factor of 3 External Kazza clients consume 7.6 times more BW than UW Kazza clients 14 Wireless Networks Lab. Content Types Downloaded by UW Clients GIF & JPEG images 42% of requests, only 16.3% of the bytes transferred AVI & MPEG 0.41% of requests, 29.3% of the bytes transferred Comparison with measurements from study in 1999 HTML traffic : -43%, GIF&JPG traffic : -59% AVI&MPG traffic : 400%, MP3 traffic 300% 15 Wireless Networks Lab. Summary The balance of HTTP traffic has changed dramatically over the last server years P2P traffic overtaking WWW traffic as the largest contributor to HTTP bytes transferred Although UW is large publisher of web documents, P2P traffic makes the University an even larger exporter of data The mixture of object types downloaded by UW clients has changed video and audio accounting for a substantially larger fraction of traffic than three years ago 16 Wireless Networks Lab. Detailed Content Delivery Characteristics 17 Wireless Networks Lab. Objects Object size: P2P (median: 4MB)> WWW (median: 2KB) & Akamai Top bandwidth consuming Objects Gnutella relatively large number of objects account for a large portion of the transferred bytes 18 Wireless Networks Lab. Top 10 bandwidth consuming objects WWW – The top 10 objects are a mix of extremely small objects Akamai – 8 out of the top 10 objects are larger and unpopular Kazaa – Export objects are larger than import objects 19 Wireless Networks Lab. Downloaded bytes by object type 20 Wireless Networks Lab. Top UW bandwidth consuming clients Figure (a) – Top Bandwidth Consuming UW Clients WWW - Top 200 clients (0.5%) 13% of WWW traffic Kazza - Top 200 clients (4%) 50% of Kazza traffic Figure (b) Kazza: 200 clients 20% of the total HTTP bytes downloaded (worst offender) 21 Wireless Networks Lab. Clients - Request rates over time Figure (a) – WWW + Akamai Request Rates inbound request rate peaks at 1100 request per second outbound request rate peaks under 200 request per second Figure (b) – Kazza Request Rates at a high level request rate: two orders of magnitude lower than the web Kazza consumes median object size: three orders of magnitude higher than the web more bandwidth 22 Wireless Networks Lab. Client – Concurrent HTTP transactions Despite the order of magnitude request-rate advantage of WWW over Kazza the number of simultaneous open Kazza connections is about twice the number of simultaneous open WWW + Akamai connections Tue 0:00 Kazza generates only 23 requests per second up to almost 1000 open requests at a time due to its long transfers 23 Wireless Networks Lab. Top UW-internal servers to external clients Figure (a) – Top Bandwidth Consuming UW Servers Gnutella: All of the bytes first 10 servers WWW: steep curve several major servers provide documents to the web Kazza: 80% of the bytes top 334 servers Figure (b) WWW: 20 servers 20% of all HTTP bytes output Kazza: 170 server 50% of all HTTP bytes output 24 Wireless Networks Lab. The UW-external servers to internal clients Figure (a) WWW: 938 external servers 50% of the bytes Kazza: 600 external servers 26% of the bytes Figure (b) Kazza: Top 500 external Kazza peers 10% of the bytes WWW: Top 500 servers 22% of the bytes 25 Wireless Networks Lab. Scalability of P2P Systems Whether P2P Systems like Kazaa can scale in environments such as UW ? Every peer in P2P system consumes bandwidth in both directions Each new P2P client added becomes a server for the entire P2P structure Kazaa object is huge, so a small number of peers can consume an enormous amount of total net. Bandwidth The bandwidth cost of each P2P peer is 90 times that of the web client It seems questionable whether any organization can supports a service with these characteristics 26 Wireless Networks Lab. Summary Peer-to-peer, which now accounts for over three quarters of HTTP traffic A small number of P2P users are consuming a disproportionately high fraction of bandwidth While the P2P request rate is quite low, the transfer last long While the design of P2P overlay structures focuses on spreading the workload for scalability, our measurements show that a small number of servers are taking the majority of the burden 27 Wireless Networks Lab. The Potential Role of Caching in CDNs Akamai requests achieve an 88% ideal hit rate and a 50% practical hit rate, noticeably higher than www requests (77% and 36%) Our analysis shows that akamai requests are more skewed towards the most popular documents than are WWW requests We know that most bytes fetched from Akamai are from images and videos This implies that much of Akamai's content is in fact static and could be cached We would expect that widely deployed proxy caches would significantly reduce the need for a separate content delivery network 28 Wireless Networks Lab. The Potential Role of Caching in P2P The potential impact of caching in P2P systems may exceed the benefits seen in the web Inbound cache byte hit rate = 35%, Outbound cache byte hit rate = 85% Hit rate increases with client population size for outbound traffic. (1000 client - 40%, 500,000 client - 85%) Reverse P2P cache saves the most bandwidth 29 Wireless Networks Lab. Conclusion P2P traffic now accounts for the majority of HTTP bytes transferred P2P documents are three orders of magnitude larger than web objects leading to a 1000-fold increase in transfer time A small number of extremely large objects account for an enormous fraction of observed P2P traffic A small number of clients and servers are responsible for the majority of the traffic we saw in the P2P systems Each P2P client creates a significant bandwidth load in both directions 30 Wireless Networks Lab.

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    32 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us