<<

An Analysis of Content Delivery Systems

S. Saroiu, K. P. Gummadi, R. J. Dunn, S. D. Gribble, and H. M. Levy Department of Computre Science & Engineering University of Washington

2008. 11. 27

Kyusik Kim

Wireless Networks lab Contents

 Introduction  Overview of Content Delivery Systems

 WWW, Content delivery networks (CDNs), peer-to-peer systems (P2P)  Measurement Methodology  High-Level Data Characteristics  Detailed Content Delivery Characteristics

 objects, clients, servers, scalability of P2P systems  The Potential Role of Caching in CDNs and P2P  Conclusion

1 Wireless Networks Lab. Introduction (1)

 Purpose

 Examining content delivery traffic

 HTTP web

 Akamai

 p2p file systems

, Kazza

 Providing a detailed characterization and comparison of content delivery systems

 analyzing a nine day trace (incoming and outgoing Internet traffic at University of Washington)

 over 500 million transactions and over 20 terabytes of HTTP data

2 Wireless Networks Lab. Introduction (2)

 Results quantify

 the extent to which p2p traffic has overwhelmed web traffic as a leading consumer of Internet bandwidth

 the differences in the characteristics of objects being transferred

 the impact of the two-way nature of p2p communication

 the ways in which p2p systems are not scaling, despite their explicitly scalable design

3 Wireless Networks Lab. Content Delivery Systems -WWW

 The world-wide web (WWW)

 Simple -server architecture

 Using the HTTP protocols

 Most web objects are small (5~10KB)

 The number if web objects is enormous and rapidly growing

4 Wireless Networks Lab. Content Delivery Systems - CDNs

 Content delivery networks (CDNs)

 Dedicated collection of servers located strategically across the wide- area Internet

 Content providers contracts with commercial CDNs to host and distribute content

 Content is replicated across the wide area

 highly available

 Clients can access topologically nearby replicas with low latency

 DNS redirection causes overhead

5 Wireless Networks Lab. CDN example - Akami

6 Wireless Networks Lab. Content Delivery Systems –P2P

 Peer-to-peer systems (P2P)

 Peers collaborate to form a distributed system

 exchanging content

 Peers behave as both servers and clients

 Architecture types of P2P systems

 logically centralized architecture

 fully distributed architecture

 Gnutella,

 hybird architecture

 some peers are elected as supernodes

7 Wireless Networks Lab. P2P example

Distributed Centralized

Hybrid 8 Wireless Networks Lab. Measurement Methodology

 Passive network monitoring

 collects traces of traffic between the University of Washington(UW) and the Internet

 UW connects to its IPSs via two border routers

 one for outbound traffic, the other is for inbound traffic

 both are fully connected to four switches

 each switches has a monitoring port

 sending copies of incoming and outgoing packets to the monitoring hosts

 Traffic types

 HTTP traffic

 WWW, Akamai, Kazza, Gnutella

 non-HTTP TCP traffic

 Kazza and Gnutella search traffic

9 Wireless Networks Lab. Classifying traffic types

Akamai HTTP traffic on port 80, 8080, 443

WWW HTTP traffic on port 80, 8080, 443

HTTP traffic on port 6346, 6347 Gnutella - inculding file transfer - excluding search, control traffic HTTP traffic on port 1214 Kazza - including file transfer -excluding search, control traffic P2P Gnutella + Kazza any other TCP traffic - NNTP, SMTP, HTTP traffic to other ports Non-HTTP TCP traffic -traffic from other P2P systems - control and search traffic on Gnutella and Kazza

10 Wireless Networks Lab. High-Level Data Characteristics

11 Wireless Networks Lab. HTTP trace summary

 Exporting 16.65 TB, importing 3.44 TB

 UW is a net provider rather than consumer of HTTP data  P2P systems account for a large percentage of the bytes exported and the total bytes transferred

12 Wireless Networks Lab. TCP Bandwidth

 All systems show a typical diurnal cycle  Bandwidth consumption

 Akamai - 0.2%

 Gnutella - 6.04%

 WWW traffic - 14.3% of TCP traffic

 Kazaa - 36.99% of TCP bytes

 other TCP based protocols – 43%

13 Wireless Networks Lab. UW Client and Server TCP Bandwidth

 Figure (a) – Inbound Data BW (web and P2P downloads from UW clients)

 WWW peaking in the middle of the day

 Kazza peaking late at night  Figure (b) – Outbound Data BW (web and P2P from UW servers)

 Peak Kazza BW dominates WWW by a factor of 3

 External Kazza clients consume 7.6 times more BW than UW Kazza clients

14 Wireless Networks Lab. Content Types Downloaded by UW Clients

 GIF & JPEG images

 42% of requests, only 16.3% of the bytes transferred  AVI & MPEG

 0.41% of requests, 29.3% of the bytes transferred  Comparison with measurements from study in 1999

 HTML traffic : -43%, GIF&JPG traffic : -59%

 AVI&MPG traffic : 400%, MP3 traffic 300%

15 Wireless Networks Lab. Summary

 The balance of HTTP traffic has changed dramatically over the last server years  P2P traffic overtaking WWW traffic as the largest contributor to HTTP bytes transferred  Although UW is large publisher of web documents, P2P traffic makes the University an even larger exporter of data  The mixture of object types downloaded by UW clients has changed

and audio accounting for a substantially larger fraction of traffic than three years ago

16 Wireless Networks Lab. Detailed Content Delivery Characteristics

17 Wireless Networks Lab. Objects

 Object size: P2P (median: 4MB)> WWW (median: 2KB) & Akamai  Top bandwidth consuming Objects

 Gnutella

 relatively large number of objects account for a large portion of the transferred bytes

18 Wireless Networks Lab. Top 10 bandwidth consuming objects

 WWW – The top 10 objects are a mix of extremely small objects  Akamai – 8 out of the top 10 objects are larger and unpopular  Kazaa – Export objects are larger than import objects

19 Wireless Networks Lab. Downloaded bytes by object type

20 Wireless Networks Lab. Top UW bandwidth consuming clients

 Figure (a) – Top Bandwidth Consuming UW Clients

 WWW - Top 200 clients (0.5%)  13% of WWW traffic Kazza - Top 200 clients (4%)  50% of Kazza traffic

 Figure (b)

 Kazza: 200 clients  20% of the total HTTP bytes downloaded (worst offender)

21 Wireless Networks Lab. Clients - Request rates over time

 Figure (a) – WWW + Akamai Request Rates

 inbound request rate peaks at 1100 request per second

 outbound request rate peaks under 200 request per second

 Figure (b) – Kazza Request Rates

 at a high level  request rate: two orders of magnitude lower than the web  Kazza consumes  median object size: three orders of magnitude higher than the web more bandwidth

22 Wireless Networks Lab. Client – Concurrent HTTP transactions

 Despite the order of magnitude request-rate advantage of WWW over Kazza

 the number of simultaneous open Kazza connections is about twice the number of simultaneous open WWW + Akamai connections  Tue 0:00

 Kazza generates only 23 requests per second

 up to almost 1000 open requests at a time due to its long transfers

23 Wireless Networks Lab. Top UW-internal servers to external clients

 Figure (a) – Top Bandwidth Consuming UW Servers

 Gnutella: All of the bytes  first 10 servers WWW: steep curve  several major servers provide documents to the web Kazza: 80% of the bytes  top 334 servers

 Figure (b)

 WWW: 20 servers  20% of all HTTP bytes output

 Kazza: 170 server  50% of all HTTP bytes output

24 Wireless Networks Lab. The UW-external servers to internal clients

 Figure (a)

 WWW: 938 external servers  50% of the bytes Kazza: 600 external servers  26% of the bytes

 Figure (b)

 Kazza: Top 500 external Kazza peers  10% of the bytes WWW: Top 500 servers  22% of the bytes

25 Wireless Networks Lab. Scalability of P2P Systems

 Whether P2P Systems like Kazaa can scale in environments such as UW ?  Every peer in P2P system consumes bandwidth in both directions

 Each new P2P client added becomes a server for the entire P2P structure

 Kazaa object is huge, so a small number of peers can consume an enormous amount of total net. Bandwidth  The bandwidth cost of each P2P peer is 90 times that of the web client  It seems questionable whether any organization can supports a service with these characteristics

26 Wireless Networks Lab. Summary

 Peer-to-peer, which accounts for over three quarters of HTTP traffic  A small number of P2P users are consuming a disproportionately high fraction of bandwidth  While the P2P request rate is quite low, the transfer last long  While the design of P2P overlay structures focuses on spreading the workload for scalability, our measurements show that a small number of servers are taking the majority of the burden

27 Wireless Networks Lab. The Potential Role of Caching in CDNs

 Akamai requests achieve an 88% ideal hit rate and a 50% practical hit rate, noticeably higher than www requests (77% and 36%)

 Our analysis shows that akamai requests are more skewed towards the most popular documents than are WWW requests

 We know that most bytes fetched from Akamai are from images and

 This implies that much of Akamai's content is in fact static and could be cached

 We would expect that widely deployed proxy caches would significantly reduce the need for a separate content delivery network

28 Wireless Networks Lab. The Potential Role of Caching in P2P

 The potential impact of caching in P2P systems may exceed the benefits seen in the web  Inbound cache byte hit rate = 35%, Outbound cache byte hit rate = 85%  Hit rate increases with client population size for outbound traffic. (1000 client - 40%, 500,000 client - 85%)  Reverse P2P cache saves the most bandwidth

29 Wireless Networks Lab. Conclusion

 P2P traffic now accounts for the majority of HTTP bytes transferred  P2P documents are three orders of magnitude larger than web objects  leading to a 1000-fold increase in transfer time  A small number of extremely large objects account for an enormous fraction of observed P2P traffic  A small number of clients and servers are responsible for the majority of the traffic we saw in the P2P systems  Each P2P client creates a significant bandwidth load in both directions

30 Wireless Networks Lab. Q & A

Thank you!!!!!

31 Wireless Networks Lab.