<<

arXiv:1605.07704v1 [cs.MM] 25 May 2016 ype oe nhnrd fmtr;Wiecmae othe to compared a cache While individually meters; users where of paradigm hundreds P2P conventional in nodes peer by mly ewr eore htaemc lsrt users, as to much closer as CDN much serve peer can are it approach, that CDN. e.g., CDN resources peer conventional network and employs the (P2P) to peer-to-peer Compared CDN, conventional ing nodes. peer delivery content such to (VoD) -on-Demand 2,epcigt unalrefato fisues( users its of fraction large over a turn deployed to has expecting provider, [2], video delive online content 300 largest for homes a individuals’ , at resources storage hsppr esuyteYuupe D,wihhsdeployed has In which systems. CDN, CDN peer measurement, o Youku peer impact the such study over potential of we and paper, ecosystem this performance delivery intr strategies, content is the the It manner. suc study coordinative from a to because in different providers cent approach, service householders’ the is by video operated (P2P) on paradigm inherently are smartrouters based peer-to-peer new dedicated delivery conventional This cr content [1]. the to of resources cooperation paradigm their network announced new delivery, a video smartro (CD using assist provider network service to together delivery video a datacen content Youku, homes, the and large provider, in users’ a deployed ChinaCache, servers at Recently, video conventional deployed the smartrouters whic with network, by delivery enabled video smartrouter-based paradigm: eieyntok CN) ie evc rvdr oa ar today providers their conte service conventional deploying video in (CDNs), monetar resources networks expansive the delivery renting reduce and for streaming, cost video data-intensive for h efrac susadptnilipoeet otepe the to improvements potential systems. of and CDN analy issues Quality our provide performance effective fi also for the We an redirection. used request be form user can su grained which itself sub-system, Second, monitoring can performance. (QoS) their Service deployment guarantee CDN can basis peer th daily ca for and a essential replication on Our scheduling is proactively Fir players. and strategy systems, insights. caching CDN and measurement and routers replication following controlled global the over are analyzed, contributions been has nesadn h mrrue-ae erCNfor CDN Peer Smartrouter-based the Understanding nFg ,w lttecnetdlvr aaim includ- paradigms delivery content the plot we 1, Fig. In ome h krceiggot fbnwdhrequirement bandwidth of growth skyrocketing the meet To Abstract mrruesa hi sr’hmsi esta n year one than less in homes users’ their at smartrouters K 300 mrrue eie o t ie temn.I our In streaming. video its for devices smartrouter K Rcn er aewtesdanwvdodelivery video new a witnessed have years —Recent 78 ieswr netgtdand investigated were K erCDNs peer .I I. 1 eateto optrSineadTcnlg,Tigu Un Tsinghua Technology, and Science Computer of Department snhaNtoa aoaoyfrIfrainSineadT and Science Information for Laboratory National Tsinghua m1@al. agh@z,sk4mis,sunlf@}tsingh suk14@mails., wangzhi@sz., {mm13@mails., NTRODUCTION omk s fntokand network of use make to 80% 2 rdaeSho tSeze,Tigu University Tsinghua Shenzhen, at School Graduate igMa Ming ftecnetrequests content the of ie Streaming Video 1 h Wang Zhi , 3 Btraffic TB 250 ralized peer e )of M) iguing i on sis ching uters t a st, ters. eate is h ne- ry. nd N) ch er nt 2 n h y e eSu Ke , n hr,icniemcaim r elydi h system, the in deployed is interes are it are mechanisms we nodes, incentive nodes Third, peering peer target in; and combination the servers a identify to is CDN challenging system conventional scale the limited both only i Second, of using nature, experiments; knowledge in active global distributed of The a is have system users. to the Youku difficult strategi First, key follows: serve and as performance to are system the homes/offices study to their challenges at storage elytesatotr nml ok otr with Router) Youku (namely smartrouters the deploy ie temn yispe D nalretvdoservice over video largest attracted a has in which CDN Youku, peer its provider, by streaming video strategi its including limitations. paradigm, and CDN peer performance, inve the to of intriguing details thus the is but e.g., for it publisher, service herself, Internet, by the content generated hosting the contents a source in content only corresponding play not a also become individuals can of applications individual roles and any the services even content of and deployment the change request network to edge collaborating and [1]. content start generated satis even to increasing paradigm, ever providers delivery content CDN traditional the change peer such prov content Today, downl and smartrouters. to particular users CDN redirects from For peer content and a contents, knowledge. in cache centralized proactively nodes to schedules the provider CDN content by peer the smartrouter-based example, coordinated a closely in nodes are the other, each serve CDN. peer and P2P, CDN, for architecture system The 1: Fig. nti ae,w odc xesv esrmn ostudy to measurement extensive conduct we paper, this In fundamentally can paradigm delivery content such As 1 )$* !" rgnlsre desre Enduser Edgeserver Original server n iegSun Lifeng and , 1 %##)&*#''(-!" )%*#)# echnology ua.edu.cn iversity centralized coordinate 300 sr to users K stigate ythe fy iders 8 oad GB is t ted the es, es s , which leads to even more noise in our traffic analysis. To address these challenges, we design active and passive measurement experiments to study not only its architecture, but also key strategies. On one hand, we deploy controlled In Beijing: clients in different networks (e.g., CERNET, Unicom, etc.) to R1, R2: CERNET, public IP ! interact with the peer CDN nodes, and study their behaviors; R3: CERNET, private IP. R4: Unicom, public IP. On the other hand, we deploy controlled smartrouters in different cities and ISPs to passively observe how these nodes are controlled by the whole system, and how they serve others. Our contributions are summarized as follows. In Shenzhen: • Based on our measurement studies covering 78K videos, R5: CERNET, public IP. 126 conventional centralized CDN servers and 3M users, we present not only the architecture and protocols, but also the key strategies that affect the performance of a Fig. 2: Deployment of testing peer CDN nodes. joint CDN and peer CDN system. • We reveal that the global content replication, replacement TABLE I: Hardware information of controlled testing nodes. and user redirection strategies are essential for the peer CDN paradigm, and proactively scheduling replication Parameter Information and caching on an hourly basis can guarantee the per- CPU 580MHz formance of the system. To name a few results: (1) RAM 128MB (DDR2) OpenWrt 2.6.36 Smartrouters use a frequently content update strategy to WiFi Protocol/Channel IEEE 802.11 b/g/[email protected] GHz meet the user demand, e.g., the median lifespan of chunks Storage 8GB TF cards 26 3 cached in the smartrouters is 24.2 hours; (2) Using global Price . USD (December 2015) knowledge, smartrouters are scheduled by a centralized TABLE II: Statistics in our measurement studies. peer selection mechanism, e.g., based on ISP or location. • We examine the system performance and observe that: Time period 9/15 – 10/30, 2015 (1) The peer CDN can guarantee the system Quality Video number 78, 285 of Service (QoS), both for the start-up delay and the Total traffic 3.1TB download speed; (2) Youku peer CDN fully utilizes the TCP/UDP Packets 4, 438, 510, 988 126 80% Contacted servers smartrouter resources of edge network, e.g., of Contacted IPs 3, 155, 877 the content requests can be served with at least 70% Distinct ASes 4, 015 of the data which come from smartrouters. We also propose some possible strategies to improve the system • R1 R2 performance for the future development. , , which are deployed in Beijing (ISP: CERNET), with public IP address and 80Mbps downlink/uplink The rest of the paper is organized as follows. In Sec. II, bandwidth; we introduce our measurement scheme. The video peer CDN • R3, which is deployed in Beijing (ISP: CERNET), with architecture and system workflows are presented in Sec. III. private IP, based on Network Address Translator (NAT), The strategies of video placement and peer selection are and 80Mbps downlink/uplink bandwidth; analyzed in Sec. IV. We evaluate the system performance in • R4, which is deployed in Beijing (ISP: China Unicom), Sec. V. We also present our discussion in Sec. VI. Finally we with public IP and 4Mbps downlink bandwidth and analyze related works in Sec. VII and conclude this paper in 512Kbps uplink bandwidth; Sec. VIII. • R5, which is deployed in Shenzhen (ISP: CERNET), with public IP address and 80Mbps downlink/uplink II. MEASUREMENT METHODOLOGY bandwidth. We conduct both passive and active measurement on the 2) Device Information: We use the ordinary version of peer CDN nodes (i.e., Youku smartrouters) and clients in our Youku smartrouters as our testing nodes, and the hardware study. information is illustrated in Table I. In particular, each device provides a 580MHz CPU, and the capacity of the internal A. Measurement by Controlled Peer CDN Nodes storage is one 8GB TF cards. 1) Testing Node Deployment: In order to “sensing” the 3) Measurement on the Testing Routers: There are two strategies used in Youku peer CDN, we deploy 5 Youku types of measurements on the devices: smartrouters in different locations with different ISPs in our • File System Monitoring: By performing root injection1, measurement study, and observe the interactions between these we are able to login these devices via SSH, and monitor smartrouters and other CDN nodes. As illustrated in Fig. 2, the 5 routers are denoted as follows: 1http://openwrt.io/docs/youku/ 1 1

0.8 0.8 Peer CDN control plane A Config server CDN 0.6 0.6

CDN edge servers CDF CDF B 0.4 0.4 QoS monitor E C Load balance 0.2 0.2 Scheduling server D 0 0 1 0 100 200 300 400 0 10 20 30 40 50 60 3 2 Chunk duration (s) Chunk size (MB) F (a) The CDF of chunk duration. (b) The CDF of chunk size. Chunk swarm 4 Fig. 4: Statistics of contents replicated by our testing routers.

Fig. 3: Architecture of the peer CDN system. • Agents: There are three types of agents in the system, including (1) Peer routers: Dedicated smartrouters which the folders on the devices where video files (chunks and can proactively cache and distribute contents; (2) Peer video files are used exchangeably in this paper) are stored clients: Clients who install the Youku Accelerator and to serve users. In Youku peer CDN, all contents are cache watched videos to serve others; (3) Ordinary users: cached as video chunks with an unique content ID, and Users who watch videos on Youku. our experiments cover 78K different videos during the • Peer CDN Controller: This component decides the monitoring period (from September 15th to October 30th, primary scheduling strategy, i.e., which videos should be 2015). In Sec. IV, we will present the details of the file replicated by which peer routers. There are mainly 3 types monitoring results. of servers as follows: • Traffic Monitoring: We monitor the traffic patterns on – Config servers: Peer routers download configura- these devices, using the conventional network utility tools tion parameters from Config servers (step A). In including tcpdump, netstat, etc. Through this moni- our datasets, we trace the HTTP response in XML tor, we explore the interaction protocols between the file, e.g., “” indicates that the peer router will users). Table II illustrates the statistics in our measure- report its partners’ information every 1800 seconds. ment experiments. For instance, the dataset contains 126 – QoS monitor: Peer routers report statistics (step B) to unique Youku peer CDN servers and 3M unique IPs. the QoS monitor, including the information of their partner states and their operation problems, i.e., net- B. Measurement by Controlled testing users work congestion and software crash. Thus peer CDN In our study, we actively run controlled testing users to join can have a global views of end-to-end QoS using the the system, and measure their interactions with the peer CDN monitoring mechanism. nodes (including both of our testing routers and others). To – Scheduling servers: Scheduling servers schedule the be specific, we act as ordinary users who request a number of content replication (step C) and user redirection (step videos, and monitor: 3) according to the information monitored. Firstly, • How these requests are served by the system, which can smartrouters periodically receive replication tasks from reveal the service workflows. the scheduling servers for downloading new contents • How smartrouters are allocated to cache the videos by to their local storage (step C); Secondly, ordinary users analyzing the peer lists returned to the users, which can discover the candidate peer lists from the scheduling reveal the content deployment strategies used by Youku. servers (step 3). In the next section, we exhibit the system architecture of • CDN Infrastructure: The CDN infrastructure takes two the Youku peer CDN inferred from our measurement. responsibilities. The first is as a “back-up” for the peer routers and users. If peers become unstable, users could III. SYSTEM ARCHITECTURE download contents from edge servers (step 1,2); the Through our measurement study and protocol analysis, we second is to publish video contents, by pushing the latest analyze the workflows of the system and verify the functional content to the peer routers (step D,E). roles of each CDN nodes connected by our testing routers and testing users. In this section, we infer the architecture used by Based on the system components illustrated above, we the Youku peer CDN system for VoD service. can understanding the cooperation between the Youku CDN infrastructure and the peer CDN control plane. A. Architecture and Workflows Table III illustrates the key HTTP requests issued by Fig. 3 illustrates the general architecture and workflows of our testing routers or testing users. In the Youku peer the Youku peer CDN system. There are three basic compo- CDN, the video ID in the CDN server, denoted as CCDN nents in Youku peer CDN: (e.g., “030008080556377E2A51F503BAF2B1CBC532CF- TABLE III: Key HTTP requests captured from our testing routers and users.

Step HTTP message Description Step A GET pcdnapi.youku.com/pcdn/sysconf/acc Routers request a key for config file downloading. Step A GET pcdnapi.youku.com/update/config Routers request the config file to set the operating parameter of the peer router. Step B POST pcdnstat.youku.com/iku/log/acc Routers report its status to control plane. Step C GET /acc/hotspot/cdnurl?rid=Cpeer Routers request to prefetch the chunk Cpeer. Step D 1 GET /player/getFlvPath/sid/CCDN . Routers/Users request Load balance server to obtain the candidate edge CDN server list. Step E, 2 GET /youku/sid/.../CCDN . Routers/Users download the chunk from edge servers. Step 3 GET /getTaddr?f=Cpeer . Agents get the candidate peer list.

BA95-0858-9BC4-31009B5D3563”), is different from Uplink Downlink the video ID of the same chunk in the peer routers, 1500 denoted as Cpeer, which is a 40-byte hex hash value, e.g., “200000004F6078A3F2D2DEE2F60AF524989FAF5”. Thus, 1000 the complete video ID in the Youku peer CDN can be denoted Traffic (MB) 500 as [CCDN , Cpeer ]. With this video ID, a original user is able to request the same video content from both of the CDN and the P2P network separately. OCT01 OCT02 OCT03 OCT04 OCT05 OCT06 OCT07 Next, we present the workflows of video streaming as Fig. 5: Traffic pattern of the peer router R1. follows: ✄ Streaming protocol in Youku: The Youku video CDN TABLE IV: The average amount of traffic delivered by testing 1 adopts the unencrypted HTTP protocol in both signaling routers in day. exchange and video streaming, while the peer CDN control Router Upload traffic (GB) Download traffic (GB) plane adopts the encrypted TCP-based protocol for signaling R1 19.90 6.10 between the scheduling servers and agents. It adopts its R2 20.99 6.23 proprietary UDP-based and TCP-based P2P protocol for the R3 3.21 6.05 agents to exchange their data. R4 3.78 2.67 R5 22.03 6.84 ✄ When the user starts to watch a videos, firstly, she down- loads the beginning part of the chunk from the edge servers (step 1, 2); At the same time, she queries the scheduling cached in the smartrouters, and observe 3 types of video server for a peer list generated based on the user’ location bitrates, i.e., 270Kbps (Standard Definition), 600Kbps (High and ISP information (step 3). If suitable peers are detected, the Definition) and 1200Kbps (Super Definition). The higher qual- user and selected peer routers attempt to establish connection ity videos are generally partitioned to shorter chunks, which with each other and delivery chunks (step 4). Meanwhile, the can benefit the fine-grained streaming scheduling. connection with the edge servers still hold on; if the selected In the next section, we perform in-depth strategy analysis peers become slow or unreliable, the CDN edge server can deployed in the smartrouter-based peer CDN. cover this difference. Then the user experience does not suffer from this instability. IV. STRATEGIES USEDINTHE PEER CDN ✄ For the content deployment on the peer routers, each peer In this section, we investigate the content deployment and router periodically obtains the prefetching chunk list (step C), peer selection strategies in the video peer CDN, which can and then downloads them one by one either from the CDN successfully meet the user demands and improve the QoS of servers (step D, E) or from other peers (step F) based on the peer routers. scheduling of the peer CDN controller. B. Content Information A. Traffic Pattern In the Youku peer CDN system, the contents cached in the First, we overview the working condition of our 5 peer peer routers are video chunks. Fig. 4 shows the basic infor- routers. In Fig. 5, we depict the upload and download traffic mation of contents cached in our testing routers, including the of R1 over 7 days, and the other days and other routers chunk duration and size, and these distributions for contents in exhibit the similar pattern. Our observations are as follows: (1) different routers are consistent. We can find out most of them The volume of data uploaded by the router is larger than the are 6 min and the average chunk size is 17MB (hence one volume of data downloaded, indicating that smartrouters serve Youku smartrouter with the 8GB storage can cache about 430 as traffic “amplifiers”; (2) Downlink traffic shows a periodical chunks). This chunk-caching strategy makes the chunks of the manner (exactly like the traffic pattern in the conventional same content spread across multiple routers, which achieves video services [3]), indicating that smartrouters can be sched- the load balancing between routers to serve users. uled to fetch content in a centralized manner. The video bitrate is another important factor for VoD Our 5 testing routers exhibit similar download and upload service. We use the tool ffmpeg to obtain the chunks bitrates patterns. But on account of the different networks they are a h hneo h ubro hnscce nterouters. the in cached chunks of number the of change The (a) Fcard, TF othe to decide to ratio. Youku utilization for storage tha factor practical conclude crucial smartrouters’ a we the is Thus respectively. condition storage, network the their in most at osat .. lhuheeytsigrue a a has router testing every although i.e., constant, eosreta h ubro hnsicessgaulyat first gradually the increases of chunks of beginning number the the that observe We r onoddo eee vrtm.Fg ()sosthe routers testing shows the 6(a) in cached Fig. chunks time. of R1 number over the deleted of change or chunks downloaded when and how are analyzing by strategy replication Youku o h errues oeta our that Note routers. peer the for eet al Vsosteaeaeaon ftafi delivere traffic of dif- amount the is average by data the upload shows and IV Table download ferent. of amount their in, deployed .CnetDeployment Content o specified). B. otherwise patterns (unless the paper present this in we router thus one strategy, caching consistent ffiegandtmso myehul rsotr.Moreover shorter). or hourly controller (maybe f timeslot the centralized in grained Youku periodically fine chunks the of fetch to that routers peer pattern, speculate the invoke daily can a follows we number and chunk downloaded The days. aaiyof capacity ewr odto ihnthese within condition network etitdb h A rvra.As traversal. NAT the by restricted oprbersls u ote r ntesm S n have and ISP the same For the bandwidth. in access are same they the to due results, comparable Chunk number 100 150 200 250 300 350 400 450 nFg () epo h ubro hnsdownloaded chunks of number the plot we 6(b), Fig. In )CnetRpiainadReplacement: and Replication Content 1) et ewl td h oceerpiainsrtge use strategies replication concrete the study will we Next, 50 ,

R3 Jaccard similarity index (%) 10 11 OCT01 3 4 5 6 7 8 9 5 R1 i.7 hn iiaiybtenthe between similarity Chunk 7: Fig. (R1,R2)(R1,R3)(R1,R4)(R1,R5)(R2,R3)(R2,R4)(R2,R5)(R3,R4)(R3,R5)(R4,R5) and uigoedy eobserve We day. one during R1 C0 C0 C0 C0 C0 OCT07 OCT06 OCT05 OCT04 OCT03 OCT02 n eoe rmthe from removed and R3 , R4 R3 stea h iiu level. minimum the at the is over and The combinationoffivepeerrouters R5 1 R4 R3 R1 ek(h olwn asaesimilar). are days following (the week epabout keep 2 as hnrmisnal ta at nearly remains then days, 5 i.6 h hrceitc fcukrpiainschedulin replication chunk of characteristics The 6: Fig. otr,isupload/download its routers, R4 R1 R2 427 R1 sdpoe nteworst the in deployed is nec orduring hour each in t podcpct is capacity upload its , 5 , , R2 370 mrruesshow smartrouters b h ubro orydwlae/eoe chunks. downloaded/removed hourly of number The (b) Chunk number 5 100 and 20 40 60 80 and routers. esuythe study We 8 OCT01 R5 Binternal GB 250 aethe have C0 C0 C0 C0 C0 OCT07 OCT06 OCT05 OCT04 OCT03 OCT02 chunks Deleted Downloaded fromYoukuserver Downloaded orm 7 d d f t , oain,idctn h mrruesaeshdldt ca to opportunities. an scheduled same ISPs are with different smartrouters videos the in the routers indicating peer locations, the if even degree, same elyeti og ok vni erCNpoiescan providers popular. CDN be peer would pre if video even content which work, Thus to predict tough day. enforced a every is are time deployment television content. contents prime user-generated copyrighted at the and release of news a most social anytime Secondly, at as generated such are anywhere, videos hot increasingly Firstly, ecuttenme fcuk hc r onoddfrom week, downloaded this are During which servers. edge chunks CDN of the number the count we iga.I hw h vrg iiaiyidxso n two any of indexes between similarity are average routers peer the box-and-whisker a shows by It depicted diagram. is which period, measurement h ein(es en hn iepnis lifespan chunk mean) life (reps. chunk of median distribution the The plots 6(c) Fig. update. chunk capacity. replaceme exam cache For 55 algorithm). the Used using Recently Least only (e.g., algorithms of instead control, tralized conten for servers Youku which of themselves, load deployment. chunk routers the the peer alleviates of by effectively most finished Thus are servers. deployments CDN from downloaded are esdvddb h ieo h no ftocukst.In sets. chunk chunk two two of of every union intersection index the similarity the the of Jaccard calculate we size of 7, using the Fig. size by by the divided routers sets i.e., peer [5], two regions index any ch or the between evaluate ISPs we their similarity replication, content by the conducts routers it when peer the between VoDdifference the in measured videos ranked simil [4]. top is [3], of update systems rate chunk server, change of t edge the rate in to CDN update this contents Furthermore, the timely router. popular and than peer frequently th and a smaller but in fresh much watch, results which of to is want plenty storage users are router original There that contents day. published one about last 29 hr r w esn htefretera-iereplicatio real-time the enforce that reasons two are There h erCNealspop lblshdln vrmil- over scheduling global prompt enables CDN peer The nte motn usinaotterue ahn sthe is caching router the about question important Another cen- the by scheduled also is it removal, chunk the for As nodrt gr u hte ok erCNmksa makes CDN peer Youku whether out figure to order In . 8 hnsaedltdi c hnteei pr storage spare is there when 1 Oct in deleted are chunks or) .. oto h hnscce ntepe router peer the in cached chunks the of most i.e., hours),

CDF 0.2 0.4 0.6 0.8 0 1 4 g. 0 0 0 0 500 400 300 200 100 0 . 2 c h itiuino hn lifespan. chunk of distribution The (c) to % 9 . 8 Lifetime (Hour) ,wihrmisa the at remains which %, 22 24 ftechunks the of % 5 Mean =29.8 Median =24.2 i uigour during min . 2 or (resp. hours span. unk ple, che nd he ar nt n. d e - t 3 6143 24 5503 528 5288

(a) Unicom (b) Telecom (c) Mobile Fig. 8: The geo-distribution of peers in the different ISPs, including China Unicom, China Telecom and China Mobile.

1 0.45 R1 0.8 R2 0.4 R3 0.35 0.6 R4 0.3 0.4 0.25 0.2

0.2 The ratio of IPs

Peer Ratio 0.15 0 Mobile CERNET Railcom Unicom Telecom Others 0.1 ISP 0.05 0 (a) The ISP distribution of Top-100 IP.

Beijing Jiangxi Fujian Anhui JiangsuZhejiang GuangdongShanghai Shandong 0.3 Area R1 0.2 R2 Fig. 9: The peer distribution of chunks in different locations. R3 R4 0.1 lions of peer routers, including pushing newly published videos

The ratio of IPs 0 Beijing Fujian Heibei Shanxi to peer routers and dynamically replacing staled contents at Guangdong JiangsuLiaoningNeimengShandongShanghai these routers. Such prompt and global strategies are enablers Area for today’s frequently changed user interests. (b) The Area distribution of Top-100 IPs. 2) Peer Router Management from a Global Perspective: Fig. 10: The distribution of Top-100 IPs in uplink traffic. A large amount of peer routers need an efficient management system for the content deployment and service delivery. In CDN manages the pool of peers in a distributed mechanism order to figure out the management mechanism and obtain the according the ISP. We further map the IP addresses to the “snapshot” of the Youku peer CDN from a global perspective, city-level location in the China mainland by querying the nali2 we apply a crawler to track both of the peer geo-distribution database. Fig. 8 shows most of the Unicom peers are located and content geo-distribution in the Youku peer CDN. in the northeast China, while most of the Telecom and Mobile 70 We sample chunk cached in the testing routers, and peers are located in the Southeast China. This distribution design a crawler to collect the their peer lists by using these follows the common sense of ISP deployment in China. chunk IDs, i.e., the step 3 in the Table III. Consider that peers To explore the content deployment condition with a global may join and leave the video swarm dynamically, we crawl view, using the peer dataset crawled from the Telecom, Fig. 9 30 the candidate peer lists of all the chunks about min (it does plots a box-and-whisker diagram to show the peer geo- not trigger the DoS attack alert from Youku). We conduct the distribution of the 70 chunks. We observe the peer distributions 3 crawling from dominating ISPs in Beijing and Shenzhen, of different chunks with different popularities are similar, e.g., including China Unicom, China Telecom and China Mobile. most chunks aggregate in Jiangsu and Zhejiang in accordance 6 96 870 After combining the sets, We find unique , peers in with Fig. 8(b), which further verifies that the chunk copies are total. deployed in the pool of routers with a global consistent mode. From the results, we observe the peer list request is redi- rected to different scheduling servers based on where this C. Peer Selection and Download Scheduling request come from. In there, requests from Unicom, Telecom, We further investigate the peer selection strategy adopted Mobile are redirected to the scheduling servers which are by Youku, i.e., how peer router chooses the partners for chunk also located in Unicom, Telecom, Mobile, respectively, and uploading and downloading. For this purpose, we focus on the above 90% of the collected peers are located in the same ISP as the scheduling servers. Then we confirm that Youku peer 2https://github.com/meteoral/Nali 1 daily Top-100 IPs, which communicate with our peer router, 1 P2P: R1 0.8 P2P: R4 ranked by the amount of traffic which they contribute during 0.8 Youku server: R1 Youku server: R4 the peak time (e.g., 8 PM – 11 PM) over the course of our Peer CDN: R1 0.6 P2P: R1 0.6 Peer CDN: R4 P2P: R4 measurement, and we analyze their distribution in terms of Youku server: R1 CDF their ISPs and geographical locations. 0.4 Youku server: R4 CDF 0.4

We select the top IPs of uplink traffic as the representative 0.2 0.2 to analyze the peer selection. First, we plot the distribution 0 0 100 100 105 10-2 100 102 104 of all selected Top- IPs in terms of ISP in Fig. 10(a). Latency (ms) Download speed (KB/s) For the R1 - R3 routers deployed in the CERNET, although (a) Latency. (b) Download speed. they are in the same region and ISP, they can join different peer sets to transfer chunks, e.g., R1 and R3 select most Fig. 11: The QoS performance of R1 and R4. peers/users in the ISP Mobile, but R2 uploads chunks mostly to the peers/users in the ISP Telecom. As we know, only the P2P connections and R1–Server connections, i.e., the average education organizations or research institutes can access to latency for downloading contents from servers is about 1ms, CERNET, which results in fewer routers deployed in CERNET while the average latency for P2P communications is about and no significant ISP barrier between CERNET and other 100ms. But for the R4, there is no significant difference ISPs. Thus our peer routers can communicate with the peers between the latency of P2P connections and the R4–Server from various other ISPs based on the global QoS monitoring connections, indicating that the bottleneck is the downlink of redirection. As for R4, it mainly uploads chunks to Unicom R4 (4Mbps) which results in the long delay. peers (89%) to avoid the ISP barrier problem. Fig. 11(b) compares the download speed of: 1) single P2P The geographical location distribution of these Top-100 connections, 2) Router–Server connections, and 3) parallel IPs is presented in Fig. 10(b). We observe that peers which P2P connections of downloading the same chunk, which is download contents from R1 - R3 are located in different labelled as “peer CDN”. From the results, we observe that the locations with relatively uniform ratio. While the R4 mainly download speed of a single P2P connection is slower, but the upload contents to the peers located in its nearby location, i.e., total peer CDN speed is quite high which can come up with Beijing and Heibei province. the speed of downloading contents from CDN servers. Based on the analysis above, we summarize our observa- From the analysis results of the smartrouters, we perceive tions as follows: Such peer CDN can form an effective QoS that the peer video CDN should overcome the obstacles monitoring sub-system. As peer routers are scheduled by a of impaired P2P connections, i.e., higher latency and lower centralized peer selection mechanism using global knowledge, download speed, to guarantee the QoS experienced by users. e.g., based on ISP or location, they can be assigned to serve users effectively. Thus large overall bandwidth can be achieved B. Performance Perceived by Users when peer routers are effectively matched to serve particular In this part, we study the performance experienced by the users. original users in the peer CDN. Fig. 12 illustrates the general traffic pattern of a VoD V. THESYSTEMPERFORMANCE session. The users firstly requests the video from CDN edge In this section, we examine the system performance of servers and watches the beginning part of the video, which the Youku peer CDN. The primary goals for the content overcomes the start-up delay in the traditional P2P system providers to push content resources to the edge of network demonstrated in the Fig. 11(a). Then she downloads the chunk are: (1) Improve the service quality experienced by users from multiple candidate peers in parallel, which guarantees the through shortening the distance between users and content total download speed experienced by users. resources; (2) Alleviate the bandwidth occupancy costs of the We evaluate the video delivery performance in Beijing CDN infrastructure by redirecting user requests to peer routers with different network environment, including CERNAT (80 as more as possible. In this section, we evaluate the system Mbps), Unicom ADSL (4 Mbps), and Unicom 4G(8 Mbps). performance upon these two targets. We choose 450 videos on the Youku website and perform a series of video sessions based on their popularity given by A. Performance Perceived by Smartrouters its website3, which follows Zipf distribution. For each video First, we present the peer CDN performance using the session we record the average download speed and data ratio traffics captured from the testing routers based on the QoS coming from peers. Fig. 13(a) compares the chunk download metrics, such as latency and download speed. We choose R1 speed of 1) peer routers transport bytes which larger than 60% and R4 as the representatives to demonstrate the results (The of the chunk, and 2) all bytes are downloaded from CDN results of R1, R2, R3 and R5 are similar). edge servers. We can observe that the peer CDN has stable In this paper, we denote the latency of a network connection download speed in our experiment. Even in the Unicom ADSL as the duration from its first packet sent by the router to the and Unicom 4G, peer CDN achieves higher download speed next packet received by the router. As shown in Fig. 11(a), for the R1, we observe there is an obvious gap between the 3http://index.youku.com/ 5000 1 From server CERNET: Edge Server 1 P2P: TCP CERNET: Peer CDN 4000 P2P: UDP 0.8 ADSL: Edge Server ADSL: Peer CDN 0.8 4G: Edge Server 3000 0.6 4G: Peer CDN 0.6

2000 CDF 0.4 0.4

0.2 1000 0.2 Download speed (KB/s) 0 0 -5 0 5 0 0 50 100 150 10 10 10 0 0.2 0.4 0.6 0.8 1

Time (s) Throughput (KB/s) Data ratio delivered by peer routers Normalized rank of video sessions Fig. 12: The traffic pattern of a video (a) Download speed from edge server vs. peer CDN. (b) Peer CDN efficiency. session in the peer CDN. Fig. 13: The video delivery QoS of peer CDN perceived by users. compared to that from edge servers, the reason is that multiple by the social networks [6], [7]. If we want peer routers peer routers parallelly accelerate the speed (there are about to provide a efficient nearby service, balancing the regional up to 20 concurrent connections to download a chunk in our content popularity and global content popularity is important experiments). for the replication strategy design in peer CDN. Then we evaluate the peer router delivery ratio and analyze Based on our analysis, compared to the conventional CDN which types of videos can be delivery by peers. Fig. 13(b) which manages the QoS between edge servers and end users shows the data ratio delivered by peer routers in our datasets. and updates the contents for a large region, tapping the internal We observe 80% of the content requests can be served by peer potential of peer CDN is promising to provide more efficient CDN with at least 70% of the bytes came from peer routers. video service. Most of the videos with lower peer delivering ratio are user generated content (UGC) which are out-of-date. It is worth VII. RELATED WORK noting that the out-of-date UGC “Gangnam Style”, which was A. Content Distribution Systems the most popular videos in many video websites, can keep up Video content constitutes a dominant fraction of online with 66% data ratio from peers. entertainment traffic today. In current video service platforms, In summary, Youku peer CDN delivery QoS is comparable CDN and P2P are two representative techniques [8]. There are or better to that of the CDN system; and a substantial fraction abundant measurement studies of content-distribution systems, of the content requests can be served by the peers. including both of CDN-based VoD systems (such as YouTube VI. DISCUSSION [9], Netflix [10] and Hulu [11]), and P2P-based VoD systems In this section, we discuss the limitation of the peer CDN (such as PPLive [12] and Joost [13]). From these measure- and propose two potential principles which can be influential ment, both academia and industrial communities are aware for peer CDN design. that CDNs have the strong global controllability but weak ✄ Limitation: In order to effectively serve users, massive scalability [14], while P2P gets a strong flexibility but weak amount of videos are continuously replicated between peer QoS guarantee for video delivery [12]. routers, which consumes too much user-side bandwidth. In B. Peer CDN System our measurement, distinguishing which contents are uploaded to end users and which contents are uploaded to other peer Recent years witness the ever-increasing amount of video 55 routers is difficult. But it is worthy to verify this problem and traffic, e.g., Netflix and Youtube account for % of the compare the traffic consumption with the performance gain. downstream traffic with fixed access in North America by 3 4 ✄ Potential improvement: 1) Global QoS monitoring based Dec 2015 [15], and Cisco predicts that over / of the world on Peer routers: Through the traffic datasets analysis, we mobile data traffic will be video traffic by 2020 [16]. Some observe that a large amount of flows between peer routers previous works have examined the peer CDN which combines maintain the long time but light load connections. We specu- the CDN and P2P system, which can obtain great benefit for late that there are probing flows to make peer routers keep content delivery. So far there are two main solutions to design contact with their partners. Since the CDN control plane peer CDN for video streaming, i.e., client-based strategy and can obtain the end-to-end QoS information from the large smartrouter-based strategy. Client-based method urges clients scale peer router swarm, it would be influential for the entire download contents from each other when they cache the same network perception and fine-grained user request redirection. contents, such as specialized applications including NetSession 2) Regional content replication: In our experiment, videos [17], LiveSky [18], 3DTI [19] and PeerCDN [20], and some are replicated in a coordinative manner. However, the content web browser plug-ins [21], [22]. All of these works do not popularities of different regions are diverse4 and influenced involve the content prefetch strategies. Many researchers realize that the abundant resources, such 4https://www.youtube.com/trendsmap as set-top [23], small cell base stations (SBS) [24], and Wi-Fi access points [25], can be well utilized to assist the content [7] Zhi Wang, Wenwu Zhu, Minghua Chen, Lifeng Sun, and Shiqiang Yang, delivery and offload the traffic of original server. These works “CPCDN: Content Delivery Powered by Context and User Intelligence,” IEEE Transactions on Multimedia, vol. 17, no. 1, pp. 92–103, January provide theoretical content replication strategies. In this paper, 2015. we are interested to know how such design perform in the wild. [8] Baochun Li, Zhi Wang, Jiangchuan Liu, and Wenwu Zhu, “Two Decades Smartrouter-based peer CDN system has appeared for al- of Internet Video Streaming: a Retrospective View,” ACM Transactions on Multimedia Computing, Communications and Applications, vol. 9, most two years. [26], [27] study a router-based peer CDN no. 1s, pp. 33–52, October 2013. system, Thunder, which is most like our paper. Their content [9] Vijay Kumar Adhikari, Sourabh Jain, Yingying Chen, and Zhi-Li Zhang, scheduling strategy is very simple, i.e., push 80TB traffic “Vivisecting : An active measurement study,” in INFOCOM, 2012 Proceedings IEEE. per day based on the file popularity in last day. [28] is our [10] Vijay Kumar Adhikari, Yang Guo, Fang Hao, Matteo Varvello, Volker previous work about the content replication strategies used Hilt, Moritz Steiner, and Zhi-Li Zhang, “Unreeling netflix: Understand- by the Youku peer CDN, which does not involve the detailed ing and improving multi-cdn movie delivery,” in INFOCOM, 2012 Proceedings IEEE. system analysis, such as the transport protocols, peer selection [11] Vijay Kumar Adhikari, Yang Guo, Fang Hao, Volker Hilt, and Zhi-Li strategy and QoS performance. Zhang, “A tale of three cdns: An active measurement study of hulu To the best of our knowledge, we are the first to use and its cdns,” in Computer Communications Workshops (INFOCOM WKSHPS), 2012 IEEE Conference on. measurement to study the architecture, system strategies and [12] Yan Huang, Tom ZJ Fu, Dah-Ming Chiu, John Lui, and Cheng Huang, performance of the real-world smartrouter-based peer CDN for “Challenges, design and analysis of a large-scale p2p-vod system,” in VoD. ACM SIGCOMM computer communication review, 2008. [13] Jun Lei, Lei Shi, and Xiaoming Fu, “An experimental analysis of joost peer-to-peer vod service,” Peer-to-peer networking and applications, VIII. CONCLUSION 2010. [14] Xi Liu, Florin Dobrian, Henry Milner, Junchen Jiang, Vyas Sekar, Ion Smartrouter-based peer CDN starts a new CDN ecosystem, Stoica, and Hui Zhang, “A case for a coordinated internet video control which deploys content delivery infrastructure (smartrouters) plane,” SIGCOMM ’12, ACM. to the edge user sides, and leverages backhaul network re- [15] Sandvine, “Sandvine report: Global internet phenomena–2015 latin america and north america report,” Sandvine Incorporated ULC, 2015. sources contributed by end users. In order to understand [16] Cisco, “Cisco visual networking index: Global mobile data traffic the strategies, performance, limitation and potential impact forecast update 2015-2020 white paper,” Cisco VNI Mobile, 2016. on such content delivery system, in this paper, we conduct [17] Mingchen Zhao, Paarijaat Aditya, Ang Chen, Yin Lin, Andreas Hae- berlen, Peter Druschel, Bruce Maggs, Bill Wishon, and Miroslav Ponec, a comprehensive measurement on a real smartrouter-based “Peer-assisted content distribution in akamai netsession,” IMC ’13, peer CDN platform, deployed by ChinaCache and Youku, ACM. which serves 200 million users of Youku. By passively and [18] Hao Yin, Xuening Liu, Tongyu Zhan, Vyas Sekar, Feng Qiu, Chuang Lin, Hui Zhang, and Bo Li, “Design and deployment of a hybrid cdn- actively measuring the Youku peer router in different ISPs p2p system for live video streaming: Experiences with livesky,” MM and locations, we provide the insights which are important for ’09, ACM. peer CDN. First, smartrouter based peer CDN system adopts a [19] Ahsan Arefin, Zixia Huang, Klara Nahrstedt, and Pooja Agarwal, “4d telecast: Towards large scale multi-site and multi-view dissemination of global replication and caching strategies. The cached contents 3dti contents,” in Distributed Computing Systems (ICDCS), 2012 IEEE are frequently updated on an hourly basis, in order to keep 32nd International Conference on. IEEE, 2012. updated with the change of the content popularity. Second, [20] Jie Wu, ZhiHui Lu, BiSheng Liu, and Shiyong Zhang, “Peercdn: A novel p2p network assisted streaming scheme,” in such peer CDN deployment can itself form an effective QoS CIT 2008. IEEE. monitoring sub-system, which can be used for fine-grained [21] Manal El Dick, Esther Pacitti, and Bettina Kemme, “Flowercdn: A user request redirection. Third, such peer CDN deployment hybrid p2p overlay for efficient query processing in cdn,” EDBT ’09, 80% ACM. can successfully guarantee the video delivery QoS, e.g., [22] Liang Zhang, Fangfei Zhou, Alan Mislove, and Ravi Sundaram, of the content requests can be served by nearby peer nodes. “Maygh: Building a cdn from client web browsers,” in Proceedings Finally, we discuss the system limitations and propose two po- of the 8th ACM EuroSys, 2013. [23] Wenjie Jiang, Stratis Ioannidis, Laurent Massoulié, and Fabio Picconi, tential design schemes, i.e., global end-to-end network monitor “Orchestrating massively distributed cdns,” in CoNEXT ’12. ACM. and regional content replication. [24] Konstantinos Poularakis, George Iosifidis, and Leandros Tassiulas, “Ap- proximation algorithms for mobile data caching in small cell networks,” REFERENCES ToC, 2014. [25] Zhenhua Li, Christo Wilson, Tianyin Xu, Yao Liu, Zhen Lu, and Yinlong [1] “http://www.prnewswire.com/news-releases/chinacache-and-youku- Wang, “Offline downloading in china: A comparative study,” IMC ’15, router—-creating-a-new-cdn-ecosystem-300138554.html,” . ACM. [2] “http://tech.ifeng.com/a/20150817/41420262_0.shtml,” . [26] Liang Chen, Yipeng Zhou, Mi Jing, and Richard T. B. Ma, “Crystal: A [3] Hongliang Yu, Dongdong Zheng, Ben Y Zhao, and Weimin Zheng, novel crowdsourcing-based content distribution platform,” NOSSDAV “Understanding user behavior in large-scale video-on-demand systems,” ’15, ACM. in ACM SIGOPS Operating Systems Review. ACM, 2006. [27] Ge Zhang, Wei Liu, Xiaojun Hei, and Wenqing Cheng, “Unreeling xun- [4] Zhenyu Li, Jiali Lin, Marc-Ismael Akodjenou, Gaogang Xie, Mo- lei kankan: understanding hybrid cdn-p2p video-on-demand streaming,” hamed Ali Kaafar, Yun Jin, and Gang Peng, “Watching videos from Multimedia, IEEE Transactions on, 2015. everywhere: A study of the pptv mobile vod system,” IMC ’12, ACM. [28] Ming Ma, Zhi Wang, Ke Su, and Lifeng Sun, “Understanding Con- [5] Paul Jaccard, Etude comparative de la distribution florale dans une tent Placement Strategies in Smartrouter-based Peer Video CDN,” in portion des Alpes et du Jura, Impr. Corbaz, 1901. NOSSDAV, 2016. [6] Zhi Wang, Lifeng Sun, Xiangwen Chen, Wenwu Zhu, Jiangchuan Liu, Minghua Chen, and Shiqiang Yang, “Propagation-based Social- aware Replication for Social Video Contents,” in ACM International Conference on Multimedia (Multimedia), 2012, pp. 29–38.