Master Thesis Electrical Engineering June 2013

Characterization of YouTube Streaming Trac

Radha Ravattu Prudhv Raj Balasetty

School of Computing Blekinge Institute of Technology 37179 Karlskrona This thesis is submitted to the School of Computing at Blekinge Institute of Technology in partial fulllment of the requirements for the degree of Master of Science in Electrical Engineering. The thesis is equivalent to 20 weeks of full time studies.

This Master Thesis is typeset using LATEX

Contact Information

Author 1: Prudhvi Raj Balasetty Address: Karlskrona, Sweden E-mail: balasettyprudhviraj@.com Author 2: Radha Ravattu Address: Karlskrona, Sweden E-mail: [email protected]

University advisor:

Dr. Markus Fiedler , Prof. COM/BTH

School of Computing Internet: www.bth.se/com Blekinge Institute of Technology Phone: +46 455 385000 371 79 KARLSKRONA, SWEDEN SWEDEN This thesis is dedicated to our parents

Prudhvi Raj Balasetty Radha Ravattu

i Abstract

Online digital have made a revolutionary evolution since the social networking sites such as YouTube and have emerged. These websites facilitate video access aable and only a click away. Ever increasing internet trac and a very signicant increase in the use of videos in social networking has led to the problem of network congestion. Consequently, it becomes essential and imperative to analyze the trac ow and comprehend how it is being delivered from the server. If the ow of trac is analyzed appropriately and if the analysis methodology is understood properly, the service providers can understand the reasons for network congestion and avoid them. Given the context, a few research studies have thrown light on video de- livery procedure of YouTube with emphasis on methodology. A few other works have examined the location of video storage and the strategy involved in sending the video packets, yet the packet delivery strategy for dierent types of videos (based on popularity) has not been explained in detail. The present research lls this gap. This research explores the origin of the source of the video and the exact strategy being followed by YouTube in deliver- ing popular and non-popular videos. This thesis also discusses the analysis methodology of packet delivery in detail.

Keywords: YouTube, Video Delivery, Burst Analysis.

i Acknowledgements

First and foremost, we are deeply indebted to Prof. Markus Fiedler, our Professor and thesis supervisor, for his valuable guidance. His inspiring tutelage brought forth the best in us. We shall ever remain thankful to him for sharpening our insights, kindling creativity in us and molding us into what we are today. Big thanks to him from the bottom of our hearts. We record our sincere thanks to Prof. Patrik Arlos for his kind co-operation and encouragement. A word of thanks to Mr. Junaid Shaik for his valuable tips and suggestions. We record our thanks to JNTU, Hyderabad, and BTH, Karlskrona, Sweden for providing us a life-time opportunity to study under teachers of global repute and we thank all those who made our stay in Sweden most memorable. Finally we would like to thank our parents for all their support and love. Without them we would have never reached this stage.

Prudhvi Raj Balasetty Radha Ravattu

i Contents

Abstract i

Acknowledgements i

Contents ii

List of Figures v

List of vi

Acronyms vii

Introduction 1

1 Introduction 1

Background 3

2 KEY CONCEPTS 3 2.1 Video Streaming ...... 3 2.2 Quality of Service (QoS) and Quality of Experience(QoE) ...... 4 2.3 Mathematical Background ...... 5 2.3.1 Summation ...... 5 2.3.2 Mean ...... 5 2.3.3 Standard deviation (STDEV) ...... 5 2.3.4 Co-ecient of Variation ...... 6 2.3.5 Cross-Correlation ...... 6 2.4 Methodology ...... 6 2.5 Related work ...... 7

ii Video Delivery Methodology 9

3 Video Delivery Methodology 9 3.1 Internet ...... 9 3.2 History of YouTube and Delivery Methodology ...... 10 3.2.1 Video Id Space ...... 10 3.2.2 DNS namespaces ...... 10 3.2.3 Physical server cache hierarchy ...... 10 3.3 Communication between client, YouTube and CDN ...... 11 3.4 User experience on videos ...... 12 3.5 Waiting intervals ...... 12 3.6 Problems in YouTube ...... 13 3.7 Burst Analysis ...... 14 3.7.1 O time ...... 14 3.7.2 On time ...... 15 3.7.3 Burst ...... 15

Design and Implemantation 16

4 Design and Implemantation 16 4.1 Research Questions ...... 16 4.2 Research Methodology ...... 16 4.3 Design ...... 17

Analysis 23

5 Analysis 23 5.1 Analysis ...... 23 5.1.1 Case-1: ...... 24 5.1.2 Case-2: ...... 25 5.1.3 Case-3: ...... 26 5.1.4 Case-4: ...... 27

Results and Discussions 30

6 Results 30 6.1 Burst Durations ...... 30 6.2 Inter-Burst Times ...... 31 6.3 Burst Lengths ...... 32 6.4 Coecient of variation of packets in a burst ...... 32 6.5 Server Selection Strategy ...... 34

iii Conclusions and Outlook 35

7 Conclusion and Outlook 35 7.1 Conclusions ...... 35 7.2 Outlook ...... 36

Bibliography 37

iv List of Figures

3.2.1 Brief overview of communication between client and YouTube 11 3.4.1 Buering of a video ...... 12 3.7.1 Burst ...... 15

4.3.1 Capturing traces using Wireshark ...... 21

5.1.1 Comparison of CDFs between Popular and Uploaded Videos in Fixed Network ...... 28 5.1.2 Comparison of CDFs between Popular and Uploaded Videos in Wireless Network ...... 28

6.1.1 Burst Duration in Popular video and Uploaded Video . . . . 30 6.2.1 Inter Burst-Times in Popular video and Uploaded Video . . . 31 6.3.1 Length of the Burst in Popular video and Uploaded Video . . 32 6.4.1 Number of packets per interval in dierent time scales . . . . 33

v List of Tables

3.5.1 User perspective of watching videos ...... 13

4.3.1 User perspective of watching videos ...... 19 4.3.2 Videos Uploaded by Us ...... 21

5.1.1 Comparison between Fixed and Wireless Networks of a pop- ular video ...... 24 5.1.2 Comparison between Fixed and Wireless of a Uploaded Video 25 5.1.3 Comparison between Popular and Uploaded Video in Fixed Network ...... 26 5.1.4 Comparison between Popular and Uploaded Video in Wireless Network ...... 27

6.4.1 Average of CoV for Popular and Uploaded Videos in Fixed LAN and Wireless Networks...... 32

vi Acronyms

BTH Blekinge Tekniska Högskola

CCDF Complimentary Cumulative Distribution Function

CDF Cumulative Distribution Function

CDN Content Delivery Network

DNS Domain Name System

HTML Hyper Text Markup Language

HTTP Hyper Text Transfer Protocol

ID Identier

IP Internet Protocol

LAN Local Area Network

NAT Network Address Translation

QoE Quality of Experience

QoS Quality of Service

STDEV Standard Deviation

TCP Transfer Control Protocol

UDP User Datagram Protocol

URL Uniform Resource Locator

WAN Wide Area Network

vii Chapter 1

Introduction

Our world is nding itself in the midst of communication technology explo- sion. Computer, Internet and Social Networking have revolutionized access to information. Vast and varied information is available to a interested user at a mouse click. From sports and entertainment content to education and business topics, anything and everything is readily available and accessible in the internet for a user. One can access information from any corner of the world. Availability of images and videos, uploading and downloading them, exchange and sharing them in the Internet has greatly enhanced the value, quality and access of information. As the Internet progressively and increas- ingly started using multimedia applications, the development of applications have also gained momentum. Dissemination of multimedia information has tremendously enhanced the utility and usage of Internet by the enthralled user. Streaming wide variety videos available in the Internet have matched the needs of the video hungry generation. The clamor for videos on demand has reached its zenith so much that it often results in network congestion. Ever-increasing Internet trac has thrown-up an open challenge to the video service providers like YouTube and Hulu to deliver user genial access to video streaming or else stand outcast. In order to overcome these challenges, quite a few research works inves- tigated video streaming, Internet and Wide Area Network WAN trac. In the last few decades, analyzing web video trac has become a major area of enquiry in the quest for determining, improving and optimizing the dy- namic characteristics of trac structures. The methodology of video delivery, which signicantly determines trac volumes and patterns, has captivated researcher's scrutiny. YouTube being the largest video sharing site was on the constant eld of research. Ever since YouTube was owned by , the video delivery infrastructure has been completely re-structured. YouTube employed data centers in U.S. With this powerful infrastructure and expo- nentially increasing number of videos and users, in almost no time YouTube contributed to a large portion of web trac. In order to avoid congestion,

1 CHAPTER 1. INTRODUCTION 2

YouTube employed Content Delivery Network (CDN) [1]. In this thesis we nd video delivery methodology of YouTube. We exam- ine the YouTube-internal policy in delivering the video packets to the end users. Chapter 2

KEY CONCEPTS

2.1 Video Streaming

In comparison, presentation of information in the form of continuous video stream is always better than simple text or images. The emergence of so- cial networks that are completely video based such as YouTube and Hulu stand as an impeccable paradigm for this. Online digital videos have be- come a very important aspect of Internet services and have become the most used of communication for business, education, social and en- tertainment purposes. These real time videos also present the users with instantaneous watching where they need not wait for hours to download their favourite video and then have to watch it. Enormous growth in the deployment and usage of internet during the recent years led to rapid in- crease in network trac. Quite a few video streaming services use User Datagram Protocol (UDP) as a transport layer protocol. UDP-based ap- plications can adjust their data transfer rate as UDP does not have con- gestion control and the retransmission of packets that are discarded in the network does not happen. However, UDP based communications are mostly blocked by rewalls or Network Address Translations (NATs). Consider- ing this, most leading video service providers run their videos over Hyper Text Transfer Protocol (HTTP), which uses Transmission Control Proto- col (TCP) [2]. The usage of TCP also ensures delivering undisturbed video content as the protocol take care of re-transmission of corrupted packets [3].

Increase in trac leads to network congestion and packet loss. Par- ticularly the outages in the network are quite frequent and they result in longer waiting times. The consequences of the problems in the network are experienced immediately in the form of freezes in the video. However, in practice, many users face volatile performance of the service, e.g. bad net- work conditions, congested media streaming servers that cause waste of time due to re-buering. In addition, degradation can occur when the video is

3 CHAPTER 2. KEY CONCEPTS 4 encoded, during transmission of the packets across the Internet protocol(IP) network, and/or during decoding and playback. Video quality degradation makes itself quite visible through various ways such as jerkiness, freezes, gaps in playback, and image-related impairments such as stalling or blurred video [4] [5].

2.2 Quality of Service (QoS) and Quality of Experience(QoE)

One of the fundamental factors that decide the quality of video stream satis- faction of user is ensuring substantially high and constant levels of perceptual QoS. It can be dened as [6] the ability of a network to provide a service with an assured service level.QoS is considered as the best eort service in the in- ternet and hence it becomes important to put a high eort to improve the QoS levels as it shows the quality of the video stream [7] [8]. Many studies and proposals have been successfully carried out to improve the QoS levels as well as the performance of the video streaming applications. QoS depends on various factors such as the rate of packet loss, end to end transmission delay, rate of the video transmission and jitter. The uncertainness of the packet transmissions in real time video stream- ing service always poses a problem for both the service providers and users. Hence it becomes dicult to ensure an appropriate QoE for the end user. QoS and QoE concepts were introduced in the IP network to explain the satisfaction rate of the end user regarding the quality. QoE can be dened as The degree of delight or annoyance of the user of an application or ser- vice [9]. In simple words, QoE is the term, which describes the satisfaction of the cus- tomers towards the service quality providers. The less the QoE, more is the dis-satisfaction of customers towards the service quality providers. QoE not only implies the network performance parameters but also shows the service quality parameters such as cost, accessibility, reliability and availability. As stated in the rst section, the video streaming is mostly done by use of the TCP protocol. When a packet is lost during transmission the TCP immediately detects it and decreases the transfer rate. If this rate is less than that of the playback rate, the playback will stop and wait for the new set of video data. No data will be displayed until the new packets arrive. Stalling, freezes and skips can highly degrade the user perception quality (QoE). Quality of the video and audio, smoothness of playback also aects QoE. QoS as well as QoE have gained substantial consideration as they play a major role in stating the quality of the video and the user satisfaction over the streaming. CHAPTER 2. KEY CONCEPTS 5

2.3 Mathematical Background

In order to understand this thesis, it is essential to understand a few math- ematical calculations and their usage. If the reader is already familiar with this kind of math, he/she can skip this section. As we are going to model a system based on the data obtained by the experimentation, it is important to select the important and useful sets of data that is hidden in existing bunches of data. The following are certain mathematical calculations in understanding the obtained data.

2.3.1 Summation The arrived data is in the form of packets. It is required to check how many number of data packets are delivered in a particular burst (for detailed information about burst please check section 3.7).

n P ai = a1 + a2 + ... + an i=1

Where a1, a2..an are the number of packets in intervals 1,2. . . , n respectively and n is the number of intervals

2.3.2 Mean As the data obtained will be in the form of packets, it will be essential to see the number of packets being arrived in each interval and the average of the number of packets in all intervals within a burst.

n 1 P a1+a2+...+an µ = n ai = n i=1

Where a1, a2..an are the number of packets in intervals 1,2. . . , n respectively and n is the number of intervals

2.3.3 Standard deviation (STDEV) Standard deviation is dened as the deviation from the expected average. It is denoted by σ. It is a measure to show the spread between the associated values. If the standard deviation is small, it shows that the numbers of packets per interval in the burst are almost same, i.e. the burst is not bumpy and has a smooth ow.

q 2 2 2 (a1−µ) +(a2−µ) +....(an−µ) σ = n−1 CHAPTER 2. KEY CONCEPTS 6

where a1+a2+...+an (average of the number of packets in all intervals) µ = n

a1, a2, . . . , an ,are the number of packets in 1, 2. . . , n intervals respec- tively.

2.3.4 Co-ecient of Variation Coecient of variation is dened as the normalized measure of dispersion of a distribution. It is the ratio of STDEV and mean.

σ Cν = µ Where σ is the Standard Deviation of the number of packets in all intervals and µ is the average of number of packets in all intervals. A rather low value (approximately 0 to 0.3) indicates smooth packet delivery within the corresponding burst.

2.3.5 Cross-Correlation Cross-Correlation is the standard method of estimating the degree to which two series of values are associated. We used this measure in order to nd the degree to which the matched distribution is in accordance with Cumulative Distribution Frequency CDF values. This is given by P Correl(X,Y ) = √ (x−x¯)(y−y¯) P(x−x¯)2 P(y−y¯)2 Where X is the CDF values, Y is the distribution values, x¯ is the average of X and y¯ is the average of Y,

2.4 Methodology

The research methodology is classied into two methods, the Quantitative and the Qualitative methods. Quantitative methods involves in testing and experimentation whereas the Qualitative methods involve in surveys and case studies. In our thesis we followed the method of experimentation, which is a quantitative research method. This type of research method is facilitated by a controlled environment during testing. The main steps involved in the experimentation procedure are:

ˆ Selection of samples from known population ˆ Allocation of samples to dierent experimental conditions ˆ Measuring small number of variables

The utilization and execution of the above steps are clearly demonstrated in the further sections of this report. CHAPTER 2. KEY CONCEPTS 7

2.5 Related work

Video streaming has taken Internet to the next level. As online digital videos play a key role in Internet usage, many studies and research works addressed the video quality based on user perspective [10]. With rapid increase in usage of Internet [11] [12], many problems based on the network trac congestion made their appearance. Many proposed and studied the location manage- ment congestion problem arising in dierent network scenarios. Previous works [13] [14] proposed dierent methods and algorithms for congestion control. The increase in the video sharing networks and sites have drastically elevated the trac in Internet. With this increased trac, providing good quality live streaming is at jeopardy. Several researches developed schemes and methods with ecient approaches such as increase in bandwidth utilization, maintain- ing reliable and high performance infrastructure which can help in enhancing the video quality over dierent networks [15] [16] [17] [18]. The multimedia methodology gained wider acceptance due to the exibility and reliability it oers and plays a key role in video streaming [19]. The success of video streaming has opened new vistas in that eld of knowledge and attracted research interest in the delivery architecture. Many authors proposed video delivery architecture in dierent scenarios [20] [21]. Delivery methodology has become essential factor for analyzing video stream- ing. When one examines closely the video delivery technologies of today, it turns out to be a surprisingly fragmented landscape, even as IP becomes the common infrastructure [22]. There are dierent architectural approaches for enterprise and service provider networks; dierent technologies used for lin- ear (live) and nonlinear (e.g., video on demand) content; and dierences in the delivery of content in over-the-top environments (e.g., Hulu, Netix, YouTube) versus managed environments, such as the IPTV services oered currently by many broadband service providers. YouTube emerged as one of the most popular and eective video sharing web- sites. With the extensive number of videos and its eective serving strategy, it became a source for consistent research. Many research works have tried to analyze architecture and video delivery process of YouTube [23] [24] [25] [26].

Few research works give a brief description of the architecture and their analysis of YouTube streaming provides means to examine the performance of video delivery [5]. Existing media providers like YouTube and Hulu de- liver videos through progressive download [27]. For the in-depth analysis of video packet delivery methodology, having profound knowledge about trace collection, probing bursts, classication of o-times and on-times is essential. In [28] and [29], authors gave a brief introduction about ON-OFF models. In general, ON-OFF models capture the essential phases of user communication in wireless networks [30] [28]. The most commonly used packet monitoring CHAPTER 2. KEY CONCEPTS 8 software tool to analyze data trac is Wireshark [31] [32].

The current state of art investigates the architecture of YouTube's method- ology with respect to content delivery infrastructure [33]. They stated the strategy followed by YouTube design and its distributed delivery infrastruc- ture to match the geographical span of its users and meet varying user de- mands. The underlying methodology of YouTube in sending the appropriate video from the nearest CDN is explained in detail.

But there is no much research related to YouTube's methodology in an- alyzing the packet delivery rate and streaming of video on the user end. To the best of our knowledge the strategies employed for dierent types of user demanded videos is not considered. Chapter 3

Video Delivery Methodology

3.1 Internet the use of computers and the subsequent use of Internet have comprehen- sively revolutionized the domain of communication. The modern computer technology coupled with Internet has facilitated access to vast and varied in- formation for a willing user and catered daily tasks of a common man. The impact of computer and Internet services on human life is so excessive that failure in these services will throw the normal life out of gear. Accessing, uploading, downloading, sharing and exchange of information are integral and vital components of Internet use. The usage of Internet has become vastly popular in the contemporary world. Internet continues to be the ideal technological platform to introduce online applications and within no time, based on the utility, the Internet based application gained acceptance and became popular. The surveys conducted by Cisco suggest that the Internet trac has reached the zettabyte level [34]. The number of global online video users has touched the magical 1 million in 2010 and is expected to reach an astounding 500 million mark in 2015 [33]. Wide and varied varieties of videos are easily available and accessible online. Dierent video types such as short videos that run for a duration of 1-15 minutes to full length movies with a running time of 2 to 3 hours, live shows, repeat programs that were telecast in digital box are readily available for the users. Live streaming and online videos have become most popular of the Internet services. Flash players are widely used and almost of net users make use of them [34]. This evidently shows that the online video streaming has an important role to play and will stay put for a while. Making use of these services, one can upload and download many videos, irrespective of place and time. YouTube with a tag line of 'Broadcast yourself' is a widely used service provider in video sharing and deliver videos through progressive download.

9 CHAPTER 3. VIDEO DELIVERY METHODOLOGY 10

3.2 History of YouTube and Delivery Methodology

YouTube is the most prominent video-streaming portal that serves more than two billion videos on a daily basis [5]. Initially the domain name for YouTube was registered on February 14, 2005 and was purchased by Google in November 2006 [35] [36]. To begin with, YouTube started oering videos with 320×240 pixels. As it gained popularity, it started providing videos in dierent resolution in order to keep the bandwidth problem away. When a video is uploaded to YouTube then it generates the same video in dierent resolutions so that it can accommodate the clients with dierent bandwidths and accommodate mobile applications, which need low resolution. The YouTube video delivery consists of three basic parts

ˆ Video Id space ˆ Hierarchical cache server DNS namespaces ˆ Physical server cache hierarchy

3.2.1 Video Id Space Every video of YouTube has unique identier, which is known as the video id space. It is eleven characters comprising of letters [A-Z] and numeric values.

3.2.2 DNS namespaces The YouTube operates and performs with the help of DNS namespaces. The DNS namespaces signify an assortment of logical video servers with denite roles. Collectively these DNS namespace structure a layered organization of logical video servers. The three DNS name spaces are

ˆ Iscache ˆ Tccache ˆ Cache

3.2.3 Physical server cache hierarchy After thorough research, it was concluded that YouTube employs 3-tier phys- ical cache hierarchy with 38 primary locations, 8 secondary locations and 5 tertiary locations[25]. CHAPTER 3. VIDEO DELIVERY METHODOLOGY 11

Figure 3.2.1: Brief overview of communication between client and YouTube

3.3 Communication between client, YouTube and CDN

When a client visits or plays a video of a particular URL such as http: //www.youtube.com/watch?v=t4H_Zoh7G5A, it returns to a HTML page with embedded URLS and points to the respective ash video server that is responsible for serving that video. When the clients click on the play button of the selected video, a HTTP GET message will be sent from client to server. When the server receives the message, it can understand that client is requesting a video from the unique video ID space identier. After receiving GET message, the server replies with a HTTP 303, which con- tains location response that redirects the client to the video servers from which videos are streamed. This way of redirecting the videos introduces load balancing. Since the main server redirects the client to relevant video servers, the main server should have a complete idea about all the videos and their servers. But in this particular case YouTube has a dierent and better strategy, usage of CDN. The CDN servers send the video content over TCP protocol in single message such as HTTP 200 OK message [34]. As YouTube has tremendous growth and due to it ever-increasing demand and popularity, to get sustainable development it is essential to perform network trac engineering. Even after trying a lot of strategies, client often experi- ence various problems such as freezing and re-buering. Hence considering the user experience plays a key role for successful video delivery. CHAPTER 3. VIDEO DELIVERY METHODOLOGY 12

3.4 User experience on videos

The number of viewers accessing Internet on one hand and the number of videos provided in the Internet on the other hand is proliferating to such a large extent that network congestion has become a regular phenomenon. Due to network congestion, the viewers experience many problems, particu- larly the outages in the network that necessitate longer waiting times. The network congestion leads to freezes in the video, volatile performance of the service, bad network conditions, congested media streaming and waste of time due to re-buering. When the time taken to view a video or image exceeds more than the regular time, a busy and time-conscious user desists from using the service. Expeditious, quick and prompt access to video is the penchant of the present day viewer and a service provider found wanting in this is likely to lose clients [37].

Figure 3.4.1: Buering of a video

3.5 Waiting intervals

As stated above, the user feels bored after waiting for a certain point of time. The intervals are fundamentally classied into three. The intervals of time and the user experience are stated below [28]. CHAPTER 3. VIDEO DELIVERY METHODOLOGY 13

S.no Time elapsed Type of Reply 1 0.1-0.2 sec Instant reply 2 1-5 sec Immediate reply 3 5-10 sec Slow reply

Table 3.5.1: User perspective of watching videos

ˆ If the reply comes between in 0.1 to 0.2 seconds, it is considered as instant reply where the response is very quick and the user need not wait to watch the video. ˆ If the reply comes between 1 to 5 seconds, it can be stated as immediate reply. At this interval user feels delay but they are not interrupted and remain watching. ˆ If the reply comes between 5 to 10 seconds, it can be stated as the interval where the response is slow and the users may lose their interest.

As the number of on-line user increases, the demand gets more. When the user is watching a video, he/she expects a smooth play back without any delays. If the video is being blurred and if the video stops and buers at any point the user gets highly dis-satised and may choose to move on without watching it. Hence, it becomes a very critical issue for the service providers to have a watch on this point and have to come up with strategies where the user does not feel bored.

3.6 Problems in YouTube

Many articles have focused on determining the factors aecting the YouTube videos. Studies more relevant to our work investigated the YouTube architec- ture and network related performance either based on NetFlow statistics [25] or on packet captures [38], [39], [40].

Extensive research has been carried out in the Domain of Video stream- ing. Initially UDP played a major role but typically it does not guarantee packet delivery which may result in congestion or packet loss leading to vi- sual artefacts, jerky motion or jumps in the stream, degraded media quality. Hence TCP became widely popular for the video transfer.

In case of YouTube, delivery is done by progressive download using TCP as protocol. TCP has retransmission technique and take cares about the corrupted or lost packets and eectively minimizes the packet loss. In the paper [8], authors state that if available bandwidth is lower than the video bit rate, video transmission becomes too slow, gradually emptying the playback CHAPTER 3. VIDEO DELIVERY METHODOLOGY 14 buer until an under-run occurs. If re-buering happens, the user notices interrupted video playback, which is referred as stalling.

QoE was undertaken with a crowdsourcing approach to study YouTube [12]. This paper showed that the primary factors aecting QoE in YouTube are the number of stalls and its duration.

Throughout measurement campaign of this [12], 1 349 users from 61 countries participated in their YouTube stalling test and rated the quality of 4 047 video transmissions suering from stalling. Statistical analysis of the demographics of the users can be found. Unreliable users were identied and the data was ltered from the user studies accordingly. Hence the reliability (inter-rater and intra-rater) of the ltered data was improved signicantly.

From this paper they have quantied QoE of YouTube on behalf of the results of seven crowd-sourcing campaigns. They have shown that for this application, QoE is primarily inuenced by the frequency and duration of stalling events. The results indicate that users tolerate one stalling event per clip as long as stalling event duration remains below 3 s. These nd- ings together with their analytical mapping functions that quantify the QoE impact of stalling can be used as guidelines for service design and network dimensioning.

3.7 Burst Analysis

Video trac in the Internet can be evaluated by analysing the traces in- tensely. The traces are basically classied into three divisions.

ˆ O times ˆ On times ˆ Bursts which will be explained in sequel

3.7.1 O time The o time is dened as the time when there is no data transfer. In the thesis we considered it as an o time if the data transfer does not take place at least for two or three seconds. When we observe the traces we get the packets at an instant and there will not be any packet transfer for some seconds and again we get another bunch of data packets and this process repeats for few minutes. We consider no packet delivery interval time as the o time but on downscaling too much, such intervals tend to appear in the burst also. Hence we should note that all the intervals with out packet deliveries are not o times provided that CHAPTER 3. VIDEO DELIVERY METHODOLOGY 15 the time scale is considered. In our observations o time varied from 17 to 25 seconds.

3.7.2 On time On time is dened as the time of arrival of packets or data with respect to request. For example if the client requests the server for a video, the server starts transferring the data packets. The time during which the data is transferred is called the on time.

3.7.3 Burst The arbitrary areas of intensity in the traces can be marked as bursts. In simple words if the traces of particular data occurs in a very fast succession then it is called as the burst. In our case we dene burst as the amount of heavy data after certain o times. For instance in our research when we collected the traces of the video we had a set of 1221 (typically) packets after a 28 seconds (typically) of no data delivery. The units may vary for 5 in the above stated statistics. The statistics are clearly given in the further sections.

Bursts 40 Burst) 35

30

25

20

15 Number of packets

10

5

0 0 50 100 150 200 250 300 Time (0.1 second scale)

Figure 3.7.1: Burst Chapter 4

Design and Implemantation

4.1 Research Questions

1 Which are the most annoying QoE problems in YouTube video deliv- ery?

2 Are there any dierences in video delivery process depending on the popularity and location of the videos?

3 What are the distributions and main parameters required to describe video delivery methodology?

4.2 Research Methodology

Our present research is analysis oriented and empirical. In this we are going to observe the main dierences in the video delivery between the popular and non-popular videos.

ˆ Thorough literature review was done in order to understand and ana- lyze the video delivery process and the problem in it.

ˆ Traces were collected by playing various popular YouTube videos from dierent networks at dierent resolutions.

ˆ Four videos were uploaded from four dierent countries and their traces were collected in the same conditions as the above.

ˆ Collection of traces was followed by identication of bursts in the traces and performing mathematical calculations.

ˆ The origin of the video and the delivery process being deployed for popular and non-popular videos have been analysed.

16 CHAPTER 4. DESIGN AND IMPLEMANTATION 17

ˆ The time taken for the arrival of video packets and their inter-packet times have been calculated and the identied based on the continuity in ow of packets.

ˆ After analysing the bursts and ON-OFF times the strategy of the trac ow was examined so as to identify whether the data ow was smooth or disturbed.

ˆ From the identied ON-OFF times, bursts patterns and their distribu- tions over multiple time scales were investigated.

ˆ The Cumulative Distribution Function CDF and Complimentary Cu- mulative Distribution Function CCDF graphs are plotted for the bursts in order to nd the type of distribution followed by them.

4.3 Design

This section presents the methodology and the way in which the experiment was carried out. This experimentation deals with two aspects. Firstly, it explains the strategy followed by the popular videos that already exist in the YouTube. Secondly, it explains the strategies followed when we upload videos into YouTube and view them.

System Requirements:

ˆ A PC/Laptop

ˆ Web browser with embedded ash player

ˆ Wireshark

For this experiment, a Sony Vaio (VPCSB26FG) laptop with CORE i5 pro- cessor with a speed 2.4GHz running on a Windows7 operating system with 64 bit version is used. The latest version of the Wireshark win64-1.8.3 is installed on the laptop.

Experimentation:

This thesis explains two dierent aspects of the experiment. The rst aspect critically examines video delivery strategy of popular videos followed by the service providers. The second aspect deals with the change in the strategy when the videos are not popular with the underlying process being almost similar, the selection of videos diered in the two aspects. In this section, both cases are discussed in detail. CHAPTER 4. DESIGN AND IMPLEMANTATION 18

Aspect-1: The place of origin, content and the language of a video certainly decides the number of viewers and as such a video of regional language may not have international followers. For the experiment, three sets of videos based on their popularity were selected and were categorized into popular, moderately popular and least popular sections. For this selection, number of views and the place of the video origin were considered as the basic criteria and two videos from each of the above categories were selected. For this, two most popular English pop songs, two widely watched Indian songs, two popular and two least viewed South-Indian regional songs and one Indian song that was quite popular among all the places were selected. All the URLs were saved in a text le. The URLs of all the songs and their descriptions are given below in the following table. CHAPTER 4. DESIGN AND IMPLEMANTATION 19

S.no Video URLs Description 1 http: // www. . com/ watch? v= t4H_ Zoh7G5A Very popu- lar English pop song 2 http: // www. youtube. com/ watch? v= 2up_ Eq6r6Ko Very popu- lar English pop song 3 http: // www. youtube. com/ watch? v= eHZn85RsrCE Semi Pop- ular song from India 4 http: // www. youtube. com/ watch? v= h7j17dx_ rjw Semi Pop- ular Hindi song from India 5 http: // www. youtube. com/ watch? v= 1aVVwG6W59Y Semi pop- ular South Indian regional song 6 http: // www. youtube. com/ watch? v= Y1gKGTAVDNo Semi Pop- ular Hindi song from India 7 http: // www. youtube. com/ watch? v= 5Tyi_ tNYOeg Less pop- ular South Indian regional video 8 http: // www. youtube. com/ watch? v= _GMuCjESLZ0 Less pop- ular South Indian regional song 9 http: // www. youtube. com/ watch? v= N3bUt10-FIQ Popular In- dian song

Table 4.3.1: User perspective of watching videos

Initially the laptop was formatted. The latest version of the Wireshark win64-1.8.3, web browser and MicroSoft Oce (2007-2010) were installed on the laptop. The choice of Google Chrome is made due to the number of clients using it [41]. Wireshark is the widely used network monitoring tool in both academic and industrial purposes [42]. Wireshark CHAPTER 4. DESIGN AND IMPLEMANTATION 20 was kept running in the background. The videos were played one after the other on a Google chrome web browser on a xed network(download 11.15mbps and upload 10.54Mbps) from Karlskrona, Sweden. The Internet bandwidth connection was checked using speedtest.net. The video quality was set to 360p and the video was played. Wireshark collected the traces of all the packet transfers. As we need only video packets that are being de- livered from YouTube server to our laptop, the lters (tcp.port == 80) &&) (ip.dst == 80.78.216.226∗) were set. The computer's IP address changes with the Internet connection. Hence it is required to check the IP address before setting the lter. As YouTube uses TCP protocol for transferring the videos the rst lter helps in ltering TCP packets only. The second lter helps in ltering the packets that are being delivered only to the laptop. As the protocol being used is TCP, an acknowledgement is sent after every two video packet deliveries. In order to remove these acknowledgements, the destination lter is used. The collected traces were saved in a text format and were imported into a Microsoft Excel sheet, where it was easy and simple to perform all the math- ematical calculations. As the time, source and length elds are required, rest of the unnecessary elds were removed. The inter-packet time and time dif- ference between arrival times of two packets were calculated. The packets were scaled into dierent time divisions, 1-Seconds scale, 0.1-Seconds scale, 0.01-Seconds scale and the number of packets arrived in each interval has been calculated.

After collecting traces for each video, the cache memory of Goggle chrome browser and the Operating system were cleared. This experiment was re- peated in the similar way but in low resolution i.e., 240p and the calculations were carried out.

This entire experiment was repeated on a wireless network. For this the traces were collected for the same set of videos at BTH Library, Karlskrona (download 12.11mbps and upload 15.45mbps). The traces were collected and were imported into excel sheets to perform all the mathematical calculations. To consider the trac impact and busy hour trac ow, the complete ex- periment was carried out in 4 dierent times of the day.

Aspect-2:

The main motive of this research was to nd the dierences in the video delivery strategy of the distinct videos based on their origin and popularity in YouTube. Hence, we uploaded four videos from four dierent countries that are Hyderabad (India), New Jersey (), Birmingham () and Karlskrona (Sweden) by using local help. Just to ensure that the video length and type does not mismatch, we down- CHAPTER 4. DESIGN AND IMPLEMANTATION 21 loaded the same popular video from YouTube and re-uploaded it. Proper care has been taken to ensure that these videos are not opened anywhere near Sweden locations. To overcome any chances of cache search string stor- age, the URL was directly pasted in the URL tab and started collecting the traces using Wireshark. The URLs of the uploaded videos are given below.

S.no Video URLs Location 1 http: // youtu. be/ AX1Orh9CW6k India 2 http: // www. youtube. com/ watch? v= U.S sStxsORRaOs&feature= youtu. be 3 http: // www. youtube. com/ watch? v= U.K IJ-9dQAbCjY 4 http: // www. youtube. com/ watch? v= Sweden isOPKOLMh9E

Table 4.3.2: Videos Uploaded by Us

The Wireshark experiment was repeated in the same scenario like xed network and wireless network. The data was imported into Excel sheet and the mathematical calculations were performed.

Figure 4.3.1: Capturing traces using Wireshark

Removing Advertisements Out of the thirteen videos used, there were advertisements in three videos. As the advertisements also transfer video packets, it is essential to separate CHAPTER 4. DESIGN AND IMPLEMANTATION 22 the original video packets with that of ads. In order to do this, checking the CAP les is required. CAP les provide all the details about the video packet. Before the starting of the video, the CAP le has the information get videoad and after the nishing of the video, there is again request for the original video which can be known from the description. Also the source for the video packets is dierent from that of the original video packets. The video packets of the ads are coming from the local server and can be easily separated. As stated in section 2.4 the three major steps in experimental method have been deployed. A set of videos from a wide range of videos in YouTube were selected. After selection of videos, the traces of the videos were collected in various experimental conditions such as xed network, wireless network over two dierent resolutions. Finally the ON times, OFF times and bursts of all the videos were analysed. Chapter 5

Analysis

5.1 Analysis

Chapter 4 explained the collection of traces. The present chapter explains how the collected traces are analysed. As explained, the experiment has two aspects. It provides an overview of video delivery methodology from the YouTube where in the packet arrival times and the strategy being followed in burst delivery were analysed thoroughly. The videos were categorised into popular and non-popular sections. The methodology of video delivery in YouTube with the focus on dierence in delivering the popular and non- popular videos is explained.

Burst: Arbitrary areas of intensity in traces can be marked as bursts. In case of trace, we dene a burst as set of packets being delivered in a very short span of time. A continuous ow of packets is considered as burst. For this research, we assume that if there is no data ow for more than 0.1 second we consider that the packets do not belong to the same burst and consider it as a new one. But in the traces the time dierence between two bursts was more than twenty seconds, hence we can easily nd our bursts and can sort them.

Time Scale Selection: As mentioned earlier we have selected dierent time scales. But the statistics given below are of the 0.01 second-scale. As we decrease scaling, we can observe the data packets in the bursts more clearly but if we go beyond 0.01 second-scale more number of intervals with out packet delivery, make their appearance in the midst of the burst which leads to confusion. When we go upscale, the intervals of packet arrival is not clear. Hence we set 0.01 second-scale as standard for our analysis. As a part of the experiment a comparison between two dierent sets videos was made and is presented here under. In the rst case, the strategy

23 CHAPTER 5. ANALYSIS 24 of bursts in a popular video is presented. In the second case, the uploaded video is presented and in the third and fourth cases the strategies between the popular and non-popular (uploaded) video in xed and wireless networks is compared and hence the dierent strategies deployed in the two cases is observed.

5.1.1 Case-1: Comparison of Delivery Process between Fixed and Wireless net- works for a Popular Video

Serial number Burst Timings Inter-burst time Packets In seconds In seconds 1 LAN 19.584 - 20.498 - 2428 WIRELESS 9.216 - 10.796 - 2448 2 33.531 - 33.744 13.033 1239 23.696 - 24.365 12.900 1219 3 59.108 - 59.250 25.364 1194 49.109 - 49.910 24.744 1222 4 84.505 - 84.837 25.255 1204 74.486 - 75.116 24.575 1219 5 113.707 - 113.912 28.871 1208 103.892 - 104.828 28.777 1199 6 145.106 - 145.352 31.194 1221 135.152 - 135.860 31.254 1221 7 178.377 - 178.752 33.025 1221 168.499 - 169.353 32.640 1194 8 209.729 - 210.103 30.978 1221 199.851 - 200.551 30.498 1220 9 239.134 - 239.482 29.031 1191 229.187 - 229.821 28.636 1220 10 270.585 - 271.091 31.103 1221 260.648 - 261.349 30.828 1221 Avg LAN 27.539 WIRELESS 27.206

Table 5.1.1: Comparison between Fixed and Wireless Networks of a popular video

In the table 5.1.1, the rows with no shade represent wireless and the dark shaded rows give the statistics of xed network. This same pattern is followed for the next case also. The second column gives the starting and CHAPTER 5. ANALYSIS 25 ending time of each burst. The next column shows the time taken between two bursts. It can be observed that the inter-burst time is almost the same for both types. The initial burst is a combination of two bursts in both the cases.

5.1.2 Case-2: Delivery Process Between Fixed and Wireless for Uploaded Video

Serial number Burst Timings Inter-burst time Packets In seconds In Seconds 1 LAN 24.084 - 27.230 - 2435 Wireless 28.106 - 30.831 - 2447 2 28.332 - 29.780 1.104 1221 31.871 - 33.343 1.099 1221 3 53.921 - 55.371 24.141 1221 57.860 - 59.055 24.517 1229 4 81.771 - 83.216 26.400 1221 85.293 - 86.753 26.238 1220 5 110.841 - 112.630 27.626 1221 115.392 - 116.804 28.639 1220 6 145.272 - 146 735 32.641 1215 148.575 - 150.528 31.771 1221 7 179.404 - 181.136 32.669 1201 182.778 - 184.436 32.250 1207 8 210.120 - 211.587 28.984 1222 213.459 - 214.975 29.024 1222 9 244.326 - 245.751 32.739 1205 247.590 - 249.052 32.615 1221 10 278.173 - 279.641 32.422 1209 281.516 - 283.009 32.464 1221 Avg LAN 26.525 Wireless 26.513

Table 5.1.2: Comparison between Fixed and Wireless of a Uploaded Video

Table 5.1.2 presents the statistics of the video delivery of the uploaded video in xed and wireless networks. The delivery followed in both the networks is almost similar. The rst two bursts arrive simultaneously as in the case-1 and the third burst arrived with a short gap. In point of fact, the rst three bursts arrive almost in the same time but as stated earlier we consider the burst as a new one when the time arrival between two video packets exceeds 0.1 second. The inter-burst time is almost similar in both the networks. CHAPTER 5. ANALYSIS 26

5.1.3 Case-3: Comparison between Popular and Uploaded Video in Fixed Net- work

Serial number Burst Duration Inter-burst time CoV Correlation In seconds In seconds 1 popular 0.914 - 0.564 99.55 uploaded 3.145 - 0.177 94.93 2 0.213 13.033 0.410 98.62 1.448 1.110 0.079 94.47 3 0.143 25.364 0.369 97.7 1.451 24.141 0.0624 93.26 4 0.331 25.255 0.473 97.43 1.444 26.400 0.0962 95.30 5 0.205 28.871 0.349 99.60 1.790 27.626 0.223 90.60 6 0.246 31.194 0.362 98.78 1.464 32.641 0.240 97.91 7 0.375 33.025 0.708 99.49 1.730 32.669 0.399 93.54 8 0.373 30.978 0.779 98.93 1.467 28.984 0.161 93.78 9 0.348 29.031 0.419 97.93 1.425 32.739 0.131 91.89 10 0.506 31.103 0.598 98.42 1.469 32.422 0.195 91.59 Avg popular 0.365 27.539 0.503 98.612 Uploaded 1.683 26.525 0.176 93.727

Table 5.1.3: Comparison between Popular and Uploaded Video in Fixed Network

Table 5.1.3 shows the video delivery statistics between popular and up- loaded (non-popular) videos in xed network. The rows with no shade are the statistics of non-popular video and the dark shaded rows represent the popular videos. The same representation is used for the following case. The second column gives the duration that a burst lasts. In the popular video the rst two bursts come as a set. Not much dierence was observed in the inter-burst time. CHAPTER 5. ANALYSIS 27

5.1.4 Case-4: Comparison between Popular and Uploaded Video in Wireless Network

Serial number Burst Duration Inter-burst time CoV Correlation In seconds In seconds 1 popular 1.579 - 0.387 92.12 uploaded 2.725 - 0.168 94.93 2 0.669 12.900 0.202 90.85 1.472 1.0399 0.224 94.47 3 0.801 24.744 0.409 88.22 1.195 24.517 0.458 93.26 4 0.630 24.576 0.183 94.22 1.460 26.238 0.218 95.33 5 0.935 28.777 0.268 91.97 1.413 28.638 0.262 92.61 6 0.708 31.254 0.365 87.52 1.953 31.771 0.415 97.91 7 0.853 32.640 0.247 90.67 1.658 32.250 0.326 93.54 8 0.670 30.498 0.125 96.17 1.516 29.024 0.346 93.78 9 0.633 28.636 0.218 95.82 1.462 32.615 0.255 92.89 10 0.702 30.827 0.307 85.64 1.493 32.464 0.228 92.59 Avg Popular 0.818 27.200 0.271 91.32 Uploaded 1.635 26.506 0.290 94.13

Table 5.1.4: Comparison between Popular and Uploaded Video in Wireless Network

Table 5.1.4 presents the video delivery statistics between popular and uploaded (non-popular) videos in wireless network. In the popular video the rst two bursts come as a set and the next burst takes a gap. Not much dierence is observed in the inter-burst time. After analysing all dierent scenarios thoroughly, we plotted the Cumu- lative Distribution Function (CDF) and Complimentary Cumulative Distri- bution Function (CCDF) graphs for each burst individually. CHAPTER 5. ANALYSIS 28

Comparison Of Cdf(Lan) 1 Popular 0.9 India US 0.8 UK Sweden 0.7

0.6

0.5

Cdf Values 0.4

0.3

0.2

0.1

0 0 10 20 30 40 50 60 70 80 Number of Packets

Figure 5.1.1: Comparison of CDFs between Popular and Uploaded Videos in Fixed Network

Comparison Of Cdf(wireless) 1 Popular 0.9 India US 0.8 UK Sweden 0.7

0.6

0.5

Cdf Values 0.4

0.3

0.2

0.1

0 0 5 10 15 20 25 Number of Packets

Figure 5.1.2: Comparison of CDFs between Popular and Uploaded Videos in Wireless Network

The graphs in Fig 5.1.1 and 5.1.2 represent the CDF of popular video against uploaded videos in both the networks. All the four uploaded videos almost have the same CDF but the CDF of the popular video varies a lot from the uploaded ones in both the scenario. Now that the burst strategy CHAPTER 5. ANALYSIS 29 has been analysed, it is important to see the distribution followed by it. The CDF plots have been matched with various distributions and it was found that the distribution which was mostly in pact with the obtained graph was Normal distribution. Further research is needed to analyse keenly about the distribution. Chapter 6

Results

Main focus of the thesis was on the video delivery strategy of YouTube. As discussed in Chapter 5, the burst analysis was done and the important conclusions arrived at are presented the form of ensuing graphs.

6.1 Burst Durations

Duration of Burst Popular Vs Uploaded 3.5 Popular video Uploaded video 3

2.5

2

1.5

Time in Seconds 1

0.5

0

−0.5 1 2 3 4 5 6 7 8 9 10 Bursts

Figure 6.1.1: Burst Duration in Popular video and Uploaded Video

Fig 6.1.1 is the error bar graph of the burst duration for popular and uploaded videos. The graph shows the positive/negative error between the bursts.

ˆ For both the popular and uploaded videos, the rst two bursts arrive combined. Hence the burst duration for the rst one is more (high-

30 CHAPTER 6. RESULTS 31

lighted in red).

ˆ The burst delivery is very fast in the popular video and hence the burst duration is very less and does not exceed 0.7 seconds but for the uploaded video the burst duration time was more than twice that of popular ones in xed network. The burst duration in the uploaded video varies between 1.5 to 2 seconds.

ˆ For the wireless network, the burst duration was slightly more than that of xed network but followed the same strategy of the xed net- work.

6.2 Inter-Burst Times

InterburstTime Uploaded Vs Popular(fixed&Wireless) 50 Popular(fixed) Popular(Wireless) Uploaded(Fixed) 40 Uploaded(Wireless)

30

20 Time in Seconds 10

0

−10 1 2 3 4 5 6 7 8 9 Intervals between Bursts

Figure 6.2.1: Inter Burst-Times in Popular video and Uploaded Video

ˆ We now know that the rst two bursts arrive combined. The next burst (third) follows shortly. It takes half the time compared to other inter-burst times in popular videos that is 12-13 seconds.

ˆ For the uploaded videos the third bust is more instantaneous and ap- pears nearly after one second.

ˆ The inter-burst time was almost maintained constant for popular and uploaded videos (except the rst one) and was an average of 27 seconds for both xed and wireless networks. CHAPTER 6. RESULTS 32

6.3 Burst Lengths

Figure 6.3.1: Length of the Burst in Popular video and Uploaded Video

ˆ A burst consists of 1221 video packets. But due to network prob- lems(such as delays, latency etc) it may vary by 30.

ˆ This length of the burst was maintained for all the bursts in both types of videos and in dierent networks.

ˆ As the rst burst is a combination of two bursts it has 2448 packets in it and the graph goes down to 1221 for the remaining bursts.

The network problems arise due to the lost or not captured packets.

6.4 Coecient of variation of packets in a burst

Serial number Network Popular Uploaded In seconds In seconds 1 LAN 0.503 0.176 2 Wireless 0.271 0.290

Table 6.4.1: Average of CoV for Popular and Uploaded Videos in Fixed LAN and Wireless Networks. CHAPTER 6. RESULTS 33

ˆ If the CoV is less, it represents that the ow of the data in that burst was smooth and the trac ow was regular. Hence less CoV is desired.

ˆ CoV is better for uploaded videos in xed network but for wireless network, it is noticeably better in popular videos.

ˆ The Co-ecient of variation for the popular videos was larger than that of the uploaded videos in xed network.

ˆ The Co-ecient of variation for the popular videos was slightly varying and is large for uploaded videos in wireless networks.

As we go up in the scaling the CoV values decrease because the burst gets smooth as shown in the gure below.

T=0.01s T=0.02s 80 100

80 60 60 40 40 20 20 Number of packets Number of packets 0 0 0 10 20 30 0 10 20 30 Intervals in Time(T) Intervals in Time(T) T=0.04s 200

150

100

50 Number of packets 0 0 10 20 30 Intervals in Time(T)

Figure 6.4.1: Number of packets per interval in dierent time scales

In the Fig 6.4.1 the graphs of the bursts in 0.01-Seconds scale, 0.02- Seconds scale and 0.04-Seconds Scale are shown (clockwise order). In the rst case the burst was rather uneven and bumpy and the CoV was 0.260. In the second case, the stability of the burst was better and the CoV was 0.045. In the third case the burst was rather stable, the trac seemed to be pretty uniform and the CoV was 0.029.We can see that as the time scale increases the burst gets more at and the CoV gets better. But in order to analyse the real bursts we need to check for the scale where the burst does not have a uniform ow. CHAPTER 6. RESULTS 34

6.5 Server Selection Strategy

If the video is a popular one the video comes from the nearest CDN and that's why we have the Sweden source origin for the popular videos. But for the non-popular (uploaded) videos, the origin is US server. The reason for this is to reduce the network trac, the popular videos are replicated and are placed in most of the CDNs and hence there is no need of the main server sending the requested video to every client. It redirects the request to the nearby CDN and the video delivery is done from there. Chapter 7

Conclusion and Outlook

7.1 Conclusions

This thesis presents the results of our experiment performed in order to char- acterize the YouTube video streaming. An experimental set-up was created to examine the packet transfer of selected videos from the server to the end user in dierent networks using Wireshark. The objective of the thesis was to analyze the video streaming trac in YouTube. The main points focused were

ˆ Most annoying QoE problems After thorough literature review, we found that the main problems were stalling and re-buering. Stalling was the most pertaining and annoying problem. ˆ Video delivery methodology and its dierence in serving dis- tinct videos based on their popularity This research examined YouTube video delivery process and found that YouTube has many hotspots (cache memory) all over world. The pop- ular videos are being served from those local hotspots. For the non- popular videos, the main cache server located in U.S delivers the video. The results have shown the origin of popular and non-popular videos. ˆ The main parameters involved in analyzing the video delivery and the distribution being followed by it. The video delivery is analyzed by thoroughly examining and analyzing the bursts. The main parameters involved in analyzing the burst are burst duration, inter-burst time, length of the burst. The burst dura- tion was comparatively very less in case of popular videos (less than 0.7 seconds). The inter-burst time and length of the burst was almost maintained constant in both the cases.

35 CHAPTER 7. CONCLUSION AND OUTLOOK 36

7.2 Outlook

Further research needs to be done in deeply analyzing the distribution being followed and to model the YouTube video delivery methodology. Appropri- ate steps should be proposed in order to reduce stalling. The work should also be extended to predict the network problems and reduce them. Any other popular videos streaming portal such as daily motion can be selected and their statistics can be compared with YouTube to explore their poten- tials. Bibliography

[1] V. K. Adhikari, S. Jain, Y. Chen, and Z.-L. Zhang, Reverse engineering the youtube video delivery cloud, HotMD11, 2011.

[2] H. Hisamatsu, G. Hasegawa, and M. Murata, Non bandwidth-intrusive video streaming over tcp, in Eighth International Conference on Infor- mation Technology: New Generations (ITNG), April, 2011, pp. 7883.

[3] T. Hoÿfeld, Modeling YouTube QoE based on crowdsourcing and lab- oratory user studies.

[4] T. Hoÿfeld, F. Liers, T. Volkert, and R. Schatz, FoG and Clouds: Op- timizing QoE for YouTube, in KuVS 5thGI/ITG KuVS Fachgespräch NG Service Delivery Platforms, Munich, , Oct. 2011.

[5] T. Hossfeld, M. Seufert, M. Hirth, T. Zinner, P. Tran-Gia, and R. Schatz, Quantication of YouTube QoE via crowdsourcing, in IEEE International Symposium on, Multimedia (ISM), Dec. 2011, pp. 494499.

[6] M. Fiedler, S. Chevul, L. Isaksson, P. Lindberg, and J. Karlsson, Generic communication requirements of its-related mobile services as basis for seamless communications, in Next Generation Internet Net- works, April, 2005, pp. 426433.

[7] M. Fiedler, T. Hoÿfeld, and P. Tran-Gia, A Generic Quantitative Rela- tionship between Quality of Experience and Quality of Service, IEEE Network Special Issue on Improving QoE for Network Services, Jun. 2010.

[8] S. Möller, S. Egger, and M. Fiedler, 5 working groups 5.1 QoE white paper and group work 1: Key aspects of experience perception and their subjective evaluation, Quality of Experience: From User Perception to Instrumental Metrics, p. 21, 2012.

[9] P. Le Callet, S. Möller, and A. Perkis, Qualinet white paper on deni- tions of quality of experience (2012).

37 BIBLIOGRAPHY 38

[10] A. Khan, L. Sun, J. Fajardo, I. Taboada, F. Liberal, and E. Ifeachor, Impact of end devices on subjective video quality assessment for qcif video sequences, in Third International Workshop on, Quality of Mul- timedia Experience (QoMEX), Sept. 2011, pp. 177182.

[11] Y. Zhang and M. Fujise, Location management congestion problem in wireless networks, IEEE Transactions on, Vehicular Technology, vol. 56, no. 2, pp. 942954, March, 2007.

[12] N. Ahmed, S. S. Kanhere, and S. Jha, The holes problem in wireless sensor networks: a survey, ACM SIGMOBILE Mobile Computing and Communications Review, vol. 9, no. 2, pp. 418, 2005.

[13] S. Hong and Y.-T. Song, Improving QoS of the internet during conges- tion through auction, in Software Engineering, Articial Intelligence, Networking and Parallel/Distributed Computing, Sixth International Conference on, First ACIS International Workshop on Self-Assembling Wireless Networks. SNPD/SAWN, May, 2005, pp. 314319.

[14] P. Zhu, W. Zeng, and C. Li, Joint design of source rate control and qos- aware congestion control for video streaming over the internet, IEEE Transactions on, Multimedia, vol. 9, no. 2, pp. 366376, Feb.

[15] C.-R. Lan, H.-H. Lu, C.-W. Yi, and C.-C. Tseng, A P2P HD live video streaming system, in International Conference on, Multimedia Tech- nology (ICMT), July, 2011, pp. 475478.

[16] B. Barekatain and M. bin Maarof, Network coding eciency in live video streaming over peer-to-peer mesh networks, in 7th International Conference on, Information Technology in Asia (CITA 11), July, 2011, pp. 17.

[17] E. Haghani, N. Ansari, S. Parekh, and D. Colin, Trac-aware video streaming in broadband wireless networks, in Wireless Communica- tions and Networking Conference (WCNC), IEEE, April, 2010, pp. 16.

[18] T. Ako, H. Nishiyama, N. Ansari, and N. Kato, A novel multichannel streaming scheme to reduce channel switching delay in application layer multicast, Systems Journal, IEEE, vol. 5, no. 4, pp. 545554, Dec. 2011.

[19] N. Bouten, S. Latre, W. Van de Meerssche, K. De Schepper, B. De Vleeschauwer, W. Van Leekwijck, and F. De Turck, An au- tonomic delivery framework for http adaptive streaming in multicast- enabled multimedia access networks, in Network Operations and Man- agement Symposium (NOMS), IEEE, April, 2012, pp. 12481253. BIBLIOGRAPHY 39

[20] N. Carapeto, N. Amram, B. Fu, L. Marchetti, M. Marchisio, B. Sayadi, and R. Nossenson, Architecture for adaptable QoE-centric mobile video delivery, in Future Network Mobile Summit (FutureNetw), July, 2012, pp. 18.

[21] W. Kumwilaisak, Y. Hou, Q. Zhang, W. Zhu, C.-C. Kuo, and Y.-Q. Zhang, A cross-layer quality-of-service mapping architecture for video delivery in wireless networks, IEEE Journal on, Selected Areas in Com- munications, vol. 21, no. 10, pp. 16851698, Dec. 2003.

[22] B. Davie, Keynote, in 10th IEEE International Symposium on, Net- work Computing and Applications (NCA). IEEE, 2011, pp. xvixvi.

[23] L. Plissonneau, E. Biersack, and P. Juluri, Analyzing the impact of YouTube delivery policies on user experience, in 24th International, Teletrac Congress (ITC 24), Sept. 2012, pp. 18.

[24] V. Adhikari, S. Jain, and Z.-L. Zhang, Where do You Tube? uncov- ering youtube server selection strategy, in Computer Communications and Networks (ICCCN), Proceedings of 20th International Conference on, 31 Aug. 2011, pp. 16.

[25] V. K. Adhikari, S. Jain, and Z.-L. Zhang, YouTube trac dynamics and its interplay with a tier-1 isp: an isp perspective, in Proceedings of the 10th annual conference on Internet measurement. ACM, 2010, pp. 431443.

[26] P. Gill, M. Arlitt, Z. Li, and A. Mahanti, YouTube trac characteriza- tion: a view from the edge, in Proceedings of the 7th ACM SIGCOMM conference on Internet measurement. ACM, 2007, pp. 1528.

[27] Z. Huang, C. Mei, L. Li, and T. Woo, Cloudstream: Delivering high- quality streaming videos through a cloud-based svc proxy, in Proceed- ings IEEE, INFOCOM, April, 2011, pp. 201205.

[28] J. Shaikh, M. Fiedler, P. Arlos, and D. Collange, Modeling and analysis of web usage and experience based on link-level measurements, in 24th International, Teletrac Congress (ITC 24), Sept. 2012, pp. 18.

[29] W. Jiang and H. Schulzrinne, Analysis of on-o patterns in voip and their eect on voice trac aggregation, in Computer Communications and Networks, Proceedings of Ninth International Conference on, 2000, pp. 8287.

[30] G. Terdik and T. Gyires, Internet trac modeling with lévy ights, in Seventh International Conference on, Networking, ICN. IEEE, 2008, pp. 468473. BIBLIOGRAPHY 40

[31] F. Luo, L. Dong, and F. Jia, Method and implementation of building forces protocol dissector based on Wireshark, in The 2nd IEEE In- ternational Conference on, Information Management and Engineering (ICIME), April, 2010, pp. 291294. [32] S. Wang, D. Xu, and S. Yan, Analysis and application of wireshark in tcp/ip protocol teaching, in 2010 International Conference on E-Health Networking, Digital Ecosystems and Technologies (EDT), vol. 2, April, pp. 269272. [33] CISCO,  Global Internet Trac Projected to Quadruple by 2015. [34] M. Zink, K. Suh, Y. Gu, and J. Kurose, Watch global, cache local: Youtube network trac at a campus network: measurements and im- plications, in Electronic Imaging 2008. International Society for Optics and Photonics, 2008, pp. 681 805681 805. [35] G. Lin, G. M. Michko, C. J. Bonk, A. J. Bonk, and Y.-T. Teng, Sur- vey research on motivational elements of YouTube: Age and education matter, in The American Educational Research Association (AERA) Annual Meeting, San Diego, CA, 2009. [36] K. Hunt, Copyright and YouTube: Pirate's playground or fair use forum, Mich. Telecomm. & Tech. L. Rev., vol. 14, p. 197, 2007. [37] S. Egger, T. Hossfeld, R. Schatz, and M. Fiedler, Waiting times in quality of experience for web based services, in 2012 Fourth Interna- tional Workshop on, Quality of Multimedia Experience (QoMEX), July, pp. 8696. [38] S. Alcock and R. Nelson, Application ow control in YouTube video streams, ACM SIGCOMM Computer Communication Review, vol. 41, no. 2, pp. 2430, 2011. [39] L. Plissonneau and E. Biersack, A longitudinal view of http video streaming performance, in Proceedings of the 3rd Multimedia Systems Conference. ACM, 2012, pp. 203214. [40] R. Torres, A. Finamore, J. R. Kim, M. Mellia, M. M. Munafo, and S. Rao, Dissecting video server selection strategies in the YouTube CDN, in 31st International Conference on, Distributed Computing Sys- tems (ICDCS). IEEE, 2011, pp. 248257. [41] (2012) Browsers Statictis. [Online]. Available: http://www.w3schools. com/browsers/browsers_stats.asp [42] G. Combs et al., Wireshark, Web page: http://www. wireshark. org/last modied, 2007.