Departamento de Universidade de Aveiro Electr´onica,Telecomunica¸˜oese Inform´atica, 2016

Jos´eMiguel Costa Content Distribution in OTT Wireless Networks Silva Distribui¸c˜aode Conte´udosem Redes OTT sem Fios

Departamento de Universidade de Aveiro Electr´onica,Telecomunica¸c˜oese Inform´atica, 2016

Jos´eMiguel Costa Content Distribution in OTT Wireless Networks Silva Distribui¸c˜aode Conte´udosem Redes OTT sem Fios

Disserta¸c˜aoapresentada `aUniversidade de Aveiro para cumprimento dos requisitos necess´arios `a obten¸c˜ao do grau de Mestre em Engenharia Electr´onicae de Telecomunica¸c˜oes,realizada sob a orienta¸c˜aocient´ıficada Professora Doutora Susana Sargento, Professora Associada com Agrega¸c˜ao do Departamento de Eletr´onica,Telecomunica¸c˜oese Inform´aticada Uni- versidade de Aveiro e co-orienta¸c˜aocient´ıficado Doutor Lucas Guardalben, Investigador do Instituto de Telecomunica¸c˜oesde Aveiro. o j´uri/ the jury presidente / president Professor Doutor At´ılioManuel da Silva Gameiro Professor Associado do Departamento de Eletr´onica, Telecomunica¸c˜oes e In- form´aticada Universidade de Aveiro vogais / examiners committee Professor Doutor Manuel Alberto Pereira Ricardo Professor Associado da Universidade do Porto (Arguente)

Professora Doutora Susana Isabel Barreto de Miranda Sargento Professora Associada com Agrega¸c˜aodo Departamento de Eletr´onica,Telecomu- nica¸c˜oese Inform´atica da Universidade de Aveiro (Orientadora) agradecimentos/ Queria agradecer em primeiro lugar aos meus pais que me possibilitaram aknowledgments esta oportunidade em poder estudar na universidade e no curso que escolhi e por todo o apoio que me deram. Queria tamb´emagradecer a minha madrinha pelo apoio. Em segundo lugar, mas n˜aomenos importante, queria agradecer a minha namorada Deolinda Moura que me aturou todos estes anos e me ajudou a conseguir ultrapassar certos momentos dif´ıceis, acreditando sempre em mim. Queria agradecer tamb´ema todos os meus amigos, Carina, Ra´ul,Carolina, Tiago e Andr´epela ajuda e apoio. Agrade¸co tamb´em`aProfessora Doutora Susana Sargento e ao Doutor Lucas Guardalben por me terem orientado neste percurso. Para terminar, agrade¸coao grupo de investiga¸c˜aodo NAP, em particular ao Carlos, ao Jo˜aoNogueira e ao Pedro por todo o apoio, ao Instituto de Telecomunica¸c˜oesde Aveiro pelos recursos facultados e a todos os profes- sores que ao longo de todos estes anos tive a oportunidade de conhecer e que contribu´ırampara a finaliza¸c˜aode mais uma etapa na minha vida. O meu muito obrigado a todos!

“The best way to predict the future is to create it.” -Peter Drucker

Resumo Nos pa´ısesdesenvolvidos, cada vez mais a Internet ´econsiderada um bem essencial. A necessidade de estar “online”, partilhar e aceder a conte´udos s˜aorotinas frequˆentesno dia-a-dia das pessoas, tornando assim a Internet num dos sistemas mais complexos em opera¸c˜ao. A maioria das comunica¸c˜oestradicionais (telefone, r´adioe televis˜ao)est˜aoa ser remodeladas ou redefinidas pela Internet, dando origem a novos servi¸cos, como o protocolo de Internet por voz (VoIP) e o protocolo de Internet de televis˜ao(IPTV). Livros, jornais e outro tipo de publica¸c˜oesimpressas est˜ao tamb´ema adaptar-se `atecnologia web ou tˆemsido reformuladas para blogs e feeds. A massifica¸c˜aoda Internet e o aumento constante das larguras de banda oferecidas aos consumidores criaram condi¸c˜oesexcelentes para servi¸cosmultim´ediado tipo Over-The-Top (OTT). Servi¸cos OTT referem- se `aentrega de ´audio, video e outros via Internet sem usar o controlo dos operadores de rede. Apesar da entrega OTT apresentar uma proposta atrativa (e lucrativa, ol- hando para o r´apidocrescimento de servi¸coscomo o YouTube, Skype e Netflix, por exemplo) esta sofre de algumas limita¸c˜oes. E´ necess´ariomanter n´ıveiselevados de Qualidade-de-Experiˆencia (QoE) para continuar a atrair clientes. Para isso ´efundamental uma rede de distribui¸c˜aode conte´udos capaz de se adaptar `arapidez com que os conte´udoss˜aorequeridos e rapi- damente descartados e que consiga albergar todo o tr´afego. Esta disserta¸c˜aofoca-se na distribui¸c˜aode conte´udosOTT nas redes sem fios, por forma a endere¸car a falta de trabalhos de investiga¸c˜aonesta ´area. E´ proposta uma solu¸c˜aoque visa poder ser integrada pelos equipamentos de rede para, desta forma, estes serem capazes de prever que tipo de conte´udo os consumidores conectados (ou nas proximidades) possam vir a solicitar e coloc´a-loem mem´oriaantes de ser pedido, melhorando a percep¸c˜aocom que os consumidores recebem o mesmo. Dada a falta de informa¸c˜aona literatura sobre gest˜aoe controlo de proxy caches para sistemas embutidos, o primeiro passo foi testar e avaliar dois algoritmos de cache diferentes: Nginx e . Os resultados mostram que existe um compromisso entre o desempenho de cache e velocidade no processamento dos pedidos, apre- sentando o Nginx um melhor desempenho mas piores tempos nas respostas aos pedidos. Foi tamb´emverificado que o tamanho da cache nem sempre determina um melhoramento significativo nos resultados. As` vezes, manter apenas o conte´udomais popular em cache ´esuficiente. De seguida, foram propostos e testados dois algoritmos de previs˜aode conte´udos(prefetching) em cen´ariosde mobilidade, dada as caracter´ısticas das redes sem fios, onde foi poss´ıvel observar melhorias de desempenho muito significativas, demonstrando que existe a possibilidade de ser vi´avel um investimento nesta ´area,embora isto implique um aumento na capaci- dade de processamento/ consumo de energia dos equipamentos de rede.

Abstract In developed countries, the Internet is increasingly considered an essential and integral part of people’s lives. The need to be “online”, share and access content are frequent routines in people’s daily lives, making the Internet one of the most complex systems in operation. Most traditional communications (telephone, radio and television) are being remodelled or redefined by the Internet, giving rise to new services such as Voice over Internet Protocol (VoIP) and Internet Protocol Television (IPTV). Books, newspapers and other types of printed publications are also adapting to the web technology or have been redesigned for blogs and feeds. Massification of the Internet and the constant increase in bandwidth offered to the consumers have created excellent conditions for services such as OTT. OTT Services refer to the delivery of audio, video and other data over the Internet without the control of network operators. Although the OTT delivery presents an attractive solution (and profitable, looking at the fast growing services such as YouTube, Skype and Netflix, for example), it suffers from some limitations. It is necessary to maintain high levels of Quality-of-Experience (QoE) to continue to attract customers. In order to do this, a content distribution network is fundamental to adapt to the speed with which the contents are required and quickly discarded and that can accommodate all the traffic. This dissertation focuses on the distribution of OTT contents in wireless networks, in order to address the lack of research work in this area. A solu- tion is proposed that can be integrated by the network equipment so that it is able to predict what kind of content consumers connected (or nearby) may request and put it in memory before being requested, improving consumers’ perception of the service. Given the lack of information in the literature on management and control of proxy caches for embedded systems, the first step was to test and evaluate two different cache algorithms: Nginx and Squid. The results show that there is a trade-off between cache perfor- mance and speed in processing the requests, with Nginx delivering better performance but worse response times. It was also found that cache size does not always determine a significant improvement in results. Sometimes keeping just the most popular content cached is enough. Afterwards, two algorithms for predicting prefetching contents in mobility scenarios were proposed and tested, given the characteristics of the wireless networks, where it was possible to observe very significant performance improvements, demonstrating that there is a possibility for an investment in this area, although this implies an increase in the processing capacity and power consumption of the network equipment.

Contents

Contents i

List of Figures iii

List of Tables v

Acronyms vii

1 Introduction 1 1.1 Motivation ...... 1 1.2 Objectives and Contributions ...... 2 1.3 Document Organization ...... 3

2 State of the art 5 2.1 Introduction ...... 5 2.2 Over-The-Top (OTT) Multimedia Networks ...... 6 2.2.1 OTT Multimedia Services in Telecommunication Operators ...... 6 2.3 Content Delivery Networks (CDNs) ...... 7 2.3.1 The CDN Infrastructure ...... 8 2.3.2 Content Distribution Architectures ...... 10 2.3.2.1 Centralized Content Delivery ...... 10 2.3.2.2 Proxy-Caching ...... 11 2.3.2.3 Peer-to-Peer (P2P) ...... 12 2.3.3 Content Delivery Network Interconnection (CDNI) ...... 12 2.3.4 Mobile Content Delivery Networks (mCDNs) ...... 14 2.3.5 CDNs and Multimedia Streaming ...... 15 2.4 Multimedia Streaming Technologies ...... 15 2.4.1 Traditional Streaming ...... 16 2.4.2 Progressive Download ...... 17 2.4.3 Adaptive Streaming Technologies ...... 19 2.4.3.1 Adaptive Segmented HTTP-based delivery ...... 19 2.5 Prefetching ...... 21 2.6 Multimedia Caching ...... 24 2.6.1 Caching Algorithms ...... 24 2.6.2 Proxy-Caching Solutions ...... 26 2.7 Chapter Considerations ...... 27

i 3 Wireless Content Distribution 29 3.1 Introduction ...... 29 3.2 Problem Statement ...... 29 3.3 Proposed Solution ...... 30 3.3.1 Proxy Caching Strategies for Embedded Systems ...... 32 3.3.2 Prefetching and Mobile Consumers ...... 34 3.4 Chapter Considerations ...... 36

4 Implementation 37 4.1 Introduction ...... 37 4.2 Architecture and Implementation Overview ...... 37 4.2.1 Consumers ...... 37 4.2.2 Nodes ...... 40 4.2.2.1 Node Initialization ...... 40 4.2.2.2 Prefetching ...... 41 4.2.2.3 Neighbors Manager ...... 45 4.3 Chapter Considerations ...... 47

5 Integration and Evaluation 49 5.1 Introduction ...... 49 5.2 Hardware and Description ...... 49 5.2.1 Consumers and Origin Server ...... 49 5.2.2 Single Board Computer with Wireless USB Adapter ...... 50 5.2.3 Operating System - OpenWrt ...... 50 5.3 Raspberry and Web Servers Configuration ...... 51 5.3.1 Raspberry Configuration ...... 51 5.3.2 Squid Configuration ...... 52 5.3.3 Nginx Configuration ...... 53 5.4 Evaluation ...... 54 5.4.1 Performance Metrics ...... 54 5.4.2 Proxy Cache Metrics ...... 55 5.4.3 Support Scripting ...... 55 5.4.3.1 Script CPU Usage and Load ...... 56 5.4.3.2 Scripts to perform the test scenario ...... 57 5.4.3.3 Scripts to generate final results ...... 60 5.4.4 Scenario 1: Caching strategies for embedded systems ...... 60 5.4.4.1 Approach by Number of Requests ...... 60 5.4.4.2 Approach by Time ...... 63 5.4.4.3 Approach by Cache Size ...... 65 5.4.5 Scenario 2: Prefetching and Mobile consumers ...... 68 5.5 Chapter Considerations ...... 72

6 Conclusion and Future Work 75 6.1 Conclusions ...... 75 6.2 Future Work ...... 75

Bibliography 77

ii List of Figures

1.1 OTT Video Ecosystem - Example ...... 1

2.1 Typical CDN infrastructure [1] ...... 8 2.2 Centralized OTT delivery ...... 11 2.3 Proxy Cache OTT delivery ...... 11 2.4 P2P OTT delivery ...... 12 2.5 CDNI use case 1 ...... 13 2.6 CDNI use case 2 ...... 13 2.7 Typical mCDN infrastructure ...... 15 2.8 Traditional Streaming [2] ...... 17 2.9 Progressive Download Architecture [3] ...... 18 2.10 Example how Progressive Download works [3] ...... 18 2.11 Progressive Download features [3] ...... 18 2.12 Segmented HTTP Adaptive Streamings [4] ...... 19 2.13 IIS Smooth Streaming Media Workflow [5] ...... 20 2.14 Client Manifest File - Example ...... 21

3.1 Typical Wireless CDN infrastructure ...... 31 3.2 Proposed Architecture Overview ...... 32 3.3 Scenario used to test different cache approaches ...... 34 3.4 Scenario used to test prefetching and consumers mobility ...... 36

4.1 Architecture main blocks...... 38 4.2 Mobile Consumer Flow Chart ...... 39 4.3 Methods Implemented in Mobile Consumers ...... 40 4.4 One Node: Block Diagram and Interactions ...... 41 4.5 Initialization Flow Chart ...... 42 4.6 First Algorithm Flow Chart ...... 43 4.7 Second Algorithm Flow Chart ...... 44 4.8 ServerSocket Flow Chart ...... 46 4.9 Packet fields ...... 47 4.10 ClientSocket Flow Chart ...... 47 4.11 Example of how Prefetching is sent through Neighbors Manager ...... 48

5.1 Raspberry Pi 2 [6] ...... 50 5.2 TP-LINK TL-WN722N USB Wireless Adapter [7] ...... 51 5.3 User Interfaces...... 52

iii 5.4 Nginx Configuration Example ...... 54 5.5 CPU/Load Measure Flow Chart ...... 56 5.6 MakeAll Scenario 1 Flow Chart ...... 58 5.7 MakeAll Script Scenario 2 Flow Chart ...... 59 5.8 Cache Performance (number of requests)...... 61 5.9 Request Time vs Qualities (number of requests) ...... 62 5.10 Request Time vs Technologies (number of requests) ...... 62 5.11 CPU Load and Usage (number of requests)...... 63 5.12 Cache Performance (time)...... 64 5.13 Request Time vs Qualities (time) ...... 64 5.14 Request Time vs Technologies (time) ...... 64 5.15 CPU Load and Usage (time)...... 65 5.16 Cache Performance (cache)...... 66 5.17 Request Time vs Qualities (cache)...... 67 5.18 Request Time vs Technologies (cache)...... 68 5.19 CPU Load and Usage (cache)...... 69 5.20 Cache Performance (prefetching)...... 71 5.21 Request Time vs Qualities (prefetching)...... 71 5.22 Request Time vs Technologies (prefetching)...... 72 5.23 CPU Load and Usage (prefetching)...... 73

iv List of Tables

2.1 FIFO cache Replacement Policy ...... 24 2.2 LRU cache Replacement Policy ...... 25 2.3 MPU cache Replacement Policy ...... 25 2.4 Comparison of Five Proxy Cache Solutions - Main features ...... 27

3.1 Available Qualities and Average Size per Chunk ...... 33

4.1 Videos and their Probabilities to be Requested...... 38

5.1 Machines used to simulate consumers and the origin server ...... 49

v vi Acronyms

AP Access Point

ABS Adaptive Bitrate Streaming

CAPEX Capital Expenditures

CARP Cache Array Routing Protocol

CDN Content Delivery Network

CDNI Content Delivery Network Interconnection

CDNi-WG Content Delivery Networks interconnection - Work Group

CPU Central Processing Unit

CSV Comma-Separated Values

DASH Dynamic Adaptative Streaming

DHCP Dynamic Host Configuration Protocol

EST-VoD Electronic Sell Through VoD

FIFO First-In-First-Out

FLV Flash Video

GDB GNU Debugger

GPL GNU General Public License

HDD Hard disk drive

HDS HTTP Dynamic Streaming

HLS HTTP Live Streaming

HTCP Hypertext Caching Protocol

HTTP Hypertext Transfer Protocol

HTTPS Hypertext Transfer Protocol Secure

ID Identity

vii IDE Integrated Development Environment

IIS Internet Information Services

IP Internet Protocol

IPTV Internet Protocol Television

ISP Internet Service Provider

IT Information Technology

LFU Least Frequently Used

LRU Least Recently Used

MB MegaByte

Mbps Megabits Per Second mCDN Mobile Content Delivery Network

MPEG Moving Pictures Expert Group

MPU Most Popularly Used

NAT Network Address Translation

OPEX Operational Expenditures

OS Operating System

OTT Over-The-Top

P2P Peer-to-Peer

PC Personal Computer

PEVq Perceptual Evaluation of Video Quality

POSIX Portable Operating System Interface

PVR Personal Video Recorder

QoE Quality-of-Experience

QoS Quality-of-Service

QT QuickTime

RAN Radio Access Network

RDT Real Data Transport

RTCP Real-Time Control Protocol

RTMP Real-Time Messaging Protocol

viii RTP Real-Time Transport Protocol

RTSP Real-Time Streaming Protocol

SBC Single Board Computer

SD Secure Digital

SSH Secure Shell

S-VoD Subscription VoD

TV Television

T-VoD Transaction VoD

TCP Transmission Control Protocol

UDP User Datagram Protocol

USB Universal Serial Bus

VoD Video-on-Demand

VoIP Voice over Internet Protocol

VCR Video Cassette Recorder

Wi-Fi Wireless Fidelity

WMV Windows Media Video

XML Extensible Markup Language

ix x Chapter 1

Introduction

1.1 Motivation

In recent years, with increasing Internet access speeds and the proliferation of mobile devices, consumer habits have been changing. There is a clear increasing trend of non- linear Television (TV) video watching versus broadcast TV services. Regarding Video-on- Demand (VoD) consumption, most of the traditional movie rental stores have closed and either focused on online video delivery (e.g. Netflix). People tend to prefer services that cover multi-screen support and where they can access anytime/everywhere. As a consequence, the number of OTT based services, characterized by being transmitted through the network of an operator without control in the distribution, has been on the rise. Some examples of heavily used OTT services are shown in Figure 1.1.

Figure 1.1: OTT Video Ecosystem - Example.

This growth has raised several issues, both in terms of scalability, reliability and QoE. The network needs to adapt to the growing number of clients that use it everyday and request content. It is necessary to store the most craved contents near the consumers, in order to avoid overloading the entire network. Although the structure that exists today is being developed in this direction, the big problem is the consumer’s QoE. The consumer’s QoE of a service is directly related to the opinion that the user gets right after using it. A service that presents a large delay or continuous quality breaks in the visualization of a video stream is, therefore, evaluated with low QoE. Although, in theory,

1 the network is able to accommodate all consumers, in practice this implies a reduction in the QoE in proportion to the total number of clients to serve. From a 2013 report by Conviva [8], 39.3% of video views experienced buffering, 4% of the views failed to start, and 63% of the views experienced low resolution [9]. There is a need to improve the content delivery time to the users, in order to improve their QoE. One way to do this is to predict in advance what kind of content they may require, based on their recent interactions with the service, and store them in memory so that they can be delivered quickly when requested. This process is called prefetching. One of the most critical points in the whole delivery is the wireless layer where, on the one hand, there is stronger interference (which leads to a higher packet loss that will have impact on the video experience felt by the consumer) and, on the other hand, the resources are very limited, either at the processing level or at the memory level. However, this is the closest layer to the user, and thus, it is the one where it is possible to obtain better content delivery times if they are cached. Another challenge in wireless networks is the consumers mobility. There is no point in predicting and caching content if it is never requested because the consumer has already moved. It is necessary to be able to send the content to other neighboring network nodes (e.g. Access Points (APs)) before the consumers arrive there, to make it seamless to the users movement. These APs must work together, thus forming a wireless content distribution network. Predicting content is not easy, but it is quite challenging and it certainly brings motivation to the Information Technology (IT) professionals.

1.2 Objectives and Contributions

The objective of this dissertation is to build a decentralized and scalable wireless content distribution network, which addresses the high consumption of content, the optimized place- ment of functionalities in mobile networks and the dynamic dissemination of content across the web. The main objectives are as follows:

• Evaluate different proxy cache solutions: for the management and control of scattered caches along the wireless network in the available APs.

• Propose, implement and test different prefetching algorithms: in order to predict the content that will be requested by the consumers and its location in this way to cache it before it is requested.

• Improve consumers QoE: in order to provide the best possible experience at a given point in time (which will happen if the requested content is cached), the multimedia services need to be in real or nearly real time, without freeze or with low-buffering.

• Build a demonstrator in the laboratory: which allows testing different solutions for the management and control of scattered caches along the network in mobility scenarios and with the prefetching mechanisms.

The work on the prefetching algorithms for the wireless content distribution network will be published in a scientific paper.

2 1.3 Document Organization

This document is organized as follows:

• Chapter 1 contains the Introduction of the work.

• Chapter 2 presents the state of the art about Over-The-Top (OTT) Multimedia Networks, Content Delivery Networks (CDNs), Multimedia Streaming Technologies, Prefetching and Multimedia caching.

• Chapter 3 presents the problem behind OTT multimedia networks and the proposed solution to improve it, explaining the scenarios used.

• Chapter 4 presents the architecture and the implementation of the proposed solution.

• Chapter 5 presents the integration and the evaluation of the implemented solution.

• Chapter 6 presents the conclusion and the future work.

3 4 Chapter 2

State of the art

2.1 Introduction

In order to give a better comprehension of this document to the reader, this chapter presents the fundamental concepts which support the developed work, and an analysis of related work in this area of study. The chapter is structured as follows:

• Section 2.2 introduces the concept of Over-The-Top (OTT) Multimedia Networks and some other concepts very used, such as Linear TV, Time-shit TV and Video-on- Demand (VoD).

• Section 2.3 introduces the concept of Content Delivery Networks (CDNs), explaining their main structure, functionalities and main architectures. As currently consumers require access to content everywhere, there is a need to introduce new concepts such as Content Delivery Network Interconnection (CDNI) and Mobile Content Delivery Networks (mCDNs). In the end it is also addressed the issues of using CDNs to serving multimedia content.

• Section 2.4 presents the evolution of the types of streaming, starting with the tradi- tional streaming, moving to the well established progressive download and then adaptive and scalable streaming, giving special importance to adaptive segmented HTTP-based delivery, since it will be used in the implementation of this MSc thesis.

• Section 2.5 explains the concept of prefetching, focusing then on web prefetching algorithms, such as prefetch by Popularity, Lifetime and Good Fetch. These algorithms only take into account as metric the characteristics of the web objects. Thus, it is also presented an evolutionary algorithm that takes into account the consumers behavior and the content itself.

• Section 2.6 provides an overview of popular caching algorithms namely First-In-First- Out (FIFO), Least Recently Used (LRU) and Least Frequently Used (LFU). It also presents an approach developed in our research group - Most Popularly Used (MPU). This section ends with a survey of proxy-caching solutions for embedded systems.

• Section 2.7 presents the chapter summary and considerations.

5 2.2 Over-The-Top (OTT) Multimedia Networks

Nowadays, most of the people have access to Internet and increasingly with higher speeds. As a consequence, OTT multimedia networks has grown in recent years. In OTT multimedia networks the content is delivered without the involvement of multiple-system operator, that controls the distribution the content. Since the Internet Service Provider (ISP) networks are being used to allow a service from a third-party, to use their network for free, this type of delivery is called Over-the-Top, and is considered unmanaged delivery. The networks used to deliver Internet Protocol (IP) can be classified into two main classes: managed/closed or unmanaged/open [10], depending on if the operator controls the traffic or not. In a managed network, ISP guarantees a very high Quality-of-Service (QoS) to subscribers (people who pay for it). This type of service is used in IPTV such as AT&T [11]. IPTV refers to the delivery of digital television and other audio and video services over broadband data networks using the same basic protocols that support the Internet. On the other hand, in a unmanaged network, the ISP does not guarantee QoS. All the content has the same treatment (no resource reservation is made). Because of that, the consumption of video services over the best effort Internet raises multiple issues, like QoE [12]. The QoE comes from the users’s expectations. In order to provide the best possible experience at a given point in time, the multimedia services need to be in real or nearly real time, that is why the video delivery can not freeze or has low-buffering. Also, in live events, low end-to-end delay is required. In order to minimize/overcome these issues, the multimedia delivery infrastructure needs to be well understood, and is thus the focus of the following sections.

2.2.1 OTT Multimedia Services in Telecommunication Operators To provide strong competition to “pure-OTT” business models (Youtube, Netflix...), telecommunication operators are also moving towards OTT-based distribution systems, due to the fact that television visualization has become more asynchronous [10]. The linear channel schedule will be less important than the programs themselves. Because of that, telecommunication operators want to give users the choice of accessing the content and services they want, and pay for, in a wide range of client devices, instead of being constrained to a location (home) or a device (TV). Live television will still attract audience, particularly for news, sports, events and national occasions. As a consequence of this shift to OTT services, it is important to understand what are CDNs, streaming protocols and how to improve QoE. The following sections will discuss this.

Telecommunication operators, to deliver their TV contents, may use:

• Linear TV, i.e. ”regular TV broadcast” respecting a predetermined program lineup, that was considered for decades as the traditional and more popular way of watching TV programs. This is still the dominant way of watching TV from national free-to-air TV services and major Pay-TV Operators [13].

• Time-shift TV relates to the visualization of deferred TV content, i.e linear-TV con- tent that is recorded to be watched later, using one of the following services:

6 1. Pause TV, allowing users to pause the television program they are currently watch- ing, from a few seconds up to several hours. Users can resume the TV broadcast when they want, continuing where they left off, skip a particular segment or even- tually catch up to the linear broadcast [13]. 2. Start-over TV gives the opportunity for the users to restart programs that have already started or finished from the beginning. The amount of time that can be possible to rewind varies from operator to operator ranging from some minutes up to 24 h. The number of TV channels supporting this feature is also a decision of the operator [13]. 3. Personal Video Recorder (PVR), where the recordings depend on the user action, i.e., they only occur if the user proactively schedules a TV program or a series to be recorded, or if he decides to start recording a program that is being watched. The behavior of the service is similar to the one of a Video Cassette Recorder (VCR); however, with a larger storage capacity and nonlinear access. The user can start watching a recording whenever he wants, even if the program is still being recorded [13]. 4. Catch-up TV is the most advanced time-shift service, relying on an automated process of “Live to VoD”. With this service, TV operators offer recorded content of the previous hours up to 30 days, and the number of recorded TV channels varies from operator to operator. Using this type of service, users can catch up on TV programs that have been missed or that they explicitly decide to watch [13].

• Video-on-Demand (VoD) refers to services where users need to pay to watch a specific content through one of the following ways:

1. Transaction VoD (T-VoD) is the most typical version of the service, where cus- tomers need to pay a given amount of money whenever they want to watch a content from the VoD catalog. The rental time depends from operator to oper- ator, but is usually of 24 or 48 h, during which they can watch it several times [13]. 2. Electronic Sell Through VoD (EST-VoD) is a version of the VoD service involving the payment of a one-time fee to access the purchased content without restrictions, usually on a specific operator platform. This method of VoD is similar in OTT providers like Apple iTunes and Amazon Instant Video; it is also being offered by traditional Pay-TV operators like Verizon’s FiOS TV [13]. 3. Subscription VoD (S-VoD) corresponds to the business model also adopted by OTT providers like Netflix, and where customers pay a monthly fee that allows them to watch whatever they want from provider catalog for an unlimited number of times [13].

2.3 Content Delivery Networks (CDNs)

The impressive dissemination of the Internet worldwide has generated a significant shift in its usage, with respect to what it was originally conceived for. Initially, the Internet was mainly designed as a robust, fault-tolerant network to connect hosts for military and scientific

7 applications. Nowadays, one of the main usages of the Internet is content generation, sharing and access of millions of simultaneous users [14], becoming one of the most complex systems in operation, both in number of protocols supported and sheer scale. As a consequence, CDNs arise to provide the end-to-end scalability necessary to support this growth. CDNs emerged in 1998 [15] to become a fundamental piece of modern delivery infras- tructure. They were developed to provide numerous benefits such as a shared platform for multi-service content delivery, reduced transmissions costs for cacheable content, improved QoE for end users and increased robustness of delivery, i.e, maximizing bandwidth. For these reasons they are frequently used for large-scale content delivery. In this context, the content refers to the data being managed (video, audio, documents...) and its associated metadata. Some examples of CDN application technologies are found on Media Providers (YouTube, Netflix), Social Networks (Facebook, Twitter), Network Operators (Telefonica, Vodafone) and Services Providers (Amazon CloudFront, Akamai).

2.3.1 The CDN Infrastructure CDNs are complex with many distributed components collaborating to deliver content across different network nodes. Usually, they can be subdivided into three main functional blocks: delivery and management, request routing and performance measurement. In figure 2.1 it is represented the typical CDN infrastructure.

Figure 2.1: Typical CDN infrastructure [1].

Content Delivery and Management System

8 As can be seen, CDNs are composed by multiple servers, CDN nodes, sometimes called replica or surrogate servers that acquire data from the origin servers (using a private network), which contain all the data. This replica servers store copies of the origin servers’ content so that they can serve content themselves, reducing the load of the origins. To deliver content to end users with QoS guarantees, CDN administrators must ensure that replica servers are strategically placed across the Web. Generally, the issue is to place M replica servers among N different sites (N > M) in a way that yields the lowest cost (widely known as the minimum K-median problem [16]). With an optimal number of replica servers, ISPs will benefit by reducing the bandwidth consumption and performance Web servers by reducing latency for their clients. After properly placing the replica servers into a CDN, there comes another issue: what content should be replicated in the replica servers? This is commonly known as content out- sourcing. Traditionally, three main categories for content outsourcing have been established [17]:

• Cooperative push-based: content replication based on prefetching with cooperation from replica servers. First, the content is prefetched (loaded in cache before it is requested) to replica servers, and then the replica servers cooperate in order to reduce the replication and update cost. In this scheme, the CDN maintains a mapping between content and replica server, and each request is directed to the closest replica server (that has the requested object), or otherwise, the request is directed to the origin server. This approach is traditionally not used on commercial networks given that proper content placement algorithms require knowledge about the Web clients and their demands, which is data that is not commonly available for CDN providers.

• Uncooperative pull-based: content replication similar to traditional caches without prefetch- ing and cooperation. In this approach, clients’ requests are directed to their closest replica server. If the content is not in cache (cache miss), then the request is directed to another replica server or to the origin server. In other words, the replica servers, which serve as caches, pull content from the origin server when a cache miss occurs. The problem in this approach is that CDNs do not always choose the optimal server from which to serve the content. However, in its simplicity lies the key for successful deployments on popular CDNs such as Akamai or Mirror Image.

• Cooperative pull-based: an evolution of the uncooperative pull-based where the replica servers cooperate with each other in the event of a cache miss. In this approach, the content is also not prefetched. Client requests are directed to their closest replica server but, in case of cache miss, the replica servers cooperate in order to find neighboring servers that can accommodate the request and avoid requests to the origin server. This approach typically draws concepts and algorithms from Peer-to-Peer (P2P) technologies.

The issue of what content to place on the CDN node is not a trivial one. The two main limitations are: dynamic and user-specific content, and the cache limit resources of edge/replica servers that must be properly managed in order to take the best results (more cache hit ratio). Depending on how the CDN is devised, multiple protocols may be used in the interaction between the different replica servers, such as Cache Array Routing Protocol (CARP) [18] or

9 Hypertext Caching Protocol (HTCP) [19]. However, the CDN administrator usually imple- ments its own communication or interaction protocols. Sharing cache contents among Web proxies reduces significantly traffic to the Internet [20].

Request Routing System

A request routing system is responsible to forward requests of end users to the best replica server able to serve the request. This choice is determined through a set of algorithms specif- ically designed for the purpose. It means that, in the context of request routing, the best server is not necessarily the closest one in terms of physical distance [21]. The selection process should consider some aspects, like network proximity (in terms of hops), the client perceived latency and server load, for example. Request routing system also interacts with the content delivery and management system to maintain the content stored in CDN nodes up-to-date [22].

Performance Measurement

Measuring the performance of anything is the most important thing. In terms of CDN means to understand the impact on user side (QoE), traffic billing due to inefficient use of bandwidth and required number of servers, which affect Capital Expenditures (CAPEX) and Operational Expenditures (OPEX). In order to get the best performance, CDNs are also designed to delivery specified content. As a consequence, Akammai HD [23] is more optimized for streaming than CloudFlare [24], for example, which is more optimized for serving dynamic applications. In a few words, a single optimal and universal solution for CDNs does not exist.

2.3.2 Content Distribution Architectures As previously said, CDNs come to provide a scalable OTT delivery network, in order to reduce transmission costs for cacheable content, increasing robustness and QoE to the consumers. With this in consideration, it becomes necessary to decentralize OTT delivery. Thus, in this sub-section it will be presented some details about content distribution architectures. First, it will be presented the centralized approach and then two of the most widely used approaches, Proxy-Caching and P2P, in order to improve scalability and reliability.

2.3.2.1 Centralized Content Delivery The centralized approach to OTT delivery is the simplest one. Clients are directly con- nected to the origin servers without any intermediate, as depicted in figure 2.2. When the client wants a specific content, a unicast stream is created directly between the origin server and the consumer device. The advantage of this approach is the lowest delivery delay when streaming live content. However, this approach introduces many disadvantages. Firstly, this may not be scalable. Each new consumer connected implies more bandwidth requirements which is extremely expensive. Secondly, this approach does not scale properly with geographically distributed consumers. The farther away the consumers are from origin server, the access delay to the content is increased, which is more problematic if the streaming

10 Figure 2.2: Centralized OTT delivery. session is using Transmission Control Protocol (TCP). Another problem is security: direct access to the origin servers can be problematic.

2.3.2.2 Proxy-Caching

The proxy-caching approach is an alternative to the centralized solutions and has the main objective of decentralizing OTT delivery, introducing security and scalability. This architecture is illustrated in figure 2.3.

Figure 2.3: Proxy Cache OTT delivery.

As can be seen, consumers only communicate directly with the proxy cache (intermediate) that acquires the content from the origin server and caches it. The number of segments cached for each object is dynamically determined by the cache admission and replacement policies. The use of proxy caches brings several advantages when compared with the centralized one. The first and most obvious is scalability. This approach can handle more users at the same time, because the proxy cache, in case of cache hit, can deliver the content directly to the consumer, avoiding the need to ask for it to the origin, congesting it. In this case, it also improves user QoE since the proxy caches are closer to the user. Security is also improved, since the consumers can not interact directly with the origin server. In terms of bandwidth, costs are reduced too through savings in core and transit network traffic.

11 This approach has some disadvantages like increased management, deployment complexity and increased end-to-end delay in case of cache miss. However, the benefits of this approach outweigh its disadvantages.

2.3.2.3 Peer-to-Peer (P2P) The P2P approach is another possibility to decentralize OTT delivery and also increase security and scalability. This architecture is illustrated in figure 2.4.

Figure 2.4: P2P OTT delivery.

In this approach, consumers and proxy caches provide resources as well as use them. Each node (peer) may communicate with others to locate and exchange missing parts (called chunks) of a certain content. The biggest advantage of using P2P is the origin bandwidth savings, since it uses the uplink capacity of users’ and proxy caches’ connections. Although P2P uses the available upstream bandwidth better than proxy-caching approach, P2P streaming has several problems which prevent it from being widely used in OTT envi- ronment, such as the startup delay of a new streaming session (location and acquiring data from peers takes longer than streaming directly from an origin or proxy cache), and additional delays with playback lag in live streaming (but this can be reduced using agiler [25]).

2.3.3 Content Delivery Network Interconnection (CDNI) CDNs provide numerous benefits in large-scale content delivery (as mentioned in previous subsections). As a result, it is desirable that a given item of content can be delivered to the consumer regardless of consumer’s location or attachment network. This creates a need for interconnecting standalone CDNs, so they can interoperate and collectively behave as a single delivery infrastructure [26, 27]. Typically, ISPs operate over multiple areas and use independent CDNs. If these individual CDNs were interconnected, the capacity of their services could be expanded without the CDNs themselves being extended. An example of CDNI between two different CDN providers located in two different countries is illustrated in figure 2.5.

12 Figure 2.5: CDNI use case 1.

As can be seen, CDNI enables CDN A to deliver content held within country A to con- sumers in country B (and vice versa) by forming a business alliance with CDN B. Both can benefit with the possibility of expanding their service without extra investment in their own networks. One of the advantages of CDNI for content service providers is that it enables them to increase their number of clients without forming business alliances with multiple CDN providers [28]. Another use case of CDNI is to support mobility [29] (figure 2.6). Telecommunication operators want to give users the choice of accessing the content and services they want, and pay for, in a wide range of client devices, instead of being constrained to a location (home) or device (TV). As a consequence, with CDNI, users can access the same content seen at home when outside the home on their smartphones or tablets.

Figure 2.6: CDNI use case 2.

As described above, CDNI provides benefits to CDN service providers as well as to con- sumers, since they can expand their service and the content is delivered, to the end-users, from a nearby replica server, improving QoE. However, since legacy CDNs were implemented using proprietary technology, there has not been any support for an open interface for connecting with other CDNs. Even though several CDNs could be interconnected, it is still difficult to

13 know the consumption of each user in order to charge. Hopefully, Content Delivery Networks interconnection - Work Group (CDNi-WG) is working to allow the CDN collaboration under different administrations [30]. The documents can be accessed online at [31].

2.3.4 Mobile Content Delivery Networks (mCDNs) In the last few years, with the advances of broadband wireless technology that provides high-speed access over a wide area, the proliferation of mobile devices, such as smartphones and tablets, is rapidly growing and expected to increase significantly in the coming years. In fact, the increasing number of wireless devices that are accessing mobile networks worldwide is one of the primary contributors to global mobile traffic growth [32]. As a consequence, and because of the constraints and impairments associated with wireless video delivery that are significantly more severe than those associated with wire line, there is the need to consider it as a potential performance bottleneck in next generation delivery of high-bandwidth OTT multimedia content, that is the main objective of this MSc thesis. Three of the primary issues that must be considered for wireless video distribution are:

1. Mobile Device CPU, screen and battery limitations: mobile devices are typically multi- use communication devices that have different processors, display sizes, resolutions and batteries. Usually, mobile devices with smaller form factors utilize processors with lower capability and screen size than tablets for example, but the battery lasts longer. For these reasons, it is necessary that the video assets be transcoded into formats and “gear set” that ensure both good quality and efficient use of resources for transmission and decoding. The battery consumption for video delivery to client device must be minimized. 2. Wireless channel throughput impairments and constraints: the uncontrolled growth of Wi-Fi networks (that is, probably, the most popular technology to support personal wireless networks), particularly in densely populated areas where Wi-Fi networks must coexist, leads to interference problems. The impact of adjacent channel interference depends on the transmission power of the interferers, their spectrum transmit masks, on the reception filters of the stations receiving the interfering signal, and on the statistical properties of the interferers’ utilization of the spectrum. This problem can dramatically affect the performance of wireless networks, particularly in live video delivery. Another aspect is the lower available bandwidth and typically smaller displays on mobile devices, the “gear sets” used for wireless are very different than those for wire line. An example is a device with a 640x480 display that can work with video bitrates anywhere between 0.16 and 1.0 Megabits Per Second (Mbps), as opposed to 2.5 Mbps required by a 1280x720 display. 3. Diverse wireless Radio Access Network (RAN): wireless networks can vary between commercial GSM, 3G and LTE up to personal networks such as Wi-Fi and Bluetooth. Video delivery solutions should be RAN agnostic. Should the client side control the requested video bit rate, differences in RAN performance can be automatically accom- modated as long as there are enough “gears” provided to span the required range [33].

In order to improve performance in wireless video delivery (with impact in the global optimization of OTT multimedia delivery) it arises the concept of mobile Content Delivery Networks (mCDNs), that is illustrated in figure 2.7.

14 Figure 2.7: Typical mCDN infrastructure.

As can be seen, mCDNs are used to optimize content delivery in mobile networks (which have higher latency, higher packet loss and huge variation in download capacity) by adding storage to an AP, such as a wireless router, so that it can serve as a dispersed cache/replica server, providing the content (if it is cached) with low latency and high user experience, preventing network congestion. Although this approach presents improvements, the cache size is reduced compared to traditional proxy caches used in the core network. Thus, the ideal would be to cache only content that would be requested by consumers in the near future. Consequently, in later sections, this document will address prefetching and multimedia caching algorithms to better understand how the content in replica servers can be replaced.

2.3.5 CDNs and Multimedia Streaming As mentioned before, CDNs have been developed to serve the content to Internet users, improving performance and providing scalable services. Although using CDNs brings many benefits, serving multimedia content presents some issues. To better understand this problem let’s take as an example the streaming of high-quality multimedia content. As can be expected, high-quality implies larger size (in MegaBytes (MBs)) than normal multimedia streaming. To keep this content in memory it is necessary a significant amount of free space on the servers Hard disk drive (HDD). As a direct conse- quence, this reduces the number of items that a given server may hold and potentially limits the cache hit ratios. Another problem is the significant impact on traffic volume of the net- work in the event of cache miss, due to its size; it may reduce the user QoE, since it will take more time to populate the replica server. Additionally, if the user does not require all the content or skip portions of a video, many resources are wasted. Live streaming services represent another challenge to CDNs. If the impact on traffic volume is significant, the delay (origin to consumer) is higher and may not be considered anymore “live” streaming service. In order to tackle these problems, several advanced multimedia stream technologies have been developed, and are presented in the next section.

2.4 Multimedia Streaming Technologies

Streaming is the process of transferring data through a channel to its destination, where data is decoded and consumed via client/device in real time. The difference between streaming

15 and non-streaming scenarios (also known as file downloading) is that, in the first case, the client media player can begin to play the data (such as a movie) before the entire file be transmitted, as opposed to non-streaming, where all data must be transferred before being played back. Streaming is never a property of the data that is being delivered, but is an attribute of the distribution channel. In other words this means that, theoretically, most media can be streamed up [34]. With the recent advances in high-speed networks and since full file transfer in the download mode usually suffers long and perhaps unacceptable transfer time, streaming media usage has increased significantly [35]. As a consequence, IPTV platforms and OTT services, such as Netflix, are growing up in popularity in the last decade. Typically, video stream services use different types of media streaming protocols. Although they differ in implementation details, they can be classified into two main categories: Push- based and Pull-based protocols. In Push-based streaming protocols, after the establishment of the client-server connection, the server streams packets to the client until the client stops or interrupts the session. There- fore, in Push-based streaming, the server maintains a session state with the client, listening for session-state changes [36]. The most common session control protocol used in Push-based streaming is Real-Time Streaming Protocol (RTSP), specified in RFC 2326 [37]. On the other hand, in Pull-based streaming protocols, the media client is the active entity that requests content from the media server. Consequently, the server remains idle or blocked, waiting for client requests. Hypertext Transfer Protocol (HTTP) is a common protocol for Pull-based media delivery [36]. The type of content being transferred and the underlying network conditions usually determine the methods used for communication. To give an example, in “live” streaming the priority is on low latency, jitter and efficient transmission when occasional losses might be tolerant; otherwise, in on-demand streaming, when the content does not exhibit any particular time constraints or time relevance, the quality is the most important. In the following subsections the most important streaming protocols will be presented, starting with Traditional Streaming (Push-based) which was the first protocol implemented, proceeded with Progressive download that is one of the most widely used Pull-based me- dia streaming, and ends with an analysis of adaptive streaming protocols (also Pull-based), namely the segmented HTTP-based delivery, given its growing popularity and potential.

2.4.1 Traditional Streaming RTSP (Real-Time Streaming Protocol) is a good example of a traditional streaming and may use Real-Time Transport Protocol (RTP) (but the operation of RTSP does not depend on the transport mechanism used to deliver continuous media) and Real-Time Control Protocol (RTCP) [37]. RTSP can be defined as a statefull protocol, which means that after the establishment of a client-server connection, the server keeps track of the client’s session state. The client can communicate its state to the server by using commands such as Play (to start the streaming), Pause (to pause the streaming) or Teardown (to disconnect from the server and close the streaming session) [38, 39]. Traditional streaming using RTP streaming protocol is illustrated in figure 2.8. After a session between the client and server has been established, the server starts sending the media, using RTP data channel either over User Datagram Protocol (UDP) or TCP, as

16 Figure 2.8: Traditional Streaming [2]. a stable stream of small packets (the default RTSP packet size is 1452 bytes, which means that, in a video decoded at 1 megabits per second, each packet contains information of about 11 milliseconds of video). To maintain a stable session, RTSP data may be interleaved with RTP and RTCP packets in order to collect QoS data such as bytes sent, packet losses and jitter. In cases of non-critical packets losses, this protocol supports grateful degradation of the playback quality and this is why it is used for live streaming. RTSP has several issues such as scalability and complexity. The support of millions of devices requires managing millions of sessions and multiple ports. For this reason, along with advances in available network capacity and bandwidth, the use of RTSP becomes outdated, although niche use-cases still exist such as video-conferencing. Other examples of traditional streaming protocols include Real-Time Messaging Protocol (RTMP) (belongs to Adobe Systems’) and RTSP over Real Data Transport (RDT) protocol (belongs to RealNetworks’) [38].

2.4.2 Progressive Download

Progressive download is a pseudo-streaming method very currently used, because it allows high scalability. This approach is a simple file download from an HTTP . The term “progressive” arises due to the fact that, as soon as the media player receives some data, the playback may begin, while the download is still in progress (typically to the Web browser cache). Therefore, progressive download is considered a pseudo-streaming due to these particular characteristics [38]. Progressive download architecture is illustrated in figure 2.9, and an example, which helps to understand better how progressive download works, can be seen in figure 2.10. In this example, the first dark blue bar shows how far the video has been viewed, and the light blue bar shows how much the video was loaded into the video browser (which is a buffer/cache). Thus, once the buffer is filled with a few seconds of video, the video will begin to play as if it is in real time. This technique allows to use several codecs (video format), such as H.264 (MP4), Flash

17 Figure 2.9: Progressive Download Architecture [3].

Figure 2.10: Example how Progressive Download works [3].

Video (FLV), QuickTime (QT), Windows Media Video (WMV) and bit rates (video quality) (figure 2.11) in order to optimize the stream depending on the user’s device, which does not happen in the traditional streaming. Progressive download ensures better quality than traditional streaming, because there is no packet loss in the client.

Figure 2.11: Progressive Download features [3].

The advantage of being supported via HTTP is what makes it highly scalable. File downloading through HTTP is stateless (if an HTTP client requests some data, the server responds by sending the data, without caring about remembering the client or its state), and because of that, it can easily use proxy servers and distributed caches or CDNs. The consequence associated with this scalability is the loss of several features of RTSP, such as no support for live streaming, no adjustable streaming based on QoS metrics or graceful degradation (missing packets will stop playback, waiting for the required data to be downloaded). However, progressive download is widely used and supported by most media players and platforms, including Adobe Flash, Silverlight, and Windows Media Player [38].

18 2.4.3 Adaptive Streaming Technologies Adaptive streaming arises as a way to take advantage of the major benefits of each tech- nology previously analyzed. If on the one hand RTSP contemplates some adaptation via the feedback QoS metrics sent through RTCP, on the other it presents scalability problems, as opposed to progressive download, which presents high scalability, but no support for live streaming, no adjustable streaming based on QoS metrics or graceful degradation. In scenarios of unreliable or varying network conditions, adaptation is a crucial feature of any streaming technology. The concept of adaptive video streaming is based on the idea to adapt the bandwidth required by the video stream to the throughput available on the network. The adaptation is performed by varying the quality of the streamed video and thus its bit rate, that is the number of bits required to encode one second of playback [40]. There are several ways to provide adaptation. The most typically used cases are in the encoding or distribution processes. Once the segmented HTTP-based delivery uses adaptation in the distribution process, only this class will be discussed.

2.4.3.1 Adaptive Segmented HTTP-based delivery Segmented HTTP-based delivery can be seen as an evolution of progressive download streaming, because it provides the same benefits without the disadvantages of not supporting adaptation or live streaming. The idea behind this method is to chop the media file into fragments (or ”chunks”), usually 2 to 4 seconds long [38], and then encode each fragment at different qualities and/or resolutions. Thereafter, the adaptation to the quality or resolution fragment is done on the client side based on several parameters, such as the available throughput and user’s device (battery level, available computing resources), e.g., the client can switch to a higher bit rate if bandwidth permits. This approach is illustrated in figure 2.12.

Figure 2.12: Segmented HTTP Adaptive Streaming [4].

The client has full control over the downloaded data. Thus, it can mix different chunks of the same video and the only impact will be on the quality or resolution of the media. The algorithmic process of deciding the optimal representation for each fragment, in order to optimize the streaming experience, is the major challenge in adaptive streaming systems. Problems such as estimating the dynamics of the available throughput, control the filling level of the local buffer/cache (in order to avoid underflows and consequently playback interrup- tions), maximize the quality of the stream and minimize the delay between the user’s request

19 and the start of the playback is not trivial [40]. Although this technology presents several challenges, it is growing in popularity due to its potential. Consequently, multiple implementations of segmented HTTP-based delivery emerged such as Apple HTTP Live Streaming (HLS), Microsoft Smooth Streaming, Adobe HTTP Dynamic Streaming (HDS) and Moving Pictures Expert Group (MPEG) Dynamic Adaptative Streaming (DASH). Since the implementation of this MSc thesis uses Microsoft Smooth Streaming, it will be presented in a more detail below.

Microsoft Smooth Streaming

Smooth Streaming was developed by Microsoft in order to provide a response in adaptive streaming area. It is based on the HTTP and MPEG-4 file format standards [41]. To create a Smooth Streaming presentations it is necessary an encoder (usually Microsoft Expression Encoder or other compatible solutions) to encode the same source content at several quality levels, typically with each level in its own complete file. Then the content is delivered using a Smooth Streaming-enabled Internet Information Services (IIS) origin server. Thereafter, a Smooth Streaming Client is required. Microsoft provides client implementation based on Silverlight [5]. The IIS Smooth Streaming media workflow is illustrated in figure 2.13.

Figure 2.13: IIS Smooth Streaming Media Workflow [5].

After the IIS origin server receives a request for media, it will send to the client a manifest file about the requested media and dynamically create cacheable virtual fragments from the video files. The benefit of this virtual fragment approach is that the content owner can manage complete files rather than thousands of pre-segmented content files [5]. There are two different manifest files - client and server, which are in Extensible Markup Language (XML) format. The client manifest may be downloaded by the client and processed in order to initiate the playback. This file reveals the internal structure of the adaptive content, such as the number of available tracks (encoded streams), resolution, duration and

20 how they are fragmented (number of chunks and each duration). With this information, the client may decide to first request the lowest quality chunks in order to evaluate the network conditions and then decide to scale up the quality or not. An example of client manifest file is show in figure 2.14.

Figure 2.14: Client Manifest File - Example.

The server manifest file is used by the streaming service (usually IIS origin server). This file provides a macro description of the encoded content, such as the number of encoded streams (tracks), the track type (video, audio or text), the location of each track and some more information (codec type, bit rate, resolution, etc). The following section will present the importance of using a prefetching algorithm in CDNs, since the content on Internet continues to grow but the size of the cache is fixed.

2.5 Prefetching

Prefetching is a generic term used to refer to something that was previously loaded into memory before being explicitly requested. Thus, web prefetching is a technique which reduces

21 the user-perceived latency by predicting web objects and storing them in advance, hoping that the prefetched objects are likely to be accessed in the near future [42, 43, 44]. Therefore, a prefetching mechanism needs to be used in conjunction with a caching strategy. Prefetching strategies are diverse and no single strategy has yet been proposed which provides optimal performance, since there will always be a compromise between the hit ratio and bandwidth [45]. Intuitively, to increase the hit ratio, it is necessary to prefetch those objects that are accessed most frequently but, to minimize the bandwidth consumption, it is necessary to select those objects with longer update intervals [44]. To be effective, prefetching must be implemented in such a way that prefetches are timely, useful and introduce little overhead [45]. Downloading data that is never used is of course a waste of resources. Many studies have been made over the years. Initially, the prefetching algorithms con- sidered only as metric the characteristics of the web objects, such as their access frequency (popularity), sizes and lifetimes (update frequency). This led to the proposal of several of prefetching algorithms such as prefetch by popularity, lifetime and good fetch, to name a few.

Prefetch by Popularity: Markatos et al. [46] suggested a “Top Ten” criterion for prefetching web objects. Each server maintains access records of all objects it holds and, pe- riodically, calculates a list of the 10 most popular objects and keeps them in memory [44, 47]. The problem with this approach is that it assumes that all users have the same preferences, and, moreover, it does not keep track of a users’s history of accesses during the current session [48]. A slight variance of the “Top Ten” approach is to prefetch the m most popular objects from the entire system. Since popular objects are most likely to be required, this approach is expected to achieve the highest hit rate [44, 47].

Prefetch by Lifetime: The lifetime of an object is the interval between two consecutive modifications of the object. Since prefetching increases system resource requirements, such as server disk and network bandwidth, the latter will probably be the main limiting factor [49]. Thus, since the content is downloaded from the web server whenever the object is updated, in order to reduce the bandwidth consumption it is natural to choose those objects that are less frequently updated. Prefetch by lifetime will selects m objects with the longest lifetime to replicate in the local cache and, thus, aims to minimize the extra bandwidth consumption [44, 47].

Good Fetch/Threshold: Venkataramani et al. [50] proposed a threshold algorithm that balances the access frequency and update frequency (lifetime), and only fetches objects whose probability of being accessed before being updated exceeds a specified threshold. The intuition behind this criterion is that objects with relatively higher access frequencies and longer update intervals are more likely to be prefetched. Thus, this criterion provides a nat- ural way to limit the bandwidth wasted by prefetching.

Although these methods exhibit some efficacy, recent studies reveal that the prefetching algorithms must take into account several factors, such as the session/consumer behavior and the content itself. Andersson et al. [51] investigated the potential of different prefetching and/or caching strategies for different user behaviors in a catch-up TV network (concept explained in subsection 2.2.1). The objective was to reduce the zapping time, i.e. the time from the channel selection to start of playout or even quick reaction to fast forward or rewind, in order to improve the user’s perceived QoE. In this study, consumers were divided into two

22 groups, depending on their behavior: zappers and loyals. Zappers had as main feature fast switching between programs, searching for the desired view, while loyals consume the first selection to the end before requesting another view. With this division, Andersson et al. [51] wanted to be able to better interpret the behavior of their consumers, in order to improve the service and keep the customer satisfied. The main conclusions reached were that zappers are more prone to watch several different streams per session than loyals, although they had shorter sessions. It was also discovered that zapping prone channels exist. A movie channel, for example, had a greater number of zapping behavior in comparison with a mixed channel. Therefore, it is interesting to compare the channel’s request patterns and not just the consumer behavior. An analysis regarding different gains of prediction of episodes of the same series is also carried out. Loyals have a higher probability than zappers to go from episode X to episode X + 1. Zappers, on the other hand, request previous episodes more likely than loyals. Thus, a good prefetching algorithm should be able to adapt to the profiling of both consumers and contents. In a similar approach, but without division into groups, Nogueira et al. [52] presents a detailed analysis on the characteristics of users’ viewings. The results show that Catch- up TV consumption exhibits very high levels of utilization throughout the day, especially on weekends and Mondays. While a higher utilization on the weekends is expected, since consumers tend to have more free time, the service utilization on Mondays is explained as being due to users catching-up on programs that they missed on the weekends. The superstar effect is also notorious. In an universe of 88,308 unique programs, the top 1,000 programs are responsible for approximately 50% of the total program requests. Results also show that most Catch-up TV playbacks occur shortly after the original content airing and users have a preference for mostly General, Kids, Movies and Series content in virtue of not being time dependent. Thus, Sport and News genres quickly become irrelevant after the first two days. All data referring to the knowledge of the habits of each user, their preferences and even the days of greater affluence to the service will be helpful in the elaboration of an effective and efficient prefetching mechanism.

In [53] it is presented an algorithm that takes into account information collected from the user session in real time. Bonito et al. [53] conceived this prediction system which must be able to adapt itself to changes in a reasonable time. Thus, a set of “prediction machines” is defined, evolved and evaluated through an evolutionary algorithm in order to obtain bet- ter prediction performances using information gathered from user sessions. A user session is defined as a sequence of web requests from a given user. Each request is composed by an identifier “GET”, “POST” or “HEAD” request to a web server coming from a host with specified IP address over the HTTP protocol [RFC 2616]. It is considered the user as an anonymous entity, approximately identified by a combination of the source IP address and the client cookie. The same person connecting to the same site at a later time would be identified as a different user. A request from a user is only considered valid if the HTTP response code is 2xx (success). Thus, the evolutionary algorithm manages a series of user requests, learning from them and trying to find a pattern in order to predict the user’s next action.

The prefetching techniques do not reduce user latency, they only use the time that the network is not being used, trying to predict the user next action, reducing the response time if

23 the prediction is done correctly. The following section will present the concept of multimedia caching and the most widely used algorithms.

2.6 Multimedia Caching

Section 2.3 explains in detail how the multimedia content is delivered in OTT networks using the CDN infrastructure. This infrastructure, as mentioned, is composed by multiple replica servers that acquire and cache data from origin servers, in order to bring the content near to the consumer, avoiding a large network congestion. Multimedia caching is then the process of storing multimedia data (web content and video) in a cache [54]. The cache storage potential depends on several factors, such as user behavior, content popularity and the caching algorithm itself, since the size of replica servers cache is reduced. Thus, a prefetching algorithm (see section 2.5) should be used along with the cache algorithm in order to help predicting the user behavior and better decide whether the content is relevant to cache. Therefore, this section will present the only factor that has not yet been discussed, the caching algorithm. Thus, the next subsection provides an overview of popular caching algo- rithms, such as FIFO, LRU and LFU. It also presents an approach developed in our group, the MPU.

2.6.1 Caching Algorithms In order to understand how some cache policies work, it is preferable to consider a simple example: the reference string (1,2,3,4,1,2,5,1,2,3,4,5) represents the order in which content is requested (different numbers represent different contents), with a cache size of 3 elements [55]. The simplest page-replacement algorithm is a FIFO algorithm. A FIFO replacement algorithm associates with each page the time when that page was brought into memory. When a page must be replaced, the oldest page is chosen. Table 2.1 demonstrates the application of the reference string to a cache employing FIFO, and shows that 3 cache hits are achieved, along with 9 pages faults.

Table 2.1: FIFO cache Replacement Policy

Iteration 1 2 3 4 5 6 7 8 9 10 11 12 Request 1 2 3 4 1 2 5 1 2 3 4 5 Result miss miss miss miss miss miss miss hit hit miss miss hit Page 1 1 1 1 4 4 4 5 5 5 5 5 5 Page 2 2 2 2 1 1 1 1 1 3 3 3 Page 3 3 3 3 2 2 2 2 2 4 4

This algorithm is easy to understand and program. However, its performance is not always good, once the FIFO is well known for being vulnerable to the B´el´ady’sanomaly [56]. LRU replacement algorithm uses the recent past as an approximation of the near future, removing the least recently accessed items as need in order to have enough space to insert a new item. This approach does not suffer from B´el´ady’sanomaly because it belongs to a class of page-replacement algorithms called stack algorithms [55]. Table 2.2 demonstrates the

24 application of the reference string to a cache employing LRU, and shows that 2 cache hits are achieved, along with 10 pages faults.

Table 2.2: LRU cache Replacement Policy

Iteration 1 2 3 4 5 6 7 8 9 10 11 12 Request 1 2 3 4 1 2 5 1 2 3 4 5 Result miss miss miss miss miss miss miss hit hit miss miss miss Page 1 1 1 1 4 4 4 5 5 5 3 3 3 Page 2 2 2 2 1 1 1 1 1 1 4 4 Page 3 3 3 3 2 2 2 2 2 2 5

LFU is a quite similar to LRU. The main difference is that, instead of removing the least recently accessed item, this approach stores the number of accesses to each item and removes items that are least frequently used. The problem with this approach is when a page is heavily used during the initial phase, but then is never used again. Since it was heavily used, it has a large count and remains in memory even though it is no longer requested. LRU is then the strategy most widely used. In [57] the authors proposal a new caching algorithm: MPU, tested in a Catch-up TV scenario. This approach leverages content demand knowledge to make cache replacement decisions based on “priority maps”. These priority maps contain enough information to unequivocally identify Catch-up TV items and their expected number of requests at each point in the future. Thus, the MPU cache eviction policy favors items that have a greater expected priority, in detriment of others with lower expected priorities. The results show that the use of MPU algorithm provide significantly better cache performance metrics, such as cache cost savings and hit-ratio. Using the same reference string and assuming each item with the priority given by {[1 : 4], [2 : 4], [3 : 3], [4 : 1], [5 : 2]}, which maps each item to its respective priority and considering higher priorities map to higher expectations that an item will be requested in the future, the table 2.3 demonstrates the application of MPU algorithm and shows that 4 cache hits are achieved, along with 8 pages faults.

Table 2.3: MPU cache Replacement Policy

Iteration 1 2 3 4 5 6 7 8 9 10 11 12 Request 1 2 3 4 1 2 5 1 2 3 4 5 Result miss miss miss miss hit hit miss hit hit miss miss miss Page 1 1 1 1 1 1 1 1 1 1 1 1 1 Page 2 2 2 2 2 2 2 2 2 2 2 2 Page 3 3 4 4 4 5 5 5 3 4 5

As the performance of caching algorithms is highly dependent on the request sequence, a good caching algorithm is one that knows which items need to be hold in cache, in order to maximize the cache hit-ratio. The following subsection will present a survey of five proxy-caching solutions for embedded systems that are already integrated with OpenWrt (the OpenWrt system is explained in detail in chapter 5 ). This was the starting point of this dissertation, since this information will be used as the basis for the implementation of the work in this MSc thesis.

25 2.6.2 Proxy-Caching Solutions With the advances in the mobile networks and the Internet itself, OTT video servers are growing in popularity. As mentioned before, in OTT networks the video content is delivered over Internet. The most common video delivery uses the HTTP from conventional web servers given the benefits of fast deployment and Network Address Translation (NAT) traversals [58]. In section 2.4.3.1, Adaptive Bitrate Streaming (ABS) was presented as a technique to improve HTTP-based video delivery by exploiting network dynamics, especially in wireless networks, since it is in these where network fluctuations and packet losses are larger compared to the wired scenarios. Although this technique is helpful, it means to break the video into smaller segments, with a duration of a few seconds, and with several quality levels so that the client may select a quality level that suits its particular environment conditions. These characteristics have an immediate impact on the performance of the caching algorithms, since the size of the cache remains unchanged, although more space is required to store the content. Thus, mCDNs (see subsection 2.3.4) are being studied in order to increase the available space in the network, reducing the backbone network latency. Therefore, proxy-caching solutions have growing in popularity. This concept allows a to act as an intermediary between a user and a content provider, making it possible to cache the most important/frequent content, such as files, images and web pages, allowing to share those resources with more users. These solutions should perform well and require few resources in processing, since they will have to be implemented on embedded systems. Popular proxy-caching solutions for embedded systems, such as OpenWrt, are Nginx, Squid, , Tinyproxy and . Nginx [59] is a free, open-source and high-performance HTTP server with other functions as well, such as , cache capability and SPDY to name a few. It is known for its high performance, stability, simple configuration and low resource consumption. Squid [60] is frequently used in academic works. It has extensive access controls and makes a great server accelerator. It runs on most available operating systems, including Windows and is licensed under the GNU General Public License (GPL). Some of its main characteristics are: forward proxy, transparent proxy, reverse proxy and cache capability to name a few. Polipo [61] is a small and fast caching web proxy, although it does not allow reverse proxy feature. Polipo is no longer maintained. Tinyproxy [62] is a light-weight HTTP/Hypertext Transfer Protocol Secure (HTTPS) proxy daemon for Portable Operating System Interface (POSIX) operating systems. It was designed to be fast and small, ideal for use cases where a full featured HTTP proxy is required, but the system resources for a larger proxy are unavailable. It supports reverse proxy feature, but no cache capability. Privoxy [63] is a non-caching web proxy with advanced filtering capabilities for enhanc- ing privacy, modifying web page data and HTTP headers, controlling access, and removing advertising and other obnoxious Internet junk. Privoxy has a flexible configuration and can be customized to suit individual needs and tastes. It supports reverse proxy feature. Table 2.4 provides a comparison of these five proxy caching solutions, but only focuses on the two main relevant supported features: reverse proxy and cache capability. The next chapter will return to this point, justifying the reason.

26 Table 2.4: Comparison of Five Proxy Cache Solutions - Main features

Supported Features Nginx Squid Polipo Tinyproxy Privoxy Reverse Proxy Yes Yes No Yes Yes Cache Capability Yes Yes Yes No No

2.7 Chapter Considerations

This introductory chapter described several important concepts to better understand the work that has been developed, such as OTT multimedia networks, CDNs, multimedia stream- ing, prefetching and multimedia caching. First, the state of the art about OTT multimedia networks was presented along with concepts widely used by content providers such as time-shift TV and VoD. Then, the CDN concept and infrastructure was presented. It was a very detailed section, since it serves as the basis for the entire structure of OTT multimedia networks. Subsequently, the state of the art about multimedia streaming technologies was presented, giving an emphasis in adaptive segmented HTTP-based delivery, given its potential. Prefetching was another very detailed concept, given its particularities in reducing user- perceived latency and its importance for the entire CDN delivery. Finally, the state of the art about multimedia caching was presented, providing an overview of popular caching algorithms and how they work. A survey about proxy-caching solutions for embedded systems was also presented, and this work was the starting point for the imple- mentation of the work in MSc thesis. These topics are the basis to understand the work developed; the work in the following chapters contains several references to this chapter, due to the relevance of these subjects.

27 28 Chapter 3

Wireless Content Distribution

3.1 Introduction

After the description of fundamental concepts addressed in this dissertation, this chap- ter presents the problem behind OTT multimedia networks. Then a solution is proposed with the objective of improving the performance of the wireless structure in the delivery of OTT multimedia content, namely the speed of delivery time, resulting in an improvement of consumers’ QoE. With this solution comes the description about the scenarios that will be evaluated in section 5.4. This chapter is organized as follows:

• Section 3.2 presents a brief description of the problem behind OTT multimedia net- works.

• Section 3.3 presents the proposed solution with the testbed and scenarios.

• Section 3.4 presents the chapter summary and conclusions.

3.2 Problem Statement

Nowadays all information is on the Internet, and the Internet is almost everywhere. In fact, on the last years, telecommunication operators have invested a lot of money in expanding and improving their infrastructures. As a consequence, more people are “online”, sharing and accessing the content simultaneously, becoming one of the most complex systems in operation. As previously mentioned (see section 2.3), CDNs arise to provide the end-to-end scalability and reliability necessary to support this growth, becoming a fundamental piece of modern delivery infrastructure. They were developed to provide numerous benefits, such as a shared platform for multi-service content delivery, reduced transmission costs for cacheable content, improved QoE for end users and increased robustness of delivery, i.e. maximizing bandwidth. Although this is true in theory, in practice this implies a reduction in the QoE in proportion to the total number of clients to serve. Therefore, a CDN should consider the particularities of every content that will be cached, allowing a better overall performance which will ultimately benefit both the operators and the end-users. However, the content on Internet is very dynamic, i.e. has a short lifetime, which limits the potential benefits of caching.

29 Another major problem is the fact that the multimedia content is increasingly higher quality and, to keep this content in memory, it is necessary a significant amount of free space on the servers HDD. As a consequence, this reduces the number of items that a given server may hold and potentially limiting the cache hit ratios. Thus, in the event of cache miss, it will have a significant impact on traffic volume of the network because the content has a large size. This reduces the user QoE since it will take more time to populate the replica server. Additionally, if the user does not require all the content or skip portions of a video, many resources are wasted. In order to reduce the current and future problems in OTT multimedia networks, there is the need to implement a content distribution network close to the user to handle these problems. Several challenges arise when trying to bring content as close as possible to the consumers, such as the lack of memory and the processing of equipment to store them. Thus, due to the fact that the number of wireless devices accessing mobile networks worldwide are expected to increase significantly in the coming years [32], wireless content delivery becomes an important piece in the multimedia delivery that must be improved, so that it does not become a bottleneck in the next generation delivery of high-bandwidth OTT multimedia content. In the next section, a solution will be presented to improve these issues.

3.3 Proposed Solution

Given the problems exposed previously and the potential QoE-impacting of the wireless delivery infrastructure in OTT multimedia services, this dissertation aims to deploy a de- centralized and scalable wireless content distribution network that, based on a forecasting mechanism, tries to cache the content closer to the consumers, improving their QoE in case of being able to cache the content that will be requested, and thus increasing the hit ratio, since the content will be delivered faster. Figure 3.1 shows the general picture of a wireless CDN. This wireless CDN is constituted by mobile consumers, such as mobile phones, cars, bicy- cles, drones and buses, to name a few, which access the network, via Wireless Fidelity (Wi-Fi) or cellular, in searching for content, such as videos, musics, news or just for data exchange. The layer above, i.e. the traditional CDN, is the layer that supports the communication with the origin servers, which have all the content desired by the mobile consumers. In order to achieve the proposed objectives, it is necessary to place caches near the con- sumers, namely in the APs themselves, and, for the management of these caches, use proxy- caching solutions (see subsection 2.6.2). This whole process is called mCDN (see subsection 2.3.4). Although this approach appears in the literature as a way to improve the performance of wireless networks, it is necessary to improve this approach, since it is required to coordinate all this process of caching content, which content to cache and in which cache to store the content. Therefore, it is necessary to implement a prefetching mechanism to use the available space in the best way, keeping in memory only the content that may be requested in the near future by some mobile consumer. Figure 3.2 presents the proposed architecture overview. This architecture has three main blocks: mobile consumers, proxy cache and origin server. Mobile consumers are responsible for generating network traffic by requesting chunks of a particular video to the proxy cache. In case the proxy cache has the requested content in

30 Figure 3.1: Typical Wireless CDN infrastructure.

memory, it immediately sends the content to the respective mobile consumer. In case the

31 Figure 3.2: Proposed Architecture Overview. proxy cache does not have the content requested, it will ask the origin server (which has all the content) and, before delivering it to the client, it will keep a copy of the content in memory. The proxy cache will always try to predict the content that connected mobile consumers will request, based on the content they are requesting, and is send this information to other neighboring APs, since the mobile consumers can move. These APs must work together, thus forming a wireless content distribution network. Whenever a request is made by a mobile consumer, the time it takes for the content to be delivered is checked, and this is the metric used to evaluate the consumers QoE. The closer the content is to the consumer, the less the time it will take to be delivered and thus the greater will be the QoE felt by the consumer. Based on what was described, the work had two phases:

• First Phase: Proxy Caching Strategies for Embedded Systems - given the lack of proxy caching solutions for embedded systems and their analysis in the literature, this first phase has as objective to evaluate two proxy cache solutions, Nginx and Squid, for the management and control of scattered caches along the wireless network. • Second Phase: Prefetching and Mobile Consumers - this second phase aims to test different prefetching algorithms, in order to predict the content that will be requested by the mobile consumers, and thus cache it before being requested, improving the consumers request latency and consequently improving their QoE. Once the tests are aimed at the wireless layer, the mobility of consumers will be simulated.

The testbed and more detailed description of these phases used to validate the proposed solution are presented in the following subsections. In the end, a demonstrator is built up which allows testing Nginx and Squid solution for the management and control of scattered caches along the network in mobility scenarios and with the prefetching mechanism.

3.3.1 Proxy Caching Strategies for Embedded Systems As previously mentioned, this first phase has as main objective to evaluate different solu- tions (Nginx and Squid) for the management and control of scattered caches in a wireless net- work. These two strategies were chosen from among five popular ones, running on OpenWrt: Nginx, Squid, Polipo, TinyProxy and Privoxy (for more information about these solutions see subsection 2.6.2).

32 Table 3.1: Available Qualities and Average Size per Chunk

Available Qualities (Bitrate) Average in Bytes 230,000 58,620 331,000 82,529 477,000 117,652 688,000 169,255 991,000 240,559 1,427,000 347,243 2,056,000 499,551 2,962,000 718,157

These solutions were studied in order to understand which ones had the possibility to implement a reverse proxy and to cache content, since these are the main characteristics required for the implementation of the proposed solution. Thus, the table 2.4 was constructed where it is observed that only Nginx and Squid have both characteristics. Therefore, these are the two solutions addressed. However, it is possible, for example, to use Tinyproxy with Squid and thus have both features, but the goal of this dissertation is to compare them individually and under the same conditions. A reverse proxy provides an additional level of abstraction and control to ensure the smooth flow of network traffic between clients and servers. It does exactly the opposite of what a direct proxy does. While a forward proxy proxies in behalf of clients (or requesting hosts), a reverse proxy proxies in behalf of servers. Thus, a reverse proxy is a type of proxy server that typically sits behind the firewall in a private network and directs client requests to the appropriate backend server [64]. With the ability to cache, it is possible to store the content before serving it to the requester or end user, allowing to share those resources with more users, saving bandwidth. In order to test these two solutions for the management and control of scattered caches in a wireless network, a testbed illustrated in figure 3.3 was created. The origin server was placed in the same network as the raspberry pi, in order to guarantee the least possible impact of external factors (such as latency), and thus making the comparison more similar. The hardware, operating system and main configurations used are described in Chapter 5. The idea is to evaluate how consumers are watching random videos, using adaptive segmented HTTP-based delivery, namely Microsoft Smooth Streaming (see sub-subsection 2.4.3.1), and evaluate the performance of each strategy considered. The video used is the Elephants Dream H.264 720p available for testing in [65]. Table 3.1 shows the available qualities of the video and, on average, the size of each chunk respectively. In this scenario, the size of the cache is reduced to an acceptable level in order to quickly overload it, leading each caching strategy to perform more replacements. Some additional features are also simulated, such as the connection of consumers (Wi-Fi or Mobile (3G/4G)), network congestion (reducing or increasing the quality of requested “chunks” of the same video 1) and the popularity of the video (giving different weights to the different videos) in order to make more realistic tests. Three small modifications are carried out in order to improve the discussion of results.

1There were used real traces from the Perceptual Evaluation of Video Quality (PEVq) used in OPTICOM [66] company databases.

33 Figure 3.3: Scenario used to test different cache approaches.

The first is aimed to determine which of the solutions has the best behavior, given a specific number of successful requests to be made. The second is aimed to test each approach for a certain period of time and evaluate the performance over time and finally, the last one is aimed to understand the behavior of each approach by varying the size of the cache, without forgetting that the cache size should always be reduced. After testing these strategies for the management and control of the cache, a prefetching mechanism is needed in order to reduce the content delivery time to the consumers, improve their QoE and save upstream bandwidth in case the requested content is in cache, and result- ing in a good hit ratio. Thereafter, the goal is to test in mobility scenarios. Thus, the next subsection will present this topic and the scenario deployed.

3.3.2 Prefetching and Mobile Consumers

The last step in an attempt to contribute to a solution that aims to reduce the current and future problems in OTT multimedia networks involves introducing a prefetching technique and test it in mobility scenarios. Prefetching, as mentioned in subsection 2.5, has the objective to predict future consump- tion of clients and populate the caches with this content before the consumer requests it.

34 Since it is used adaptive segmented HTTP-based delivery, the prefetching technique involves trying to predict future chunks of a given video in a certain quality and populate the cache to reduce the misses (e.g. the requested content is not in cache, resulting in a request to the origin server). Thus, two strategies were implemented:

• First Strategy: uses the chunk information that was requested and tries to use the same quality in the forecast of the next chunk, i.e. it assumes the same network conditions.

• Second Strategy: analyzes the behavior of each consumer and, based on the current quality and the last four requested, predicts which quality will be requested, using the equation 3.1

X(t+1) = 0.5 × X(t) + 0.2 × X(t−1) + 0.15 × X(t−2) + 0.1 × X(t−3) + 0.05 × X(t−4) (3.1)

where X(t+1) is the expected quality, X(t) is the current one and X(t−1)...X(t−4) are the last four qualities. It is considered that the current quality would have an importance of 50% in the next forecast, and that the other four qualities would lose importance successively, going down to 20%, 15%, 10% and finally only 5%. The sum of the probabilities is equal to 100% so that the expected quality is between the minimum and the maximum of the quality obtained in the instants considered (X(t)...X(t−4)).

Consumers always try to see the video until the end before asking for another content. In other words, they are considered loyal. The main steps are:

1. Consumer asks for a chunk.

2. Cache proxy is responsible for providing the request to the consumer.

3. It is calculated which is the next chunk of the video and predicted its quality.

4. The proxy cache then caches the content of this prediction.

5. A form of communication between APs is implemented so that this information is sent to neighboring APs in order to cache this content in the neighborhood.

Thus, a good prefetching mechanism is one that can accurately predicts the quality that will be requested by the consumer, and thereby, improve their QoE by reducing the request latency. Due to the characteristics of a wireless network, it is important to find a testbed with mobility features. Thus, the new testbed used is illustrated in figure 3.4. As can be observed, for this scenario three raspberry pis were used in order to simulate the displacement of consumers along a route. Since they are simulated on the same machine/com- puter, all the consumers start at the same location and then they all move to another. Firstly, the consumers join to the AP A, then move and join to the AP B and finish in location/AP C. Understanding what content will be requested and for which location is a major challenge in wireless networks.

35 Figure 3.4: Scenario used to test prefetching and consumers mobility.

3.4 Chapter Considerations

This chapter presented some of the current challenges in delivering OTT multimedia content. In order to solve these problems, a solution was proposed with the aim of improving the delivery of OTT contents in wireless networks. This solution is not only beneficial to consumers, who will benefit from a better QoE. Operators are also benefited since, on the one hand, their consumers will be more satisfied and, on the other hand, there is a better use of the network capabilities, i.e bandwidth, allowing less investment in new infrastructures to accommodate more customers. The next chapter will describe how this solution is implemented.

36 Chapter 4

Implementation

4.1 Introduction

Once the problem and the strategies presented to solve it were identified (high-level) in Chapter 3, there is the need to understand how it works from a practical perspective. Thus, this chapter will present this topic. First the architecture will be presented and then explained how the blocks are imple- mented. This chapter is organized as follows:

• Section 4.2 presents the architecture and the implementation of its main blocks.

• Section 4.3 presents the chapter summary and considerations.

4.2 Architecture and Implementation Overview

The architecture has two main blocks: consumers that simulate clients watching video streaming, and the nodes (APs) that implement the prefetching technique. These blocks and main characteristics are illustrated in figure 4.1. The implementation of these blocks is developed using python 3 programing language. This choice is due to the fact that it is a relatively easy language to learn, runs everywhere, powerful, open, with good documentation and increasingly used (see reference [67] for more details about the language). It is also used, as Integrated Development Environment (IDE), PyCharm which runs on any major operating system [68]. The following subsections will present more details on the implementation of these main blocks, starting with the consumers.

4.2.1 Consumers As can be seen in figure 4.1 (a) the consumers 1 are implemented with several features, such as different connections (Mobile (3G/4G) or Wi-Fi), mobility, Identity (ID) and random choice of video. Thus, a method called JobGenerator is implemented. The objective of this method is to generate “jobs” that will be requested by consumers. Each job is constituted by a particular video (one of ten available, each one with different weights/popularities),

1Each consumer is simulated in a different thread, e.g, 10 consumers are 10 different threads.

37 (a) Consumers (b) Nodes

Figure 4.1: Architecture main blocks. a connection 2 (for each wireless connection created, the next one has necessarily to be Mobile, i.e. avoid having more Wi-Fi consumers than Mobile or vice versa) and an ID number (explained below). Then each job is placed in a job queue. Table 4.1 shows how the weights/popularities are divided by the available videos.

Table 4.1: Videos and their Probabilities to be Requested.

Video Number Weight/Popularity (%) 1 77,52 2 10,74 3 4,99 4 2,77 5 1,70 6 1,05 7 0,63 8 0,36 9 0,18 10 0,06 Total: 100

The flow chart presented in figure 4.2 exemplifies the life cycle of each mobile consumer. First, each mobile consumer connects to a specific AP that, in real environment, must be the closest one. Since the consumers are simulated on the same machine/computer, all the connections and travels between APs are carried out simultaneously, although each consumer is treated differently. After the connection, each consumer “gets a job” [JobConsumer] from the queue. The ID number will be used to create a profile (.txt file) about the user consump- tion, such as the connection type, video and chunks seen, delay and quality of each requested

2Each simulated connection (Wi-Fi or Mobile) has, in turn, one of ten sets of real traces that come from the PEVq software used by OPTICOM [66] company databases.

38 Figure 4.2: Mobile Consumer Flow Chart. chunk and elapsed time. Once the source IP is the same for all consumers, this ID number will be added to the HTTP header (User-Agent) to distinguish clients, helping reading the cache log files. After the association with the AP and the acquisition of a job, a profile file for that consumer is created and then it is “put to sleep” a random time (between zero and one

39 minute) before start requesting [VideoRequestJob], in order to avoid unnecessary initial congestion. Thereafter, all the consumers move to another AP/position. This change has two issues. Firstly, before switching, it must ensure that no consumer has already made a request nor is waiting for response, and then, it needs to know the number of consumers and their current video positions, in order to continue the job when changing to another AP. Thus, since each mobile consumer is implemented using threads, it is necessary to use a semaphore to avoid executing the code that allows to change an AP when there are still pending requests (concurrency problem). When a consumer views all the video, i.e. receives all the chunks of the video, it restarts the whole process of getting a job. The number of simultaneous consumers and the size of the job queue is configured. If the time of the experiment is over, all the consumers are prevented from making requests and, when everyone has received their last request, they disconnect. Figure 4.3 summarizes all the methods implemented in the mobile consumers.

Figure 4.3: Methods Implemented in Mobile Consumers.

4.2.2 Nodes The nodes/APs block (figure 4.1 (b)) also has several features, such as cache manager (Nginx and Squid), prefetching and neighbors manager. The figure 4.4 presents the block diagram and interactions between these features. As can be seen, the bridge between the cache manager and the prefetching mechanism is made using a module in python 3 [watchDogLogs] that reads and parses the cache logs and then makes predictions [prefetching] in order to populate the cache, increasing the user experience in the event of cache hit, since the latency of delivery will be lower. Thereafter, the same predictions are sent [clientSocket], in a UDP packet, to other neighboring nodes, which also receive and process those requests [serverSocket] and populate their caches, since the mobile consumers can connect to these at anytime. The following sub-subsections will present more details on the implementation of prefetch- ing and communication between neighbors, but first it is important to understand how the node is initialized.

4.2.2.1 Node Initialization The node initialization is presented in figure 4.5. Some initial tasks have not been repre- sented in order to facilitate the reading, such as the raspberry pi and web servers configurations

40 Figure 4.4: One Node: Block Diagram and Interactions.

(see section 5.3 for more details). The node begins by initializing a monitoring log file [watchDogLogs]: since each cache manager’s strategy has its own log files, there will be two different monitoring files, but only one will be active at the same time, depending on the type of test. The node also initializes a neighbors manager structure (client-server). This monitoring file will be in an infinite loop cycle waiting for specific requests of consumers. Consequently, it will not end until the experience time is over. If the request has a valid format (this will be explained in the following sub-subsection), it will be used by the prefetching mechanism and then sent to other closest nodes, using the neighbors manager structure.

4.2.2.2 Prefetching

The prefetching mechanism is composed of two modules: watchDogLogs and prefetching. Whenever a log is printed in the cache log file, it is parsed by the watchDogLogs module. This module aims to understand which content is requested, namely the video, its quality and the chunk, as well as which consumer asked for it. The log is considered valid when it displays the following format:

41 Figure 4.5: Initialization Flow Chart.

‘‘GET http://{ ip : port }/ElephantsDream {videoNumber } .ism/QualityLevels( { qualityLevels })/Fragments(video={chunk })} HTTP/1.1” ‘‘Consumer { id }” With this information, it is possible to estimate what could be the next chunk of the video that will be requested by the consumer, although the quality of the video is always uncertain because, in a real scenario, it will depend on the client’s network conditions. Therefore, this will be the great challenge of the prefetching algorithm: predict the quality of the next chunk. As previously mentioned, two types of connections are simulated: Wi-Fi and Mobile (3G/4G), where each of these had ten different real traces (quality variations). Thus, in order to predict the quality of the next chunk, two different prefetching algorithms were developed:

• First Algorithm: this algorithm uses the current quality to predict the quality of the next chunk, if there is a next chunk. In other words, it assumes the same networking conditions. The figure 4.6 shows its flow diagram. First, the consumer requests a chunk of video to the proxy cache that will be parsed by the watchDogLogs module. The next video chunk is then calculated and, if it is the last one, the prefetching is not done. If the next chunk exists, then the same quality is used in the next forecast. This quality will be cached and the information sent to neighboring nodes. This way they will also cache this request, since the consumer is in the vicinity and can connect to them at any time.

• Second Algorithm: this algorithm uses the current quality and the last four requested, per client, to estimate the quality of the next chunk, if it exists. The available qualities are presented in table 3.1. These are renumbered from 1 to 8, where the quality 1 is

42 Figure 4.6: First Algorithm Flow Chart.

the lowest, before being used in equation 3.1. The figure 4.7 shows the algorithm flow diagram. First, the consumer requests a chunk of video to the proxy cache that will be parsed by the watchDogLogs module. The next video chunk is then calculated and, if it is the last one, the prefetching is not done. If the next chunk exists, it is checked if the consumer ID exists. If it does not exist, it is added together with the requested quality to a clientsDataBaseInfo dictionary (i.e. clientID: [quality]). If it exists, it is verified if there are four previous qualities of this consumer and only proceeds to calculate the next quality if they exist, using the equation 3.1. The forecast is then made and sent to the neighbors.

The prefetching module is then responsible for predicting the next quality and thus caching the content before it is requested. In order to distinguish consumer requests from prefetching, the HTTP header (User-Agent) is changed to prefetching Bot. Thus, the log generated by the prefetching mechanism will have the following form:

43 Figure 4.7: Second Algorithm Flow Chart.

‘‘GET http://{ ip : port }/ElephantsDream {videoNumber } .ism/QualityLevels( { qualityLevels })/Fragments(video={chunk })} HTTP/1.1” ‘‘prefetching B o t ” which is considered invalid by the watchDogLogs module and thus prefetching is not done. This way a forecast is not predicted.

44 4.2.2.3 Neighbors Manager

The neighbors manager is composed of two modules: clientSocket and serverSocket. These modules are responsible for the communication between Raspberry Pis. In python 3 it is possible to use four classes of servers: TCPServer, UDPServer, UnixStream- Server and UnixDatagramServer but these last two are not very used, since, one the one hand, they are similar to the others and, on the other hand, they are not available on non- platforms [69]. clientSocket and serverSocket are implemented using UDPServer approach. The choice of UDP is due to the fact that it is a simple and adequate protocol for real-time data flows, where retransmissions intensify network latency. Since the use of prefetching mecha- nisms already causes impacts in the network traffic, avoiding retransmissions of forecasts that can neither be correct nor be used is a way to reduce the network traffic. Thus, the use of UDP is characterized by not guaranteeing reliability or order in sending packages. Once the server classes in python process the requests synchronously (i.e. each request must be completed before the next request to be started) and all data will be stored externally (in the file system where access times for reading and writing are greater), it was necessary to create a separate thread to handle each request, since it is not suitable if each request takes a long time to complete, because it returns a large amount of data which the client is slow to process, or because it requires a strong computation effort. Thus, one solution is to use ThreadingMixIn mix-in class, that can be used to support asynchronous behavior. Creating a server [serverSocket] to handle many requests requires several steps. First, it is necessary to create a request handler that will process incoming requests. Second, it is necessary to instantiate one of the server class (in this case ThreadingUDPServer), passing it the server’s address (host IP and port) and the request handler class. Then, call the server - forever() method of the server object to process many requests. At the end, to close the socket, it is called server close(). The serverSocket flow chart is presented in figure 4.8. First, a UDP listening socket is created on port “8083” in each Raspberry Pi. Then, the first condition is tested and the thread continues its execution, waiting until it receives a new packet or server close() method is called. When a new packet is received, the thread reads the data sent by clientSocket neighbor and, if the data has a valid format (see figure 4.9), it processes them. This process consists of reading the fields of the received packet, creating an HTTP request with this information and attaching the “prefetching Bot” field to the user-agent header, in order to distinguish requests from the prefetching mechanism and those from the clients, and then make the request, waiting for the data response. In other words, serverSocket receives an order from neighboring nodes to make a specific prefetching request, behaving as a client to the origin server, hence the need to use threads so there will not be necessary to wait for all the data before attending other requests. If a new packet has arrived, another thread processes it, regardless of whether, another request is still being processed. After explaining the serverSocket module, which deals with the reception and precessing of packets, it is necessary to explain the clientSocket module, which deals with the sending of information. Thus, the clientSocket flow chart is presented in figure 4.10. First, the clientSocket module needs information to send. This information comes from the prefetching mechanism, previously presented and observed in figure 4.4. After having information to be sent to the neighboring nodes, it is created a UDP packet with four fields as presented in figure 4.9. Each field is composed of 4 bytes, where an integer value is stored

45 Figure 4.8: ServerSocket Flow Chart. which identifies the video, the desired chunk, the quality and the client ID, respectively. Thereafter, it is created an UDP socket to send this information to the neighboring nodes (Raspberry Pis). It is possible to send a datagram to one server and immediately send another datagram with the same socket to a different server. Figure 4.11 shows an example of what happens when there is a need to send a prefetching request to neighboring nodes. In summary the communication is carried out as follows:

1. There is a packet that needs to be sent: clientSocket creates (figure 4.9) and sends the UDP packet to neighboring APs, to its respective IP and port.

2. The packet is received by neighboring nodes: through the serverSocket.

3. If the content of the prefetching are not cached: it is requested directly to the origin

46 Figure 4.9: Packet fields.

Figure 4.10: ClientSocket Flow Chart.

server.

4. The origin server responds with the requested content.

4.3 Chapter Considerations

This chapter presented the architecture and implementation of the solution proposed in chapter 3. First, it presented the architecture main blocks: consumers that are simulated clients watching video streaming, and the nodes/APs, therefore the implemented prefetching mechanism. The biggest challenge of the implementation is undoubtedly trying to simulate all the conditions of a real environment. The next chapter presents how the integration between the hardware and the software is made, the main configurations needed for the testbed used and the obtained results, in the

47 Figure 4.11: Example of how Prefetching is sent through Neighbors Manager. laboratory environment.

48 Chapter 5

Integration and Evaluation

5.1 Introduction

This chapter presents the integration of proxy cache solutions with prefetching in a real testbed using Single Board Computers (SBCs) with OpenWRT Operating System (OS). The experiments are performed using two Personal Computers (PCs), one to simulate consumers and the other to serve as origin server. The obtained results are presented and discussed.

• Section 5.2 presents a brief description of the hardware and OS of the testbed.

• Section 5.3 presents the main configurations required.

• Section 5.4 presents the evaluation of each scenario and the discussion of results.

• Section 5.5 presents the chapter summary and conclusions.

5.2 Hardware and Operating System Description

Along this section it is described the hardware of consumers and origin server, the SBCs and the wireless adapter used to implement and test the solution proposed in Chapter 3, as well as to describe the OS installed.

5.2.1 Consumers and Origin Server

Table 5.1 summarizes the main hardware characteristics of consumers and the origin server.

Table 5.1: Machines used to simulate consumers and the origin server

Consumers Origin Server CPU 3 cores, 2GHz 8 cores, 20 GHz Memory 4 GB 8 GB Location Intranet Instituto de Telecominica¸c˜oes de Aveiro

49 5.2.2 Single Board Computer with Wireless USB Adapter The Single Board Computers (SBCs) used in all the evaluation phase are Raspberry Pis 2 Model B, illustrated in figure 5.1, developed in the United Kingdom by the Raspberry Pi Foundation to promote the teaching of basic computer science in schools and developing countries [70].

Figure 5.1: Raspberry Pi 2. [6]

The main features are:

• A 900MHz quad-core ARM Cortex-A7 CPU

• 1GB RAM

• 4 Universal Serial Bus (USB) ports

• 40 GPIO pins

• Full HDMI port

• Ethernet port

• Micro Secure Digital (SD) card slot

As can be seen, it does not have onboard Wi-Fi (only the Raspberry Pi 3 has) but support a USB Wi-Fi dongle. Therefore it was used TP-LINK TL-WN722N USB Wireless Adapter illustrated in figure 5.2, which reaches a maximum gain of 150 Mbps.

5.2.3 Operating System - OpenWrt The Operating System installed on Raspberry Pi 2 is OpenWrt Wireless Freedom [71], an open-source GNU/ distribution for embedded devices, typically wireless routers. The development branch (trunk/ master) is Bleeding Edge, r49377. OpenWrt provides a fully file system with optional package management to give freedom for developers to customize and configure the system to suit their own needs. OpenWrt is on constant evolution due to several contributions from OpenWrt community. In this class,

50 Figure 5.2: TP-LINK TL-WN722N USB Wireless Adapter. [7]

OpenWrt has long been established as the best firmware solution in terms of performance, stability, extensibility, robustness and design. The four major components of OpenWrt OS are: OpenWrt Buildroot, bootloader, Linux Kernel and userspace. OpenWrt buildroot is a set of makefiles and patches that allows the creation of cross-compilation toolchain and a root file system for embedded systems, and the necessary libraries to compile the desired packets. The main features that have been added to OpenWrt are: web server/proxy (Nginx, Squid), python 3 language, LuCI (web graphical interface), GNU Debugger (GDB), and wireless driver for TP-LINK TL-WN722N. The bootloader and kernel are the basis for any OS. The bootloader is a piece of software that is executed every time the hardware device is powered up, and linux kernel is responsible to connect the applications software to the hardware. Relatively to the userspace, the user interacts with the OS usually via a terminal, or in some cases via a graphical interface (e.g. LuCI web page). Both are illustrated in figure 5.3. OpenWrt can run in other types of devices such as smartphones, pocket computers (e.g. Ben NanoNote) and laptops.

5.3 Raspberry and Web Servers Configuration

This section presents the main configurations and the steps to install the different software on Raspberry Pis. Some configurations are dependent on the scenario and can be enabled or disabled. The most relevant and common settings are presented in this section.

5.3.1 Raspberry Configuration For all Raspberry Pis OpenWrt Bleeding Edge r49377 OS is installed. Bleeding Edge belongs to the development branch (trunk/ master), and to install OpenWrt in Raspberry Pis, it is necessary to build an image to be used in micro SD card. Firstly, it is installed the

51 (a) Terminal (b) LuCI web page

Figure 5.3: User Interfaces.

OpenWrt Buildroot on a laptop with Linux OS. Then, Buildroot is configured to compile the OpenWrt (and all selected packages) to the target system Broadcom BCM27xx (Raspberry Pi’s processor family). Then, the OpenWrt OS and the bootloader are compiled [72] and installed in one micro SD card, which is then cloned for each Raspberry Pi. After all this process, the basic network configurations are performed. The Ethernet port is configured to obtain automatically IP via a Dynamic Host Configuration Protocol (DHCP) server to have access to Internet and allow communication between Raspberry Pis. This communication is also used to extract relevant test data using Secure Shell (SSH). It was also created one wireless interface to make the Raspberry Pi an AP.

5.3.2 Squid Configuration The stable version of Squid installed on Raspberry Pis is 3.5.12, but actually the recent version is 3.5.21 (stable) or 4.0.14 (trunk) [73]. After installed, it is necessary to configure Squid as a reverse proxy with cache capability in order to evaluate its performance (which can be seen in 5.4.4). Typically, in OpenWrt OS, all Squid configuration files are kept in the directory /etc/squid/ path. The most important file to configure is squid.conf. A simple example of Squid con- figuration, implemented to be able to evaluate the test scenarios (see Chapter 3), is written below. Some comments are made to understand the purpose of each of the lines.

# Ports configuration − example h t t p port 192.168.234.1:13128 accel defaultsite=test s i t e s no−vhost acl allcomputers src 192.168.234.0/24

# Network configuration − example r e q u e s t timeout 5 minutes connect timeout 5 minutes quick abort min −1

# Cache configuration − example

52 c a c h e dir aufs /var/cache/squid 100 16 256

# Reverse Proxy main configuration − example a c l t e s t sites urlpath r e g e x / cache peer 193.136.93.153 parent 80 0 no−query originserver name=TestServer1 c a c h e p e e r access TestServer1 allow test s i t e s h t t p access allow test s i t e s h t t p access allow allcomputers

# Turn on logs − example debug options ALL,1 28,9 r e q u e s t h e a d e r a c c e s s User−Agent allow all logformat combined [%tl ] ”%rm %ru HTTP/%rv” %Hs %h” ”%{User−Agent}>h” %Ss:%Sh a c c e s s log stdio:/mnt/ext/log/squid/squid.logs combined c a c h e log /mnt/ext/log/squid/squid1. logs c a c h e s t o r e log /mnt/ext/log/squid/squid2.logs l o g f i l e r o t a t e 0 l o g f i l e daemon /dev/null The main points are:

• logformat combined: used to place a pattern in cache logs. It was necessary to put the HTTP User-Agent field (see subsection 4.2.2.2).

• cache dir: used to indicate where to cache content.

• cache peer: used to indicate the origin server IP, to do reverse proxy.

For more information, the documentation is accessible at [74].

5.3.3 Nginx Configuration Similarly to Squid, after installing Nginx, it is necessary to configure it as a reverse proxy with cache capability to evaluate its performance (which can be seen in 5.4.4). The stable version installed is 1.10.0, but actualy the more recent is 1.10.1 (stable) and 1.14.4 (trunk/mainline) [75]. Typically, the nginx.conf file path is in /etc/nginx or /urs/local/etc/nginx, depending on the package maintainer and Linux distribution (OpenWrt OS uses the first path). The Nginx configuration file is easy to understand compared to Squid. The most basic example of Nginx configuration, implemented to be able to evaluate the test scenarios (see Chapter 3), is presented in figure 5.4. Again, some comments are made to understand the purpose of each of the lines. The main points are:

• log format: used to place a pattern in cache logs. It was necessary to put the HTTP User-Agent field (see subsection 4.2.2.2).

53 Figure 5.4: Nginx Configuration Example.

• proxy cache path: used to indicate where to cache content.

• proxy path: used to indicate the origin server IP, to do reverse proxy.

For more information, the documentation is accessible at [59].

5.4 Evaluation

5.4.1 Performance Metrics During a heavy simulation procedure it is important to monitor the performance of each SBCs to avoid resources overloading. Thus, during all tests it is collected and evaluated several parameters that allow to validate the experiments. These are described in the following topics.

CPU Usage

Central Processing Unit (CPU) usage is a term used to describe how much the processor is working. A computer’s CPU usage can vary depending on the type of tasks that are being performed by the processor and is given in a percentage of CPU time over the total CPU’s capacity. When the CPU usage reaches 100% there is no more capacity to use for

54 running other programs. Thus, to avoid overloading each SBCs, it is necessary to take this in consideration in order to verify if the obtained results from the tests are valid or if it is necessary to adapt the test in order to consume less resources. The CPU usage information of a system can be monitored in many ways. Since OpenWrt is a Linux environment, this information is available in the /proc/stat path. The CPU usage can be evaluated by subtracting the idle CPU time from the total CPU time, dividing the difference by the total CPU time as shown in equation 5.1. The total and idle times are measured during a specific sampling period (default value is 10 seconds).

T OT ALtime − IDLEtime CPUusage = × 100 (5.1) T OT ALtime

Load Average

Another metric very useful in Linux OS is load average. The load average measures the mean of the number of processes waiting to be served by the CPU. The instant load does not mean too much since it can vary fast. Thus, usually the load information is relative to the last 1, 5 and 15 minutes [76]. Assuming a single-CPU system, the critical load point is 1. If the load average is 0.00, there are no processes waiting to be served by the CPU. If the load average is 1.00 it means that the processor is exactly at full capacity. Above this value, it means that the CPU is overloaded. So, if the load is 2.00, it means that the CPU has twice more processes than it can handle. In case of multi-processors, the critical load point is given by the number of processor cores available. In this case, as a Raspberry Pi has four processors the critical point is 4. This metric is given by the UNIX OS in /proc/loadavg path, and can refer to a sample period of 1, 5 and 15 minutes. In the evaluated results, 1 minute is used.

5.4.2 Proxy Cache Metrics The two metrics that evaluate the performance of a proxy cache is the speed to process a request and, at the cache level, how many hits and misses it has. In order to quickly understand the relationship between these two numbers their ratio is calculated, indicating how many times the first number contains the second. Thus, the hit ratio is given by the equation 5.2 and the miss ratio is given by the equation 5.3.

T otalHit RatioHit = (5.2) T otalHit + T otalMiss

T otalMiss RatioMiss = (5.3) T otalHit + T otalMiss

5.4.3 Support Scripting In order to support the evaluation of the proposed scenarios several scripts were developed. This set of scripts can be divided into three based on their purpose:

• Script CPU usage and load average, used in all scenarios. This script measures the CPU consumption and the load average in order verify if the obtained results from the

55 tests are valid, or if it is necessary to adapt the test in order to consume less resources from the Raspberry Pis.

• Scripts to perform the test scenario, used to access and control Raspberry Pis and control the consumers requests in order to make specific tests and repeat them to have confidence intervals.

• Scripts to generate final results, used to create data in .csv format (Comma- Separated Values (CSV)) and then generate graphics using RStudio, editor for R lan- guage [77].

More detailed information about the scripts are presented below.

5.4.3.1 Script CPU Usage and Load The flowchart in figure 5.5 represents the process of the measurement of CPU consumption, /proc/stat path, and load average, /proc/loadavg/ path, as well as the process to store the information in the log files. The only input argument ($1) that needs to be passed is the measurement periodicity in seconds, i.e. the time between consecutive measurements.

Figure 5.5: CPU/Load Measure Flow Chart.

56 5.4.3.2 Scripts to perform the test scenario In order to perform the evaluation phase of scenario 1 (caching strategies for embedded systems) and scenario 2 (prefetching and mobile consumers) two scripts are developed, one for each scenario, called MakeAll scripts, which are responsible for managing the test scenario sequence and repetitions. The flowchart illustrated in figure 5.6 represents the MakeAll script for the scenario 1. Before running the script, it is possible to configure and choose the desired Raspberry Pi to run it, the number of iterations/rounds per experiment, time per round or the maximum number of successful requests for example, depending on the purpose of the experiment. First, the script begins by selecting one cache strategy (e.g. Squid) and its respective port (this port will be passed to consumers script to make the requests to the corresponding cache technology). Then a folder with the name of the caching solution is created, and a new sub-folder is created to include the results of each iteration. Then, a connection is established for the Raspberry Pi in order to restart the caching process. Before restarting, the old logs in Raspberry Pi are deleted, if they exist, and then the script enters in a waiting phase until the load average is close to zero, to guarantee the same starting conditions to all rounds. After that, the script CPU Load/Usage is launched in the Raspberry Pi and the consumers script (similar to figure 4.2, but without changing APs) is initialized to start the evaluation of the test scenario. The experiment time is then waited and, after this, the launched scripts are finished and the logs in Raspberry Pi are copied to the respective iteration number folder. If there are still other iterations to be made, the process is repeated from the creation of the iteration number folder. When all iterations are executed, MakeAll script shows, per round, the total number of hits and misses, average request time for qualities and technologies, and CPU load and usage, creating a log with this information. Then, the cache solution is changed (e.g. Nginx) and the experiment is repeated again, starting from the creation of the solution folder. The experiment ends when the two solutions are evaluated. The flowchart illustrated in figure 5.7 represents the MakeAll script for the scenario 2. Before running the script, it is also possible to configure it with the number of itera- tions/rounds per experiment and time per round. The difference for the script of scenario 1 is that here it is not possible to choose which Raspberry Pi to run the whole test, since there is a logical sequence (see figure 3.4), and it is necessary to specify whether the test will be with Nginx or Squid (in order to be able to test the technologies separately). As in script 1, each of the proxy caching technologies is associated with a port that is later passed, in this case, to the MobileConsumers script (explained in figure 4.2), in order to make the requests to the corresponding proxy cache technology. It is then set whether the test will start with or without the prefetching mechanism, since this is the main difference between scenarios. After that, a folder is created with the name with/withoutPrefething, depending on the type of test, and then a new sub-folder is created to include the results of each iteration. It is also established, as in scenario 1, a connection for each of the Raspberry Pis used, in order to clear all the data from possible previous iterations and then the script also goes into a waiting phase until the load average is close to zero, to ensure the same starting conditions for all the rounds. Thereafter, the scripts ServerSocket and WatchDogLogs are launched in each Raspberry

57 58 Figure 5.6: MakeAll Script Scenario 1 Flow Chart. Figure 5.7: MakeAll Script Scenario 2 Flow Chart. 59 Pi (for more information about each one see subsection 4.2.2), followed by the script CPU Load/Usage and then by the MobileConsumers script, in order to start the evaluation of the test scenario. MakeAll script waits the experiment time until it finishes the experiment by finishing all scripts called in the process. To finish the actual iteration, the logs in each Raspberry Pi are copied to the respective iteration number folder. If there are still other iterations to be made, the process is repeated from the creation of the iteration number folder. When all iterations are executed, a log is also created, as in scenario 1, with general information about each round of experiments (with or without prefetching). Then, the test type is changed (i.e, withPrefething) and the experiment is repeated again, from the creation of the folder with the name of the test type. The experiment ends when the proxy caching solution is evaluated with and without prefetching.

5.4.3.3 Scripts to generate final results

The last scripts created are the results analysis. These are responsible for reading the final logs generated by each scenario and grouping the information into six .csv format files (cache performance, CPU load, CPU usage, request time vs qualities and request time vs technologies). Then, this information is loaded into RStudio that analyzed it, calculating averages, confidence intervals and finally generated the graph plots.

5.4.4 Scenario 1: Caching strategies for embedded systems

As mentioned in subsection 3.3.1, three small changes in scenario 1 (figure 3.3) are tested, in order to enrich the discussion of results and obtain a better comparison between the two solutions considered: Nginx and Squid. Thus, the following sub-subsections will present these tree approaches: approach by number of requests, by time and by variation of cache size. The origin server is placed in the same network as the Raspberry Pi (inside the building 2 from Instituto de Telecomunica¸c˜oesde Aveiro), in order to guarantee the least possible impact from external factors (such as latency), and thus making the comparison more similar. Both solutions use LRU as the default cache algorithm.

5.4.4.1 Approach by Number of Requests

The goal of this approach is to make a fixed number of successful requests and see the behavior of each solution. As the results are done in approximately the same conditions, this approach will allow to see which of the solutions ends first and its performance (in terms of hit rate).

Evaluation Procedure

To evaluate this approach, 5,000 successful requests are sent for each solution. The number of simulated consumers watching random videos is 20 and 20 rounds are made, in order to ensure good confidence in the results. A Raspberry Pi is used and configured with a cache size of 100 MB to quickly overload it, leading each strategy to its limit (performing more cache replacements).

60 (a) 5000 requests (b) Ratio 5000 requests

Figure 5.8: Cache Performance (number of requests).

Once the goal is to reach a specific number of requests, the time of each experiment is variable.

Obtained Results

Figure 5.8 presents the performance of each cache solution, with a confidence interval of 95% (all the graphs presented show this characteristic, except the graphics that involve the CPU Load and Usage, that present confidence intervals of 90%, since they are more unstable and the results are more dispersed). In (a) it is possible to see the correspondence of these 5,000 requests in hits and misses, while in (b) it is possible to verify the corresponding hit/miss ratio to quickly conclude about the cache performance of each solution. The hit ratio is calculated by the formula 5.2, and the miss ratio is calculated by the formula 5.3. As it can be seen, both solutions have a similar behavior in terms of cache performance although Nginx, in this approach, seems to have a performance of about 3% above Squid. The confidence intervals are, however, coincident, which makes a comparison impossible. Although both solutions use LRU as the default cache algorithm and the size of the cache is the same, each solution presents its peculiarities to store the content. It is observed that, during each heavy simulation procedure, a partial increase in the cache of some MBs, in both solutions, was usual while the cache manager proceeded to remove old content from the cache. It is also important to note that, although the cache is small, one of the videos had a popularity of about 77.5% compared to the others, which causes many of the emulated clients to request this video more frequently (this explains a higher hit rate), reusing the small cache in the best way (thus saving bandwidth to the upstream server) and getting the content more fast than the others. The consumption speed of the videos is directly proportional to obtaining all the chunks of it. Therefore, consumers who prefer more popular content will be able to make more requests. The figure 5.9 shows the average request time by qualities of each solution. The highest video qualities present, as expected, higher request times downloading the chunks. In this case, it is possible to observe that Squid presents shorter times in the delivery of the chunks demanded by the consumers, in all the available qualities. This may indicate that it could have processed more requests than Nginx in the same time (this topic will be covered below in sub-subsection 5.4.4.2).

61 Figure 5.9: Request Time vs Qualities (number of requests).

Figure 5.10: Request Time vs Technologies (number of requests).

The speed at which an algorithm interprets the requests and processes them is the main factor that influences its performance. Figure 5.10 shows the average request time of all consumers using the two simulated technologies (Mobile/cellular and Wi-Fi). The fact that requests from Wi-Fi show a longer time than requests from Mobile come from the fact that Wi-Fi consumers have been simulated to demand higher quality content than those using Mobile, due to the characteristics of the network itself. Looking at the previous figure (5.9), it is expected that Squid would get better results again. The figure 5.11 presents the CPU Load and Usage measurements. Regarding the CPU Load (a), the results show that the critical point has never exceeded, which in this case is 4, since each Raspberry Pi is a quad-core. The CPU Usage (b) is often under 35% which is an acceptable value to this parameter. It is found that the Nginx solution uses more CPU than Squid, which may explain an improvement in performance but a loss of responsiveness. Squid is free and open-source, while Nginx seems to be more robust, with more features than Squid, and more professional, with a paid version in which engineers purposely work on it. Once the goal of this approach is to reach a specific number of requests, Nginx takes about 5 more minutes than Squid to process, on average, 5,000 requests. This is expected, since Squid has better times in delivering the chunks required by the consumers.

62 (a) CPU Load (b) CPU Usage

Figure 5.11: CPU Load and Usage (number of requests).

5.4.4.2 Approach by Time

The purpose of this approach is to analyze the behavior of each solution (Nginx and Squid) over 10 minutes of experiment. In other words, it will test the computational effort of each one for the management and control of the caches in a wireless network.

Evaluation Procedure

To evaluate this approach, the same conditions as the previous one are used: a Raspberry Pi with a cache size of 100 MB, 20 simulated consumers and 20 rounds, to ensure greater confidence in the results.

Obtained Results

The figure 5.12 presents the cache performance (hits and misses) of each solution. In (a) it is possible to see that Squid processes more requests than Nginx in the same time (approximately 1500). However in (b), where the hit and miss ratio is calculated, it is possible to verify that Nginx performs better (about 5% above Squid), as in the previous approach; this has already been verified in the figure 5.8 (a) and (b). The figure 5.13 presents the average request time by qualities of each solution. Again, it is possible to verify that Squid presents shorter times in the delivery of the chunks demanded by the consumers, in all the available qualities. This may mean that there is a trade-off between cache performance and request processing speed. The figure 5.14 presents the average request time of all consumers using the two simulated technologies (Mobile and Wi-Fi). As mentioned before, the fact that requests from Wi-Fi show a longer time than Mobile requests come from the fact that Wi-Fi consumers have been simulated to demand higher quality content than those using Mobile, due to the characteristics of the network itself. Looking at Figure 5.13, it is expected that Squid would present better times again. The figure 5.15 shows the CPU Load and Usage measurements, respectively, by each solution during the test duration (10 minutes). All these parameters are within acceptable values, although Nginx presents larger values than Squid.

63 (a) 10 minutes (b) Ratio 10 minutes

Figure 5.12: Cache Performance (time).

Figure 5.13: Request Time vs Qualities (time).

Figure 5.14: Request Time vs Technologies (time).

64 (a) CPU Load (b) CPU Usage

Figure 5.15: CPU Load and Usage (time).

5.4.4.3 Approach by Cache Size

The objective of this approach is to verify the performance of each solution when the size of the cache is increased.

Evaluation Procedure

In order to evaluate this approach, 5,000 successful requests are sent for each solution. The number of simulated consumers, watching random videos, and the number of rounds were the same as those of the previous approach: 20. The only difference is in the size of the cache, which starts at 100 MB, then it is increased to 200 MB and finally to 400 MB. Once the goal is to reach a specific number of requests, the time of each experiment is variable.

Obtained Results

Figure 5.16 shows the performance of each cache solution. The results for 100 MB have been previously discussed in sub-subsection 5.4.4.1, where Nginx seemed to perform 3% above Squid. Increasing the cache to 200 MB, it was possible to observe that the only one that improved its performance is Squid, which presented a behavior similar to Nginx. Duplicating the size of the cache again, it is found that the performance is maintained. This means that the cache size is not a factor that determines a significant improvement in results. Sometimes keeping only the most important/popular content cached is enough to achieve good results. The figure 5.17 shows the average request time by qualities of each solution. Again, the fact that the higher video qualities present higher request times is due to the fact that they present heavier resolutions. It is not possible to conclude with certainty that the size of the cache is a factor that reduces significantly the time of obtaining the requests, at least in this scenario, where the caches are of a reduced size and the origin is in the same network where the tests are performed. It can only be concluded that the maximum request time that Nginx obtained is lower for the quality 2,962,000 with a cache of 400 MB.

65 (a) 5,000 requests - 100 MB (b) Ratio 5,000 requests - 100 MB

(c) 5,000 requests - 200 MB (d) Ratio 5,000 requests - 200 MB

(e) 5,000 requests - 400 MB (f) Ratio 5,000 requests - 400 MB

Figure 5.16: Cache Performance (cache).

66 (a) 5,000 requests - 100 MB (b) 5,000 requests - 200 MB

(c) 5,000 requests - 400 MB

Figure 5.17: Request Time vs Qualities (cache).

67 (a) 5,000 requests - 100 MB (b) 5,000 requests - 200 MB

(c) 5,000 requests - 400 MB

Figure 5.18: Request Time vs Technologies (cache).

The figure 5.18 presents the average request time of all consumers using the two simulated technologies (Mobile and Wi-Fi). It is verified that the consumers simulated by Wi-Fi are those who bring a larger overhead to the tests performed, especially when Nginx is used as a cache algorithm, as expected, looking at the previous figure (5.17). As mentioned before, the fact that requests from Wi-Fi show a longer time than Mobile requests come from the fact that Wi-Fi consumers have been simulated to demand higher quality content than those using Mobile, due to the characteristics of the network itself. The figure 5.19 shows the CPU Load and Usage measurements, respectively, by each solution. It was possible to verify that, with the increase of the size of the cache, the use of the processor by Squid was more intensive, even surpassing the one of the Nginx. However, the first managed to reach the number of requests faster than the latter. The increase in processing, in both cases, is directly related to the increase in cache size. The cache manager will have to manage a larger cache, which in practice leads to a loss of efficiency in fetching content that must be removed from the cache to place another content.

5.4.5 Scenario 2: Prefetching and Mobile consumers

As mentioned in subsection 3.3.2, due to the characteristics of the wireless networks, it is necessary to have a scenario (see figure 3.4) that allows to simulate mobility in the consumers. Thus, in this scenario, this will be one of the simulated characteristics. Then, a prefetching

68 (a) 5,000 requests - 100 MB (b) 5,000 requests - 100 MB

(c) 5,000 requests - 200 MB (d) 5,000 requests - 200 MB

(e) 5,000 requests - 400 MB (f) 5,000 requests - 400 MB

Figure 5.19: CPU Load and Usage (cache).

69 mechanism will be added to the proxy cache, in order to predict which will be the next request of the connected consumers, and thus, put it in memory before it is truly requested, improving the QoE in case of hit. In order to do that, two strategies were adopted. The first uses the current chunk information requested and tries to use the same quality in the forecast of the next chunk, while, a second approach, analyses the behavior of each consumer and, based on its current and last four requests, predicts which quality will be requested. The objective of this scenario is to compare the two prefetching algorithms developed for a mobility scenario and to verify the impact on consumers request times.

Evaluation Procedure

To evaluate this scenario, it was used three Raspberry Pis (see figure 3.4) configured with a cache size of 500 MB. The number of simultaneous consumers are 10, and 14 rounds are made, each one with 15 minutes (5 minutes in each Raspberry Pi). During the experiments with the implemented algorithms, it was verified that Squid ag- gregated the requests for writing to the log files. This may explain why Squid responds to the requests faster than Nginx (see subsection 5.4.4). Since the implemented prefetching mechanism needs to read cache proxy logs as fast as possible to quickly cache content, squid results were much worse. Thus, in order to compare the algorithms under the same condi- tions, a time of 5 seconds was added between consumers requests in order to avoid aggregation.

Obtained Results

Figure 5.20 shows the performance of each cache solution, applying the two prediction algorithms developed. It is possible to verify that there are clear improvements in the hit ratio in both cases, although the first one presents better results. This may mean that the weights assigned to the second algorithm should be readjusted as needed. There is a clear increase in the number of requests with prefetching versus no prefetching by Nginx (approximately 57%); however, Squid is able to service more requests and even has better performance in terms of hit ratio. The figure 5.21 shows the average request time by qualities of each solution. It is possible to verify that, with prefetching, it is possible to reduce significantly the response times to the consumer requests, thus improving their QoE and certainly attract more consumers to this type of OTT services. The difference between algorithms is not very significant, showing that even though not always predicting the right quality, on average it improves the results. It is possible to verify that the smaller qualities sometimes present larger times, and this is due to not being frequently requested and, because of that, the probability of not being cached is larger. These qualities are usually only used at the beginning of streaming while the client analyses the quality of the consumer’s connection. Figure 5.22 shows the average request time of all consumers using the two simulated technologies (Mobile and Wi-Fi). The reason for the Wi-Fi consumers relates to the larger times than the Mobile as was explained in subsection 5.4.4. Here the important thing to note is that both prefetching algorithms have experienced smaller times compared to without prefetching and, again, there is not a very significant improvement in a matter of time. The figure 5.23 shows the CPU Load and Usage measurements, respectively, for each solution, but only relative to the first Raspberry Pi (AP A in figure 3.4), since the behavior of the others is similar.

70 (a) 15 minutes - Nginx (b) Ratio 15 minutes - Nginx

(c) 15 minutes - Squid (d) Ratio 15 minutes - Squid

Figure 5.20: Cache Performance (prefetching).

(a) Nginx (b) Squid

Figure 5.21: Request Time vs Qualities (prefetching).

71 (a) Nginx (b) Squid

Figure 5.22: Request Time vs Technologies (prefetching).

The experiment time is 15 minutes in total, with the consumers spending 5 minutes on each of the three Raspberry Pis. As the graphics are relative to the first Raspberry Pi, it is possible to verify that the first 5 minutes present a larger processing than in the remaining minutes of the experiment. It is verified that without prefetching, soon after consumers move to another AP, CPU Load and Usage times go to very low levels. Regarding the cases with prefetching, since although the consumers are not in the AP, they are in the neighborhood and in this way they receive the same requests and put the content in cache. There is, however, a drop because they only process the requests, so there is no need to send them to neighboring APs.

5.5 Chapter Considerations

This Chapter focused on explaining how the whole approach presented in Chapter 4 was integrated and evaluating the proposed solution in Chapter 3. With regard to the proposed first phase (see subsection 3.3), Squid had a clear advantage in responding to consumer requests, but Nginx showed a slightly better performance (hit rate) at the expense of increased CPU Load and Usage. This increase became less evident with the increase in cache size. Regarding the second phase, both prefetching algorithms presented significant improve- ments in the decrease of the time of the requests of the consumers; however, between algo- rithms the improvement is not significant. The first algorithm presents better performance (hit ratio) than the second one, showing that the previous quality has more influence than the last set of qualities, since the network performance varies over time.

72 (a) CPU Load - Nginx (b) CPU Usage - Nginx

(c) CPU Load - Squid (d) CPU Usage - Squid

Figure 5.23: CPU Load and Usage (prefetching).

73 74 Chapter 6

Conclusion and Future Work

6.1 Conclusions

OTT multimedia services have grown in recent years and everything indicates that this trend will continue. In order to sustain this fast paced growth, a wide range of challenges must be addressed, with respect to scalability, reliability and QoE concerns. This dissertation was proposed to improve the delivery of OTT contents in wireless net- works. This improvement means, on the one hand, providing a better quality of experience to the consumers and, on the other hand, reducing the bandwidth required to serve all the requests, saving origin servers from content providers. In this work, two cache proxy technologies are studied: Nginx and Squid. Nginx showed a better performance (hit rate) than Squid, but the latter achieves a clear advantage in responding to consumer requests. Two prefetching algorithms are also developed in order to reduce delivery times for consumer requests. A number of challenges appeared during the development of this work, namely delays in implementation and evaluation but, the proposed objectives were achieved with good results. It has been proved a significant improvement in response times to consumer requests using prefetching mechanisms which will lead to an improvement in their QoE. Investing in large infrastructures to store content is expensive and unattractive. Thus, the idea arises to take advantage of the nodes of mobile networks to be able to store some content and thus increase the size available for caching. Major challenges are expected in the delivery of OTT multimedia content, due to the increasing number of users. This leads to an inevitably improvement in content predicting and cooperation among network nodes in the delivery of content.

6.2 Future Work

This dissertation presents some areas that are possible to develop and expand in the future. Some of these topics can be summarized as follows:

• Real network testing and evaluation: although the tests presented promising results, most of the characteristics were simulated, such as the mobility of the consumers, the type of connection and network conditions. It would be interesting to be able to test in a real environment, where the origin server is not on the same network as the consumers.

75 • Test other types of technologies that allow for streaming adaptation: in this dissertation Microsoft Smooth Streaming was used, but there are many other adaptive streaming platforms that can be used to test, such as Apple HTTP Live Streaming (HLS), Adobe HTTP Dynamic Streaming (HDS), Moving Pictures Expert Group (MPEG) and Dy- namic Adaptative Streaming (DASH).

• Distributed cache solutions: in the performed tests, each AP had a maximum size that could be used for caching, with no sharing of content between neighboring APs. In other words, in case the content was not cached, it was requested directly from the origin server. One way to improve this would be to use solutions that allow the management of distributed caches in the network and, therefore, when the content is not cached it can be searched in neighboring APs. One way to reduce network overhead in content search is to use Bloom filters, widely used for sharing [78].

• Mobility prediction: in order to avoid overloading all the neighboring nodes in the network it would be interesting to predict where the consumer is moving and in this way send prefetching requests to only those nodes.

• Use “tcpdump” to capture packets: tcpdump is a common packet analyzer that is com- patible with OpenWrt OS. It would be useful to use this type of tool to understand what kind of traffic arrives at the AP and its contents, so that it can be used by the prefething mechanism, instead of waiting for the cache management solution to write the request in the logs, which, as previously mentioned, is not always instantaneous.

76 Bibliography

[1] 4cornermedia. Speed Up Your Site Using CDN. http://4cornermedia.com/ speed-up-your-site-using-cdn/. [Online; accessed 26-July-2016]. [2] Mingfeiy. How does video streaming work? http://mingfeiy.com/ traditional-streaming-video-streaming. [Online; accessed 4-October-2016]. [3] Mingfeiy. How does video streaming work? http://mingfeiy.com/ progressive-download-video-streaming. [Online; accessed 4-October-2016]. [4] Bitmovin. MPEG-DASH in a Nutshell. ://bitmovin.com/mpeg-dash/. [Online; accessed 7-October-2016].

[5] Microsoft. Smooth Streaming. https://www.iis.net/downloads/microsoft/ smooth-streaming. [Online; accessed 8-October-2016]. [6] Raspberry Pi Foundation. BCM2836 and Raspberry Pi 2. https://www.raspberrypi. org/blog/raspberry-pi-2-on-sale/. [Online; accessed 24-September-2016]. [7] tp link. 150Mbps High Gain Wireless USB Adapter TL-WN722N . http://www. tp-link.com/en/products/details/cat-11_TL-WN722N.html. [Online; accessed 24- September-2016].

[8] Conviva. 2013 Conviva Viewer Experience Report . http://www.conviva.com/ conviva-viewer-experience-report/vxr-2013/. [Online; accessed 7-December-2016]. [9] Hindawi. Caching Eliminates the Wireless Bottleneck in Video Aware Wireless Networks . https://www.hindawi.com/journals/aee/2014/261390/abs/. [Online; accessed 7- December-2016]. [10] William Cooper and Graham Lovelace. Iptv guide. 2006.

[11] AT&T. AT&T Entertainment. http://about.att.com/sites/entertainment. [On- line; accessed 26-July-2016]. [12] Niels Bouten, Steven Latr´e, Wim Van de Meerssche, Bart De Vleeschauwer, Koen De Schepper, Werner Van Leekwijck, and Filip De Turck. A multicast-enabled deliv- ery framework for qoe assurance of over-the-top services in multimedia access networks. Journal of Network and Systems Management, 21(4):677–706, 2013. [13] Jorge Abreu, Jo˜aoNogueira, Valdecir Becker, and Bernardo Cardoso. Survey of catch-up tv and other time-shift services: a comprehensive analysis and taxonomy of linear and nonlinear television. Telecommunication Systems, pages 1–18, 2016.

77 [14] Andrea Passarella. A survey on content-centric technologies for the current internet: Cdn and p2p solutions. Computer Communications, 35(1):1–32, 2012.

[15] Athena Vakali and George Pallis. Content delivery networks: Status and trends. IEEE Internet Computing, 7(6):68–74, 2003.

[16] PB Mirchandani. The p-median problem and generalizations. discrete location theory (pb mirchandani and rl francis, eds.), Wiley, New York, 1990.

[17] George Pallis and Athena Vakali. Insight and perspectives for content delivery networks. Communications of the ACM, 49(1):101–106, 2006.

[18] Vinod Valloppillil and Keith W Ross. Cache array routing protocol v1. https://tools. ietf.org/html/draft-vinod-carp-v1-03, 1998. [Online; accessed 15-December-2016].

[19] Duane Wessels and Paul Vixie. Hyper text caching protocol (htcp/0.0). https://tools. ietf.org/html/rfc2756, 2000. [Online; accessed 15-December-2016].

[20] Li Fan, Pei Cao, Jussara Almeida, and Andrei Z Broder. Summary cache: a scalable wide-area web cache sharing protocol. IEEE/ACM Transactions on Networking (TON), 8(3):281–293, 2000.

[21] Chung-Min Chen, Yibei Ling, Marcus Pang, Wai Chen, Shengwei Cai, Yoshihisa Suwa, and Onur Altintas. Scalable request routing with next-neighbor load sharing in multi- server environments. In 19th International Conference on Advanced Information Net- working and Applications (AINA’05) Volume 1 (AINA papers), volume 1, pages 441–446. IEEE, 2005.

[22] Al-Mukaddim Khan Pathan and Rajkumar Buyya. A taxonomy and survey of content delivery networks. Grid Computing and Distributed Systems Laboratory, University of Melbourne, Technical Report, pages 1–44, 2007.

[23] Akamai: Web Performance; Media Delivery; Cloud Security; Cloud Networking; Network Operator. Choosing the right CDN. http://www.akamai.com. [Online; accessed 26-July- 2016].

[24] Cloudflare. Give us five minutes and we’ll superchange your website. http://www. cloudflare.com. [Online; accessed 26-July-2016].

[25] Dongbo Huang, Jin Zhao, and Xin Wang. Agiler: A p2p live streaming system with low playback lag. In Collaborative Computing: Networking, Applications and Worksharing (CollaborateCom), 2010 6th International Conference on, pages 1–10. IEEE, 2010.

[26] Ben Niven-, F Le Faucheur, and Nabil Bitar. Content distribution network in- terconnection (cdni) problem statement. Technical report, 2012.

[27] Kent Leung and Yiu Lee. Content distribution network interconnection (cdni) require- ments. Technical report, 2014.

[28] P Eardley and K Ma. Use cases for content delivery network interconnection draft-ietf- cdni-use-cases-00. 2011.

78 [29] K Ma and G Watson. Use cases for content delivery network interconnection draft-ietf- cdni-use-cases-10. 2012.

[30] Mois´esRodrigues, Stenio Fernandes, Judith Kelner, and Djamel Sadok. An analytical view of multiple cdns collaboration. In 2014 IEEE 28th International Conference on Advanced Information Networking and Applications, pages 25–32. IEEE, 2014.

[31] Content Delivery Networks Interconnection Work Group. https://datatracker.ietf. org/wg/cdni/documents/. [Online; accessed 6-August-2016].

[32] Cisco VNI. Cisco visual networking index: Global mobile data traf- fic forecast update, 2015–2020. http://www.cisco.com/c/en/us/ solutions/collateral/service-provider/visual-networking-index-vni/ mobile-white-paper-c11-520862.html, 2016. [Online; accessed 8-August-2016].

[33] Dragan Boscovic, Faramak Vakil, StaniˇsaDautovi´c,and Milenko Toˇsi´c.Pervasive wireless cdn for greening video streaming to mobile devices. In MIPRO, 2011 Proceedings of the 34th International Convention, pages 629–636. IEEE, 2011.

[34] Arun Kumar BR, Lokanatha C Reddy, and Prakash S Hiremath. Rtsp audio and video streaming for qos in wireless mobile devices. IJCSNS, 8(1):96, 2008.

[35] Dapeng Wu, Yiwei Thomas Hou, Wenwu Zhu, Ya-Qin Zhang, and Jon M Peha. Stream- ing video over the internet: approaches and directions. IEEE Transactions on circuits and systems for video technology, 11(3):282–300, 2001.

[36] Ali Begen, Tankut Akgul, and Mark Baugher. Watching video over the web: Part 1: Streaming protocols. IEEE Internet Computing, 15(2):54–63, 2011.

[37] Henning Schulzrinne. Real time streaming protocol (rtsp). RFC2326, 1998.

[38] Alex Zambelli. Iis smooth streaming technical overview. Microsoft Corporation, 3:40, 2009.

[39] Thomas Stockhammer. Dynamic adaptive streaming over http–: standards and design principles. In Proceedings of the second annual ACM conference on Multimedia systems, pages 133–144. ACM, 2011.

[40] Konstantin Miller, Emanuele Quacchio, Gianluca Gennari, and Adam Wolisz. Adapta- tion algorithm for adaptive streaming over http. In 2012 19th International Packet Video Workshop (PV), pages 173–178. IEEE, 2012.

[41] Microsoft. Smooth Streaming Transport Protocol. https://www.iis.net/learn/ media/smooth-streaming/smooth-streaming-transport-protocol. [Online; ac- cessed 8-October-2016].

[42] Neelam Duhan et al. Survey of recent web prefetching techniques. International Journal of Research in Computer and Communication Technology, pages 1465–1469, 2013.

[43] Sandhaya Gawade and Hitesh Gupta. Review of algorithms for web pre-fetching and caching. Int. J. Adv. Res. Comput. Commun. Eng, 1(2):62–65, 2012.

79 [44] Bin Wu and Ajay D Kshemkalyani. Objective-optimal algorithms for long-term web prefetching. IEEE Transactions on Computers, 55(1):2–17, 2006.

[45] Steven P Vanderwiel and David J Lilja. Data prefetch mechanisms. ACM Computing Surveys (CSUR), 32(2):174–199, 2000.

[46] E Markatos and C Chironaki. A top ten approach for prefetching the web. In Proceedings of the INET98 Internet Global Summit, 1998.

[47] Bin Wu and Ajay D Kshemkalyani. Objective-greedy algorithms for long-term web prefetching. In Network Computing and Applications, 2004.(NCA 2004). Proceedings. Third IEEE International Symposium on, pages 61–68. IEEE, 2004.

[48] Aruna Jain. Optimizing web server performance using data mining techniques. Editorial Advisory Board e, 17(2):222–231, 2005.

[49] Yingyin Jiang, Min-You Wu, and Wei Shu. Web prefetching: Costs, benefits and per- formance. In Proceedings of the 7th international workshop on web content caching and distribution (WCW2002). Boulder, Colorado. Citeseer, 2002.

[50] Arun Venkataramani, Praveen Yalagandula, Ravindranath Kokku, Sadia Sharif, and Mike Dahlin. The potential costs and benefits of long-term prefetching for content dis- tribution. Computer Communications, 25(4):367–375, 2002.

[51] Jens A Andersson, Manxing Du, Huimin Zhang, Maria Kihl, Stefan H`yst,Christina Lagerstedt, et al. User profiling for pre-fetching or caching in a catch-up tv network. In Broadband Multimedia Systems and Broadcasting (BMSB), 2016 IEEE International Symposium on, pages 1–4. IEEE, 2016.

[52] Jo˜aoNogueira, Lucas Guardalben, Bernardo Cardoso, and Susana Sargento. Catch-up tv analytics: statistical characterization and consumption patterns identification on a production service. Multimedia Systems, pages 1–19, 2016.

[53] Dario Bonino, Fulvio Corno, and Giovanni Squillero. A real-time evolutionary algo- rithm for web prediction. In Web Intelligence, 2003. WI 2003. Proceedings. IEEE/WIC International Conference on, pages 139–145. IEEE, 2003.

[54] Henrik Abrahamsson and Mats Bj¨orkman.Caching for iptv distribution with time-shift. In Computing, Networking and Communications (ICNC), 2013 International Conference on, pages 916–921. IEEE, 2013.

[55] Abraham Silberschatz, Peter Baer Galvin, and Greg Gagne. Operating System Concepts. 8th edition, John Wiley & Sons, 2009.

[56] Laszlo A Belady, Robert A Nelson, and Gerald S Shedler. An anomaly in space-time characteristics of certain programs running in a paging machine. Communications of the ACM, 12(6):349–353, 1969.

[57] Joao Nogueira, Daniel Gonzalez, Lucas Guardalben, and Susana Sargento. Over-the-top catch-up tv content-aware caching. In Computers and Communication (ISCC), 2016 IEEE Symposium on, pages 1012–1017. IEEE, 2016.

80 [58] Kai Dong, Jun He, and Wei Song. Qoe-aware adaptive bitrate video streaming over mobile networks with caching proxy. In Computing, Networking and Communications (ICNC), 2015 International Conference on, pages 737–741. IEEE, 2015.

[59] Nginx. Welcome to NGINX Wikis documentation! . https://www.nginx.com/ resources/wiki/. [Online; accessed 8-November-2016].

[60] Squid. Squid: Optimising Web Delivery . http://www.squid-cache.org/. [Online; accessed 8-November-2016].

[61] Polipo. Polipo is no longer maintained . https://www.irif.fr/~jch//software/ polipo/. [Online; accessed 8-November-2016].

[62] Tinyproxy. lightweight http(s) proxy daemon . https://tinyproxy.github.io/. [On- line; accessed 8-November-2016].

[63] OpenWrt. Privoxy . https://wiki.openwrt.org/doc/howto/proxy.privoxy. [Online; accessed 8-November-2016].

[64] Nginx. What is a Reverse Proxy Server? . https://www.nginx.com/resources/ glossary/reverse-proxy-server/. [Online; accessed 8-December-2016].

[65] Microsoft. IIS Smooth Streaming HD Sample Content . https://www.microsoft.com/ en-us/download/details.aspx?id=18199. [Online; accessed 26-November-2016].

[66] Germany OPTICOM GmbH. The Source for Voice and Video Quality Testing . http: //www.pevq.com/. [Online; accessed 16-November-2016].

[67] python. Python is powerful... and fast; plays well with others; runs everywhere; is friendly & easy to learn; is Open. . https://www.python.org/about/. [Online; accessed 21-October-2016].

[68] PyCharm. Python IDE for Professional Developers . https://www.jetbrains.com/ pycharm/. [Online; accessed 21-October-2016].

[69] Python. socketserver - A framework for network servers . https://docs.python. org/3.5/library/socketserver.html#socketserver-udpserver-example. [Online; accessed 13-November-2016].

[70] Raspberry Pi. Raspberry pi. Raspberry Pi, 1:12, 2012.

[71] OpenWrt. OpenWrt Wireless Freedom . https://openwrt.org/. [Online; accessed 24-September-2016].

[72] OpenWrt. OpenWrt Wireless Freedom . https://wiki.openwrt.org/doc/howto/ buildroot.exigence. [Online; accessed 25-September-2016].

[73] squid cache.org. Squid Versions . http://www.squid-cache.org/Versions/. [Online; accessed 29-September-2016].

[74] squid cache.org. Squid configuration directives . http://www.squid-cache.org/Doc/ config/. [Online; accessed 28-September-2016].

81 [75] nginx.org. nginx: download . https://nginx.org/en/download.html. [Online; accessed 29-September-2016].

[76] How-To Geek. Understanding the Load Average on Linux and Other Unix-like Systems . http://www.howtogeek.com/194642/ understanding-the-load-average-on-linux-and-other-unix-like-systems/. [Online; accessed 28-September-2016].

[77] RStudio. Take control of your R code . https://www.rstudio.com/products/rstudio/. [Online; accessed 29-September-2016].

[78] Andrei Broder and Michael Mitzenmacher. Network applications of bloom filters: A survey. Internet mathematics, 1(4):485–509, 2004.

82