DEGREE PROJECT IN TECHNOLOGY, SECOND CYCLE, 30 CREDITS STOCKHOLM, SWEDEN 2020

On the impact and applicability of network edge computing to reduce network latencies of worldwide client applications

Stephan Horsthemke

KTH ROYAL INSTITUTE OF TECHNOLOGY ELECTRICAL ENGINEERING AND COMPUTER SCIENCE Authors

Stephan Horsthemke Information and Communication Technology KTH Royal Institute of Technology

Host Company

Spotify AB Regeringsgatan 19 111 53 Stockholm, Sweden

Examiner

Vladimir Vlassov Kistagången 16 KTH Royal Institute of Technology

Supervisors

Axel Liljencrantz AB

Sina Sheikholeslami KTH Royal Institute of Technology

ii Abstract

This project evaluates the applicability of network edge computing to reduce global latencies of client applications. It determines the dimension of latency reduction network edge computing can provide compared to common architectures. Furthermore, this project examines whether Compute@Edge, an exemplary and modern edge computing service, enables the replacement of many latency-sensitive cloud systems by an adequate versatility and a reasonable cost- benefit ratio. Compute@Edge is a new, serverless edge computing platform by Fastly built on WebAssembly. A prototype that replicates a globally utilized server of Spotify was implemented on Compute@Edge. To compare the latencies of cloud and edge computing, an experiment captured the latencies of the prototype and the original system using a Spotify client that generated almost 26 million data points from all over the world. Next to the experiment, the implementation of the prototype allows accurate insights into the possibilities of Compute@Edge and whether WebAssembly is a promising approach for edge computing. Successes of this work include data showing that network edge computing can reduce latencies significantly. It offers arguments to ramp up the usage of edge computing, WebAssembly and Compute@Edge for applications that require low latencies. The results of the experiment show that network edge computing is capable of reducing network latency compared to cloud computing by at least 38%. The lower latencies combined with the versatility and feasibility of Compute@Edge show that modern edge platforms enable a much higher utilization for applications like Spotify.

Keywords

Edge Computing, WebAssembly, WebAssembly System Interface, Latency, Response Time

iii Sammanfattning

Projektet utvärderar hur applicerbart nätverks edge computing är för att minska global latens av kundapplikationer. Den avgör att dimensionen av fördröjnings minskningen i nätverks edge computing kan ge i jämförelse till vanliga cloud computing arkitekturer. Projektet undersöker också om Compute@Edge, en exemplarisk och modernt edge computing service, möjliggör ett byte av många latens-känsliga cloud system och då med en lämplig användbarhet och ett rimlig kostnads-nyttoförhållande.

Compute@Edge är en ny serverlös edge computing platform av Fastly, byggt på WebAssembly. En prototype som replikerar en globalt använd server av Spotify var implementerad på Compute@Edge. För att jämföra latenserna av cloud och edge computing, genomfördes ett experiment som fångade upp latenserna av prototypen och det ursprungliga systemet med hjälp från en Spotify kund som genererade runt 26 millioner globala datapunkter. Med experimentet, ger prototypimplementeringen exakta insikter till möjligheterna med Compute@Edge och om WebAssembly är en lovande lösning till edge computing. Arbetes framgång inkluderar data som visar att nätverks edge computing kan minska latensen betydligt. Det visar också argument för att öka på användingen av edge computing, WebAssembly och Compute@Edge till applikationer som behöver låga latens.

Experimentets resultat visar att nätverks edge computing kan minska nätverkslatens i jämförelse till cloud computing med åtminstone 38%. De lägre latenserna kombinerade med användbarheten och möjligheten av Compute@Edge visar att moderna edge plattformar ger möjligheter till mycket mer bättre översättning för applikationer som Spotify.

iv Acknowledgements

I want to thank the numerous people who helped me along the sometimes very challenging months of writing this thesis! Thanks to Sina Sheikholeslami and Axel Liljencrantz for their patience and for sometimes rather asking “how are you” ahead of “what did you achieve”!

A big thanks goes to quite some Spotify Squads who helped with parts of the project and the Fastly team!

Finally, I would like to extend my gratitude to friends and family who helped me dealing with tough phases throughout these months.

v Acronyms

CDN IoT of Things ISP Internet Service Provider VCL Varnish Configuration Language WASI WebAssembly System Interface API Application programming Interface PoP Point of Presence IaaS Infrastructure as a Service CLI Command Line Interface HTTP Hypertext Transfer Protocol NIST National Institute of Standards and Technology UX User Experience

vi Glossary

Apache Cassandra Apache Cassandra is an eventual consistent Database [3]. 47

ByteCodeAlliance The ByteCodeAlliance is a group of organizations(RedHat, Mozilla, Intel and Fastly among others) and projects which are dedicated to use WebAssembly and WASI for a more secure and efficient web [6]. 16, 17, 48, 50

CDN Content Delivery Networks (CDNs) strive to have their Points of Presence (PoPs) as close to users as possible to ensure low latencies [1]. Technically CDNs serve replicated content at well-connected network edges [28]. Companies offering CDNs, like Akamai and Fastly, are often also called CDNs. Technically, many provide much more services than just a CDN. Therefore, Fastly strives to be called an edge cloud platform . 15

Cloud BigTable Cloud BigTable is an eventual consistent Database offered by Cloud [27]. 47 cloud data center A cloud data center is a centralized data center deep in the cloud and the location is optimized for cost efficiency. 1, 4, 7, 29, 51 sandbox A safety mechanism to separate multiple applications from each other and their hosting environment. For WebAssembly specifically it means that it can only communicate out of their context by going through appropriate Application programming Interfaces (APIs) [33]. 9

vii Contents

1 Introduction 1 1.1 Research Questions ...... 3 1.2 Thesis Contributions ...... 4 1.3 Goals ...... 4 1.4 Benefits, Ethics and Sustainability ...... 4 1.5 Outline ...... 5

2 Theoretical Background 6 2.1 Edge Computing ...... 6 2.1.1 Definition ...... 7 2.1.2 Comparison to Cloud computing ...... 7 2.2 WebAssembly ...... 8 2.2.1 Speed ...... 9 2.2.2 Safety ...... 9 2.2.3 Portability ...... 9 2.3 WebAssembly System Interface (WASI) ...... 10 2.4 Serverless Computing ...... 10 2.5 Response Time ...... 11 2.5.1 Latency ...... 11 2.5.2 Retention ...... 13 2.5.3 Response time limits ...... 14

3 Technical Background 15 3.1 Fastly ...... 15 3.1.1 Lucet ...... 16 3.1.2 Compute@Edge ...... 17 3.2 Spotify ...... 20

viii CONTENTS

3.2.1 Web API ...... 20 3.2.2 Metadata ...... 20 3.2.3 Web-player ...... 22 3.2.4 Response time ...... 22

4 Design and Implementation 23 4.1 Prototype ...... 23 4.1.1 Method ...... 23 4.1.2 Implementation ...... 24 4.2 Experiment ...... 31 4.2.1 Passive Execution ...... 32 4.2.2 Fake Requests ...... 32 4.2.3 Metrics ...... 33 4.2.4 Feature toggling ...... 33 4.3 Visualization ...... 34 4.3.1 Monitoring ...... 34 4.3.2 BigQuery ...... 34

5 Results 36 5.1 Latency ...... 36 5.1.1 Compute@Edge ...... 37 5.1.2 Spotify Web API ...... 41 5.1.3 Comparison ...... 43 5.2 Versatility ...... 46 5.2.1 Computation ...... 46 5.2.2 Storage ...... 47 5.3 Feasibility ...... 48 5.3.1 Implementation ...... 48 5.3.2 Maintenance ...... 49 5.3.3 Risks ...... 49

6 Conclusions 51 6.1 Future Work ...... 53 6.1.1 Mobile Data ...... 53 6.1.2 Cloudflare workers ...... 53 6.1.3 Proxy cloud systems with C@E ...... 53

ix CONTENTS

6.1.4 Critical Point Analysis ...... 54

References 55

x Chapter 1

Introduction

Latency is a neglected trait of global cloud applications, which edge computing attempts to improve. But how big is the dimension of improvement edge computing is capable of and are modern network edge solutions powerful enough to merit using it in addition to the cloud?

With tech companies striving to increase their markets globally and users demanding increased performance, developers require solutions overcome the disadvantages of cloud computing. The proliferation of cloud computing extended the capabilities of smartphones significantly, as devices are not in need to do heavy computations themselves anymore. However, services with latency-sensitive content like streaming data and user interactions face the boundary of distance to cloud data centers, usually not optimized for lower latencies. Often the transmission to data centers is a substantial reason for high latencies and ultimately poor performance.

Therefore latency-sensitive requests should be handled closer to the clients at well- connected locations at the edge of the network. Computation at the edge, at well- connected PoPs, is what some Content Delivery Networks (CDNs), or rather edge platforms, are offering nowadays. Edge data centers optimize latencies and with the distribution and location of data centers, edge providers offer lower worldwide latencies compared to cloud computing infrastructures.

It might be trivial that lower distances lead to lower latencies, but it is often unclear how much faster edge computing is more accurately. Other studies evaluating latencies of edge platforms usually lack an adequate number of data points. Furthermore, many studies are presumably imprecise as test requests are often executed in close proximity

1 CHAPTER 1. INTRODUCTION to network hubs, which significantly deteriorates the results [1].

A quantitative study that executes test requests from real-world users of a worldwide application to an edge platform has not been found but is expected to contain essential insights in the dimension of improvement edge computing provides. This project determines the dimension of latency reduction network edge computing can provide compared to common cloud computing architectures. Based on the outcome, developers can decide for or against using edge computation to speed up their system by analyzing benefits versus higher costs.

However, next to the actual latency improvement, it is crucial to determine whether developers can implement systems at the edge and whether that is feasible to weigh out costs versus benefits. Therefore, this project also examines versatility and feasibility of an exemplary and modern edge computing service.

The usability of the edge currently has limitations which edge computation technologies have to overcome. Current solutions mostly come with huge development efforts and an environment which limits possible use cases significantly. Some solutions are not flexible enough to implement the necessary logic for complex edge systems, other solutions are complex itself as they require developers to learn another language. Many have issues like long cold-start times which again degrade the overall latency. The latency improvements of edge solutions often cause significantly higher development costs, which ultimately dictate the benefit-cost ratio.

It is only a matter of time until edge platforms offer solutions that diminish development costs and increase their possibilities. One promising bleeding-edge solution is Compute@Edge from Fastly, a serverless edge compute environment which builds on top of WebAssembly [22].

WebAssembly, which initially was targeted as an alternative to JavaScript and therefore web-browsers, is ultimately designed to run on any system. The features of WebAssembly, like portability, performance, and safety, make it a good fit for edge computing. Some tech companies including Fastly started open-source projects using it without browser environments by implementing WebAssembly System Interface (WASI), a standard of WebAssembly for system interfaces. Fastly initiated the development of Lucet [2], the compiler and runtime of Compute@Edge, and the tooling around it to implement their next generation edge computation solution.

2 CHAPTER 1. INTRODUCTION

This project examines the possibilities and the limitations of this new approach Compute@Edge represents by implementing a prototype on Compute@Edge that replicates a globally utilized server of Spotify. Furthermore, an experiment was executed to capture the latencies of the prototype and the original system using a Spotify client that generated almost 26 million data points from all over the world. This allows not only the comparison of global latencies for current system and prototype, but also a general comparison of edge computing and cloud computing.

1.1 Research Questions

This work addresses the following three questions with a focus on the first one.

1. How does running systems on an edge computing platform impact its worldwide network latency compared to cloud computing?

The project presents the outcome of an experiment that gets worldwide latencies of two exemplary servers, one running on the edge and one in the cloud. The comparisons show the latency differences from an edge system to a cloud system globally and regionally.

2. Is WebAssembly with WASI sufficiently versatile to implement necessary logic in order to replace latency-sensitive cloud systems?

The technology around WebAssembly is examined and practically tested to approach the possibilities of that technology and find out advantages and disadvantages.

3. Is the development of a latency-sensitive system on Compute@Edge worth the latency improvement?

The development effort can be approached by the exemplary implementation effort of the prototype (4.1). Additionally, the tech stack behind Compute@Edge discloses hints on how the actual development effort is expected to be, also compared to other edge computing solutions.

3 CHAPTER 1. INTRODUCTION

1.2 Thesis Contributions

The outcome of this thesis presents detailed, quantitative insights into real-user network latencies for edge computing and cloud computing infrastructures. Generally, the results of the experiment support infrastructure decisions, especially when it comes to possible performance improvements and the decision-making to implement a specific system for cloud or edge. This work is a reference to explore the cost-benefit ratio of implementing systems on edge platforms. Furthermore it shares insights into global latency inequalities exposed by cloud computing and offers initial starting points to overcome those.

1.3 Goals

According to the stated Research Questions (1.1), the project has the following goals.

1. Implement a prototype at the network edge with Compute@Edge which handles an exemplary request like its cloud counterpart.

2. Compare the latencies of the prototype to the original cloud system.

3. Explore feasibility and versatility of edge computing with WebAssembly and Compute@Edge.

1.4 Benefits, Ethics and Sustainability

There are ethical benefits this project and especially the utilization of edge computing can provide.

This project promotes the increased utilization of edge computing which is expected be include some ethical benefits. In terms of sustainability, edge computing might seem like an improvement, but it is arguably tiny or maybe even non-existent.

Ethically the decentralization of data centers should lead to a fairer distribution of application performances. With the centralization of computation in recent years (2.1.2), big data centers serving major parts of the internet were build in parts of the world where most of the traffic happens. This has led to a latency disadvantage for locations in the world which are often far away from cloud data centers. With edge

4 CHAPTER 1. INTRODUCTION computing the situation can be improved significantly. By using hundreds instead of a handful of data centers, the average worldwide latency is a lot lower and delays and functionality of applications could be less dependent on the geolocation of clients.

The author refuses to see a clear improvement in terms of sustainability. One could argue that with the reduction of distance requests need to travel, the number and size of necessary infrastructure gets smaller and energy usage is therefore reduced. However, there is much more to consider like the way edge computing is used. It is imaginable that many clients will be implemented with parallel requests to both, edge and cloud, endpoints to take the fastest response. Hence, this could also lead to a rebound effect, increasing the number of requests and requiring higher energy consumption.

Additionally edge computing currently leads to an incredible amount of infrastructure placed in new small data centers. A shift towards edge computing means that new hardware is going to be produced.

1.5 Outline

Chapter 2 provides the theoretical background of the project, e.g. explaining concept like edge computing and what is meant with that throughout this project. After that Chapter 3 covers the technical background, especially explaining Compute@Edge in detail. Chapter 4 describes the different technical implementations made throughout the project with the focus on Compute@Edge. The results are found in Chapter 5 and Chapter 6 warps up the work and gives an outlook onto possible future work.

5 Chapter 2

Theoretical Background

Several key concepts enable the functionality and noteworthiness of Compute@Edge, the system around the prototype of this work. The general idea of Compute@Edge is to offer network edge computing (2.1) at Points of Presence (PoPs) of Fastly. Compute@Edge uses WebAssembly (2.2), a portable low-level assembly language, and a system interface for WebAssembly (2.3) to offer a serverless computing environment (2.4) to run applications at their 50+ data centers (3.1). The theoretical concepts are described in the following sections of this chapter. Additionally Section 2.5 sheds a light onto response times in this context, including a subsection about latencies that explains which part of the network latency this project attempts to improve.

2.1 Edge Computing

Edge computing is a modern computing paradigm, which some coin as “the next big trend” [46]. Historically, the term edge computing has been used at least since Akamai started to use content customization for their cache. Interestingly this definition comes very close to what this thesis means with edge computing. However, the term is excessively used for a range of different kinds of computation, which are only loosely related to its origin.

First, the differences between what edge computation can mean nowadays as well as a general definition will be presented in Section 2.1.1. Then, Section 2.1.2 will outline the difference between edge computing and cloud computing. As the replicated system

6 CHAPTER 2. THEORETICAL BACKGROUND in this work runs on a common cloud architecture, the experiment compares cloud computing and edge computing focused on the latency difference. Comparing those paradigms as a whole is therefore a first step towards understanding the results.

2.1.1 Definition

Defining the concept of edge computing is very complicated as the term is excessively used in different parts of the tech world with quite disparate meanings. All those approaches deal with physical proximity to reduce latency. The definitions are mostly dissimilar in what they define as the edge. Edge computing is a term in Internet of Things (IoT), meaning for example to shift computation to IoT Gateways, but also in telecommunication, where edge computing refers to computation at the central office or base station [15].

The edge can basically be any device in-between the client and the cloud data center, including the client itself. Therefore, a very general definition for edge computing is a model for using computing resources of any device which is not part of the cloud data center [13]. Edge computing stands for solutions that facilitate data processing at or in closer proximity to the source of data generation than the closest cloud data center [45].

2.1.2 Comparison to Cloud computing

The physical proximity of computation is essential in edge computing but is neglected in cloud computing. According to the National Institute of Standards and Technology (NIST), “cloud computing is a model for enabling ubiquitous, convenient, on-demand network access to a shared pool of configurable computing resources that can be rapidly provisioned and released with minimal management effort or service provider interaction” [25]. In recent years, the proliferation of this model has extended the capabilities of smartphones and other devices significantly. Devices do not need to do heavy computations themselves anymore and cloud computing simplified the development of various web services, transforming the internet to what it is today.

However, services with latency-sensitive content like streaming data and user interactions face the distance to cloud data centers for doing computation. As

7 CHAPTER 2. THEORETICAL BACKGROUND described in Section (2.5.1), the distance to data centers is often a significant contributor to high latencies.

Cloud computing, when compared to edge computing, is a very centralized approach to computing. Latency and physical proximity are not prioritized, it usually exploits economies of scale to lower costs of maintenance and operation [32]. Cloud data centers are mostly located at cost-efficient but not necessarily well-connected locations.

The transmission to data centers is often responsible for high network latencies. As systems are mostly running on a small and limited number of data centers, the latency to the closest data center is often the major reason why globally-distributed applications provide varying performance depending on the client’s geolocation, Internet Service Provider (ISP) and connection to the closest data center in general. Computation needs to happen closer to clients to overcome the physical boundary of transferring data to distant data centers.

As cloud computing has those drawbacks which are getting more apparent with IoT and modern mobile applications, computation is projected to get increasingly decentralized [32]. This does not mean that cloud computing will be replaced in the future, but rather that edge computing can be a complement to the cloud, handling latency-sensitive parts to speed up web applications of any kind.

The essential difference between edge computing and cloud computing is the difference between centralization and dispersion [32]. Cloud computing can reduce costs, simplify deployment and maintenance of services and has many other advantages. However, cloud data centers are often not in the best location for all types of computation. Latency-sensitive and globally distributed applications require computations in closer proximity to clients and cloud computing might not be the best strategy for those cases [43].

2.2 WebAssembly

WebAssembly is the answer of the pitfalls JavaScript, “the only natively supported programming language on the Web” [21], has. Primarily, WebAssembly is targeted to be an alternative to JavaScript for the Web. However, the language is designed to run anywhere (2.3) and does not depend on the environment of browsers. It

8 CHAPTER 2. THEORETICAL BACKGROUND is a compilation target for programming languages. WebAssembly is a portable, low-level assembly language which is supposed to offer great performance (2.2.1) in an isolated and safe manner (2.2.2) while being portable to any system with a WebAssembly runtime and possible to compile using many of the common programming languages (2.2.3). WebAssembly is the enabling technology of Compute@Edge (3.1.2) and therefore features of WebAssembly directly relate to the capabilities of Compute@Edge.

2.2.1 Speed

In comparison to JavaScript, WebAssembly especially stands out in terms of performance. First of all, as it is native machine-code, it can use the full performance of the machine it is running on, whereas JavaScript is known to have inconsistent performance [21]. Additionally, it is far more compact and does not impose as much strain on networks when distributed. Even compressed JavaScript code is not as compact as WebAssembly files.

2.2.2 Safety

As code on the Web is considered to be distrusted, the safety of WebAssembly is a crucial point for a safe language. WebAssembly does not assume a managed environment and is supposed to run on any system, including operating systems (2.3). Therefor)e, the interface to the host system has to be well designed. In order to be safe, WebAssembly runs different executions in distinct sandboxes and WebAssembly binaries need to be executed in a deterministic way. WebAssembly strives to support only “safe programs by eliminating dangerous features from its execution semantics” [33]. This ensures the development of safe applications but might restrict the functionality of high-level programming languages when compiled to WebAssembly. As WebAssembly was first used in browsers which are trusted and supposed to come with safety measures like memory safety, safety has more challenges for non-browser- based systems (2.3).

2.2.3 Portability

WebAssembly is designed to be capable of running anywhere with the same outcome. Thanks to the portability it is imaginable that systems are easily distributed on different

9 CHAPTER 2. THEORETICAL BACKGROUND cloud/edge platforms, it could prevent vendor lock-in and enable shifting existing systems to the edge.

“WebAssembly is created for a conceptual machine, which means it is supposed to run on any architecture“ [39]. It is independent from architecture, operating system and platform and runs on all those systems with the same behavior [21]. The concept implies that as long as a WebAssembly runtime is supported, developers can assume that their compiled WebAssembly code will run identically on various systems.

Besides running anywhere, existing code bases are supposed to be easily shifted to be compiled to WebAssembly to ensure the portability of running systems. For browsers, that means specifically that WebAssembly plays well with JavaScript. It can call and handle JavaScript and can access browser functionality through already existing Application programming Interfaces (APIs) [44].

2.3 WebAssembly System Interface (WASI)

WebAssembly runs on all major browser environments. However, there are use cases, like computing at the edge, the browser engine would just be an overhead. Running WebAssembly without the additional layer of browsers, further increases its possibilities. WASI, a standard for a modular system interface for WebAssembly was defined for just that. However, running WebAssembly inside browsers is different from running it directly on operating systems because the browser imposes functionalities like preventing many kinds of malicious executions. Furthermore, different operating systems work very differently, and WebAssembly does not have a system interface to safely talk to the operating system. As WebAssembly is supposed to run on any operating system, it requires a system interface for a conceptual operating system [39], which eventually works for all of the physical operating systems. That is what WASI is supposed to be. It is intended to hold all WebAssembly principles and make WebAssembly a ubiquitous code distribution format.

2.4 Serverless Computing

Serverless computing is a concept for services that offer computation of custom application code on managed infrastructure. The advantage of serverless computing

10 CHAPTER 2. THEORETICAL BACKGROUND is that developers do not need to control operational aspects of deployment and maintenance actively but can still deploy their individual application code [5]. A disadvantage is that the level of freedom is not as high as in Infrastructure as a Service (IaaS). For instance, there is a constraint on supported programming languages and the application usually needs to be stateless. Serverless does not mean that no servers are used, which is a common misconception, but rather that developers do not need to control any servers actively.

2.5 Response Time

Response time for interactive applications can be “defined as the elapsed time between when an action of the user is captured by the system and when the result of this trigger can be perceived by the user” [8]. For interactive end-user applications, the response time is significant for a seemingly wait-free user experience. Short response times represent an integral element of the performance of distributed applications. To analyze response time, it can be decomposed into multiple different latencies, described in the following Section (2.5.1), with a focus on network latency. After that Section (2.5.2) clarifies the importance of low response times by showing the dependency of performance and retention of applications. Finally, Section (2.5.3) presents recommended upper limits of response times for interactive actions in applications like Spotify in order to prevent a noteworthy decrease in retention.

2.5.1 Latency

If a client action triggers a data center call with computation, the response time depends on different latencies from the client to the data center. According to Choy et al. [8] different kinds of latencies can be separated into the following three parts.

Tresponse = Tclient + Tnetwork + Tserver

Tclient represents the handling of the action in the client, which consists of sending and receiving data and further computations.

Tserver represents the time the data center needs to make the computation for the respective request. Tserver lasts from receiving the request to the sending of the

11 CHAPTER 2. THEORETICAL BACKGROUND response.

Tnetwork is the time the whole network between the client and server needs to send the request to the server and the response back to the client. Choy et al. [8] split up the network latency into four more parts depending on the kind of provider the infrastructure is maintained. This helps to identify latency parts the application development can directly impact. The following will describe network latency in more detail and emphasize why the transit network latency is especially important in this work.

The network infrastructure the client uses directly for the transmission to the first internet-connected router and the data transmission on the infrastructure of the ISP up to the next-hop transit network are the first two transmission latencies according to Choy et al. [8]. Those latencies are dependent on the device of the client, its connection to the next router and the infrastructure of the ISP. It can not be directly affected by the provider of the application.

Additionally, the data center latency is another latency portion with little to no improvement possibilities from the application development point of view, assuming that IaaS is used. The data center latency defines the network delay behind the front-end servers of the data center responsible for the computation. As computing paradigms compared require data centers, this should be a comparable and also neglectable delay.

Between ISP-latency and data center latency, the transit network latency defines the delay introduced by transmitting data from the first peering point to the gateway of the data center. Both the ISP and the provider of the data center are responsible for this transmission, but also networks of third-party network providers are often used as well. This latency can be significantly affected by the choice and location of data centers and hence, the chosen computing paradigm (2.1). Holistically the network latency itself is dependent on many different devices like cables, switches and routers, which themselves have individual delays [48]. However, as the distances to data centers for worldwide applications are often in the order of thousands of kilometers, the most significant delay is caused by bridging the distance. Even with the best physical connection link, the boundary for latency is the speed of light [32] and the transmission across thousands of kilometers should not be neglected.

12 CHAPTER 2. THEORETICAL BACKGROUND

A typical number to calculate the latency of a fiber optic cable is about 4.9 microseconds per kilometer [7]. Therefore data transmission over 1000 kilometers will take around 4900 microseconds and with only a small number of cloud data centers (2.1.2), the distance can be around 10000 kilometers for disadvantaged places or more. For example, assuming the lack of data center in Australia, the distance from New Zealand to a data center in Asia or South America is roughly 10000 kilometers, which accounts for 49 milliseconds one way. In this scenario alone, solely the transit network latency can be assumed to account for around 100 ms. A response time over 100 ms is often too much for an application to be interactive and possibly decreases retention (2.5.2).

Percentile latency

The percentile is a relevant statistical concept which finds many applications in distributed systems. For experiments with network latencies, it is often misleading to calculate the average or maximum of multiple data points as the distribution of different latencies can deviate a lot. There are often a few latencies which are much higher than the rest.

Practically, the xth percentile means in this context that x percent of the recorded latencies are below the value. Therefore if the 99th percentile is 300 ms, it means that 99% of all the requests took less than 300 milliseconds.

Theoretically, there are multiple definitions for the exact calculation of the percentile, which can change the outcome significantly with small sets of data [29].

2.5.2 Retention

Retention of users is a metric for application providers to track whether users continue or stop using the application. It is widely used to measure the success of applications as high retention shows high engagement and adoption [36, 49].

To increase retention, it is crucial to understand the reasoning of users for low engagement with applications. Studies show that one frequent reason for users to abandon applications is poor performance. The App Attention Index 2019 found that the expectations of users in terms of application performance are increasing and 55% of users are frustrated by slow response or loading times. Furthermore, consumers are abandoning applications and actively switch to other services in the case

13 CHAPTER 2. THEORETICAL BACKGROUND of “unacceptable digital experience” [41] in particular. This emphasizes the importance of being above a certain standard of performance, including low response times, to reach high retention and prevent damage to brand and business.

As long as the performance is at reasonable levels, decreased latency of applications does not necessarily guarantee higher retention [49]. To support decisions for or against higher performances, which often come with higher expenses, it is crucial to quantify boundaries for recommended response times to work on effective improvements to reach better retention.

2.5.3 Response time limits

As humans are acutely sensitive to delays [32], it is helpful to have latency boundaries for response times and requests. Models like RAIL [24], which is based on UX research about how users perceive delays [30], define time boundaries for different user interactions with applications.

The base of all this is human perception and research shows that 100 ms is about the limit for users to feel that a system reacts instantly. Furthermore, the response time should be below 1 second in order to ensure that users do not get interrupted by different thoughts while waiting for the response [30].

The RAIL model defines goals and guidelines on top of those response time limits that define what developers should aim for in terms of processing time. One important goal is to complete transitions after user interaction within 100 ms to have seemingly instantaneous feedback. A guideline to keep this time limit is to have the actual execution time below 50 ms to leave the remaining time for parallel executions.

14 Chapter 3

Technical Background

The technical background describes services of Fastly and Spotify that this project uses. After an introduction of Fastly (3.1) and some more background on their data centers, this chapter introduces Compute@Edge (3.1.2). Compute@Edge is the edge computation tool of Fastly, which uses Lucet (3.1.1) to compile and run systems at edge data centers (3.1). Furthermore this chapter presents the system around track metadata of Spotify (3.2.2).

3.1 Fastly

Fastly is an American edge cloud platform provider that initially offered a CDN, which they gradually expanded by new tools and offers to expand the possibilities of the edge. They strive to enable developers to deliver a fast, secure and scalable digital experience [4]. The company was founded in 2011 by Artur Bergman and currently has more than 500 employees with its headquarters in San Francisco, CA.

Aside from a global CDN, Fastly also offers image optimization, video & streaming, cloud security, load balancing and edge computing. In addition to Compute@Edge, the focus of this work, Fastly offers edge computation with the Varnish Configuration Language since 2017. Compared to other CDNs like Akamai or Cloudflare, Fastly has a different approach for their PoPs (3.1).

15 CHAPTER 3. TECHNICAL BACKGROUND

Data centers

With PoPs in currently around 55 different locations [16], Fastly does not run their network in many data centers compared to other CDN providers1. The low number of edge data centers comes with pros and cons. With a higher number of data centers, it is possible to be even closer to users on average. However, more data centers come with more maintenance effort and increase the complexity of rolling out changes. Fastly argues that with the historic network infrastructure some years back, when CDNs arose, it made sense to have as many PoPs as Akamai had because the time to reach the internet backbone was way higher than today. However, today, strategically placed PoPs are only a few more milliseconds away. It is much more important to have data centers big enough to increase cache hits and prevent requests back to the origin, which takes a lot longer [47]. An Analysis of many big CDNs shows that Fastly indeed can keep up with the global network latencies of other CDNs [1]. Additionally, data centers of Fastly are very well-equipped, for instance, with exclusively Solid State Drives, which are essential to have a short, constant time for retrieving cache in big data centers. Those bigger data centers are now able to serve cache in addition to all the other services Fastly offers, including Compute@Edge. It would probably be very complex for Akamai to implement a similar change, as fast as Fastly does, in all of their data centers.

3.1.1 Lucet

Lucet is an open-source WebAssembly runtime and compiler initiated by Fastly to run Compute@Edge [2, 14]. As it became one advanced implementation of WASI it is not only useful for Compute@Edge but for several use cases and companies which generally require “low-latency, high-concurrency server-based applications of WebAssembly” [6]. Fastly passed on the lead of the project to the ByteCodeAlliance. Lucet compiles native code to WebAssembly modules with Ahead-of-time compilation. Additionally, the runtime of Lucet manages resources and traps runtime faults. One significant advantage is that thousands of different WebAssembly binaries can be executed in the same process, which leads to a minimal overhead for initialization and cold starts.

1For example Akamai, the CDN with the most data centers, has more than 2200 PoPs, while Cloudflare has over 200.

16 CHAPTER 3. TECHNICAL BACKGROUND

3.1.2 Compute@Edge

Compute@Edge is a serverless edge computation tool from Fastly, which uses the traits of WASI to simplify writing and running systems at their edge data centers (3.1). Compute@Edge uses Lucet to compile and run code and is configurable by Command Line Interface (CLI) and webapp of Fastly.

This section first links Compute@Edge with theoretical concepts. After that, the current status (3.1.2) of Compute@Edge is outlined by showing how application developers can do the development and showing the preliminary hardware limits of the platform (3.1.2)

Serverless Computing

Compute@Edge is a Serverless Computing Platform (2.4) on a per-request basis. It strives to stand out for its low network latency. Unlike other Serverless Computing providers, the computation is done at edge data centers (3.1) and those data centers are optimized for network latency (2.1). A difference to most other serverless functions [5] is that the computation is invoked by an Hypertext Transfer Protocol (HTTP) request, which is the input of the executed main function and the output is an HTTP response (3.1.2).

Current Status

Compute@Edge [11] is in an advanced stage of the private beta phase, giving many insights into future performance and use cases. This section shows the status of Compute@Edge at the time of implementing this project, and shares prospects on how Compute@Edge might evolve in the future.

The programming languages available for Compute@Edge are tightly coupled to the ecosystem around WASI and Lucet, as programming languages need to be compilable to WebAssembly. The first programming language for the Beta phase of Compute@Edge is Rust. In addition, there are plans to support more languages in the future [14]. Fastly and ByteCodeAlliance announced to put effort into improving the ecosystem around WebAssembly, which will increase number and improve the quality of supported programming languages over time.

17 CHAPTER 3. TECHNICAL BACKGROUND

Invocation. A Compute@Edge function is invoked by a request to the associated domain. The request is then forwarded to the WebAssembly binary, which calls the main function with the request as an object. Fastly implemented a macro for Rust, which creates an interface for the main function so that it gets the request as an input and returns a response [42]. Snippet 3.1.2 shows how the invocation of the main function can be implemented in Rust.

Listing 3.1: Implementation which shows how to use the Fastly macro to invoke the C@E system.

1 #[fastly::main]

2 fn main(mut req: Request) −> Result {

3 handle_request()

4 }

The Deployment of Compute@Edge is simple with the CLI of Fastly. If everything like the compilation target is configured, it is possible to compile and deploy a WebAssembly binary with two commands: >fastly compute build uses lucet3.1.1 under the hood to compile the rust package to a WebAssembly binary. >fastly compute deploy uploads the WebAssembly binary, puts it in place and activates the new version of the system. All this takes only a few seconds until one can see the new version actively in place.

Logging on Compute@Edge is not trivial as all data centers might execute the application. Fastly offers a logging integration with real-time log streaming to various logging endpoints like Google Storage, S3 and Logentries [34]. Additionally, Fastly has a Rust library, which offers a simple way to send custom logs to those endpoints [23].

Limits and Pricing. Serverless Computing platforms usually enforce some boundaries of single instances and have elaborate pricing to prevent the misuse of the system. Compute@Edge has some preliminary limits for each running instance. The currently enforced limits are 1MB stack 30MB heap and a maximum of 60 seconds runtime [42]. Furthermore Fastly plans to bill Compute@Edge systems on a per- request basis.

18 CHAPTER 3. TECHNICAL BACKGROUND

Disadvantages

The disadvantages of Compute@Edge are linked to the underlying technology, particularly WebAssembly and WASI. The following paragraphs focuses on important ones:

Multithreading. Currently, some useful libraries do not compile to WebAssembly due to missing support for multithreading. Therefore multithreading is not possible with Compute@Edge and it was also unavailable to WebAssembly for a long time. It is on the roadmap of WebAssembly now and some browsers support it already [31]. It should be only a matter of time until WASI supports multithreading, which enables Compute@Edge to support it, too.

Libraries are not necessarily compilable to WebAssembly. Therefore, developers do not have the complete freedom of all existing libraries of a programming language. If libraries depend on modules that are not compilable to WebAssembly or do not have WASI support, Compute@Edge cannot execute the code. For many missing components, like multithreading, some proposals exist. Therefore Rust and other languages are likely to be more thoroughly supported in the future. It is currently helpful to compile the package with a new library before using it for a Compute@Edge project. Lucet (3.1.1) complains about imported but unsupported libraries right away.

Storage is and will probably always be very different from the storage possibilities of 2.1.2. Currently, Fastly offers caching with various tooling and extensions around it. Storing static data is fast and straightforward. Furthermore extensions like instant purging enable the caching of more dynamic content [12].

However, state conflicts which are only resolvable by intercommunication of data centers prevent the implementation of all storage functionalities known from cloud computing within the low latency approach of edge computing. Eventual consistency seems to be possible.

Fastly itself argues that only a few use cases, which for example require big state, should be run in the cloud [40].

19 CHAPTER 3. TECHNICAL BACKGROUND

3.2 Spotify

Spotify is a Swedish music streaming service that served around 320 million monthly active users in the third quarter of 2020 [38]. In this project the public API (3.2.1) and the browser-based client (3.2.3) of Spotify are used to run the experiment and described in the respective subsections. Additionally the system for metadata is described in more detail in Section 3.2.2 and Section 3.2.4 quickly introduces the history of the importance of response times at Spotify.

3.2.1 Web API

The Spotify Web API is the public API of Spotify for many use cases [19] to offer developers to build on top of Spotify in various ways. The API is also used internally and is responsible for serving many client requests. With Spotify Web API anyone can get access to various data entities with the metadata endpoints (3.2.2). The system runs in three different data centers, which are located in America, Europe and Asia.

3.2.2 Metadata

The Metadata service of Spotify provides any information about the content, or used media, for actual end-user entertainment experiences like music or podcasts. It allows finding content metadata based on IDs. It also runs complex business logic to provide the correct metadata depending on, among other parameters, a specific kind of request and time. There are a lot of different metadata entities that can also be directly received from Spotify Web API [19](3.2.1). The track metadata endpoint (3.2.2) is one exemplary metadata entity, which works similarly to many other metadata entities.

Track Metadata

One heavily utilized endpoint among Spotify metadata endpoints is the one for track metadata. Track metadata is necessary whenever a client shows or plays a song. Apart from the sound file location, track metadata contains much more data like the name of the song, more information about the album, artist and available markets, to name a few [20]. Snippet 3.2.2 shows a version of track metadata of a song called Cosmic Waves, from an album called the same. Additionally, the metadata shows the release

20 CHAPTER 3. TECHNICAL BACKGROUND date of the album and markets the song is available in. Furthermore, the images object in album contains the link to the picture of the album and href links to the actual sound file.

Listing 3.2: Exemplary JSON representation of track metadata. Three dots represent abbreviated objects.

1 {

2 ”metadata”: {

3 ”album”: {

4 //...

5 ”images”: //...

6 ”name”: ”Cosmic Waves”,

7 ”release_date”: ”2020−06−01”,

8 ”release_date_precision”: ”day”,

9 },

10 ”artists”: //...

11 ”available_markets”: [

12 ”AD”,

13 ”AE”,

14 ”AL”,

15 //...

16 ],

17 //...

18 ”href” :”https://api.spotify.com/v1/tracks/4MPtWNVS3ZpDWL33imUs79”

19 ”name” : ”Cosmic Waves”,

20 ”type”: ”track”,

21 }

22 }

However, there is not only one track metadata for one track at a particular time. The metadata gets customized for specific use cases depending on kind of client, user, time, geolocation and more. To further illustrate the complexity, the following paragraphs will explain some examples which are a subset of all customization for track metadata.

Market availability. Due to licensing constraints and other reasons, songs, artists and albums are not necessarily available in all the available markets on Spotify. The metadata of tracks has a field that specifies the available markets for a song. If possible, the Spotify Web API will only return metadata if it is available in the market of the user.

Language. Metadata of tracks, especially the names of those tracks, are often different depending on the language of the user. Creators can specify names for

21 CHAPTER 3. TECHNICAL BACKGROUND different languages and the client will get the correct version according to language or country.

Time availability. Tracks have release dates and the metadata should not be distributed and shown to end users before the track is released. Therefore the release date has to be compared to the current date ahead of its distribution.

3.2.3 Web-player

The web-player is a Spotify client for desktop web-browsers. It is accessible through open.spotify.com and displays the central consumer platform of Spotify. With millions of users, some of the associated backend services need to handle thousands of requests a second.

3.2.4 Response time

One goal of Spotify has always been to make playing chosen songs seemingly immediate for users. Historically, the goal was to stay below 250 ms from pressing play to hearing the song [37]. According to User Experience (UX) research [30] the delay should be even lower at 100 ms to ensure that users hear songs without noticing a delay. Furthermore the RAIL guidelines suggest that the execution time should not exceed 50 ms (2.5.3).

22 Chapter 4

Design and Implementation

This project consists of three main components that were implemented. The prototype, which runs on Compute@Edge, is presented in Section 4.1. It is a new service that is supposed to replicate the functionalities of a system that currently runs on a cloud computing platform. The implementation reveals insights into development complexity and the capabilities of Compute@Edge. To show how Compute@Edge works from a development perspective, the section is supported by multiple code examples.

The prototype is utilized by the experiment, that gets global network latency metrics from Spotify clients to compare the performance of the current cloud infrastructure to edge computing. The implementation of the experiment is briefly explained in Section 4.2. After that, Section 4.3 shows how the data from the experiment was visualized and explored to get the presented results of the following Chapter 5.

4.1 Prototype

This section first explains the selected method, answering why the track metadata request was chosen as the exemplary call (4.1.1). Then , Section 4.1.2 shows parts of the implementation and explains the essential modules.

4.1.1 Method

The approach of this work is to implement a vertical prototype in order to compare a logically similar system running on the edge to its current implementation. The

23 CHAPTER 4. DESIGN AND IMPLEMENTATION prototype showcases a running system by implementing one working part of the original system. It handles an exemplary request, representing many other similar latency-sensitive calls for Spotify and generally gives insights for the implementation of any other edge system, especially Compute@Edge.

The prototype implements the endpoint for track metadata at the edge of Fastly. It is supposed to handle requests similar to the current track metadata endpoint (3.2.2), but on edge data centers; and hence without the necessity of calls to cloud data centers.

The track metadata request is one widespread kind of request from Spotify clients. To play a song in the app, the metadata first needs to be retrieved, to then be able to get the soundtrack and further static data like the picture of the track. This specific request is similar to all the other metadata requests in the ecosystem of Spotify. Additionally, it is highly latency-sensitive and a good example for metadata requests and many other requests.

Track requests in the infrastructure of Spotify are highly latency-sensitive as metadata is necessary to play a song on Spotify, which is supposed to happen interruption- free (3.2.4). As it is not possible to cache the whole Spotify catalog, there will always be times when users hit the play button of a song without metadata in the cache. Therefore, the latency of track requests must be short to reduce the time until the song starts to play.

The request of track metadata not only accurately represents many other similar metadata requests in the infrastructure of Spotify, but in theory, it stands for any request for dynamic content that requires some computation and, therefore, cannot be covered with caching.

4.1.2 Implementation

The implementation of the prototype is done with Rust in the Compute@Edge environment of Fastly (3.1.2). The following paragraphs explain parts of the implementation and show insights into the codebase.

The code in the following explanation is simplified to focus on the important parts. The full codebase can be found in Appendix A.

24 CHAPTER 4. DESIGN AND IMPLEMENTATION

Setup

Initially, Compute@Edge has to be set up to compile, deploy and monitor code at the edges of Fastly [35]. After creating a new service, downloading the Fastly CLI and configuring it, the CLI offers an initial Rust package with a starting point with the command >fastly compute init. That package is then ready to be compiled and deployed by the respective commands (3.1.2).

Handling Requests

The prototype needs to handle one specific path for track metadata requests, including a track id either in path or query. The chosen path is metadata/v1/track.

The default code for the initial rust package [18] already contains the main function, which takes the incoming request as input and handles it according to HTTP method and path. Therefore, only the chosen path had to be added in a pattern match that either handles a track request if the path is correct or it returns an error. Snippet 4.1.2 shows how was implemented in Rust.

Listing 4.1: Exerpt of function in prototype which sorts requests by path.

1 match (req.method(), req.uri().path()) {

2 (&Method::GET, path) if path.starts_with(”/metadata/v1/track”) => {

3 let response = track_handler::handle(req);

4 return response

5 }

6 _ => {

7 Ok(Response::builder()

8 .status(StatusCode::NOT_FOUND)

9 .body(Body::try_from(”The page you requested could not be found”)?)?)

10 }

11 }

If the incoming request to the domain of the service is a GET request with the correct path, the track_handler module will handle it, which is explained in the next section (4.1.2). In case there is any other incoming request with a different method or path, the system will respond with an error. With this implementation only requests like metadata.edgecompute.com/metadata/v1/track/ are handled assuming metadata.edgecompute.com is the correct domain of the Compute@Edge service. Any other request will return an error response. If the track_handler gets the request, it will handle it and return the response, which is sent back to the requester.

25 CHAPTER 4. DESIGN AND IMPLEMENTATION

Track Handler

The track_handler module contains all necessary functionality to handle track metadata. It includes the function fn handle() that invokes all other components in order to return the correct metadata.

Listing 4.2: Top function for track metadata handling of the prototype.

1

2 pub fn handle(track_id: &str, query: &str) −> Result<(String,String), error::TrackError> {

3

4 // parse track id and query parameters

5 let track_id = parser::track_id(String::from(track_id));

6 let query_parameters = parser::query_parameters(query);

7

8 // get metadata from cache or webapi if not cached

9 let mut metadata = cache_handler::get_cache_for_track_id(&track_id);

10 if (!cache_metadata){

11 metadata = spotify_api::get_metadata_for_trackid(&track_id)?;

12 }

13

14 metadata = parser::metadata(&metadata)?;

15

16 // return Error if the track’s release date is in the future

17 if !metadata_checker::is_released(&metadata)?{

18 return Err(error::TrackError{error: error::Error::Custom(String::from(”Track not released”)), status_code: 404});

19 }

20

21 let market = parser::get_specific_query(”market”, query_parameters);

22 metadata = metadata_checker::handle_market(metadata, &market)?;

23

24 // get geolocation data of IP

25 let metrics = parser::get_metric_data(used_webapi)?;

26

27 let body = parser::compose_body(metadata, metrics);

28

29 Ok(body)

30 }

First, the request is parsed. This is done by a parsing module that can parse track id, query, and the metadata at a later point. The parsing module is described in more detail in Section 4.1.2.

With the parsed track id, the cache_handler module can retrieve the metadata either from cache or from the Spotify Web API (3.2.1). Note that the track used for the experiment was cached, and therefore the Spotify Web API was not used. Section 4.1.2 describes how the system requests the cache in Compute@Edge.

26 CHAPTER 4. DESIGN AND IMPLEMENTATION

At this point, the metadata of the track was retrieved, and the parsing module parses the metadata next (4.1.2).

After that the metadata is checked and, if necessary, adjusted for the specific use case depending on further information from the query. In this case, the metadata_checker module verifies whether the track is already released and handle_market() checks if it is available in the requested market. The market check is described in Section 4.1.2.

If the release date of the track is in the future, an error is constructed and returned with an explanation and an HTTP status code. Error handling was a challenge in Compute@Edge and is further described in Section 4.1.2.

After the checks, the parser module is getting some more metric data, which is solely significant for the experiment and not part of the actual set of requested functionalities (4.1.2).

The last step is the construction of the response, including the found corresponding metadata and the geolocation data for the experiment. Thereafter, the response is returned (4.1.2).

Parse

The parse module has multiple helper functions to parse inputs and get specific parts of it. The implementation responsible for parsing the track id from the request is a simple function depicted in snippet 4.1.2.

Listing 4.3: Helper function of prototype which gets the track id from a given path.

1 pub fn track_id(path: String) −> String {

2 let values: Vec<&str> = path.split(’/’).collect();

3 return String::from(values[4]);

4 }

The function expects the path which is ”/metadata/v1/track/” in this case. It splits the String by ”/” and takes the fourth part of the vector, which is only the track id. Getting query parameters is equally simple and Fastly actually provides good examples that just have to be adjusted to the actual use case [42].

27 CHAPTER 4. DESIGN AND IMPLEMENTATION

Cache

There are two ways caching speeds up the prototype and ensures that only a few calls have to go back to the origin. First, some test metadata was cached manually with Fastly. Second, the application caches HTTP responses of backends.

To test the prototype without synchronous calls to cloud data centers, the test metadata used in the experiment is cached by Fastly. This implies that the Compute@Edge application can usually get the metadata from within the data center it is running in. Only in case of cache misses the latency is going to be affected significantly. Fastly announces in their documentation [42] that communication with the cache will be individually possible to improve handling the cache directly within Compute@Edge. Currently, it can be implemented similarly to the communication with any HTTP server.

Listing 4.4: function of prototype which gets cached metadata for a track id.

1 const CACHE_DOMAIN: &str = ”https://metadata−at−edge.cachedomain.com/”;

2 const CACHE_BACKEND_NAME: &str = ”metadata−cache”;

3

4 pub fn get_cache_for_track_id(track_id: &String) −> Result {

5 let uri = format!(”{}{}”, CACHE_DOMAIN, track_id);

6 let cache_backend= Backend::from_name(CACHE_BACKEND_NAME)?;

7 let bereq = Request::get(uri).body(())?;

8

9 let beresp = cache_backend.send(bereq)?;

10 let body: String = beresp.into_body().into_string();

11

12 if body.is_empty(){

13 return Err(TrackError{error: Error::NotCached(track_id.to_string()), status_code: 404});

14 }

15 return Ok(body);

16 }

This function shown in Section 4.1.2 builds the request with the correct URL to the metadata and sends a GET request to receive the metadata. Then it parses the metadata into a string and returns it. Additionally, it checks for an empty body before returning, as that would mean the metadata could not be received from the cache.

However, the prototype can also get any track metadata by requesting it from the original system if the cache is not found. Getting the metadata directly from Spotify Web API is implemented similar to getting the cache, but with an additional call for authentication and authorization. Requesting the metadata from the original system might look pointless as Compute@Edge would proxy the actual service. However, it

28 CHAPTER 4. DESIGN AND IMPLEMENTATION could also be a simple example to reduce latency with the help of Compute@Edge. This is further discussed in the future work, Section 6.1.3.

Check

As the prototype does not aim to just proxy the Spotify Web API, but reduce the calls which have to go back to the Spotify Web API, it has to replicate the business logic (3.2.2) which currently happens in cloud data centers only. Some exemplary parts were implemented to show that Compute@Edge and WASI can implement this kind of logic.

The current version is capable of verifying if the track is available in the specific market of a user and whether a track is already released (3.2.2). Those checks are a subset of all the checks and customization the current system needs to do to send only available and correct metadata to clients. If a check turns out to be negative, the system will not return any metadata to prevent disclosing data, that for example is not released yet.

The check functions get the parsed metadata and return the customized version of the metadata or an error if the verification turned out to be negative. Snippet 4.1.2 showcases the implementation of the market availability check.

Listing 4.5: Function to customize track metadata according to a given market.

1 // Checks if market appears in ”available_market” of both, album and track and adjust metadata

2 pub fn handle_market<’a>(mut metadata: Value, market: &str) −> Result {

3

4 // check if market is in ”available_markets” of track

5 if is_available_in_market(&metadata[”available_markets”], market)?

6 {

7 // delete available_market in album and track

8 metadata[”available_markets”].take();

9

10 // create ”is_playable :true”

11 metadata[”is_playable”] = json!(”true”);

12 return Ok(metadata)

13 }

14 return Err(TrackError{error: Error::Custom(format!(”track is not available in market: {:?}”, market))

15 , status_code: 404});

16 }

17

18

19 fn is_available_in_market(available_markets: &Value, market: &str) −> Result{

20 let available_markets = available_markets.as_array();

21 if available_markets == None{

29 CHAPTER 4. DESIGN AND IMPLEMENTATION

22 return Ok(true);

23 }

24 let available_markets = available_markets.unwrap();

25 for a in available_markets{

26 if a.as_str().unwrap() == market {

27 return Ok(true);

28 }

29 }

30 Ok(false)

31 } handle_market() gets the market of the user who sent the request in addition to the cached metadata. In is_available_in_market() it loops through the available markets which are part of the metadata and checks if the market of the user is part of it. If that is the case, it will replace the field for the available markets with a field just specifying that the song is playable, which is similar to what is happening in the background of the Spotify Web API.

Geolocation Metrics

Finally, the prototype gathers some geolocation metrics exclusively used by the experiment to get client location data. However, this pretty much out-of-the-box functionality of Compute@Edge could help to implement further checks of metadata in the future. The geolocation metrics are added along with the metadata to the response body used by the metrics collector (4.2).

Response

The final step of the system is to construct a HTTP response, containing important headers, the metadata plus the geolocation metrics in the body, assuming the system executed all previous steps without errors. Snippet 4.1.2 shows the essential parts of the response body.

Listing 4.6: Exemplary JSON response of a track metadata request to the prototype. Three dots represent abbreviated objects.

1 {

2 ”metadata”: {

3 ”album”: {

4 //...

5 ”name”: ”Cosmic Waves”,

6 ”release_date”: ”2020−06−01”,

7 ”release_date_precision”: ”day”,

30 CHAPTER 4. DESIGN AND IMPLEMENTATION

8 },

9 ”artists”: //...

10 ”available_markets”: [

11 ”AD”,

12 ”AE”,

13 ”AL”,

14 //...

15 ],

16 //...

17 ”name” : ”Cosmic Waves”,

18 ”type”: ”track”,

19 },

20 ”metrics”: {

21 ”client_city”: ”gütersloh”,

22 ”client_continent”: ”EU”,

23 ”client_country”: ”germany”,

24 ”client_country_code”: ”DE”,

25 ”latitude”: ”51.92”,

26 ”longtitude”: ”8.37”,

27 ”used_webapi”: ”false”

28 }

29 }

Error Handling

In the case of errors, the HTTP response currently contains the error message with a 404 status code. However, since the implementation of the prototype, the error handling libraries advanced and there is a cleaner way to handle errors during development and production included in the rust crate of Fastly [17] now.

4.2 Experiment

Another technical part of this project is the implementation of necessary components to execute an experiment that uses the Spotify web-player (3.2.3) to get latency metrics. The goal is get track metadata from both systems, Spotify Web API (3.2.1) and the prototype (4.1), and record the respective latency metrics. To get those metrics, the web-player clients need to send requests to both systems and record the delay until the response arrives. As those additional requests are not a functional requirement of the web-player, the requests must be executed passively without, or rather a minimal, degradation of the web-player itself. Therefore the experiment had to be executed when the web-player is already interactive and has enough idle resources (4.2.1). Secondly, it has to send two fake requests (4.2.2), get the network latency metrics

31 CHAPTER 4. DESIGN AND IMPLEMENTATION for those requests and centrally collect those metrics for the evaluation (4.2.3). Additionally, those components are wrapped inside a feature toggle, enabling the rollout of the change to only parts of the users (4.2.4). The following sections describe the components in more detail.

4.2.1 Passive Execution

An integral part of the production change is to prevent significant degradation of the frontend service. A worst-case scenario is to break something in the web-player making it unusable for Spotify users. Any additional delay in the client should be prevented. Furthermore, the client had to handle the request of the prototype with particular caution as Compute@Edge was still in beta without any service level assurances.

First of all, the change was carefully located at an execution point the whole web-player is already interactive at. It ensures that the vital features of the web-player, or basically all features, have a higher priority than the experiment.

To prevent errors in old or unsupported browsers, the module checks the browser and its version to execute the experiment only if all the necessary functions are supported. Additionally, only the necessary computation was done at the frontend to get a minimal computational overhead from the change and ensure that the performance of the web- player stays the same.

4.2.2 Fake Requests

The experiment module sends two similar fake HTTP requests to prototype and Spotify Web API to get the latencies in the next step (4.2.3). As the prototype replicates the current functionality, the requests look similar apart from the domain. As the experiment captures the worldwide latencies, different track ids are expected to yield similar latencies. Therefore the track id was hardcoded throughout the experiment.

The next step is to forward data and latency metrics from the responses to the metric system.

32 CHAPTER 4. DESIGN AND IMPLEMENTATION

4.2.3 Metrics

To get exact network latency metrics without unrelated overhead like queuing, this project uses the Navigation Timing API [26] to get precise timings of the requests.

Among other timings the so called ResourceTiming object returns requeststart and responseend which allow to calculate the latency:

requestlatency = responseend − requeststart

The outcome, in combination with some corresponding tags, is sent to a central metric collector. The tags are mostly generated by the prototype 4.1.2 and just forwarded to the metric system. The following tags are associated with each latency metric.

• client city

• client continent

• client country

• client country code

• latitude

• longitude

• used webapi

• track id

4.2.4 Feature toggling

To test and gradually activate the experiment in clients and control the absolute number of clients, the experiment is wrapped inside a feature toggling functionality capable of remotely activating the experiment in clients. The system was primarily used to see the change in production before rolling it out to a higher number of clients. Controlling the number of requests to Compute@Edge was especially important as the early Beta system at that time had not been utilized in that dimension before. Ultimately, the feature toggling could have been very important to stop the experiment in the event of failures.

33 CHAPTER 4. DESIGN AND IMPLEMENTATION

4.3 Visualization

Another technical part of the project includes handling and exploring the data of the experiment 4.2. During the experiment, Grafana and the monitoring of Compute@Edge was used to get real-time metrics for verifying the status and to get a first grasp of the data (4.3.1). Eventually, the metrics were dumped in a database on Google Cloud and BigQuery was used to query the data to create the charts used for the evaluation (4.3.2).

4.3.1 Monitoring

For the first visualization of the metrics, Grafana and the integrated monitoring of Compute@Edge were used. Grafana was pretty valuable during the experiment, as it verified that all parts of the system worked as expected. However, the time-based visualization of these monitoring tools are not ideal for the complete evaluation as the time aspect, when a request is executed, is not important to gather data of the worldwide latency. Therefore, to show the data in a better way, another visualization technique was necessary.

4.3.2 BigQuery

As the metrics ended up in a Google Cloud database, BigQuery could easily be used to process the data and create more insightful visualizations.

One focus of the visualization was the grouping of latency data points by country to calculate different percentiles and more values. Snippet 4.3.2 shows one exemplary query which was used.

Listing 4.7: One exemplary SQL query to prepare the data for its visualization.

1 SELECT

2 message. tags [ORDINAL(7)]. value AS country ,

3 AVG( message . value ) AS avg_latency ,

4 STDDEV( message . value ) AS stddev ,

5 approx_quantiles(message. value , 100)[offset(90)] as perc_90 ,

6 approx_quantiles(message. value , 100)[offset(95)] as perc_95 ,

7 approx_quantiles(message. value , 100)[offset(99)] as perc_99 ,

8 approx_quantiles(message. value , 100)[offset(50)] as median , 9 count (*) as count 10 FROM

11 ‘ Metrics−Database ‘

12 WHERE

34 CHAPTER 4. DESIGN AND IMPLEMENTATION

13 message.component_id = ”desktop−web−player ”

14 AND message.what = ”latency−prototype ”

15 GROUP BY

16 message. tags [ORDINAL(7)]. value

As the database contains more than just the metrics from this experiment, the query first checks for the specific component.id and kind of message. In this case, the query just takes the metrics from the Compute@Edge prototype by specifying message.what as latency-prototype. These data points are then grouped by its country, which is the seventh element of an array of tags set in the experiment (4.2.3). The latency array for each country was aggregated in different ways:

• average

• median(p50)

• p90

• p95

• p99

• stddev

As the tags contain more than just the country, the collected data also allows getting more kinds of groupings like city or latitude and longitude pair. However, grouping by city was found to be most insightful for this project.

The query outcome is imported in a Google Sheet file, which offered the ability to visualize the different values in corresponding colors depending on the aggregated latency value (5.1).

35 Chapter 5

Results

The main argument for shifting systems handling latency-sensitive requests from cloud data centers to the edge is to reduce the network latency to ultimately increase performance and retention of client applications.

Compute@Edge is a service that enables edge computation for many use cases. This section first shows that the worldwide network latency of Compute@Edge is significantly lower than a similar system running in three cloud data centers in the world. Then it discusses whether Compute@Edge is sufficiently versatile (5.2) and feasible (5.3) to shift systems to the edge and ultimately support decisions for Compute@Edge.

5.1 Latency

The experiment collected client network latencies for two different requests from all over the world. One of the requests was handled by the prototype, which runs on Compute@Edge and represents edge computation. The second request was handled by Spotify Web API, representing a cloud computing setup with three data centers in the world. In total, the experiment was executed almost 26 million times, which corresponds to the number of collected latency data points for each of those two requests.

The collected data shows that edge computation has a significantly lower latency on average as well as in different percentiles (2.5.1). Table 5.1.1 shows the various global latency aggregations next and additional the difference Compute@Edge was faster

36 CHAPTER 5. RESULTS than Spotify Web API and the corresponding percentage of improvement. Section 5.1.3 explains the processed values in detail.

Compute@Edge (ms) Spotify Web API (ms) xdiff (ms) ximpr (%) average 217.45 365.99 148.54 41 median(p50) 76.14 152.98 76.84 50 p90 324.37 586.74 262.37 45 p95 700.46 1,201.07 500.61 42 p99 2,781.62 4,541.02 1759.41 39 stddev 861.22 1,200.49 339.27 28

Table 5.1.1: Global latency comparison including absolute and percentual improvements. xdiff and ximpr are defined in Section 5.1.3.

Before, the following sections show charts that visualize data from the experiment individually, for Compute@Edge (5.1.1) and Spotify Web API (5.1.2).

5.1.1 Compute@Edge

This section will show the latencies that the experiment recorded by sending requests to the prototype running on Compute@Edge.

With almost 26 million requests the prototype turned out to have the following global latency values:

• average: 217.45 ms

• median(p50): 76.14 ms

• p90: 324.37 ms

• p95: 700.46 ms

• p99: 2,781.62 ms

• stddev: 861.22 ms

Interesting to note is the quite significant difference between the median and the average latencies. 50% of all worldwide latencies are below 76.14 ms. However, the standard deviation and the higher percentiles show that there are a significant number of requests that took multiple times longer than the median latency. 5% of the requests took more than 700 ms and there is even a significant percentage of requests that lasted

37 CHAPTER 5. RESULTS multiple seconds. Some of the long latencies might be partially explained by the way people use the web-player (3.2.3), which runs on desktop computers only. Presumably, most are connected to routers and have a stable connection. However, some percents of the users probably use the mobile network or even tether the connection with another mobile device. In those cases, the connection of the client to the first internet connected router is usually unstable and slower and switching to closer data centers probably does not improve the latency, as those parts of the internet connection can not be impacted by the location of the data center (cf. 2.5.1). Additionally, parts of the world do not have good connections ahead of the ISP, and therefore, the network latency is expected to be higher in some countries even if data centers are close.

The following charts show the latencies grouped by the country of clients (4.3) to analyze these differences.

The charts were made by grouping the origins of the requests by country and calculate the different aggregations. Furthermore, the minimum number of data points for a country was set to 20 to filter out inaccurate results.

Average

Figure 5.1.1: The average Latency to Compute@Edge in ms for each country.

The average Compute@Edge latencies of countries in Chart 5.1.1 are displayed from

38 CHAPTER 5. RESULTS dark green, which would be 0 milliseconds to 1500 milliseconds, dark red. It is evident that green is the dominant color of this chart, with Finland having the lowest value, an average of 34.7 milliseconds, and only a few countries like Myanmar and French Guiana show latencies above 1000 ms. It is visible that there seem to be areas around north and central Africa as well as south-east Asia where countries have worse latencies than the rest of the world. This inequality partly corresponds to the world map of Fastly’s PoPs, which are seemingly underrepresented in those areas [16].

99th Latency Percentile

Chart 5.1.2 shows a similar visualization but instead of the average it shows the 99th Percentile of each country.

Figure 5.1.2: The 99th Latency Percentile to Compute@Edge for each country.

It is important to note that the color scale for this chart goes up to 7000 milliseconds. As expected, the P99 value is quite a lot higher than the average. The dominant color is yellow, with a tendency towards green. The dominant color is around a P99 value of 2500 ms. For example, Brazil has 2539 and Spain 2527. The lowest P99 latency has Finland with 333 ms. China and Venezuela have noticeable high latencies with more than 7000 ms. Additionally, both countries have a pretty high standard deviation despite having multiple thousands of data points. Therefore those countries might

39 CHAPTER 5. RESULTS have very varying quality of connections, presumably due to well-connected regions and rural areas with worse service and longer network distances.

Median

As average latencies can be misleading especially in cases of a high standard deviation, Chart 5.1.3 shows the median or 50th percentile.

Figure 5.1.3: The 50th Latency Percentile(Median) to Compute@Edge for each country.

The median of all the collected data is much lower compared to the average. This phenomenon is also visible for grouped countries. The charts show that the median of most countries is below the average. Finland also has the lowest median latency with 12.84 ms, New Zealand, for example, has a median of 36 ms. The median of China is significantly lower than the average, which corresponds to the high standard deviation. The highest median values are recorded in Myanmar and Laos with around 1000 ms. However, some neighbouring countries, like Thailand and Vietnam, have a median network latency below 100 ms, which probably means that bad infrastructure in those countries is the reason for high latencies. The currently closest running Fastly PoPs are in Hongkong, Chennai and Singapore. Those latencies might even be a reason Fastly currently constructs a closer PoP in Kolkata [16].

40 CHAPTER 5. RESULTS

Appendix B shows further country charts with more percentiles and the standard deviation.

5.1.2 Spotify Web API

This section shows the latencies recorded by sending fake requests to the Spotify Web API (3.2.1).

With almost 26 million requests the Spotify Web API turned out to have the following global latency values:

• average: 365.99 ms

• median(p50): 152.98 ms

• p90: 586.74 ms

• p95: 1,201.07 ms

• p99: 4,541.02 ms

• stddev: 1,200.49 ms

First of all, global latencies for the Spotify Web API are all higher than the ones from Compute@Edge (5.1.1)! The following section 5.1.3 compares the outcomes in more detail.

The global median of the Spotify Web API is also less than half compared to the global average for the Spotify Web API. Generally, the interrelations of the different global latencies are similar to the ones from the prototype but higher.

The following sections will show the latencies grouped by countries on the world map.

Average

The dominant color of Chart 5.1.4 is not easy to distinguish. Some of the biggest countries in the world like Russia and Canada are green but it seems that many countries especially in south Asia and Africa have a tendency towards red.

Interestingly Finland is also the country with the lowest latency for Spotify Web API with an average of 164.12 ms. On the other side, there are more than ten dark red countries with an average latency above 1500 ms. Some unexpected countries

41 CHAPTER 5. RESULTS

Figure 5.1.4: The average Latency of clients to Spotify Web API for each country. with >1500 ms are Denmark and Norway which should have enough data points to be accurate. The others are, e.g., Zimbabwe, Senegal and North Korea, which are expectedly underperforming and have only around 100 data points, probably leading to more inaccurate outcomes.

Norway, Denmark and Sweden (837 ms) are especially noticeable because they are close to the best performing Finland and surrounded by countries with good latencies in comparison1.

99th Latency Percentile

Consistent with the global P99 latency of Spotify Web API which is about 4541 ms, the P99 country Chart 5.1.5 dominant color is much more red than green. That is expected, considering that the chosen upper limit of the chart is at 7000 for the P99 charts.

It is noticeable that there are still some countries with quite low P99 latencies. Some, like Tanzania and Nepal, have to be considered with caution as the number of data points is relatively low and the deviation to the actual P99 can be quite significant. However, with Finland and Poland, there are also countries with multiple tens of

1The Infrastructure Team of Spotify started investigating the reason for degraded latencies in those countries right after the outcome was visible.

42 CHAPTER 5. RESULTS

Figure 5.1.5: The 99th Latency Percentile to Spotify Web API for each country. thousands of data points that should ensure an accurate outcome.

Other country charts with different percentiles, including median and the standard deviation per country, can be found in Appendix B.

5.1.3 Comparison

To further analyze the latencies and compare both systems, the same calculations and charts were made based on the latency difference. The latency differences are calculated by subtracting the respective latency of the prototype from the latency of the Spotify Web API. For example the global average was calculated like the following:

xdiff = xwebapi − xc@e = 365.99 − 217.45 = 148.54

Furthermore, the degree of improvement, usually the percentage Compute@Edge is better than Spotify Web API, is calculated by dividing the difference by the Spotify Web API value: xdifference 148.54 ximpr = ∗ 100 = ∗ 100 ≈ 41% xwebapi 365.99

Therefore any positive latency value or percentage indicates that the prototype has

43 CHAPTER 5. RESULTS lower latency than the Spotify Web API and vice versa.

The latencies to the edge computing infrastructure outperform the cloud computing infrastructure in all matters by significant duration (5.1.1).

In all global latencies except the standard deviation, which is kind of a different case, the C@E latencies are more than 38% lower compared to the Spotify Web API. Furthermore, the median latency of Compute@Edge is only half of the one for Spotify Web API. These numbers show that by implementing the track metadata request on Compute@Edge, a large part of the worldwide latency, as well as the number and length of long-lasting requests, could be decreased significantly.

The results also show that the higher latency percentiles of both systems are way above 100 ms which shows that response time limits are hard to achieve for the significant majority of clients 2.5.3. The reason for some very long requests is presumably the infrastructure before and after the network transit that is not directly impacted by the location of data centers. A client connection that needs 1100 ms to reach a cloud data center because it takes 1000 ms to reach the ISP is only insignificantly improved by edge data centers as the transit of the distance to cloud data center takes less than a tenth of the total time. The problem in this example is the connection to the ISP and can not directly be impacted by the location of data centers 2.5.1.

Country charts

The following charts show the differences on a per-country basis to get a more detailed idea of disparities in various regions.

First of all, the color scale of the chart reaches from red, for negative values indicating that the latencies of Spotify Web API are better, to green for positive values showing C@E in favour. It is visible that the dominant color of all those charts is green. Hence, the prototype has lower latencies in most countries. In large parts of the world, like North America and Europe, the average latency difference improved by 50 - 100 ms (5.1.6). Countries with higher latencies for the Spotify Web API like Denmark and Norway and South America show higher differences by 250 ms and more.

Only a few countries have negative values and, therefore, better latencies to the Spotify Web API. Most of them, like Madagascar and Sri Lanka, are smaller countries with a low number of data points.

44 CHAPTER 5. RESULTS

Figure 5.1.6: The average Latency difference between C@E (3.1.2) and Spotify Web API (3.2.1).

Figure 5.1.7: The 99th Latency Percentile difference between C@E (3.1.2) and Spotify Web API (3.2.1).

45 CHAPTER 5. RESULTS

China is a noticeable country with a high negative difference in the 99th Latency Percentile (5.1.7), however a slightly positive average difference and a median difference of 161 ms. This indicates that Compute@Edge outperformed the Spotify Web API in the fasted 50% of the requests. However, the number of very slow requests is higher for the prototype.

It is shown that Compute@Edge or generally edge computing can result in a significant worldwide client network latency improvement (5.1). As described by other works, lower latencies can have a high impact on performance and increase retention of applications (2.5.2), this outcome concludes that edge computing can improve the performance of globally distributed systems and the retention of applications. However, whether the latency improvement is worth the additional costs is more complex to answer and probably has to be decided on a case-by-case basis.

The remaining questions are whether the environment of Compute@Edge is versatile enough to implement arbitrary services comparable to the possibilities in cloud computing (5.2) and if development time and costs of Compute@Edge are worth the improvement (5.2).

5.2 Versatility

Implementing systems that handle latency-sensitive requests at the edge only pays off if there are few synchronous sub-request to the cloud data centers, as the distance the requests have to travel is not reduced otherwise. To be valuable an edge platform should be capable of handling requests without the cloud backend, at least in most cases. To rely on edge computing for latency-sensitive calls, it would be ideal if implementations from cloud data centers were possible at the edge as well. However, the spread of so many different smaller data centers (3.1) and the dimension of necessary replication has its challenges in terms of computation (5.2.1) and storage (5.2.2), which the next sections will discuss.

5.2.1 Computation

The consequence of many distributed data centers is that those do not serve as many requests as central cloud data centers. Therefore edge data centers are usually smaller with less hardware. Additionally, they do not provide the same range of

46 CHAPTER 5. RESULTS various hardware components for different computational tasks. Hence, edge data centers are not ideal for jobs requiring heavier or specialized computational loads. Presumably, the time for computation might be slightly higher in edge data centers. So for computationally heavy tasks it makes more sense to use a computationally faster cloud data center.

WebAssembly and Compute@Edge help tackle hardware shortcomings by executing services with little overhead (2.2). As the performance of Compute@Edge is that good small and medium computational loads seem to be possible at the edge now.

Synchronizing a state between all data centers is very complex and time-consuming as the data centers have to communicate with each other. Waiting for all data centers to exchange states removes latency gains from having a data center close to the client. Therefore the computations or rather the WebAssembly binaries of Compute@Edge cannot handle states. Furthermore, the state is one of the challenges for databases, which require a shared state.

5.2.2 Storage

Many cloud systems require data objects. That is why the edge needs to support various data storage types to support versatile requirements.

Edge platforms, especially those with a CDN as an origin, usually offer caching at their edges already. Getting various kinds of content from within edge data centers is possible and already supported by Fastly. However, there are boundaries to what extend content can and probably will be possible to be stored (3.1.2).

However, those boundaries seem to be strong consistency and truly dynamic content, which are unnecessary for many systems. If, or rather when, edge platforms succeed with implementing eventual consistent storage solutions comparable to Cloud BigTable or Apache Cassandra, a majority of cloud systems seem to be theoretically possible to run at the edge. In combination with Compute@Edge as an enabling technology of fast and straightforward edge computing, the edge will eventually be versatile enough for the bulk of latency-sensitive systems. What is left is a practical eventual consistent storage solution to exhaust the versatility possibilities Compute@Edge seems to offer.

47 CHAPTER 5. RESULTS

5.3 Feasibility

An important factor when it comes to the adoption of tech solutions is the feasibility of it. The latency gain of edge computing is shown in Section 5.1 and additionally, Section 5.2 argues that Compute@Edge can cover a wide range of different use cases. The remaining part of the equation is to show that the development and maintenance of Compute@Edge systems are worth the latency improvement. The major domains of expenses are the initial implementation of the system (5.3.1) and its maintenance (5.3.2), which are evaluated in the following sections. After that possible risks, are discussed in Section 5.3.3.

5.3.1 Implementation

The implementation of edge computing systems is usually costly in terms of development time. Most edge computing solutions require developers to learn new languages and concepts, which is expensive, increases the barrier of adoption and flattens the learning curve. Compute@Edge currently supports Rust, a common language for software infrastructure. Presently, it is a requirement for developers to be capable of implementing systems in Rust. However, the number of supported languages is expected to increase in the future. The ByteCodeAlliance and Fastly explore and invest in the next languages already and it is only a matter of time until Compute@Edge supports more languages.

Therefore, it is expected that WebAssembly and WASI are supported by several common languages soon and many developers will not need to learn a new programming language to get started with Compute@Edge. Compute@Edge presumably only requires engineers to adapt to their tooling, CLI and web-interface, in addition to using libraries of Fastly for the chosen programming language. Hence, Compute@Edge has a steeper learning curve and cloud systems in common programming languages might even be compilable to WebAssembly with some changes. Therefore implementing new and replicating existing cloud systems with Compute@Edge should be less costly than current edge solutions.

48 CHAPTER 5. RESULTS

5.3.2 Maintenance

The maintenance, which includes running the Compute@Edge service, should generally be compared to the costs of maintaining similar services on cloud platforms. Several causes could lead to increased costs for edge systems.

The payment plans for Compute@Edge are still preliminary, but the overall idea is that customers pay per request. Paying per request instead of computing hardware could lead to both higher and lower costs. It is likely that edge systems cost slightly more as edge data centers are smaller and not cost optimized (2.1).

Furthermore, maintenance requires engineers to monitor the system and ensure its correct execution. Developers have to be trained to get used to the monitoring system of the edge platform. Compared to cloud infrastructure, which usually has a common way to do monitoring and, e.g., on-call, the new platform requires developers to handle a second, different platform.

All in all the maintenance costs are likely to increase slightly compared to cloud computing.

5.3.3 Risks

A significant risk of adopting very immature products like Compute@Edge is vendor lock-in. Vendor lock-in means that a type of implementation is vendor-specific. In this case, a system of an edge platform could not be run at a competing edge platform or would be complex to move to another vendor.

However, with WebAssembly (2.2) being the basis of Compute@Edge and portability being a central feature of WebAssembly (2.2.3), the problem should be minimal as long as other edge platforms adopt WebAssembly as well. And even if other edge platforms do not adopt WebAssembly, the system still is implemented in a common programming language, which should reduce risks in terms of vendor lock-in. In comparison Varnish Configuration Language (VCL), which is the first approach of Fastly to edge computation, is a configuration language that is improbable to be used by any other edge platform.

Furthermore, the trend shows that WebAssembly is more and more adopted and looks like a good fit for edge platforms, which partly support WebAssembly already. However, current competitors of Fastly are using WebAssembly in a different way,

49 CHAPTER 5. RESULTS mostly because they run WebAssembly without WASI [10].

It is imaginable that Compute@Edge turns out to not work as expected. The active development of Fastly could stop and WebAssembly could turn out to not be the best solution for edge computation. In this case, it still is valuable to have a system implemented in a common programming language that could run similarly at cloud data centers or anywhere else.

However, Fastly currently has a roadmap for the years to come and is backed by the ByteCodeAlliance, which includes multiple leading tech companies investing in WebAssembly. So the path looks promising towards a great ecosystem around WebAssembly and WASI.

50 Chapter 6

Conclusions

This work implemented an edge service that can be deployed to all Fastly data centers at once by Compute@Edge with WebAssembly as the enabling technology. This new edge computing approach, which is supposed to run code safely and fast, was primarily tested to gather latency insights and compare it to a common cloud infrastructure. The outcome shows that edge computing can significantly improve the worldwide latency of client applications and WebAssembly seems to be a promising technology for edge computation.

The results of the experiment show that network edge computing is capable of reducing network latency compared to cloud computing by at least 38%. Additionally, it is shown that location, distance to clients and the nature of data centers are essential to reduce the network latency of client applications.

The latencies to the edge computation platform of Fastly was significantly lower than the latencies to three distributed cloud data centers on average as well as in all percentiles. Furthermore, a detailed examination showed that latency improvements occurred anywhere globally, including in countries close to the cloud data centers. This shows the significance of edge data centers being close to network hubs. But especially in distant locations from cloud data centers and countries with presumably weaker network infrastructure, edge computation seems to tremendously outperform cloud computing. The experiment shows that adopting edge computing for latency-sensitive requests can lead to a fairer worldwide distribution of latencies.

The latency improvements of edge computing enable better worldwide performance of applications currently degraded by backends running in cloud data centers. This

51 CHAPTER 6. CONCLUSIONS performance can lead to higher user satisfaction and increase the retention of applications. Ultimately the adoption of edge computation for latency-sensitive components can lift the overall performance of applications to stand out from competitors.

However, the adoption of edge computing platforms can be costly, especially in terms of development. Furthermore, the possibilities of the environment are limited explicitly in terms of dynamic data. But this work showed that using WebAssembly and WASI for edge computation is a promising solution that seems to make edge computing powerful enough to replace many cloud services. WebAssembly can reduce the development costs for edge systems significantly by supporting common programming languages. It simplifies deployment, has strong security guarantees and is efficient. Additionally, it reduces risks like vendor lock-in by its portability. WebAssembly seems versatile enough to implement many edge replacements or supporting modules for latency-sensitive cloud systems, which currently slow down applications, with reasonable costs. However, edge computation cannot replace anything currently running in cloud data centers.

The tech stack for WebAssembly and especially WASI like Lucet and Compute@Edge are still in a very early stage. Adopting it might be risky but very promising. All products using WebAssembly and WASI are presumably getting more powerful by more contributions to it. With WebAssembly getting more and more attention, its tech stack likely grows more versatile over the upcoming years. But the early stage also involves risks of reduced attention of WebAssembly, which would result in a decline of new features and a possible shutdown of services like Compute@Edge. However, that seems unlikely considering the companies who agreed on a roadmap for WebAssembly and WASI next to Fastly.

Compute@Edge is a promising implementation of WASI. It already shows a lot of potential. Usability and versatility will increase by more features and improvements of the underlying technology. For example, the increase of supported programming languages will lead to simplified adoption and lower development costs. Compute@Edge is heavily dependent on the success of WebAssembly. Still, the chances seem to be very promising as it is likely to lead to an edge computing solution, which eventually can replace most cloud systems for global client applications.

52 CHAPTER 6. CONCLUSIONS

6.1 Future Work

To further explore edge computing, WASI and Compute@Edge there are various different questions which could not be followed up in the scope of this work. The following sections introduce some ideas to further analyze those topics.

6.1.1 Mobile Data

The client side of the experiment was implemented in the Spotify web-player, which has implications on the most frequent kind of internet connection of clients. For an even more accurate outcome it would be valuable to execute an experiment including all various kinds of devices, especially mobile devices, to get a holistic view of the latencies. As the Spotify web-player is only distributed to non-mobile clients, it has to be assumed that only a small part of the requests uses a mobile network. Mobile networks usually have longer latencies to the internet backbone, which cannot be affected directly by the kind of data center (2.5.1).

6.1.2 Cloudflare workers

Cloudflare, a competitor of Fastly, also offers edge computing with WebAssembly which looks similar first [9]. However, they currently run WebAssembly without WASI which requires them to run the WebAssembly binaries on a host environment. This has implications especially on the performance of the system. On the other hand, as WASI is currently immature, it could mean that the versatility of the systems is increased compared to Compute@Edge. It would be interesting to compare those systems, also for the different use cases they are suitable for.

6.1.3 Proxy cloud systems with C@E

During the project, the prototype was implemented to first check for available metadata in the cache (4.1.2) and then fall back to the current production system, Spotify Web API. A temporary idea was to use the caching of HTTP responses, which comes with Compute@Edge out-of-the-box, to speed up current Spotify clients in the course of the project already. Compute@Edge could simply cache metadata responses from Spotify Web API and implement only the necessary checks like for release date and market availability before sending those cached responses (4.1.2). The original system

53 CHAPTER 6. CONCLUSIONS would not have to be replaced if Compute@Edge acts as an additional system to reach lower latencies. However, this system only works for such track metadata, which is requested a lot and stays in Compute@Edge cache long enough. It could be tested to run the Compute@Edge system in parallel to the Spotify Web API and take the fastest response.

6.1.4 Critical Point Analysis

Zuniga et al. [49] defined specific critical points for Apps and Categories of Apps to quantify the effects of performance and specifically latency to retention. To tweak the recommended upper limits for response times (2.5), it would be meaningful for Spotify or any other app to make a similar study for the specific application to classify the importance of non-functional changes like the use of Edge Computing.

54 Bibliography

[1] Analysing Global CDN Performance — RIPE Labs. URL: https : / / labs . ripe.net/Members/emirb/analysing- global-cdn- performance (visited on 01/31/2020).

[2] Announcing Lucet: Fastly’s native WebAssembly compiler and runtime | Fastly. URL: https://www.fastly.com/blog/announcing- lucet- fastly- native-webassembly-compiler-runtime (visited on 10/27/2020).

[3] Apache Cassandra. URL: https : / / cassandra . apache . org/ (visited on 11/03/2020).

[4] Astasia Myers. “Fastly S-1 Analysis”. In: (2019). URL: https://medium.com/ memory-leak/fastly-s-1-analysis-accelerating-the-edge-dfa9cd4c8bb7.

[5] Baldini Ioana, Castro. Serverless Computing: Current Trends and Open Problems. Tech. rep. arXiv: 1706.03178v1.

[6] Bytecode Alliance. URL: https://bytecodealliance.org/%7B%5C#%7Dwhat- is-the-bytecode-alliance (visited on 08/24/2020).

[7] Calculating Optical Fiber Latency. URL: https://www.m2optics.com/blog/ bid/70587/calculating-optical-fiber-latency (visited on 07/09/2020).

[8] Choy, Sharon, Wong, Bernard, Simon, Gwendal, Rosenberg, Catherine, and Bretagne, Telecom. “The Brewing Storm in Cloud Gaming: A Measurement Study on Cloud to End-User Latency The Brewing Storm in Cloud Gaming: A Measurement Study on Cloud to End-User Latency. NetGames 2012: The 11th ACM Annual Workshop on Network and Systems Support for Games, ” in: (2012). DOI: 10 . 1109 / NetGames . 2012 . 6404024ï. URL: https : / / hal . archives-ouvertes.fr/hal-00786278.

[9] Cloudflare Workers®. URL: https://workers.cloudflare.com/ (visited on 11/05/2020).

55 BIBLIOGRAPHY

[10] cloudflare/wrangler: ? wrangle your cloudflare workers. URL: https : / / github.com/cloudflare/wrangler (visited on 10/09/2020).

[11] Compute@Edge demo: See our new serverless compute environment at work | Fastly. URL: https://www.fastly.com/blog/edge-compute-demo (visited on 09/10/2020).

[12] Content and its delivery | Fastly Help Guides. URL: https://docs.fastly. com/en/guides/content-and-its-delivery (visited on 10/21/2020).

[13] ETSI - ETSI Blog - What is Edge? URL: https://www.etsi.org/newsroom/ blogs/entry/what-is-edge (visited on 06/28/2020).

[14] Evaluating new languages for Compute@Edge | Fastly. URL: https://www. fastly.com/blog/evaluating-new-languages-for-edge-compute (visited on 09/10/2020).

[15] Far Edge vs. Near Edge in Edge Computing - Tech.in | 5G, SDN/NFV & MEC. URL: https://www.thetech.in/2019/06/far-edge-vs-near-edge-in-edge- computing.html (visited on 06/28/2020).

[16] Fastly network map | Fastly. URL: https://www.fastly.com/network-map (visited on 09/16/2020).

[17] fastly::error - Rust. URL: https://docs.rs/fastly/0.5.0/fastly/error/ index.html (visited on 10/26/2020).

[18] fastly/fastly-template-rust-default: Default package template for Rust based Compute@Edge projects. URL: https : / / github . com / fastly / fastly - template-rust-default (visited on 09/21/2020).

[19] Features | Spotify for Developers. URL: https://developer.spotify.com/ discover/%7B%5C#%7Dmetadata (visited on 09/22/2020).

[20] Get a Track | Spotify for Developers. URL: https://developer.spotify.com/ console/get-track/ (visited on 09/22/2020).

[21] Haas, Andreas, Rossberg, Andreas, Schuff, Derek L, Titzer, Ben L, Holman, Michael, Gohman, Dan, Wagner, Luke, Zakai, Alon, and Bastien, J F. “Bringing the Web up to Speed with WebAssembly”. In: (). DOI: 10 . 1145 / 3062341 . 3062363. URL: http://dx.doi.org/10.1145/3062341.3062363.

56 BIBLIOGRAPHY

[22] Join the beta: our new serverless compute environment gives you more power at the edge | Fastly. URL: https : / / www . fastly . com / blog / join - the - beta - new - serverless - compute - environment - at - the - edge (visited on 10/27/2020).

[23] log_fastly - Rust. URL: https://docs.rs/log-fastly/0.1.2/log%7B%5C_ %7Dfastly/ (visited on 09/11/2020).

[24] Measure performance with the RAIL model. June 2020. URL: https://web. dev/rail/ (visited on 08/18/2020).

[25] Mell, Peter and Grance, Timothy. The NIST Definition of Cloud Computing Recommendations of the National Institute of Standards and Technology. Tech. rep.

[26] Navigation Timing API - Web APIs | MDN. URL: https : / / developer . mozilla . org / en - US / docs / Web / API / Navigation % 7B % 5C _ %7Dtiming % 7B % 5C_%7DAPI (visited on 09/17/2020).

[27] Overview of replication | Cloud Bigtable Documentation | Google Cloud. URL: https://cloud.google.com/bigtable/docs/replication-overview%7B%5C# %7Dconsistency-model (visited on 11/03/2020).

[28] Peng, Gang. “CDN: Content Distribution Network”. In: (Nov. 2004). arXiv: 0411069 [cs]. URL: http://arxiv.org/abs/cs/0411069.

[29] Percentiles. URL: https://cnx.org/contents/223y7Xzw@12/Percentiles.

[30] Response Time Limits: Article by Jakob Nielsen. URL: https : / / www . nngroup.com/articles/response-times-3-important-limits/ (visited on 08/04/2020).

[31] Roadmap - WebAssembly. URL: https://webassembly.org/roadmap/ (visited on 09/11/2020).

[32] Satyanarayanan, Mahadev. The Emergence of Edge Computing. Tech. rep.

[33] Security - WebAssembly. URL: https://webassembly.org/docs/security/ (visited on 06/21/2020).

[34] Setting up remote log streaming | Fastly Help Guides. URL: https://docs. fastly . com / en / guides / setting - up - remote - log - streaming (visited on 09/11/2020).

57 BIBLIOGRAPHY

[35] Setup C@E. URL: https://developer.fastly.com/learning/compute (visited on 09/21/2020).

[36] Sigg, Stephan and Peltonen, Ella. Exploiting usage to predict instantaneous app popularity: Trend filters and retention rates. Tech. rep. 2017. URL: http: //andrewchen.co/new-data-shows-why-losing-80-of-your-mobile-users-.

[37] Söderström, Gustav. A Brief History Of Spotify: Gustav Söderström - YouTube. Jan. 2019. URL: https://youtu.be/jTM7ZCKEUGM?t=635.

[38] Spotify MAUs worldwide 2019 | Statista. URL: https://www.statista.com/ statistics/367739/spotify-global-mau/ (visited on 11/19/2020).

[39] Standardizing WASI: A system interface to run WebAssembly outside the web - Mozilla Hacks - the Web developer blog. URL: https://hacks.mozilla.org/ 2019/03/standardizing-wasi-a-webassembly-system-interface/ (visited on 06/20/2020).

[40] State at the edge. URL: https://www.fastly.com/blog/state-at-the-edge.

[41] The App Attention Index 2019: The Era of the Digital Reflex | Application Performance Monitoring Blog | AppDynamics. URL: https : / / www . appdynamics . com / blog / news / app - attention - index - 2019/ (visited on 07/21/2020).

[42] Using Compute@Edge | Fastly Developer Hub. URL: https : / / developer . fastly.com/learning/compute/using/ (visited on 09/16/2020).

[43] Varghese, Blesson, Wang, Nan, Barbhuiya, Sakil, Kilpatrick, Peter, and Nikolopoulos, Dimitrios S. Challenges and Opportunities in Edge Computing. Tech. rep. arXiv: 1609 . 01967v1. URL: http : / / www . telegraph . co . uk / technology/mobile-phones/11287659/Quarter-.

[44] WebAssembly. URL: https://webassembly.org/.

[45] What Edge Computing Means for Infrastructure and Operations Leaders - Smarter With Gartner. URL: https : / / www . gartner . com / smarterwithgartner/what- edge- computing- means- for- infrastructure- and-operations-leaders/ (visited on 06/26/2020).

[46] What is Edge Computing: The Network Edge Explained. URL: https://www. cloudwards.net/what-is-edge-computing/ (visited on 06/26/2020).

58 BIBLIOGRAPHY

[47] Why having more POPs isn’t always better | Fastly. URL: https : / / www . fastly . com / blog / why - having - more - pops - isnt - always - better (visited on 09/16/2020).

[48] Zilberman, Noa, Grosvenor, Matthew, Popescu, Diana Andreea, Manihatty- Bojan, Neelakandan, Antichi, Gianni, Wójcik, Marcin, and Moore, Andrew W. Where Has My Time Gone? Tech. rep. URL: http://docs.libmemcached.org/ bin/memaslap.html.

[49] Zuniga, Agustin, Flores, Huber, Lagerspetz, Eemil, Tarkoma, Sasu, Manner, Jukka, Hui, Pan, and Nurmi, Petteri. “Tortoise or Hare? Quantifying the Effects of Performance on Mobile App Retention”. In: (2019). DOI: 10.1145/3308558. 3313428. URL: https://doi.org/10.1145/3308558.3313428.

59 Appendix - Contents

A Codebase of the prototype 61 A.1 main.rs ...... 61 A.1.1 mod.rs ...... 62 A.1.2 parser.rs ...... 63 A.1.3 cache_handler.rs ...... 65 A.1.4 spotify_api.rs ...... 66 A.1.5 metadata_checker.rs ...... 67

B Further Visualizations 70 B.1 Prototype ...... 70 B.2 Spotify Web API ...... 72 B.3 Country Differences ...... 72

60 Appendix A

Codebase of the prototype

This appendix shows the full Rust code of the prototype. This code is apart from secrets and the cache domain the exact same example which ran the final experiments with Compute@Edge (3.1.2). It was build and deployed with Fastly’s CLI.

A.1 main.rs

1 mod track_handler; 2 use fastly::http::{Method, StatusCode}; 3 use fastly::{downstream_request, Body, Error, Request, Response, ResponseExt}; 4 use log; 5 use log_fastly; 6 use std::convert::TryFrom; 7 use log_panics; 8 9 const VALID_METHODS: [Method; 2] = [Method::GET, Method::OPTIONS]; 10 11 // Handle the downstream request from the client. 12 fn handle_request(req: Request) −> Result, Error> { 13 if !(VALID_METHODS.contains(req.method())) { 14 log::warn!(”Not supported HTTP Request method: {}”, req.method()); 15 return Ok(Response::builder() 16 .status(StatusCode::METHOD_NOT_ALLOWED) 17 .body(Body::try_from(”This method is not allowed”)?)?); 18 } 19 20 // Pattern match on the request method and path. 21 match (req.method(), req.uri().path()) { 22 23 (&Method::OPTIONS, _) => Ok(Response::builder() 24 .status(StatusCode::OK) 25 .header(”Access−Control−Allow−Origin”, ”*”) 26 .header(”Access−Control−Allow−Methods”, ”GET, POST”) 27 .header(”Access−Control−Allow−Headers”, 28 ”accept, app−platform, authorization, content−Type, 29 origin, retry−after, spotify−app−version”) 30 .body(Body::try_from(”option”)?)? 31 ), 32 33 (&Method::GET, ”/health”) => Ok(Response::builder() 34 .status(StatusCode::OK) 35 .body(Body::try_from(”OK!”)?)?),

61 APPENDIX A. CODEBASE OF THE PROTOTYPE

36 37 (&Method::GET, ”/metadata”) => Ok(Response::builder() 38 .status(StatusCode::OK) 39 .body(Body::try_from(format!(”{:?}”, req.headers()))?)?), 40 41 42 (&Method::GET, path) if path.starts_with(”/metadata/v1/track”) => { 43 let response = track_handler::handle(req.uri().path(), 44 req.uri().query().unwrap_or(”?”)); 45 46 return match response { 47 Ok(r) => { 48 Ok(Response::builder() 49 .header(”Access−Control−Allow−Origin”, ”*”) 50 .header(”Access−Control−Allow−Methods”, ”GET, POST”) 51 .header(”Timing−Allow−Origin”, ”https://open.spotify.com, *”) 52 .header(”used_webapi”, r.1) 53 .header(”content−type”, ”application/json”) 54 .status(StatusCode::OK) 55 .body(Body::try_from(r.0 56 )?)?) 57 } 58 Err(e) => 59 Ok(Response::builder() 60 .status(e.status_code) 61 .body(Body::try_from(format!(”Error in C@E: {:?}”, e))?) 62 ?) 63 } 64 } 65 66 // Catch all other requests and return a 404. 67 _ => { 68 log::warn!(”Request path was not found: {}”, req.uri().path()); 69 Ok(Response::builder() 70 .status(StatusCode::NOT_FOUND) 71 .body(Body::try_from(”The page you requested could not be found”)?)?) 72 } 73 } 74 } 75 76 fn setup_logging() { 77 log_fastly::init_simple(”GCS”, log::LevelFilter::Info); 78 log_panics::init(); 79 log::info!(”Logger successfully started”); 80 } 81 82 fn main() −> Result<(), Error> { 83 setup_logging(); 84 let req = downstream_request(); 85 match handle_request(req) { 86 Ok(resp) => resp.send_downstream(), 87 Err(e) => { 88 let mut resp = Response::new(e.to_string()); 89 *resp.status_mut() = StatusCode::INTERNAL_SERVER_ERROR; 90 resp.send_downstream(); 91 } 92 } 93 Ok(()) 94 }

A.1.1 mod.rs track_handler/mod.rs

This is the module which handles all the different parts of a track metadata request. It uses many function declared in the following code files.

62 APPENDIX A. CODEBASE OF THE PROTOTYPE

1 mod spotify_api; 2 mod error; 3 mod cache_handler; 4 mod parser; 5 mod metadata_checker; 6 use log; 7 8 9 pub fn handle(track_id: &str, query: &str) −> Result<(String,String), error::TrackError> { 10 11 let track_id = parser::track_id(String::from(track_id)); 12 let query_parameters = parser::query_parameters(query); 13 log::info!(”Track ID: {} and query parameters received: {:?}”, track_id, query); 14 15 // check if metadata is cached and get it 16 let cache_metadata = cache_handler::get_cache_for_track_id(&track_id); 17 18 let (metadata, used_webapi) = match cache_metadata { 19 Ok(m) => (m, false), 20 21 // if cache is empty, get it from webapi 22 Err(error::TrackError{error: error::Error::NotCached(_), status_code: _}) 23 => (spotify_api::get_metadata_for_trackid(&track_id)?, true), 24 25 Err(e) => { 26 log::error!(”Error fetching cached metadata: {:?}”, e); 27 return Err(e) 28 }, 29 }; 30 31 let mut metadata = parser::metadata(&metadata)?; 32 33 // return Error if the track’s release date is in the future 34 if !metadata_checker::is_released(&metadata)?{ 35 return Err(error::TrackError{error: error::Error::Custom( 36 String::from(”Track not released”)), status_code: 404}); 37 } 38 39 40 let market = parser::get_specific_query(”market”, query_parameters); 41 // Handle the market only if specified, continue if no market in query 42 if market.is_ok(){ 43 let market = market.unwrap(); 44 if market.trim() != ”from_token”{ 45 metadata = metadata_checker::handle_market(metadata, &market)?; 46 } 47 } 48 49 50 51 let metrics = parser::get_metric_data(used_webapi)?; 52 53 54 let body = parser::compose_body(metadata, metrics); 55 56 log::info!(”Successfully got metadata (with webapi? {}) and checked it”, used_webapi); 57 Ok((format!( 58 ”{}”, 59 body 60 ), 61 used_webapi.to_string()) 62 ) 63 }

A.1.2 parser.rs track_handler/parser.rs

63 APPENDIX A. CODEBASE OF THE PROTOTYPE

1 use super::error::{TrackError, Error}; 2 use querystring; 3 use serde_json::value::Value; 4 use serde_json; 5 use serde_json::json; 6 use fastly::geo::{geo_lookup,Geo}; 7 8 9 10 11 pub fn track_id(path: String) −> String { 12 let values: Vec<&str> = path.split(’/’).collect(); 13 return String::from(values[4]); 14 } 15 16 pub fn query_parameters<’a>(query: &’a str) −> Vec<(&’a str,&’a str)> { 17 let mut qs = querystring::querify(query); 18 19 qs.sort_by(|a, b| a.0.cmp(b.0)); 20 21 return qs; 22 } 23 24 pub fn get_specific_query(query: &str, qs: Vec<(&str,&str)>) −> Result{ 25 for q in qs{ 26 if q.0 == query{ 27 return Ok(String::from(q.1)); 28 } 29 } 30 return Err(TrackError{error: Error::Custom( 31 String::from(format!(”necessary query parameter not found in request: {}”, query))), 32 status_code: 404}); 33 } 34 35 pub fn metadata(metadata: &str) −> Result { 36 let v: Value = serde_json::from_str(metadata)?; 37 return Ok(v); 38 } 39 40 pub fn get_metric_data(used_webapi: bool) −> Result { 41 42 let geo = get_geo_locations(); 43 44 let metrics: Value = json!({ 45 ”client_country” : geo.country_name(), 46 ”client_country_code” : geo.country_code(), 47 ”client_city” : geo.city(), 48 ”client_continent” : geo.continent(), 49 ”longtitude”: geo.longitude().to_string(), 50 ”latitude”: geo.latitude().to_string(), 51 ”used_webapi” : used_webapi.to_string(), 52 }); 53 54 return Ok(metrics) 55 } 56 57 pub fn compose_body(metadata: Value, metrics: Value) −> Value { 58 59 let body: Value = json!({ 60 ”metrics”: metrics, 61 ”metadata”: metadata, 62 }); 63 64 body 65 } 66 67 fn get_geo_locations() −> Geo{ 68 69 let client_ip = fastly::downstream_client_ip_addr().unwrap(); 70 let geo = geo_lookup(client_ip).unwrap(); 71 return geo;

64 APPENDIX A. CODEBASE OF THE PROTOTYPE

72 } 73 74 #[cfg(test)] 75 mod tests { 76 use super::*; 77 78 79 #[test] 80 fn test_parse_track_id() { 81 assert_eq!( 82 track_id(String::from( 83 ”test.edgecompute.edge/metadata/4/track/abc123” 84 )), 85 String::from(”abc123”) 86 ); 87 } 88 89 #[test] 90 fn test_parse_query_parameters() { 91 assert_eq!( 92 query_parameters( 93 ”foo=bar&bla=blub” 94 ), 95 vec![(”bla” , ”blub”), (”foo”, ”bar”)] 96 ); 97 } 98 99 #[test] 100 fn test_get_specific_query() { 101 let qs = query_parameters(”foo=bar&bla=blub”); 102 assert_eq!( 103 get_specific_query(”bla”, qs).unwrap(), 104 ”blub” 105 ) 106 } 107 }

A.1.3 cache_handler.rs

track_handler/cache_handler.rs

1 use super::error::{TrackError, Error}; 2 use fastly::backend::Backend; 3 use fastly::Request; 4 5 const CACHE_DOMAIN: &str = ”https://metadata−at−edge.cachedomain.com/”; 6 const CACHE_BACKEND_NAME: &str = ”metadata−cache”; 7 8 pub fn get_cache_for_track_id(track_id: &String) −> Result { 9 let uri = format!(”{}{}”, CACHE_DOMAIN, track_id); 10 11 let cache_backend= Backend::from_name(CACHE_BACKEND_NAME)?; 12 13 // build request 14 let bereq = Request::get(uri).body(())?; 15 16 // send it to backend 17 let beresp = cache_backend.send(bereq)?; 18 let body: String = beresp.into_body().into_string(); 19 20 if body.is_empty(){ 21 return Err(TrackError{error: Error::NotCached 22 (track_id.to_string()), status_code: 404}); 23 } 24 25 return Ok(body); 26 }

65 APPENDIX A. CODEBASE OF THE PROTOTYPE

A.1.4 spotify_api.rs track_handler/spotify_api.rs

1 2 use super::error::{TrackError, Error}; 3 use serde_json::value::Value; 4 use serde_json; 5 use fastly::backend::Backend; 6 use fastly::Request; 7 use fastly::http::{StatusCode}; 8 9 const WEBAPI_SECRET: &str = ”secret=”; 10 const WEBAPI_AUTH_URI: &str = ”https://accounts.spotify.com/api/token”; 11 const WEBAPI_AUTH_BACKEND_NAME :&str = ”webapi−auth”; //https://accounts.spotify.com 12 const WEBAPI_BACKEND_NAME :&str = ”webapi”; //https://api.spotify.com 13 const WEBAPI_TRACK_URI :&str = ”https://api.spotify.com/v1/tracks/”; 14 15 pub fn get_metadata_for_trackid(track_id: &String) −> Result { 16 let access_token = get_access_token()?; 17 return get_metadata(track_id, access_token); 18 } 19 20 // https://developer.spotify.com/documentation/general/guides/authorization−guide 21 fn get_access_token() −> Result { 22 let auth_head = format!(”Basic {}”, WEBAPI_SECRET); 23 let auth_backend= Backend::from_name(WEBAPI_AUTH_BACKEND_NAME)?; 24 25 let bereq = Request::builder() 26 .method(”POST”) 27 .uri(WEBAPI_AUTH_URI) 28 .header(”Authorization”, auth_head) 29 .header(”Content−Type”, ”application/x−www−form−urlencoded”) 30 .body(”grant_type=client_credentials”)?; 31 32 let beresp = auth_backend.send(bereq)?; 33 let body: String = beresp.into_body().into_string(); 34 35 let v: Value = serde_json::from_str(&body)?; 36 37 let access_token = v[”access_token”].as_str().ok_or( 38 TrackError{error: Error::Custom( 39 String::from(”Access Token not found in response”)), status_code: 404})?; 40 41 return Ok(String::from(access_token)); 42 } 43 44 fn get_metadata(track_id: &String, access_token: String) −> Result { 45 46 let uri = format!(”{}{}”,WEBAPI_TRACK_URI,track_id); 47 let auth_head = format!(”Bearer {}”, access_token); 48 let api_backend= Backend::from_name(WEBAPI_BACKEND_NAME)?; 49 50 let bereq = Request::builder() 51 .method(”GET”) 52 .uri(uri) 53 .header(”Authorization”, auth_head) 54 .body(())?; 55 56 let beresp = api_backend.send(bereq)?; 57 let status = beresp.status(); 58 let body: String = beresp.into_body().into_string(); 59 if body.is_empty(){ 60 return Err(TrackError{error: Error::Custom( 61 format!(”Track ID does not exist: {}” ,track_id)), status_code: 404}); 62 }else if status != StatusCode::OK{ 63 log::error!(”Error in WebAPI response: {}” ,body); 64 return Err(TrackError{error: Error::Custom( 65 format!(”Error in WebAPI response: {}” ,body)), status_code: status.as_u16()}); 66 }

66 APPENDIX A. CODEBASE OF THE PROTOTYPE

67 68 return Ok(body); 69 70 }

A.1.5 metadata_checker.rs track_handler/metadata_checker.rs

1 use super::error::{TrackError, Error}; 2 3 use serde_json::value::Value; 4 use serde_json; 5 use serde_json::json; 6 use chrono::{Date,NaiveDate, Utc}; 7 8 9 pub fn is_released(metadata: &Value) −> Result { 10 11 let release_date_precision = metadata[”album”][”release_date_precision”] 12 .as_str().unwrap_or(”day”); 13 let release_date= metadata[”album”][”release_date”].as_str() 14 .ok_or(TrackError{error: Error::Custom( 15 String::from(”Release date not found in metadata”)), status_code: 404})?; 16 17 match release_date_precision{ 18 ”day” => is_released_day(release_date), 19 _ => { 20 log::error!(”Unknown release date precision key in metadata, 21 trying to handle with day: {:?}”, release_date_precision); 22 is_released_day(release_date) 23 } 24 } 25 } 26 27 fn is_released_day(release_date: &str) −> Result { 28 29 let release_date: NaiveDate = NaiveDate::parse_from_str(release_date, ”%Y−%m−%d”)?; 30 let release_date: Date = Date::from_utc(release_date, Utc); 31 32 // TODO use timezone of Market and not of serving instance 33 let today: Date = Date::from_utc(chrono::offset::Local::today().naive_utc(), Utc); 34 if today >= release_date{ 35 return Ok(true); 36 } 37 log::error!(”Release date is in the future: {:?}”, release_date); 38 return Err(TrackError{error: Error::Custom(String::from(”track is not released yet”)), 39 status_code: 404}); 40 41 } 42 43 // Checks if market appears in ”available_market” of both, album and track and adjusts metadata 44 pub fn handle_market<’a>(mut metadata: Value, market: &str) −> Result { 45 46 // check if market is in ”available_markets” of track 47 if is_available_in_market(&metadata[”available_markets”], market)? && 48 is_available_in_market(&metadata[”album”][”available_markets”], market)?{ 49 // delete available_market in album and track 50 metadata[”available_markets”].take(); 51 metadata[”album”][”available_markets”].take(); 52 53 // create ”is_playable :true” 54 metadata[”is_playable”] = json!(”true”); 55 return Ok(metadata) 56 } 57 return Err(TrackError{error: Error::Custom( 58 format!(”track is not available in market: {:?}”, market)), status_code: 404});

67 APPENDIX A. CODEBASE OF THE PROTOTYPE

59 } 60 61 62 fn is_available_in_market(available_markets: &Value, market: &str) −> Result{ 63 let available_markets = available_markets.as_array(); 64 if available_markets == None{ 65 return Ok(true); 66 } 67 let available_markets = available_markets.unwrap(); 68 for a in available_markets{ 69 if a.as_str().unwrap() == market { 70 return Ok(true); 71 } 72 } 73 Ok(false) 74 } 75 76 77 78 79 // Remember that unsuccessful tests result in really ugly output with 80 // $ cargo wasi test −− −−nocapture 81 // https://bytecodealliance.github.io/cargo−wasi/testing.html 82 #[cfg(test)] 83 mod tests { 84 use super::*; 85 use serde_json::json; 86 87 #[test] 88 fn test_is_released_day_no_date(){ 89 let metadata: Value = json!({ ”wrong”: ”data” }); 90 assert!(is_released(&metadata).is_err()) 91 } 92 93 #[test] 94 fn test_is_released_day_false(){ 95 let metadata: Value = json!({ 96 ”album” :{ 97 ”release_date” : ”2100−09−26”, 98 ”release_date_precision” : ”day” 99 } 100 }); 101 assert!(is_released(&metadata).is_err()) 102 } 103 104 #[test] 105 fn test_is_released_day(){ 106 let metadata: Value = json!({ 107 ”album” :{ 108 ”release_date” : ”2000−09−26”, 109 ”release_date_precision” : ”day” 110 } 111 }); 112 assert_eq!(is_released(&metadata).unwrap(), true); 113 } 114 115 #[test] 116 fn test_is_available_in_market(){ 117 let metadata: Value = json!({ 118 ”album” :{ 119 ”available_markets” :[ ”AD”, ”AE”, ”AR”, ”BG”] 120 }, 121 ”available_markets” :[ ”AD”, ”AE”, ”AR”, ”BG”] 122 }); 123 let metadata = handle_market(metadata, ”AD”).unwrap(); 124 //assert_eq!(metadata[”album”][”is_playable”].as_str().unwrap(), ”true”) 125 assert_eq!(metadata, 126 json!({ 127 ”album” :{ 128 ”available_markets” : null, 129 },

68 APPENDIX A. CODEBASE OF THE PROTOTYPE

130 ”available_markets” : null, 131 ”is_playable” : ”true” 132 }) 133 ); 134 } 135 136 #[test] 137 fn test_is_available_in_market_no_markets(){ 138 let metadata: Value = json!({ 139 ”album” :{ 140 } 141 }); 142 let metadata = handle_market(metadata, ”AD”).unwrap(); 143 //assert_eq!(metadata[”album”][”is_playable”].as_str().unwrap(), ”true”) 144 assert_eq!(metadata, 145 json!({ 146 ”album” :{ 147 ”available_markets” : null, 148 }, 149 ”available_markets” : null, 150 ”is_playable” : ”true” 151 }) 152 ); 153 } 154 }

69 Appendix B

Further Visualizations

B.1 Prototype

The following charts show further visualizations of the experiment with the prototype (4.1) with the values being 90th percentile (B.1.1) , the 95th percentile (B.1.2) and standard deviation (B.1.3).

Figure B.1.1: The 90th Latency Percentile for each country.

70 APPENDIX B. FURTHER VISUALIZATIONS

Figure B.1.2: The 95th Latency Percentile for each country.

Figure B.1.3: The Standard Deviation for each country.

71 APPENDIX B. FURTHER VISUALIZATIONS

B.2 Spotify Web API

The following charts show further visualizations of the experiment with the Spotify Web API (5.1.2) with the values being 50th percentile (B.2.1), 90th percentile (B.2.2), the 95th percentile (B.2.3) and standard deviation (B.2.4).

Figure B.2.1: The 50th Latency Percentile(median) for each country.

B.3 Country Differences

The following charts show further visualizations of the differences between both endpoints with the values being 50th percentile (B.3.1), 90th percentile (B.3.2) , the 95th percentile (B.3.3) and standard deviation (B.3.4).

72 APPENDIX B. FURTHER VISUALIZATIONS

Figure B.2.2: The 90th Latency Percentile for each country.

Figure B.2.3: The 95th Latency Percentile for each country.

73 APPENDIX B. FURTHER VISUALIZATIONS

Figure B.2.4: The Standard Deviation for each country.

Figure B.3.1: The median Latency difference between C@E( 3.1.2) and Spotify Web API (3.2.1).

74 APPENDIX B. FURTHER VISUALIZATIONS

Figure B.3.2: The 90th percentile Latency difference between C@E( 3.1.2) and Spotify Web API (3.2.1).

Figure B.3.3: The 95th percentile Latency difference between C@E( 3.1.2) and Spotify Web API (3.2.1).

75 APPENDIX B. FURTHER VISUALIZATIONS

Figure B.3.4: The Standard Deviation difference between C@E( 3.1.2) and Spotify Web API (3.2.1).

76 TRITA-EECS-EX-2020:888

www.kth.se