SledgeEDF: Deadline-driven Serverless for the Edge

by Sean Patrick McBride

B.S. in German, May 2007, United States Military Academy M.A.S. in Information Technology and Management, May 2013, Illinois Institute of Technology

A Thesis submitted to

The Faculty of The School of Engineering and Applied Science of The George Washington University in partial satisfaction of the requirements for the degree of Master of Science

January 8, 2021

Thesis directed by

Gabriel Parmer Associate Professor of Computer Science c Copyright 2021 by Sean Patrick McBride All rights reserved

ii Dedication

This thesis is dedicated to the many educators, mentors, and battle buddies that have helped me grow as a technologist, software engineer, and computer scientist. Sankaran Iyer, Susan Schwartz, Chris Okasaki, Christa Chewar, Ray Trygstad, Jeremy Hajek, Jeffrey Kimont, Robert Hendry, Carol Davids, Bill Slater, Bonnie Goins, David Gaertner, Andy Quintana, Patricia Schatz, Wayne Bucek, Pat Medo, Lih Wang, Tony Liu, Bill Seubert, Marty Horan, Fred Bader, Mitch Green, Bob Kaas, Richard Lewis, Gwen Dente, Ray Mullins, Frank DeGilio, Paul Novak, Bruce Hayden, Art Breslau, Chris Ganim, Mark Woehrer, Will Dory, Steve Payne, Walt Melo, Mark Davis, Omri Bernstein, Eliot Szwajkowski, Dani Young-Smith, Conrad Holloman, David Tillery, Garth Hershfield, Daniel Cox, Doug Fort, Jeff Hemminger, Josh Rutherford, Hiromi Suenaga, Kait Moreno, Howie Huang, Ben Bowman, Yuede Ji, Pradeep Kumar, Nahid Ghalaty, Roozbeh Haghnazar, Morris Lancaster, Gabe Parmer, Phani Kishoreg, and the unnamed others I’ve forgotten. I am mostly an ambulatory accumulation of the investments others have made in me over the years. I hope that you consider me a worthy investment, and I pledge to pay this forward!

iii Abstract

SledgeEDF: Deadline-driven Serverless for the Edge

Serverless Computing has gained mass popularity by offering lower cost, improved elasticity, and improved ease of use. Driven by the need for efficient low latency computation on resource-constrained infrastructure, it is also becoming a common execution model for edge computing. However, hyperscale cloud mitigations against the serverless cold start problem do not cleanly scale down to tiny 10-100kW edge sites, causing edge deployments of existing VM and container-based serverless runtimes to suffer poor tail latency (the slowest latency in the distribution, typically expressed as the 99th percentile). This is particularly acute considering that future edge computing workloads are expected to have latency requirements ranging from microseconds to seconds. SledgeEDF is the first runtime to apply the traditional real-time systems techniques of admissions control and deadline-driven scheduling to the serverless execution model. It extends previous research on aWsm, an ahead-of-time (AOT) WebAssembly compiler, and Sledge, a single-process WebAssembly-based serverless runtime designed for the edge, yielding a runtime that targets efficient execution of mixed-criticality edge workloads. Evaluations demonstrate that SledgeEDF prevents backpressure due to excessive client requests and eliminates head-of-line blocking, allowing latency-sensitive high-criticality requests to preempt executing tasks and complete within 10% of their optimal execution time. Taken together, SledgeEDF’s admissions controller and deadline-driven scheduler enable it to provide limited guarantees around latency deadlines defined by client service level objectives.

iv Table of Contents

Dedication ...... iii

Abstract ...... iv

1 Introduction ...... 1 1.1 Contributions ...... 2 2 Background ...... 3 2.1 Serverless ...... 3 2.2 Edge Computing ...... 5 2.3 WebAssembly ...... 7 3 Related Work ...... 11 3.1 Serverless Research ...... 11 3.2 Edge Computing Research ...... 12 3.3 WebAssembly Research ...... 13 3.4 Comparisons with SledgeEDF ...... 13 4 Design ...... 15 4.1 Baseline Functionality ...... 16 4.2 Design Changes ...... 17 5 Implementation ...... 18 5.1 Enabling Refactors ...... 18 5.1.1 Processor Aware ...... 18 5.1.2 Sandbox State Machine ...... 18 5.1.3 Polymorphic Scheduling Interface ...... 19 5.1.4 Module Specification Changes ...... 19 5.1.5 Removal of libuv ...... 20 5.2 EDF Scheduling ...... 20 5.3 Admissions Control ...... 22 5.4 SledgeEDF ...... 24 6 Evaluation ...... 25 6.1 Deadline-based Scheduling ...... 25 6.2 Admissions Control ...... 28 6.3 System Overheads ...... 29 7 Conclusion ...... 31 7.1 Future Work ...... 31 Bibliography ...... 32

v Chapter 1: Introduction

Serverless Computing or Functions-as-a-Service (FaaS) is the first novel cloud-native service to have gained mass popularity, largely by offering lower cost, improved elasticity, and improved ease of use. Certain classes of startups are adopting the so-called Backend-as-a- Service (BaaS) pattern, whereby a company focuses on highly differentiated web or mobile apps and implements all backend logic using serverless functions orchestrating managed cloud services, allowing them to scale with minimal distributed systems or operations expertise. Inspired by the needs of IOT and smart devices, cloud services are migrating from hyperscale data centers to the edge of the network, resulting in a class of services called edge computing. Given the event-driven nature of certain classes of IOT devices, serverless is one of the more common cloud services to be deployed to the edge. However, while existing serverless services typically use container or VM-based deployment units in hyperscale cloud data centers, this approach is ill suited to the edge due to the cold start problem, a situation where, in the worst-case, all caches are cold, and a request must stall waiting for the deployment of an operating system and language runtime before being able to execute, resulting in unfavorable tail latency. Newer serverless runtimes have solved the cold-start problem by using user-level sandboxing technologies, most commonly WebAssembly. Some research and open-source systems also solve the research questions of how to efficiently perform stateful computation, such as big data or machine learning, using stateless functions. This thesis views these research questions as largely solved and examines an area as yet unexplored in the recent serverless research literature: scheduling. Generally, existing serverless runtimes use simple schedulers that provide weighted- fairness and optimize for efficient throughput at the expense of quality of service guarantees around request latency. This likely makes sense for serverless running in a hyperscale data center, as the risk of exhausting hardware capacity is effectively nil and users are accustomed

1 to tolerating occasional 100ms stalls caused my the cold start problem in exchange for low cost. However, serverless running on the edge deals with different constraints. Small scale edge infrastructure is unable to guarantee the illusion of infinite capacity, so edge serverless must be able to handle request outstripping available supply. Additionally, to be able to support hard and soft realtime systems that require low roundtrip latency, a runtime needs to be able to handle differentiated quality of service guarantees for different sorts of requests. A serverless request from an aerial quadcopter requiring 20ms roundtrip latency cannot indefinitely sit on a request queue because all processors are occupied executing long-running non-realtime tasks, such as data compression of footage from nearby video cameras that is subsequently uploaded to cloud object storage. This thesis presents SledgeEDF, a single-process WebAssembly-based serverless runtime designed to support differentiated quality of service guarantees in a resource-constrained edge environment. The theoretical basis for this work is derived from decades-old research into real-time scheduling and the deadline-driven execution of sporadic jobs [84]. However, after a thorough review of the existing literature, the author believes this to be the first application of these traditional techniques in the context of a serverless runtime.

1.1 Contributions

This thesis presents SledgeEDF, a single-node serverless runtime optimized for running mixed-criticality workloads at the edge. Contributions include:

• An Earliest Deadline First (EDF) scheduler designed in the context of a single-process serverless runtime.

• A lightweight Admissions Controller for a serverless runtime running on limited edge infrastructure that cannot sustain the illusion of infinite capacity.

• An evaluation demonstrating the efficacy of these components.

2 Chapter 2: Background

2.1 Serverless

The value and use case of SledgeEDF is largely an extension of the serverless paradigm. This section provides a brief overview of serverless by contextualizing it in the larger ecosystem of cloud services, providing a high level overview, and discussing several common criticisms. In the fifteen years since Jeff Bezos’ announcement that Amazon would provide "pay by the drink" web services [95], most corporate IT departments have begun to migrate their critical software systems from on-premise data centers to the cloud. However, to date, much of this cloud adoption has simply involved the "lifting and shifting" of virtual machines from on-premise VMWare vCenter clusters to managed Infrastructure as a Service (IaaS), such as AWS EC2. Adoption of higher-level Platform as a Service (PaaS) offerings that offered "cloud-native" elasticity, such as App Engine and the original Windows Azure service [25], was stymied by the difficulty of refactoring legacy software systems. Over time, this led and Google to deemphasize PaaS and place greater emphasis on infrastructure-centric services [51, 63]. Recently, customers have expressed interest in adopting containers as a lighter weight alternative to virtual machines, typically in conjunction with the microservices pattern [88]. This has resulted in the term "cloud-native" gaining an association with containers and the Kubernetes ecosystem [49]. Despite this lofty language, equating containerization with cloud native is likely an overstatement, as containers remain a modest evolutionary step from IaaS services with a generally non- threatening migration path [105]. In contrast to IaaS and container-based services, the serverless or Functions-as-a-Service (FaaS) pattern pioneered by AWS Lambda represents the first fundamentally innovative cloud service to have achieved broad popularity in the cloud era [76]. As of 2020, all major cloud providers offer a service of this sort [33, 57, 90, 111] and several viable open-source

3 implementations are also available [7, 45, 71]. In serverless offerings, computation is abstracted from traditional physical infrastructure, expressed as stateless functions, executed in an event-driven manner, and priced according to memory capacity and execution time in seconds (GB/s) [115]. Given that this delivery model granularly scales compute in response to events triggered by customer usage, estimates of potential infrastructure cost savings are as high as 77.08% over virtual machines (VMs), compared with a 13.42% savings using containers [126]. By offering lower cost, improved elasticity, and ease of use that enables scaling without dedicated backend and distributed systems specialists, this class of offerings has become particularly attractive to startups. As such, serverless has grown in tandem with the rise of coding bootcamps as a source for developer talent [54], suggesting that FaaS may be reducing the need for backend engineering just as IaaS reduced the need for systems administrators and data center personnel. This is reflected by the overarching term that describes the pattern of using only FaaS and managed cloud services: Backend-as-a-Service (BaaS). A common criticism against serverless is the that the dominance of proprietary APIs risks lock-in [34]. This fear has motivated researchers and open-source developers to create new open-source serverless runtimes, such as OpenLambda [64], OpenFaaS [45], and OpenWhisk [12], as well as higher-level abstractions, such as Serverless Framework [110], that wrap proprietary services in a portable vendor-agnostic API. As of late 2020, a popular internet reference lists a dizzying forty-nine current and defunct serverless frameworks [89]. A second criticism of serverless is the performance disadvantages of the stateless event- driven execution model. Being stateless means that serverless uses a "data shipping" architecture of copying data to and from a remote object store, typically incurring an additional 10ms overhead per fetch or store [24]. This can be a particularly thorny problem for algorithms with random access patterns that cannot be hidden via pre-fetching. Being event-driven means that execution only starts on demand, potentially incurring a startup cost on every request. For most hyperscale serverless offerings, the backing deployment unit is a

4 Linux container or lightweight VM, so this startup includes the function itself, a language runtime, and a minimal Linux environment [112]. The startup time of these components can be hidden in the common case via caching, snapshots, and keeping backing deployment units warm for longer periods of time, but these optimizations are flushed after periods of inactivity as short as forty minutes, leading to spiky tail latency [41, 85]. Several recent studies have profiled this start time to be roughly 25ms during a best case warm start and 160+ ms on a cold start [18]. Much of this performance problem is related to the overheads of the Linux container abstraction, causing some researchers to wonder if an alternative host OS or a unikernel approach might be better suited [79]. This line of thinking led to AWS’ decision to rearchitect Lambda functions around a new lightweight virtualization platform called Firecracker [1].

2.2 Edge Computing

SledgeEDF is a serverless runtime designed with the constraints of edge infrastructure in mind. To motivate why a purpose-built runtime is necessary, this section discusses the classes of workloads that are poorly served by cloud services delivered via centralized hyperscale cloud data centers and introduces the nascent edge computing services emerging to address this market need. Though some researchers identify the event-driven nature of some classes of IOT devices as a good fit for serverless, adoption has lagged due to one or more of the following attributes:

• Large data feeds (i.e. high resolution video) that are cost-prohibitive to ship to the cloud [26].

• Realtime requirements intolerant of the high tail latency of occasional cold starts.

• Tight latency requirements that are exceeded by the roundtrip latency of reaching a hyperscale cloud data center. For example, 70% of the world’s Internet traffic runs through the "Data Center Alley" in Loudoun County, Virginia [99], to which roundtrip requests from the US West Coast take approximately 90ms [106, 108]. While this can be

5 ameliorated by better use of regions, multi-region applications still experience network latency between 30-40ms [19], but some realtime systems require a roundtrip latency as low as 10ms [132].

These issues motivate a new class of cloud services called edge computing, which executes cloud services in micro data centers at the network’s edge in order to improve latency by reducing the geographic distance of round-trip requests. The original edge service was the content delivery network (CDN), traditionally run by specialty companies, such as Akamai, CloudFlare, and Fastly, but now also offered by the major cloud providers [120]. Reflecting the shifting perceptions of edge from a caching layer for static content to a general-purpose execution environment, sometimes called the Intelligent Edge, players in this space now often refer to themselves as edge cloud providers and describe CDNs as merely one service among many. For example, CloudFlare [124], Fastly [47], AWS [113], and Baidu [20] now offer serverless execution on their edge clouds. Even if the edge is increasingly intelligent, running on highly-distributed, small-scale, and potentially heterogenous infrastructure means edge services experience novel problems not seen in hyperscale cloud data centers. For example, the cloud is traditionally defined as presenting customers the illusion of infinite capacity [51], yet edge hardware has less capacity and is difficult to scale, so exhaustion is possible. How to manage this remains an open question. Additionally, resource constraints at the edge render certain performance optimizations uneconomical, such as keeping idle containers warm to reduce serverless cold-start. As such, the latency reductions of moving container-based serverless closer to latency-sensitive clients may be cancelled out by an exacerbated cold start problem. Possibly for this reason, Fastly and CloudFlare have created edge serverless offerings that replace containers or lightweight VMs [1] with WebAssembly, a lightweight user-level sandbox that eliminates much of the cold start problem [66, 119]. While most edge research seems to consider edge services in the context of IOT, smart city, and realtime applications, some venture capitalists envision a future where self-driving

6 cars, delivery drones, and other novel applications with tight latency requirements cause a disruptive shift towards decentralization that results in the end of the cloud [82]. A notable application that is currently leaning in this direction is the Xbox Game Pass, which replaces game consoles with a Netflix-style streaming service [28, 91]. Given that a competing game streaming company claims that sub 20ms latency is needed to offer a near-native experience [36], the new Xbox service is likely motivating the build out of several new Azure edge services [19]. Along these same lines, the ’s State of the Edge report is particularly bullish, likening Edge Computing to the Third Act of the Internet and predicting $700 billion in CAPEX leading to 102 thousand MW of edge compute by 2028 [43].

2.3 WebAssembly

In order to provide efficient serverless execution on constrained edge devices, SledgeEDF leverages WebAssembly, a lightweight standards-based sandboxing technology that is neither bound to the web nor an assembly language in the traditional sense. While WebAssembly can be quickly summarized without regard to the web platform, this section provides additional context around the history and motivations that led to the initial specification, active initiatives in the WebAssembly ecosystem, and directions that the technology is likely to go in the future. Given the potential that WebAssembly has to impact the POSIX substrate and core design primitives of systems software, the author considers the trajectory of WebAssembly as essential knowledge for systems researchers. The Web has undergone a substantial transformation over the last decade, evolving from a distribution medium for hypertext into a full featured application development platform. While such a move has been investigated and studied by researchers for years [68, 69, 103], the two primary drivers of this change were the iPhone, which left the web as the last option for cross-platform application development [102], and Lars Bak’s work on V8, the first high-performance JIT compiler designed around the peculiarities of the JavaScript

7 specification [11]. These drivers motivated new single-page application frameworks [46], technologies that blurred the lines between web and native applications [50], and transpilers that enabled alternative languages to run in the browser via source-to-source compilation [74]. JavaScript, a language designed by in ten days [114] with a reputation of good parts surrounded by bad [35], became the defacto "Assembly Language for the Web" [62]. Designed to be simple to use, asynchronous, event-driven, and single-threaded, fast JavaScript was still insufficient for a subset of applications where low startup time, raw compute-intensive performance, or high-concurrency was important. This led to several attempts to provide a mechanism for running trusted native code without suffering the security problems associated with Microsoft’s older ActiveX technology [42]. Drawing on the lessons of software fault isolation (SFI) research, such as PittSFIeld [86] and Xax [39], Google developed NaCl [133] and PNaCl [38], secure sandboxes for native code. In contrast, Mozilla followed the emerging pattern of transpilation and developed , a LLVM- based compiler toolchain that could turn C or C++ into Asm.js, a JavaScript subset that ran at half the speed of native binaries by optimizing for the JIT compilers’ fast path [136]. The success of Asm.js led browser vendors to collaborate on WebAssembly [44], a W3C standard compilation target for web browsers that takes cues from previous research into typed assembly languages [93, 94]. Compared to the Java Virtual Machine (JVM) or .NET Common Language Runtime (CLR), WebAssembly was conservatively specified to only provide the minimal subset of features required to support the execution of C, C++, or Rust code. However, given the history of security vulnerabilities in older plugin-based execution environments, such as ActiveX, Java Applets, Silverlight, and Adobe Flash, considerable attention was paid to making this specification a sound and secure sandbox for running untrusted and potentially adverse code loaded from a network. WebAssembly is composed of the following high-level components:

• A virtual instruction set architecture (ISA) that operates on an implicit operand stack rather than using traditional loads and stores of registers.

8 • A handful of types, which only include i32, i64, f32, and f64. Higher-level types must be expressed using integral values.

• A 32-bit little endian linear memory that can be dynamically resized between one page and 4GB. This acts as the program’s working memory, storing things like the heap. WebAssembly can only access this abstract linear memory, providing a form of Software Fault Isolation sandboxing similar to a Virtual Address Space provided by a host operating system.

• An external stack that exists outside of the linear memory and is protected by the host environment. This protects against several classes of security vulnerabilities.

• A series of tables containing typed entry points to functions and function pointers. We- bAssembly can only call functions via a call_indirect instruction that takes a table index, providing a form of Control Flow Integrity.

• A module specification with import and export semantics that do not make assumptions about a host environment.

• A compact binary representation that optimizes for size.

• A text format with LISP-like S-expressions that optimizes for legibility [59].

Given Google’s decision to sunset its competing PNaCl standard and the WebAssembly specification reaching W3C consensus on February 28, 2017 [127], WebAssembly can be considered the industry’s first standards-based intermediate representation with substantial cross-vendor support. Because it delineates the core virtual ISA and export/import interface from the complexities of concrete host environments, such as the web browser [60], numer- ous systems programmers consider WebAssembly a general-purpose embeddable sandbox with strong isolation and low overhead. As a result, a major area of further specification has been the WebAssembly System Interface (WASI), which provides a means for WebAssem- bly code to interact with an operating system host environment [32]. The resulting design provides a core set of POSIX-like APIs that are modified to provide a capability-based security model similar to Capsicum [130]. This allows for a runtime to limit a WebAssembly

9 module to fine grained permissions, and for that module to further restrict and delegate those permissions to third-party modules. The combination of WebAssembly and WASI has triggered a veritable Cambrian explo- sion of activity, including front-end languages [8], compilers [4, 5, 65, 122], interpreters, general runtimes [6, 9, 67, 109, 129], and adoption of WebAssembly in existing popular pieces of software, including Kubernetes [48], Envoy [30], and OpenFaaS [17]. Many new specifications are on the horizon including garbage collection, direct access to the browser’s (DOM), and various features to make it easier for heterogenous languages to import and export higher-level types. A higher-level Nanoprocess Pattern has been proposed by the ByteCode Alliance, an organization focused on the application of the technology outside of the browser [31]. Zooming out, the meta-thread of ongoing research and industry efforts seem to be working towards positioning WebAssembly, WASI and the Nanoprocess pattern as a cleaner, simpler, and more secure replacement for POSIX.

10 Chapter 3: Related Work

Given the paucity of research into scheduling of serverless runtimes on constrained edge infrastructure, this section broadly investigates ongoing research related to one of several key aspects of the system: serverless, edge computing, and WebAssembly.

3.1 Serverless Research

The broad trend of recent serverless research is an attempt to shift serverless architectures from event-driven stateless execution based around containers and data shipping to event- driven stateful execution based around lightweight sandboxes, such as WebAssembly, adjacent to some sort of high-speed storage service. Papers fall in the following categories:

• Analysis of proprietary [85, 128] and open-source [83, 92, 115] serverless offerings.

• Vision Papers focused on serverless in general [76] or specific domains, such as machine learning [23], including gap analysis and suggestions for future research.

• Runtimes that are open-source clones of AWS Lambda [64, 87].

• Runtimes that attempt to solve the Cold Start Problem via simplified container names- paces [98], per-container process-based multitenancy [2], user-level sandboxing via We- bAssembly [52, 117], user-level sandboxing via V8 contexts [137], improved snapshots [41], or unikernels [22].

• Attempts to solve the Data Shipping Problem via standalone storage systems [40, 78] or frameworks that provide both a unified storage engine and serverless runtime aimed at performant stateful programming in Python [24, 75, 121], JavaScript [137], Java [13], or WebAssembly [117].

11 3.2 Edge Computing Research

The broad stroke of recent Edge Computing research is an attempt to forecast the use cases and prototype the network and system architectures needed for the gradual rollout of edge infrastructure over the next decade. Papers fall in the following categories:

• Vision Papers on the use case and challenges of running services at the edge of the network [116, 125, 135].

• Performance Benchmarks of existing open-source serverless runtimes on resource con- strained edge infrastructure [101], including customizations to OpenFaaS [55] and Open- Whisk [15] to prototype edge serverless reference architectures. Generally, these systems exhibited surprisingly poor round-trip latency on the order of 100ms, demonstrating that existing Kubernetes and container-based serverless runtimes are ill suited for resource constrained edge infrastructure.

• Reference architectures and implementations of Edge Serverless Runtimes [29, 53], including for specific workloads, such as data analytics [97], and for certain industries, such as oil and gas [3, 70].

• Hybrid N-Tier Serverless reference architectures and implementations [26, 134] that offer opportunistic performance gains via the edge, but are ultimately backed by hyperscale cloud services. Building on this, the FogFlow framework dynamically migrates data and serverless functions between the cloud and edge based on latency and bandwidth targets [27].

• Multi-vendor Edge Serverless Markets, including a framework that matches clients with a best fit provider based on their scheduling requirements [10, 14] and a simulation of how different forms of real-time auctions impact pricing and placement on an edge network [16].

• Research that revisits the idea of cloudlets and Mobile Edge Computing (MEC) [108], which envisions each mobile device having a dedicated backend for offload that migrates

12 as the mobile device moves. In the context of serverless [15], opportunistic offload to edge serverless is considered. A particularly interesting example of this involves a mobile game that live migrates a WebAssembly module running compute-intensive graphics processing to the edge, resulting in improved frame rates [73].

3.3 WebAssembly Research

The broad trend of WebAssembly research is a push to refine the core specification and surrounding ecosystem of tools and investigate novel applications in non-web embeddings. Papers fall in these categories:

• Criticism against some of the strong claims of performance [72] and security [80] made by the original WebAssembly paper [60].

• Suggested extensions to the specification, including a relaxed memory model [131] and adding tables of memory segments in support of new hardware-based memory safety features, such as ARM Memory Tagging Extension [37].

• Tools to enable taint tracking [123], superoptimization [21], Valgrind-like dynamic analy- sis [81], and easier adoption of sandboxing of third-party dependencies in large brownfield C++ projects [96].

• Applications to a novel domain, such as existing VMs [107], trusted execution environ- ments such as Intel SGX [56], operating system kernels [118], or mobile software agents that can live migrate between phones and edge compute devices [73].

3.4 Comparisons with SledgeEDF

Existing edge serverless research has largely focused on reusing existing open-source container-based serverless runtimes, reporting poor tail latency due to the cold start problem. Other than previous work by George Washington University researchers [52, 53], the sole exception the author encountered was an edge serverless runtime that researchers at Georgia

13 Tech rapidly prototyped via the WebAssembly capabilities of Node.js, which came to similar conclusions regarding the unsuitability of container-based runtimes for the edge [61]. Other WebAssembly-based serverless runtimes are either commercial systems that cannot be cleanly replicated, such as the commercial Fastly and CloudFlare offerings, or are not targeted towards the edge, such as Faasm [117], a particularly impressive general-purpose research system. Beyond being the sole open-source WebAssembly-based serverless runtime built with the edge in mind, SledgeEDF [52] differs from these multi-node distributed systems by optimizing for deployment as a single-process edge node. The closest analogous single node serverless deployment system is Nuclio [71], an open-source serverless runtime which allows a function processor to be run in a standalone fashion similar to SledgeEDF, but which uses process forking within container similar to SAND [2] in place of WebAssembly. Regarding the issue of scheduling, all existing serverless systems and proposed serverless reference architectures provide weighted-fairness and optimize for efficient throughput, leading the author to believe that SledgeEDF is the first serverless runtime to consider service- level agreements (SLAs) and per-request latency requirements in scheduling decisions. After a thorough review, the only serverless paper the author encountered that modeled execution as deadline-driven tasks was a focused investigation of serverless for Smart Oilfields, but this system only used deadlines for cluster-level distribution decisions [70]. The author believes that this paper’s ideas about cluster-level work distribution might be a useful complement if SledgeEDF expands into clustered execution in the future. Expanding out of the serverless domain, several papers focused on reducing scheduling tail latency include Shenango [100], which focused on core reallocation, ZygOS [104], which compares First Come First Serve (FCFS) and Priority Scheduling (PS) policies across different workload distributions, and GrandSLAm [77], which batches and reorders requests running through a distributed system composed of microservices to optimize end-to-end target latency.

14 Chapter 4: Design

Sledge is a serverless runtime designed around the requirements of resource-constrained edge systems. It leverages the WebAssembly security model to provide in-process isolation in order to execute untrusted modules from multiple tenants in a single process. This lighter-weight form of isolation eliminates the cold start problem, resulting in an 80-90% decrease in p100 latency (the slowest latency in the distribution) [61]. Other runtimes also use WebAssembly, but these runtimes still exhibit overheads of between 28-150% over native code [72]. Sledge reduces this overhead to 13.4% [52] through the following techniques:

• Eliminating JIT compilation via aWsm, a compiler that transforms WebAssembly modules into LLVM bitcode, which is fed into the LLVM compiler to produce native ELF shared libraries. aWsm is written in approximately 4500 lines of Rust.

• Using a purpose-built runtime written in 8,000 lines of C in place of existing runtimes, such as Node.js.

• Using kernel-bypass and non-blocking POSIX APIs.

• Reducing function startup to allocation of a sandbox struct and WebAssembly linear memory by linking and loading the ELF shared libraries containing serverless functions at the launch of the runtime.

• Eliding bounds checks by allocating a full 32-bit address space per sandbox [60].

The baseline designs of the aWsm compiler and Sledge edge serverless runtime have been covered in previous publications [52, 53], which investigated various design tradeoffs. For brevity, this section limits discussion of existing work to the optimal configuration determined by past experimental results. This serves as a baseline to contextualize the modifications and enhancements new to SledgeEDF.

15 4.1 Baseline Functionality

On program start, the main thread of the Sledge serverless runtime initializes global data structures and loads a static JSON file defining the attributes of one or more serverless functions, termed "modules" hereafter. These attributes include the path of the ELF shared library containing the function, the port to bind to, the argument count, the content type of the request, and the size of buffer to use for requests and responses. This information allows the runtime to link and load all modules, bind to the desired host ports, and save the resulting sockets to a global data structure. Once initialization is complete, the main thread launches a dedicated listener thread and a configurable number of worker threads and blocks for the remainder of runtime execution. Each thread is pinned to a processor core, such that the total number of threads cannot exceed the number of physical cores on the system. Hereafter, the listener and worker threads are referred to as listener and worker cores. At this point, the listener core monitors all incoming requests via the Linux epoll interface. When a request arrives, the listener core accepts the request and wraps the resulting client socket connection in a sandbox request structure, which is placed on a global request queue. Worker threads busy loop waiting to pull sandbox requests off the global request queue. Once one is pulled, a worker allocates a sandbox struct containing all state needed for the end-to-end execution of a serverless function, including the stack, linear memory, and tables that constitute the WebAssembly runtime environment. When allocation is complete, the worker begins execution of the sandbox via its main entry point. The runtime broadcasts POSIX signals to all workers at a discrete quantum, interrupting executing sandboxes and forcing them to make a scheduling decision. If a sandbox blocks and its runqueue is empty, a worker pulls another request off the global request queue, potentially resulting in multiple sandboxes on a worker’s local runqueue. When this occurs, the worker executes the sandboxes on the runqueue in a round robin manner. When a worker completes a sandbox, it sends a response to the client using the socket stored in the sandbox struct, deallocates

16 the linear memory, and adds the sandbox to a completion queue. Each worker periodically deallocates the completed sandboxes stored on the completion queue.

4.2 Design Changes

In the existing Sledge baseline, sandbox requests and sandboxes are unordered and executed according to a global first-in-first-out FIFO policy and a local round robin policy. The global request queue is a lock free data structure, and the runqueues are implemented with work-stealing data structures that allow workers to pull sandboxes from each other if the global request queue is empty. This design optimizes for efficient throughput. However, there are several reasons that this approach may be insufficient for an edge serverless system. First, as previously mentioned in the background section, edge computing likely executes on smaller scale infrastructure that is unable to maintain the illusion of infinite capacity. There are two potential issues with this situation: First, excessive demand introduces backpressure on the serverless runtime, which can cause client requests to queue excessively, increasing end-to-end execution time. Second, different functions might have vastly different latency requirements, which might cause situations where long-running compute-bound serverless functions executing in a FIFO order might block serverless functions with tight latency requirements: a phenomenon called head of line blocking. To address these challenges, SledgeEDF implements several ideas from traditional real-time software systems, including:

1. Implementing an admissions controller component to reject requests when the runtime is saturated, thereby preventing backpressure.

2. Modifying the runtime to implement Earliest Deadline First (EDF) behavior, thereby ensuring serverless functions with tight latency requirements cannot be blocked by long- running executions with a later deadline.

17 Chapter 5: Implementation

5.1 Enabling Refactors

To enable the implementation of admissions control and EDF-based scheduling, Sledge was refactored in several key ways.

5.1.1 Processor Aware

In order to convert between cycles and microseconds and measure instantaneous execution capacity, the Sledge runtime was modified to, at launch, read the speed of the first processor core in MHz from /proc/cpuinfo and store this information in a global variable. As currently implemented, this depends on two simplifying assumptions:

• Cores do not dynamically change their speed due to thermal or power management.

• Cores are homogeneous.

These simplifying assumptions are insufficient for certain classes of edge infrastructure, such as servers running ARM’s heterogeneous big.LITTLE architecture, which differentiates efficiency cores and power cores [58]. Additionally, they impose the requirement that the host environment execute processors using the performance governor with all cores executing a consistent pinned clock speed.

5.1.2 Sandbox State Machine

In order to properly model the execution time of a sandbox, the runtime has undergone large-scale refactors to model a sandbox as a state machine. While a major goal of this work was as a debugging aid, it also enabled per-state accounting of sandbox execution time, which is relevant for admissions control. See Figure 5.1 for a visualization of the associated states and transitions.

18 Figure 5.1: Sandbox States

5.1.3 Polymorphic Scheduling Interface

In order to allow Sledge to run alternative scheduling policies, it was refactored to use an abstract polymorphic scheduling interface, and the existing scheduling logic was refactored into a concrete implementation of this interface. This enabled the new EDF policy to be implemented as an alternate concrete implementation. The selection of scheduling policy is exposed at runtime via an environment variable, defaulting to the existing FIFO policy.

5.1.4 Module Specification Changes

Sledge loads serverless functions (termed "modules") via a JSON-based configuration file. Several fields have been added to this specification to enable EDF scheduling and admissions control:

• relative-deadline-us - A static deadline by which Sledge must complete a serverless function, expressed as microseconds relative to when Sledge receives the request.

• admissions-percentile - A client-provided integral between 50 and 99, expressing the target percentile of past executions that the admissions controller should use for decisions. This is useful for workloads that exhibit a long-tail in execution times.

19 • expected-execution-us - A client-provided expected baseline for how long a typical request of this module would take to complete. This is assumed to be the same percentile as admissions-percentile and is used on first invocation before the runtime has had an opportunity to profile real-world execution time.

5.1.5 Removal of libuv

libuv is a popular C-based framework that provides asynchronous event-driven non-blocking I/O, either directly in C or via higher-level languages, such as Node.js. Because it uses an an event-loop architecture, users of this framework effectively delegate scheduling decisions about the execution of callbacks to libuv. Given that there is no support for making libuv deadline-aware, using it in SledgeEDF would result in situations where a callback associated with a sandbox with a later deadline might be running before a runnable sandbox with an earlier deadline. One of the major advantages of libuv is that it provides cross-platform support for Linux, Unix, Mac, and Windows, but since Sledge already uses native Linux features directly, this is irrelevant. As a result, the runtime has been refactored to replace libuv with native Linux alternatives, such as epoll. This preserves a higher degree of control over scheduling decisions, reducing edge cases where asynchronous code violates the principle of deadline-driven execution.

5.2 EDF Scheduling

This is a description of how EDF scheduling policy differs from the baseline described in section 4.1. At launch, Sledge loads the module specification JSON. In order to support EDF scheduling, the relative-deadline-us value is multiplied by the processor speed in MHz to calculate the relative deadline in cycles. This is saved to the module struct and cycles are used internally throughout runtime execution in order to reduce overheads in performance critical sections, such as context switches. Later, upon receipt of a client request, the

20 listener core calculates the absolute deadline of a sandbox in cycles by using an assembly instruction, such as rdtsc, to capture a timestamp, and adding the relative deadline of the associated module. This is saved onto the sandbox request struct. The global request queue is implemented as a min-heap priority queue ordered by absolute deadline in cycles. In contrast to the existing FIFO structure, this requires a lock to protect invariants between when an element is inserted and when the heap structure property is restored. To ameliorate lock contention, the deadline of the element at the head of the request queue is saved to a member that can be safely read without taking the lock. This enables the worker cores to either inspect the earliest deadline without taking the lock or attempt to pull the request associated with this deadline by taking the lock. Under EDF, worker cores now implement their local runqueues via the same lock-based min-heap structure as described previously. However, given that this structure is local to a single core in this context, the worker uses an alternate set of APIs that bypass the lock. Throughout their loops, worker cores make scheduling decisions based on a comparison of the deadline of the currently executing sandbox against the head of the local runqueue and the head of the global request queue. If either of these values has an earlier deadline, the worker attempts preemption. In the case of the earlier sandbox being on the local runqueue (such as due to a blocked resource becoming runnable), the preemption is always successful, and the hitherto executing sandbox transitions to a preempted state (equivalent to runnable). If the earlier sandbox is on the global request queue, the worker attempts to pull the request. However, given that there may be many worker cores attempting this at roughly the same time, one of several situations might occur:

1. A worker successfully pulls the request with this deadline and preempts the current sandbox.

2. A worker is too slow to pull the request, and the earliest request deadline when the worker finally gets the lock is not earlier than what it is already executing. In this case, the worker exits and resumes the existing sandbox.

21 3. A worker is too slow to pull the request at the head of the queue, but the earliest deadline when the worker finally gets the lock is still earlier than it is already executing. The worker pulls this request and preempts as in case 1.

In order to differentiate between cases 2 and 3, when a worker core makes a call that requires it to take a lock, it passes the absolute deadline of its currently executing sandbox. One major limitation of the current implementation is that it lacks a means to terminate or de-prioritize execution of sandboxes that miss their deadlines. In the worst case, this means a client workload caught in an infinite loop might cause permanent head of line blocking because its deadline is earlier than the current timestamp, so no new workloads can possibly preempt the executing workload. Possible solutions to this problem are discussed in Future Work.

5.3 Admissions Control

The goal of the admissions controller is to ensure the runtime does not accept requests when the associated module’s relative deadline cannot be met. The three main pieces of global state used to accomplish this are as follows:

• admissions_control_capacity - the capacity of the system. This assumes that the runtime has successfully pinned the workers to cores and suffers from negligible OS-level con- tention. By default, the system examines the core count of the host environment, dedicates the first core as a listener core, and executes all other cores as workers. The aggregate capacity is reduced by a configurable percentage (20% on the evaluation system) to account for runtime overhead. The core count, starting core ID, and overhead percentage are tunable runtime parameters intended to be sized to specific deployments. In the case of the evaluation system, these values were hand tuned by the author, and questions of fine-grain overhead accounting and capacity planning are deferred as future work.

• admissions_control_admitted - the total active work admitted into the system.

• A specialized per-module performance window that provides sorted lookup in order of

22 execution time over a sliding window of the last N executions.

The module specification provides a relative deadline, an expected execution time, and the target latency percentile (between 50% and 99%) that the admissions control system should use when making admissions decisions. Tuning this percentile expresses how conservative the system should be with regard to scheduling. Selecting 50% means that the admissions controller uses the median latency of previous executions to estimate execution time. Conversely, selecting a higher value, such as 99%, means that the admissions controller uses the tail latency of previous executions to estimate execution time. When the runtime receives a request, the target percentile of the associated module is used to index into the performance window containing previous execution times. This yields the admissions controller’s estimate for how long this request will take to complete. If a module has not yet been executed, its associated performance window is empty, and the client-provided expected-execution-us value is instead used as a fallback. In either case, the admissions controller divides this execution time by the relative deadline of the module, deriving the instantaneous fraction of a processor core needed to complete this request by the target deadline. For example, if a request has a deadline of ten seconds and is estimated to take one second to complete, the instantaneous fraction would be 0.1. If this instantaneous fraction plus admissions_control_admitted is less than admissions_control_capacity, the workload is accepted, and admissions_control_admitted is increased by this instanta- neous fraction. The amount of this increase is saved to the sandbox request, and when the request completes, admissions_control_admitted is decreased by this amount. Otherwise, the request is rejected and the runtime sends the client an HTTP 503 response. It is the responsibility of the client to determine how to respond to this rejection, but attempting to failover to an alternate edge node is likely the answer. This research question is out of scope of SledgeEDF, but several other researchers have focused on questions of failover and cluster-level work distribution [10, 14, 16]. In order to provide a degree of temporal locality in admissions control estimates, during a

23 sandbox’s transition to the complete state, the duration that the sandbox spent in the running state is added to the associated module’s performance window. This ensures that estimates of execution time always reflect the last N executions.

5.4 SledgeEDF

By combining a deadline-aware admissions controller that is capable of preventing ex- cessive backpressure with a deadline-driven scheduler that optimizes the order in which accepted work is executed, SledgeEDF provides differentiated qualities of service in a mixed-criticality environment. While there are several key limitations around handling run- ning sandboxes that miss their deadlines, the initial functionality of SledgeEDF is sufficient to study the costs and benefits of a deadline-driven serverless approach.

24 Chapter 6: Evaluation

The following evaluations were run on Ubuntu 18.04 executing on a Dell Precision 7820 workstation with a 16 core 2.1 GHz Intel Xeon Silver 4216 processor, 16 GB of memory, and a 10G NIC. During all experiments, the processor was configured to run in performance mode with all cores executing at a constant 2.0 GHz. The SledgeEDF runtime was compiled using clang / LLVM version 8 set to optimization level 3 with link-time optimization enabled. Requests were run on a local network configured with static IPs over a 10G Netgear XS708Ev2 switch, but the network did not achieve 10G due to the client machine only having a 2.5G NIC.

6.1 Deadline-based Scheduling

The first experiment seeks to compare the performance of mixed criticality workloads run on the FIFO and EDF scheduling variants supported by the SledgeEDF runtime. Both variants are run with admissions control disabled in order to understand the impact of EDF scheduling and preemption independently from admissions control. The two workloads used to evaluate a mixed criticality workload are as follows:

• On port 10010, fibonacci(10), a function that calculates the 10th Fibonacci number, a

short running task that executes in 163µs with a relative deadline of 20ms. This simulates a soft-realtime request from an IOT device.

• On port 10040, fibonacci(40), a function that calculates the 40th Fibonacci number, a long

running task that executes in 1160625µs with a relative deadline of 300s. This deadline is sized to guarantee that any request to port 10010 preempts any request to port 10040. This roughly simulates a long-running edge computing task, such as compression of footage from nearby video cameras that is subsequently uploaded to cloud object storage.

In order to capture a baseline, SledgeEDF executes each workload independently in

25 sequence, indicated by fib10 and fib40 in the tables below. This informs us as to how the system performs when saturated by only requests to either port 10010 or 10040. After this, clients issue requests to both of these ports, resulting in a mixed criticality environment with a mix of fibonacci(10) and fibonacci(40) requests. In the tables below, these concurrent workloads are indicated by fib10-con and fib40-con. In this environment, comparing fibonacci(10) latency for the two scheduling variants in this mixed critical environment demonstrates the value of EDF over FIFO and comparing throughput provides an approximate idea of how the additional work of EDF impacts throughput.

Payload p50 p90 p99 p100 Deadline fib10 8.3 12.6 30.7 3160.9* 20 fib10-con 6653.9 6662.1 6662.5 6662.6 20 fib40 6994.1 7768.7 7773.3 7775.0 20000 fib40-con 7491.0 7769.9 7786.1 8884.7 20000

Table 6.1. Request Latency(ms) under FIFO

Payload p50 p90 p99 p100 Deadline fib10 8.2 10.9 32.1 3134.9* 20 fib10-con 9.0 13.5 37.4 3126.9* 20 fib40 6998.0 7768.0 7773.3 7774.0 20000 fib40-con 7656.6 6866.6 8639.7 9989.6 20000

Table 6.2. Request Latency(ms) under EDF

Payload FIFO EDF fib10 882 874 fib10-con 20 871 fib40 29 29 fib40-con 28 27

Table 6.3. Request Throughput

The results flagged with asterisks indicate unexpected results. Looking at the results for the two tests involving the quick running fibonacci(10) sandbox, a batch of approximately 13,000 requests exhibit gradually increasing latency below 50ms and a second batch of

26 approximately 100 requests have latency just above 3100ms. Given this unusual distribution, it is likely that an interrupt is not being properly disabled in a critical section, causing a context switch to occur while a sandbox is holding a critical resource such as the global request queue lock. Given that this is the author’s first experience working in a codebase with inline assembly and context switches, this seems likely. However, given that this pattern is consistent across the EDF and FIFO scheduling variants, the author considers this bug orthogonal to demonstrating the value of the EDF scheduling variant. Under FIFO, the relative deadline value is unused, and the system is focused on providing maximum throughput. Requests to fibonacci(10) and fibonacci(40) proceed through the system in arbitrary interleavings. Given the longer execution time of fibonacci(40), these workloads end up saturating the workers, resulting in head-of-line blocking that causes the throughput and latency of fibonacci(10) to converge towards that of fibonacci(40). As shown in Table 6.1, this results in a substantial slowdown of the short-running fibonacci(10) requests relative to when they are the only workload executing in the system. For example, at the 50th percentile, round trip latency increases from 8.3ms to 6653.9ms. Under EDF with the relative deadlines mentioned above, fibonacci(10) requests always preempt fibonacci(40). This means that fibonacci(10) requests execute nearly immediately, and fibonacci(40) only execute when all fibonacci(10) requests have been handled. As shown in Table 6.2, preemption ensures that the 50th percentile of short-running fibonacci(10) requests execute in 9.0ms, which is very close to the 8.2ms round trip latency measured when they are the only workload executing in the system. In Table 6.3, the throughput of fibonacci(40) appears roughly comparable across both scheduling variants in both single and mixed criticality environments. This is also true for fibonacci(10) under EDF. However, the throughput of fibonacci(10) under FIFO dropped precipitously from 882 requests when executing as the sole workload to 20 requests when executing concurrently in a mixed criticality environment, reflecting head-of-line blocking.

27 6.2 Admissions Control

Admissions control is concerned with preventing runtime saturation and request backpres- sure. This is orthogonal to deadline-based scheduling, which is focused on the prioritization of requests within the system. In order to demonstrate the value of admissions control, this experiment issues 1,000 fibonacci(40) requests in parallel as a single batch with timeouts disabled in order to demonstrate how backpressure can result in missed deadlines and how admissions control can prevent backpressure. The fibonacci(40) module is configured to have a relative deadline of 20 seconds, and the admissions percentile is tuned to the 70th percentile in order to conservatively provide a buffer to ensure we make all deadlines.

Admissions Control p50 p90 p99 p100 Deadline Enabled 7.7906 14.4373 15.5672 16.6792 20 Disabled 29.9798 53.3220 58.8047 58.8880 20

Table 6.4. Latency(s) based on Admissions Control

Admissions Control Accepted Requests Rejected Requests Success % Enabled 206 794 100% Disabled 1000 0 33.2%

Table 6.5. Rejections and Deadline Success based on Admissions Control

When admissions control is disabled, fibonacci(40) requests enqueue faster than the runtime can execute them, causing backpressure. As shown in Table 6.4, in the most extreme case, the runtime completes a request in 58.9 seconds, which is well over the relative deadline of 20 seconds. In contrast, when admission control is enabled, the listener core rejects 794 of the 1000 requests, as shown in Table 6.5. This reduces backpressure, such that the longest executing sandbox completes in 16.7 seconds. The 3.3 seconds of slack reflects that the executing module was configured relatively conservatively against the 70th percentile of past executions.

28 6.3 System Overheads

In order to understand the performance overheads of enabling both admissions control and EDF based scheduling, the final experiment ran a mixed-critical workload against the four possible permutations of the runtime. Given that the previous admissions control experiment demonstrated the ability of the component to guarantee deadlines, this experiment was rate limited such that approximately half of all fibonacci(10) requests were rejected when the admissions controller was on, and the experiment ran for sixty seconds. Pending requests that did not return by the sixty second deadline were ignored.

EDF FIFO Status fib(10) fib(40) fib(10) fib(40) Disabled 10791 15 40 22 Enabled 11007 14 26 21

Table 6.6. Throughput by Scheduling Policy and Admissions Control

Given the vast differences in execution time between fibonacci(10) and fibonacci(40), it is not immediately clear to the author how to interpret the results in Table 6.6. Based on a log of sandbox executions, fibonacci(40) executes in 1160625µs and fibonacci(10) executes in approximately 163µs. This means that 7120 requests of fibonacci(10) equate to one request of fibonacci(40). Using this ratio to scale all results to fibonacci(10) yields the following.

EDF FIFO Status fib(10) Overhead fib(10) Overhead Disabled 117591 24.95% 156680 0% Enabled 110687 29.35% 149546 4.55%

Table 6.7. Throughput by Policy and Admissions Control

However, given the fact that the EDF scheduler executed on the order of 10,000 sand- boxes and the FIFO scheduler executed less than 100, it is unclear how much of the overhead shown in Table 6.7 is merely caused by additional context switches. Additionally, some amount of this overhead may be caused by the suspected bug mentioned in the first experi-

29 ment (interrupts not being properly disabled in a critical section). Given the high number of context switches that occurred in this experiment, such a bug likely would have dispropor- tionately affected EDF results. As such, the author does not have a high degree of confidence in understanding the performance overheads of deadline-based scheduling. Given the tight resource constraints of edge systems, this aspect of the system requires a more thorough study.

30 Chapter 7: Conclusion

As demonstrated by previous work around Sledge, purpose-built serverless runtimes de- signed with edge infrastructure in mind exhibit improved latency compared with general- purpose serverless alternatives. SledgeEDF extends this research direction by applying traditional scheduling techniques associated with real-time systems to a serverless runtime. By applying a deadline-driven scheduler, SledgeEDF was able to reduce the latency of a serverless function with tight requirements to within 10% of optimal latency while under significant load of mixed-criticality workloads. By applying an admissions controller and using well-tuned module configurations, the runtime was able to demonstrate the ability to meet 100% of deadlines. Given the importance of low-latency guarantees in future edge systems, these attributes demonstrate the utility of further research around scheduling techniques in edge serverless systems.

7.1 Future Work

Possible directions for future work include:

1. Improving support for dynamic core throttling and heterogeneous cores.

2. Providing mechanisms to deal with sandboxes that miss their deadlines, such as by either rejecting immediately or shifting to an alternate best-effort scheduler that only executes when the system is otherwise idle.

3. Supporting stateful serverless execution via an integrated K-V store.

4. Supporting pipelines of serverless functions.

5. Studying improvements to the listener core, a possible performance bottleneck.

6. Expanding this single-process design to deadline-driven clustering.

31 Bibliography

[1] Alexandru Agache, Marc Brooker, Alexandra Iordache, Anthony Liguori, Rolf Neuge- bauer, Phil Piwonka, and Diana-Maria Popa. Firecracker: Lightweight Virtualization for Serverless Applications. In 17th USENIX Symposium on Networked Systems Design and Implementation (NSDI 20), pages 419–434, 2020.

[2] Istemi Ekin Akkus, Ruichuan Chen, Ivica Rimac, Manuel Stein, Klaus Satzke, Andre Beck, Paarijaat Aditya, and Volker Hilt. SAND: Towards High-Performance Serverless Computing. In 2018 USENIX Annual Technical Conference (USENIX ATC 18), pages 923–935, 2018.

[3] Ammar Albayati, Nor Fadzilah Abdullah, Asma Abu-Samah, Ammar Hussein Mutlag, and Rosdiadee Nordin. A serverless advanced metering infrastructure based on fog- edge computing for a smart grid: A comparison study for energy sector in iraq. Energies, 13(20):5460, 2020.

[4] Bytecode Alliance. Cranelift Code Generator. https://github.com/bytecodealliance/wasmtime/tree/main/cranelift. Accessed: 2020-11-15.

[5] Bytecode Alliance. Lucet, 2020. https://github.com/fastly/lucet. Accessed: 2020-11- 15.

[6] Bytecode Alliance. WebAssembly Micro Runtime, 2020. https://github.com/bytecodealliance/wasm-micro-runtime. Accessed: 2020-11-15.

[7] Apache Foundation. Apache OpenWhisk: Open Source Serverless Cloud Platform. http://openwhisk.apache.org/. Accessed: 2020-11-15.

[8] appcyper. WebAssembly Languages, 2020. https://github.com/appcypher/awesome- wasm-langs. Accessed: 2020-11-15.

[9] appcyper. WebAssembly Runtimes, 2020. https://github.com/appcypher/awesome- wasm-runtimes. Accessed: 2020-11-15.

[10] Austin Aske and Xinghui Zhao. Supporting multi-provider serverless computing on the edge. In Proceedings of the 47th International Conference on Parallel Processing Companion, pages 1–6, 2018.

[11] Lars Bak. Implementing Language-Based Virtual Machines. In Proceedings of the 11th annual international conference on Aspect-oriented Software Development Companion, pages 7–8, 2012.

32 [12] Ioana Baldini, Paul Castro, Perry Cheng, Stephen Fink, Vatche Ishakian, Nick Mitchell, Vinod Muthusamy, Rodric Rabbah, and Philippe Suter. Cloud-Native, Event-Based Programming for Mobile Applications. In Proceedings of the Inter- national Conference on Mobile Software Engineering and Systems, pages 287–288, 2016. [13] Daniel Barcelona-Pons, Marc Sánchez-Artigas, Gerard París, Pierre Sutra, and Pedro García-López. On the FaaS Track: Building Stateful Distributed. Applications with Serverless Architectures. In Proceedings of the 20th International Middleware Conference (Middleware 19), pages 41–54. ACM/IFIP, 2019. [14] Luciano Baresi and Danilo Filgueira Mendonça. Towards a serverless platform for edge computing. In 2019 IEEE International Conference on Fog Computing (ICFC), pages 1–10. IEEE, 2019. [15] Luciano Baresi, Danilo Filgueira Mendonça, and Martin Garriga. Empowering Low-Latency Applications Through a Serverless Edge Computing Architecture. In European Conference on Service-Oriented and Cloud Computing, pages 196–210. Springer, 2017. [16] David Bermbach, Setareh Maghsudi, Jonathan Hasenburg, and Tobias Pfandzelter. Towards auction-based function placement in serverless fog platforms. In 2020 IEEE International Conference on Fog Computing (ICFC), pages 25–31. IEEE, 2020. [17] Ramiro Berrelleza. WebAssembly + OpenFaaS The Universal Runtime for Server- less Functions. https://github.com/rberrelleza/openfaas-plus-webassembly Accessed: 2020-11-16. [18] Sol Boucher, Anuj Kalia, David G Andersen, and Michael Kaminsky. Putting the "Micro" Back in Microservice. In 2018 USENIX Annual Technical Conference (USENIX ATC 18), pages 645–650, 2018. [19] Mary Branscombe. Azure Edge Zones: Microsoft’s Plan to Dominate Edge Com- puting and 5G. https://www.datacenterknowledge.com/microsoft/azure-edge-zones- microsoft-s-plan-dominate-edge-computing-and-5g. Accessed: 2020-11-16. [20] Mary Branscombe. Azure Edge Zones: Microsoft’s Plan to Dominate Edge Com- puting and 5G. https://www.datacenterknowledge.com/microsoft/azure-edge-zones- microsoft-s-plan-dominate-edge-computing-and-5g. Accessed: 2020-11-16. [21] Javier Cabrera Arteaga, Shrinish Donde, Jian Gu, Orestis Floros, Lucas Satabin, Benoit Baudry, and Martin Monperrus. Superoptimization of WebAssembly Bytecode. In Conference Companion of the 4th International Conference on Art, Science, and Engineering of Programming, pages 36–40, 2020. [22] James Cadden, Thomas Unger, Yara Awad, Han Dong, Orran Krieger, and Jonathan Appavoo. SEUSS: Skip Redundant Paths to Make Serverless Fast. In Proceedings of the Fifteenth European Conference on Computer Systems (EuroSys 20), pages 1–15, 2020.

33 [23] Joao Carreira, Pedro Fonseca, Alexey Tumanov, Andrew Zhang, and Randy Katz. A Case for Serverless Machine Learning. In Workshop on Systems for ML and Open Source Software at NeurIPS, volume 2018, 2018.

[24] Joao Carreira, Pedro Fonseca, Alexey Tumanov, Andrew Zhang, and Randy Katz. Cirrus: a Serverless Framework for End-to-end ML Workflows. In Proceedings of the ACM Symposium on Cloud Computing, pages 13–24, 2019.

[25] David Chappell. Introducing the Azure Services Platform. Technical report, David- Chappel and Associates, 2009.

[26] Y. Chen, Q. Feng, and W. Shi. An Industrial Robot System Based on Edge Computing: An Early Experience. In USENIX Workshop on Hot Topics in Edge Computing (HotEdge 18), Boston, MA, 2018.

[27] Bin Cheng, Jonathan Fuerst, Gurkan Solmaz, and Takuya Sanada. Fog function: Serverless fog computing for data intensive iot services. In 2019 IEEE International Conference on Services Computing (SCC), pages 28–35. IEEE, 2019.

[28] Sharon Choy, Bernard Wong, Gwendal Simon, and Catherine Rosenberg. The Brewing Storm in Cloud Gaming: A Measurement Study on Cloud to End-User Latency. In 2012 11th Annual Workshop on Network and Systems Support for Games (NetGames), pages 1–6. IEEE, 2012.

[29] Claudio Cicconetti, Marco Conti, and Andrea Passarella. An architectural framework for serverless edge computing: design and emulation tools. In 2018 IEEE Interna- tional Conference on Cloud Computing Technology and Science (CloudCom), pages 48–55. IEEE, 2018.

[30] Dan Ciruli. WebAssembly brings extensibility to network proxies. https://opensource.googleblog.com/2020/03/webassembly-brings-extensibility- to. Accessed: 2020-11-16.

[31] Lin Clark. Announcing the Bytecode Alliance: Building a secure by default, compos- able future for WebAssembly. https://bytecodealliance.org/articles/announcing-the- bytecode-alliance. Accessed: 2020-11-15.

[32] Lin Clark. Standardizing WASI: A System Interface to Run WebAssembly Outside the Web, 2019. https://hacks.mozilla.org/2019/03/standardizing-wasi-a-webassembly- system-interface/. Accessed: 2020-11-15.

[33] IBM Corp. IBM Cloud Functions, 2019. https://cloud.ibm.com/functions. Accessed: 2020-11-15.

[34] Katie Costello. The CIO’s Guide to Serverless Computing. Technical report, Gartner. https://www.gartner.com/smarterwithgartner/the-cios-guide-to-serverless- computing/. Accessed: 2020-11-15.

[35] . JavaScript: The Good Parts. "O’Reilly Media, Inc.", 2008.

34 [36] Chris Dickson. The Technology Behind A Low Latency Cloud Gaming Service. https://blog.parsecgaming.com/description-of-parsec-technology-b2738dcc3842. Ac- cessed: 2020-11-15.

[37] Craig Disselkoen, John Renner, Conrad Watt, Tal Garfinkel, Amit Levy, and Deian Stefan. Position Paper: Progressive Memory Safety for WebAssembly. In Proceed- ings of the 8th International Workshop on Hardware and Architectural Support for Security and Privacy, HASP ’19, 2019.

[38] Alan Donovan, Robert Muth, Brad Chen, and David Sehr. PNaCl: Portable Native Client Executables. Technical report, Google, LLC, 2010.

[39] John R Douceur, Jeremy Elson, Jon Howell, and Jacob R Lorch. Leveraging Legacy Code to Deploy Desktop Applications on the Web. In OSDI, volume 8, pages 339–354, 2008.

[40] Aleksandar Dragojevic,´ Dushyanth Narayanan, Miguel Castro, and Orion Hodson. FaRM: Fast Remote Memory. In 11th USENIX Symposium on Networked Systems Design and Implementation (NSDI 14), pages 401–414, 2014.

[41] Dong Du, Tianyi Yu, Yubin Xia, Binyu Zang, Guanglu Yan, Chenggang Qin, Qixuan Wu, and Haibo Chen. Catalyzer: Sub-millisecond Startup for Serverless Computing with Initialization-less Booting. In Proceedings of the 25th International Confer- ence on Architectural Support for Programming Languages and Operating Systems (ASPLOS 20), pages 467–481, 2020.

[42] Steve Dugan. Exposing the ActiveX security model. InfoWorld, 19(20):98, 1997.

[43] LF Edge. Open Glossary of Edge Computing, Version 2.0. Technical report, The Linux Foundation, 2020.

[44] Brendan Eich. From ASM.js to WebAssembly.

[45] Alex Ellis. OpenFaaS: Serverless Functions, Made Simple. https://openfaas.com. Accessed: 2020-11-15.

[46] Facebook. React – A JavaScript library for building user interfaces, 2020. https://reactjs.org/. Accessed: 2020-11-15.

[47] Fastly, Inc. Fastly Expands Serverless Capabilities With the Launch of Com- pute@Edge. https://www.fastly.com/press/press-releases/fastly-expands-serverless- capabilities-launch-compute-edge. Accessed: 2020-11-15.

[48] Matthew Fisher. Introducing Krustlet, the WebAssembly Kubelet. https://deislabs.io/posts/introducing-krustlet/ Accessed: 2020-11-16.

[49] Cloud Native Compute Foundation. CNCF Cloud Native Definition v1.0. https://github.com/cncf/toc/blob/master/DEFINITION.md. Accessed: 2020-11-15.

35 [50] OpenJS Foundation. Electron: Build cross-platform desktop apps with JavaScript, HTML, and CSS, 2020. https://www.electronjs.org/. Accessed: 2020-11-15.

[51] Armando Fox, Rean Griffith, Anthony Joseph, Randy Katz, Andrew Konwinski, Gunho Lee, David Patterson, Ariel Rabkin, Ion Stoica, et al. Above the Clouds: A Berkeley View of Cloud Computing. Technical report, University of California, Berkeley.

[52] Phani Kishore Gadepalli, Sean McBride, , Gregor Peach, Ludmila Cherkasova, Rob Aitken, and Gabriel Parmer. Sledge: a Serverless-first Light-weight Wasm Runtime for the Edge. In Proceedings of the 21th International Middleware Conference (Middleware 20). ACM/IFIP, 2020.

[53] Phani Kishore Gadepalli, Gregor Peach, Ludmila Cherkasova, Rob Aitken, and Gabriel Parmer. Challenges and Opportunities for Efficient Serverless Computing at the Edge. In 2019 38th Symposium on Reliable Distributed Systems (SRDS), pages 261–2615. IEEE, 2019.

[54] James Gallagher. State of the Coding Bootcamp Market Report 2020. https://careerkarma.com/blog/bootcamp-market-report-2020/. Accessed: 2020-11-15.

[55] Fabian Gand, Ilenia Fronza, Nabil El Ioini, Hamid R Barzegar, and Claus Pahl. Serverless Container Cluster Management for Lightweight Edge Clouds. In CLOSER, pages 302–311, 2020.

[56] David Goltzsche, Manuel Nieke, Thomas Knauth, and Rüdiger Kapitza. AccTEE: A WebAssembly-based Two-way Sandbox for Trusted Resource Accounting. In Proceedings of the 20th International Middleware Conference (Middleware 19), pages 123–135, 2019.

[57] Google, LLC. Cloud Functions, 2020. https://cloud.google.com/functions/. Accessed: 2020-11-15.

[58] Peter Greenhalgh. big.LITTLE Processing with ARM Cortex-A15 & Cortex-A7. Technical report, ARM Limited, 2011.

[59] WebAssembly Community Group. WebAssembly Text Format, 2019. https://webassembly.org/docs/text-format/. Accessed: 2020-11-15.

[60] Andreas Haas, Andreas Rossberg, Derek L. Schuff, Ben L. Titzer, Michael Holman, Dan Gohman, Luke Wagner, Alon Zakai, and JF Bastien. Bringing the Web Up to Speed with WebAssembly. In Proceedings of the 38th ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI ’17, 2017.

[61] Adam Hall and Umakishore Ramachandran. An Execution Model for Serverless Functions at the Edge. In Proceedings of the International Conference on Internet of Things Design and Implementation, pages 225–236, 2019.

36 [62] Scott Hanselman. JavaScript is Assembly Language for the Web: Sematic Markup is Dead! Clean vs. Machine-coded HTML. https://www.hanselman.com/blog/javascript- is-assembly-language-for-the-web-sematic-markup-is-dead-clean-vs- machinecoded-html. Accessed: 2020-11-15.

[63] Joseph M Hellerstein, Jose Faleiro, Joseph E Gonzalez, Johann Schleier-Smith, Vikram Sreekanti, Alexey Tumanov, and Chenggang Wu. Serverless Computing: One Step Forward, Two Steps Back. In CIDR, 2019.

[64] Scott Hendrickson, Stephen Sturdevant, Tyler Harter, Venkateshwaran Venkataramani, Andrea C Arpaci-Dusseau, and Remzi H Arpaci-Dusseau. Serverless Computation with OpenLambda. In 8th USENIX Workshop on Hot Topics in Cloud Computing (HotCloud 16), 2016.

[65] Pat Hicket. Announcing Lucet: Fastly’s native WebAssembly compiler and runtime, 2019. https://www.fastly.com/blog/announcing-lucet-fastly-native-webassembly- compiler-runtime. Accessed: 2020-11-15.

[66] Pat Hickey. How Fastly and the developer community are investing in the WebAssem- bly ecosystem. https://www.fastly.com/es/blog/how-fastly-and-developer-community- invest-in-webassembly-ecosystem. Accessed: 2020-11-16.

[67] Kevin Hoffman. Introducing Waxosuit: A secure, cloud-native exosuit for WebAssem- bly. https://medium.com/@KevinHoffman/introducing-waxosuit-6ad754b48ed9 Ac- cessed: 2020-11-16.

[68] Jon Howell, Bryan Parno, and John R Douceur. How to Run POSIX Apps in a Minimal Picoprocess. In 2013 USENIX Annual Technical Conference (USENIX ATC 13), pages 321–332, 2013.

[69] Jon Howell, Bryan Parno, John R Douceur, Mike Dahlin, Vitaly Shmatikov, Bal- achander Krishnamurthy, Walter Willinger, Mike Paleczny, Daniel Peek, Paul Saab, et al. Embassies: Radically Refactoring the Web. In 10th USENIX Symposium on Networked Systems Design and Implementation (NSDI 13), pages 529–545, 2013.

[70] Razin Farhan Hussain, Mohsen Amini Salehi, and Omid Semiari. Serverless Edge Computing for Green Oil and Gas Industry. In 2019 IEEE Green Technologies Conference (GreenTech), pages 1–4. IEEE, 2019.

[71] Iguazio, Ltd. Nuclio: Automate the Data Science Pipeline with Serverless Functions. https://nuclio.io. Accessed: 2020-11-15.

[72] Abhinav Jangda, Bobby Powers, Emery D Berger, and Arjun Guha. Not So Fast: Analyzing the Performance of WebAssembly vs. Native Code. In 2019 USENIX Annual Technical Conference (USENIX ATC 19), pages 107–120, 2019.

[73] Hyuk-Jin Jeong, Chang Hyun Shin, Kwang Yong Shin, Hyeon-Jae Lee, and Soo- Mook Moon. Seamless Offloading of Web App Computations From Mobile Device

37 to Edge Clouds via HTML5 Web Worker Migration. In Proceedings of the ACM Symposium on Cloud Computing, pages 38–49, 2019.

[74] Jeremy Ashkenas et al. List of languages that compile to JS. https://github.com/jashkenas/coffeescript/wiki/List-of-languages-that-compile- to-JS. Accessed: 2020-11-17.

[75] Eric Jonas, Qifan Pu, Shivaram Venkataraman, Ion Stoica, and Benjamin Recht. Occupy the Cloud: Distributed Computing for the 99%. In Proceedings of the 2017 Symposium on Cloud Computing, pages 445–451, 2017.

[76] Eric Jonas, Johann Schleier-Smith, Vikram Sreekanti, Chia-Che Tsai, Anurag Khan- delwal, Qifan Pu, Vaishaal Shankar, Joao Carreira, Karl Krauth, Neeraja Yadwadkar, et al. Cloud Programming Simplified: A Berkeley View on Serverless Computing. Technical report, University of California, Berkeley.

[77] Ram Srivatsa Kannan, Lavanya Subramanian, Ashwin Raju, Jeongseob Ahn, Jason Mars, and Lingjia Tang. GrandSLAm: Guaranteeing SLAs for Jobs in Microservices Execution Frameworks. In Proceedings of the Fourteenth EuroSys Conference 2019, pages 1–16, 2019.

[78] Ana Klimovic, Yawen Wang, Patrick Stuedi, Animesh Trivedi, Jonas Pfefferle, and Christos Kozyrakis. Pocket: Elastic Ephemeral Storage for Serverless Analytics. In 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI 18), pages 427–444, 2018.

[79] Ricardo Koller and Dan Williams. Will Serverless End the Dominance of Linux in the Cloud? In Proceedings of the 16th Workshop on Hot Topics in Operating Systems, HotOS ’17, 2017.

[80] Daniel Lehmann, Johannes Kinder, and Michael Pradel. Everything Old is New Again: Binary Security of WebAssembly. In 29th USENIX Security Symposium (USENIX Security 20), pages 217–234, 2020.

[81] Daniel Lehmann and Michael Pradel. Wasabi: A Framework for Dynamically Ana- lyzing WebAssembly. In Proceedings of the Twenty-Fourth International Conference on Architectural Support for Programming Languages and Operating Systems, pages 1045–1058, 2019.

[82] Peter Levine. The End of Cloud Computing, 2019. https://a16z.com/2019/11/15/the- end-of-cloud-computing-2/. Accessed: 2020-11-15.

[83] Junfeng Li, Sameer G. Kulkarni, K. K. Ramakrishnan, and Dan Li. Understanding Open Source Serverless Platforms: Design Considerations and Performance. In Proceedings of the 5th International Workshop on Serverless Computing, WOSC ’19, 2019.

[84] Fan Liu, Ajit Narayanan, and Quan Bai. Real-time systems. 2000.

38 [85] Wes Lloyd, Shruti Ramesh, Swetha Chinthalapati, Lan Ly, and Shrideep Pallickara. Serverless Computing: An Investigation of Factors Influencing Microservice Per- formance. In IEEE International Conference on Cloud Engineering (IC2E 18), 2018.

[86] Stephen McCamant and Greg Morrisett. Evaluating SFI for a CISC Architecture. In USENIX Security Symposium, 2006.

[87] Garrett McGrath and Paul R Brenner. Serverless computing: Design, implementation, and performance. In 2017 IEEE 37th International Conference on Distributed Computing Systems Workshops (ICDCSW), pages 405–410. IEEE, 2017.

[88] Dirk Merkel. Docker: Lightweight Linux Containers for Consistent Development and Deployment. Linux Journal, 2014(239):2, 2014.

[89] Juan Anibal et al. Micheli. Awesome Serverless: Frameworks, 2020. https://github.com/anaibol/awesome-serverless#frameworks. Accessed: 2020-11-15.

[90] Microsoft Corporation. Azure Functions. https://azure.microsoft.com/en- us/services/functions/. Accessed: 2020-11-15.

[91] Microsoft Corporation. Xbox Game Pass cloud gaming, 2020. https://www.xbox.com/en-US/xbox-game-pass/cloud-gaming. Accessed: 2020-11- 15.

[92] Sunil Mohanty, Gopika Premsankar, and Mario Di Francesco. An Evaluation of Open Source Serverless Computing Frameworks. In IEEE 10th International Conference on Cloud Computing Technology and Science, (CloudCom 18), 2018.

[93] Greg Morrisett, David Walker, Karl Crary, and Neal Glew. From System F to Typed Assembly Language. ACM Transactions on Programming Languages and Systems (TOPLAS), 21(3):527–568, 1999.

[94] John Gregory Morrisett, Karl Crary, Neal Glew, and David Walker. Stack-Based Typed Assembly Language. Journal of Functional Programming, 2003.

[95] James Murty. Programming Amazon Web Services: S3, EC2, SQS, FPS, and Sim- pleDB. O’Reilly Media, Inc., 2008.

[96] Shravan Narayan, Craig Disselkoen, Tal Garfinkel, Nathan Froyd, Eric Rahm, Sorin Lerner, Hovav Shacham, and Deian Stefan. Retrofitting Fine Grain Isolation in the Renderer. In 29th USENIX Security Symposium (USENIX Security 20), 2020.

[97] Stefan Nastic, Thomas Rausch, Ognjen Scekic, Schahram Dustdar, Marjan Gusev, Bojana Koteska, Magdalena Kostoska, Boro Jakimovski, Sasko Ristov, and Radu Prodan. A Serverless Real-Time Data Analytics Platform for Edge Computing. IEEE Internet Computing, 21(4):64–71, 2017.

39 [98] Edward Oakes, Leon Yang, Dennis Zhou, Kevin Houck, Tyler Harter, Andrea Arpaci- Dusseau, and Remzi Arpaci-Dusseau. SOCK: Rapid task provisioning with serverless- optimized containers. In 2018 USENIX Annual Technical Conference (USENIX ATC 18), pages 57–70, 2018.

[99] Loudoun County Department of Economic Development. Loudoun Virginia Eco- nomic Development: Data Centers, 2020. https://biz.loudoun.gov/key-business- sectors/data-centers/. Accessed: 2020-11-15.

[100] Amy Ousterhout, Joshua Fried, Jonathan Behrens, Adam Belay, and Hari Balakrish- nan. Shenango: Achieving High CPU Efficiency for Latency-sensitive Datacenter Workloads. In 16th USENIX Symposium on Networked Systems Design and Imple- mentation (NSDI 19), pages 361–378, 2019.

[101] Andrei Palade, Aqeel Kazmi, and Siobhán Clarke. An Evaluation of Open Source Serverless Computing Frameworks Support at the Edge. In 2019 IEEE World Congress on Services (SERVICES), volume 2642, pages 206–211. IEEE, 2019.

[102] David Pogue. Ultimate iPhone FAQs List, Part 2. https://pogue.blogs.nytimes.com/2007/01/13/ultimate-iphone-faqs-list-part-2/. Accessed: 2020-11-17.

[103] Bobby Powers, John Vilk, and Emery D Berger. Browsix: Bridging the Gap Between UNIX and the Browser. ACM SIGPLAN Notices, 52(4):253–266, 2017.

[104] George Prekas, Marios Kogias, and Edouard Bugnion. ZygOS: Achieving Low Tail Latency for Microsecond-Scale Networked Tasks. In Proceedings of the 26th Symposium on Operating Systems Principles, 2017.

[105] Daniel Price and Andrew Tucker. Solaris Zones: Operating System Support for Consolidating Commercial Workloads. In LISA, volume 4, pages 241–254, 2004.

[106] John Rauser. O’Reilly Velocity: TCP and The Lower Bound of Web Performance, 2015. https://www.youtube.com/watch?v=C8orjQLacTo. Accessed: 2020-11-15.

[107] Salim S Salim, Andy Nisbet, and Mikel Luján. TruffleWasm: a WebAssembly inter- preter on GraalVM. In Proceedings of the 16th ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments, pages 88–100, 2020.

[108] Mahadev Satyanarayanan, Paramvir Bahl, Ramón Caceres, and Nigel Davies. The Case for VM-Based Cloudlets in Mobile Computing. IEEE Pervasive Computing, 8(4):14–23, 2009.

[109] Andrew et al. Scheidecker. WAVM: WAVM is a WebAssembly virtual machine, designed for use in non-web applications, 2019. https://wavm.github.io/. Accessed: 2020-11-15.

[110] Serverless, Inc. Serverless Framework: zero-friction serverless development. https://www.serverless.com/. Accessed: 2020-11-15.

40 [111] Amazon Web Services. AWS Lambda, 2019. https://aws.amazon.com/lambda/. Accessed: 2020-11-15.

[112] Amazon Web Services. Security Overview of AWS Lambda, 2019. https://d1.awsstatic.com/whitepapers/Overview-AWS-Lambda-Security.pdf. Accessed: 2020-11-15.

[113] Amazon Web Services. Lambda@Edge, 2020. https://aws.amazon.com/lambda/edge/. Accessed: 2020-11-17.

[114] Charles Severance. JavaScript: Designing a Language in 10 Days. Computer, 45(2):7–8, 2012.

[115] Mohammad Shahrad, Jonathan Balkind, and David Wentzlaff. Architectural Impli- cations of Function-as-a-Service Computing. In Proceedings of the 52nd Annual IEEE/ACM International Symposium on Microarchitecture, pages 1063–1075, 2019.

[116] W. Shi, J. Cao, Q. Zhang, Y. Li, and L. Xu. Edge Computing: Vision and Challenges. IEEE Internet of Things Journal, 3(5):637–646, Oct 2016.

[117] Simon Shillaker and Peter Pietzuch. Faasm: Lightweight Isolation for Efficient Stateful Serverless Computing. In 2020 USENIX Annual Technical Conference (USENIX ATC 20), pages 419–433, 2020.

[118] Sneff, Lachlan et al. Nebulet on GitHub. https://github.com/nebulet/nebulet. Ac- cessed: 2020-11-15.

[119] Michael Snoyman. Serverless Rust using WASM and Cloudflare, 2019. https://tech.fpcomplete.com/blog/serverless-rust-wasm-cloudflare. Accessed: 2020- 11-15.

[120] Solarwinds Pingdom. Benchmarking CDNs: CloudFront, Cloudflare, Fastly, and Google Cloud. https://www.pingdom.com/blog/benchmarking-cdns-cloudfront- cloudflare-fastly-and-google-cloud/. Accessed: 2020-11-17.

[121] Vikram Sreekanti, Chenggang Wu Xiayue Charles Lin, Jose M Faleiro, Joseph E Gon- zalez, Joseph M Hellerstein, and Alexey Tumanov. Cloudburst: Stateful Functions- as-a-Service. In Proceedings of the VLDB Endowment, volume 12, pages 2438–2452. VLDB Endowment, 2019.

[122] GW Systems. aWsm, 2020. https://github.com/gwsystems/aWsm. Accessed: 2020- 11-15.

[123] Aron Szanto, Timothy Tamm, and Artidoro Pagnoni. Taint Tracking for WebAssem- bly. arXiv preprint arXiv:1807.08349, 2018.

[124] Kenton Varda. WebAssembly on Cloudflare Workers. https://blog.cloudflare.com/webassembly-on-cloudflare-workers/. Accessed: 2020-11-15.

41 [125] Ricard Vilalta, Victor López, Alessio Giorgetti, Shuping Peng, Vittorio Orsini, Luis Velasco, Rene Serral-Gracia, Donal Morris, Silvia De Fina, Filippo Cugini, et al. Tel- coFog: A Unified Flexible Fog and Cloud Computing Architecture for 5G Networks. IEEE Communications Magazine, 55(8):36–43, 2017.

[126] Mario Villamizar, Oscar Garces, Lina Ochoa, Harold Castro, Lorena Salamanca, Mauricio Verano, Rubby Casallas, Santiago Gil, Carlos Valencia, Angee Zambrano, et al. Infrastructure Cost Comparison of Running Web Applications in the Cloud Us- ing AWS Lambda and Monolithic and Microservice Architectures. In 16th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid 16), pages 179–182. IEEE, 2016.

[127] Luke Wagner. WebAssembly consensus and end of Browser Preview, 2017. https://lists.w3.org/Archives/Public/public-webassembly/2017Feb/0002.html. Ac- cessed: 2020-11-15.

[128] Liang Wang, Mengyuan Li, Yinqian Zhang, Thomas Ristenpart, and Michael Swift. Peeking Behind the Curtains of Serverless Platforms. In 2018 USENIX Annual Technical Conference (USENIX ATC 18), pages 133–146, 2018.

[129] Wasmer, Inc. Wasmer: Run any code on any client with WebAssembly and Wasmer, 2020. https://wasmer.io/. Accessed: 2020-11-15.

[130] Robert N. M. Watson, Jonathan Anderson, Ben Laurie, and Kris Kennaway. Capsicum: Practical Capabilities for UNIX. In USENIX Security Symposium, volume 46, page 2, 2010.

[131] Conrad Watt, Andreas Rossberg, and Jean Pichon-Pharabod. Weakening WebAssem- bly. Proceedings of the ACM International Conference on Object-Oriented Program- ming, Systems, Languages and Applications (OOPSLA 19), pages 1–28, 2019.

[132] White Paper of Edge Computing Consortium. Technical report, Edge Computing Con- sortium, 2017. https://www.iotaustralia.org.au/wp-content/uploads/2017/01/White- Paper-of-Edge-Computing-Consortium.pdf. Accessed: 2020-11-15.

[133] Bennet Yee, David Sehr, Gregory Dardyk, J Bradley Chen, Robert Muth, Tavis Ormandy, Shiki Okasaka, Neha Narula, and Nicholas Fullagar. Native Client: A Sandbox for portable, untrusted x86 Native Code. In 30th IEEE Symposium on Security and Privacy (IEEE SP 09), pages 79–93. IEEE, 2009.

[134] Shanhe Yi, Zijiang Hao, Qingyang Zhang, Quan Zhang, Weisong Shi, and Qun Li. LAVEA: Latency-aware Video Analytics on Edge Computing. In Proceedings of the Second ACM/IEEE Symposium on Edge Computing, pages 1–13, 2017.

[135] Wei Yu, Fan Liang, Xiaofei He, William Grant Hatcher, Chao Lu, Jie Lin, and Xinyu Yang. A Survey on the Edge Computing for the Internet of Things. IEEE access, 6:6900–6919, 2017.

42 [136] Alon Zakai. Emscripten: an LLVM-to-JavaScript compiler. In Proceedings of the ACM International Conference on Object-Oriented Programming, Systems, Lan- guages and Applications (OOPSLA 11), pages 301–312, 2011.

[137] Tian Zhang, Dong Xie, Feifei Li, and Ryan Stutsman. Narrowing the Gap Be- tween Serverless and its State with Storage Functions. In Proceedings of the ACM Symposium on Cloud Computing (SoCC 19), pages 1–12. ACM, 2019.

43