SONIC: Application-Aware Data Passing for Chained Serverless Applications
Total Page:16
File Type:pdf, Size:1020Kb
SONIC: Application-aware Data Passing for Chained Serverless Applications Ashraf Mahgoub(a), Karthick Shankar(b), Subrata Mitra(c), Ana Klimovic(d), Somali Chaterji(a), Saurabh Bagchi(a) (a) Purdue University, (b) Carnegie Mellon University, (c) Adobe Research, (d) ETH Zurich Abstract and video analytics [6,27]. Accordingly, major cloud comput- ing providers recently introduced their serverless workflow Data analytics applications are increasingly leveraging services such as AWS Step Functions [5], Azure Durable serverless execution environments for their ease-of-use and Functions [48], and Google Cloud Composer [29], which pay-as-you-go billing. Increasingly, such applications are provide easier design and orchestration for serverless work- composed of multiple functions arranged in some workflow. flow applications. Serverless workflows are composed of a However, the current approach of exchanging intermediate sequence of execution stages, which can be represented as (ephemeral) data between functions through remote storage a directed acyclic graph (DAG) [21, 56]. DAG nodes corre- (such as S3) introduces significant performance overhead. spond to serverless functions (or ls1) and edges represent the We show that there are three alternative data-passing meth- flow of data between dependent ls (e.g., our video analytics ods, which we call VM-Storage, Direct-Passing, and state-of- application DAG is shown in Fig. 1). practice Remote-Storage. Crucially, we show that no single Exchanging intermediate data between serverless functions data-passing method prevails under all scenarios and the op- has been identified as a major challenge in serverless work- timal choice depends on dynamic factors such as the size flows [39, 41, 50]. The reason is that, by design, IP addresses of input data, the size of intermediate data, the application’s and port numbers of individual ls are not exposed to users, degree of parallelism, and network bandwidth. We propose making direct point-to-point communication difficult. More- SONIC, a data-passing manager that optimizes application per- over, serverless platforms provide no guarantees for the over- formance and cost, by transparently selecting the optimal data- lap in time between the executions of the parent (sending) and passing method for each edge of a serverless workflow DAG child (receiving) functions. Hence, the state-of-practice tech- and implementing communication-aware function placement. nique for data passing between serverless functions is saving SONIC monitors application parameters and uses simple re- and loading data through remote storage (e.g., S3). Although gression models to adapt its hybrid data passing accordingly. passing intermediate data through remote storage has the ben- We integrate SONIC with OpenLambda and evaluate the sys- efit that it makes for a clear separation between compute and tem on Amazon EC2 with three analytics applications, popu- storage resources, it adds significant performance overheads, lar in the serverless environment. SONIC provides lower la- especially for data-intensive applications [50]. For example, tency (raw performance) and higher performance/$ across di- Pu et al. [56] show that running the CloudSort benchmark verse conditions, compared to four different baselines: SAND with 100TB of data on AWS Lambda with S3 remote stor- [UsenixATC-18], Vanilla OpenLambda [HotCloud-16], Open- age, can be up to 500× slower than running on a cluster of Lambda integrated with Pocket [OSDI-18], and AWS Lambda VMs. Our own experiment with a machine learning pipeline (state of practice). that has a simple linear workflow shows that passing data 1 Introduction through remote storage takes over 75% of the execution time Serverless computing platforms provide on-demand scala- (Fig. 2, fanout = 1). Previous approaches target reducing this bility and fine-grained allocation of resources. In this comput- overhead, either by replacing disk-based storage (e.g., S3) by ing model, the cloud provider runs the servers and manages memory-based storage (e.g., ElastiCache Redis), or by com- all the administrative tasks (e.g., scaling, capacity planning, bining different storage media (e.g., DRAM, SSD, NVMe) etc.), while users focus on their application logic. Due to to match application needs [15, 42, 56]. However, these ap- its significant advantages, serverless computing is becoming proaches still require passing the data over the network mul- increasingly popular for complex workflows such as data pro- cessing pipelines [40,56], machine learning pipelines [14,59], 1For simplicity, we denote a serverless function as lambda or l. 1 Start DAG parameters Long-Term Storage (S3) Video_input_size = 15 MB Split_Video (λퟏퟏ) Memory = 40 MB, Exec-time = 2 Sec Video_chunk_size = 1 MB Extract Frame (λퟐퟏ) ….... Extract Frame (λퟐ푵) Memory = 55 MB, Exec-time = 0.3 Sec Frame_size = 0.1 MB Classify Frame (λퟑퟏ) ….... Classify Frame (λퟑ푵) Memory = 500 MB, Exec-time = 0.7 Sec Figure 2: Execution time comparison with Remote-Storage, VM-Storage, Long-Term Storage (S3) Fan-out degree (N) = 19 and Direct-Passing for the LightGBM application with Fanout = 1, 3, 12. The best data-passing method differs in every case. End Figure 1: DAG overview (DAG definition provided by the user and our bandwidth. SONIC adapts its decision dynamically to changes profiled DAG parameters) for Video Analytics application. in these parameters in an application-specific manner. The tiple times, adding significant latency to the job’s execution workflow of SONIC is shown in Fig.3. time. Moreover, in-memory storage services are much more We note that a locally optimized data-passing decision for expensive than disk-based storage (e.g., ElastiCache Redis a given stage can become sub-optimal for the entire DAG. As costs 700x compared to S3 per unit data volume [56]). an example, placing two functions together in the same VM In our work, we show that the cloud provider can imple- and using VM-Storage will reduce the data passing time be- ment communication-aware placement of lambdas by opti- tween these two stages but if the second function generates a mizing data exchange between chained functions in DAGs. large volume of intermediate data then the cost of transporting For instance, the cloud provider can leverage data locality by that data to a different VM may be high. Therefore, SONIC de- scheduling the sending and receiving functions on the same velops a Viterbi-based algorithm to avoid greedy data-passing VM, while preserving local disk state between the two invo- decisions and provide optimized latency and cost across the cations. This data-passing mechanism, which we refer to as entire DAG. SONIC abstracts its hybrid data-passing selec- VM-Storage, minimizes data exchange latency but imposes tion and provides the user with a simple file-based API. With constraints on where the lambdas can be scheduled. Alter- SONIC ’s API, users always read and write data as files to nately, the cloud provider can enable data exchange between storage that appear local to the user. SONIC is designed to be functions on different VMs by directly copying intermediate integrated with a cluster resource manager (e.g., Protean [34]) data between the VMs that host the sending and receiving that assigns VM requests to the physical hardware and op- lambdas. This data-passing mechanism, which we refer to timizes provider-centric metrics such as resource utilization as Direct-Passing, requires copying intermediate data ele- and load balancing (Fig4). ments once, serving as a middleground between VM-Storage As our approach requires integration with a serverless plat- (which requires no data copying) and Remote Storage (which form (and commercial platforms do not allow for such im- requires two copies of the data). Each data-passing mecha- plementation), we integrate SONIC with OpenLambda [36]. nism provides a trade-off between latency, cost, scalability, We compare SONIC ’s performance using three popular ana- and scheduling flexibility. Crucially, no single mechanism lytics applications to several baselines: AWS-Lambda, with prevails across all serverless applications with different data S3 and ElastiCache-Redis (which can be taken to represent dependencies. For example, while Direct-Passing does not state-of-the-practice); SAND [2]; and OpenLambda with S3 impose strict scheduling constraints, scalability can become and Pocket [42]. Our evaluation shows that SONIC outper- an issue when a large number of receiving functions copy data forms all baselines. SONIC achieves between 34% to 158% simultaneously saturating the VM’s outgoing network band- higher performance/$ (here performance is the inverse of width. Hence, we need a hybrid, fine-grained data-passing latency) over OpenLambda+S3, between 59% to 2.3X over approach that optimizes the data-passing for every edge of OpenLambda+Pocket, and between 1.9X to 5.6X over SAND, the application DAG. a serverless platform that leverages data locality to minimize Our solution: We propose SONIC, a unified API and the execution time. system for inter-lambda data exchange, which adopts a hybrid In summary, our contributions are as follows: approach of the three data-passing methods (VM-Storage, (1) We analyze the trade-offs of three different intermediate Direct-Passing and Remote-Storage) to optimize application data-passing methods (Fig.5) in serverless workflows and performance and cost. SONIC decides the best data-passing show that no single method prevails under all conditions in method for every edge in the application DAG to minimize both latency and cost. This motivates the need for our hybrid data-passing latency and cost. We show that this selection and dynamic approach. can be impacted by several parameters such as the size of (2) We propose SONIC, which provides selection of optimized input, the application’s degree of parallelism, and VM network data-passing methods between any two serverless functions 2 Figure 3: Workflow of SONIC: Users provide a DAG definition and input data files for profiling. SONIC’s online profiler executes the DAG and generates regression models that map the input size to the DAG’s parameters.