SONIC: Application-Aware Data Passing for Chained Serverless Applications
Total Page:16
File Type:pdf, Size:1020Kb
SONIC: Application-aware Data Passing for Chained Serverless Applications Ashraf Mahgoub Karthick Shankar Subrata Mitra Ana Klimovic Purdue University Carnegie Mellon University Adobe Research ETH Zurich Somali Chaterji Saurabh Bagchi Purdue University Purdue University Abstract introduced serverless workflow services such as AWS Step Data analytics applications are increasingly leveraging Functions [4], Azure Durable Functions [45], and Google serverless execution environments for their ease-of-use and Cloud Composer [24], which provide easier design and orches- pay-as-you-go billing. The structure of such applications tration for serverless workflow applications. Serverless work- is usually composed of multiple functions that are chained flows are composed of a sequence of execution stages, which can be represented as a directed acyclic graph (DAG) [17,53]. together to form a workflow. The current approach of ex- 1 changing intermediate (ephemeral) data between functions DAG nodes correspond to serverless functions (or ls ) and is through a remote storage (such as S3), which introduces edges represent the flow of data between dependent ls (e.g., significant performance overhead. We compare three data- our video analytics application DAG is shown in Fig.1). passing methods, which we call VM-Storage, Direct-Passing, Exchanging intermediate data between serverless functions and state-of-practice Remote-Storage. Crucially, we show that is a major challenge in serverless workflows [34,36,47]. By no single data-passing method prevails under all scenarios design, IP addresses and port numbers of individual ls are not and the optimal choice depends on dynamic factors such as exposed to users, making direct point-to-point communication the size of input data, the size of intermediate data, the appli- difficult. Moreover, serverless platforms provide no guaran- cation’s degree of parallelism, and network bandwidth. We tees for the overlap in time between the executions of the propose SONIC, a data-passing manager that optimizes ap- parent (sending) and child (receiving) functions. Hence, the plication performance and cost, by transparently selecting state-of-practice technique for data passing between server- the optimal data-passing method for each edge of a server- less functions is saving and loading data through remote stor- less workflow DAG and implementing communication-aware age (e.g., S3). Although passing intermediate data through function placement. SONIC monitors application parame- remote storage has the benefit of cleanly separating com- ters and uses simple regression models to adapt its hybrid pute and storage resources, it adds significant performance data passing accordingly. We integrate SONIC with Open- overheads, especially for data-intensive applications [47]. For Lambda and evaluate the system on Amazon EC2 with three example, Pu et al. [53] show that running the CloudSort bench- analytics applications, popular in the serverless environment. mark with 100TB of data on AWS Lambda with S3 remote SONIC provides lower latency (raw performance) and higher storage can be up to 500× slower than running on a cluster of performance/$ across diverse conditions, compared to four VMs. Our own experiment with a machine learning pipeline baselines: SAND, vanilla OpenLambda, OpenLambda with that has a simple linear workflow shows that passing data Pocket, and AWS Lambda. through remote storage takes over 75% of the computation time (Fig.2, fanout = 1). Previous approaches reduce this 1 Introduction overhead by implementing exchange operators optimized for Serverless computing platforms provide on-demand scala- object storage [47, 52], replacing disk-based object storage bility and fine-grained resource allocation. In this computing (e.g., S3) with memory-based storage (e.g., ElastiCache Re- model, cloud providers run the servers and manage all admin- dis), or combining different storage media (e.g., DRAM, SSD, istrative tasks (e.g., scaling, capacity planning, etc.), while NVMe) to match application needs [12, 37, 53]. However, users focus on the application logic. Due to its elasticity and these approaches still require passing data over the network ease-of-use advantages, serverless computing is becoming multiple times, adding latency. Moreover, in-memory stor- increasingly popular for advanced workflows such as data pro- age services are much more expensive than disk-based stor- cessing pipelines [35,53], machine learning pipelines [11,55], and video analytics [5,22,64]. Major cloud providers recently 1For simplicity, we denote a serverless function as lambda or l. Start DAG parameters Intermediate Data Long-Term Storage (S3) Split_Video (λퟏퟏ) Memory = 40 MB, Exec-time = 2 Sec Video_chunk_size = 1 MB Extract Frame (λퟐퟏ) ….. Extract Frame (λퟐ푵) Memory = 55 MB, Exec-time = 0.3 Sec Frame_size = 0.1 MB Classify Frame (λퟑퟏ) ….. Classify Frame (λퟑ푵) Memory = 500 MB, Exec-time = 0.7 Sec Figure 2: Execution time comparison with Remote storage, VM storage, Long-Term Storage (S3) Fan-out degree (N) = 19 and Direct-Passing for the LightGBM application with Fanout = 1, 3, 12. The best data passing method differs in every case. End its decision dynamically as these parameters change. Since Figure 1: DAG overview (DAG definition provided by the user and our locally optimizing data passing decisions at given stages in a profiled DAG parameters) for Video Analytics application. DAG can be globally sub-optimal, SONIC applies a Viterbi- based algorithm to optimize latency and cost across the entire age (e.g., ElastiCache Redis costs 700× more than S3 per DAG. Fig.3 shows the workflow of SONIC. SONIC abstracts GB [53]). its hybrid data passing selection and provides users with a We show how cloud providers can optimize data exchange simple file-based API, so users always read and write data as between chained functions in a serverless DAG workflow with files to storage that appears local. SONIC is designed to be communication-aware placement of lambdas. For instance, integrated with a cluster resource manager (e.g., Protean [29]) the cloud provider can leverage data locality by scheduling that assigns VM requests to the physical hardware and op- the sending and receiving functions on the same VM, while timizes provider-centric metrics such as resource utilization preserving local disk state between the two invocations. This and load balancing (Fig4). data passing mechanism, which we refer to as VM-Storage, We integrate SONIC with the open-source Open- minimizes data exchange latency but imposes constraints on Lambda [31] framework and compare performance to sev- where the lambdas can be scheduled. Alternatively, the cloud eral baselines: AWS-Lambda, with S3 and ElastiCache- provider can enable data exchange between functions on dif- Redis (which can be taken to represent state-of-the-practice); ferent VMs by directly copying intermediate data between the SAND [2]; and OpenLambda with S3 and Pocket [37]. Our VMs that host the sending and receiving lambdas. This data evaluation shows that SONIC outperforms all baselines for passing mechanism, which we refer to as Direct-Passing, re- a variety of analytics applications. SONIC achieves between quires one copy of the intermediate data elements, serving as 34% and 158% higher performance/$ (here performance is a middle ground between VM-Storage (which requires no data the inverse of latency) over OpenLambda+S3, between 59% copying) and Remote storage (which requires two copies of and 2.3X over OpenLambda+Pocket, and between 1.9× and the data). Each data passing mechanism provides a trade-off 5.6× over SAND, a serverless platform that leverages data between latency, cost, scalability, and scheduling flexibility. locality to minimize execution time. Crucially, we find that no single mechanism prevails across In summary, our contributions are as follows: all serverless applications, which have different data depen- (1) We analyze the trade-offs of three different intermediate dencies. For example, while Direct-Passing does not impose data passing methods (Fig.5) in serverless workflows and strict scheduling constraintsr of receiving functions copy data show that no single method prevails under all conditions in simultaneously and saturate the VM’s outgoing network band- both latency and cost. This motivates the need for our hybrid width. Hence, we need a hybrid, fine-grained data passing and dynamic approach. approach that optimizes data passing for every edge of an (2) We propose SONIC, which automatically selects data pass- application DAG. ing methods between any two serverless functions in a work- Our solution: We propose SONIC, a management layer flow to minimize latency and cost. SONIC dynamically adapts for inter-lambda data exchange, which adopts a hybrid ap- to application changes. proach of the three data passing methods (VM-Storage, Direct- (3) We evaluate SONIC’s sensitivity to serverless-specific Passing and Remote storage) to optimize application perfor- challenges such as the cold-start problem (set-up time for the mance and cost. SONIC exposes a unified API to application application’s environment when it is invoked for the first time), developers which selects the optimal data passing method for the lack of point-to-point communication, and the provider’s every edge in an application DAG to minimize data passing lack of knowledge of lambda input data content. SONIC shows latency and cost. We show that this selection depends on pa- its benefit over all baselines with three common classes of rameters such as the size of input, the application’s degree serverless applications, with different input sizes and user- of parallelism, and VM network bandwidth. SONIC adapts specified parameters. Figure 3: Workflow of SONIC: Users provide a DAG definition and input data files for profiling. SONIC’s online profiler executes the DAG and generates regression models that map the input size to the DAG’s parameters. For every new input, the online manager uses the regression models to identify the best placement for every l Figure 4: SONIC’s interaction with an and best data passing method for every pair of sending\receiving ls in the DAG existing Resource Manager in the system.