A Thunk to Remember

A Thunk to Remember: make -j1000 (and other jobs) on functions-as-a-service infrastructure Sadjad Fouladi Dan Iter Shuvo Chatterjee Christos Kozyrakis Matei Zaharia Keith Winstein Stanford University Draft of Sep. 25, 2017 (under double-blind review; please do not redistribute) Abstract processes and executing it on serverless platforms. Unlike previous work, gg can be applied to arbitrary We present gg: a system for executing interdependent soft- workflows of Unix-like processes, such as software builds, ware workflows across thousands of short-lived “lambdas” scientific workflows, and video-processing pipelines. We that run in parallel on public cloud infrastructure. The first evaluate gg on one of the most challenging and well- system includes three major contributions: (a) an inter- studied parallelization problems: software builds. In this change format for representing “thunks”—programs and setting, we show that gg can extract more parallelism out their complete data dependencies—that can be executed of builds than traditional parallel and distributed build anywhere; (b) a system to automatically infer the depen- tools such as make, distcc, and icecc [16, 19, 29], dency tree of a software build system and synthesize it while inferring the dependencies between build steps au- as a directed acyclic graph of thunks, by replacing stages tomatically, and that it performs up to 3.9× better than of the system with “models” that capture dependencies outsourcing to a long-running 64-core build server. with fine granularity; and (c) an execution engine that re- These gains are due both to increased parallelism (gg solves thunks recursively on public functions-as-a-service can exploit thousand-way parallelism from serverless in- infrastructure with thousand-way parallelism. frastructure, and can infer parallelism opportunities not We found that gg outperforms existing schemes for found in the original Makefile) and fewer network round accelerating compilation—with large projects such as trips, because gg keeps track of intermediate dependen- inkscape and LLVM, gg was 3.7× to 4× as fast as out- cies in the cloud and does not need to fetch intermediate sourcing compilation to a remote 64-core machine, and build products back to the local machine. In addition, we 1.2× to 2.2× as fast as running make -j64 locally on the show that gg can be used to implement video processing 64-core machine itself—and that the thunk abstraction is and MapReduce workloads similar to those in ExCam- applicable to a broad range of tasks. era [21] and PyWren [32]. To achieve these results, gg is built on two key con- 1 Introduction cepts. First, gg represents work graphs through an abstraction called thunks, which are self-contained units of The scale and elasticity of cloud platforms gives us an computation specifying both the executable to run and its opportunity to dramatically rethink many computer appli- dependencies. Second, gg provides a novel mechanism cations. As cloud-computing platforms have developed, for automatically capturing an accurate dependency graph they have invariably become more fine-grained: for exam- from an existing application, called model substitution. ple, cloud vendors today offer hyper-elastic “serverless We briefly outline these concepts in turn. computing” platforms that can launch thousands of Linux Thunks: gg’s thunks are self-contained units of com- containers within seconds [4, 21, 32], and all the major putation that specify both an executable to run for a work- cloud vendors have switched to one-minute or ten-minute flow step and its dependencies and environment. For ex- minimum billing increments for VMs [7, 18, 22]. Re- ample, in a build system, the thunk for compiling a single searchers have already taken advantage of these platforms C source file will reference, as dependencies, the content to implement hyper-elastic versions of compute-intensive hashes of the source file, the header files it includes as applications including video encoding [21] and MapRe- dependencies, and the compiler binary itself. In a video duce [32]. processing job, a thunk might specify one chunk of the Making hyper-elastic platforms broadly available to video and the encoder binary that will operate on it. more applications, however, will require general APIs and Because thunks identify their complete functional foot- systems for accessing them, as opposed to the application- print, gg can evaluate them in diverse environments, in- specific systems designed in prior work. To this end, we cluding a local sandbox or an AWS Lambda function. propose gg, a system for efficiently and easily captur- Each thunk is named canonically in terms of its computa- ing a parallelizable application consisting of a DAG of tion task and dependencies (which can be other thunks), 1 which allows gg to memoize and reuse thunk results. By contrast, gg can execute the same build system in 1.25 Model substitution: Although applications can spec- minutes on AWS Lambda, with each stage billed with ify a thunk graph directly, gg also provides a novel and subsecond granularity. easy-to-use mechanism for capturing an accurate depen- Compiling LLVM (a toolkit for writing compilers) re- dency graph from an existing multi-process application, quires 86 minutes on a single core, 4 minutes using icecc called model substitution. In this mechanism, gg can sub- to outsource to a 64-core VM in the same region, and stitute model programs for each CPU-intensive executable 1.2 minutes with gg. invoked by a high-level driver process for the application These gains come with a caveat: automatically infer- (e.g., make, cmake, ninja, or even a shell script), by sim- ring dependencies in the first place requires running the Draft of Sep. 25, 2017 (under double-blind review; please do not redistribute) ply placing them in the PATH. As the driver process runs, original build system (e.g. make) with the compiler and these models execute only the minimal computation re- other programs replaced with models. The build system quired to determine each execution’s dependencies; for itself can become a bottleneck, especially if it involves example, gg’s model for the C linker determines which recursive make and many dependencies and the client ma- libraries will be consulted to produce the output file, but chine is not well-endowed with CPU resources. On a cold does not actually link them. Each model program then start, inferring the tree of dependencies for inkscape outputs a thunk representing that computation and its required 2.5 minutes on the client machine (the 4-core inputs. We found that models are an effective way to au- VM), and 2.75 minutes for LLVM. tomatically and correctly infer dependencies from large Apart from software builds, we also use gg to imple- legacy applications; for example, by including models for ment a video processing workload similar to ExCam- just six common executables (gcc, g++, ld, ar, ranlib, era [21] and a MapReduce engine similar to PyWren [32], and strip), gg can automatically capture the dependency to show that gg’s thunk abstraction is general enough to graphs of many large open-source projects. capture these applications. We demonstrate gg through several example applica- In summary, we believe that the ability to divide com- tions. First, to showcase a complex application that is mon computational tasks into fine-grained tasks and exe- not supported by previous “serverless” systems, we used cute them over hyper-elastic functions-as-a-service plat- gg to implement a parallel build accelerator. Builds have forms like AWS Lambda, Google Cloud Functions, IBM traditionally been challenging to parallelize efficiently OpenWhisk, and Azure Functions will become a new for two reasons. First, to leverage parallelism, the build foundational use of these platforms. gg offers general and tool needs an accurate and fine-grained description of the powerful mechanisms to achieve this for a common class dependencies (e.g., from a Makefile): overspecifying of applications—multi-process Unix-like applications. gg dependencies will reduce parallelism, while underspec- is effective for a wide range of tasks ranging from “em- ifying them will lead to incorrect results. Second, cur- barrassingly parallel” MapReduce to parallel builds with rent parallel build-outsourcing tools, such as distcc and complex, irregular dependencies. icecc [16, 29], require many round trips between a mas- gg is open-source software; we have posted an anony- ter server and its workers (e.g., to send back intermediate mous version for review at https://github:com/gg-anon. build products), and perform poorly on a higher-latency connection to a public cloud. In contrast, using model substitution, gg automatically discovers a fine-grained but 2 Related Work accurate dependency graph for a build simply by running the existing build system (e.g. make). This sometimes gg is related to several different classes of systems. finds more parallelism than the build system has itself. Workflow systems. gg treats computations as a DAG Then, gg’s execution engine allows it to upload all input of tasks, in a similar manner to Dryad [31], CIEL [39], files and submit the execution graph without repeated Spark [50], and many scientific workflow systems [35]. round trips, running with up to 1000-way parallelism. However, gg differs from these systems in two important ways: its abstraction of a thunk to capture each task, and 1.1 Summary of results the manner in which gg can build up the thunk graph from an arbitrary (Unix-like) application using models. gg on AWS Lambda outperforms existing parallel out- gg’s thunk abstraction differs from tasks in typical sourced build systems running in the EC2 cloud, without workflow systems in two ways. First, a thunk in gg is requiring changes to the program’s build system.

A Thunk to Remember

Details

Download

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

Support