CloudBuild: ’s Distributed and Caching Build Service

Hamed Esfahani, Jonas Fietz, Qi Ke, Alexei Kolomiets, Erica Lan, Erik Mavrinac, Wolfram Schulte, Newton Sanches, Srikanth Kandula Microsoft

ABSTRACT ● execute builds, tests, and related tasks as fast as possible, ● _ousands of Microso engineers build and test hundreds of so- on-board as many product groups as eòortlessly as possible, ● ware products several times a day.It is essential that this con- integrate into existing workows, ● tinuous integration scales, guarantees short feedback cycles, and ensure high reliability of builds and their necessary infras- functions reliably with minimal human intervention. _is paper tructure, ● describes CvÕn­BnTv­, the build service infrastructure developed reduce costs by avoiding separate build labs per organization, ● within Microso over the last few years. CvÕn­BnTv­ is responsi- leverage Microso’s resources in the cloud for scale and elas- ble for all aspects of a continuous integration workow, including ticity, and lastly ● builds, test and code analysis, as well as drops, package and sym- consolidate disparate engineering eòorts into one common bol creation and storage. CvÕn­BnTv­ supports multiple build lan- service. guages as long as they fulûll a coarse grained, ûle IO based contract. We found it diõcult to simultaneously satisfy these requirements. CvÕn­BnTv­ uses content based caching to run build-related tasks Most build labs, for instance, have only a few dependencies on other only when needed.Lastly, it builds on many machines in parallel. systems; they connect to one version control system or build drop CvÕn­BnTv­ oòers a reliable build service in the presence of un- storage solution. To avoid separate build labs,CvÕn­BnTv­ sup- reliable components.It aims to rapidly onboard teams and hence ports the union of all lab dependencies and workows. However, the has to support non-deterministic build tools and speciûcation lan- consequent interoperability issues can lead to lower reliability. As guages that under-declare dependencies. We will outline how we ad- another instance of conicting requirements, product groups have dressed these challenges and characterize the operations of CvÕn­- optimized their builds (and build labs) over many years, oen ex- BnTv­. CvÕn­BnTv­ has on-boarded hundreds of codebases with ploiting their particular product architecture and workow to opti- only man-months of eòort each. Some of these codebases are used mize for incremental and parallel builds. Since CvÕn­BnTv­ oòers by thousands of developers. _e speed ups of build and test range a generic cloud-based solution across the various product groups, from Ï.k× to Ϫ×, and service availability is ••ì. our current speedup improvements are less dramatic (not Ϫª× for example) relative to the highly optimized custom systems. Keywords When architecting CvÕn­BnTv­, we had to a crucial de- Build Systems,CloudBuild, Distributed Systems, Caching, Speciû- cision. We found that existing build tools and speciûcation lan- cation Languages, Operations guages were unsuitable for cached and parallel execution. In partic- ular, the languages would under-specify dependencies and the tools 1. INTRODUCTION produced non-deterministic outputs.Hence, while single machine (or single threaded) execution would run correctly, distributed ex- Microso is rapidly adopting an agile methodology to enable a ecution and caching of task outputs could lead to unpredictable be- much faster delivery cadence of days to weeks. For instance, Oõce havior or limited speedup. A clean solution would be to design a kC, Visual Studio Online and SQL Azure have release cycles ranging new build speciûcation language, rewrite all build speciûcations in from k months to k weeks. On the extreme end, Bing’s website ships this language and force tools to respect additional constraints. Such several times a day. eòorts are underway [, }9]. CvÕn­BnTv­, however, is our quick Delivering rapidly requires that builds, which are at the core of solution. _at is, here we show how to onboard existing speciûca- the inner loop of any engineering system, are fast and reliable. _is tion languages (and tools) on to a cloud-based parallel and cached challenge has given us an opportunity to design a new in-house in- build system. _e key advantage with this choice is the speed of on- frastructure for continuous integration known as CvÕn­BnTv­.Its boarding. Builds and tests can be sped up and disparate labs can be design was motivated by these needs: consolidated into the cloud quickly. We also found CvÕn­BnTv­ to Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed be able to cover a wide range of tools and speciûcations. _e cost, for profit or commercial advantage and that copies bear this notice and the full citation however, is the rather heuristic nature of CvÕn­BnTv­: our goal is on the first page. Copyrights for components of this work owned by others than the not to function correctly in every possible setting (of speciûcation author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or and tool behavior). We mitigate this with customer education and republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]. hardening of CvÕn­BnTv­ to cover frequently recurring issues. ICSE ’16 Companion, May 14 - 22, 2016, Austin, TX,USA We believe that our design decision has been successful. CvÕn­- © 2016 Copyright held by the owner/author(s). Publication rights licensed to ACM. BnTv­ is currently being used by more than ªªª developers in ISBN 978-1-4503-4205-6/16/05. . . $15.00 Bing,Exchange, SQL, OneDrive, Azure and Oõce.It executes more DOI: http://dx.doi.org/Ϫ.ÏÏ/}ÊʕÏCª.}Êʕ}}} than }ªK builds per day on up to ϪK machines spread over several data centers. Build and test speeds typically improved by Ï.k times to Microso’s engineering fabric. Ϫ times compared to earlier lab based systems, sustaining an avail- In the rest of this paper, we describe the design of CvÕn­- ability of ••.•ì. Reliability, deûned as the number of requests that BnTv­ (§}–§k). Experiential anecdotes are oòered throughout the are served without a CvÕn­BnTv­ internal error, is better than ••ì. design sections to clarify the various design choices. We then char- CvÕn­BnTv­ connects with three diòerent version control and four acterize some operational aspects of the CvÕn­BnTv­ service (§.Ï) binary storage systems. REST and Azure’s Service Bus [k] are and share onboarding and live-site experiences (§.}–§.k). We con- used for integration into multiple workow systems. clude with a discussion of related work (§) and lessons learnt. While CvÕn­BnTv­’s service design follows existing designs for large scale data-parallel systems [k,Ê, Ï], CvÕn­BnTv­’s build en- 2. PRIMER ON BUILD SERVICES gine is unique in that it allows the execution of arbitrary build lan- Today, build systems such as Make, Maven, Graddle and MSBuild guages and tools, even in the presence of under-speciûed dependen- execute on one machine, typically the developers’ desktops or lap- cies and non-deterministic tools. tops. As the size of code bases grow, these tools can take several Given a build request,CvÕn­BnTv­ evaluates build speciûca- hours for a full build. Delta builds, that is, rebuilding aer changes, tions, typically expressed in Make or MsBuild ûles, to infer project- are faster. However, continuous integration workows oen require to-project level dependencies. _is means that CvÕn­BnTv­’s unit packaging binaries and/or running exhaustive batteries of tests. of scheduling is a coarse grained project. While not as ûne grained To speed up the build workow, we note that there is substantial as other systems like Google’s Bazel [], the approach allows greater opportunity to parallelize. While dependencies do exist, they only reuse of pre-existing build speciûcations, without having to compre- form a partial order over the build tasks and many build tasks are hend all the runtime semantics of the underlying build tools. unordered with respect to each other. As a consequence various ef- CvÕn­BnTv­ can determine whether to execute a given project forts are under way to provide multi-threaded builds and distribute or to reuse outputs captured during an earlier execution, and stored the build tasks over many nodes. in a system-wide cache. _e determination is based on the evalua- Furthermore, there is an opportunity to reuse the outputs of tion of the pertinent source ûles and parent projects which is sum- previous builds, similar to delta builds. Build reuse arises funda- marized into a cache lookup key for the project. _e approach to mentally due to highly modularized soware engineering practices. compute this key ensures that the build output is deterministic and Teams of developers work on speciûc components and can reuse consistent with other projects in the same build. the latest binaries for the rest. Even across source control branches, Finally CvÕn­BnTv­ schedules work, i.e. the tasks for each there is a high likelihood of sharing due to the use of shared libraries project that needs to be built, in a location-aware manner over mul- for commonly used or important functionality. tiple worker machines. Locality here implies two things: fresh- In this context,CvÕn­BnTv­ oòers a distributed and caching ness of the cache at the machine as well as the freshness of their build service. We oòer a quick summary of relevant background. enlistment from source control. Without this locality, the build- preparation-time to conûgure the machine with the right set of 2.1 Source Control SDKs and sources before it can participate in the build will be sub- stantial. CvÕn­BnTv­ uses standard DAG scheduling logic. _e Enlistments are (partial) materializations of source ûles from tasks themselves have heterogeneous resource demands. _e de- a version control system. CvÕn­BnTv­ supports diòerent ver- pendencies between tasks are more arbitrary than in data parallel sion control systems , including Source Depot [Ï], Git [9] and clusters (no shuøes for example). Furthermore, unlike the case TFVC[Ï].New changes registered in source control are a poten- of data-parallel jobs where outputs are much smaller than the in- tial trigger of new builds for veriûcation. put (e.g., queries in TPC-DS [ÏC] or TPC-H [Ï9]), the outputs of the median build job are quite large since many libraries and exe- 2.2 Build Specifications and Tests cutables are generated and packaged. _is requires CvÕn­BnTv­ to Build speciûcations describe the items that are to built, the depen- more carefully pipeline the fetching of inputs with the execution of dencies between items and the actions to be taken per item. Exam- tasks. ple speciûcations include Makeûles and project ûles in MSBuild [•]. For a given task,CvÕn­BnTv­ creates a dedicated ûle system di- A codebase typically has multiple ûle-system directories with many rectory structure, on which build tools are to execute. We call this speciûcation ûles, typically one speciûcation ûle per build target. A sandbox a , and it addresses reliability issues due to under-speciûed build target is a logical entity comprising a binary ûle and ancillary dependencies and allows for multi-tenant builds. _is is similar in ûles (e.g., debugging symbols). intent to Unix’s chroot jail [k], though it is implemented at the ap- Dependencies between build speciûcations are expressed by ref- plication level. As we will explain later, cache and sandbox play to- erencing the paths of other build specs, or the paths of ûles produced gether to guarantee the absence of “Frankenbuilds”, i.e. builds where by the targets. outputs from diòerent build jobs can combine in inconsistent ways Build speciûcations also describe the details of how to build a due to cache re-use. particular target. _is is done either explicitly or by invoking rules Despite allowing so much exibility for speciûcations, the result- deûned in other ûles. For example, the rule to compile a dynam- ing CvÕn­BnTv­ system is fast and reliable. At Microso, we have ically linked library from C++ would specify a set of observed that 9ª to •ªì of build artifacts can be shared; this num- steps using any number of tools (, linkers, etc.) as well ber varies across code bases. And CvÕn­BnTv­’s overhead for dis- as command-line options. distributions and SDKs oen tributed builds is small—with enough builders, •ì of builds take reference rule ûles that can be reused. However, teams or organi- no more than Ï.ª× the total time cost of the most expensive path zations can customize these rule ûles to satisfy technical or policy in the directed task graph described in the build speciûcations. requirements. Further, many aspects of the common rules can be CvÕn­BnTv­’s main contribution is thus to allow for reliable, fast parameterized, extended, or changed from within a build spec. builds in the presence of under-speciûed dependencies and non- While CvÕn­BnTv­ supports in principle any build language that deterministic tools. CvÕn­BnTv­’s engineering contribution is that declares which sources are read, which project-dependencies they it is easy to use, scales to varying demands and integrates well into have, and which output directories they use, special handling is 1 Codebase Speed-up Onboarding Time ¢ Devs 51 tools 1388 tools 0.8 B Ï.C kmm Ï}ªª 0.6 7 tools C 9.k kmm kª 0.4 D k.Ï }mm Ϫª 0.2 Frequency of Usage Cumulative E k.Ê }mm }ªª 0 1 10 100 1000 10000 Table Ï: Shows the ratio between the build times with and without Number of Tools CvÕn­BnTv­.Original build times ranged from Ï to  hours. _e Tool Rel. Freq. Tool Rel. Freq. table also quantiûes onboarding eòort (mm denotes a man-month), perl ª.ÏÏ nmake ª.ªk and the number of committers to each codebase. mkdir ª.ª• buildoutput ª.ªk echo ª.ªÊ copy ª.ªk cmd ª.ª9 dfmgr ª.ªk exportls ª.ªC signtool ª.ª} cl ª.ª link ª.ª} binplace ª.ª getfilehash ª.ªÏ

Figure Ï: CDF of tools by usage and the names of the most frequently used tools across all of Codebase A’s build speciûcations. given to the two major build languages currently in use at Microso, nmake [Ϫ] and MSBuild [•]. Figure }:An architecture diagram for CvÕn­BnTv­ As described in the introduction,CvÕn­BnTv­ is also responsi- ble for the execution of tests that follow compilation and linkage. ● _e test execution is distributed in the same way as the build steps. Commitment to compatibility: CvÕn­BnTv­ was designed CvÕn­BnTv­ supports tests written in VSTest [ÏÊ] and its ecosystem to replace existing build pipelines, and as such should be sub- of adapters for diòerent languages and unit test frameworks, includ- stantially compatible with pre-existing build speciûcations, ing nUnit and xUnit. CvÕn­BnTv­ also supports various code anal- tools, and SDKs. Given the size, age, and heterogeneity of ysis frameworks. Microso codebases, approaches that require rewrites or ma- jor refactoring of build speciûcations were considered pro- 2.3 Build Tools hibitive by most teams. Build speciûcations happen to use many kinds of tools. Figure Ï ● Minimal new constraints: _e only prescriptions made and lists the top few tool names and their frequency in all of the build changes demanded by CvÕn­BnTv­ are those that would oth- speciûcations of codebase A.Ï A total of ÏkÊÊ diòerent tools are erwise hinder parallelizing the build, or make caching ineòec- used in codebase A. _e distribution of usage is heavy tailed; ª tive. _e system provides mechanisms to assist with onboard- tools (k.Cì of all unique tools used) contribute •ì of the overall ing legacy speciûcations and, once onboarded, to maintain usage. However, several of the most frequently used tools are script those code bases within those constraints. invocators: perl and cmd for example. _e latter is Windows’ shell invocator similar to sh or bash. _ere are also recursive calls such ● Permissive execution environment: Safety under adversar- as those to nmake.Hence, this characterization of tool frequency ial inputs is not a goal. _at is, it is possible to write build under-estimates the diversity of tools. Yet, it suõces to show that a speciûcations or tools that result in unpredictable build be- wide variety of tools are used by typical codebases. havior. However when used as intended, CvÕn­BnTv­ guar- As mentioned in the introduction, a key goal of CvÕn­BnTv­ is to antees deterministic build output. Further,CvÕn­BnTv­ is onboard existing build specs with minimal additional eòort.Hence, a multi-tenant service and limits the impact on a customer CvÕn­BnTv­ has to support all of these tools. As some of the code from others that share the same caches or servers. bases are quite old, the associated tools are equally old. But they are so widely used that rewriting them is not easy. Further, several tools Table Ïp rovides data from experience to quantify the outcomes depend on their environment in idiosyncratic and under-speciûed of these principles. Each row is a codebase with hundreds of devel- ways that makes them a challenge to parallelize.Many of these tools opers that now runs on CvÕn­BnTv­. Speed-up is very diõcult to are also not deterministic, so reinvoking the tools could create dif- compute because the underlying build speciûcations have evolved ferent outputs; a typical cause of non-determinism is that tools may substantially aer onboarding. In every case, the new build specs have timestamps or randomly generated bits in their output. are much more comprehensive than the historical ones that pre-date CvÕn­BnTv­.Nevertheless, we compare the build time before on- boarding onto CvÕn­BnTv­ vs. the latest build times. _e table 3. DESIGN OF CloudBuild shows that the speed-ups are sizable. _e table also shows that the time to onboard each codebase was a couple man-months.It would 3.1 Design Principles have not been possible to onboard as rapidly, i.e., move to a dis- tributed cached build execution, if CvÕn­BnTv­ did not adhere to _e primary goal of CvÕn­BnTv­ has been to reduce build times the above principles. and increase build reliability for as wide a range of Microso teams as possible with as little investment as necessary. To achieve that, 3.2 Architecture Overview CvÕn­BnTv­ follows the following principles: CvÕn­BnTv­ builds on top of Microso’s cluster management ÏWhile we anonymize the names of codebases, we will relate various system,Autopilot [kk]. Autopilot (AP) manages the deployment of aspects of the same codebase by consistently referring to them with services as well as the monitoring of hardware, and the same letter. service health. AP applies automatic recovery techniques to deal Targets needing... Speciûcation ûles... codebase LOC build targets input annot. output annot. number type parse time B }.Ê × Ïª9 Ê}Ck }Ê9 }Ï ÏÏkªª mostly nmake Ê.Ês C .Ê × ÏªC }9Ïk Ï9Ê Ê• kk• mostly nmake kC.Cs D − }ª9C }C• ÏÏ} }}C9 mostly msbuild k.9s

Table }:Build Targets and those that required explicit dependency annotations. Codebase B uses an automatic build migration tool that leads to a higher-than-typical number of annotations. _e table also lists the number of speciûcation ûles parsed by CvÕn­BnTv­ for dependency inference; this process takes only about one minute, even for some very large codebases. 3.3 Extracting Dependency Graph _e dependency graph of a build determines the order of execu- tion. Existing build languages such as nmake [Ϫ] and MSBuild [•] allow dependencies to be under-speciûed. Consider the example in Figure k. _e syntax is inspired by make. Each target has a list of explicitly declared inputs and a corresponding set of commands. At ûrst blush, the two targets appear unrelated. However, T2.dll uses y dll z conf Figure k:Example illustrating hidden dependencies that break a par- . and . which are produced as side-eòects of executing allel or cached build. T1.dll. Sequential builds, which the build spec was written for, will succeed. However, a parallel engine that builds T2.dll ûrst will ei- ther fail or produce outputs using stale versions of the dependencies. with hardware and soware faults, and aims to maintain high ser- A distributed engine that builds T2.dll on a diòerent machine will vice availability during maintenance and soware upgrades. behave similarly (i.e. either fail or use stale dependencies). Similarly, CvÕn­BnTv­ consists of a large number of worker machines cached builds would not know to fetch the correct inputs. managed by small set of collaborating processes collectively referred _e fundamental problem is that widely-used build languages do to as the cluster manager. Once Autopilot deploys the CvÕn­BnTv­ not enumerate all of the inputs and the outputs of build targets.Nei- runtime to a machine, the worker publishes its presence and state to ther do they explicitly call out the side-eòects and resource depen- a Zookeeper []-like service. _is service maintains the list of avail- dencies of build commands. able enlistments, machine allocation to ongoing builds, and ma- A key component of CvÕn­BnTv­ is its ability to automatically chine readiness for new builds. infer hidden dependencies.It does so in a pre-processing step before _e cluster manager can shrink or grow the number of workers the build. CvÕn­BnTv­ has a large set of rules to apply against the or enlistments per codebase. Workers interact with source con- build speciûcation ASTs that identify patterns for particular build trol (§}.Ï) and package managers [ÏÏ] to download the necessary languages and SDKs, and then emit the anticipated inputs and out- content into the enlistments. puts. In some cases, invocations of external commands are also _e cluster manager accepts build-initiation requests. _e input parsed. For example, when robocopy.exe [Ï}] commands are used, includes a build deûnition that lists conûguration options such as: CvÕn­BnTv­ parses its command line to infer the inputs and out- output type (e.g., “build for C-bit architecture”); parallelization set- puts. _ere are invocation parsers for many of the tools listed in Fig- tings (e.g., “use between  and 9 nodes”); routing preferences (e.g., ure Ï. _ese parsers reduce porting overhead; they also substantially “prefer datacenters in US East region”). _e input also includes per- reduce the need for manual annotations that are described next. request options (e.g., code base revision to be built). Build jobs are When implicit dependencies cannot be inferred, users can add logged and added to a build queue. annotations. _ese are designed to blend into the build speciûcation Build jobs remain in a pending state until they are matched up language in a way that is transparent, in case these speciûcations are with a suõcient number of workers with matching enlistments.If executed outside CvÕn­BnTv­. _ere are also rules to detect con- many workers are available, the cluster manager selects those that structs that CvÕn­BnTv­ does not understand, such as invocations performed similar work recently, under the assumption that their of external scripts; the system alerts the user, requesting that anno- enlistments (and local caches) will be freshest. Chosen workers are tations be added to describe their inputs and outputs. asked to update their enlistments to the desired source and pack- In some cases, build tools use runtime context to determine their age revisions. Such preparation happens in parallel. Once enough inputs. Such context is unavailable in the pre-processing phase. workers are ready, the build job begins by spawning a build coordi- Where possible, CvÕn­BnTv­’s parser over-approximates the list of nator and multiple builder nodes. inputs: it may include all ûles that can be referenced via all branches of the runtime logic; it may include whole directories, where the ex- Architectural diòerences: _e architecture is largely similar to act set of accessed ûles is known only at runtime. other systems that schedule dependent sets of tasks such as Yarn [Ê], _e task dependency graph is constructed with one node per tar- Tez [k] and Cosmos [Ï]. A key diòerence thus far is the early choice n n get. An edge is added between two nodes Ï, } whenever there is of machines for builds (maintaining a map from machines to enlist- n n overlap between Ï’s output and }’s inputs. Unresolved inputs, that ments and caches and using locality). In contrast, data-parallel sys- any machine is those not found in source control and not output by any target, are tems place tasks on on the cluster. Since machines have agged as errors. Cycles in the dependency graph are also agged to be prepared (pull relevant sources from source control as well as as errors. SDKs and other packages needed to build),CvÕn­BnTv­ is more Besides the task nodes emitted for each target,CvÕn­BnTv­ may careful in picking machines for builds to amortize the prep cost and emit additional nodes for derived tasks, including: execution of tests improve cache hit rate. Furthermore, data-parallel systems need not and static analysis tools, aggregation of code coverage and analysis infer hidden dependencies (§k.k) since they are perfectly speciûed reports, and uploading of build outputs to durable cloud storage. by user queries and do not isolate various tasks at runtime (§k.). Table } oòers some experience numbers from three large code- Diòerences in scheduling are described later (§k.C). Output Collection: As also described above, build speciûcations do not list all of their outputs. To determine the outputs for a task at runtime, CvÕn­BnTv­ enumerates all directory entries in the sand- box’s root directory. Any ûle entry that is not a symbolic link is considered part of the task’s output, has its contents hashed, and is moved into a build job-wide location, outside of the sandbox. Isolation: Ideally, to avoid unpredictable behavior during parallel Figure :Example illustrating sandboxes execution, a build task should not mutate its inputs, nor should two tasks produce outputs into the same ûle path. In practice neither of 1 these criteria were met in the build speciûcations that CvÕn­BnTv­ 0.8 aimed to support. _e sandboxing approach described above allows 0.6 the system to detect such conicting writes, and attribute them to a 0.4 # Symlinks particular task, even when multiple tasks are executed at the same 0.2 Cumulative Setup Time (ms) time on the same machine. CloudBuild has multiple policies to deal 0 1 10 100 1000 10000 100000 1e+06 with such cases: it can block any such conicts; it can allow them, Per Sandbox, num symlinks and setup time (ms) provided conicting accesses write identical content – this is sub- optimal, but deterministic; it can even allow writes with diòerent Figure : Quantifying sandbox size (number of symbolic links) and content, when the resulting non-determinism is acceptable by the the time to setup sandboxes. _e gap between the CDFs is due to codebase owner. our recycling of sandboxes. Sandboxes of medium size needs a few thousand symbolic links and takes about ª.ks to setup. Some tar- Recycling: To minimize costly interactions with the ûle system,  recycles gets (MSIs) have > Ϫ inputs. CvÕn­BnTv­ sandbox directories.It retains the sandbox di- rectories that have been created for previous build tasks as well as an bases that run on CvÕn­BnTv­. _e data is for a full build of each in-memory representation of their ûle system state. When creating codebase. We see that parsing the speciûcations to infer dependen- a new sandbox,CvÕn­BnTv­ uses the in-memory representation cies takes only about one minute, even for large codebases. _e pars- to locate the best ûtting sandbox directory. Using that as a starting ing happens during a pre-processing step. We also see that the code- point,CvÕn­BnTv­ adds and removes symbolic links to reect the bases have thousands of targets and about Ï.} times as many spec- dependencies of the incoming task. iûcation ûles. Recall that there can be many common or project- Experience data: Figure o òers some data about sandboxes.It speciûc speciûcations that list build rules, in addition to those that shows the number of symbolic links required and the time to set up refer to the targets. _e table also shows the number of cases that as a CDF over all sandboxes. Most sandboxes require between Ϫ} - require explicit annotation. Most of the work is handled by auto- Ϫ symbolic links. Targets with many inputs tend to be those that matic inference. _e remainder are ûxed by annotations. Codebase produce deployment packages or layouts. We see that the setup time B has a more-than-typical number of annotations because it uses an can oen be much smaller than the number of symbolic links due auto-annotation tool. to our recycling of sandboxes. Most sandboxes are set up within a few seconds. 3.4 Sandboxing Build Tasks Alternate designs: Fundamentally,CvÕn­BnTv­ requires three sandbox CvÕn­BnTv­ executes each task in a . A sandbox is a properties from the ûle system environment in which the build tool directory with symbolic links for all of the task’s inputs, where in- executes. First, build tasks should be isolated. _at is, a task should puts include sources from the enlistment and the outputs of parent behave as if it is the only one running a machine even though many tasks. Additionally, the sandbox redeûnes a number of process en- tasks are running in parallel. Second, all of the inputs and only the vironment variables which refer to the particular locations in the inputs should be available to the build task. _ird, it should be pos- enlistment. _e redeûnition steers the task to locations within the sible to rapidly setup and tear down such an environment. sandbox for inputs, outputs and scratch folder locations. Figure  Operations like BSD’s [k] chroot jails would have been useful, illustrates this design. but were not available in our operating system of choice. Moreover, Sandboxes are introduced for three main reasons: dependency the build speciûcation already expected a root directory to be es- veriûcation, output collection, and isolation. tablished through a process-speciûc environment variable; this al- Dependency Veriûcation: As we saw above, build speciûcations lows CvÕn­BnTv­ to do “application-level virtualization” by setting can have dependencies that are not explicitly declared. _e infer- up sandboxes. Virtual machines are expensive to setup and tear- ences of dependencies made by CvÕn­BnTv­ may therefore be in- down especially since the virtual disk for each sandbox has diòer- complete (§k.k). Mistakes in dependencies will lead to false cache ent ûles (the inputs). We also considered using Drawbridge [}}], a ûngerprints and hence false cache hits. To solve this,CvÕn­BnTv­ lightweight process isolation container. While it allows for safe de- populates a task’s sandbox with only its explicit and inferred depen- touring of ûle system operations, the cost to on-board thousands dencies. _ereby,CvÕn­BnTv­ ensures that any missed dependen- of build tools into Drawbridge was considered prohibitive. Copy- cies will be runtime errors – typically “ûle not found”. Such errors on-write snapshots help reduce performance impact of moving data are surfaced to the user, the sandbox’s outputs are discarded, and the into and out of sandboxes. Tool compatibility constraints have thus child tasks are not executed. far limited our ability to use a ûle system that supports this feature; Since it is oen prohibitively expensive to create symbolic links this continues to be an area of further work. to every single ûle, CvÕn­BnTv­ may create symbolic links to the directories that contain the required ûles. CvÕn­BnTv­ uses De- 3.5 Distributed Cache of Build Outputs tours [k}] binary injection to observe ûle system APIs, and record To reuse the work of prior builds,CvÕn­BnTv­ uses a distributed the actual ûles accessed. Accesses beyond those predicted by the cache. Conceptually, the cache aims to map the entirety of the in- dependency parser are surfaced as errors. put values presented to a task execution to the output emitted by the task. If a future execution of the task is presented the same GCS no longer have any valid replicas due to network partitioning, inputs, it can be skipped and the cached outputs used instead. In hardware failure, and a small class of maintenance activities. Such concrete terms, the cache is implemented as a dictionary mapping issues are detected during cache lookups, and result in the removal CacheKeys, which encapsulate the values of task inputs, to output of the invalid cache entry. bags describing the names and contents of emitted task outputs.  CacheKeys Experience data: _e load on the GCS is roughly O(Ϫ ) reads _e following sections will explore how the are com- k puted, the architecture of the distributed cache, and our strategies to and O(Ϫ ) writes per second. Recall that CvÕn­BnTv­ uses Azure ensure consistent content is retrieved from cache, even in the pres- Tables to implement the GCS, thereby receiving a distributed table ence of multiple concurrent builds and non-idempotent tools. store with persistence and strong consistency. 3.5.1 Cache Data Structures 3.6 Build Scheduling output bag Each consists of an ID plus a set of (ContentHash f , Each build job is controlled by a build coordinator.It works by Path f ) tuples. _e ID is a pseudo-random number assigned to an assigning individual tasks to builder nodes. For this, it tracks a list output bag at creation time to help with race conditions (see §.k). of unblocked tasks, and tries to assign these whenever resources on Each tuple describes the content and path for a ûle emitted during a builder are available. the execution of the task described by the content bag. _e build scheduler uses several sets of information: Lookups into the cache are performed based on the inputs of a ● task. _ese inputs are encoded into a compact CacheKey as de- _e currently available, i.e., unblocked tasks for which all de- scribed inductively by the following equations. At a high level, FF pendencies are completed. ● denotes a ûle or task ûngerprint, GF is a ûngerprint of the global set- An estimate of a task’s resource proûle based on task charac- tings and the CacheKey of a target depends on the ûngerprints of teristics inferred during parsing. ● its inputs. Resource proûle and execution time for a task determined from historical build data. ● _e longest path for each task to a leaf node. FF ContentHash Hash Path s = s ⊕ ( s ) ∀ source ûles s ● _e longest path overall, also called the critical path. FFt = CacheKeyt ⊕ IDt ∀ tasks t CacheKeyt = GF ⊕ Š> ∈ FFi  ∀ tasks t criticality i inputst _e build scheduler uses the of a task, deûned as the longest sub-path from this task to any output of the DAG. _is mea- _e ûrst two equations show how to compute the ûngerprint FF sure is used as the priority order in which tasks are scheduled. for each input source s and build task t respectively. File content _e scheduler only considers actionable tasks in the DAG. Ac- hashes can be retrieved from source control if available, or com- tionable tasks are deûned as the subset of tasks who have not yet puted from ûle content. _e ûle path is always used as part of its run, but whose dependencies are met because their parent tasks ei- ûngerprint, since path values are used as inputs to certain build ther ûnished or had matching CacheKeys. _e actionable tasks are tasks such as those performing recursive structure-preserving di- added to a queue of executable tasks, sorted by the above priority. rectory copies. For the targets, the ûngerprint is derived from their _e actual assignment of tasks to a speciûc worker depends on CacheKey and ID. _e third equation shows how to compute the input locality, the task’s resource proûle, and availability of resources CacheKey for a task based on the ûngerprints of its inputs (both on builders.Per task – going by priority order – the coordinator sources and parent tasks). GF encodes global settings conûgured in iterates over the available builders, and through a combination of a build job, presumed to be inputs for every task. models tries to evaluate the fastest time to completion for this task. Several factors inuence the completion time. First, the setup 3.5.2 Cache Service Design phase needs to copy over inputs that are not already available on CvÕn­BnTv­’s caching layer is implemented in an architecture that machine. A builder that already contains a lot of the inputs will that is similar to other distributed caching systems. _e mapping have a shorter setup time than one that needs to copy all of them from CacheKey to output bags is maintained by the Global Cache over. Second, enough resources to run the task may be not available Service (GCS). GCS is persistent and consistent.It is built using at the machine until some time in the future. _e queuing delay is Azure Tables. Each worker node runs a Local Cache Service (LCS) calculated based on the modeled execution times of tasks running which maintains a partial copy of this mapping. Together, LCS and already at the builder and those that are in the queue. Tasks are GCS also implement a distributed content addressable store (CAS), queued at builders with the shortest expected execution time, i.e. with LCS storing replicas of each piece of content available in the the sum of both numbers. cache, and GCS serving as a cluster-wide tracker of such replicas. An CvÕn­BnTv­ schedules tasks eagerly to overlap fetching of inputs LCS keeps local replicas for all content that is referenced or created with the execution of other tasks. Such eager scheduling is done by tasks on that node. carefully however to not make decisions far into the future. Reads from the cache are ûrst made to the local LCS.If the Some classes of task execution errors are deemed retriable, and CacheKey is present on the LCS, the corresponding CAS content is depending on the nature of the failure, a policy may be set in the returned to the caller, together with the associated (ContentHash f , task to seek retries in diòerent builder nodes. In this case, the build Path f ) tuples.Otherwise, the LCS queries the GCS and if present coordinator excludes the previous execution site from the list of can- fetches a copy of the contents of the output bag from one of the repli- didates for future placements of the task. cas to the local CAS. Eòectively,CvÕn­BnTv­ uses critical path scheduling [}C] but Writes to the cache are handled in a write-through manner. _at with added attention to multi-resource packing of build tasks and is, the LCS immediately pushes changes to the GCS and asks that eager queuing of tasks so that fetching inputs can overlap the exe- some number of replicas be created. We discuss races in §.k. cution of other tasks. In particular, tasks’ have a large number of When in active use, popular cache items can have hundreds of inputs and in a rather unstructured manner (as opposed to a reduce replicas. Cache entries with no use are deleted aer a period of time task reading from every map task [Ê]). _is forces CvÕn­BnTv­ to (usually a few days). At times, it is possible that items present in the more carefully account for input locality by actually computing the 1 0.8 0.6 0.4 0.2 Cumulative 0 0 1000 2000 3000 4000 5000 6000 7000 8000 9000 Number of targets in build session

Figure C:Arrival process of builds Figure •: Targets in build job.

1 1 0.8 Build Targets Test Targets 0.8 0.6 0.6 0.4 0.4 0.2 Cumulative 0.2 Cumulative 0 0 0 20 40 60 80 100 0 0.5 1 1.5 2 2.5 %age of targets hitting in cache Runtime (normalized to median) B Figure Ê: Cache hit rates for ª builds of codebase . Figure ÏÏ: How long does a build take? Above, we show a CDF of the build times of codebase B. time to fetch the inputs; the corresponding issue is simpler in data- parallel systems [k,Ê, }Ï].Much of our experiences with the build To assess CvÕn­BnTv­’s performance, we compare build execu- scheduler are documented in §.Ï. tion times with the critical path of the build dependency graph (Fig- ure Ϫa). _e distance in Figure Ϫa is on average about }ªì of the 3.7 Managing Build Outputs total build time. Some of the diòerence is due to copying of input CvÕn­BnTv­ needs to support multiple mechanisms to facilitate and outputs; aspects that are not accounted for in our critical path access to build outputs. To oòer a backwards-compatible build out- calculation. put storage, CvÕn­BnTv­ can persist build outputs into SMB ûle We also compare build execution times with an estimate of the shares [k9]. _is has been the primary mode of storage in the old total work (Figure Ϫb) in the build DAG. _is metric is the maxi- Microso build labs that CvÕn­BnTv­ seeks to replace. CvÕn­- mum over all resources, the sum of task duration times the demand BnTv­ can also save build outputs into the Artifact Repository, an for that resource divided by the available capacity of that resource.If internal content storage system designed on top of Azure Blob Stor- there were no other constraints, the total work is a lower bound on age [}k] and integrated into Microso’s Visual Studio Online [kC] the runtime. From Figure Ϫb, we see that the total work is between suite. _is system allows for progressive diòerential data uploads Ϫì and }ªì of the actual running time. A look at the timelapse during build execution; similarly, it oòers customers progressive dif- of an example build of codebase B Figure 9aca n explain why. _e ferential downloads, to reduce the end-to-end time to acquire all of execution schedule shows several voids in which only a few tasks the content produced by a build. are running. _is is because, naively generated dependency graphs can have several “choke points” that limit parallelism. Choke points 4. EXPERIENCES are instances when all of the unscheduled tasks depend on all of the previous tasks allowing a single outlier to prolong the build [}Ï]. We want to share some of our experiences and lessons learned Figure 9bd isplays a similar view for codebase C. _is codebase from building and operating the service over the last few years. has been running distributed builds for a number of years; over time its developers have optimized the build dependency graph to in- 4.1 Characterizing CloudBuild crease parallelism, thus reducing the build times. Automatic iden- CvÕn­BnTv­ runs in several datacenters and uses thousands of tiûcation of choke points in our codebases is an area of additional servers. Roughly twenty thousand builds are launched per day. Fig- research. ure C depicts the arrival process of builds. Demand is mostly di- Figure ÏÏ shows that the maximum build time for codebase B is urnal, with peaks at ÏÏAM and PM. During business days around nearly × the median build time. _e primary factor for this diòer- Cªì of the traõc is due to user requests; around }ì are continu- ence is caching. _e range between min and max is about Ϫ×. We ous integration builds that are triggered by a code check-in; Ϫì are believe that shaving oò some of the ûxed overheads would stretch scheduled builds, conûgured to run at preset times. this range further. Over •ì of the builds use the distributed cache, which is the de- Table ks hows the typical resource utilization of tasks. While on fault option. CvÕn­BnTv­ allows developers the option to not use average tasks use relatively few resources, the standard deviation is the cache which is useful for troubleshooting and onboarding new high. From the maximum values, it can be seen that a one-size-ûts- branches. Cache hit rates vary substantially based on the codebase all approach to scheduling would not suõce.Multi-resource pack- being built. Figure Ê shows the cache hit rates for diòerent builds ing [}Ê] could improve cluster throughput. of codebase B which as described in Table } is currently the largest Table  takes a deeper look at the correlation between diòerent codebase by size and execution volume. For •ªì of the builds, at kinds of task resources.It plots the covariance in usages of diòerent least ªì of the targets are cached, and for ªì of the builds, at resources.Overall, they do not correlate strongly except for the data least ʪì of the targets are cached. _e overall cache hit rate is 9Cì read and written by tasks. _ere is a weaker correlation between which translates into signiûcant savings. Caching helps both cluster memory and data read or written. _is is in contrast to tasks in throughput (builds per second) and build latency (time to ûnish a data parallel clusters which for the most part process one row aer build). another (joins are an exception). Many build tasks, such as com- (a) Codebase B (unoptimized) (b) Codebase C (optimized over time) Figure 9: Depicted above are two actual execution schedules for diòerent builds. Each dark grey block represents a task, and the tasks are sorted by the machine on which they were executed. Figure 9a is an unoptimized build, whileF igure 9b depicts a schedule for a diòerent, optimized codebase.

Scatter plot of runtime vs critical path length Scatter plot of runtime vs total area 3 3 Equality Equality 2.5 x = 1.1y 2.5 x = 3y x = 1.3y x = 5y 2 2 x = 10y 1.5 1.5

1 1

0.5 0.5

critical path (normalized) 0 0 0 0.5 1 1.5 2 2.5 3 0 0.5 1 1.5 2 2.5 3

real run-time (normalized) total area / processors (normalized) real run-time (normalized) (a) Critical path (b) Area Figure Ϫ: _e ûgures above compare the runtimes of codebase B with two lower bounds: length of the critical path, and the area. _e ûgures show that codebase B is dominated by work on the critical path and that CvÕn­BnTv­’s scheduler is nearly optimal.

Resource Avg. Stdev. max Customer Education: While CvÕn­BnTv­ was designed to be Cores ª.} ª.}k  highly compatible with old build speciûcations, it does impose addi- Memory (MB) Ϫ •k kC}} tional constraints on what users specify. Because of sandboxing, the Data read (MB) ª ÏÏ }}ª execution environment can be somewhat diòerent from that of MS- Data written (MB) }k ÏÊ k Build and nmake.Hence, builds can experience unique failures in ¢ Parents • k kÊ CvÕn­BnTv­; even common failures may present themselves dif- ¢ Children • C9 }ª} ferently in a distributed environment. As large engineer popula- Duration (s) }Ï } 9} tions onboard into CvÕn­BnTv­, we have had to fund substantial user education eòorts, with both proactive documentation and trou- Table k: Resource proûles of tasks. bleshooting consultancy. We also have implemented functionality C M W P CH to detect, triage and explain failures, and to correlate them with per- C – tinent documentation. Having said that, much work is still needed M ª.Ï – to identify systems that can automatically surface the most relevant R ª.Ï9Ê ª.• – content based on user-stated or implied needs. Another customer W ª.Ϫk ª.k•ª ª.Ê9 – education direction was in training to adapt build speciûcations to P -ª.ªÏÏ ª.ªk• ª.k ª.k}ª – forms that can be more eòectively parallelized. Figure 9s hows an CH ª.ªÏ ª.ª9 ª.ª9Ï ª.ªC9 ª.ªÏC – C:Cores, M: Mem., R: Data read, R: Data written, P: Parents, CH: Children example where the build on the right is more easily parallellizable than the one on the le (less void cycles, more work oò the critical Table :Correlation between pairs of resource demands of the var- path, etc.). ious tasks. Virtuous but spiraling demand: As codebases are onboarded and users see shorter build times, the demand for builds increases. pilation, linking and tests may retain more of their inputs (binaries Builds that would be erstwhile executed on desktops are now of- and executables) in memory for the duration of the task. Overall, oaded to CvÕn­BnTv­. _e lower cost also means that developers however, the lack of correlation means that there is opportunity to are much more inclined to submit “speculative builds” even where pack tasks with diverse demands, e.g. I/O intensive jobs with CPU there is little conûdence that the builds and tests will succeed. De- intensive jobs. velopers get to wait less and check-in more oen, increasing the de- mand of pre- and post-checkin build jobs. We also saw the number 4.2 People Experiences of unit-tests consistently increase. _is is less of a problem though since such tests typically add little to the critical path of the build. machines and can receive variable amounts of cached outputs, the CvÕn­BnTv­ continues to be substantially scaled out to accommo- execution orders are no longer deterministic. _is surfaced several date all the dimensions of growth. hidden or latent bugs. _e vast majority of onboarded codebases Evangelizing the best-of-breed: By becoming the most widely had to be patched to remove latent bugs. CvÕn­BnTv­’s consis- adopted build solution at Microso, our team has exposure to a large tency policies (see §k.) helped detect some of these issues. Sur- variety of codebases. _is has enabled the team to identify best- prisingly, some codebase owners were happy to allow output non- of-breed tools, build speciûcations, patterns, and practices, and to determinism – in some cases because the aòected output content evangelize them across diòerent groups. We also funnel emerging was not used by the application at run-time; on others because any requirements back to tool owners, and allow them to validate tool of the possible output values was acceptable. changes against their customer codebases on CvÕn­BnTv­. 5. RELATED WORK 4.3 Technical Experiences Google’s Bazel [] system solves a similar problem– parallel and Races and Non-idempotence: It is possible that two concurrent cached builds for large code bases. _e key diòerence is that Bazel build jobs observe a cache miss and race to execute the same requires developers to write speciûcations and redesign their code task. Avoiding races requires serialization and hurts performance. layout.It oòers a new build speciûcation language and requires a Hence, rather than prevent such races,CvÕn­BnTv­ optimistically speciûc directory structure. Both these decisions can simplify de- handles them aer the fact. _e issue with races is that build tools pendency inference. In contrast,CvÕn­BnTv­’s primary goal is to are not idempotent, i.e., running the same task twice can produce quickly onboard teams with minimal changes to their existing build diòerent outputs, for a variety of reasons. _e output can contain specs and tools. Further distinctions are hard to draw since very environmental information such as machine-name or timestamp. little information is publicly available about the operational aspects Or, it could have some uninitialized strings with random values. To of Bazel. _e released version only targets single machine builds. handle such races,CvÕn­BnTv­ only allows the ûrst writer, deemed How they cache or schedule builds, the extent of support for arbi- the winner, to update the cache. _e others receive an indication trary build tools and their performance and time-to-onboard new that the task’s output is already available causing them to discard teams would all be very useful topics to compare in the future. their local output and to use the winner’s output instead. distcc [C] is another publicly available system. _e unit of work Customers expect provenance relationships between outputs of is a preprocessed source code ûle that is sent over the network and a build job to hold just as if the build had been performed on a compiled remotely. A pump mode allows preprocessing to also be single machine without caching. For example, consider a build executed on the distributed host. Compared to CvÕn­BnTv­, it t t where a task Ï emits ûle raw~file.dll and a child task } emits misses several steps such as distributed linking or test execution. compressed~file.dll.gz. A build output consumer can expect Linking is executed on the main host, and test execution is not part that unzipping file.dll.gz yields a ûle that is bitwise-identical to of its design. Further, distcc only supports gcc. file.dll. Assume now that an earlier build job created cache en- State-of-the-art build systems such as Maven [}], Ant [Ï] and t t t tries for Ï and }, but later Ï’s entry was deleted for some reason. SBT [Ïk] are neither distributed nor cache build outputs; some oòer t t If Ï is regenerated by a non-idempotent tool and }’s output is re- parallelization at a coarse granularity [}]. _e key ideas in CvÕn­- trieved from the cache, file.dll and file.dll.gz will no longer BnTv­ improve upon ideas from prior work; our use of sandboxes is match. To avoid this issue, CvÕn­BnTv­ includes the input’s content similar to [}ª] and our cache for build outputs is akin to [}•]. bag identity when computing a target’s CacheKey. In this example, Some prior work describes new build speciûcation languages with t identity t Ï’s content bag is used to obtain }’s CacheKey which leads formal properties [}, kª] and techniques to automatically migrate t to a cache miss and thus forces } to be re-built avoiding “Franken- legacy speciûcations [}9].Other work reduces “debt” [kÊ] or pares builds”. down build targets to only the outputs that are actually used [ª]. Build Job Cohorts: For large codebases, build durations and re- CvÕn­BnTv­ is orthogonal to these eòorts. We believe that such quest arrival processes are such that there are oen many concur- eòorts are needed and we are working towards these goals as well. rent jobs building the same codebase. Most developers tend to Yarn [Ê], Mesos [kÏ], and other data-parallel computing infras- work close to the “head revision” in their codebase, so the major- tructures work with tasks that are already amenable to distributed ity of source code is common across concurrent jobs. _e races de- execution, i.e. tasks with deterministic and idempotent behavior. scribed above are therefore very common. Later-starting jobs are _ese attributes also mean that they are easily runnable in paral- more likely to catch up since they consume cached content created lel. Many of the challenges that CvÕn­BnTv­ solves are towards by the earlier-starting jobs. Once they catch-up, they will patho- making build tools and speciûcations geared for single-threaded / logically race to execute further tasks (over common sources). _is single-machine execution to work in a distributed setting. creates high demand for certain CacheKeys and causes load on the _e build scheduling problem (§k.C) is closest to scheduling jobs GCS to spike. _is also wastes cluster resources in the work per- with dependencies (DAGs) as is common in data-parallel frame- formed by the race loser. Federating schedulers so as to minimize works such as Tez [k], Hive [k•], and SCOPE[}]. _ere are a few races between diòerent build jobs is an area for further exploration. key diòerences however. First, due to caching, the subset of the build DAG that needs to be executed varies dynamically. Second, unlike Non-Deterministic Build Outputs: Prior to CvÕn­BnTv­, code- data-parallel jobs where the data-in-ight reduces dramatically with bases were built using a single machine. While parallelism was pos- each operation [ϕ], build outputs are numerous and large in size. sible, task execution order were mostly stable across build jobs since Hence, scheduling build DAGs needs more careful handling of de- the relatively naïve scheduler had only the limited resources oòered pendencies. by a single machine. Conicts between tasks can go unnoticed when task execution order is stable. For example, suppose a spec allows two tasks to output to the same ûle path but this would go unnoticed 6. FINAL REMARKS if the stable execution order ensured that the same task always won CvÕn­BnTv­ was started to enable Bing’s fast delivery cadence.It in single machine builds. Since CvÕn­BnTv­ schedules over many has since been rapidly adopted by many teams across Microso. _e primary reason has been that instead of few builds per day, devel- [ÏÏ] Nuget.h ttp://bit.ly/ÏOdeEJA. opers could suddenly do many builds and thus run many more ex- [Ï}] robocopy.h ttp://bit.ly/ÏOdeEJA. periments. For instance, the number of builds for codebase B dou- [Ïk] Scala build tool. www.scala-sbt.org. bled within six months, despite the fact that the team did not change [Ï] Source depot.h ttps://en.wikipedia.org/wiki/Perforce. its development process. Changing engineering processes can im- [Ï] Team foundation version control.h ttp://bit.ly/ÏESÊVLm. prove agility further. For instance, a team that recently onboarded [ÏC] TPC-DS Benchmark.h ttp://bit.ly/ÏJCuDap. has }ªªª targets and executes Ïª× more builds per day than they could two months ago using their old build-lab-based system. [Ï9] TPC-H Benchmark.h ttp://bit.ly/ÏKRKgl. Growing CvÕn­BnTv­ to support teams with diverse program- [ÏÊ] VSTest.h ttp://bit.ly/ÏL}nmPO. ming cultures (e.g.,Bing and Oõce) was not without challenge. [ϕ] S. Agarwal et al. Re-optimizing data parallel computing. In NSDI First, we struggled with the immense scale requirements. For in- , }ªÏ}. stance, our load exceeded existing binary storage and ûle-signing [}ª] G. Ammons.Grexmk: Speeding up scripted builds. In Workshop on Dynamic Systems Analysis solutions. While we could scale up or scale out some of the de- , }ªªC. pendent systems, others had to be redesigned. We created a new [}Ï] G. Ananthanarayanan, S.Kandula, A.Greenberg, I. Stoica, content-addressable de-duplicated store in the cloud, which is cur- Y.Lu,B. Saha, and E.Harris. Reining in the outliers in rently used for drops, symbols, and packages. By exploiting CvÕn­- map-reduce clusters using mantri. In USENIX OSDI, }ªÏª. BnTv­’s cache hit rates we were able to reduce the ingress and egress [}}]A . Baumann et al. Shielding applications from an untrusted volume for various storage system by over ʪì. cloud with haven. In OSDI, }ªÏ. Another challenge was the requirement to integrate into the very [}k]B . Calder et al. Windows azure storage: a highly available diòerent engineering workows that existed across groups. In some cloud storage service with strong consistency. In SOSP, }ªÏÏ. cases,CvÕn­BnTv­ did not have proper abstractions for all integra- [}] R. Chaiken et al. SCOPE: Easy and Eõcient Parallel tion points. When new requirements came up we followed the open Processing of Massive Datasets. In VLDB, }ªªÊ. source model and allowed developers from other organizations to [}] M. Christakis, K. Leino, and W. Schulte. Formalizing and contribute. _e resulting system slowly became fragile, because of verifying a modern build language. In FM, Lecture Notes in unintended feature interactions of the new dependencies. We took Computer Science. }ªÏ. various steps to address this issue: Workow integration now fol- [}C]E . G. Coòman and R. L. Graham. Optimal scheduling for lows a publish/subscribe model, i.e. integration is done externally. two-processor systems. Acta Informatica, Ï(k):}ªª–}Ïk, ϕ9}. For simpler deployment, testing in production, and dependency iso- [}9] M. Gliboric et al. Automated migration of build scripts using lation, we are currently redesigning CvÕn­BnTv­ to follow a more dynamic analysis and search-based refactoring. In OOPSLA, decoupled service design pattern. Finally, we have started to contin- }ªÏ. uously apply Chaos Monkeys to improve the resilience and recover- ability of our services. [}Ê] R.Grandl, G. Ananthanarayanan, S.Kandula, S. Rao, and A. Akella.Multi-resource Packing for Cluster Schedulers. In In spite of the rapid growth of CvÕn­BnTv­ to a service used by SIGCOMM thousands of engineers,CvÕn­BnTv­ continues to achieve its pri- , }ªÏ. [}•]A .Heydon, R.Levin, and Y. Yu. Caching function calls using mary goal: improve the cycle time for the developers inner loop: PLDI build, and test, and do so reliably and quickly. precise dependencies. In , }ªªª. [kª] J. Hickey and A. Nogin. Omake: Designing a scalable build FASE 7. ACKNOWLEDGMENTS process. In , }ªªC. [kÏ]B . Hindman et al. Mesos: a platform for ûne-grained resource _e authors would like to thank current and past CvÕn­BnTv­ sharing in the data center. In NSDI, }ªÏÏ. staò members and contributors: Stephan Adler, Andrew Arbogast, Jonathan Boles Nick Carlson Matthew Chu Jacek Czerwonka John [k}] G. Hunt and D. Brubacher. Detours:Binary interception of , , , , win k} functions. In Usenix Windows NTSymposium, ϕ••. Erickson, Steve Faiks, Maksim Goleta, Dmitry Goncharenko, De- dian Guo Mario Guzzi Jeòrey Hamblin Cody Hanika Kim Herzig [kk] M. Isard. Autopilot:Automatic Data Center Management. , , , , , OSR, }ªª9. Zach Holmes, Jinjin Hong, Fei Huang, Rui Jiang, Paul Jones, Jeò Kluge Ben Lieber Eric Maino Joshua May Michael Michniewski [k] M. K. McKusick, K. Bostic, M. J. Karels, and J. S. Quarterman. , , , , , _e design and implementation of the . BSD operating Paul Miller, Nick Pizzolato, Michael Pysson, Josh Rowe, Zack Run- system.Pearson Education, ϕ•C. ner, Vinod Sridharan, Stanislaw Szczepanowski, Carl Tanner, Suresh _ummalapenta, Umapathy Venkatachalam, Christopher Warring- [k] Microso. Azure Service Bus.h ttp://bit.ly/ÏLjqUIf. ton, Rick Weber, Martha Wieczorek, Yancho Yanev, and Jet Zhao. [kC] Microso. Introducing Visual Studio Online. http://bit.ly/ÏOvpNez. [k9] Microso. SMB Protocol and CIFS Protocol Overview. 8. REFERENCES http://bit.ly/ÏCrljd9. [Ï]A pache ant project. ant.apache.org. [kÊ] J. D. Morgenthaler et al. Searching for build debt:Experiences Workshop on Managing [}]A pache maven project.m aven.apache.org. managing technical debt at google. In Technical Debt [k]A pache Tez.h ttp://tez.apache.org/. , }ªÏ}. []A pache zookeeper. zookeeper.apache.org. [k•]A . _usoo et al. Hive- a warehousing solution over a VLDB [] Bazel.h ttp://bazel.io/. map-reduce framework. In , }ªª•. [ª] M. Vakilian et al. Automated decomposition of build targets. [C]d istcc.h ttps://.com/distcc/distcc. ICSE [9] Git for visualstudio.h ttp://bit.ly/ÏvqcDFe. In , }ªÏ. [Ê] Hadoop YARNProject.h ttp://bit.ly/ÏiSÊxvP. [Ï] J. Zhou et al. SCOPE: Parallel Databases Meet MapReduce. In VLDB, }ªÏ}. [•] MSBuild.h ttp://bit.ly/ÏFwkCEz. [Ϫ] Nmake.h ttp://bit.ly/ÏNgNzsE.