The Gozer Workflow System

The Gozer Workflow System Jason Madden∗, Nicolas G. Grounds∗, Jay Sachs∗, and John K. Antonioy ∗RiskMetrics Group, 201 David L. Boren Blvd, Suite 300, Norman, OK, USA ySchool of Computer Science, University of Oklahoma, Norman, OK, USA Abstract are currently mostly used for the processing of financial data, they can be considered general-purpose. The Gozer The Gozer workflow system is a production workflow language is executed by a custom bytecode-based Gozer authoring and execution platform that was developed at virtual machine and runtime layered on top of the JVM RiskMetrics Group. It provides a high-level language and (Java Virtual Machine). This virtual machine (the GVM), supporting libraries for implementing local and distributed described in Section 4.1, provides support for the futures parallel processes. Gozer was developed with an emphasis used to implement the local thread-based parallelism of on distributed processing environments in which workflows Section 2 and the continuations required by the distributed may execute for hours or even days. Key features of Gozer concurrency of Section 3. Lisp’s simple evaluation model include: implicit parallelization that exploits both local and the ease with which a runtime VM could be developed and distributed parallel resources; survivability of system for it drove the decision to use Lisp. faults/shutdowns without losing state; automatic distributed Listing 1 shows example functions for computing process migration; and implicit resource management and the sum of squares in map/reduce style. The function control. The Gozer language is a dialect of Lisp, and the loc-sum-squares performs the computation entirely Gozer system is implemented on a service-oriented architec- locally using Gozer’s sequential loop construct for the map ture. step (squaring the numbers) and a sequential application of addition for the reduce step. Section 2 describes how the 1. Introduction function par-sum-squares allows for a degree of local Gozer names both a platform for developing and executing parallelism in the map step. Finally, as detailed in section 3.5, complex distributed and parallel processes called workflows the function dist-sum-squares distributes the map step as well as the high-level language used to encode workflow across available nodes and then performs the reduce step in computations. A primary goal of the platform is to make it a single local process, after all the squares have been com- easy to take advantage of a distributed system. puted. Notice the similarity among the three variants, which The Gozer platform is built on top of the Java platform. highlights the simplicity of expressing parallel/distributed It runs within the (proprietary) BlueBox environment, a computations using Gozer. distributed, message-passing cluster based on a service- oriented architecture. Service instances communicate by 2. Local Parallelism placing XML messages on a message queue (the Java Local (shared-memory) parallel operations in Gozer are Message Service) which distributes the messages to available based upon threads and a thread pool, the native parallel nodes. Each service describes the operations it offers with primitives provided by the underlying Java platform. Al- an XML document called a WSDL. The Gozer platform though the Gozer programmer is free to use these primitives, exploits the asynchronous nature of the message queue higher-level operators based upon those from Multilisp [3] and its ability to load-balance across multiple instances are provided by the Gozer language. These operators are all of a single service to achieve distributed concurrency and declarative in nature and are designed to allow the program- fault tolerance, i.e., survivability. The underlying BlueBox mer to focus on opportunities for achieving concurrency, platform provides monitoring and management features. rather than the implementation details. The Gozer language is a Lisp dialect that could be A key abstraction provided by Gozer for expressing local described as a “scripting language” due to its support for parallelism is the future. A future represents a computation interactive development, rapid prototyping, and tight integra- that may not have completed yet, and represents a promise tion with existing Java libraries and BlueBox services. Its to deliver the value of that computation when required, at a primary influence is Common Lisp, but it includes elements future point in time. Until a future’s computation completes, from other languages such as Clojure [1] and Groovy the future is said to be undetermined, after which the future [2]. Although the Gozer language and BlueBox platform is determined. Any value that is not a future is always said to be determined. Futures are used to exploit opportunities for Listing 1. Sum-of-Squares Variants concurrency that exist between the computation of a value ( defun loc−sum−squares (numbers) and its ultimate use. For example, when transforming a set ( apply # '+ by applying a function to all members of a set, the earliest ( loop for number in numbers c o l l e c t (* number number)))) time that the transformed value for the first transformed member of that set could be used is after the transformation ( defun par−sum−squares (numbers) last ( apply # '+ has completed on the member of the set. An opportunity ( loop for number in numbers for concurrency exists that can easily be expressed with c o l l e c t ( f u t u r e (* number number))))) a future, which in Gozer is declared with the future par-sum-squares ( defun dist−sum−squares (numbers) macro. The function from Listing 1 ( apply # '+ illustrates an example of this usage. ( for−each (number in numbers) When a computation involves futures, the Gozer program- (* number number)))) mer generally does not need to take special precautions. Futures can freely be mixed with other values, passed to and returned from functions, stored in data structures, and opportunities for distribution are written in a declarative so on. The GVM is responsible for managing the execution fashion, and the details of implementation are provided by and determination of futures. the platform. The GVM does not provide a guarantee about the se- Listing 1’s function dist-sum-squares demonstrates quence in which futures are determined. In the case of IO distributed programming with Vinz. Choosing an appropriate or other side effects, order of operations can be important. level of concurrency, distributing work to available nodes, Gozer’s touch and pcall operators allow the programmer gathering results, and continuing the computation when all to control sequencing in these cases. The touch operator concurrent work is completed are all automatically handled causes the calling thread to await the determination of a by Vinz. Importantly, waiting for computations to complete particular value before proceeding, while pcall applies a is event driven and consumes no resources. Even though function, but only after all its arguments are determined. conceptually dist-sum-squares blocks awaiting the 3. Transparently Distributed Workflows for-each results, no actual blocking occurs because the (distributed) process state is saved to persistent storage and 3.1. Overview is restored only after all necessary results are available. This Shared-memory thread-based local parallelism has a num- type of distribution may be nested to an arbitrary depth and ber of disadvantages for the construction of large, complex, the results of each step may be arbitrarily complex. evolving processes. First, it doesn’t scale well beyond a A distributed workflow begins as a Gozer program. Vinz single physical machine. The potential for side-effects makes takes this program and makes it available for running on the it challenging to evolve a local process over time or to nodes of the BlueBox cluster. This is done by wrapping the integrate code from multiple authors. Any robustness in the Gozer program up as a distinct BlueBox service. Operations case of machine failure such as the saving and resuming of are the only way to interact with a service in BlueBox intermediate states must be programmed explicitly for each and the only way instances of services can interact with process. Finally, especially in the case of long running pro- each other, so this new service provides a standardized set cesses, it may be desirable to “suspend” a process in order to of operations (see Table 1) that may be invoked on some allow a higher-priority process to use scarce resources such available instance of the workflow service by placing the as memory; such suspension would similarly require explicit appropriate messages in the message queue. handling in the design of each process. Gozer’s distributed The GVM allows a Gozer program (specifically, a flow workflows are designed to overcome these difficulties by of control within the program) to request a continuation allowing a single process to span multiple machines, by at any point by executing the yield or push-cc special using a fork/join paradigm that prohibits side-effects, by au- forms. A continuation represents the completion of the same tomatically creating and maintaining persistent checkpoints, flow of control (compare to a future, which represents the and by using non-blocking, zero-resource consuming, event completion of a different flow of control). Additionally, the driven processing. yield form causes the GVM to return control to its own Gozer’s distribution facilities, and in general its integra- caller. Using the operations in Table 1 and GVM continu- tion with the BlueBox platform, are provided in a separate ations requested at opportune moments, Vinz automatically module known as Vinz. Vinz offers a simplified set of ab- distributes and migrates workflows stractions to workflow authors intended to make writing fully Execution of a workflow is typically initiated by invoking distributed, concurrent workflows as similar to writing local, the Start operation with a set of workflow-defined param- sequential programs as possible.

The Gozer Workflow System

Benchmarking Implementations of Functional Languages with ‘Pseudoknot’, a Float-Intensive Benchmark

A Scheme Foreign Function Interface to Javascript Based on an Infix

The Evolution of Lisp

Tousimojarad, Ashkan (2016) GPRM: a High Performance Programming Framework for Manycore Processors. Phd Thesis

Part: an Asynchronous Parallel Abstraction for Speculative Pipeline Computations Kiko Fernandez-Reyes, Dave Clarke, Daniel Mccain

The Butterfly(TM) Lisp System

Graph Reduction Without Pointers

Parallelism in Lisp

Parallel Combinators for the Encore Programming Language

Benchmarking Implementations of Functional Languages With

Functional Programming 28 and 30 Sept

Part: an Asynchronous Parallel Abstraction for Speculative Pipeline Computations