Gridrun: a Lightweight Packaging and Execution Environment for Compact, Multi-Architecture Binaries
Total Page:16
File Type:pdf, Size:1020Kb
GridRun: A lightweight packaging and execution environment for compact, multi-architecture binaries John Shalf Lawrence Berkeley National Laboratory 1 Cyclotron Road, Berkeley California 94720 {[email protected]} Tom Goodale Louisiana State University Baton Rouge, LA 70803 {[email protected]} applications and will continue to play a very important Abstract role in distributed applications for the foreseeable future. Therefore, we focus our attention squarely on GridRun offers a very simple set of tools for creating the issue of simplifying the management and and executing multi-platform binary executables. distribution native executables as multiplatform binary These "fat-binaries" archive native machine code into packages (fat-binaries). compact packages that are typically a fraction the size of the original binary images they store, enabling In order to support a seamless multiplatform execution efficient staging of executables for heterogeneous environment in lieu of virtual machines, we extend the parallel jobs. GridRun interoperates with existing familiar concept of the “fat-binary” to apply to multi- distributed job launchers/managers like Condor and operating system environments. The fat-binary has the Globus GRAM to greatly simplify the logic been a very well-known and successful design pattern required launching native binary applications in for smoothing major instruction-set-architecture distributed heterogeneous environments. transitions within a single Operating System environment. A fat-binary file contains complete Introduction images of the binary executables for each CPU instruction set it supports. The operating system’s Grid computing makes a vast number of file-loader then selects and executes appropriate heterogeneous computing systems available as a binary image for the CPU architecture that is running virtual computing resource. It is desirable to run the program. Fat-binaries have been used successfully native binary programs in order to use these resources for packaging Windows NT programs that could efficiently. However, executing native programs in execute at native performance on both DEC Alpha and heterogeneous distributed environments typically Intel x86 architectures. In a more widely known requires careful staging of the native binary images, example, Apple Computer Inc. used fat-binary complex RSL’s or clever job-launcher scripting to executables as an alternative to emulation during the select the appropriate executable to run on each transition from the 680x0 processors to the PowerPC hardware platform. Although there are many robust architecture. resource selection systems available as an integrated part of Grid schedulers, progress in deploying From a user’s standpoint, the fat-binaries appear to be production metacomputing applications has been ordinary program files that execute at native speed on hampered as a result of the non-uniformity and machines with radically different CPU architectures inherent complexity of methods employed to mange without any additional effort on their part. However, native code for parallel heterogeneous environments. the only available examples of fat-binary execution environments are systems that support different CPU Interpreted languages and Virtual Machines are often architectures that run the same operating system. We employed as an abstraction layer that hides desire this same degree of elegance for binaries that architectural heterogeneity [2]. This includes scripting work across multiple Operating Systems as well as languages, byte-codes, and virtual machines of various different CPU architectures in order to support forms. However, these solutions have a significant heterogeneous Grid environments. A Grid-oriented performance impact for compute-bound applications. fat-binary execution architecture must support more Native binary programs still offer the most efficient robust selection criteria than the prior Microsoft and execution environment for compute-bound Apple examples. In addition, some level of essentially out of the user’s control. The execution compression must be employed to ensure that these paradigm afforded by fat-binaries places the control of “fat-binaries” do not get unmanageably large given software revisions firmly in control of the user. the larger number of target platforms it must support. Finally, the file format must work well with existing VisPortal [11] and the GridLab Information Services job launchers for both parallel and distributed [4] provide a looser remote service model where environments including Condor, Globus GRAM, distributed information services (GRIS/GIIS) or local GridLab GRMS, and various implementations of MPI “contact lists” act as central indices for software [4,7]. components that are pre-installed on various machines in their heterogeneous environment. When a job is In the paper we describe a simple multiplatform launched on a particular host, the system queries the execution environment called GridRun that simplifies information service (eg GIIS) or the contact list to the creation and management of fat-binary executables provide the correct location of the executable to the for heterogeneous collections of computing resources. job launcher (eg edit the RSL for a GRAM job GridRun reduces the size of file transfers required to launch). However, unlike NetSolve’s remote stage the executables, simplifies storage and selection computing model, software components that are of the correct executable image, and even reduces the indexed in this manner typically only loosely complexity of Globus RSL’s and Condor submit files integrated with the information providers used by the for these kinds of jobs. GridRun easily accommodates MDS, so there is only a weak guarantee that the additional selection criteria that account for installed code-revision matches the data presented by heterogeneity in operating systems, instruction sets, the information service. The process of pushing out and even software libraries. new code revisions to a large collection of heterogeneous hosts in order to ensure revision control Related Work consistency can be tedious and is clearly not scalable. The most typical method for managing binary There are examples of scalable systems using an executables in a heterogeneous environment is to application-level scheduling paradigm (AppLeS) [8], manually stage the executables on each machine. such as the Application Manager component of the These methods are typically employed when GrADS framework [7] and Nimrod/G [9], where the launching parallel MPI-based metacomputing jobs binaries are moved to computing resources on a that span multiple sites and computer architectures. demand-driven basis. However, a fat-binary based Examples of this include the Gordon-Bell winning system provides the same scaling efficiency and code runtime optimizing transcontinental black-hole revision consistency without the added complexity of simulations performed by Dramlitsch et al In 2001 [9]. an application-level scheduler or indexing the codes MPICH-G2 was used to launch the job on 1500 via distributed information services. processors spread across 4 heterogeneous supercomputer systems located on multiple continents Another method for moving native code in a (via DUROC), but the executable images had to be heterogeneous environment is to incorporate manually staged on each of those respective systems. sophisticated automation for rebuilding the application DUROC was also used to provide a separate RSL for from source code as part of its launch procedure. the job-launch on each system. Fat-binaries would Such systems provide stronger guarantees of code allow the a single binary image to be staged across the revision consistency. The Cactus Worm [3] heterogeneous resources as an integrated part of the exemplifies the kind of application. The Worm is an job launching procedure with considerably less adaptive Grid application that dynamically discovers variation in the subjob RSL’s. additional resources on the Grid at runtime and will migrate itself to “better” resources automatically in Some systems manage heterogeneous execution response to “contract violation” or other soft resource environments using a service model that treats pre- failures. The Worm’s nomadic capabilities depend on installed native binaries as resources. Platform-neutral Cactus’ architecture independent checkpointing RPC interfaces are used to abstract the differences mechanism and Cactus’ robust ability to automatically between underlying computing platforms. Systems rebuild itself from source code on a wide variety of like NetSolve [6] are typical of an agent-based remote computing platforms. However, when the cost of computing service model where the software rebuilding the application from source-code is component stays resident on a server at a fixed factored into the performance model employed by the location and is invoked via RPC as needed in Resource Selector component of the application, it can distributed applications. The remote server paradigm create a significant barrier to migration. Similar ensures the revision consistency and availability of the examples can be constructed from the various software components. However, this very centralized adaptive application scenarios supported by the approach provides a rather rigid infrastructure that is GrADS software infrastructure [7]. Migrating the Worm’s executable image in fat-binary form would define custom, application-specific job selection significantly simplify code