Gridrun: a Lightweight Packaging and Execution Environment for Compact, Multi-Architecture Binaries

Total Page:16

File Type:pdf, Size:1020Kb

Gridrun: a Lightweight Packaging and Execution Environment for Compact, Multi-Architecture Binaries GridRun: A lightweight packaging and execution environment for compact, multi-architecture binaries John Shalf Lawrence Berkeley National Laboratory 1 Cyclotron Road, Berkeley California 94720 {[email protected]} Tom Goodale Louisiana State University Baton Rouge, LA 70803 {[email protected]} applications and will continue to play a very important Abstract role in distributed applications for the foreseeable future. Therefore, we focus our attention squarely on GridRun offers a very simple set of tools for creating the issue of simplifying the management and and executing multi-platform binary executables. distribution native executables as multiplatform binary These "fat-binaries" archive native machine code into packages (fat-binaries). compact packages that are typically a fraction the size of the original binary images they store, enabling In order to support a seamless multiplatform execution efficient staging of executables for heterogeneous environment in lieu of virtual machines, we extend the parallel jobs. GridRun interoperates with existing familiar concept of the “fat-binary” to apply to multi- distributed job launchers/managers like Condor and operating system environments. The fat-binary has the Globus GRAM to greatly simplify the logic been a very well-known and successful design pattern required launching native binary applications in for smoothing major instruction-set-architecture distributed heterogeneous environments. transitions within a single Operating System environment. A fat-binary file contains complete Introduction images of the binary executables for each CPU instruction set it supports. The operating system’s Grid computing makes a vast number of file-loader then selects and executes appropriate heterogeneous computing systems available as a binary image for the CPU architecture that is running virtual computing resource. It is desirable to run the program. Fat-binaries have been used successfully native binary programs in order to use these resources for packaging Windows NT programs that could efficiently. However, executing native programs in execute at native performance on both DEC Alpha and heterogeneous distributed environments typically Intel x86 architectures. In a more widely known requires careful staging of the native binary images, example, Apple Computer Inc. used fat-binary complex RSL’s or clever job-launcher scripting to executables as an alternative to emulation during the select the appropriate executable to run on each transition from the 680x0 processors to the PowerPC hardware platform. Although there are many robust architecture. resource selection systems available as an integrated part of Grid schedulers, progress in deploying From a user’s standpoint, the fat-binaries appear to be production metacomputing applications has been ordinary program files that execute at native speed on hampered as a result of the non-uniformity and machines with radically different CPU architectures inherent complexity of methods employed to mange without any additional effort on their part. However, native code for parallel heterogeneous environments. the only available examples of fat-binary execution environments are systems that support different CPU Interpreted languages and Virtual Machines are often architectures that run the same operating system. We employed as an abstraction layer that hides desire this same degree of elegance for binaries that architectural heterogeneity [2]. This includes scripting work across multiple Operating Systems as well as languages, byte-codes, and virtual machines of various different CPU architectures in order to support forms. However, these solutions have a significant heterogeneous Grid environments. A Grid-oriented performance impact for compute-bound applications. fat-binary execution architecture must support more Native binary programs still offer the most efficient robust selection criteria than the prior Microsoft and execution environment for compute-bound Apple examples. In addition, some level of essentially out of the user’s control. The execution compression must be employed to ensure that these paradigm afforded by fat-binaries places the control of “fat-binaries” do not get unmanageably large given software revisions firmly in control of the user. the larger number of target platforms it must support. Finally, the file format must work well with existing VisPortal [11] and the GridLab Information Services job launchers for both parallel and distributed [4] provide a looser remote service model where environments including Condor, Globus GRAM, distributed information services (GRIS/GIIS) or local GridLab GRMS, and various implementations of MPI “contact lists” act as central indices for software [4,7]. components that are pre-installed on various machines in their heterogeneous environment. When a job is In the paper we describe a simple multiplatform launched on a particular host, the system queries the execution environment called GridRun that simplifies information service (eg GIIS) or the contact list to the creation and management of fat-binary executables provide the correct location of the executable to the for heterogeneous collections of computing resources. job launcher (eg edit the RSL for a GRAM job GridRun reduces the size of file transfers required to launch). However, unlike NetSolve’s remote stage the executables, simplifies storage and selection computing model, software components that are of the correct executable image, and even reduces the indexed in this manner typically only loosely complexity of Globus RSL’s and Condor submit files integrated with the information providers used by the for these kinds of jobs. GridRun easily accommodates MDS, so there is only a weak guarantee that the additional selection criteria that account for installed code-revision matches the data presented by heterogeneity in operating systems, instruction sets, the information service. The process of pushing out and even software libraries. new code revisions to a large collection of heterogeneous hosts in order to ensure revision control Related Work consistency can be tedious and is clearly not scalable. The most typical method for managing binary There are examples of scalable systems using an executables in a heterogeneous environment is to application-level scheduling paradigm (AppLeS) [8], manually stage the executables on each machine. such as the Application Manager component of the These methods are typically employed when GrADS framework [7] and Nimrod/G [9], where the launching parallel MPI-based metacomputing jobs binaries are moved to computing resources on a that span multiple sites and computer architectures. demand-driven basis. However, a fat-binary based Examples of this include the Gordon-Bell winning system provides the same scaling efficiency and code runtime optimizing transcontinental black-hole revision consistency without the added complexity of simulations performed by Dramlitsch et al In 2001 [9]. an application-level scheduler or indexing the codes MPICH-G2 was used to launch the job on 1500 via distributed information services. processors spread across 4 heterogeneous supercomputer systems located on multiple continents Another method for moving native code in a (via DUROC), but the executable images had to be heterogeneous environment is to incorporate manually staged on each of those respective systems. sophisticated automation for rebuilding the application DUROC was also used to provide a separate RSL for from source code as part of its launch procedure. the job-launch on each system. Fat-binaries would Such systems provide stronger guarantees of code allow the a single binary image to be staged across the revision consistency. The Cactus Worm [3] heterogeneous resources as an integrated part of the exemplifies the kind of application. The Worm is an job launching procedure with considerably less adaptive Grid application that dynamically discovers variation in the subjob RSL’s. additional resources on the Grid at runtime and will migrate itself to “better” resources automatically in Some systems manage heterogeneous execution response to “contract violation” or other soft resource environments using a service model that treats pre- failures. The Worm’s nomadic capabilities depend on installed native binaries as resources. Platform-neutral Cactus’ architecture independent checkpointing RPC interfaces are used to abstract the differences mechanism and Cactus’ robust ability to automatically between underlying computing platforms. Systems rebuild itself from source code on a wide variety of like NetSolve [6] are typical of an agent-based remote computing platforms. However, when the cost of computing service model where the software rebuilding the application from source-code is component stays resident on a server at a fixed factored into the performance model employed by the location and is invoked via RPC as needed in Resource Selector component of the application, it can distributed applications. The remote server paradigm create a significant barrier to migration. Similar ensures the revision consistency and availability of the examples can be constructed from the various software components. However, this very centralized adaptive application scenarios supported by the approach provides a rather rigid infrastructure that is GrADS software infrastructure [7]. Migrating the Worm’s executable image in fat-binary form would define custom, application-specific job selection significantly simplify code
Recommended publications
  • Cygwin User's Guide
    Cygwin User’s Guide Cygwin User’s Guide ii Copyright © Cygwin authors Permission is granted to make and distribute verbatim copies of this documentation provided the copyright notice and this per- mission notice are preserved on all copies. Permission is granted to copy and distribute modified versions of this documentation under the conditions for verbatim copying, provided that the entire resulting derived work is distributed under the terms of a permission notice identical to this one. Permission is granted to copy and distribute translations of this documentation into another language, under the above conditions for modified versions, except that this permission notice may be stated in a translation approved by the Free Software Foundation. Cygwin User’s Guide iii Contents 1 Cygwin Overview 1 1.1 What is it? . .1 1.2 Quick Start Guide for those more experienced with Windows . .1 1.3 Quick Start Guide for those more experienced with UNIX . .1 1.4 Are the Cygwin tools free software? . .2 1.5 A brief history of the Cygwin project . .2 1.6 Highlights of Cygwin Functionality . .3 1.6.1 Introduction . .3 1.6.2 Permissions and Security . .3 1.6.3 File Access . .3 1.6.4 Text Mode vs. Binary Mode . .4 1.6.5 ANSI C Library . .4 1.6.6 Process Creation . .5 1.6.6.1 Problems with process creation . .5 1.6.7 Signals . .6 1.6.8 Sockets . .6 1.6.9 Select . .7 1.7 What’s new and what changed in Cygwin . .7 1.7.1 What’s new and what changed in 3.2 .
    [Show full text]
  • Chapter 1. Origins of Mac OS X
    1 Chapter 1. Origins of Mac OS X "Most ideas come from previous ideas." Alan Curtis Kay The Mac OS X operating system represents a rather successful coming together of paradigms, ideologies, and technologies that have often resisted each other in the past. A good example is the cordial relationship that exists between the command-line and graphical interfaces in Mac OS X. The system is a result of the trials and tribulations of Apple and NeXT, as well as their user and developer communities. Mac OS X exemplifies how a capable system can result from the direct or indirect efforts of corporations, academic and research communities, the Open Source and Free Software movements, and, of course, individuals. Apple has been around since 1976, and many accounts of its history have been told. If the story of Apple as a company is fascinating, so is the technical history of Apple's operating systems. In this chapter,[1] we will trace the history of Mac OS X, discussing several technologies whose confluence eventually led to the modern-day Apple operating system. [1] This book's accompanying web site (www.osxbook.com) provides a more detailed technical history of all of Apple's operating systems. 1 2 2 1 1.1. Apple's Quest for the[2] Operating System [2] Whereas the word "the" is used here to designate prominence and desirability, it is an interesting coincidence that "THE" was the name of a multiprogramming system described by Edsger W. Dijkstra in a 1968 paper. It was March 1988. The Macintosh had been around for four years.
    [Show full text]
  • Intel® Oneapi Programming Guide
    Intel® oneAPI Programming Guide Intel Corporation www.intel.com Notices and Disclaimers Contents Notices and Disclaimers....................................................................... 5 Chapter 1: Introduction oneAPI Programming Model Overview ..........................................................7 Data Parallel C++ (DPC++)................................................................8 oneAPI Toolkit Distribution..................................................................9 About This Guide.......................................................................................9 Related Documentation ..............................................................................9 Chapter 2: oneAPI Programming Model Sample Program ..................................................................................... 10 Platform Model........................................................................................ 14 Execution Model ...................................................................................... 15 Memory Model ........................................................................................ 17 Memory Objects.............................................................................. 19 Accessors....................................................................................... 19 Synchronization .............................................................................. 20 Unified Shared Memory.................................................................... 20 Kernel Programming
    [Show full text]
  • NOMADS User Guide V1.0
    NOMADS User Guide V1.0 Table of Contents • Introduction o Explanation of "Online" and "Offline" Data o Obtaining Offline Data o Offline Order Limitations • Distributed Data Access o Functional Overview o Getting Started with OPeNDAP o Desktop Services ("Clients") o OPeNDAP Servers • Quick Start to Retrieve or Plot Data o Getting Started o Using FTP to Retrieve Data o Using HTTP to Retrieve Data o Using wget to Retrieve Data o Using GDS Clients to Plot Data o Using the NOMADS Server to Create Plots • Advanced Data Access Methods o Subsetting Parameters and Levels Using FTP4u o Obtain ASCII Data Using GDS o Scripting Wget in a Time Loop o Mass GRIB Subsetting: Utilizing Partial-file HTTP Transfers • General Data Structure o Organization of Data Directories o Organization of Data Files o Use of Templates in GrADS Data Descriptor Files • MD5 Hash Files o Purpose of MD5 Files o Identifying MD5 Files o Using md5sum on MD5 Files (Linux) o Using gmd5sum on MD5 Files (FreeBSD) o External Links • Miscellaneous Information o Building Dataset Collections o Working with Timeseries Data o Creating Plots with GrADS o Scheduled Downtime Notification Introduction The model data repository at NCEI contains both deep archived (offline) and online model data. We provide a variety of ways to access our weather and climate model data. You can access the online data using traditional access methods (web-based or FTP), or you can use open and distributed access methods promoted under the collaborative approach called the NOAA National Operational Model Archive and Distribution System (NOMADS). On the Data Products page you are presented with a table that contains basic information about each dataset, as well as links to the various services available for each dataset.
    [Show full text]
  • Pooch Manual In
    What’s New As of August 21, 2011, Pooch is updated to version 1.8.3 for use with OS X 10.7 “Lion”: Pooch users can renew their subscriptions today! Please see http://daugerresearch.com/pooch for more! On November 17, 2009, Pooch was updated to version 1.8: • Linux: Pooch can now cluster nodes running 64-bit Linux, combined with Mac • 64-bit: Major internal revisions for 64-bit, particularly updated data types and structures, for Mac OS X 10.6 "Snow Leopard" and 64-bit Linux • Sockets: Major revisions to internal networking to adapt to BSD Sockets, as recommended by Apple moving forward and required for Linux • POSIX Paths: Major revisions to internal file specification format in favor of POSIX paths, recommended by Apple moving forward and required for Linux • mDNS: Adapted usage of Bonjour service discovery to use Apple's Open Source mDNS library • Pooch Binary directory: Added Pooch binary directory support, making possible launching jobs using a remotely-compiled executable • Minor updates and fixes needed for Mac OS X 10.6 "Snow Leopard" Current Pooch users can renew their subscriptions today! Please see http://daugerresearch.com/pooch for more! On April 16, 2008, Pooch was updated to version 1.7.6: • Mac OS X 10.5 “Leopard” spurs updates in a variety of Pooch technologies: • Network Scan window • Preferences window • Keychain access • Launching via, detection of, and commands to the Terminal • Behind the Login window behavior • Other user interface and infrastructure adjustments • Open MPI support: • Complete MPI support using libraries
    [Show full text]
  • Package 'Filelock'
    Package ‘filelock’ October 5, 2018 Title Portable File Locking Version 1.0.2 Author Gábor Csárdi Maintainer Gábor Csárdi <[email protected]> Description Place an exclusive or shared lock on a file. It uses 'LockFile' on Windows and 'fcntl' locks on Unix-like systems. License MIT + file LICENSE LazyData true URL https://github.com/r-lib/filelock#readme BugReports https://github.com/r-lib/filelock/issues RoxygenNote 6.0.1.9000 Suggests callr (>= 2.0.0), covr, testthat Encoding UTF-8 NeedsCompilation yes Repository CRAN Date/Publication 2018-10-05 10:30:12 UTC R topics documented: lock .............................................2 Index 5 1 2 lock lock Advisory File Locking and Unlocking Description There are two kinds of locks, exclusive and shared, see the exclusive argument and other details below. Usage lock(path, exclusive = TRUE, timeout = Inf) unlock(lock) Arguments path Path to the file to lock. If the file does not exist, it will be created, but the directory of the file must exist. Do not place the lock on a file that you want to read from or write to! *Always use a special lock file. See details below. exclusive Whether to acquire an exclusive lock. An exclusive lock gives the process ex- clusive access to the file, no other processes can place any kind of lock on it. A non-exclusive lock is a shared lock. Multiple processes can hold a shared lock on the same file. A process that writes to a file typically requests an exclusive lock, and a process that reads from it typically requests a shared lock.
    [Show full text]
  • Mac OS X for UNIX Users the Power of UNIX with the Simplicity of Macintosh
    Mac OS X for UNIX Users The power of UNIX with the simplicity of Macintosh. Features Mac OS X version 10.3 “Panther” combines a robust and open UNIX-based foundation with the richness and usability of the Macintosh interface, bringing UNIX technology Open source, standards-based UNIX to the mass market. Apple has made open source and standards a key part of its foundation strategy and delivers an operating system built on a powerful UNIX-based foundation •Based on FreeBSD 5 and Mach 3.0 • Support for POSIX, Linux, and System V APIs that is innovative and easy to use. • High-performance math libraries, including There are over 8.5 million Mac OS X users, including scientists, animators, developers, vector/DSP and PowerPC G5 support and system administrators, making Mac OS X the most widely used UNIX-based desktop • Optimized X11 window server for UNIX GUIs operating system. In addition, Mac OS X is the only UNIX-based environment that •Open source code available via the natively runs Microsoft Office, Adobe Photoshop, and thousands of other consumer Darwin project applications—all side by side with traditional command-line, X11, and Java applications. Standards-based networking For notebook computer users, Mac OS X delivers full power management and mobility •Open source TCP/IP-based networking support for Apple’s award-winning PowerBook G4. architecture, including IPv4, IPv6, and L2TP/IPSec •Interoperability with NFS, AFP, and Windows (SMB/CIFS) file servers •Powerful web server (Apache) •Open Directory 2, an LDAP-based directory services
    [Show full text]
  • Managing Network File Systems in Oracle® Solaris 11.4
    Managing Network File Systems in ® Oracle Solaris 11.4 Part No: E61004 August 2021 Managing Network File Systems in Oracle Solaris 11.4 Part No: E61004 Copyright © 2002, 2021, Oracle and/or its affiliates. This software and related documentation are provided under a license agreement containing restrictions on use and disclosure and are protected by intellectual property laws. Except as expressly permitted in your license agreement or allowed by law, you may not use, copy, reproduce, translate, broadcast, modify, license, transmit, distribute, exhibit, perform, publish, or display any part, in any form, or by any means. Reverse engineering, disassembly, or decompilation of this software, unless required by law for interoperability, is prohibited. The information contained herein is subject to change without notice and is not warranted to be error-free. If you find any errors, please report them to us in writing. If this is software or related documentation that is delivered to the U.S. Government or anyone licensing it on behalf of the U.S. Government, then the following notice is applicable: U.S. GOVERNMENT END USERS: Oracle programs (including any operating system, integrated software, any programs embedded, installed or activated on delivered hardware, and modifications of such programs) and Oracle computer documentation or other Oracle data delivered to or accessed by U.S. Government end users are "commercial computer software" or "commercial computer software documentation" pursuant to the applicable Federal Acquisition Regulation and agency-specific supplemental regulations. As such, the use, reproduction, duplication, release, display, disclosure, modification, preparation of derivative works, and/or adaptation of i) Oracle programs (including any operating system, integrated software, any programs embedded, installed or activated on delivered hardware, and modifications of such programs), ii) Oracle computer documentation and/or iii) other Oracle data, is subject to the rights and limitations specified in the license contained in the applicable contract.
    [Show full text]
  • Cygwin User's Guide
    Cygwin User’s Guide i Cygwin User’s Guide Cygwin User’s Guide ii Copyright © 1998, 1999, 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011, 2012 Red Hat, Inc. Permission is granted to make and distribute verbatim copies of this documentation provided the copyright notice and this per- mission notice are preserved on all copies. Permission is granted to copy and distribute modified versions of this documentation under the conditions for verbatim copying, provided that the entire resulting derived work is distributed under the terms of a permission notice identical to this one. Permission is granted to copy and distribute translations of this documentation into another language, under the above conditions for modified versions, except that this permission notice may be stated in a translation approved by the Free Software Foundation. Cygwin User’s Guide iii Contents 1 Cygwin Overview 1 1.1 What is it? . .1 1.2 Quick Start Guide for those more experienced with Windows . .1 1.3 Quick Start Guide for those more experienced with UNIX . .1 1.4 Are the Cygwin tools free software? . .2 1.5 A brief history of the Cygwin project . .2 1.6 Highlights of Cygwin Functionality . .3 1.6.1 Introduction . .3 1.6.2 Permissions and Security . .3 1.6.3 File Access . .3 1.6.4 Text Mode vs. Binary Mode . .4 1.6.5 ANSI C Library . .5 1.6.6 Process Creation . .5 1.6.6.1 Problems with process creation . .5 1.6.7 Signals . .6 1.6.8 Sockets . .6 1.6.9 Select .
    [Show full text]
  • Portalocker Documentation Release 2.3.2
    Portalocker Documentation Release 2.3.2 Rick van Hattem Aug 27, 2021 CONTENTS 1 portalocker - Cross-platform locking library1 1.1 Overview.................................................1 1.2 Redis Locks...............................................1 1.3 Python 2.................................................2 1.4 Tips....................................................2 1.5 Links...................................................2 1.6 Examples.................................................3 1.7 Versioning................................................4 1.8 Changelog................................................4 1.9 License..................................................4 1.9.1 portalocker package.......................................4 1.9.1.1 Submodules......................................4 1.9.1.2 Module contents....................................9 1.9.2 tests package.......................................... 13 1.9.2.1 Module contents.................................... 13 1.9.3 License............................................. 13 2 Indices and tables 15 Python Module Index 17 Index 19 i ii CHAPTER ONE PORTALOCKER - CROSS-PLATFORM LOCKING LIBRARY 1.1 Overview Portalocker is a library to provide an easy API to file locking. An important detail to note is that on Linux and Unix systems the locks are advisory by default. By specifying the -o mand option to the mount command it is possible to enable mandatory file locking on Linux. This is generally not recommended however. For more information about the subject: • https://en.wikipedia.org/wiki/File_locking
    [Show full text]
  • Real-Time Audio Servers on BSD Unix Derivatives
    Juha Erkkilä Real-Time Audio Servers on BSD Unix Derivatives Master's Thesis in Information Technology June 17, 2005 University of Jyväskylä Department of Mathematical Information Technology Jyväskylä Author: Juha Erkkilä Contact information: [email protected].fi Title: Real-Time Audio Servers on BSD Unix Derivatives Työn nimi: Reaaliaikaiset äänipalvelinsovellukset BSD Unix -johdannaisjärjestelmissä Project: Master's Thesis in Information Technology Page count: 146 Abstract: This paper covers real-time and interprocess communication features of 4.4BSD Unix derived operating systems, and especially their applicability for real- time audio servers. The research ground of bringing real-time properties to tradi- tional Unix operating systems (such as 4.4BSD) is covered. Included are some design ideas used in BSD-variants, such as using multithreaded kernels, and schedulers that can provide real-time guarantees to processes. Factors affecting the design of real- time audio servers are considered, especially the suitability of various interprocess communication facilities as mechanisms to pass audio data between applications. To test these mechanisms on a real operating system, an audio server and a client utilizing these techniques is written and tested on an OpenBSD operating system. The performance of the audio server and OpenBSD is analyzed, with attempts to identify some bottlenecks of real-time operation in the OpenBSD system. Suomenkielinen tiivistelmä: Tämä tutkielma kattaa reaaliaikaisuus- ja prosessien väliset kommunikaatio-ominaisuudet, keskittyen 4.4BSD Unix -johdannaisiin käyt- töjärjestelmiin, ja erityisesti siihen kuinka hyvin nämä soveltuvat reaaliaikaisille äänipalvelinsovelluksille. Tutkimusalueeseen sisältyy reaaliaikaisuusominaisuuk- sien tuominen perinteisiin Unix-käyttöjärjestelmiin (kuten 4.4BSD:hen). Mukana on suunnitteluideoita, joita on käytetty joissakin BSD-varianteissa, kuten säikeis- tetyt kernelit, ja skedulerit, jotka voivat tarjota reaaliaikaisuustakeita prosesseille.
    [Show full text]
  • Design and Implementation of XNU Port of Lustre Client File System
    Design and Implementation of XNU port of Lustre Client File System Danilov Nikita 2005.02.01 Abstract Describes structure of Lustre client file system module for XNU (Darwin kernel). In particular, changes that were necessary in core XNU kernel to enable unique Lustre requirements (e.g., intents) are discussed in much detail. Changes to the platform-independent core of Lustre in order to make it more portable are discussed in the companion paper Lustre Universal Portability Specification. Contents 1 Introduction 2 2 Distribution 3 3 Backgroundon XNU 3 3.1 XNUVFS.......................................... ......... 3 3.1.1 namei()....................................... ......... 4 3.1.2 vnodelifecycle ................................ ........... 6 3.2 XNUpagecache.................................... ........... 6 3.3 XNUSynchronization .............................. ............. 7 3.4 Miscellania..................................... ............. 9 4 High Level Design 9 4.1 XLLIntentHandling ............................... ............. 9 4.1.1 Requirements .................................. .......... 9 4.1.2 FunctionalSpecification . .............. 10 4.1.3 UseCases ...................................... ........ 10 4.1.4 LogicalSpecification . ............. 11 4.1.5 StateSpecification ......................... .... ............ 12 4.2 Sessions........................................ ............ 12 4.2.1 Requirements .................................. .......... 12 4.2.2 FunctionalSpecification . .............. 12 4.2.3 UseCases .....................................
    [Show full text]