Mos: an Architecture for Extreme-Scale Operating Systems

Total Page:16

File Type:pdf, Size:1020Kb

Mos: an Architecture for Extreme-Scale Operating Systems mOS: An Architecture for Extreme-Scale Operating Systems Robert W. Wisniewskiy Todd Ingletty Pardo Keppely Ravi Murtyy Rolf Rieseny Linux R , or more specifically, the Linux API, plays a key 1. INTRODUCTION role in HPC computing. Even for extreme-scale computing, As the system software community moves forward to ex- a known and familiar API is required for production ma- ascale computing and beyond, there is the oft debated ques- chines. However, an off-the-shelf Linux distribution faces tion of how revolutionary versus how evolutionary the soft- challenges at extreme scale. To date, two approaches have ware needs to be. Over the last half decade, researchers been used to address the challenges of providing an operat- have pushed in one direction or the other. We contend that ing system (OS) at extreme scale. In the Full-Weight Kernel both directions are valid and needed simultaneously. Throw- (FWK) approach, an OS, typically Linux, forms the starting ing out all current software environments and starting over point, and work is undertaken to remove features from the would be untenable from an application perspective. Yet, environment so that it will scale up across more cores and there are significant challenges getting to exascale and be- out across a large cluster. A Light-Weight Kernel (LWK) yond, so revolutionary approaches are needed. Thus, we approach often starts with a new kernel and work is under- need to simultaneously allow the evolutionary path, i.e., in taken to add functionality to provide a familiar API, typi- the OS context, a Linux API, to coexist with revolutionary cally Linux. Either approach however, results in an execu- models supportable by a nimble LWK. The focus of mOS tion environment that is not fully Linux compatible. is extreme-scale HPC. mOS, which simultaneously runs a mOS (multi Operating System) runs both an FWK (Linux), Linux and an LWK, supports the coexistence of evolutionary and an LWK, simultaneously as kernels on the same compute and revolutionary models. In the rest of the introduction, node. mOS thereby achieves the scalability and reliability we describe the following motivations for mOS: of LWKs, while providing the full Linux functionality of an • simultaneously support the existing Linux API with LWK FWK. Further, mOS works in concert with Operating Sys- performance, scalability, and reliability; tem Nodes (OSNs) to offload system calls, e.g., I/O, that are too invasive to run on the compute nodes at extreme- • resolve the tension between FWK and LWK approaches; scale. Beyond providing full Linux capability with LWK • nimbly incorporate new technologies such as hybrid mem- performance, other advantages of mOS include the ability to ory and specialized cores, as well as new models such as effectively manage different types of compute and memory fine-grained threading and asynchronicity; and resources, interface easily with proposed asynchronous and • provide a hierarchy for system call implementation. fine-grained runtimes, and nimbly manage new technologies. The architecture of mOS is depicted in Figure 1. We This paper is an architectural description of mOS. As a describe it in more specificity in Section 4, but include it prototype is not yet finished, the contributions of this work early so the following description has context. are a description of mOS's architecture, an exploration of Simultaneously support legacy and new paradigms the tradeoffs and value of this approach for the purposes Linux is crucial to a broad user base across many comput- listed above, and a detailed architecture description of each ing areas. As such, proposed modifications to Linux need to of the six components of mOS, including the tradeoffs we meet a high bar in terms of their applicability. An LWK has considered. The uptick of OS research work indicates that greater flexibility to target the specific needs of high-end many view this as an important area for getting to extreme HPC. mOS, by incorporating Linux and an LWK, leverages scale. Thus, most importantly, the goal of the paper is to the strengths of Linux and provides a test bed to explore generate discussion in this area at the workshop. new technologies allowing Linux to incorporate them when they have demonstrated broader-based appeal. For exam- yIntel Corporation ple, large pages were used in HPC LWKs a decade ago, and Linux R is the registered trademark of Linus Torvalds in the U.S. and other countries. over time, as the usage models were validated, Linux has started including large page support. Thus, in mOS, Linux Permission to make digital or hard copies of all or part of this work for personal or and the LWK, are not in competition but are symbiotic with classroom use is granted without fee provided that copies are not made or distributed each leveraging the strengths of the other. From a pragmatic for profit or commercial advantage and that copies bear this notice and the full citation point of view, vendors need to deliver solutions that work on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or for their existing customer base, but also need to innovate republish, to post on servers or to redistribute to lists, requires prior specific permission to maintain competitiveness. mOS provides the ability to and/or a fee. Request permissions from [email protected]. simultaneously support legacy and emerging paradigms. ROSS ’14, June 10, 2014, Munich, Germany Resolve tension in FWK and LWK approaches Copyright is held by the owner/author(s). Publication rights licensed to ACM. Historically, high-end operating systems have been ap- ACM 978-1-4503-2950-7/14/06 ...$15.00. proached either from an FWK or an LWK perspective. In http://dx.doi.org/10.1145/2612262.2612263 single thread performance, ones optimized for power effi- ciency, ones designed for special purposes but being used more generally, e.g., GPUs, programmable logic (FPGAs), and fixed-function application-specific hardware. Thus, an mOS-like approach will be increasingly important compared to either FWK or traditional LWK approaches, as mOS's architecture is suited to leverage special cores. It allows placing key services as close as possible to raw hardware, while at the same time keeping jittery and other system ser- vices isolated. Further, because mOS provides non-critical Figure 1: mOS architecture c Intel services by calling Linux, the mOS LWK is smaller than a conventional LWK; in turn, this makes it simpler to port and target mOS to novel hardware. Provide a hierarchy for system calls an FWK approach, an OS, typically Linux, forms the start- As machines increase in size { both node count across the ing point, and work is undertaken to remove features from machine, and core count per node { a scalable method for the environment so that it will scale across more cores and handling the proliferation is to move to hierarchical mecha- out across more nodes in a large cluster. An LWK approach nisms. This has been done for scalable job launch, and I/O. often starts with a new kernel and work is undertaken to Cray's Linux [3] and the IBM R CNK [9] use a hierarchy for add functionality to provide a familiar API, typically some- I/O. By design, mOS leverages a hierarchy of places to im- thing close to that of a general purpose operating system plement system calls. First, performance critical calls are such as Linux. Either approach's end point however, is serviced by the LWK on the compute cores running mOS. an environment that is not fully Linux compatible. The Second, calls that benefit from a short latency, require node- LWK approaches Linux, but some Linux functionality, ei- local information, or require a Linux kernel, are serviced on ther because of time constraints or intentionally, is not im- the Linux running on the compute node. Third, calls with plemented. An FWK approach, strips enough from Linux so higher-latency or those requiring resources not available on that it too, no longer supports generic Linux applications. the compute node, are off-loaded to an OS node. While we Thus, some vendors offer two solutions: a fully compatible describe three levels of hierarchy for simplicity, we envision cluster HPC Linux that performs well but not at extreme offload can be to more than just a single node. scale, and a configured patched offering for extreme scale, but that does not provide full Linux compatibility. As an example, the Cray R Linux Environment provides extreme This paper is an architectural description of mOS. As a scale and cluster compatibility capability [3]. mOS runs prototype is not fully implemented, the contributions of this both Linux and an LWK simultaneously. Therefore, Linux work are 1) a description of mOS's architecture, 2) an ex- can be more Linux, it does not need to be stripped down. ploration of the tradeoffs we considered and the value of the And the LWK can be more lightweight, it does not need to mOS approach as a way to satisfy the above listed motiva- include support for services that should be supported on the tions. The motivations and requirements are representative compute node, but would be better left out of an LWK, e.g., of the needs for OS kernels for high-end HPC systems, and Java, Python. Further, mOS allows Linux functionality to 3) a description of mOS's six components and the tradeoffs be achieved with minimal or no patching of Linux. The goal we considered. of upstreaming modifications can be achieved due to the nar- There was significant operating system research in the row interface needed to interact with Linux, and provides for 1990s, and then in the 2000s there was less emphasis on new better sustainability due to not needing to maintain Linux operating structure and more work on Linux enhancements.
Recommended publications
  • ENCE360: Operating Systems Course Outline
    ENCE360: Operating Systems Course Outline This course is an introduction to operating systems: Operating systems are a special type of software that sits between the hardware and other software applications. They function to manage various computer resources, and to provide a convenient interface to the users. This course emphasises system calls (which provide an interface between the operating system and applications) and examples of operating systems. Lecture Topic Reading Laboratory Topic Introduction to Operating Systems MOS: Ch 1 C Revision Processes and Threads MOS: Ch 2 Scheduling (processes/threads) Threads, Processes Inter-process Communicatioin (pipes/sockets) MOS: pgs 43-45, 733-734 Concurrency Pipes, Files, Signals Input/Output MOS: Ch 5 Files and Directories MOS: Ch.4 Sockets LAB TEST 1 MID SEMESTER BREAK Memory Management - Caches MOS: Ch 3, 7.8 labs on OS examples, including Memory Management - Virtual Memory MOS: Sect 3.3 Xv6 (simple teaching OS) Virtualisation Distributed Systems Operating System Examples include: Windows, Linux, Android, macOS/iOS, real-time operating systems Assignment due Staff for Operating Systems Course Supervisor & Lecturer Dr Richard Green [email protected] Lecturer: Prof Mark Claypool [email protected] Tutor Gordon Beintmann [email protected] Laboratories There are two Labs begin the first week of term scheduled lab streams. For lab times and locations, check www.canterbury.ac.nz/tt All labs will be held in the department Self-allocate your lab via labs in the Erskine https://mytimetable.canterbury.ac.nz building. /aplus/apstudent Each student should If you hit any snags, email attend one 2-hour [email protected] lab each week.
    [Show full text]
  • Certified Systems Matrix 12C Release 3 (12.3.2.0.0) E59961-07 July 2016
    Oracle® Enterprise Manager Ops Center Certified Systems Matrix 12c Release 3 (12.3.2.0.0) E59961-07 July 2016 This guide lists the certified systems for Oracle Enterprise Manager Ops Center. The following topics are covered in this document: · Base Operating Systems · Base Browsers · Base Databases · Base Oracle Clusterware for High Availability · Target Operating Systems · Target Servers · Target Non-Server Hardware · Target Virtualization · Target Engineered Systems · Supported Technology Base Operating Systems This section describes the supported operating systems for the Enterprise Controller and Proxy Controller. Enterprise Controller Operating Systems This table lists the supported operating systems for the Enterprise Controller. Table 1-1 Enterprise Controller Operating Systems Enterprise Controller Operating Systems Certification Platform Version Minimum Update Level and Comments Oracle Solaris NA NA Oracle Solaris SPARC 10 Embedded Database: 1/13 Customer-Managed Database: 9/10 through 1/13 1 Table 1-1 (Cont.) Enterprise Controller Operating Systems Certification Platform Version Minimum Update Level and Comments Oracle Solaris SPARC 11 Embedded Database: 11.1 SRU 14.5 through 11.3 Customer-Managed Database: 11.0 SRU 10 through 11.3 Oracle Solaris 11 Express is not supported. Oracle Solaris SPARC 10 Embedded Database: 1/13 Local Zone Customer-Managed Database: 9/10 through 1/13 Oracle Solaris SPARC 11 Embedded Database: 11.1 SRU 14.5 through Local Zone 11.3 Customer-Managed Database: 11.0 SRU 10 through 11.3 Oracle Solaris 11 Express is not supported. Oracle VM Server for 10 Embedded Database: 1/13 SPARC Customer-Managed Database: 9/10 through 1/13 Oracle VM Server for 11 Embedded Database: 11.1 SRU 14.5 through SPARC 11.3 Customer-Managed Database: 11.0 SRU 10 through 11.3 Oracle Solaris 11 Express is not supported.
    [Show full text]
  • Seminar HPC Trends Winter Term 2017/2018 New Operating System Concepts for High Performance Computing
    Seminar HPC Trends Winter Term 2017/2018 New Operating System Concepts for High Performance Computing Fabian Dreer Ludwig-Maximilians Universit¨atM¨unchen [email protected] January 2018 Abstract 1 The Impact of System Noise When using a traditional operating system kernel When running large-scale applications on clusters, in high performance computing applications, the the noise generated by the operating system can cache and interrupt system are under heavy load by greatly impact the overall performance. In order to e.g. system services for housekeeping tasks which is minimize overhead, new concepts for HPC OSs are also referred to as noise. The performance of the needed as a response to increasing complexity while application is notably reduced by this noise. still considering existing API compatibility. Even small delays from cache misses or interrupts can affect the overall performance of a large scale In this paper we study the design concepts of het- application. So called jitter even influences collec- erogeneous kernels using the example of mOS and tive communication regarding the synchronization, the approach of library operating systems by ex- which can either absorb or propagate the noise. ploring the architecture of Exokernel. We sum- Even though asynchronous communication has a marize architectural decisions, present a similar much higher probability to absorb the noise, it is project in each case, Interface for Heterogeneous not completely unaffected. Collective operations Kernels and Unikernels respectively, and show suffer the most from propagation of jitter especially benchmark results where possible. when implemented linearly. But it is hard to anal- yse noise and its propagation for collective oper- ations even for simple algorithms.
    [Show full text]
  • The Linux Device File-System
    The Linux Device File-System Richard Gooch EMC Corporation [email protected] Abstract 1 Introduction All Unix systems provide access to hardware via de- vice drivers. These drivers need to provide entry points for user-space applications and system tools to access the hardware. Following the \everything is a file” philosophy of Unix, these entry points are ex- posed in the file name-space, and are called \device The Device File-System (devfs) provides a power- special files” or \device nodes". ful new device management mechanism for Linux. Unlike other existing and proposed device manage- This paper discusses how these device nodes are cre- ment schemes, it is powerful, flexible, scalable and ated and managed in conventional Unix systems and efficient. the limitations this scheme imposes. An alternative mechanism is then presented. It is an alternative to conventional disc-based char- acter and block special devices. Kernel device drivers can register devices by name rather than de- vice numbers, and these device entries will appear in the file-system automatically. 1.1 Device numbers Devfs provides an immediate benefit to system ad- ministrators, as it implements a device naming scheme which is more convenient for large systems Conventional Unix systems have the concept of a (providing a topology-based name-space) and small \device number". Each instance of a driver and systems (via a device-class based name-space) alike. hardware component is assigned a unique device number. Within the kernel, this device number is Device driver authors can benefit from devfs by used to refer to the hardware and driver instance.
    [Show full text]
  • A Framework for Visual Modular Design of Educational Operating System
    A Framework for Visual Modular Design of Educational Operating System Naeem Al-Oudat Communications and Computer Engineering Department, Tafila Technical University, Jordan [email protected] Abstract— Operating systems are a vital part in most • Memory manager. Utilizing the RAM and its computing systems. However, learning basic concepts of extensions in an efficient way is the role of this operating systems are hard for normal students although they component. are necessary and important. State of the art in teaching • File system. The main job of this component is to operating systems depends on studying existing open source operating systems like Linux, hacking teaching operating abstract the way of dealing with data and storing it in systems like Xv6, or using simulators. Difficulties of learning a permanent media as a hard disk. still there in these methods, since they require a great deal of Designing an environment where learners can work and system programming techniques. In this paper, we propose a design the above basic components of an operating system novel direction in learning operating systems that is solely without getting into the complicated details is an urgent need dependent on visually building the operating system. By using in today’s university classes of software systems. This this method, we mitigated the complexity of going into system simplicity should not make the whole process as a programming details. The development platform consists of key simulation/emulation-like design. A good option would be building blocks that a user can drag and drop into a working using pre-programmed components. The learner can select panel to build his own operating system.
    [Show full text]
  • Cielo Computational Environment Usage Model
    Cielo Computational Environment Usage Model With Mappings to ACE Requirements for the General Availability User Environment Capabilities Release Version 1.1 Version 1.0: Bob Tomlinson, John Cerutti, Robert A. Ballance (Eds.) Version 1.1: Manuel Vigil, Jeffrey Johnson, Karen Haskell, Robert A. Ballance (Eds.) Prepared by the Alliance for Computing at Extreme Scale (ACES), a partnership of Los Alamos National Laboratory and Sandia National Laboratories. Approved for public release, unlimited dissemination LA-UR-12-24015 July 2012 Los Alamos National Laboratory Sandia National Laboratories Disclaimer Unless otherwise indicated, this information has been authored by an employee or employees of the Los Alamos National Security, LLC. (LANS), operator of the Los Alamos National Laboratory under Contract No. DE-AC52-06NA25396 with the U.S. Department of Energy. The U.S. Government has rights to use, reproduce, and distribute this information. The public may copy and use this information without charge, provided that this Notice and any statement of authorship are reproduced on all copies. Neither the Government nor LANS makes any warranty, express or implied, or assumes any liability or responsibility for the use of this information. Bob Tomlinson – Los Alamos National Laboratory John H. Cerutti – Los Alamos National Laboratory Robert A. Ballance – Sandia National Laboratories Karen H. Haskell – Sandia National Laboratories (Editors) Cray, LibSci, and PathScale are federally registered trademarks. Cray Apprentice2, Cray Apprentice2 Desktop, Cray C++ Compiling System, Cray Fortran Compiler, Cray Linux Environment, Cray SHMEM, Cray XE, Cray XE6, Cray XT, Cray XTm, Cray XT3, Cray XT4, Cray XT5, Cray XT5h, Cray XT5m, Cray XT6, Cray XT6m, CrayDoc, CrayPort, CRInform, Gemini, Libsci and UNICOS/lc are trademarks of Cray Inc.
    [Show full text]
  • Stepping Towards a Noiseless Linux Environment
    ROSS 2012 | June 29 2012 | Venice, Italy STEPPING TOWARDS A NOISELESS LINUX ENVIRONMENT Hakan Akkan*, Michael Lang¶, Lorie Liebrock* Presented by: Abhishek Kulkarni¶ * New Mexico Tech ¶ Ultrascale Systems Research Center New Mexico Consortium Los Alamos National Laboratory 29 June 2012 Stepping Towards A Noiseless Linux Environment 2 Motivation • HPC applications are unnecessarily interrupted by the OS far too often • OS noise (or jitter) includes interruptions that increase an application’s time to solution • Asymmetric CPU roles (OS cores vs Application cores) • Spatio-temporal partitioning of resources (Tessellation) • LWK and HPC Oses improve performance at scale 29 June 2012 Stepping Towards A Noiseless Linux Environment 3 OS noise exacerbates at scale • OS noise can cause a significant slowdown of the app • Delays the superstep since synchronization must wait for the slowest process: max(wi) Image: The Case of the Missing Supercomputer Performance, Petrini et. Al, 2003 29 June 2012 Stepping Towards A Noiseless Linux Environment 4 Noise co-scheduling • Co-scheduling the noise across the machine so all processes pay the price at the same time Image: The Case of the Missing Supercomputer Performance, Petrini et. Al, 2003 29 June 2012 Stepping Towards A Noiseless Linux Environment 5 Noise Resonance • Low frequency, Long duration noise • System services, daemons • Can be moved to separate cores • High frequency, Short duration noise • OS clock ticks • Not as easy to synchronize - usually much more frequent and shorter than the computation granularity of the application • Previous research • Tsafrir, Brightwell, Ferreira, Beckman, Hoefler • Indirect overhead is generally not acknowledged • Cache and TLB pollution • Other scalability issues: locking during ticks 29 June 2012 Stepping Towards A Noiseless Linux Environment 6 Some applications are memory and network bandwidth limited! *Sancho, et.
    [Show full text]
  • The Development of Unix
    The development of Unix ∗ By Kasper Edwards Departmnent of Technology and Social Sciences, Technical University of Denmark Building 303 East, room 150, 2800 Lyngby, Denmark. (email: [email protected]) ABSTRACT This paper tells the story of the development of the Unix time sharing system. The development at AT&T and the MULTICS roots are uncovered. The events are presented in chronological order from 1969 to 1995. The Berkeley Software Distribution (BSD) are presented as well as the Free Software Foundation and other. Note: This is a working paper. Short sections of text, no more than two paragraphs may be quoted without permission provided that full credit is given to the source. Copyright © 2000-2001 by Kasper Edwards, all rights reserved. Comments are welcome to [email protected]. ∗ I would like to thank Keld Jørn Simmonsen, Ass. Prof. Jørgen Lindgaard Pedersen of the Technical University of Denmark and Ass. Prof. Jørgen Steensgaard for helpful comments and suggestions on earlier drafts on this paper. I assume full responsibility for any remaining vulnerabilities. Page 1 of 31 1.1 Introduction This thesis about Linux, however Linux is called a Unix clone in the sense that it looks like, and are designed on the same principles as Unix. Both Unix and Linux are POSIX (Portable Operating System Interface) compliant (described in paragraph 3.29). In short POSIX describes the Unix user interface, i.e. commands and their syntax. Some Unix’es are certified POSIX compliant but no one have yet been willing to pay a third party company to test the POSIX compliance of Linux.
    [Show full text]
  • Considerations for the SDP Operating System
    DocuSign Envelope ID: E376CF60-053D-4FF0-8629-99235147B54B Considerations for the SDP Operating System Document Number .......SDP Memo 063 ………………………………………………………………… Document Type .. MEMO ……………………………………………………………………… ………… Revision . 01 ………………………………………………………………………… ………… ………… Author . N. Erdödy, R. O’Keefe ………………………………………………………………………… … Release Date ... .2018-08-31 …………………………………………………………………………… … Document Classification ... Unrestricted ………………………………………………………………… Status ... Draft ………………………………………………………………………………………… … Document No: 063 Unrestricted Revision: 01 Author: N. Erdödy ​ Release Date: 2018-08-31 Page 1 of 56 DocuSign Envelope ID: E376CF60-053D-4FF0-8629-99235147B54B Lead Author Designation Affiliation Nicolás Erdödy SDP Team Open Parallel Ltd. NZ SKA Alliance (NZA). Signature & Date: 10/21/2018 8:19:36 PM PDT With contributions and reviews Affiliation greatly appreciated from Dr. Richard O’Keefe SDP Team, NZA University of Otago - Open Parallel (NZA) Dr. Andrew Ensor Director, NZA AUT University (NZA) Piers Harding SDP Team, NZA Catalyst IT (NZA) Robert O’Brien Systems Engineer / Security Independent Anonymous Reviewer CEng (UK), CPEng (NZ) Manager, NZ Govt ORGANISATION DETAILS Name Science Data Processor Consortium Address Astrophysics Cavendish Laboratory JJ Thomson Avenue Cambridge CB3 0HE Website http://ska-sdp.org Email [email protected] Document No: 063 Unrestricted Revision: 01 Author: N. Erdödy ​ Release Date: 2018-08-31 Page 2 of 56 DocuSign Envelope ID: E376CF60-053D-4FF0-8629-99235147B54B 1. SDP Memo Disclaimer The SDP memos are designed to allow the quick recording of investigations and research done by members of the SDP. They are also designed to raise questions about parts of the SDP design or SDP process. The contents of a memo may be the opinion of the author, not the whole of the SDP. Acknowledgement: The authors wish to acknowledge the inputs, corrections and continuous support from the NZA Team Members Dr.
    [Show full text]
  • Secure Computation Systems for Confidential Data Analysis By
    Secure Computation Systems for Confidential Data Analysis by Rishabh Poddar A dissertation submitted in partial satisfaction of the requirements for the degree of Doctor of Philosophy in Computer Science in the Graduate Division of the University of California, Berkeley Committee in charge: Professor Raluca Ada Popa, Chair Professor Ion Stoica Professor Sylvia Ratnasamy Professor Deirdre Mulligan Fall 2020 Secure Computation Systems for Confidential Data Analysis Copyright 2020 by Rishabh Poddar 1 Abstract Secure Computation Systems for Confidential Data Analysis by Rishabh Poddar Doctor of Philosophy in Computer Science University of California, Berkeley Professor Raluca Ada Popa, Chair A large number of services today are built around processing data that is collected from or shared by customers. While such services are typically able to protect the data when it is in transit or in storage using standard encryption protocols, they are unable to extend this protection to the data when it is being processed, making it vulnerable to breaches. This not only threatens data confidentiality in existing services, it also prevents customers from availing such services altogether for sensitive workloads, in that they are unwilling / unable to share their data out of privacy concerns, regulatory hurdles, or business competition. Existing solutions to this problem are unable to meet the requirements of advanced data analysis applications. Systems that are efficient do not provide strong enough security guarantees, and approaches with stronger security are often not efficient. To address this problem, the work in this dissertation develops new systems and protocols for securely computing on encrypted data, that attempt to bridge the gap between security and efficiency.
    [Show full text]
  • CN-Linux (Compute Node Linux)
    CN-Linux (Compute Node Linux) Alex Nagelberg August 4, 2006 Contents Abstract 2 Introduction 2 Overview of the BlueGene/L Architecture 3 ZOID (Zepto OS I/O Daemon) 4 CN-Linux (Compute Node Linux) 4 Performance 5 Future Work 5 Acknowledgements 5 Performance Figures 6 References 8 1 Abstract BlueGene/L [1] (BG/L) is a supercomputer engineered by IBM researchers with space-saving and low power-consumption in mind. BG/L’s architecture utilizes many nodes containing few components, which interact with each other to perform a given task. The nodes are divided in to 3 parts: the compute nodes, which handle computation; the I/O nodes, which route all input/output system calls appropriately to/from compute nodes; and the service nodes, which are regular computer systems that manage hardware resources and perform job control. There is currently no simple solution for compiling a large project on the BG/L. IBM has written its own proprietary operating system to run a process on a compute node (known as BLRTS). Simply compiling GCC (GNU Compiler Collection) for the BLRTS OS returned errors with memory allocation. Popular compilers also usually require multiple processes (to invoke compilation to assembly, compilation of assembly, and linking) as well as shared libraries in which BLRTS does not allow. These issues could be resolved using a Linux-based operating system, such as ZeptoOS, [2] under development at Argonne. Linux is now functional on compute nodes using a hack to execute the kernel under BLRTS. Libraries were written to initialize and perform communication of system calls to the I/O nodes by porting hacked GLIBC BLRTS code.
    [Show full text]
  • Exascale Computing Study: Technology Challenges in Achieving Exascale Systems
    ExaScale Computing Study: Technology Challenges in Achieving Exascale Systems Peter Kogge, Editor & Study Lead Keren Bergman Shekhar Borkar Dan Campbell William Carlson William Dally Monty Denneau Paul Franzon William Harrod Kerry Hill Jon Hiller Sherman Karp Stephen Keckler Dean Klein Robert Lucas Mark Richards Al Scarpelli Steven Scott Allan Snavely Thomas Sterling R. Stanley Williams Katherine Yelick September 28, 2008 This work was sponsored by DARPA IPTO in the ExaScale Computing Study with Dr. William Harrod as Program Manager; AFRL contract number FA8650-07-C-7724. This report is published in the interest of scientific and technical information exchange and its publication does not constitute the Government’s approval or disapproval of its ideas or findings NOTICE Using Government drawings, specifications, or other data included in this document for any purpose other than Government procurement does not in any way obligate the U.S. Government. The fact that the Government formulated or supplied the drawings, specifications, or other data does not license the holder or any other person or corporation; or convey any rights or permission to manufacture, use, or sell any patented invention that may relate to them. APPROVED FOR PUBLIC RELEASE, DISTRIBUTION UNLIMITED. This page intentionally left blank. DISCLAIMER The following disclaimer was signed by all members of the Exascale Study Group (listed below): I agree that the material in this document reects the collective views, ideas, opinions and ¯ndings of the study participants only, and not those of any of the universities, corporations, or other institutions with which they are a±liated. Furthermore, the material in this document does not reect the o±cial views, ideas, opinions and/or ¯ndings of DARPA, the Department of Defense, or of the United States government.
    [Show full text]