Hierarchical Multithreading: Programming Model and System Software

Total Page:16

File Type:pdf, Size:1020Kb

Hierarchical Multithreading: Programming Model and System Software Hierarchical Multithreading: Programming Model and System Software Guang R. Gao1, Thomas Sterling2,3,RickStevens4, Mark Hereld4,WeirongZhu1 1 2 Department of Electrical and Computer Engineering Center for Advanced Computing Research University of Delaware California Institute of Technology {ggao,weirong}@capsl.udel.edu [email protected] 3 4 Department of Computer Science Mathematics and Computer Science Division Louisiana State University Argonne National Laboratory [email protected] {stevens,hereld}@mcs.anl.gov Abstract 1 Introduction With the rapid increase in both the scale and com- This paper addresses the underlying sources of per- plexity of scientific and engineering problems, the com- formance degradation (e.g. latency, overhead, and putational demands grow accordingly. Breakthrough- starvation) and the difficulties of programmer produc- quality scientific discoveries and optimal engineering tivity (e.g. explicit locality management and schedul- designs often rely on large scale simulations on High- ing, performance tuning, fragmented memory, and syn- End Computing (HEC) systems with performance re- chronous global barriers) to dramatically enhance the quirement reaching peta-flops and beyond. However, broad effectiveness of parallel processing for high end current HEC systems lack system software and tools computing. We are developing a hierarchical threaded optimized for advanced scientific and engineering work virtual machine (HTVM) that defines a dynamic, mul- of interest, and are extremely difficult to program and tithreaded execution model and programming model, to port applications to. Consequently, applications providing an architecture abstraction for HEC system rarely achieve an acceptable fraction of the peak ca- software and tools development. We are working on pability of the system. a prototype language, LITL-X (pronounced “little-X”) To radically improve this situation, the following for Latency Intrinsic-Tolerant Language, which pro- key features are expected to be supported in the fu- vides the application programmers with a powerful set ture HEC systems: (1) Architecture support for coarse- of semantic constructs to organize parallel computa- and/or fine-grain multithreading at enormous scale (up tions in a way that hides/manages latency and limits to millions of threads). (2) Architecture support for the effects of overhead. This is quite different from lo- runtime thread migration, and (3) Architecture sup- cality management, although the intent of both strate- port for large shared address space across nodes. These gies is to minimize the effect of latency on the efficiency features can be observed in the IBM Bluegene L [6] of computation. We will work on a dynamic compila- and Cyclops architectures [5], Processor-In-Memory- tion and runtime model to achieve efficient LITL-X based architectures [17], fine-grain multithreaded ar- program execution. Several adaptive optimizations will chitectures like HTMT [10] and CARE [14]. be studied. A methodology of incorporating domain- In this paper, we propose a hierarchical threaded vir- specific knowledge in program optimization will be stud- tual machine (HTVM) that defines a dynamic, multi- ied. Finally, we plan to implement our method in an threaded execution model, which provides an architec- experimental testbed for a HEC architecture and per- ture abstraction for HEC system software and tools de- form a qualitative and quantitative evaluation on se- velopment. A corresponding programming model will lected applications. efficiently exploit the ability of the execution model by users. We will perform research on programming model 1-4244-0054-6/06/$20.00 ©2006 IEEE and language issues, continuous compilation and run- namic adaptiveness of the HEC system. Fig. 1 shows time software that are critical to enable the dynamic our overall system software architecture. At the core is adaptation of the HEC system. We propose a method a Hierarchical Threaded Virtual Machine(HTVM) ex- to enable domain-expert knowledge input and exploita- ecution model that features dynamic multi-level mul- tion, and runtime performance monitoring mechanism tithreaded execution. HTVM includes three compo- to support the above continuous compilation. Finally, nents: a thread model, a memory model and a syn- we report the current status of the implementation, chronization model. This design focuses on adaptiv- performance analysis, and evaluation of the proposed ity features, as will be discussed in detail in Section 3. methods under an experimental HEC system software The functionalities of HTVM will be supported and ex- testbed. plored through the HTVM parallel programming lan- guage (called LITL-X), compiler and runtime software. 2 An Overview of the Hierarchical The compiler has two parts: a static part and a dy- Threaded Virtual Machine Model namic part. As shown in Fig. 1, the dynamic compiler is responsible for the adaptation of loop parallelism, dynamic load, locality and latency. Since the dynamic This section gives our overall vision on the HEC compiler closely interacts with the runtime system and system software/tools. A major challenge is to accom- will be called during the execution of the HEC applica- modate dynamic adaptivity in the design due to the tions, its functionality extends smoothly to the runtime complex and dynamic nature of ultra-large scale HEC system as well, as indicated by the boxes that span applications and machines. Under a real HEC pro- across the dynamic compiler and the runtime system. gram execution scenario, millions of threads at various To take advantage of the adaptivity features of levels in the thread hierarchy may be generated and HTVM more effectively, a domain-experts knowledge executed at different time and places in the machine. base is provided. Domain-specific knowledge is ex- Each thread should be mapped to a desirable physi- pressed as scripts, which give specific annotations to cal thread unit when resources become available and the source of the HEC applications to guide the com- dependences are resolved. We identify four classes of pilation process of the static compiler. To assist adap- adaptivity critical to the performance of the system: tivity, a system of structured hints guides the dynamic • Loop parallelism adaptation. Scientific appli- compiler for selection and completion of the partial cations tend to have computation-intensive kernels schedules generated by the static compiler, and for se- consisting of loop nests. Exploitable parallelism in lection of runtime algorithms, based on the dynamic a loop nest, and the grain size of the parallelism, facts such as memory access patterns found by a run- are runtime dependent on the machine resource time performance monitor during the execution of the availability and data locality, which change more HEC applications. The flow of the mapping process of drastically in a highly threaded environment with an HEC application under our proposed research soft- deep memory hierarchy. ware and tools is indicated by the big shaded arrows • Dynamic load adaptation. The computation in Fig. 1. The components of the software infrastruc- load may become unbalanced and a large number ture have been annotated by the corresponding sec- of threads may need to migrate to balance the load tion numbers. The HTVM compilation and execution of the machine. process is an iterative process with the assistance of a feedback process as shown in the figure. • Locality adaptation. Data objects may need to migrate, and copies be generated and moved in the memory hierarchy to achieve high locality, 3 Hierarchical Multithreading: Pro- while copy consistency needs to be preserved. gramming Model and System Soft- • Latency adaptation. The deep memory hier- ware archy usually found in an HEC machine makes the memory access latencies vary more drastically 3.1 A Hierarchical Threaded Virtual Ma- during the execution, depending on the locality chine Model of references, the number of concurrent accesses, and the available memory bandwidth. The system needs dynamically adapt to such variations. One of our primary objectives is to define the hi- erarchical threaded virtual machine (HTVM). We first A main task of our research is to study the key sys- outline our research in the HTVM execution model, tem software technologies that support the above dy- which consists of a thread model, a memory model and Figure 1. An Overview of the Proposed HEC Software/Tools a synchronization model. We then outline research is- • Small-Grain Threads (SGTs) under sues and tasks for HTVM programming model. HTVM: Small-grain threads are another feature of certain HEC architectures interested in this proposal. These threads normally expect 3.1.1 HTVM Execution Model to perform a much smaller computation task, building some state but with substantial less A novel aspect of our HTVM model is to provide a “weight”. Therefore, cost of their invocation smooth and integrated abstraction that directly rep- and management is much lower when comparing resents these thread levels and provides an integrated with large-grain threads. An example of SGTs is thread hierarchy. We will target future HEC architec- the threaded function calls under CILK [9] and tures - they provide rich hardware support for a hier- EARTH[19], parcels under HTMT [10] and Cas- archy of threads at different grain levels, as discussed cade [4], and asynchronous calls being considered earlier. under PERCS [1]. Intuitively, the following levels of threads are to be • defined under HTVM. Tiny-Grain Threads
Recommended publications
  • A Case for High Performance Computing with Virtual Machines
    A Case for High Performance Computing with Virtual Machines Wei Huangy Jiuxing Liuz Bulent Abaliz Dhabaleswar K. Panday y Computer Science and Engineering z IBM T. J. Watson Research Center The Ohio State University 19 Skyline Drive Columbus, OH 43210 Hawthorne, NY 10532 fhuanwei, [email protected] fjl, [email protected] ABSTRACT in the 1960s [9], but are experiencing a resurgence in both Virtual machine (VM) technologies are experiencing a resur- industry and research communities. A VM environment pro- gence in both industry and research communities. VMs of- vides virtualized hardware interfaces to VMs through a Vir- fer many desirable features such as security, ease of man- tual Machine Monitor (VMM) (also called hypervisor). VM agement, OS customization, performance isolation, check- technologies allow running different guest VMs in a phys- pointing, and migration, which can be very beneficial to ical box, with each guest VM possibly running a different the performance and the manageability of high performance guest operating system. They can also provide secure and computing (HPC) applications. However, very few HPC ap- portable environments to meet the demanding requirements plications are currently running in a virtualized environment of computing resources in modern computing systems. due to the performance overhead of virtualization. Further, Recently, network interconnects such as InfiniBand [16], using VMs for HPC also introduces additional challenges Myrinet [24] and Quadrics [31] are emerging, which provide such as management and distribution of OS images. very low latency (less than 5 µs) and very high bandwidth In this paper we present a case for HPC with virtual ma- (multiple Gbps).
    [Show full text]
  • A Light-Weight Virtual Machine Monitor for Blue Gene/P
    A Light-Weight Virtual Machine Monitor for Blue Gene/P Jan Stoessx{1 Udo Steinbergz1 Volkmar Uhlig{1 Jonathan Appavooy1 Amos Waterlandj1 Jens Kehnex xKarlsruhe Institute of Technology zTechnische Universität Dresden {HStreaming LLC jHarvard School of Engineering and Applied Sciences yBoston University ABSTRACT debugging tools. CNK also supports I/O only via function- In this paper, we present a light-weight, micro{kernel-based shipping to I/O nodes. virtual machine monitor (VMM) for the Blue Gene/P Su- CNK's lightweight kernel model is a good choice for the percomputer. Our VMM comprises a small µ-kernel with current set of BG/P HPC applications, providing low oper- virtualization capabilities and, atop, a user-level VMM com- ating system (OS) noise and focusing on performance, scal- ponent that manages virtual BG/P cores, memory, and in- ability, and extensibility. However, today's HPC application terconnects; we also support running native applications space is beginning to scale out towards Exascale systems of directly atop the µ-kernel. Our design goal is to enable truly global dimensions, spanning companies, institutions, compatibility to standard OSes such as Linux on BG/P via and even countries. The restricted support for standardized virtualization, but to also keep the amount of kernel func- application interfaces of light-weight kernels in general and tionality small enough to facilitate shortening the path to CNK in particular renders porting the sprawling diversity applications and lowering OS noise. of scalable applications to supercomputers more and more a Our prototype implementation successfully virtualizes a bottleneck in the development path of HPC applications.
    [Show full text]
  • Opportunities for Leveraging OS Virtualization in High-End Supercomputing
    Opportunities for Leveraging OS Virtualization in High-End Supercomputing Kevin T. Pedretti Patrick G. Bridges Sandia National Laboratories∗ University of New Mexico Albuquerque, NM Albuquerque, NM [email protected] [email protected] ABSTRACT formance, making modest virtualization performance over- This paper examines potential motivations for incorporating heads viable. In these cases,the increased flexibility that vir- virtualization support in the system software stacks of high- tualization provides can be used to support a wider range end capability supercomputers. We advocate that this will of applications, to enable exascale co-design research and increase the flexibility of these platforms significantly and development, and provide new capabilities that are not pos- enable new capabilities that are not possible with current sible with the fixed software stacks that high-end capability fixed software stacks. Our results indicate that compute, supercomputers use today. virtual memory, and I/O virtualization overheads are low and can be further mitigated by utilizing well-known tech- The remainder of this paper is organized as follows. Sec- niques such as large paging and VMM bypass. Furthermore, tion 2 discusses previous work dealing with the use of virtu- since the addition of virtualization support does not affect alization in HPC. Section 3 discusses several potential areas the performance of applications using the traditional native where platform virtualization could be useful in high-end environment, there is essentially no disadvantage to its ad- supercomputing. Section 4 presents single node native vs. dition. virtual performance results on a modern Intel platform that show that compute, virtual memory, and I/O virtualization 1.
    [Show full text]
  • System Trends and Their Impact on Future Microprocessor Design
    IBM Research System Trends and their Impact on Future Microprocessor Design Tilak Agerwala Vice President, Systems IBM Research System Trends and their Impact | MICRO 35 | Tilak Agerwala © 2002 IBM Corporation IBM Research Agenda System and application trends Impact on architecture and microarchitecture The Memory Wall Cellular architectures and IBM's Blue Gene Summary System Trends and their Impact | MICRO 35 | Tilak Agerwala © 2002 IBM Corporation IBM Research TRENDSTRENDS Microprocessors < 10 GHz in systems 64-256 Way SMP 65-45nm, Copper, Highest performance SOI Best MP Scalability 1-2 GHz Leading edge process technology 4-8 Way SMP RAS, virtualization ~100nm technology 10+ of GHz 4-8 Way SMP SMP/Large 65-45nm, Copper, SOI Systems Highest Frequency Cost and power Low GHz sensitive Leading edge process Uniprocessor technology Desktop ~100nm technology 2-4 GHz, Uniproc, Component-based and Game ~100nm, Copper, SOI Consoles Lowest Power / Lowest Multi MHz cost designs SoC capable Embedded Uniprocessor ASIC / Foundry ~100-200nm technologies Systems technology System Trends and their Impact | MICRO 35 | Tilak Agerwala © 2002 IBM Corporation IBM Research TRENDSTRENDS Large system application trends Traditional commercial applications Databases, transaction processing, business apps like payroll etc. The internet has driven the growth of new commercial applications New life sciences applications are commercial and high-growth Drug discovery and genetic engineering research needs huge amounts of compute power (e.g. protein folding simulations)
    [Show full text]
  • A Virtual Machine Environment for Real Time Systems Laboratories
    AC 2007-904: A VIRTUAL MACHINE ENVIRONMENT FOR REAL-TIME SYSTEMS LABORATORIES Mukul Shirvaikar, University of Texas-Tyler MUKUL SHIRVAIKAR received the Ph.D. degree in Electrical and Computer Engineering from the University of Tennessee in 1993. He is currently an Associate Professor of Electrical Engineering at the University of Texas at Tyler. He has also held positions at Texas Instruments and the University of West Florida. His research interests include real-time imaging, embedded systems, pattern recognition, and dual-core processor architectures. At the University of Texas he has started a new real-time systems lab using dual-core processor technology. He is also the principal investigator for the “Back-To-Basics” project aimed at engineering student retention. Nikhil Satyala, University of Texas-Tyler NIKHIL SATYALA received the Bachelors degree in Electronics and Communication Engineering from the Jawaharlal Nehru Technological University (JNTU), India in 2004. He is currently pursuing his Masters degree at the University of Texas at Tyler, while working as a research assistant. His research interests include embedded systems, dual-core processor architectures and microprocessors. Page 12.152.1 Page © American Society for Engineering Education, 2007 A Virtual Machine Environment for Real Time Systems Laboratories Abstract The goal of this project was to build a superior environment for a real time system laboratory that would allow users to run Windows and Linux embedded application development tools concurrently on a single computer. These requirements were dictated by real-time system applications which are increasingly being implemented on asymmetric dual-core processors running different operating systems. A real time systems laboratory curriculum based on dual- core architectures has been presented in this forum in the past.2 It was designed for a senior elective course in real time systems at the University of Texas at Tyler that combines lectures along with an integrated lab.
    [Show full text]
  • An Overview of the Blue Gene/L System Software Organization
    An Overview of the Blue Gene/L System Software Organization George Almasi´ , Ralph Bellofatto , Jose´ Brunheroto , Calin˘ Cas¸caval , Jose´ G. ¡ Castanos˜ , Luis Ceze , Paul Crumley , C. Christopher Erway , Joseph Gagliano , Derek Lieber , Xavier Martorell , Jose´ E. Moreira , Alda Sanomiya , and Karin ¡ Strauss ¢ IBM Thomas J. Watson Research Center Yorktown Heights, NY 10598-0218 £ gheorghe,ralphbel,brunhe,cascaval,castanos,pgc,erway, jgaglia,lieber,xavim,jmoreira,sanomiya ¤ @us.ibm.com ¥ Department of Computer Science University of Illinois at Urbana-Champaign Urabana, IL 61801 £ luisceze,kstrauss ¤ @uiuc.edu Abstract. The Blue Gene/L supercomputer will use system-on-a-chip integra- tion and a highly scalable cellular architecture. With 65,536 compute nodes, Blue Gene/L represents a new level of complexity for parallel system software, with specific challenges in the areas of scalability, maintenance and usability. In this paper we present our vision of a software architecture that faces up to these challenges, and the simulation framework that we have used for our experiments. 1 Introduction In November 2001 IBM announced a partnership with Lawrence Livermore National Laboratory to build the Blue Gene/L (BG/L) supercomputer, a 65,536-node machine de- signed around embedded PowerPC processors. Through the use of system-on-a-chip in- tegration [10], coupled with a highly scalable cellular architecture, Blue Gene/L will de- liver 180 or 360 Teraflops of peak computing power, depending on the utilization mode. Blue Gene/L represents a new level of scalability for parallel systems. Whereas existing large scale systems range in size from hundreds (ASCI White [2], Earth Simulator [4]) to a few thousands (Cplant [3], ASCI Red [1]) of compute nodes, Blue Gene/L makes a jump of almost two orders of magnitude.
    [Show full text]
  • GT-Virtual-Machines
    College of Design Virtual Machine Setup for CP 6581 Fall 2020 1. You will have access to multiple College of Design virtual machines (VMs). CP 6581 will require VMs with ArcGIS installed, and for class sessions we will start the semester using the K2GPU VM. You should open each VM available to you and check to see if ArcGIS is installed. 2. Here are a few general notes on VMs. a. Because you will be a new user when you first start a VM, you may be prompted to set up Microsoft Explorer, Microsoft Edge, Adobe Creative Suite, or other software. You can close these windows and set up specific software later, if you wish. b. Different VMs allow different degrees of customization. To customize right-click on an empty Desktop area and choose “Personalize”. Change your background to a solid color that’s a distinctive color so you can easily distinguish your virtual desktop from your local computer’s desktop. Some VMs will remember your desktop color, other VMs must be re-set at each login, other VMs won’t allow you to change the background color but may allow you to change a highlight color from the “Colors” item on the left menu just below “Background”. c. Your desktop will look exactly the same as physical computer’s desktop except for the small black rectangle at the middle of the very top of the VM screen. Click on the rectangle and a pop-down horizontal menu of VM options will appear. Here are two useful options. i. Preferences then File Access will allow you to grant VM access to your local computer’s drives and files.
    [Show full text]
  • Cellular Wave Computers and CNN Technology – a Soc Architecture with Xk Processors and Sensor Arrays*
    Cellular Wave Computers and CNN Technology – a SoC architecture with xK processors and sensor arrays* Tamás ROSKA1 Fellow IEEE 1. Introduction and the main theses 1.1 Scenario: Architectural lessons from the trends in manufacturing billion component devices when crossing the threshold of 100 nm feature size Preliminary proposition: The nature of fabrication technology, the nature and type of data to be processed, and the nature and type of events to be detected or „computed” will determine the architecture, the elementary instructions, and the type of algorithms needed, hence also the complexity of the solution. In view of this proposition, let us list a few key features of the electronic technology of Today and its consequences.. (i) Convergence of CMOS, NANO and OPTICAL technologies towards a cellular architecture with short and sparse wires CMOS chips: * Processors: K or M transistors on an M or G transistor die =>K processors /chip * Wires: at 180 nm or below, gate delay is smaller than wire delay NANO processors and sensors: * Mainly 2D organization of cells integrating processing and sensing * Interactions mainly with the neighbours OPTICAL devices: * parallel processing * optical correlators * VCSELs and programable SLMs, Hence the architecture should be characterized by * 2 D layers (or a layered 3D) * Cellular architecture with * mainly local and /or regular sparse wireing leading via Î a Cellular Nonlinear Network (CNN) Dynamics 1 The Faculty of Information Technology and the Jedlik Laboratories of the Pázmány University, Budapest and the Computer and Automation Institute of the Hungarian Academy of Sciences, Budapest, Hungary ([email protected], www.itk.ppke.hu) * Research supported by the Office of Naval Research, Human Frontiers of Science Program, EU Future and Emergent Technologies Program, the Hungarian Academy of Sciences, and the Jedlik Laboratories of the Pázmány University, Budapest 0-7803-9254-X/05/$20.00 ©2005 IEEE.
    [Show full text]
  • Performance Modelling and Optimization of Memory Access on Cellular Computer Architecture Cyclops64
    Performance modelling and optimization of memory access on cellular computer architecture Cyclops64 Yanwei Niu, Ziang Hu, Kenneth Barner and Guang R. Gao Department of ECE, University of Delaware, Newark, DE, 19711, USA {niu, hu, barner, ggao}@ee.udel.edu Abstract. This paper focuses on the Cyclops64 computer architecture and presents an analytical model and performance simulation results for the preloading and loop unrolling approaches to optimize the performance of SVD (Singular Value Decomposition) benchmark. A performance model for dissecting the total execu- tion cycles is presented. The data preloading using “memcpy” or hand optimized “inline” assembly code, and the loop unrolling approach are implemented and compared with each other in terms of the total number of memory access cycles. The key idea is to preload data from offchip to onchip memory and store the data back after the computation. These approaches can reduce the total memory access cycles and can thus improve the benchmark performance significantly. 1 Introduction The design concept of computer architecture over the last two decades has been mainly on the exploitation of the instruction level parallelism, such as pipelining,VLIW or superscalar architecture. For the next generation of computer architecture, hardware threading multiprocessor is becoming more and more popular. One approach of hard- ware multithreading is called CMP (Chip MultiProcessor) approach, which proposes a single chip design that uses a collection of independent processors with less resource sharing. An example of CMP architecture design is Cyclops64 [1–5], a new architec- ture for high performance parallel computers being developed at the IBM T. J. Watson Research Center and University of Delaware.
    [Show full text]
  • Virtualizationoverview
    VMWAREW H WHITEI T E PPAPERA P E R Virtualization Overview 1 VMWARE WHITE PAPER Table of Contents Introduction .............................................................................................................................................. 3 Virtualization in a Nutshell ................................................................................................................... 3 Virtualization Approaches .................................................................................................................... 4 Virtualization for Server Consolidation and Containment ........................................................... 7 How Virtualization Complements New-Generation Hardware .................................................. 8 Para-virtualization ................................................................................................................................... 8 VMware’s Virtualization Portfolio ........................................................................................................ 9 Glossary ..................................................................................................................................................... 10 2 VMWARE WHITE PAPER Virtualization Overview Introduction Virtualization in a Nutshell Among the leading business challenges confronting CIOs and Simply put, virtualization is an idea whose time has come. IT managers today are: cost-effective utilization of IT infrastruc- The term virtualization broadly describes the separation
    [Show full text]
  • Virtual Machine Benchmarking Kim-Thomas M¨Oller Diploma Thesis
    Universitat¨ Karlsruhe (TH) Institut fur¨ Betriebs- und Dialogsysteme Lehrstuhl Systemarchitektur Virtual Machine Benchmarking Kim-Thomas Moller¨ Diploma Thesis Advisors: Prof. Dr. Frank Bellosa Joshua LeVasseur 17. April 2007 I hereby declare that this thesis is the result of my own work, and that all informa- tion sources and literature used are indicated in the thesis. I also certify that the work in this thesis has not previously been submitted as part of requirements for a degree. Hiermit erklare¨ ich, die vorliegende Arbeit selbstandig¨ und nur unter Benutzung der angegebenen Literatur und Hilfsmittel angefertigt zu haben. Alle Stellen, die wortlich¨ oder sinngemaߨ aus veroffentlichten¨ und nicht veroffentlichten¨ Schriften entnommen wurden, sind als solche kenntlich gemacht. Die Arbeit hat in gleicher oder ahnlicher¨ Form keiner anderen Prufungsbeh¨ orde¨ vorgelegen. Karlsruhe, den 17. April 2007 Kim-Thomas Moller¨ Abstract The resurgence of system virtualization has provoked diverse virtualization tech- niques targeting different application workloads and requirements. However, a methodology to compare the performance of virtualization techniques at fine gran- ularity has not yet been introduced. VMbench is a novel benchmarking suite that focusses on virtual machine environments. By applying the pre-virtualization ap- proach for hypervisor interoperability, VMbench achieves hypervisor-neutral in- strumentation of virtual machines at the instruction level. Measurements of dif- ferent virtual machine configurations demonstrate how VMbench helps rate and predict virtual machine performance. Kurzfassung Das wiedererwachte Interesse an der Systemvirtualisierung hat verschiedenartige Virtualisierungstechniken fur¨ unterschiedliche Anwendungslasten und Anforde- rungen hervorgebracht. Jedoch wurde bislang noch keine Methodik eingefuhrt,¨ um Virtualisierungstechniken mit hoher Granularitat¨ zu vergleichen. VMbench ist eine neuartige Benchmarking-Suite fur¨ Virtuelle-Maschinen-Umgebungen.
    [Show full text]
  • Distributed Virtual Machines: a System Architecture for Network Computing
    Distributed Virtual Machines: A System Architecture for Network Computing Emin Gün Sirer, Robert Grimm, Arthur J. Gregory, Nathan Anderson, Brian N. Bershad {egs,rgrimm,artjg,nra,bershad}@cs.washington.edu http://kimera.cs.washington.edu Dept. of Computer Science & Engineering University of Washington Seattle, WA 98195-2350 Abstract Modern virtual machines, such as Java and Inferno, are emerging as network computing platforms. While these virtual machines provide higher-level abstractions and more sophisticated services than their predecessors from twenty years ago, their architecture has essentially remained unchanged. State of the art virtual machines are still monolithic, that is, they are comprised of closely-coupled service components, which are thus replicated over all computers in an organization. This crude replication of services forms one of the weakest points in today’s networked systems, as it creates widely acknowledged and well-publicized problems of security, manageability and performance. We have designed and implemented a new system architecture for network computing based on distributed virtual machines. In our system, virtual machine services that perform rule checking and code transformation are factored out of clients and are located in enterprise- wide network servers. The services operate by intercepting application code and modifying it on the fly to provide additional service functionality. This architecture reduces client resource demands and the size of the trusted computing base, establishes physical isolation between virtual machine services and creates a single point of administration. We demonstrate that such a distributed virtual machine architecture can provide substantially better integrity and manageability than a monolithic architecture, scales well with increasing numbers of clients, and does not entail high overhead.
    [Show full text]