Virtual Machines Uses for Virtual Machines
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
US 2002/0066086 A1 Linden (43) Pub
US 2002O066086A1 (19) United States (12) Patent Application Publication (10) Pub. No.: US 2002/0066086 A1 Linden (43) Pub. Date: May 30, 2002 (54) DYNAMIC RECOMPLER (52) U.S. Cl. ............................................ 717/145; 717/151 (76) Inventor: Randal N. Linden, Los Angeles, CA (57) ABSTRACT (US) A method for dynamic recompilation of Source Software Correspondence Address: instructions for execution by a target processor, which LU & LIU LLP considers not only the Specific Source instructions, but also 811 WEST SEVENTH STREET, SUITE 1100 the intent and purpose of the instructions, to translate and LOS ANGELES, CA 90017 (US) optimize a set of equivalent code for the target processor. The dynamic recompiler determines what the Source opera (21) Appl. No.: 09/755,542 tion code is trying to accomplish and the optimum way of Filed: Jan. 5, 2001 doing it at the target processor, in an “interpolative' and (22) context Sensitive fashion. The Source instructions are pro Related U.S. Application Data cessed in blocks of varying Sizes by the dynamic recompiler, which considers the instructions that come before and after (63) Non-provisional of provisional application No. a current instruction to determine the most efficient approach 60/175,008, filed on Jan. 7, 2000. out of Several available approaches for encoding the opera tion code for the target processor to perform the equivalent Publication Classification tasks Specified by the Source instructions. The dynamic compiler comprises a decoding Stage, an optimization Stage (51) Int. Cl. ................................................. G06F 9/45 and an encoding Stage. Patent Application Publication May 30, 2002 Sheet 1 of 3 US 2002/0066086 A1 Patent Application Publication May 30, 2002 Sheet 2 of 3 US 2002/0066086 A1 - \O, 2 Patent Application Publication May 30, 2002 Sheet 3 of 3 US 2002/0066086 A1 fo, 3 US 2002/0066086 A1 May 30, 2002 DYNAMIC RECOMPLER more target instructions to be executed in response to a given Source instruction. -
A Dynamically Recompiling ARM Emulator POWERED
POWERED TACARM A Dynamically Recompiling ARM Emulator 3rd year project report 2000-2001 University of Warwick Written by David Sharp Supervised by Graham Martin 2 Abstract Dynamically recompiling from one machine code to another can be used to emulate modern microprocessors at realistic speeds. This report is a discussion of the techniques used in implementing a dynamically recompiling emulator of an ARM processor for use in an emulation of a complete computer system. Keywords Emulation, Dynamic Recompilation, Binary Translation, Just-In-Time compilation, Virtual Machine, Microprocessor Simulation, Intermediate Code, ARM. 3 Contents 1. Introduction 7 1.1. What is emulation? 7 1.2. Applications of emulation 7 1.3. Processor emulation techniques 9 1.4. This Project 13 2. Analysis 16 2.1. Introduction to the ARM 16 2.2. Identifying the problem 18 2.3. Getting started 20 2.4. The System Design 21 3. Disassembler 23 3.1. Purpose 23 3.2. ARM decoding 23 3.3. Design 24 4. Interpreter 26 4.1. Purpose 26 4.2. The problem with JIT 26 4.3. The HotSpotTM alternative 26 4.4. Quantifying the JIT problem 27 4.5. Faster decoding 28 4.6. Interfaces 29 4.7. The emulation loop 30 4.8. Implementation 31 4.9. Debugging 32 4.10. Compatibility 33 5. Recompilation 35 5.1. Overview 35 5.2. Methods of generating native code 35 5.3. The use of intermediate code 36 6. Armlets – An Intermediate Code 38 6.1. Purpose 38 6.2. The ‘explicit-implicit problem’ 38 6.3. Options 39 6.4. -
A Bare Machine Sensor Application for an ARM Processor
A Bare Machine Sensor Application for an ARM Processor Alexander Peter, Ramesh K. Karne and Alexander L. Wijesinha Department of Computer & Information Sciences Towson, MD 21252 [email protected], (rkarne, awijesinha)@towson.edu Abstract- Sensor devices that monitor environmental changes in In order to build a bare machine temperature sensor device temperature, sound, light, vibration and pressure usually run application, it requires several components as shown in Fig. 1. applications that require the support of a small operating system, A brief functional description of this application is as follows. lean kernel, or an embedded system. This paper presents a The application is loaded from the SD Card interface (Label 9) methodology for developing sensor device applications that can into memory using U-Boot. A temperature sensor (Label 1) is be directly run on the bare hardware without any need for middleware. Such bare sensor device applications only depend plugged into the ADB to monitor the current room on the underlying processor architecture, enabling them to run temperature with a sensing granularity of 900 microseconds. on a variety of devices. The methodology is used for developing, As the temperature changes, its value (Label 8) and a graphic designing, and implementing a temperature sensor application image (Label 7) is displayed on the LCD screen (Label 2). An that runs on an ARM processor. The same methodology can be output console is used to debug the inner working of the ADB used to build other bare machine sensor device applications for and temperature (Label 3). An LED light (Label 4) provides a ARM processors, and is easily extended to different processor visual indication of temperature sensor operation. -
Operating Systems Privileged Instructions
November 2, 1999 Operating Systems • Operating systems manage processes and resources • Processes are executing instances of programs may be the same or different programs process 1 process 2 process 3 code code code data data data SV SV SV state vector: information necessary to start, stop, and restart the process • State vector: registers, program counter, memory mgmt registers, etc. user-level processes assembly & high-level languages operating system, “kernel” machine language & system calls bare machine machine language Copyright 1997 Computer Science 217: Operating System Page 187 November 2, 1999 Privileged Instructions • Machines have two kinds of instructions 1. “normal” instructions, e.g., add, sub, etc. 2. “privileged” instructions, e.g., initiate I/O switch state vectors or contexts load/save from protected memory etc. • Operating systems hide privileged instructions and provide virtual instructions to access and manipulate virtual resources, e.g., I/O to and from disc files • Virtual instructions are system calls • Operating systems interpret virtual instructions Copyright 1997 Computer Science 217: Operating System Page 188 November 2, 1999 Processor Modes • Machine level typically has 2 modes, e.g., “user” mode and “kernel” mode • User mode processor executes “normal” instructions in the user’s program upon encountering a “privileged” instruction, processor switches to kernel mode, and the operating system performs a service • Kernel mode processor executes both normal and privileged instructions • User-to-kernel switch -
A Brief History of Just-In-Time Compilation
A Brief History of Just-In-Time JOHN AYCOCK University of Calgary Software systems have been using “just-in-time” compilation (JIT) techniques since the 1960s. Broadly, JIT compilation includes any translation performed dynamically, after a program has started execution. We examine the motivation behind JIT compilation and constraints imposed on JIT compilation systems, and present a classification scheme for such systems. This classification emerges as we survey forty years of JIT work, from 1960–2000. Categories and Subject Descriptors: D.3.4 [Programming Languages]: Processors; K.2 [History of Computing]: Software General Terms: Languages, Performance Additional Key Words and Phrases: Just-in-time compilation, dynamic compilation 1. INTRODUCTION into a form that is executable on a target platform. Those who cannot remember the past are con- What is translated? The scope and na- demned to repeat it. ture of programming languages that re- George Santayana, 1863–1952 [Bartlett 1992] quire translation into executable form covers a wide spectrum. Traditional pro- This oft-quoted line is all too applicable gramming languages like Ada, C, and in computer science. Ideas are generated, Java are included, as well as little lan- explored, set aside—only to be reinvented guages [Bentley 1988] such as regular years later. Such is the case with what expressions. is now called “just-in-time” (JIT) or dy- Traditionally, there are two approaches namic compilation, which refers to trans- to translation: compilation and interpreta- lation that occurs after a program begins tion. Compilation translates one language execution. into another—C to assembly language, for Strictly speaking, JIT compilation sys- example—with the implication that the tems (“JIT systems” for short) are com- translated form will be more amenable pletely unnecessary. -
Program Dynamic Analysis Overview
4/14/16 Program Dynamic Analysis Overview • Dynamic Analysis • JVM & Java Bytecode [2] • A Java bytecode engineering library: ASM [1] 2 1 4/14/16 What is dynamic analysis? [3] • The investigation of the properties of a running software system over one or more executions 3 Has anyone done dynamic analysis? [3] • Loggers • Debuggers • Profilers • … 4 2 4/14/16 Why dynamic analysis? [3] • Gap between run-time structure and code structure in OO programs Trying to understand one [structure] from the other is like trying to understand the dynamism of living ecosystems from the static taxonomy of plants and animals, and vice-versa. -- Erich Gamma et al., Design Patterns 5 Why dynamic analysis? • Collect runtime execution information – Resource usage, execution profiles • Program comprehension – Find bugs in applications, identify hotspots • Program transformation – Optimize or obfuscate programs – Insert debugging or monitoring code – Modify program behaviors on the fly 6 3 4/14/16 How to do dynamic analysis? • Instrumentation – Modify code or runtime to monitor specific components in a system and collect data – Instrumentation approaches • Source code modification • Byte code modification • VM modification • Data analysis 7 A Running Example • Method call instrumentation – Given a program’s source code, how do you modify the code to record which method is called by main() in what order? public class Test { public static void main(String[] args) { if (args.length == 0) return; if (args.length % 2 == 0) printEven(); else printOdd(); } public -
MIPS: the Virtual Machine
2 MIPS: The Virtual Machine Objectives After this lab you will know: • what are synthetic instructions • how synthetic instructions expand to sequences of native instructions • how MIPS addresses the memory Note: You can use PCSpim using the menu items instead of individual console commands. Familiarize yourself with PCSpim. Introduction MIPS is a ‘typical’ RISC architecture: it has a simple and regular instruction set, only one memory addressing mode (base plus displacement) and instructions are of fixed size (32 bit). Surprisingly, it is not very simple to write assembly programs for MIPS (and for any RISC machine for that matter). The reason is that programming is not supposed to be done in assembly language. The instruction sets of RISC machines have been designed such that: • they are good targets for compilers (thus simplifying the job of compiler writers) • they allow a very efficient implementation (pipelining) Pipelining means that several instructions are present in the CPU at any time, in various stages of execution. Because of this situation it is possible that some sequences of instructions just don’t work the way they are listed on paper. For instance Ex 1: lw $t0, 0($gp) # $t0 <- M[$gp + 0] add $t2, $t2, $t0 # $t2 <- $t2 + $t0 loads a word from memory in register $t0 and then adds it, in the next instruction, to register $t2. The prob- lem here is that, by the time the add instruction is ready to use register $t0, the value of $t0 has not changed yet. This is not to say that the load instruction (lw) has not completed. -
Address Translation
CS 152 Computer Architecture and Engineering Lecture 8 - Address Translation John Wawrzynek Electrical Engineering and Computer Sciences University of California at Berkeley http://www.eecs.berkeley.edu/~johnw http://inst.eecs.berkeley.edu/~cs152 9/27/2016 CS152, Fall 2016 CS152 Administrivia § Lab 2 due Friday § PS 2 due Tuesday § Quiz 2 next Thursday! 9/27/2016 CS152, Fall 2016 2 Last time in Lecture 7 § 3 C’s of cache misses – Compulsory, Capacity, Conflict § Write policies – Write back, write-through, write-allocate, no write allocate § Multi-level cache hierarchies reduce miss penalty – 3 levels common in modern systems (some have 4!) – Can change design tradeoffs of L1 cache if known to have L2 § Prefetching: retrieve memory data before CPU request – Prefetching can waste bandwidth and cause cache pollution – Software vs hardware prefetching § Software memory hierarchy optimizations – Loop interchange, loop fusion, cache tiling 9/27/2016 CS152, Fall 2016 3 Bare Machine Physical Physical Address Inst. Address Data Decode PC Cache D E + M Cache W Physical Memory Controller Physical Address Address Physical Address Main Memory (DRAM) § In a bare machine, the only kind of address is a physical address 9/27/2016 CS152, Fall 2016 4 Absolute Addresses EDSAC, early 50’s § Only one program ran at a time, with unrestricted access to entire machine (RAM + I/O devices) § Addresses in a program depended upon where the program was to be loaded in memory § But it was more convenient for programmers to write location-independent subroutines How -
Transparent Dynamic Optimization: the Design and Implementation of Dynamo
Transparent Dynamic Optimization: The Design and Implementation of Dynamo Vasanth Bala, Evelyn Duesterwald, Sanjeev Banerjia HP Laboratories Cambridge HPL-1999-78 June, 1999 E-mail: [email protected] dynamic Dynamic optimization refers to the runtime optimization optimization, of a native program binary. This report describes the compiler, design and implementation of Dynamo, a prototype trace selection, dynamic optimizer that is capable of optimizing a native binary translation program binary at runtime. Dynamo is a realistic implementation, not a simulation, that is written entirely in user-level software, and runs on a PA-RISC machine under the HPUX operating system. Dynamo does not depend on any special programming language, compiler, operating system or hardware support. Contrary to intuition, we demonstrate that it is possible to use a piece of software to improve the performance of a native, statically optimized program binary, while it is executing. Dynamo not only speeds up real application programs, its performance improvement is often quite significant. For example, the performance of many +O2 optimized SPECint95 binaries running under Dynamo is comparable to the performance of their +O4 optimized version running without Dynamo. Internal Accession Date Only Ó Copyright Hewlett-Packard Company 1999 Contents 1 INTRODUCTION ........................................................................................... 7 2 RELATED WORK ......................................................................................... 9 3 OVERVIEW -
Infrastructures and Compilation Strategies for the Performance of Computing Systems Erven Rohou
Infrastructures and Compilation Strategies for the Performance of Computing Systems Erven Rohou To cite this version: Erven Rohou. Infrastructures and Compilation Strategies for the Performance of Computing Systems. Other [cs.OH]. Universit´ede Rennes 1, 2015. <tel-01237164> HAL Id: tel-01237164 https://hal.inria.fr/tel-01237164 Submitted on 2 Dec 2015 HAL is a multi-disciplinary open access L'archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destin´eeau d´ep^otet `ala diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publi´esou non, lished or not. The documents may come from ´emanant des ´etablissements d'enseignement et de teaching and research institutions in France or recherche fran¸caisou ´etrangers,des laboratoires abroad, or from public or private research centers. publics ou priv´es. Distributed under a Creative Commons Attribution - NonCommercial - NoDerivatives 4.0 International License HABILITATION À DIRIGER DES RECHERCHES présentée devant L’Université de Rennes 1 Spécialité : informatique par Erven Rohou Infrastructures and Compilation Strategies for the Performance of Computing Systems soutenue le 2 novembre 2015 devant le jury composé de : Mme Corinne Ancourt Rapporteur Prof. Stefano Crespi Reghizzi Rapporteur Prof. Frédéric Pétrot Rapporteur Prof. Cédric Bastoul Examinateur Prof. Olivier Sentieys Président M. André Seznec Examinateur Contents 1 Introduction 3 1.1 Evolution of Hardware . .3 1.1.1 Traditional Evolution . .4 1.1.2 Multicore Era . .4 1.1.3 Amdahl’s Law . .5 1.1.4 Foreseeable Future Architectures . .6 1.1.5 Extremely Complex Micro-architectures . .6 1.2 Evolution of Software Ecosystem . -
Costing JIT Traces
Costing JIT Traces John Magnus Morton, Patrick Maier, and Philip Trinder University of Glasgow [email protected], {Patrick.Maier, Phil.Trinder}@glasgow.ac.uk Abstract. Tracing JIT compilation generates units of compilation that are easy to analyse and are known to execute frequently. The AJITPar project aims to investigate whether the information in JIT traces can be used to make better scheduling decisions or perform code transformations to adapt the code for a specific parallel architecture. To achieve this goal, a cost model must be developed to estimate the execution time of an individual trace. This paper presents the design and implementation of a system for ex- tracting JIT trace information from the Pycket JIT compiler. We define three increasingly parametric cost models for Pycket traces. We perform a search of the cost model parameter space using genetic algorithms to identify the best weightings for those parameters. We test the accuracy of these cost models for predicting the cost of individual traces on a set of loop-based micro-benchmarks. We also compare the accuracy of the cost models for predicting whole program execution time over the Py- cket benchmark suite. Our results show that the weighted cost model using the weightings found from the genetic algorithm search has the best accuracy. 1 Introduction Modern hardware is increasingly multicore, and increasingly, software is required to exhibit decent parallel performance in order to match the hardware’s poten- tial. Writing performant parallel code is non-trivial for a fixed architecture, yet it is much harder if the target architecture is not known in advance, or if the code is meant to be portable across a range of architectures. -
Microkernels Meet Recursive Virtual Machines
Microkernels Meet Recursive Virtual Machines Bryan Ford Mike Hibler Jay Lepreau Patrick Tullmann Godmar Back Stephen Clawson Department of Computer Science, University of Utah Salt Lake City, UT 84112 [email protected] http://www.cs.utah.edu/projects/flux/ Abstract “vertically” by implementing OS functionality in stackable virtual machine monitors, each of which exports a virtual This paper describes a novel approach to providingmod- machine interface compatible with the machine interface ular and extensible operating system functionality and en- on which it runs. Traditionally, virtual machines have been capsulated environments based on a synthesis of micro- implemented on and export existing hardware architectures kernel and virtual machine concepts. We have developed so they can support “naive” operating systems (see Fig- a software-based virtualizable architecture called Fluke ure 1). For example, the most well-known virtual machine that allows recursive virtual machines (virtual machines system, VM/370 [28, 29], provides virtual memory and se- running on other virtual machines) to be implemented ef- curity between multiple concurrent virtual machines, all ficiently by a microkernel running on generic hardware. exporting the IBM S/370 hardware architecture. Further- A complete virtual machine interface is provided at each more, special virtualizable hardware architectures [22, 35] level; efficiency derives from needing to implement only have been proposed, whose design goal is to allow virtual new functionality at each level. This infrastructure allows machines to be stacked much more efficiently. common OS functionality, such as process management, demand paging, fault tolerance, and debugging support, to This paper presents a new approach to OS extensibil- be provided by cleanly modularized, independent, stack- ity which combines both microkernel and virtual machine able virtual machine monitors, implemented as user pro- concepts in one system.