Vmkit: a Substrate for Managed Runtime Environments
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
The LLVM Instruction Set and Compilation Strategy
The LLVM Instruction Set and Compilation Strategy Chris Lattner Vikram Adve University of Illinois at Urbana-Champaign lattner,vadve ¡ @cs.uiuc.edu Abstract This document introduces the LLVM compiler infrastructure and instruction set, a simple approach that enables sophisticated code transformations at link time, runtime, and in the field. It is a pragmatic approach to compilation, interfering with programmers and tools as little as possible, while still retaining extensive high-level information from source-level compilers for later stages of an application’s lifetime. We describe the LLVM instruction set, the design of the LLVM system, and some of its key components. 1 Introduction Modern programming languages and software practices aim to support more reliable, flexible, and powerful software applications, increase programmer productivity, and provide higher level semantic information to the compiler. Un- fortunately, traditional approaches to compilation either fail to extract sufficient performance from the program (by not using interprocedural analysis or profile information) or interfere with the build process substantially (by requiring build scripts to be modified for either profiling or interprocedural optimization). Furthermore, they do not support optimization either at runtime or after an application has been installed at an end-user’s site, when the most relevant information about actual usage patterns would be available. The LLVM Compilation Strategy is designed to enable effective multi-stage optimization (at compile-time, link-time, runtime, and offline) and more effective profile-driven optimization, and to do so without changes to the traditional build process or programmer intervention. LLVM (Low Level Virtual Machine) is a compilation strategy that uses a low-level virtual instruction set with rich type information as a common code representation for all phases of compilation. -
Domain-Specific Programming Systems
Lecture 22: Domain-Specific Programming Systems Parallel Computer Architecture and Programming CMU 15-418/15-618, Spring 2020 Slide acknowledgments: Pat Hanrahan, Zach Devito (Stanford University) Jonathan Ragan-Kelley (MIT, Berkeley) Course themes: Designing computer systems that scale (running faster given more resources) Designing computer systems that are efficient (running faster under constraints on resources) Techniques discussed: Exploiting parallelism in applications Exploiting locality in applications Leveraging hardware specialization (earlier lecture) CMU 15-418/618, Spring 2020 Claim: most software uses modern hardware resources inefficiently ▪ Consider a piece of sequential C code - Call the performance of this code our “baseline performance” ▪ Well-written sequential C code: ~ 5-10x faster ▪ Assembly language program: maybe another small constant factor faster ▪ Java, Python, PHP, etc. ?? Credit: Pat Hanrahan CMU 15-418/618, Spring 2020 Code performance: relative to C (single core) GCC -O3 (no manual vector optimizations) 51 40/57/53 47 44/114x 40 = NBody 35 = Mandlebrot = Tree Alloc/Delloc 30 = Power method (compute eigenvalue) 25 20 15 10 5 Slowdown (Compared to C++) Slowdown (Compared no data no 0 data no Java Scala C# Haskell Go Javascript Lua PHP Python 3 Ruby (Mono) (V8) (JRuby) Data from: The Computer Language Benchmarks Game: CMU 15-418/618, http://shootout.alioth.debian.org Spring 2020 Even good C code is inefficient Recall Assignment 1’s Mandelbrot program Consider execution on a high-end laptop: quad-core, Intel Core i7, AVX instructions... Single core, with AVX vector instructions: 5.8x speedup over C implementation Multi-core + hyper-threading + AVX instructions: 21.7x speedup Conclusion: basic C implementation compiled with -O3 leaves a lot of performance on the table CMU 15-418/618, Spring 2020 Making efficient use of modern machines is challenging (proof by assignments 2, 3, and 4) In our assignments, you only programmed homogeneous parallel computers. -
Toward Harnessing High-Level Language Virtual Machines for Further Speeding up Weak Mutation Testing
2012 IEEE Fifth International Conference on Software Testing, Verification and Validation Toward Harnessing High-level Language Virtual Machines for Further Speeding up Weak Mutation Testing Vinicius H. S. Durelli Jeff Offutt Marcio E. Delamaro Computer Systems Department Software Engineering Computer Systems Department Universidade de Sao˜ Paulo George Mason University Universidade de Sao˜ Paulo Sao˜ Carlos, SP, Brazil Fairfax, VA, USA Sao˜ Carlos, SP, Brazil [email protected] [email protected] [email protected] Abstract—High-level language virtual machines (HLL VMs) have tried to exploit the control that HLL VMs exert over run- are now widely used to implement high-level programming ning programs to facilitate and speedup software engineering languages. To a certain extent, their widespread adoption is due activities. Thus, this research suggests that software testing to the software engineering benefits provided by these managed execution environments, for example, garbage collection (GC) activities can benefit from HLL VMs support. and cross-platform portability. Although HLL VMs are widely Test tools are usually built on top of HLL VMs. However, used, most research has concentrated on high-end optimizations they often end up tampering with the emergent computation. such as dynamic compilation and advanced GC techniques. Few Using features within the HLL VMs can avoid such problems. efforts have focused on introducing features that automate or fa- Moreover, embedding testing tools within HLL VMs can cilitate certain software engineering activities, including software testing. This paper suggests that HLL VMs provide a reasonable significantly speedup computationally expensive techniques basis for building an integrated software testing environment. As such as mutation testing [6]. -
Sun Glassfish Enterprise Server V3 Scripting Framework Guide
Sun GlassFish Enterprise Server v3 Scripting Framework Guide Sun Microsystems, Inc. 4150 Network Circle Santa Clara, CA 95054 U.S.A. Part No: 820–7697–11 December 2009 Copyright 2009 Sun Microsystems, Inc. 4150 Network Circle, Santa Clara, CA 95054 U.S.A. All rights reserved. Sun Microsystems, Inc. has intellectual property rights relating to technology embodied in the product that is described in this document. In particular, and without limitation, these intellectual property rights may include one or more U.S. patents or pending patent applications in the U.S. and in other countries. U.S. Government Rights – Commercial software. Government users are subject to the Sun Microsystems, Inc. standard license agreement and applicable provisions of the FAR and its supplements. This distribution may include materials developed by third parties. Parts of the product may be derived from Berkeley BSD systems, licensed from the University of California. UNIX is a registered trademark in the U.S. and other countries, exclusively licensed through X/Open Company, Ltd. Sun, Sun Microsystems, the Sun logo, the Solaris logo, the Java Coffee Cup logo, docs.sun.com, Enterprise JavaBeans, EJB, GlassFish, J2EE, J2SE, Java Naming and Directory Interface, JavaBeans, Javadoc, JDBC, JDK, JavaScript, JavaServer, JavaServer Pages, JMX, JRE, JSP,JVM, MySQL, NetBeans, OpenSolaris, SunSolve, Sun GlassFish, ZFS, Java, and Solaris are trademarks or registered trademarks of Sun Microsystems, Inc. or its subsidiaries in the U.S. and other countries. All SPARC trademarks are used under license and are trademarks or registered trademarks of SPARC International, Inc. in the U.S. and other countries. -
What Is LLVM? and a Status Update
What is LLVM? And a Status Update. Approved for public release Hal Finkel Leadership Computing Facility Argonne National Laboratory Clang, LLVM, etc. ✔ LLVM is a liberally-licensed(*) infrastructure for creating compilers, other toolchain components, and JIT compilation engines. ✔ Clang is a modern C++ frontend for LLVM ✔ LLVM and Clang will play significant roles in exascale computing systems! (*) Now under the Apache 2 license with the LLVM Exception LLVM/Clang is both a research platform and a production-quality compiler. 2 A role in exascale? Current/Future HPC vendors are already involved (plus many others)... Apple + Google Intel (Many millions invested annually) + many others (Qualcomm, Sony, Microsoft, Facebook, Ericcson, etc.) ARM LLVM IBM Cray NVIDIA (and PGI) Academia, Labs, etc. AMD 3 What is LLVM: LLVM is a multi-architecture infrastructure for constructing compilers and other toolchain components. LLVM is not a “low-level virtual machine”! LLVM IR Architecture-independent simplification Architecture-aware optimization (e.g. vectorization) Assembly printing, binary generation, or JIT execution Backends (Type legalization, instruction selection, register allocation, etc.) 4 What is Clang: LLVM IR Clang is a C++ frontend for LLVM... Code generation Parsing and C++ Source semantic analysis (C++14, C11, etc.) Static analysis ● For basic compilation, Clang works just like gcc – using clang instead of gcc, or clang++ instead of g++, in your makefile will likely “just work.” ● Clang has a scalable LTO, check out: https://clang.llvm.org/docs/ThinLTO.html 5 The core LLVM compiler-infrastructure components are one of the subprojects in the LLVM project. These components are also referred to as “LLVM.” 6 What About Flang? ● Started as a collaboration between DOE and NVIDIA/PGI. -
(Cont'd) Current Trends
Scripting vs Systems Programming Languages (cont’d) Designed for gluing Designed for building Does application implement complex algorithms and data applications : flexibility applications : efficiency structures? Does application process large data sets (>10,000 items)? Interpreted Compiled Are application functions well-defined, fixed? Dynamic variable creation Variable declaration If yes, consider a system programming language. Data and code integrated : Data and code separated : meta-programming cannot create/run code on Is the main task to connect components, legacy apps? supported the fly Does the application manipulate a variety of things? Does the application have a GUI? Dynamic typing (typeless) Static typing Are the application's functions evolving rapidly? Examples: PERL, Tcl, Examples: PL/1, Ada, Must the application be extensible? Python, Ruby, Scheme, Java, C/C++, C#, etc Does the application do a lot of string manipulation? Visual Basic, etc If yes, consider a scripting language. cs480 (Prasad) LSysVsScipt 1 cs480 (Prasad) LSysVsScipt 2 Current Trends Jython (for convenient access to Java APIs) Hybrid Languages : Scripting + Systems Programming I:\tkprasad\cs480>jython – Recent JVM-based Scripting Languages Jython 2.1 on java1.4.1_02 (JIT: null) Type "copyright", "credits" or "license" for more information. »Jython : Python dialect >>> import javax.swing as swing >>> win = swing.JFrame("Welcome to Jython") »Clojure : LISP dialect >>> win.size = (200, 200) »Scala : OOP +Functional Hybrid >>> win.show() -
Amazon Codeguru Profiler
Amazon CodeGuru Profiler User Guide Amazon CodeGuru Profiler User Guide Amazon CodeGuru Profiler: User Guide Copyright © Amazon Web Services, Inc. and/or its affiliates. All rights reserved. Amazon's trademarks and trade dress may not be used in connection with any product or service that is not Amazon's, in any manner that is likely to cause confusion among customers, or in any manner that disparages or discredits Amazon. All other trademarks not owned by Amazon are the property of their respective owners, who may or may not be affiliated with, connected to, or sponsored by Amazon. Amazon CodeGuru Profiler User Guide Table of Contents What is Amazon CodeGuru Profiler? ..................................................................................................... 1 What can I do with CodeGuru Profiler? ......................................................................................... 1 What languages are supported by CodeGuru Profiler? ..................................................................... 1 How do I get started with CodeGuru Profiler? ................................................................................ 1 Setting up ......................................................................................................................................... 3 Set up in the Lambda console ..................................................................................................... 3 Step 1: Sign up for AWS .................................................................................................... -
CUDA Flux: a Lightweight Instruction Profiler for CUDA Applications
CUDA Flux: A Lightweight Instruction Profiler for CUDA Applications Lorenz Braun Holger Froning¨ Institute of Computer Engineering Institute of Computer Engineering Heidelberg University Heidelberg University Heidelberg, Germany Heidelberg, Germany [email protected] [email protected] Abstract—GPUs are powerful, massively parallel processors, hardware, it lacks verbosity when it comes to analyzing the which require a vast amount of thread parallelism to keep their different types of instructions executed. This problem is aggra- thousands of execution units busy, and to tolerate latency when vated by the recently frequent introduction of new instructions, accessing its high-throughput memory system. Understanding the behavior of massively threaded GPU programs can be for instance floating point operations with reduced precision difficult, even though recent GPUs provide an abundance of or special function operations including Tensor Cores for ma- hardware performance counters, which collect statistics about chine learning. Similarly, information on the use of vectorized certain events. Profiling tools that assist the user in such analysis operations is often lost. Characterizing workloads on PTX for their GPUs, like NVIDIA’s nvprof and cupti, are state-of- level is desirable as PTX allows us to determine the exact types the-art. However, instrumentation based on reading hardware performance counters can be slow, in particular when the number of the instructions a kernel executes. On the contrary, nvprof of metrics is large. Furthermore, the results can be inaccurate as metrics are based on a lower-level representation called SASS. instructions are grouped to match the available set of hardware Still, even if there was an exact mapping, some information on counters. -
A Hardware Abstraction Layer in Java
A Hardware Abstraction Layer in Java MARTIN SCHOEBERL Vienna University of Technology, Austria STEPHAN KORSHOLM Aalborg University, Denmark TOMAS KALIBERA Purdue University, USA and ANDERS P. RAVN Aalborg University, Denmark Embedded systems use specialized hardware devices to interact with their environment, and since they have to be dependable, it is attractive to use a modern, type-safe programming language like Java to develop programs for them. Standard Java, as a platform independent language, delegates access to devices, direct memory access, and interrupt handling to some underlying operating system or kernel, but in the embedded systems domain resources are scarce and a Java virtual machine (JVM) without an underlying middleware is an attractive architecture. The contribution of this paper is a proposal for Java packages with hardware objects and interrupt handlers that interface to such a JVM. We provide implementations of the proposal directly in hardware, as extensions of standard interpreters, and finally with an operating system middleware. The latter solution is mainly seen as a migration path allowing Java programs to coexist with legacy system components. An important aspect of the proposal is that it is compatible with the Real-Time Specification for Java (RTSJ). Categories and Subject Descriptors: D.4.7 [Operating Systems]: Organization and Design—Real-time sys- tems and embedded systems; D.3.3 [Programming Languages]: Language Classifications—Object-oriented languages; D.3.3 [Programming Languages]: Language Constructs and Features—Input/output General Terms: Languages, Design, Implementation Additional Key Words and Phrases: Device driver, embedded system, Java, Java virtual machine 1. INTRODUCTION When developing software for an embedded system, for instance an instrument, it is nec- essary to control specialized hardware devices, for instance a heating element or an inter- ferometer mirror. -
Techniques for Real-System Characterization of Java Virtual Machine Energy and Power Behavior Gilberto Contreras Margaret Martonosi
Techniques for Real-System Characterization of Java Virtual Machine Energy and Power Behavior Gilberto Contreras Margaret Martonosi Department of Electrical Engineering Princeton University 1 Why Study Power in Java Systems? The Java platform has been adopted in a wide variety of devices Java servers demand performance, embedded devices require low-power Performance is important, power/energy/thermal issues are equally important How do we study and characterize these requirements in a multi-layer platform? 2 Power/Performance Design Issues Java Application Java Virtual Machine Operating System Hardware 3 Power/Performance Design Issues Java Application Garbage Class Runtime Execution Collection LoaderJava VirtualCompiler MachineEngine Operating System Hardware How do the various software layers affect power/performance characteristics of hardware? Where should time be invested when designing power and/or thermally aware Java virtual Machines? 4 Outline Approaches for Energy/Performance Characterization of Java virtual machines Methodology Breaking the JVM into sub-components Hardware-based power/performance characterization of JVM sub-components Results Jikes & Kaffe on Pentium M Kaffe on Intel XScale Conclusions 5 Power & Performance Analysis of Java Simulation Approach √ Flexible: easy to model non-existent hardware x Simulators may lack comprehensiveness and accuracy x Thermal studies require tens of seconds granularity Accurate simulators are too slow Hardware Approach √ Able to capture full-system characteristics -
Eagle: Tcl Implementation in C
Eagle: Tcl Implementation in C# Joe Mistachkin <[email protected]> 1. Abstract Eagle [1], Extensible Adaptable Generalized Logic Engine, is an implementation of the Tcl [2] scripting language for the Microsoft Common Language Runtime (CLR) [3]. It is designed to be a universal scripting solution for any CLR based language, and is written completely in C# [4]. Su- perficially, it is similar to Jacl [5], but it was written from scratch based on the design and imple- mentation of Tcl 8.4 [6]. It provides most of the functionality of the Tcl 8.4 interpreter while bor- rowing selected features from Tcl 8.5 [7] and the upcoming Tcl 8.6 [8] in addition to adding en- tirely new features. This paper explains how Eagle adds value to both Tcl/Tk and CLR-based applications and how it differs from other “dynamic languages” hosted by the CLR and its cousin, the Microsoft Dy- namic Language Runtime (DLR) [9]. It then describes how to use, integrate with, and extend Ea- gle effectively. It also covers some important implementation details and the overall design phi- losophy behind them. 2. Introduction This paper presents Eagle, which is an open-source [10] implementation of Tcl for the Microsoft CLR written entirely in C#. The goal of this project was to create a dynamic scripting language that could be used to automate any host application running on the CLR. 3. Rationale and Motivation Tcl makes it relatively easy to script applications written in C [11] and/or C++ [12] and so can also script applications written in many other languages (e.g. -
Implementing Continuation Based Language in LLVM and Clang
Implementing Continuation based language in LLVM and Clang Kaito TOKUMORI Shinji KONO University of the Ryukyus University of the Ryukyus Email: [email protected] Email: [email protected] Abstract—The programming paradigm which use data seg- 1 __code f(Allocate allocate){ 2 allocate.size = 0; ments and code segments are proposed. This paradigm uses 3 goto g(allocate); Continuation based C (CbC), which a slight modified C language. 4 } 5 Code segments are units of calculation and Data segments are 6 // data segment definition sets of typed data. We use these segments as units of computation 7 // (generated automatically) 8 union Data { and meta computation. In this paper we show the implementation 9 struct Allocate { of CbC on LLVM and Clang 3.7. 10 long size; 11 } allocate; 12 }; I. A PRACTICAL CONTINUATION BASED LANGUAGE TABLE I The proposed units of programming are named code seg- CBCEXAMPLE ments and data segments. Code segments are units of calcu- lation which have no state. Data segments are sets of typed data. Code segments are connected to data segments by a meta data segment called a context. After the execution of a code have input data segments and output data segments. Data segment and its context, the next code segment (continuation) segments have two kind of dependency with code segments. is executed. First, Code segments access the contents of data segments Continuation based C (CbC) [1], hereafter referred to as using field names. So data segments should have the named CbC, is a slight modified C which supports code segments.