By Mathew Zaleski a Thesis Submitted in Conformity with the Requirements

Total Page:16

File Type:pdf, Size:1020Kb

By Mathew Zaleski a Thesis Submitted in Conformity with the Requirements YETI: A GRADUALLYEXTENSIBLE TRACE INTERPRETER by Mathew Zaleski A thesis submitted in conformity with the requirements for the degree of Doctor of Philosophy Graduate Department of Computer Science University of Toronto Copyright c 2008 by Mathew Zaleski Library and Archives Bibliothèque et Canada Archives Canada Published Heritage Direction du Branch Patrimoine de l’édition 395 Wellington Street 395, rue Wellington Ottawa ON K1A 0N4 Ottawa ON K1A 0N4 Canada Canada Your file Votre référence ISBN: 978-0-494-57946-6 Our file Notre référence ISBN: 978-0-494-57946-6 NOTICE: AVIS: The author has granted a non- L’auteur a accordé une licence non exclusive exclusive license allowing Library and permettant à la Bibliothèque et Archives Archives Canada to reproduce, Canada de reproduire, publier, archiver, publish, archive, preserve, conserve, sauvegarder, conserver, transmettre au public communicate to the public by par télécommunication ou par l’Internet, prêter, telecommunication or on the Internet, distribuer et vendre des thèses partout dans le loan, distribute and sell theses monde, à des fins commerciales ou autres, sur worldwide, for commercial or non- support microforme, papier, électronique et/ou commercial purposes, in microform, autres formats. paper, electronic and/or any other formats. The author retains copyright L’auteur conserve la propriété du droit d’auteur ownership and moral rights in this et des droits moraux qui protège cette thèse. Ni thesis. Neither the thesis nor la thèse ni des extraits substantiels de celle-ci substantial extracts from it may be ne doivent être imprimés ou autrement printed or otherwise reproduced reproduits sans son autorisation. without the author’s permission. In compliance with the Canadian Conformément à la loi canadienne sur la Privacy Act some supporting forms protection de la vie privée, quelques may have been removed from this formulaires secondaires ont été enlevés de thesis. cette thèse. While these forms may be included Bien que ces formulaires aient inclus dans in the document page count, their la pagination, il n’y aura aucun contenu removal does not represent any loss manquant. of content from the thesis. Abstract YETI: a graduallY Extensible Trace Interpreter Mathew Zaleski Doctor of Philosophy Graduate Department of Computer Science University of Toronto 2008 The implementation of new programming languages benefits from interpretation because it is simple, flexible and portable. The only downside is speed of execution, as there remains a large performance gap between even efficient interpreters and systems that include a just-in- time (JIT) compiler. Augmenting an interpreter with a JIT, however, is not a small task. Today, Java JITs are typically method-based. To compile whole methods, the JIT must re-implement much functionality already provided by the interpreter, leading to a “big bang” development effort before the JIT can be deployed. Adding a JIT to an interpreter would be easier if we could more gradually shift from dispatching virtual instructions bodies implemented for the interpreter to running instructions compiled into native code by the JIT. We show that virtual instructions implemented as lightweight callable routines can form the basis for a very efficient interpreter. Our new technique, interpreted traces, identifies hot paths, or traces, as a virtual program is interpreted. By exploiting the way traces predict branch desti- nations our technique markedly reduces branch mispredictions caused by dispatch. Interpreted traces are a high-performance technique, running about 25% faster than direct threading. We show that interpreted traces are a good starting point for a trace-based JIT. We extend our interpreter so traces may contain a mixture of compiled code for some virtual instructions and calls to virtual instruction bodies for others. By compiling about 50 integer and object virtual instructions to machine code we improve performance by about 30% over interpreted traces, running about twice as fast as the direct threaded system with which we started. ii Acknowledgements My supervisor, Angela Demke Brown uncomplainingly drew the short straw when she was handed me, a middle-aged electrical engineer turned businessman, as her first doctoral student. Overcoming many obstacles she has taught me the subtle craft of systems research. I thank my advisory committee of Michael Stumm, David Wortman and Tarek Abdelrah- man for their input and guidance on our research. Special thanks are also due to Kevin Stoodley of IBM, without whose insight this research would not have been possible. My family must at some level be to blame for my decision to interrupt (even jeopardize?) a reasonably successful and entirely enjoyable career to spend the last several years retraining as a researcher. Leonard, my father, would have been pleased had I written this dissertation twenty years ago. If I had, he would have been alive to read it. My mother, Irma, has written several books on religion while I completed my studies. Her love of knowledge, and the process of writing it down, has been impetus for my less ethereal work. My father-in-law, Harry Eastman, advised me throughout the process of returning to school. I doubt I would have had the nerve to carry it through without his support. I wish he were still with us to pull his academic hood out of mothballs and see me over the threshold one more time. My wife, Harriet Eastman, has been a paragon of support and patience while I enjoyed the thrills and chills of the PhD program. Without her love and encouragement I would have given up long ago. Our children, Sam and Jacob, while occasionally benefiting from my flexible hours, were more often called upon to be good sports when the flexibility simply meant I worked all the time. iii Contents 1 Introduction 1 1.1 Challenges of Method-based JIT Compilation . 3 1.2 Challenges of Efficient Interpretation . 4 1.3 What We Need . 4 1.4 Overview of Our Solution . 5 1.5 Thesis Statement . 8 1.6 Contributions . 8 1.7 Outline of Thesis . 10 2 Background 12 2.1 High Level Language Virtual Machine . 12 2.1.1 Overview of a Virtual Program . 14 2.1.2 Interpretation . 15 2.1.3 Early Just in Time Compilers . 16 2.2 Challenges to HLL VM Performance . 17 2.2.1 Polymorphism and the Implications of Object-oriented Programming . 18 2.2.2 Late binding . 21 2.3 Early Dynamic Optimization . 22 2.3.1 Manual Dynamic Optimization . 22 2.3.2 Application specific dynamic compilation . 22 iv 2.3.3 Dynamic Compilation of Manually Identified Static Regions . 23 2.4 Dynamic Object-oriented optimization . 24 2.4.1 Finding the destination of a polymorphic callsite . 25 2.4.2 Smalltalk and Self . 26 2.4.3 Java JIT as Dynamic Optimizer . 28 2.4.4 JIT Compiling Partial Methods . 30 2.5 Traces . 31 2.6 Hotpath . 33 2.7 Branch Prediction and General Purpose Computers . 33 2.7.1 Dynamic Hardware Branch Prediction . 35 2.8 Chapter Summary . 36 3 Dispatch Techniques 38 3.1 Switch Dispatch . 39 3.2 Direct Call Threading . 41 3.3 Direct Threading . 41 3.4 The Context Problem . 43 3.5 Subroutine Threading . 45 3.6 Optimizing Dispatch . 46 3.6.1 Superinstructions . 46 3.6.2 Selective Inlining . 47 3.6.3 Replication . 48 3.7 Chapter Summary . 48 4 Design and Implementation of Efficient Interpretation 49 4.1 Understanding Branches . 51 4.2 Handling Linear Dispatch . 53 4.3 Handling Virtual Branches . 54 v 4.4 Handling Virtual Call and Return . 58 4.5 Chapter Summary . 60 5 Evaluation of Context Threading 62 5.1 Experimental Set-up . 63 5.1.1 Virtual Machines and Benchmarks . 63 5.1.2 Performance and Pipeline Hazard Measurements . 65 5.2 Interpreting the data . 66 5.2.1 Effect on Pipeline Branch Hazards . 70 5.2.2 Performance . 71 5.3 Inlining . 76 5.4 Limitations of Context Threading . 77 5.4.1 Heavyweight Virtual Instruction Bodies . 77 5.4.2 Context Threading and Profiling . 79 5.4.3 Development using SableVM . 79 5.5 Chapter Summary . 80 6 Design and Implementation of YETI 82 6.1 Structure and Overview of Yeti . 83 6.2 Region Selection . 87 6.2.1 Initiating Region Discovery . 87 6.2.2 Linear Block Detection . 88 6.2.3 Trace Selection . 90 6.3 Trace Exit Runtime . 92 6.3.1 Trace Linking . 93 6.4 Generating code for traces . 95 6.4.1 Interpreted Traces . 95 6.4.2 JIT Compiled Traces . 96 vi 6.4.3 Trace Optimization . 100 6.5 Other implementation details . 102 6.6 Chapter Summary . 103 7 Evaluation of Yeti 106 7.1 Experimental Set-up . 107 7.2 Effect of region shape on dispatch . 110 7.3 Effect of region shape on performance . 114 7.4 Early Pentium Results . 121 7.5 Identification of Stall Cycles . 121 7.5.1 Identifying Causes of Stall Cycles . 123 7.5.2 Stall Cycle results . 124 7.5.3 Trends . 127 7.6 Chapter Summary . 130 8 Conclusions and Future Work 133 8.1 Conclusions and Lessons Learned . 133 8.2 Future work . 136 8.2.1 Virtual instruction bodies as nested functions . 137 8.2.2 Extension to Runtime Typed Languages . 138 8.2.3 New shapes of region body . 139 8.2.4 Vision for new language implementation . 139 8.3 Summary . ..
Recommended publications
  • Java in Embedded Linux Systems
    Java in Embedded Linux Systems Java in Embedded Linux Systems Thomas Petazzoni / Michael Opdenacker Free Electrons http://free-electrons.com/ Created with OpenOffice.org 2.x Java in Embedded Linux Systems © Copyright 2004-2007, Free Electrons, Creative Commons Attribution-ShareAlike 2.5 license http://free-electrons.com Sep 15, 2009 1 Rights to copy Attribution ± ShareAlike 2.5 © Copyright 2004-2008 You are free Free Electrons to copy, distribute, display, and perform the work [email protected] to make derivative works to make commercial use of the work Document sources, updates and translations: Under the following conditions http://free-electrons.com/articles/java Attribution. You must give the original author credit. Corrections, suggestions, contributions and Share Alike. If you alter, transform, or build upon this work, you may distribute the resulting work only under a license translations are welcome! identical to this one. For any reuse or distribution, you must make clear to others the license terms of this work. Any of these conditions can be waived if you get permission from the copyright holder. Your fair use and other rights are in no way affected by the above. License text: http://creativecommons.org/licenses/by-sa/2.5/legalcode Java in Embedded Linux Systems © Copyright 2004-2007, Free Electrons, Creative Commons Attribution-ShareAlike 2.5 license http://free-electrons.com Sep 15, 2009 2 Best viewed with... This document is best viewed with a recent PDF reader or with OpenOffice.org itself! Take advantage of internal
    [Show full text]
  • Effective Inline-Threaded Interpretation of Java Bytecode
    Effective Inline-Threaded Interpretation of Java Bytecode Using Preparation Sequences Etienne Gagnon1 and Laurie Hendren2 1 Sable Research Group Universit´eduQu´ebec `a Montr´eal, [email protected] 2 McGill University Montreal, Canada, [email protected] Abstract. Inline-threaded interpretation is a recent technique that im- proves performance by eliminating dispatch overhead within basic blocks for interpreters written in C [11]. The dynamic class loading, lazy class initialization, and multi-threading features of Java reduce the effective- ness of a straight-forward implementation of this technique within Java interpreters. In this paper, we introduce preparation sequences, a new technique that solves the particular challenge of effectively inline-threa- ding Java. We have implemented our technique in the SableVM Java virtual machine, and our experimental results show that using our tech- nique, inline-threaded interpretation of Java, on a set of benchmarks, achieves a speedup ranging from 1.20 to 2.41 over switch-based inter- pretation, and a speedup ranging from 1.15 to 2.14 over direct-threaded interpretation. 1 Introduction One of the main advantages of interpreters written in high-level languages is their simplicity and portability, when compared to static and dynamic compiler-based systems. One of their main drawbacks is poor performance, due to a high cost for dispatching interpreted instructions. In [11], Piumarta and Riccardi introduced a technique called inlined-threading which reduces this overhead by dynamically inlining instruction sequences within basic blocks, leaving a single instruction dispatch at the end of each sequence. To our knowledge, inlined-threading has not been applied to Java interpreters before.
    [Show full text]
  • Android Cours 1 : Introduction `Aandroid / Android Studio
    Android Cours 1 : Introduction `aAndroid / Android Studio Damien MASSON [email protected] http://www.esiee.fr/~massond 21 f´evrier2017 R´ef´erences https://developer.android.com (Incontournable !) https://openclassrooms.com/courses/ creez-des-applications-pour-android/ Un tutoriel en fran¸caisassez complet et plut^ot`ajour... 2/52 Qu'est-ce qu'Android ? PME am´ericaine,Android Incorporated, cr´e´eeen 2003, rachet´eepar Google en 2005 OS lanc´een 2007 En 2015, Android est le syst`emed'exploitation mobile le plus utilis´edans le monde (>80%) 3/52 Qu'est-ce qu'Android ? Cinq couches distinctes : 1 le noyau Linux avec les pilotes ; 2 des biblioth`equeslogicielles telles que WebKit/Blink, OpenGL ES, SQLite ou FreeType ; 3 un environnement d'ex´ecutionet des biblioth`equespermettant d'ex´ecuterdes programmes pr´evuspour la plate-forme Java ; 4 un framework { kit de d´eveloppement d'applications ; 4/52 Android et la plateforme Java Jusqu'`asa version 4.4, Android comporte une machine virtuelle nomm´eeDalvik Le bytecode de Dalvik est diff´erentde celui de la machine virtuelle Java de Oracle (JVM) le processus de construction d'une application est diff´erent Code Java (.java) ! bytecode Java (.class/.jar) ! bytecode Dalvik (.dex) ! interpr´et´e L'ensemble de la biblioth`equestandard d'Android ressemble `a J2SE (Java Standard Edition) de la plateforme Java. La principale diff´erenceest que les biblioth`equesd'interface graphique AWT et Swing sont remplac´eespar des biblioth`equesd'Android. 5/52 Android Runtime (ART) A` partir de la version 5.0 (2014), l'environnement d'ex´ecution ART (Android RunTime) remplace la machine virtuelle Dalvik.
    [Show full text]
  • Compiler Analyses for Improved Return Value Prediction
    Compiler Analyses for Improved Return Value Prediction Christopher J.F. Pickett Clark Verbrugge {cpicke,clump}@sable.mcgill.ca School of Computer Science, McGill University Montreal,´ Quebec,´ Canada H3A 2A7 Overview Introduction and Related Work Contributions Framework Parameter Dependence Analysis Return Value Use Analysis Conclusions Future Work CDP04 – 1/48 Introduction and Related Work Speculative method-level parallelism (SMLP) allows for dynamic parallelisation of single-threaded programs speculative threads are forked at callsites suitable for Java virtual machines Perfect return value prediction can double performance of SMLP (Hu et al., 2003) Implemented accurate return value prediction in SableVM, our group’s JVM (Pickett et al., 2004) Current goals: Reduce memory requirements Achieve higher accuracy CDP04 – 2/48 Speculative Method-Level Parallelism // execute foo non-speculatively r = foo (a, b, c); // execute past return point // speculatively in parallel with foo() if (r > 10) { ... } else { ... } CDP04 – 3/48 Impact of Return Value Prediction RVP strategy return value SMLP speedup none arbitrary 1.52 best predicted 1.92 perfect correct 2.76 26% speedup over no RVP with Hu’s best predictor 82% speedup over no RVP with perfect prediction Improved hybrid accuracy is highly desirable S. Hu., R. Bhargava, and L. K. John. The role of return value prediction in exploiting speculative method-level parallelism. Journal of Instruction-Level Parallelism, 5:1–21, Nov. 2003. CDP04 – 4/48 Return Value Prediction in SableVM Implemented all of Hu et al.’s predictors in SableVM Introduced new memoization predictor into hybrid C.J.F. Pickett and C. Verbrugge. Return value prediction in a Java virtual machine.
    [Show full text]
  • GNU Classpath
    GNU Classpath Core Classes for a Diversity of Java Virtual Machines February 2004 ■ Sascha Brawer [email protected] http://www.dandelis.ch/people/brawer/ ■ Mark Wielaard [email protected] http://www.klomp.org/mark/ Overview ■ Overview – Motivation – Anatomy of a Java-like system – Documentation, Quality assurance, Releases ■ Demo ■ Details – Current state – VMs using GNU Classpath ■ How you can help 2 Motivation ■ Currently, most free software is written in C The Good The Bad The Ugly Close to hardware Easy to make bugs Libraries not Fast execution, even Lacks “modern” well integrated with bad compilers language concepts Difficult to learn Large code base, (“modern” = 1980’ies) many libraries Hard to write Ubiquitous portable code 3 Motivation ■ Java is a good foundation for many projects – Type-safety ● Avoids many typical bugs, crashes, security problems ● More opportunities for optimizing compilers – Modular, object-oriented, concurrent, dynamic – Easier to write structured, portable code – Very rich library, reasonably well designed (mostly) – Large developer base (“COBOL of the 1990’ies”) ● There exist lots of other good (even better?) languages: Oberon, Eiffel, O’Caml, Haskell, Erlang, TOM, Cedar, ... ● But: They have not many free applications/libraries 4 Motivation ■ But: Java does not solve every problem – Over-hyped as a solution to everything – Bad reputation for wasting resources (CPU, RAM) ● It is easy to write inefficient programs ● Early implementations were very slow ● Type-safety and memory management have their cost – Often over-estimated: Array bounds checking (→ no buffer overflows) costs ~2% execution time, avoids ~50% security-related bugs – Syntax similar to C++ → not very easy to learn 5 Motivation ■ Java is a compromise Why GNU Will Be – Not necessarily ideal, Compatible with UNIX but reasonable Unix is not my ideal system, but it is not too bad.
    [Show full text]
  • Guideline Document and Peripheral Input/Output Libraries for Developments
    Keeping Emulation Environments Portable FP7-ICT-231954 Guideline document and peripheral input/output libraries for developments Deliverable number D 5.1 Nature Other Dissemination level PU Delivery date Due: M8 (September 2009) Actual: M8 (September 2009) Status Final Workpackage number WP5 Lead beneficiary UPHEC Author(s) Antonio Ciuffreda, UPHEC David Anderson, UPHEC Janet Delve, UPHEC Getaneh Agegn Alemu, UPHEC Dan Pinchbeck, UPHEC Bram Lohman,TSSP David Michel, TSSP Bart Kiers, KB Vincent Joguin, JOG D 5.1 Guideline document and peripheral input/output libraries for developments Document history Revisions Version Date Author Changes 0.1 3 September 2009 Bram Lohman Sections 3.2 and 3.3 0.2 5 September 2009 Vincent Joguin Section 3.3 0.3 11 September 2009 David Michel Sections 1.2, 2.1, 2.2, 2.3, 3.1.1, 3.1.2, 3.1.4, 3.2, 3.3, 4.2.1 and 4.2.7 0.4 15 September 2009 Vincent Joguin Executive Summary, Abbreviations, Sections 3.1.1 and 3.1.4 0.5 15 September 2009 David Anderson Executive Summary, Abbreviations, Sections 1, 2, 3 and 4 Reviews Date Name Result 23/09/2009 Jean Marc Saffroy Approved with changes 29/09/2009 Jeffrey van der Hoeven Approved with changes Signature/Approval Approved by (signature) Date Accepted by at European Commission (signature) Date KEEP_WP5_D5.1_Guideline-Document-and-Peripheral-Input-Output-Libraries-for-Developments_v1.0.doc 2/23 D 5.1 Guideline document and peripheral input/output libraries for developments Executive Summary This report presents the initial integration strategy of the KEEP Emulation Access Platform (EAP) components, with a focus on the methodologies used to integrate the Core Emulation Framework (CEF) with the Olonys Virtual Machine (VM), the Front-end Emulation Framework (FEF) and external web services.
    [Show full text]
  • Speculative Parallelization of Object-Oriented Applications
    UNIVERSIDADE DE LISBOA INSTITUTO SUPERIOR TÉCNICO Speculative Parallelization of Object-Oriented Applications Ivo Filipe Silva Daniel Anjo Supervisor: Doctor João Manuel Pinheiro Cachopo Thesis approved in public session to obtain the PhD Degree in Information Systems and Computer Engineering Jury final classification: Pass with Merit Jury Chairperson: Chairman of the IST Scientific Board Members of the Committee: Doctor Thomas Richard Gross Doctor Bruno Miguel Brás Cabral Doctor Luís Manuel Antunes Veiga Doctor João Manuel Pinheiro Cachopo 2015 UNIVERSIDADE DE LISBOA INSTITUTO SUPERIOR TÉCNICO Speculative Parallelization of Object-Oriented Applications Ivo Filipe Silva Daniel Anjo Supervisor: Doctor João Manuel Pinheiro Cachopo Thesis approved in public session to obtain the PhD Degree in Information Systems and Computer Engineering Jury final classification: Pass with Merit Jury Chairperson: Chairman of the IST Scientific Board Members of the Committee: Doctor Thomas Richard Gross, Full Professor, ETH Zürich, Switzerland Doctor Bruno Miguel Brás Cabral, Professor Auxiliar da Faculdade de Ciências e Tecnologia, Universidade de Coimbra Doctor Luís Manuel Antunes Veiga, Professor Auxiliar do Instituto Superior Técnico, Universidade de Lisboa Doctor João Manuel Pinheiro Cachopo, Professor Auxiliar do Instituto Superior Técnico, Universidade de Lisboa Funding Institutions Fundação para a Ciência e a Tecnologia 2015 Resumo Apesar da ubiquidade dos processadores multicore, e ainda que novas aplicações informáti- cas estejam a ser desenvolvidas para tirar partido das suas capacidades de processamento paralelo, muitas das aplicações mais comuns são ainda sequenciais. Como consequência, muitos dos designers de processadores, ainda que activamente publicitando e recomendando a utilização de programação paralela, dedicam ainda muitos dos seus recursos à obtenção de melhores velocidades de processamento para código sequencial.
    [Show full text]
  • Return Value Prediction in a Java Virtual Machine
    Return Value Prediction in a Java Virtual Machine Christopher J.F. Pickett Clark Verbrugge School of Computer Science, McGill University Montr´eal, Qu´ebec, Canada H3A 2A7 {cpicke,clump}@sable.mcgill.ca Abstract We present the design and implementation of return value prediction in SableVM, a Java Virtual Machine. We give detailed results for the full SPEC JVM98 benchmark suite, and compare our results with previous, more lim- ited data. At the performance limit of exist- ing last value, stride, 2-delta stride, parameter stride, and context (FCM) sub-predictors in a hybrid, we achieve an average accuracy of 72%. We describe and characterize a new table-based memoization predictor that complements these predictors nicely, yielding an increased average hybrid accuracy of 81%. VM level information about data widths provides a 35% reduction in space, and dynamic allocation and expansion of per-callsite hashtables allows for highly accu- rate prediction with an average per-benchmark requirement of 119 MB for the context predic- Figure 1: (a) Sequential execution of Java bytecode. The tor and 43 MB for the memoization predictor. target method of an INVOKE<X> instruction executes before As far as we know, the is the first implementa- the instructions following the return point. (b) Speculative tion of non-trace-based return value prediction execution of Java bytecode under SMLP. Upon reaching a within a JVM. method callsite, the non-speculative parent thread T1 forks a speculative child thread T2. If the method is non-void, a predicted return value is pushed on T2’s Java stack. T2 then continues past the return point in parallel with the execu- 1 Introduction tion of the method body, buffering main memory accesses.
    [Show full text]
  • Port of the Java Virtual Machine Kaffe to DROPS by Using L4env
    Port of the Java Virtual Machine Kaffe to DROPS by using L4Env Alexander B¨ottcher [email protected] Technische Universit¨atDresden Faculty of Computer Science Operating System Group July 2004 Contents 1 Motivation 3 2 Fundamentals 4 2.1 Dresden Real-Time Operating System . 4 2.1.1 Microkernel . 4 2.1.2 L4Env - an environment on top of the L4 kernel . 5 2.2 Java and Java Virtual Machines . 6 2.2.1 Architecture . 6 2.2.2 Available Java Virtual Machines . 7 2.2.3 Kaffe . 8 2.3 Decision for Kaffe . 9 3 Design 10 3.1 Concepts . 10 3.2 Threads . 10 3.2.1 Thread mappings . 11 3.2.2 Adaptation of the L4Env thread library to Kaffe . 13 3.2.3 Garbage Collector . 14 3.2.4 Priorities and Scheduling . 15 3.3 Critical sections . 16 3.4 Java exception handling . 17 3.4.1 Null object and null pointer handling . 18 3.4.2 Arithmetic exception handling . 20 3.5 Memory and files . 20 3.6 Java native interface (JNI) and shared native libraries . 21 4 Implementation 22 4.1 Kaffes configuration and build system . 22 4.2 Thread data structure management . 22 4.3 Delaying of newly created threads . 23 4.4 Suspending and resuming threads . 23 1 CONTENTS 4.4.1 Stopping all threads . 24 4.4.2 Resuming all threads . 25 4.5 Semaphore problems with timing constraints . 26 4.5.1 Deadlocks caused by semaphore down timed . 26 4.5.2 Incorrect admissions to enter a semaphore .
    [Show full text]
  • Relative Factors in Performance Analysis of Java Virtual Machines
    Relative Factors in Performance Analysis of Java Virtual Machines Dayong Gu Clark Verbrugge Etienne M. Gagnon School of Computer Science D´epartement d’informatique McGill University Universit´eduQu´ebeca ` Montr´eal Montr´eal, Qu´ebec, Canada Montr´eal, Qu´ebec, Canada {dgu1, clump}@cs.mcgill.ca [email protected] Abstract perturbing performance measurements. This resulting lack of guid- Many new Java runtime optimizations report relatively small, ance often means that optimization researchers do not always give single-digit performance improvements. On modern virtual and a sufficiently wide or deep consideration to potential confounding actual hardware, however, the performance impact of an optimiza- factors from the underlying systems. tion can be influenced by a variety of factors in the underlying In this paper we focus on two main concerns. First, an expo- systems. Using a case study of a new garbage collection optimiza- sition of various external, and benchmark specific factors that can tion in two different Java virtual machines, we show the relative influence a VM or source-level optimization, and second a quanti- effects of issues that must be taken into consideration when claim- tative evaluation and comparison of the different influences. Using ing an improvement. We examine the specific and overall perfor- an in-depth case study of a simple garbage collection optimization mance changes due to our optimization and show how unintended we are able to show that a combination of instruction cache changes side-effects can contribute to, and distort the final assessment. Our due to trivial code modifications, and subtle, consequent data lay- experience shows that VM and hardware concerns can generate out and usage differences can induce almost a 10% whole program variances of up to 9.5% in whole program execution time.
    [Show full text]
  • Bootstrappable Builds
    Bootstrappable builds Ricardo Wurmus September 18, 2017 Recipe for yoghurt: add yoghurt to milk I “yoghurt software” undermines software freedom I ignore: use existing binaries to build new binaries I retrace history, save old tools from bit rot I write new alternative language implementations Java I JDK is written in Java I GCJ needs ECJ (Java), bundles pre-compiled GNU Classpath :-( JDK bootstrap 1. Jikes (C++) -> SableVM (Java + C) 2. old Ant (Java) -> old ECJ (Java) 3. ECJ -> GNU Classpath 0.99 -> JamVM 4. JamVM -> GNU Classpath devel (for Java ~1.6) -> JamVM’ 5. OpenJDK via IcedTea 1.x -> 2.x -> 3.x Java libraries I cultural problem: download/bundle pre-built packages I only leaf nodes are built from source I dependency cycles are common I cannot use modern build tools I needs lots of patience Haskell I All versions of GHC are written in Haskell I need GHC (n-1) to build GHC n I history is littered with defunct Haskell systems: nhc98, Yale Haskell, Hugs I failed: use Hugs to build nhc98 to build GHC see: https://elephly.net/posts/ 2017-01-09-bootstrapping-haskell-part-1.html I next: use Hugs interpreter to build patched GHC directly I future: revive and extend Yale Haskell? The elephant in the room. [width=.9]./elephly GCC I needs C++ compiler since version 4.7 I is there a path from reasonably auditable source to GCC? Stage0 and Mes Stage0 https://git.savannah.nongnu.org/git/stage0.git I hex.0: self-hosting hex assembler that we consider to be source (< 300 bytes) I M0 macro assembler written in .0 I M1 macro assembler written in
    [Show full text]
  • Resource Accounting and Reservation in Java Virtual Machine
    Resource accounting and reservation in Java Virtual Machine Inti Gonzalez-Herrera, Johann Bourcier, Olivier Barais January 24, 2013 1 The Java platform The Java programming language was designed to developed small application for embedded devices, but it was a long time ago. Today, Java application are running in many platforms ranging from smartphones to enterprise servers. Modern pervasive middleware is typically implemented using Java because of its safety, flexibility, and mature development environment. However, the Java virtual machine specification has not had a major revised since 1999 [50]. It was designed to execute only a single application per instances, and thus it does not provide resource accounting or per-application resource reservations in the sense most middlewares consider application. Often, current pervasive middlewares are thus unable to reserve resources for critical applications, which may cause these applications to crash or hang up when there are not enough resources. Moreover, contemporary middlewares are unable to provide resource accounting. It makes impossible to apply adaptation policies to optimize resource use. Several researches had addressed these important issues. As result of these efforts some Java spec- ification requests (JSR) have emerged. We consider there are seven JSRs relate to monitoring and to resource accounting and reservation: three for the Java Management eXtension API (JSRs 3, 160, 255), two for Metric Instrumentation (JSRs 138, 174) and two for resource-consumption management (JSRs 121, 284). The Java Management extension API only addresses monitoring and management: it does not define specific resource accounting or reservation strategies. JSRs 138 and 174 define monitors for the Java Virtual Machine.
    [Show full text]