Improving Interpreter Performance with Superinstructions

Total Page:16

File Type:pdf, Size:1020Kb

Improving Interpreter Performance with Superinstructions Master thesis QuickInterp - Improving interpreter performance with superinstructions Lukas Miedema June 11, 2020 bipush lcmp new ifeq dup iload_3iload_3 istore_3 Exam committee prof. dr. M. Huisman dr.ir. T. van Dijk dr.ir. A.B.J. Kokkeler Cover design Gerben Miedema Abstract The performance of Java Virtual Machine (JVM) bytecode interpreters can be severely limited by the (1) inability to perform optimizations over multiple instructions, and (2) the excessive level of branching with the interpreter loop as a result of having to do at least one jump per bytecode instruction. With the QuickInterp VM we mitigate these performance limitations within JVM interpreter design by means of superinstructions. Our interpreter is improved by supporting not just regular bytecode instructions, but also extra instructions which perform the work of a sequence of bytecode instructions (superinstructions). Bytecode is, at class load time, preprocessed where sequences of bytecode instructions are replaced by such equivalent superinstructions, requiring no alterations to or compatibility loss with existing bytecode. The interpreter source code is generated automatically based on a profile of the running application. New instruction handlers are generated by concatenating the instruction handlers of existing instructions, removing the need for manual construction of the superinstruction handlers. Which sequences of instructions to convert into a superinstruction, and how to perform the most effective substitution of superinstructions into a loaded program, are questions which we answer in this thesis. Earlier work has shown that finding the optimal superinstruction set is NP-hard [15]. As such, we implement an iterative optimization algorithm to find the optimal superinstruction set. Furthermore, we design and test various runtime substitution algorithms. Our new shortest-path based runtime substitution algorithm uses shortest-path based pathfinding through the input program to find the combination of superinstructions that lowers the amount of required instruction dispatches as much as possible. Furthermore, we enhance the shortest-path algorithm by doing substitution based on equivalence, extending the impact each individual superinstruction can make at runtime. We compare our new runtime substitution algorithms against a reimplementation of a substitution algorithm from earlier work [4]. Our methods show that the superinstruction optimization is still valid in 2020, boasting a 45.6% performance improvement against baseline in a small arithmetic benchmark, and showing a 33.0% improvement against baseline in a larger Spring Boot-based web application benchmark. Our iterative superinstruction set construction algorithm manages to find near-optimal solutions for the NP-hard problem of constructing the superinstruction set. However, we also show how more advanced superinstruction placement algorithms do not offer the same return on investment. Given enough superinstructions, each of the tested substitution algorithms is capable of achieving similar performance improvements. The code and benchmarks are available at https://github.com/LukasMiedema/QuickInterp. 2 Table of contents Abstract 2 Table of contents 2 List of figures 5 List of listings 6 1 Introduction 9 1.1 What is a VM . .9 1.2 What is an interpreter . 10 1.2.1 Anatomy of an interpreter . 10 1.3 What is a JIT Compiler . 11 1.4 Superinstructions . 12 1.5 Research goals . 13 1.5.1 Motivation . 13 1.5.2 Research question . 13 1.5.3 Goals and method . 14 1.6 Research contributions . 14 1.7 Overview . 14 2 Background 15 2.1 The JVM . 15 2.1.1 Introduction . 15 2.1.2 Bytecode structure . 15 2.1.3 Stack-based architecture . 16 2.1.4 Verifier, typesafety and abstract interpreters . 16 2.1.5 Dynamic class loading . 17 2.2 Modern interpreter dispatching . 17 2.2.1 Superscalar execution and branch prediction . 17 2.2.2 The interpreter . 18 2.2.3 Token-threaded interpreters . 19 2.2.4 Threaded-code interpreter . 20 2.3 Conclusion . 21 3 Related work 22 3.1 Introduction . 22 3.2 Superinstructions . 22 3.2.1 Workflow . 22 3.2.2 Superoperators . 24 3.2.3 vmgen . 24 3.2.4 Tiger . 25 3.2.5 Conclusion . 26 3 3.3 Other interpreter optimizations / research . 27 3.3.1 Static replication . 27 3.3.2 Instruction specialization . 28 3.4 Conclusion . 28 4 Design of QuickInterp 29 4.1 Introduction . 29 4.1.1 Design goals . 29 4.1.2 Overview . 30 4.2 Architecture and workflow overview . 30 4.2.1 From profile to runtime . 30 4.3 QuickInterp application profile . 32 4.3.1 Introduction . 32 4.3.2 Data in the profile . 32 4.4 QuickInterp compile time . 36 4.4.1 Introduction . 36 4.4.2 Instruction selection . 36 4.4.3 Static evaluation . 39 4.4.4 Superinstruction set construction . 41 4.4.5 Superinstruction generation evaluation loop . 44 4.4.6 Handling superinstruction operands . 45 4.5 QuickInterp runtime . 47 4.5.1 Introduction . 47 4.5.2 Basic runtime superinstruction placement . 47 4.5.3 Instruction placement using shortest path . 51 4.6 Equivalent superinstructions . 55 4.6.1 Superinstruction equivalence . 55 4.6.2 Discovering data dependencies . 57 4.6.3 Data Dependency Graph Construction . 65 4.6.4 Using superinstruction equivalency . 67 4.7 Conclusion . 68 5 Implementing QuickInterp 70 5.1 Introduction . 70 5.2 Implementation goals and non-goals . 71 5.3 QuickInterp on OpenJDK Zero . 72 5.3.1 Why OpenJDK Zero . 72 5.3.2 OpenJDK Zero class-loading pipeline . 72 5.3.3 OpenJDK Zero Interpreter . 74 5.3.4 Code stretching . 76 5.3.5 Superinstruction placement in Java . 78 5.3.6 Conclusion . 79 5.4 Profiling in practice . 79 5.4.1 Specialized Instructions . 80 5.4.2 The profile on disk . 81 5.4.3 Conclusion . 82 5.5 Constructing the superinstruction set . 82 5.5.1 Interpreter Generator tool implementation . 83 5.5.2 Loading the profile . 87 5.5.3 Optimizing the instruction set . 90 5.5.4 Conclusion . 92 5.6 Superinstruction placement . 92 5.6.1 Overview . 92 5.6.2 Algorithm implementations . 94 4 5.6.3 Conclusion . 98 5.7 Generating the QuickInterp interpreter . 98 5.7.1 Instruction primitives . 99 5.7.2 Instruction code as macros . 101 5.7.3 Code generation . 102 5.7.4 Superinstruction class cache . 105 5.7.5 Loss of the garbage collection and class verifier . 106 5.7.6 Wrapping up . 107 5.8.
Recommended publications
  • Strict Protection for Virtual Function Calls in COTS C++ Binaries
    vfGuard: Strict Protection for Virtual Function Calls in COTS C++ Binaries Aravind Prakash Xunchao Hu Heng Yin Department of EECS Department of EECS Department of EECS Syracuse University Syracuse University Syracuse University [email protected] [email protected] [email protected] Abstract—Control-Flow Integrity (CFI) is an important se- these binary-only solutions are unfortunately coarse-grained curity property that needs to be enforced to prevent control- and permissive. flow hijacking attacks. Recent attacks have demonstrated that existing CFI protections for COTS binaries are too permissive, While coarse-grained CFI solutions have significantly re- and vulnerable to sophisticated code reusing attacks. Accounting duced the attack surface, recent efforts by Goktas¸¨ et al. [9] for control flow restrictions imposed at higher levels of semantics and Carlini [10] have demonstrated that coarse-grained CFI is key to increasing CFI precision. In this paper, we aim to provide solutions are too permissive, and can be bypassed by reusing more stringent protection for virtual function calls in COTS large gadgets whose starting addresses are allowed by these C++ binaries by recovering C++ level semantics. To achieve this solutions. The primary reason for such permissiveness is the goal, we recover C++ semantics, including VTables and virtual lack of higher level program semantics that introduce certain callsites. With the extracted C++ semantics, we construct a sound mandates on the control flow. For example, given a class CFI policy and further improve the policy precision by devising two filters, namely “Nested Call Filter” and “Calling Convention inheritance, target of a virtual function dispatch in C++ must Filter”.
    [Show full text]
  • Comparative Studies of 10 Programming Languages Within 10 Diverse Criteria Revision 1.0
    Comparative Studies of 10 Programming Languages within 10 Diverse Criteria Revision 1.0 Rana Naim∗ Mohammad Fahim Nizam† Concordia University Montreal, Concordia University Montreal, Quebec, Canada Quebec, Canada [email protected] [email protected] Sheetal Hanamasagar‡ Jalal Noureddine§ Concordia University Montreal, Concordia University Montreal, Quebec, Canada Quebec, Canada [email protected] [email protected] Marinela Miladinova¶ Concordia University Montreal, Quebec, Canada [email protected] Abstract This is a survey on the programming languages: C++, JavaScript, AspectJ, C#, Haskell, Java, PHP, Scala, Scheme, and BPEL. Our survey work involves a comparative study of these ten programming languages with respect to the following criteria: secure programming practices, web application development, web service composition, OOP-based abstractions, reflection, aspect orientation, functional programming, declarative programming, batch scripting, and UI prototyping. We study these languages in the context of the above mentioned criteria and the level of support they provide for each one of them. Keywords: programming languages, programming paradigms, language features, language design and implementation 1 Introduction Choosing the best language that would satisfy all requirements for the given problem domain can be a difficult task. Some languages are better suited for specific applications than others. In order to select the proper one for the specific problem domain, one has to know what features it provides to support the requirements. Different languages support different paradigms, provide different abstractions, and have different levels of expressive power. Some are better suited to express algorithms and others are targeting the non-technical users. The question is then what is the best tool for a particular problem.
    [Show full text]
  • An Introduction to the C Programming Language and Software Design
    An Introduction to the C Programming Language and Software Design Tim Bailey Preface This textbook began as a set of lecture notes for a first-year undergraduate software engineering course in 2003. The course was run over a 13-week semester with two lectures a week. The intention of this text is to cover topics on the C programming language and introductory software design in sequence as a 20 lecture course, with the material in Chapters 2, 7, 8, 11, and 13 well served by two lectures apiece. Ample cross-referencing and indexing is provided to make the text a servicable reference, but more complete works are recommended. In particular, for the practicing programmer, the best available tutorial and reference is Kernighan and Ritchie [KR88] and the best in-depth reference is Harbison and Steele [HS95, HS02]. The influence of these two works on this text is readily apparent throughout. What sets this book apart from most introductory C-programming texts is its strong emphasis on software design. Like other texts, it presents the core language syntax and semantics, but it also addresses aspects of program composition, such as function interfaces (Section 4.5), file modularity (Section 5.7), and object-modular coding style (Section 11.6). It also shows how to design for errors using assert() and exit() (Section 4.4). Chapter 6 introduces the basics of the software design process—from the requirements and specification, to top-down and bottom-up design, to writing actual code. Chapter 14 shows how to write generic software (i.e., code designed to work with a variety of different data types).
    [Show full text]
  • Symbol Tables and Branch Tables Linking Applications Together
    NASA/TM—2011-216948 Symbol Tables and Branch Tables Linking Applications Together Louis M. Handler Glenn Research Center, Cleveland, Ohio January 2011 NASA STI Program . in Profi le Since its founding, NASA has been dedicated to the • CONFERENCE PUBLICATION. Collected advancement of aeronautics and space science. The papers from scientifi c and technical NASA Scientifi c and Technical Information (STI) conferences, symposia, seminars, or other program plays a key part in helping NASA maintain meetings sponsored or cosponsored by NASA. this important role. • SPECIAL PUBLICATION. Scientifi c, The NASA STI Program operates under the auspices technical, or historical information from of the Agency Chief Information Offi cer. It collects, NASA programs, projects, and missions, often organizes, provides for archiving, and disseminates concerned with subjects having substantial NASA’s STI. The NASA STI program provides access public interest. to the NASA Aeronautics and Space Database and its public interface, the NASA Technical Reports • TECHNICAL TRANSLATION. English- Server, thus providing one of the largest collections language translations of foreign scientifi c and of aeronautical and space science STI in the world. technical material pertinent to NASA’s mission. Results are published in both non-NASA channels and by NASA in the NASA STI Report Series, which Specialized services also include creating custom includes the following report types: thesauri, building customized databases, organizing and publishing research results. • TECHNICAL PUBLICATION. Reports of completed research or a major signifi cant phase For more information about the NASA STI of research that present the results of NASA program, see the following: programs and include extensive data or theoretical analysis.
    [Show full text]
  • Abstract Interface Types in GNAT: Conversions, Discriminants, and C++
    Abstract Interface Types in GNAT: Conversions, Discriminants, and C++ Javier Miranda1 and Edmond Schonberg2 1 [email protected] Applied Microelectronics Research Institute University of Las Palmas de Gran Canaria Spain and AdaCore 2 [email protected] AdaCore 104 Fifth Avenue, 15th floor New York, NY 10011 Abstract. Ada 2005 Abstract Interface Types provide a limited and practical form of multiple inheritance of specifications. In this paper we cover the following aspects of their implementation in the GNAT compiler: interface type conversions, the layout of variable sized tagged objects with interface progenitors, and the use of the GNAT compiler for interfacing with C++ classes with compatible inheritance trees. Keywords: Ada 2005, Abstract Interface Types, Tagged Types, Discriminants, GNAT. 1 Introduction In recent years, a number of language designs [1, 2] have adopted a compromise between full multiple inheritance and strict single inheritance, which is to allow multiple inheritance of specifications, but only single inheritance of implemen- tations. Typically this is obtained by means of “interface” types. An interface consists solely of a set of operation specifications: it has no data components and no operation implementations. A type may implement multiple interfaces, but can inherit code from only one parent type [4, 7]. This model has much of the power of full-blown multiple inheritance, without most of the implementation and semantic difficulties that are manifest in the object model of C++ [3]. At compile time, an interface type is conceptually a special kind of abstract tagged type and hence its handling does not add special complexity to the com- piler front-end (in fact, most of the current compiler support for abstract tagged types has been reused in GNAT).
    [Show full text]
  • Multi-Dispatch in the Java Virtual Machine: Design and Implementation
    6th USENIX Conference on Object-Oriented Technologies, January 29-February 2, 2001, San Antonio, Texas, U.S.A. (Minor corrections made February 5, 2001) Multi-Dispatch in the Java Virtual Machine: Design and Implementation ? ? ? } Christopher Dutchyn? Paul Lu Duane Szafron Steven Bromling Wade Holst ? Dept. of Computing Science } Dept. of Computer Science University of Alberta The University of Western Ontario Edmonton, Alberta, Canada, T6G 2E8 London, Ontario, Canada, N6A 5B7 g fdutchyn,paullu,duane,bromling @cs.ualberta.ca [email protected] Abstract selection based upon the types of the arguments. This method selection process is known as dispatch. It can Mainstream object-oriented languages, such as C++ occur at compile-time or at execution-time. In the for- and Java1, provide only a restricted form of polymor- mer case, where only the static type information is phic methods, namely uni-receiver dispatch. In com- available, we have static dispatch (method overload- mon programming situations, developers must work ing). The latter case is known as dynamic dispatch around this limitation. We describe how to extend the (dynamic method overriding or virtual functions) and Java Virtual Machine to support multi-dispatch and ex- object-oriented languages leverage it to provide poly- amine the complications that Java imposes on multi- morphism — the execution of type-specific program dispatch in practice. Our technique avoids changes to code. the Java programming language itself, maintains source We can divide OO languages into two broad categories code and library compatibility, and isolates the perfor- based upon how many arguments are considered dur- mance penalty and semantic changes of multi-method ing dispatch.
    [Show full text]
  • Implementing Signatures for C++
    Implementing Signatures for C++ GERALD BAUMGARTNER and VINCENT F. RUSSO Purdue University We outline the design and detail the implementation of a language extension for abstracting types and for decoupling subtyping and inheritance in C++. This extension gives the user more of the flexibility of dynamic typing while retaining the efficiency and security of static typing. After a brief discussion of syntax and semantics of this language extension and examples of its use, we present and analyze three different implementation techniques: a preprocessor to a C++ compiler, an implementation in the front end of a C++ compiler, and a low-level implementation with back-end support. We follow with an analysis of the performance of the three implementation techniques and show that our extension actually allows subtype polymorphism to be implemented more efficiently than with virtual functions. We conclude with a discussion of the lessons we learned for future programming language design. Categories and Subject Descriptors: D.3.3 [Programming Languages]: Language Constructs and Features—abstract data types; D.2.2 [Software Engineering]: Tools and Techniques—modules and interfaces; D.1.5 [Programming Techniques]: Object-Oriented Programming; D.3.4 [Pro- gramming Languages]: Processors—compilers General Terms: Design, Languages, Measurement, Performance Additional Key Words and Phrases: C++, dispatch tables, inheritance, object interfaces, poly- morphism, subtyping 1. INTRODUCTION In C++, as in several other object-oriented languages, the class construct is used to define a type, to implement that type, and as the basis for inheritance, type abstraction, and subtype polymorphism. We argue that overloading the class con- struct limits the expressiveness of type abstraction, subtype polymorphism, and inheritance.
    [Show full text]
  • Compiling Objects
    CS153: Compilers Lecture 11: Compiling Objects Stephen Chong https://www.seas.harvard.edu/courses/cs153 Announcements •Project 3 due today •Project 4 out •Due Thursday Oct 25 (16 days) •Project 5 released on Thursday Stephen Chong, Harvard University 2 Today •Object Oriented programming •What is it •Dynamic dispatch •Code generation for methods and method calls •Fields •Creating objects •Extensions •Type system Stephen Chong, Harvard University 3 What Is Object-Oriented Programming? •Programming based on concept of objects, which are data plus code •OOP can be an effective approach to writing large systems •Objects naturally model entities •OO languages typically support • information hiding (aka encapsulation) to support modularity • inheritance to support code reuse •Several families of OO languages: •Prototype-based (e.g. Javascript, Lua) •Class-based (e.g. C++, Java, C#) •We focus on the compilation of class-based OO languages Stephen Chong, Harvard University 4 Brief Incomplete History of OO •(Early 60’s) Key concepts emerge in various languages/ programs: sketchpad (Sutherland), SIMSCRIPT (Hoare), and probably many others. •(1967) Simula 67 (Dahl, Nygaard) crystalizes many ideas (class, object, subclass, dispatch) into a coherent OO language •(1972) Smalltalk (Kay) introduces the concept of object- oriented programming (you should try Squeak!) •(1978) Modula-2 (Wirth) •(1985) Eiffel (Meyer) •(1990’s) OO programming becomes mainstream: C++, Java, C#, … Stephen Chong, Harvard University 5 Classes •What’s the difference between a class
    [Show full text]
  • Sandboxing PHP Applications with Tailored System Call Allowlists
    Saphire: Sandboxing PHP Applications with Tailored System Call Allowlists Alexander Bulekov∗ Rasoul Jahanshahi∗ Manuel Egele ∗ Equal contribution joint first authors Boston University {alxndr,rasoulj,megele}@bu.edu Abstract 1 Introduction Interpreted languages, such as PHP and JavaScript, are the Interpreted languages, such as PHP, power a host of platform- foundation of modern-day computing. This is particularly true independent applications, including websites, instant messen- for the web, where online social networks, eCommerce, and gers, video games, and development environments. With the online news attract the attention of billions of daily users. The flourishing popularity of these applications, attackers have ensuing swaths of personal, financial, and otherwise sensitive honed in on finding and exploiting vulnerabilities in inter- information held by these entities, make web sites attractive preted code. Generally, all parts of an interpreted application targets for cyber attacks. Beyond localized leaks of informa- execute with uniform and superfluous privileges, increasing tion, web apps and the interpreted languages that power them the potential damage from an exploit. This lack of privilege- have also been at the core of data breaches that affect society separation is in stark violation of the principle of least privi- at large. In 2015 attackers allegedly leveraged vulnerabilities lege(PoLP). in plugins of the WordPress and Drupal web apps to leak what has become known as the “Panama Papers” [36]. As testa- Despite 1,980 web app remote code execution (RCE) vul- ment to this crisis, Symantec reports [10] that in 2017, one nerabilities discovered in 2018 alone [25], current defenses in every 13 web requests was malicious.
    [Show full text]
  • PROGRAMMING LANGUAGE EVOLUTION and SOURCE CODE REJUVENATION a Dissertation by PETER MATHIAS PIRKELBAUER Submitted to the Office
    PROGRAMMING LANGUAGE EVOLUTION AND SOURCE CODE REJUVENATION A Dissertation by PETER MATHIAS PIRKELBAUER Submitted to the Office of Graduate Studies of Texas A&M University in partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY December 2010 Major Subject: Computer Science PROGRAMMING LANGUAGE EVOLUTION AND SOURCE CODE REJUVENATION A Dissertation by PETER MATHIAS PIRKELBAUER Submitted to the Office of Graduate Studies of Texas A&M University in partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY Approved by: Chair of Committee, Bjarne Stroustrup Committee Members, Gabriel Dos Reis Lawrence Rauchwerger Jaakko J¨arvi Weiping Shi Head of Department, Valerie Taylor December 2010 Major Subject: Computer Science iii ABSTRACT Programming Language Evolution and Source Code Rejuvenation. (December 2010) Peter Mathias Pirkelbauer, Dipl.-Ing., Johannes-Kepler Universit¨atLinz, Austria; M.B.A., Texas A&M University Chair of Advisory Committee: Dr. Bjarne Stroustrup Programmers rely on programming idioms, design patterns, and workaround techniques to express fundamental design not directly supported by the language. Evolving languages often address frequently encountered problems by adding language and library support to subsequent releases. By using new features, programmers can express their intent more directly. As new concerns, such as parallelism or security, arise, early idioms and language facilities can become serious liabilities. Modern code sometimes benefits from optimization techniques not feasible for code that uses less expressive constructs. Manual source code migration is expensive, time-consuming, and prone to errors. This dissertation discusses the introduction of new language features and li- braries, exemplified by open-methods and a non-blocking growable array library.
    [Show full text]