2.5 Python Virtual Machines

Shannon, Mark (2011) The construction of high-performance virtual machines for dynamic languages. PhD thesis. http://theses.gla.ac.uk/2975/ Copyright and moral rights for this thesis are retained by the author A copy can be downloaded for personal non-commercial research or study, without prior permission or charge This thesis cannot be reproduced or quoted extensively from without first obtaining permission in writing from the Author The content must not be changed in any way or sold commercially in any format or medium without the formal permission of the Author When referring to this work, full bibliographic details including the author, title, awarding institution and date of the thesis must be given Glasgow Theses Service http://theses.gla.ac.uk/ [email protected] The Construction of High-Performance Virtual Machines for Dynamic Languages Mark Shannon, MSc. Submitted for the Degree of Doctor of Philosophy School of Computing Science College of Science and Engineering University of Glasgow November 2011 2 Abstract Dynamic languages, such as Python and Ruby, have become more widely used over the past decade. Despite this, the standard virtual machines for these languages have disappointing performance. These virtual machines are slow, not because methods for achieving better performance are unknown, but because their implementation is hard. What makes the implementation of high-performance virtual machines difficult is not that they are large pieces of software, but that there are fundamental and complex interdependencies between their components. In order to work together correctly, the interpreter, just-in-time compiler, garbage collector and library must all conform to the same precise low-level protocols. In this dissertation I describe a method for constructing virtual machines for dynamic languages, and explain how to design a virtual machine toolkit by building it around an abstract machine. The design and implementation of such a toolkit, the Glasgow Virtual Machine Toolkit, is described. The Glasgow Virtual Machine Toolkit automatically generates a just-in-time compiler, integrates precise garbage collection into the virtual machine, and automatically manages the complex interdependencies between all the virtual machine components. Two different virtual machines have been constructed using the GVMT. One is a minimal implementation of Scheme; which was implemented in under three weeks to demonstrate that toolkits like the GVMT can enable the easy construction of virtual machines. The second, the HotPy VM for Python, is a high- performance virtual machine; it demonstrates that a virtual machine built with a toolkit can be fast and that the use of a toolkit does not overly constrain the high-level design. Evaluation shows that HotPy outperforms the standard Python interpreter, CPython, by a large margin, and has performance on a par with PyPy, the fastest Python VM currently available. Contents 1 Introduction 11 1.1 Virtual Machines .......................... 12 1.2 Dynamic Languages ........................ 12 1.3 The Problem ............................. 13 1.4 Thesis ................................ 14 1.5 Contributions ............................ 14 1.6 Outline ............................... 15 2 Virtual Machines 17 2.1 A Little History ........................... 17 2.2 Interpreters ............................. 19 2.3 Garbage Collection ......................... 24 2.4 Optimisation for Dynamic Languages ............... 31 2.5 Python Virtual Machines ...................... 33 2.6 Other Interpreted Languages and their VMs ............ 36 2.7 Self-Interpreters ........................... 43 2.8 Multi-Threading and Dynamic Languages ............. 43 2.9 Conclusion ............................. 44 3 Abstract Machine Based Toolkits 45 3.1 Introduction ............................. 45 3.2 The Essential Features of a Virtual Machine ............ 47 3.3 An Abstract Machine for Virtual Machines ............. 48 3.4 Optimisation in VMs for Dynamic Languages ........... 52 3.5 When to use the abstract machine approach? ............ 54 3.6 Alternative Approaches to Building VMs ............. 55 3.7 Related Work ............................ 56 3.8 Conclusions ............................. 57 4 The Glasgow Virtual Machine Toolkit 59 4.1 Overview .............................. 59 4.2 The Abstract Machine ........................ 61 4.3 Front-End Tools ........................... 66 4.4 Back-End Tools ........................... 71 4.5 Translating GVMT Abstract Machine Code to Real Machine Code 73 2 4.6 Memory Management in the GVMT ................ 78 4.7 Locks ................................ 86 4.8 Concurrency and Garbage Collection ................ 88 4.9 Comparison of PyPy and GVMT .................. 90 4.10 The GVMT Scheme Example Implementation ........... 91 4.11 Conclusions ............................. 92 5 HotPy, ANew VM for Python 95 5.1 Introduction ............................. 95 5.2 The HotPy VMModel ....................... 96 5.3 Design of the HotPy VM ...................... 99 5.4 Tracing and Traces ......................... 104 5.5 Optimisation of Traces ....................... 108 5.6 Specialisation ............................ 111 5.7 Deferred Object Creation ...................... 113 5.8 Further Optimisations ........................ 119 5.9 De-Optimisation .......................... 120 5.10 An Example ............................. 120 5.11 Deviations from the Design of CPython .............. 125 5.12 Dictionaries ............................. 126 5.13 Related Work ............................ 130 5.14 Conclusion ............................. 131 6 Results and Evaluation 133 6.1 Introduction ............................. 133 6.2 Utility of the GVMT and Toolkits in General ........... 133 6.3 Performance of the GVMT Scheme VM .............. 134 6.4 Comparison of Unladen Swallow, the PyPy VM, and HotPy ... 135 6.5 Aspects of Virtual Machine Performance .............. 143 6.6 Memory Usage ........................... 146 6.7 Effect of Garbage Collection .................... 149 6.8 Potential for Further Optimisation ................. 150 6.9 Conclusions ............................. 152 7 Conclusions 155 7.1 Review of the Thesis ........................ 155 7.2 Significant Results ......................... 156 7.3 Dissertation Summary ....................... 156 7.4 Future Work ............................. 158 7.5 In Closing .............................. 160 A The GVMT Abstract Machine Instruction Set 162 BThe GVMT Abstract Machine Language Grammar 197 C Python Attribute Lookup Semantics 200 3 C.1 Definitions .............................. 200 C.2 Lookup Algorithm ......................... 201 D Surrogate Functions 203 D.1 The __new__ method for tuple ................... 203 D.2 The __call__ method for type .................... 204 D.3 The Binary Operator ........................ 204 EThe HotPy Virtual Machine Bytecodes 205 E.1 Base Instructions .......................... 205 E.2 Instructions Required for Tracing .................. 213 E.3 Specialised Instructions ....................... 216 E.4 D.O.C. Instructions ......................... 221 E.5 Super Instructions .......................... 222 F Results 224 Bibliography 228 4 List of Tables 2.1 Main Python Implementations ................... 34 2.2 Main Ruby Implementations .................... 39 4.1 GVMTTypes ............................ 64 6.1 Unoptimised Interpreters. Short Benchmarks ........... 139 6.2 Unoptimised Interpreters. Medium Benchmarks .......... 139 6.3 Full VM. Short Benchmarks .................... 140 6.4 Full VM. Medium Benchmarks ................... 140 6.5 Full VM. Long Benchmarks .................... 141 6.6 Optimised Interpreters. Short Benchmarks ............. 141 6.7 Optimised Interpreters. Medium Benchmarks ........... 141 6.8 Interpreter vs. Compiler. Short Benchmarks ............ 142 6.9 Interpreter vs. Compiler. Medium Benchmarks .......... 142 6.10 Interpreter vs. Compiler. Long Benchmarks ............ 142 6.11 HotPy(C) Performance Permutations. Speeds Relative to CPython 144 6.12 HotPy(Py) Performance Permutations. Speeds Relative to CPython 144 6.13 Speed Up Due to Adding Specialiser; HotPy(C)........... 144 6.14 Speed Up Due to Adding D.O.C.; HotPy(C)............. 145 6.15 Speed Up Due to Adding Compiler; HotPy(C)........... 145 6.16 Speed Up Due to Adding Specialiser; HotPy(Py).......... 145 6.17 Speed Up Due to Adding D.O.C.; HotPy(Py)............ 145 6.18 Speed Up Due to Adding Compiler; HotPy(Py)........... 146 6.19 Memory Usage ........................... 147 6.20 CPython GC percentages ...................... 149 6.21 Theoretical CPython Speedups ................... 149 F.1 Timings (in seconds); short benchmarks .............. 225 F.2 Timings (in seconds); medium benchmarks ............ 226 F.3 Timings (in seconds); long benchmarks .............. 227 5 List of Figures 2.1 Token Threaded Code – Stack-based program for A+B ...... 20 2.2 Switch Threaded Code – Stack-based program for A+B ...... 20 2.3 Switch Threaded Implementation in C ............... 21 2.4 Direct Threaded Code – Stack-based program for A+B ...... 21 2.5 Indirect Threaded Code – Stack-based program for A+B ..... 21 2.6 Subroutine Threaded Code – Stack-based program for A+B .... 22 2.7 Call Threaded Code – Stack-based program for A+B ....... 22 2.8 Memory Cycle ........................... 24 2.9 Uncollectable cycle for Reference Counting ............ 27 2.10 Self VM Tag Formats ........................ 29 3.1 Generalised Bytecode Optimiser

2.5 Python Virtual Machines

Learning to Program in Perl

Hacker Public Radio

Ironpython in Action

A Practical Solution for Scripting Language Compilers

2016 8Th International Conference on Cyber Conflict: Cyber Power

A Java Virtual Machine Extended to Run Parrot Bytecode

Threat Modeling and Circumvention of Internet Censorship by David Fifield

LAMP and the REST Architecture Step by Step Analysis of Best Practice

Graceful Language Extensions and Interfaces

Proseminar Python - Python Bindings

Efficient Use of Python on the Clusters

CIS 192: Lecture 12 Deploying Apps and Concurrency