Tracing the Meta-Level: Pypy's Tracing JIT Compiler

Tracing the Meta-Level: PyPy’s Tracing JIT Compiler Carl Friedrich Bolz Antonio Cuni University of Düsseldorf University of Genova STUPS Group DISI Germany Italy [email protected] [email protected] Maciej Fijalkowski Armin Rigo merlinux GmbH [email protected] fi[email protected] ABSTRACT stand, extend and port whereas writing a just-in-time com- We attempt to apply the technique of Tracing JIT Com- piler is an error-prone task that is made even harder by the pilers in the context of the PyPy project, i.e., to programs dynamic features of a language. that are interpreters for some dynamic languages, including A recent approach to getting better performance for dy- Python. Tracing JIT compilers can greatly speed up pro- namic languages is that of tracing JIT compilers [16, 7]. grams that spend most of their time in loops in which they Writing a tracing JIT compiler is relatively simple. It can take similar code paths. However, applying an unmodified be added to an existing interpreter for a language, the inter- tracing JIT to a program that is itself a bytecode interpreter preter takes over some of the functionality of the compiler results in very limited or no speedup. In this paper we show and the machine code generation part can be simplified. how to guide tracing JIT compilers to greatly improve the The PyPy project is trying to find approaches to generally speed of bytecode interpreters. One crucial point is to un- ease the implementation of dynamic languages. It started as roll the bytecode dispatch loop, based on two hints provided a Python implementation in Python, but has now extended by the implementer of the bytecode interpreter. We evalu- its goals to be generally useful for implementing other dy- ate our technique by applying it to two PyPy interpreters: namic languages as well. The general approach is to imple- one is a small example, and the other one is the full Python ment an interpreter for the language in a subset of Python. interpreter. This subset is chosen in such a way that programs in it can be compiled into various target environments, such as C/Posix, the CLI or the JVM. The PyPy project is described 1. INTRODUCTION in more detail in Section 2. Dynamic languages have seen a steady rise in popularity In this paper we discuss ongoing work in the PyPy project in recent years. JavaScript is increasingly being used to im- to improve the performance of interpreters written with the plement full-scale applications which run within a browser, help of the PyPy toolchain. The approach is that of a trac- whereas other dynamic languages (such as Ruby, Perl, Python, ing JIT compiler. Contrary to the tracing JITs for dynamic PHP) are used for the server side of many web sites, as well languages that currently exist, PyPy's tracing JIT operates as in areas unrelated to the web. \one level down", i.e., it traces the execution of the inter- One of the often-cited drawbacks of dynamic languages is preter, as opposed to the execution of the user program. the performance penalties they impose. Typically they are The fact that the program the tracing JIT compiles is in slower than statically typed languages. Even though there our case always an interpreter brings its own set of prob- has been a lot of research into improving the performance lems. We describe tracing JITs and their application to in- of dynamic languages (in the SELF project, to name just terpreters in Section 3. By this approach we hope to arrive one example [18]), those techniques are not as widely used at a JIT compiler that can be applied to a variety of dy- as one would expect. Many dynamic language implementa- namic languages, given an appropriate interpreter for each tions use completely straightforward bytecode-interpreters of them. The process is not completely automatic but needs without any advanced implementation techniques like just- a small number of hints from the interpreter author, to help in-time compilation. There are a number of reasons for this. the tracing JIT. The details of how the process integrates Most of them boil down to the inherent complexities of using into the rest of PyPy will be explained in Section 4. This compilation. Interpreters are simple to implement, under- work is not finished, but has already produced some promis- ing results, which we will discuss in Section 5. The contributions of this paper are: • Applying a tracing JIT compiler to an interpreter. Permission to make digital or hard copies of all or part of this work for • Finding techniques for improving the generated code. personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to 2. THE PYPY PROJECT republish, to post on servers or to redistribute to lists, requires prior specific The PyPy project1 [21, 5] is an environment where flex- permission and/or a fee. 1 Copyright 200X ACM X-XXXXX-XX-X/XX/XX ...$5.00. http://codespeak.net/pypy ible implementation of dynamic languages can be written. When a hot loop is identified, the interpreter enters a To implement a dynamic language with PyPy, an interpreter special mode, called tracing mode. During tracing, the in- for that language has to be written in RPython [1]. RPython terpreter records a history of all the operations it executes. (\Restricted Python") is a subset of Python chosen in such It traces until it has recorded the execution of one iteration a way that type inference can be performed on it. The lan- of the hot loop. To decide when this is the case, the trace guage interpreter can then be translated with the help of is repeatedly checked during tracing as to whether the in- PyPy into various target environments, such as C/Posix, the terpreter is at a position in the program where it had been CLI and the JVM. This is done by a component of PyPy earlier. called the translation toolchain. The history recorded by the tracer is called a trace: it By writing VMs in a high-level language, we keep the is a sequential list of operations, together with their actual implementation of the language free of low-level details such operands and results. Such a trace can be used to generate as memory management strategy, threading model or object efficient machine code. This generated machine code is im- layout. These features are automatically added during the mediately executable, and can be used in the next iteration translation process. The process starts by performing con- of the loop. trol flow graph construction and type inferences, then fol- Being sequential, the trace represents only one of the many lowed by a series of steps, each step transforming the inter- possible paths through the code. To ensure correctness, the mediate representation of the program produced by the pre- trace contains a guard at every possible point where the vious one until we get the final executable. The first trans- path could have followed another direction, for example at formation step makes details of the Python object model conditions and indirect or virtual calls. When generating explicit in the intermediate representation, later steps intro- the machine code, every guard is turned into a quick check ducing garbage collection and other low-level details. As we to guarantee that the path we are executing is still valid. will see later, this internal representation of the program is If a guard fails, we immediately quit the machine code and also used as an input for the tracing JIT. continue the execution by falling back to interpretation.2 It is important to understand how the tracer recognizes 3. TRACING JIT COMPILERS that the trace it recorded so far corresponds to a loop. This happens when the position key is the same as at an earlier Tracing optimizations were initially explored by the Dy- point. The position key describes the position of the exe- namo project [2] to dynamically optimize machine code at cution of the program, i.e., usually contains things like the runtime. Its techniques were then successfully used to imple- function currently being executed and the program counter ment a JIT compiler for a Java VM [16, 15]. Subsequently position of the tracing interpreter. The tracing interpreter these tracing JITs were discovered to be a relatively simple does not need to check all the time whether the position key way to implement JIT compilers for dynamic languages [7]. already occurred earlier, but only at instructions that are The technique is now being used by both Mozilla's Trace- able to change the position key to an earlier value, e.g., a Monkey JavaScript VM [14] and has been tried for Adobe's backward branch instruction. Note that this is already the Tamarin ActionScript VM [8]. second place where backward branches are treated specially: Tracing JITs are built on the following basic assumptions: during interpretation they are the place where the profiling • programs spend most of their runtime in loops is performed and where tracing is started or already existing assembler code executed; during tracing they are the place • several iterations of the same loop are likely to take where the check for a closed loop is performed. similar code paths Let's look at a small example. Take the (slightly con- trived) RPython code in Figure 1. The tracer interprets The basic approach of a tracing JIT is to only generate these functions in a bytecode format that is an encoding of machine code for the hot code paths of commonly executed the intermediate representation of PyPy's translation tool- loops and to interpret the rest of the program.

Load more