The Jalapeño Dynamic Optimizing Compiler for Javatm

TM The Jalap eno ~ Dynamic Optimizing Compiler for Java Michael G. Burke Jong-Deok Choi Stephen Fink David Grove Michael Hind Vivek Sarkar Mauricio J. Serrano V. C. Sreedhar Harini Srinivasan John Whaley IBM Thomas J. Watson Research Center P.O. Box 704, Yorktown Heights, NY 10598 Abstract p erformance and scalabilityofJava applicatio ns on SMP server machines. Some previous high p erformance imple- The Jalap eno ~ Dynamic Optimizing Compiler is a key com- mentations of Java e.g., [9, 25, 22, 18] have relied on static 1 p onent of the Jalap eno ~ Virtual Machine, a new Java Vir- compilation, and have therefore disallowed certain features tual Machine JVM designed to supp ort ecient and scal- such as dynamic class loading. In contrast, the Jalap eno ~ able execution of Java applications on SMP server machines. JVM supp orts all features of Java [29], and the Jalap eno ~ This pap er describ es the design of the Jalap eno ~ Optimizing 2 Optimizing Compiler is a fully integrated dynamic compiler Compiler, and the implementation results that wehave ob- in the Jalap eno ~ JVM. tained thus far. To the b est of our knowledge, this is the rst The Jalap eno ~ pro ject was initiated in Decemb er 1997 at dynamic optimizing compiler for Java that is b eing used in a the IBM T. J. Watson Research Center and is still work-in- JVM with a compile-only approach to program execution. progress. This pap er describ es the design of the Jalap eno ~ Optimizing Compiler and the implementation results that 1 Intro duction wehave obtained thus far. To the b est of our knowledge, this is the rst dynamic optimizing compiler for Java that This pap er describ es the Jalap eno ~ Optimizing Compiler, a is b eing used in a JVM with a compile-only approachto key comp onent of the Jalap eno ~ Virtual Machine, a new JVM program execution. b eing built at IBM Research. A distingui shi ng feature of The rest of the pap er is organized as follows. Section 2 the Jalap eno ~ JVM is that it takes a compile-only approach provides the context for this work by describing key features to program execution. Instead of providing b oth an inter- of the Jalap eno ~ Virtual Machine. Section 3 outlines the high- preter and a JIT compiler as in other JVMs, byteco des are level structure of the Jalap eno ~ Optimizing Compiler and how always translated to machine co de b efore they are executed. it is invoked within the Jalap eno ~ Virtual Machine. Section 4 Jalap eno ~ has three di erent compilers to provide such trans- describ es the intermediate representation IR used in the lation: an optimizing compiler for computationall y intensive Jalap eno ~ Optimizing Compiler. Sections 5 and 6 describ e metho ds which is the sub ject of this pap er, a \quick" the \front-end" and \back-end" resp ectively of the Jalap eno ~ compiler that p erforms a low level of co de optimization pri- Optimizing Compiler; the front-end describ es a mostly marily register allo cation, and a \baseline" compiler that single-pass translation of Javabyteco des to an optimized mimics the stack machine of the JVM sp eci cation do cu- high-level IR HIR, and the back-end describ es how HIR is ment [29]. The compile-only approach makes it easier to lowered and translated into optimized machine co de accom- mix execution of unoptimized and optimized compiled meth- panied by exception tables and GC stack-maps. Section 7 o ds in the Jalap eno ~ JVM, compared to mixing interpreted summarizes our framework for ecient ow-insensitive op- execution and JIT-compiled execution as in other JVMs. timizations for single-assignment variables. Section 8 de- A primary goal of the Jalap eno ~ JVM is to deliver high scrib es our framework for inlining metho d calls. Section 9 1 Trademark or registered trademark of Sun Microsystems, Inc. presents p erformance results obtained from the current implementation of the Jalap eno ~ Optimizing Compiler as of 2 Though dynamic compilation is the default mo de for the Jalap e ~no Optimizing Compiler, the same infrastructure can b e used to supp ort ahybrid of static and dynamic compilation, as discussed in Section 2. h 1999. Section 10 describ es twointerpro cedural op- Marc Baseline/Quick Bytecode Compiler timizations that are in progress as extensions to the current Unoptimized tation | interpro cedural optimization of register implemen Bytecode Code Translation saves and restores, and interpro cedural escap e analysis. Fi- , Section 11 discusses related work and Section 12 con- nally Instrumented Executable Code Optimized tains our conclusions . Code Code Online Adaptive Optimizing 2 The Jalap e~no Virtual Machine Measurements Optimization Compiler System The subsystems of the Jalap eno ~ JVM include a dynamic Context Sensitive Optimization class loader, dynamic linker, ob ject allo cator, garbage col- Profile Information Plan lector, thread scheduler, pro ler on-line measurements system, three dynamic compilers, and supp ort for other run- Controller time features, such as exception handling and typ e testing. Among the three dynamic compilers, the baseline compiler was implemented rst. It is used to validate the other com- Figure 1: Context for Jalap eno ~ Optimizing Compiler pilers, for debugging, and as the default compiler until the while an applicatio n is running. The goal of the Jalap eno ~ quick compiler is fully functional. The class loader sup- Optimizing Compiler is to generate the b est p ossible co de p orts dynamic linking via backpatching for classes that were for the selected metho ds for a given compile-time budget. loaded after compilation . In addition, its optimizations must deliver signi cant p er- Memory management in the Jalap eno ~ JVM consists of formance improvements while correctly preserving Java se- an ob ject allo cator and a garbage collector. The Jalap eno ~ mantics with resp ect to exceptions, garbage collection, and JVM supp orts typ e-accurate garbage collection. With a threads. Reducing the cost of synchronization and other view to future exp erimentation to determine which garbage thread primitives is esp ecially imp ortant for achieving scal- collection algorithm will b e b est suited for SMP execution of able p erformance on SMP servers. Finally, it should b e multithreaded Java programs, the Jalap eno ~ JVM contains a p ossible to retarget the Jalap eno ~ Optimizing Compiler to varietyoftyp e-accurate garbage collectors generational and avariety of hardware platforms. Building a dynamic op- non-generational , copying and non-copying [24]. timizing compiler that achieves all these goals is a ma jor In the Jalap eno ~ JVM, each ob ject has a two-word header: challenge. a p ointer to a type information block, and a status word for Figure 1 shows the overall design for how the Jalap eno ~ hashing, lo cking, and garbage collection. Since threads in Optimizing Compiler is used in the Jalap eno ~ Virtual Ma- Java are ob jects, the Jalap eno ~ JVM creates a distinct ob ject chine. The Optimizing Compiler is the key comp onentof for eachJava thread. One of the elds of this thread ob ject Jalap eno's ~ Adaptive Optimization System, which also in- holds a reference to the thread's stack, which contains a cludes an On-Line Measurements OLM subsystem and a contiguous sequence of variable-size stack frames, one p er Controller subsystem. The OLM and Controller subsys- metho d invo cation. These stack frames are chained together tems are currently under development. The OLM system by \dynamic links". is designed to monitor the p erformance of individu al meth- Another distinguis hi ng feature of the Jalap eno ~ JVM is o ds in the application by using software sampling and pro- that all its subsystems including the compilers, run-time ling techniques combined with a collection of hardware routines, and garbage collector are implemented in Java and p erformance monitor information, and to maintain context- run alongside the Java application . Although it is written sensitive pro le information for metho d calls in a Calling in Java, the Jalap eno ~ JVM is self-b o otstrapping; i.e., it do es Context Graph CCG similar to the Calling Context Tree not need to run on top of another JVM. One of the many intro duced in [3]. The Controller subsystem will b e invoked advantages of a pure Java implementation is that we can when the OLM subsystem detects that a certain p erformance dynamically self-optimize the Jalap eno ~ JVM. threshold is reached. The controller uses the CCG and its asso ciated pro ling information to build an \optimization 3 Structure of the Jalap e~no Optimizing Compiler plan" that describ es which metho ds the optimizing compiler should compile and with what optimization levels. The OLM The Jalap eno ~ Optimizing Compiler is adaptive and dynamic. subsystem will continue monitoring individual metho ds, in- It is invoked on an automatically selected set of metho ds Class Files (Bytecode) binary co de in a \b o ot image". Similarly, the optimizing compiler could also compile selected metho ds from a user application and store them in a custom b o ot image tailored Adaptive Optimization System to the application . When doing so, the optimizing compiler would essentially function as a static compiler as shown in Opt-compiled Code Figure 2. When the Jalap eno ~ Optimizing Compiler functions as ust generate the b est p os- Boot-image Writer a pure dynamic compiler, it m sible co de for a given compile-time budget. The compile- time budget is less imp ortant when the Jalap eno ~ Optimiz- ing Compiler functions as a static compiler or as a static Boot Image with 3 yteco de-to-byteco de optimizer.

The Jalapeño Dynamic Optimizing Compiler for Javatm

The LLVM Instruction Set and Compilation Strategy

Analysis of Program Optimization Possibilities and Further Development

Register Allocation and Method Inlining

Generalizing Loop-Invariant Code Motion in a Real-World Compiler

Efficient Symbolic Analysis for Optimizing Compilers*

The Effect of Code Expanding Optimizations on Instruction Cache Design

Eliminating Scope and Selection Restrictions in Compiler Optimizations

Compiler-Based Code-Improvement Techniques

An ECMA-55 Minimal BASIC Compiler for X86-64 Linux®

Register Allocation Deconstructed

Design and Evaluation of Register Allocation on Gpus

Compiler Information