A Study of Devirtualization Techniques for a Java™ Just-In

A Study of Devirtualization Techniques for a Java™ Just-In-Time Compiler Kazuaki Ishizaki, Motohiro Kawahito, Toshiaki Yasue, Hideaki Komatsu, Toshio Nakatani IBM Research, Tokyo Research Laboratory 1623-14, Shimotsuruma, Yamato-shi, Kanagawa-ken, 242-8502, Japan [email protected] ABSTRACT to make a direct call to the corresponding target method. We call this approach guarded devirtualization. In dynamically-typed Many devirtualization techniques have been proposed to reduce object-oriented languages such as Self [4], guarded devirtualiza- the runtime overhead of dynamic method calls for various object- tion is extremely effective because of the high overhead of a dy- oriented languages, however, most of them are less effective or namic method call. In statically-typed object-oriented language cannot be applied for Java in a straightforward manner. This is like Java [12], guarded devirtualization is less effective because of partly because Java is a statically-typed language and thus trans- the low overhead of dynamic method call due to the fact that it forming a dynamic call to a static one does not make a tangible can be translated into a few load operations followed by an indi- performance gain (owing to the low overhead of accessing the rect jump operation. In order to boost the performance with a dy- method table) unless it is inlined, and partly because the dynamic namic method call, the method call must be inlined as much as class loading feature of Java prohibits the whole program analysis possible after the guard test is eliminated. and optimizations from being applied. To devirtualize a dynamic method call without generating a guard We propose a new technique called direct devirtualization with test (we call this approach direct devirtualization), the whole pro- the code patching mechanism. For a given dynamic call site, our gram analysis and optimizations [5, 6, 7, 8, 9] have been proposed compiler first determines whether the call can be devirtualized, by in the context of static compilers. However, they are based on the analyzing the current class hierarchy. When the call is devirtualiz- closed-world assumption, in which no dynamic class loading is able and the target method is suitably sized, the compiler gener- allowed. Therefore, these techniques cannot be directly applicable ates the inlined code of the method, together with the backup code to Java. Dynamic recompilation can be used to make direct devir- of making the dynamic call. Only the inlined code is actually exe- tualization possible in the non-closed world assumption, but it cuted until our assumption about the devirtualization becomes involves a complicated mechanism called on-stack replacement invalidated, at which time the compiler performs code patching to [13]. make the backup code executed subsequently. Since the new technique prevents some code motions across the merge point We propose a new technique called direct devirtualization with between the inlined code and the backup code, we have further- the code patching mechanism (the code patching mechanism or more implemented recently-known analysis techniques, such as code patch in short) [10]. For a given dynamic call site, our com- type analysis and preexistence analysis, which allow the backup piler first determines whether the call can be devirtualized, by code to be completely eliminated. We made various experiments analyzing the current class hierarchy. When the call is devirtualiz- using 16 real programs to understand the effectiveness and charac- able and the target method is suitably sized, the compiler gener- teristics of the devirtualization techniques in our Java Just-In- ates the inlined code of the method, together with the backup code Time (JIT) compiler. In summary, we reduced the number of dy- (also called backup path) of making the dynamic call. Only the namic calls by ranging from 8.9% to 97.3% (the average of inlined code is actually executed until our assumption about the 40.2%), and we improved the execution performance by ranging devirtualization becomes invalidated, at which time the compiler from -1% to 133% (with the geometric mean of 16%). performs code patching to make the backup code executed subsequently. Since the new technique prevents some code motions 1. Introduction across the merge point between the inlined code and the backup Many devirtualization techniques [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11] code, we have furthermore implemented recently-known analysis have been proposed to reduce the overhead of dynamic method techniques, such as type analysis [14, 15, 16, 17] and preexistence calls for various object-oriented languages. In general, a guard test analysis [11], which allow the backup code to be completely is generated to test the receiver of the class (called class test) [1, 2, eliminated. 3] or the method (called method test) [11] to ensure that it is valid We made various experiments using 16 real programs to understand the effectiveness and characteristics of the devirtualization techniques in our Java Just-In-Time (JIT) compiler. In summary, we reduced the number of dynamic calls by ranging from 8.9% to Permission to make digital or hard copies of part or all of this work or personal or classroom use is granted without fee provided that copies are 97.3% (the average of 40.2%), and we improved the execution not made or distributed for profit or commercial advantage and that copies performance by ranging from -1% to 133% (with the geometric bear this notice and the full citation on the first page. To copy otherwise, to mean of 16%). republish, to post on servers, or to redistribute to lists, requires prior specific permission and/or a fee. This paper makes the following contributions: A new devirtualization technique called direct devirtualiza- OOPSLA ‘00, 10/00 Minneapolis, MN, USA © 2000 ACM ISBN 1-58113-200-x/00/0010...$5.00 tion with the code patching mechanism, which is much sim- pler to implement and has less overhead to execute than a re- 294 compilation approach with on-stack replacement. We elimi- 17] attempts to tighten the static type constraints on the receiver nate the backup path by type analysis and preexistence analy- expressions. It increases the opportunities for direct devirtualiza- sis. We also optimize the inlined code by inserting compensa- tion to determine whether a call site has a single implementation. tion code in the backup path if it is not eliminated. It can also directly devirtualize a dynamic method call without a Evaluation of various devirtualization techniques imple- backup path. mented in our Java JIT compiler, including direct devirtuali- Several languages, such as C++, Dylan, and Java, have a linguistic zations (the code patching mechanism, type analysis, and mechanism that allows users to declare a class sealed, so that it is preexistence) and guarded devirtualizations (class test and prohibited to subclass any new class from it. However, sealed method test). methods are not common in the Java Core libraries such as The rest of the paper is structured as follows. Section 2 discusses java.util.Vector. Most of the methods in this class are not related work. Section 3 describes our devirtualization techniques sealed in Java 2. that implemented in our Java JIT compiler. Section 4 gives ex- The Self system performs extensive inlining of dynamic method perimental results with statistics and performance results on a set calls [13], whose correctness is ensured by the on-stack replace- of real programs. Finally, Section 5 outlines our conclusions. ment mechanism. In the compiled code, there are deoptimization points at which the original state of the method’s variables can be 2. Related Work recovered from the optimized state. When a compilation assump- Devirtualization techniques are important to improve the perform- tion is violated by dynamic class loading, the Self system recovers ance in object-oriented languages. Therefore, many devirtualiza- the original state at a deoptimization point and recompiles the tion techniques have been proposed. method. Such a system introduces several concerns. In the Self An inline cache technique was developed to speed up dynamic implementation, the compiler produces numerous data structures method calls. An inline cache records the class of the last receiver called scope descriptors for deoptimization. Deoptimization points object at the call site, and jumps directly to the method for that also introduce inefficiency into the generated code to storing extra class. A stub validates that the dynamic type of the receiver data in memory to recover the original context. The compiler can- matches the expected type. If this test fails, a normal method not reorder instructions over a deoptimization point. It is also lookup makes a dynamic method call and stores the class of the difficult to replace methods on stacks in the multi-threaded run- current receiver to the call site cache. Hölzle extended the tech- time environment. The Java HotSpot compiler [21] adopts a re- nique to a polymorphic case of inline caches [18]. Type prediction compilation approach using on-stack replacement. We did not [2, 3] and method test [11] have also been proposed. Type predic- explore this approach because of the complexity of its implemen- tion and method test predict the type of a frequently-called object tation. Preexistence analysis [11] is an approach to prevent on- at compile time. A polymorphic inline cache, type prediction, and stack replacement by determining whether direct devirtualization method test introduce new runtime tests, since these techniques can be performed based on the analysis of the receiver expres- take the cache approach with memory references. According to sions. We adopted it to increase the opportunity for compiler op- the results of simple experiments [19], type prediction without timizations by eliminating backup paths. inlining, even with 100% accuracy of the predictions, cannot outperform direct devirtualization of dynamic method calls 3.

Load more