Opencj: a Research Java Static Compiler Based on Open64
Total Page:16
File Type:pdf, Size:1020Kb
Opencj: A research Java static compiler based on Open64 Keqiao Yang, Zhemin Yang, Zhiwei Cao, Zeng Huang, Di Wang, Min Yang, Binyu Zang Parallel Processing Institute, Fudan University, Shanghai, China {kqyang,yangzhemin,zwcao,hz,wang di,m yang, byzang}@fudan.edu.cn Abstract bile phones due to its productivity and safety. Many op- As Java becomes more pervasive in the programming land- timization approaches have been applied to improve Java scape even in HPC applications, it is very important to pro- runtime performance. In many applications, Java’s perfor- vide optimizing compilers and more efficient runtime sys- mance is similar to other programming languages (includ- tems. To this end, we try to leverage the synergy between ing C++/Fortran)[16]. There are two ways to run Java static and dynamic optimization to exploit more optimiza- programs: running bytecode within a Java virtual machine tion chances and improve Java runtime performance espe- (JVM) or running its executable code directly. cially for server applications. This paper presents our first JVM can either interpret the Java bytecode or compile achievement of implementing a Java static compiler Opencj the bytecode into native code of the target machine. Be- which can perform fully optimization for Java applications. cause interpretation incurs a high runtime overhead, JVMs Opencj takes Java source files or class files as inputs and usually rely on compilation. The most popular compilation generates machine dependent executable code for Linux/IA32. approach is to perform dynamic or just-in-time (JIT) com- It is developed based on Open64 with some optimizations pilation, where translation from bytecode to native code is implemented for Java. Efficient support for exception han- performed just before a method is executed. To reduce the dling and virtual method call resolution fulfills the demands overhead of runtime compilation, JVM usually operates in which are imposed by the dynamic features of the Java pro- a hybrid mode, which means that bytecode of a method gramming language. Due to the same optimizer in Opencj is initially executed by an interpreter until the JVM de- and Open64, the performance gap between Java and C/C++ termines its suitability for further optimization. Examples of such systems include IBM JVM[34], Sun Hotspot com- programs can be evaluated. The evaluation of the scientific TM SciMark 2.0 benchmark suite shows they have a similar peak piler[21], Hotspot for Java 6[17], and Jalapeno˜ adaptive performance between its Java and C versions. The evalua- optimization system[2]. Nevertheless, JVMs using even the tion also illustrates that the performance of Opencj is better most sophisticated dynamic compilation strategies in their than GCJ for SPECjvm98 benchmark suite. server compiler still suffer from two serious problems, Large memory footprint and Startup overhead of JVMs. However, Categories and Subject Descriptors D.3.4 [Programming static compilation achieves greatly reduced cost of startup, Languages]: Processors-Compilers, Optimization, Code gen- reduced usage of memory, automatic sharing of code by eration the OS between applications, and easier linking with native General Terms Algorithms, Languages, Performance code. This paper introduces a static Java compiler Opencj Keywords Java, Java Static Compiler, Java Exception Han- which is developed based on Open64 compiler. Open64 is dling, Bounds Check Elimination, Synchronization Elimina- an open-source static compiler for C/C++ and Fortran with tion, Java Devirtualization excellent optimizer. Opencj benefits from Open64’s back- 1. Introduction end, especially IPA which enables us to generate advanced optimized Java native code and evaluate the performance The Java programming language enjoys widespread popu- gap between Java and C/C++. larity on different platforms ranging from servers to mo- Due to the lack of precise runtime information, it is hard to predict runtime behavior for static compilation. Although offline profiling technique has been adopted by some static Permission to make digital or hard copies of all or part of this work for personal or compilers including Open64, it may not be accurate or rep- classroom use is granted without fee provided that copies are not made or distributed resentative. As a result, compilers are forced to make con- for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute servative assumptions to preserve correctness and to avoid to lists, requires prior specific permission and/or a fee. performance degradation. So, we argue that static and dy- CGO 2009 Open64 Workshop March 22, Washington. Copyright c 2009 ACM [Open64 Workshop]. $5.00 namic optimizations are not distinct and competitive. And we try to find a way to integrate the benefits of static and nest optimizer(LNO), global optimizer(WOPT), and code dynamic optimizations to improve runtime performance. We generator (CG). These modules interact via a common tree- take DRLVM, a JVM of Apache Harmony[13], as a platform based intermediate representations, called WHIRL(Winning to reveal how to leverage static optimization benefits. Hierarchical Intermediate Representation Language). The In section 7, we picture a Java compilation framework WHIRL has five levels to facilitate the implementation of mixing the static and dynamic optimization techniques to different analysis and optimization phases. They are classi- further reduce runtime overhead. In this framework, a dy- fied as Very High, High, Mid, Low, and Very Low levels, re- namic compilation module is introduced to collect runtime spectively. And each optimization is implemented on a spe- information and apply dynamic optimizations guided by the cific level of WHIRL. For example, IPA and LNO are ap- profiling results. Besides, a complete Java runtime environ- plied to High level WHIRL while WOPT operates on Mid ment based on Harmony will be integrated into Opencj to level. further accelerate Java runtime speed. The C/C++, Java frontends are based on GNU technol- This paper makes the following contributions: ogy. The Fortran90/95 frontend is the SGI Pro64 (Cray) For- tran frontend. This paper presents the detail of Java frontend • Design and implementation of a Java static compiler in Open64 in section 3. Each frontend produces a Very High Opencj: we introduce the infrastructure of Opencj that level WHIRL for the input program units, stored as a so- compiles Java source files or Class files into optimized called .B file. This representation is available for the subse- executable code. In particular, we focus on the Java ex- quent phases. The driver of Open64 controls the execution of ception handling and some optimizations in Opencj. the compiler, deciding what modules to load and what com- • We compare the runtime performance of Java applica- pilation plan to use. The Driver is responsible for invoking tions between running in JVM and running executable the frontends, the stand-alone procedure inliner, IPA, back- code. Meanwhile we give an evaluation of the perfor- end modules and the linker. Figure 1 is the compiler execut- mance gap between Java and C in scientific applications. ing pattern. • We evaluate the performance of Opencj at Linux/IA32 comparing to GCJ 4.2.0, Harmony DRLVM and Sun hotspot of JDK 1.6. • We give a big picture of how to combine static optimiza- 3. Frontend migration tion and Java runtime technique to improve Java runtime 3.1 Java frontend performance. The C/C++ frontend of Open64 is inherited from GCC, so The rest of the paper is organized as follows. Section 2 gives the frontend of GCJ compiler which is a Java static compiler an overview of Opencj compiler. Section 3 presents the fron- of GCC is chosen to develop the Java frontend of Open64 tend migration for Open64. Section 4 describes the optimiza- when designing Opencj. In figure 1, the frontend has two tions which are designed and implemented for Java applica- modules, gspin which outputs AST of GCC as language tions. Section5 gives the experimental evaluation of Opencj. independent spin file and wgen which takes spin file as Section 6 pursues related works. Section 7 highlights the fu- input and converts it into WHIRL. The objective of gspin ture work of our research and, finally, section 8 concludes module is to keep the independency and persistency of wgen the paper. module in the compiler version updating. Spin file is equal to AST file except a few modifications in AST to remove the language dependent tree nodes, such as renaming the tree 2. Overview of Opencj node type. The main components of Opencj include the Java frontend We migrate the frontend of GCJ 4.2.0 version into Opencj. migrated from GCJ[12] and the Optimizer of Open64[25]. After GCC 4.0 version, it redefined the AST tree into The frontend is used to read Java source files or class files GENERIC and GIMPLE[20] tree for performing optimiza- and transform them into WHIRL[14] of Open64. And then tion at the tree level. The purpose of GENERIC is simply Open64 Optimizer performs optimization on the WHIRL to provide a language-independent way of representing an and generates machine dependent executable code for IA32 entire function in trees. GIMPLE is a simplified subset of or Itanium. GENERIC for use in optimization. We used to try to gener- Open64 is open sourced by Silicon Graphics Inc. from ate spin file at GIMPLE tree instead of GENERIC to benefit its SGI Pro64 compiler targeting MIPS and Itanium proces- the GCJ optimization. The evaluation of SciMark 2.0 Java sors. Open64 is a well-written, modularized, robust, state-of- benchmark shows that it caused about 30% performance the-art compiler with support for C/C++ and Fortran 77/90. degradation since some high level structure has been trans- The major modules of Open64 are the multiple language formed into low level, e.g.