Computer Science Technical Report Compiling Java to SUIF
Total Page:16
File Type:pdf, Size:1020Kb
Computer Science Technical Report Compiling Java to SUIF: Incorporating Support for Object-Oriented Languages Sumith Mathew, Eric Dahlman and Sandeep Gupta Computer Science Department Colorado State University, Fort Collins, CO 80523 g fmathew, dahlman, gupta @cs.colostate.edu July 1997 Technical Report CS-97-114 Computer Science Department Colorado State University Fort Collins, CO 80523-1873 Phone: (970) 491-5792 Fax: (970) 491-2466 WWW: http://www.cs.colostate.edu Compiling Java to SUIF: Incorp orating Supp ort for Ob ject-Oriented Languages Sumith Mathew, Eric Dahlman and Sandeep Gupta Computer Science Department Colorado State University,Fort Collins, CO 80523 fmathew, dahlman, [email protected] Abstract A primary ob jective in the SUIF compiler design has b een to develop an infrastructure for research in a variety of compiler topics including optimizations on ob ject-oriented languages. However, the task of optimizing ob ject-oriented languages requires that high- level ob ject constructs b e visible in SUIF. Java is a statically-typ ed, ob ject-oriented and interpreted language that has the same requirements of high-p erformance as pro cedural languages. Wehave develop ed a Java front-end to SUIF, j2s, for the purp ose of carrying out optimizations. This pap er discusses the compilation issues in translating Java to SUIF and draws up on those exp eriences in prop osing the solutions to incorp orating ob ject-oriented supp ort in SUIF as well as the extensions that need to be made to supp ort exceptions and threads. 1 Intro duction Ob ject-oriented languages with their features of inheritance and abstraction equip programmers with the to ols to write reusable programs quickly and easily. However, the use of these features results in co de that is quite di erent in structure from pro cedural co de. Ob ject-oriented programs tend to have smaller metho d sizes, data-dep endent control ow due to dynamic dispatch and a large numb er of p otential aliases [ZCG94]. Heavy use of the feature of dynamic dispatch and the fact that metho ds tend to be small in size imp ose p erformance p enalties when the programs are compiled using traditional intrapro cedural techniques [ASU86]. + Optimization techniques [Dea96, DDG 96, Ple96, Cha92]have b een prop osed whichhave a direct b ene t of eliminating the overhead of dynamic dispatchby statically determining the receiver of a metho d call. Moreover greater indirect b ene ts are obtained by enabling inlining and hence further intrapro cedural optimizations. However, in order to evaluate the relative costs and b ene ts of these optimization techniques, it is necessary to have a common framework on which these optimizations are p erformed. Also, the framework needs to b e language-indep endent so that programs of di erent ob ject-oriented languages can get comparable treatment. An intermediate representation can act as the interface b etween di erent language front-ends and the optimizing back-end. SUIF [Gro94] is a compiler infrastructure that allows easy exp erimentation and incremental opti- mization through multiple passes. The feature of extensibility [Smi96] makes it p ossible for SUIF to supp ort many user-de ned constructs. Also, SUIF has built in supp ort for interpro cedural op- timizations [Pan96] and parallelization. We prop ose to implement an optimizer for ob ject-oriented languages on the SUIF representation of the program. This pro ject is supp orted byFaculty Grant 168597 from Colorado State University 1 A necessary requirement of the intermediate representation is that it must allow the di erent front- ends to describ e the structure of the source program in such a way that optimizations may be p erformed. Ob ject-oriented languages use constructs like metho d calls, eld accesses, ob ject instan- tiations etc., which are not found in pro cedural languages. To p erform static binding of dynamically- dispatched calls, it is necessary that these constructs b e easily identi able in the intermediate rep- resentation. However, the current version of SUIF 1.x do es not have any front-ends for ob ject- oriented languages and do es not supp ort ob ject-oriented constructs. Therefore, a ma jor endeavor of this pro ject has b een to identify the extensions, if any, that need to b e made to SUIF to supp ort these features. This pap er deals with the issues in compiling Java, a statically-typ ed and ob ject-oriented language, to SUIF. As Javaisaninterpreted and dynamic language wehave had to make some changes to the compilation mo del. Java programs are compiled to byteco de b efore b eing translated to the SUIF representation. We draw up on our exp eriences in building the translator, j2s,toshowhow SUIF can b e used to supp ort ob ject-oriented features. We also prop ose solutions for supp orting exceptions and threads in SUIF. The rest of this pap er is organized as follows. Section 2 deals with the issues in translating byteco de to SUIF in detail. It describ es the instruction format, data-typ es and ob ject-oriented constructs that are supp orted by the Java Virtual Machine. In translating Java to SUIF wehave had to deal with the issue of supp orting ob ject-oriented features in SUIF. Section 3 outlines the features that can b e supp orted in the existing version of SUIF as also the extensions that need to b e made. The issues of exception handling and thread libraries are also discussed b ecause they are present in most ob ject-oriented languages. The conclusions and future work are describ ed in section 4. 2 Translating Byteco de to SUIF Translating byteco de to SUIF involves handling the di erent data typ es and byteco de instructions without any alteration in the semantics of the program. The byteco de is to b e nally compiled to native co de. Currently, the compilation mo del requires that all the byteco de les be available at compile time as it do es not supp ort dynamic linking. The currentversion of the translator compiles eachbyteco de .class le into a SUIF le. There- fore, programs with multiple .class les will have a corresp onding numb er of SUIF les. Eachof set entry per fileset. In a later pass the multiple file set entries the SUIF les have one file are linked and placed in a single fileset for the purp oses of interpro cedural analysis. 2.1 Compilation Mo del The Java to SUIF translator, j2s, is one step in the compilation framework for Java programs which is outlined in Figure 1 . Java source co de is rst compiled to byteco de b efore it is translated into SUIF. The reasons for cho osing byteco de as the source language are many. As byteco des do not have to b e parsed, unlike source co de, a simple reader is sucient to read in the information from a byteco de le. Further, byteco des provide a suciently high level of representation of the constructs used in the language. This is imp ortant for the purp oses of optimization where all the ob ject-oriented constructs from the source program need to b e retained. We are currently considering three alternatives for the execution strategy. The approaches can b e classi ed as those interfacing with the Java Virtual Machine Interface and those that take a stand alone approach. In either case all the comp onentbyteco de les are required to b e available at compile time. Dynamic loading is not currently supp orted. There are twoways of interfacing compiled co de with Sun's Java Virtual Machine - using the Native Metho d Interface or the JIT Just-In-Time 2 Java Source JavaC Bytecode j2s SUIF Optimizer C Code Native Method/ Fat-Class File Figure 1: Compilation Framework. API [Yel96]. The stand alone approachwould require a run-time environment that duplicates many of the services that are provided by the Virtual Machine. 2.2 Converting Stack-Based Instructions to Three Address Co de AJava Virtual Machine instruction [LY97] consists of a one-byte op co de sp ecifying the op eration followed by zero or more op erands supplying the arguments. The byteco de instructions are stack- based instructions. The translation pro cedure converts it to three-address co de SUIF expression tree. The inner lo op of the translation pro cedure is given in Figure 2. do { read the opcode read the arguments from stack and bytecode array create appropriate SUIF objects for arguments create appropriate SUIF instruction object if (no more instructions in expression tree) emit instruction else place address of instruction on SUIF stack } while (more instructions available) Figure 2: Converting Stack-Based Instructions to Tree Expressions Each expression tree is lab eled with the number of the instruction in the Java byteco de using annotations to the mrk instructions. These are used in determining the targets of jump instructions. 3 The byteco de instructions for jumps represent the target as an o set within the given metho d. We do not currently supp ort jumps for exception handling using the catch and finally clauses. Patching the right SUIF instruction to jump to requires two passes. The rst pass annotates the branch instructions with the numb er of the Javabyteco de instruction. The second pass will patch in the destination op erand of the SUIF jump instructions. long a,b; 1: mrk 0 ldc2_w #4 <Long 2> ["line': 0 "ClassA.bc"] a = 2; 3 lstore_1 2: str main.lv1 = e1 4 lload_1 3: e1: ldc t:g5 (i.64) 2 b = a * 3; 5 ldc2_w #6 <Long 3> 4: mrk 8 mul ["line": 8 "ClassA.bc"] 9 lstore_3 5: str main.lv2 = e1 10 return 6: e1: mul t:g5 (i.64) e2, e3 7: e2: lod t:g5 (i.64) main.lv1 8: e3: ldc t:g5 (i.64) 3 9: mrk ["line": 10 "ClassA.bc"] 10: ret <nullop> Java Source Java Bytecode SUIF representation Figure 3: Example of a Simple Expression Tree Figure 3 lists an example output from the translation pro cedure that converts stack-based co de to an expression tree.