
Proceedings of the Third Conference on Object-Oriented Technologies and Systems (COOTS '97) 1 Toba: Java For Applications A Way Ahead of Time (WAT) Compiler Todd A. Proebsting Gregg Townsend Patrick Bridges John H. Hartman Tim Newsham Scott A. Watterson The University of Arizona Abstract machine code once. For example, a compiled C pro- gram runs 1.5-2.2 times faster than the equivalent JIT- Toba is a system for generating ef®cient standalone Java compiled Java program, and 2.6-4.2 times faster than an applications. Toba includes a Java-bytecode-to-C com- interpreted Java program. piler, a garbage collector, a threads package, and Java These performance penalties are especially bother- API support. Toba-compiled Java applications execute some in non-mobile applications that are run many times 1.5±4.2 times faster than interpreted and Just-In-Time without change. To combat these inherent performance compiled applications. penalties we have developed a Java system that pre- compiles Java class ®les into machine code. Our system, 1 Introduction Toba,1 ®rst translates Java class ®les into C code, then compiles the C into machine code. The resulting object Java [GYT96] is an object-oriented language designed ®les are linked with the Toba run-time system to create by Sun Microsystems that supports mobile code, i.e., ex- traditional executable ®les. To distinguish our technique ecutable code that runs on a variety of platforms. Al- from JIT compilation, we have (somewhat facetiously) though the language is interesting in its own right, Java's coined the phrase Way-Ahead-of-Time (WAT) compiler popularity stems from its promise of ªwrite once, run to describe Toba. Toba compiles Java programs into ma- anywhere.º Mobile code proponents envision a future of chine code during program development, eliminating the location-independentcode moving about the Internet and need for interpretation or JIT compilation of bytecodes. running on any platform. Although we forfeit Java's architecture-neutral distribu- Java's mobility is achieved by compiling its object tion, Toba-generated executables are 1.5-4.4 times faster classes into a distribution format called a class ®le.A than alternative JVM implementations. class ®le contains information about the Java class, in- Toba has several advantages over interpretation or cluding bytecodes, an architecturally-neutral representa- JIT-compilation. First, because Toba runs way-ahead- tion of the instructions associated with the class's meth- of-time, rather than just-in-time, the resulting machine ods. A class ®le can execute on any computer supporting code can be more heavily optimized to yield more ef- the Java Virtual Machine (JVM). Java's code mobility, ®cient executables. Second, because Toba creates a C- therefore, depends on both architecture-neutral class ®les equivalent to the Java program, the standard C debug- and the implicit assumption that the JVM is supported on ging and pro®ling tools can operate on Toba-generated every client machine. executables. Third, because Toba executables include all Most JVM implementations execute bytecodes via class ®les used by the application, there is no possibility interpretation or Just-In-Time (JIT) compilation, which of an application suddenly ceasing to execute because of compiles the bytecodes into machine code at run time. a change in available class ®les. For these reasons we be- Thus, Java's mobility comes at a price, exacted by the lieve that WAT-compilation is valuable for the develop- cost of interpreting or JIT-compiling the bytecodes every ment and distribution of ef®cient Java programs. time the program is executed. These systems incur mod- Toba consists of many components: a bytecode-to-C est to severe performance penalties compared to more translator, a garbage collector, a threads package, a run- traditional systems that compile source code directly to time library, and native routines implementing the Java API. Toba is a surprisingly small system: the transla- Address: Department of Computer Science, University of Ari- 1 zona, Tucson, AZ 85721; Email: ftodd, gmt, bridges, jhh, newsham, Lake Toba is a prominent feature on Sumatra, the island just west [email protected]. of Java. Proceedings of the Third Conference on Object-Oriented Technologies and Systems (COOTS '97) 2 tor is only 5000 lines of Java; the garbage collector is a 3.1 Naming modestly-altered version of the Boehm-Demers-Weiser Toba attempts to preserve Java names in the C it pro- conservative collector [BW88]; the threads package is duces, although this isn't always possible. Java names builton top of Solaristhreads; the run-timelibraryis only may draw from thousands of different Unicode charac- 6500 lines of C; and the API routines are simply transla- ters whereas C names are limited to just 63 ASCII char- tions of Sun's API class ®les. Except for dynamic link- acters. Furthermore, some legal Java names such as ing, Toba provides a complete Java execution environ- enum and setjmp have special meaning in C. When ment. a Java name cannot be used directly as a C name, Toba discards non-C characters, adds a hash-code suf®x, and 2 The Java Virtual Machine additionally adds a pre®x character if the resulting name begins with a digit or other illegal character. The Java Virtual Machine (JVM) de®nes a stack-based Java method names always require hash-code suf- virtual machine that executes Java class ®les [LY97]. ®xes. Toba translates each Java method into a C function, Each Java class compiles into a separate class ®le con- and these functions share a global namespace. Because taining information describing the class's inheritance, Java methods may be overloaded among and within ®elds, methods, etc., as well as nearly all of the compile- classes, a hash-code suf®x is added to distinguish the time type information. The Java bytecodes form the methods. The suf®x encodes the class name, the method JVM's instruction set, and combine simple arithmetic name, and the method signature. and control-¯ow operators with operators speci®c to the Java language's object model. Powerful object-level in- structions include those to access static and instance vari- 3.2 Data Layout ables, and those to invoke static, virtual, nonvirtual and Java includes eight primitivetypes: byte, short, int, long, interface functions. The JVM also includes an exception boolean, char, ¯oat, and double. Each translates into a mechanism for handling abnormal conditions that arise primitive C type. (Note that Java's ªcharº type repre- during execution. sents a 16-bit Unicode value.) The JVM also provides facilities for managing ob- All other Java types are reference types that subclass jects and concurrency. The JVM implements a garbage- the root class, java.lang.Object. All reference collected object allocation model, with facilities for ini- types are translated into a C pointer type. Each reference tializing and ®nalizing objects. Concurrency is provided points to an object instance, and all instances of a par- through a thread abstraction. Threads are pre-emptive ticular class contain a class-pointer to a common class and scheduled according to priority. A monitor facility structure. Java has two different kinds of objects: array provides mutual exclusion on critical sections as well as objects and ordinary objects. The Toba structure for or- thread scheduling through wait/notify primitives. Moni- dinary objects appears in Figure 1. An ordinary object's tors are recursive, allowing a single thread to acquire the class descriptor includes the instance size and a ¯ag that same monitor lock multiple times without deadlocking. indicates it is not an array. The Toba structure for array objects appears in Figure 2. An array's class descriptor 3 Toba's Run-Time Data Structures includes the element size and its ¯ag indicates that it rep- resents an array. Array instances contain both a length Java's rich object model requires run-time data struc- ®eld and a vector of elements. tures to describe each object's type and methods. We de- Each per-class run-time structure has three parts: veloped our data structures with both performance and general informationthat is needed for all classes (e.g., su- simplicity in mind. They differ in many respects from perclass information), a method table that contains point- those of Sun's implementation of Java. For instance, ers to virtual functions, and a table of class variables. Sun's implementation requires that all object references Figure 3 summarizes run-time class-level information go through a handle, which represents an extra level of common to all classes. indirection, an added inef®ciency, and an extra compli- The method table is simply a vector of function point- cation. Toba accesses objects directly. The differences ers and unique method identi®ers. The method identi- are invisible to Java programmers but important to au- ®ers are used when invoking interface functions, which thors of native methods. must be found at run-time. The structure of the method table is typical of statically-bound object-oriented lan- guages likeOberon-2 [MW91] and C++ [Str86]. Method tables include inherited methods as well as functions de- ®ned by the class itself. Proceedings of the Third Conference on Object-Oriented Technologies and Systems (COOTS '97) 3 Object Pointer Object Instance Class Struct - - Array Bit = 0 Instance Variables Instance Size . Figure 1: Ordinary Object Structure Object Pointer Array Instance Class Struct - - Length Array Bit = 1 Element Size = 1,2,4,8 . Figure
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages13 Page
-
File Size-