Kawa - Compiling Dynamic Languages to the Java VM
Per Bothner Cygnus Solutions 1325 Chesapeake Terrace Sunnyvale CA 94089, USA
Abstract: in a project in conjunction with Java. A language im- plemented on top of Java gives programmers many of Many are interested in Java for its portable bytecodes the extra-linguistic benefits of Java, including libraries, and extensive libraries, but prefer a different language, portable bytecodes, web applets, and the existing efforts especially for scripting. People have implemented other to improve Java implementations and tools. languages using an interpreter (which is slow), or by translating into Java source (with poor responsiveness The Kawa toolkit supports compiling and running vari- for eval). Kawa uses an interpreter only for “simple” ous languages on the Java Virtual Machine. Currently, expressions; all non-trivial expressions (such as function Scheme is fully supported (except for a few difficult fea- definitions) are compiled into Java bytecodes, which are tures discussed later). An implementation of ECMA- emitted into an in-memory byte array. This can be saved Script is coming along, but at the time of writing it is for later, or quickly loaded using the Java ClassLoader. not usable.
Kawa is intended to be a framework that supports mul- Scheme [R RS] is a simple yet powerful language. It tiple source languages. Currently, it only supports is a non-pure functional language (i.e. it has first-class Scheme, which is a lexically-scoped language in the Lisp functions, lexical scoping, non-lazy evaluation, and side family. The Kawa dialect of Scheme implements almost effects). It has dynamic typing, and usually has an inter- all of the current Scheme standard (R RS), with a num- active read-evaluate-print interface. The dynamic nature ber of extensions, and is written in a efficient object- of Scheme (run-time typing, immediate expression eval- oriented style. It includes the full “numeric tower”, with uation) may be a better match for the dynamic Java envi- complex numbers, exact infinite-precision rational arith- ronment (interpreted bytecodes, dynamic loading) than metic, and units. A number of extensions provide ac- Java is! cess to Java primitives, and some Java methods provide ECMAScript is the name of the dialect of JavaScript convenient access to Scheme. Since all Java objects are defined by ECMA standard 262 [ECMAScript], which Scheme values and vice versa, this makes for a very pow- standardizes JavaScript’s core language only, with no in- erful hybrid Java/Scheme environment. put/output or browser/document interface. It defines a An implementation of ECMAScript (the standardized very dynamic object-based language based on prototype “core” of JavaScript) is under construction. Other lan- inheritance, rather than classes. A number of new or pro- guages, including Emacs Lisp, are also being considered. posed Web standards are based on ECMAScript.
Kawa home page: http://www.cygnus.com/ Information and source code will be available from bothner/kawa.html. http://www.cygnus.com/ bothner/kawa.html. (Note that Kawa is also the name of an unrelated com- mercial Java development environment.) 1. Introduction While Java is a decent programming language, the rea- 2. History son for the “Java explosion” is largely due to the Java Virtual Machine (JVM), which allows programs to be Starting in 1995 Cygnus (on behalf of the Free Soft- distributed easily and efficiently in the form of portable ware Foundation) developed Guile, an implementation bytecodes, which can run on a wide variety of architec- of Scheme suitable as a general embedding and exten- tures and in web browsers. These advantages are largely sion language. Guile was based on Aubrey Jaffar’s SCM independent of the Java language, which is why there interpreter; the various Guile enhancements were ini- have been a number of efforts to run other languages on tially done by Tom Lord. In 1995 we got a major contract the JVM, even though the JVM is very clearly designed to enhance Guile, and with our client we added more and optimized for Java. Many are especially interested features, including threads (primarily done by Anthony in more high-level, dynamic “scripting” languages to use Green), and internationalization. The contract called for a byte-code compiler for Guile, Java” dialects. and it looked like doing a good job on this would be a This only gives you the single a single (Java VM) layer major project. One option we considered was compil- of interpretation. On the other hand, most of the ef- ing Scheme into Java bytecodes and executing them by forts that people are making into improving Java per- a Java engine. The disadvantage would be that such a formance will benefit your implementation, since you Scheme system would not co-exist with Guile (on the use standard Java bytecodes. other hand, we had run into various technical and non- technical problems with Guile that led us to conclude The biggest problem with that approach is that it is an that Guile would after all not be strategic to Cygnus). inherently batch process, and has poor responsiveness. The advantage of a Java solution was leveraging off the Consider a read-eval-print-loop, that is the ability for tools and development being done in the Java “space”, a user to type in an expression, and have it be imme- plus that Java was more likely to be strategic long-term. diately read, evaluated, and the result printed. If eval- uating an expression requires converting it to a Java The customer agreed to using Java, and I started active program, writing it to a disk file, invoking a separate development June 1996. As a base, I used the Kawa java compiler, and then loading the resulting class file Scheme interpreter written by R. Alexander Milowski. into the running environment, then response time will He needed an object-oriented Scheme interpreter to im- be inherently poor. This hurts “exploratory program- plement DSSSL [DSSSL], a Scheme-like environment ing”, that is the ability to define and update functions for expressing style, formatting, and other processing on the fly. of SGML [SGML] documents. DSSSL is an subset of “pure” Scheme with some extensions. Kawa 0.2 was a A lesser disadvantage is that Java source code is not simple interpreter which was far from complete. It pro- quite as expressive as Java bytecodes. While byte- vided a useful starting point, but almost all of the original codes are very close to Java source, there are some code has by now been re-written. useful features not available in the Java language, such as goto. Debugging information is also an issue. Kawa 1.0 was released to our customer and “the Net” September 1996. Development has continued since