<<

Compiling Standard ML to Java Byteco des

NickBenton Andrew Kennedy George Russell

In Proceedings of the rd ACM SIGPLAN Conference on

Septemb er Baltimore

The following copyright notice is required by the ACM see httpwwwacmorgpubscopy righ tpol icy html

Copyright by the Asso ciation for Computing Machinery Inc Permission to make digital or hard copies

of part or all of this work for p ersonal or classro om use is granted without fee provided that copies are not made

or distributed for prot or commercial advantage and that copies b ear this notice and the full citation on the rst

page Copyrights for comp onents of this work owned by others than ACM must b e honored Abstracting with

credit is p ermitted To copy otherwise to republish to p ost on servers or to redistribute to lists requires prior

sp ecic p ermission andor a fee Request p ermissions from Publications Dept ACM Inc fax

or permissionsacmorg

Compiling Standard ML to Java Byteco des

Nick Benton Andrew Kennedy George Russell

Persimmon IT Inc

Cambridge UK

fnickandrewgeorgegpersimmoncouk

Abstract adoption of functional languages in real world applications

and the situation is getting worse as more software has to

MLJ compiles SML into veriercompliant Java byte

op erate in a complex environment interacting with com

co des Its features include typ echecked interlanguage work

p onents written in a variety of languages p ossibly wrapp ed

ing extensions which allow ML and Javacodetocall each

up in a distributed comp onentarchitecture such as CORBA

other automatic recompilation management compact com

DCOM or JavaBeans

piled co de and runtime p erformance which using a just in

Because the semantic gap between Java and ML is

time compiling Java virtual machine usually exceeds that

smaller than that b etween C and ML and b ecause Java uses

of existing sp ecialised byteco de interpreters for ML Notable

a simple scheme for dynamic linking MLJ is able to make

features of the itself include wholeprogram optimi

interlanguage working safe and straightforward MLJ co de

sation based on rewriting compilation of p olymorphism by

can not only call external Java metho ds but can also ma

sp ecialisation a novel monadic intermediate language which

nipulate Java ob jects and declare Java classes with metho ds

expresses eect information in the typ e system and some in

implemented in ML which can b e called from Java Thus

teresting data representation choices

the MLJ programmer can not only write applets in ML but

also has instant access to standard libraries for and D

orking graphics GUIs database access so ckets and netw

Intro duction

concurrency CORBA connectivity security servlets sound

and so on as well as to a large and rapidlygrowing collec

The success of Sun Microsystems Java language means

tion of thirdpartycode

that virtual machines executing Javas secure multi

The interesting question is whether ML can b e compiled

threaded garbagecollected byteco de and supp orted by a

into Javabyteco des which are ecient enough to b e useful

capable collection of standard library classes are now not

Java itself is often criticised for being to o slow esp ecially

just available for a wide range of architectures and op erat

running co de which do es signicant amounts of allo cation

ing systems but are actually installed on most mo dern ma

and its byteco des were certainly not designed with compila

chines The idea of compiling a functional language suchas

tion of other languages in mind There is little opp ortunity

ML into Javabyteco des is thus very app ealing as well as the

for lowlevel backend trickery since wehave no control over

obvious attraction of b eing able to run the same compiled

the garbage collector heap layout etc and the requirement

co de on any machine with a JVM the p otential b enets of

that compiled classes pass the Javaverier places strict typ e

interlanguage working b etween Java and ML are consider

constraints on the co de we generate Furthermore current

able

Java virtual machines not only store activation records on

Many existing for functional languages have

a xedsize stack but also fail to optimise tail calls Thus

the ability to call external functions written in another lan

the initial prosp ects for generating acceptably ecientJava

guage usually C Unfortunately dierences in memory

byteco des from a functional language did not lo ok go o d our

mo dels and typ e systems make most of these foreign func

rst very simpleminded lamb dacalculus to Java translator

tion interfaces awkward to use limited in functionality and

plus an early JVM ran the nb b enchmark times slower

even typ eunsafe Consequently although there are for ex

than Moscow ML and it was clear that a practical ML to

ample go o d functional graphics libraries which call X

Jav a byteco de compiler would have to do fairly extensive

the typical functional programmer probably do esnt b other

optimisations MLJ is still a workinprogress and there is

to use a C language interface to call everyday library func

scop e for signicant improvement in b oth compilation sp eed

tions to say calculate an MD checksum manipulate a GIF

and the generated co de in particular the currentversion

le or access a database She thus either do es more w ork

still only optimises simple tail calls but it is already quite

than should b e necessaryorgives up and uses another lan

usable on source programs of several thousand lines and pro

guage This is surely a ma jor factor holding back the wider

duces co de which with a go o d mo dern JVM usually out

p erforms the p opular MoscowMLbyteco de interpreter

To app ear in the rd ACM SIGPLAN Conference on

Functional Programming Septemb er Baltimore



Java A large subset of the new standard Basis Library has Overview

b een implemented

Compiler phases

The interlanguage features bring Java typ es and val

ues into ML whilst enforcing a separation b etween the two

MLJ is intended for writing compact selfcontained appli

though a Javatyp e and an ML typ e maywell end up the

cations applets and software comp onents and do es not have

same in the compiled co de External Java typ es maybe

the usual SML interactive readevalprint top level Instead

referred to in ML simply by using their Java names in quo

it op erates much more like a traditional batch compiler The

tation marks Thus

structures which make up a pro ject are separately parsed

typ echecked translated into our typ ed Monadic Interme

javaawtGraphics int

diate Language MIL see section and simplied These

is the typ e of pairs whose rst comp onentisaJava ob ject

separately compiled MIL terms are then linked together into

representing a graphics context and whose second comp o

one large MIL term which is extensively transformed b efore

nentisanMLinteger An imp ortant subtlety is that whilst

b eing translated into our lowlevel Basic Blo ckCodeBBC

in Java all pointer reference typ es implicitly include the

see section Finally the backend turns the BBC into a

echose to make their quoted Javatyp e sp ecial value nullw

collection of compiled Java class les whichby default are

names in ML refer only to the nonnull values Where a

placed in a single zip archive

value of Java reference typ e t may b e passed from Javato

This wholeprogram approach to compilation is unusual

ML co de as the result of an external eld access or metho d

though not unique It increases recompilation times

call or as a parameter to a metho d implemented in ML

considerably but do es allow us easily to pro duce much faster

then that value is given the ML typ e t option with the

and just as imp ortantly for applets smaller co de We

value NONE corresp onding to Java null and values of the

monomorphise the whole program and can p erform trans

form SOMEv corresp onding to nonnull values Similarly

formations suc h as inlining deadco de elimination

we guarantee that anyMLvalue of typ e t option will b e

raising and knownfunction call optimisation with no regard

represented by the an element of the underlying Java class

for mo dule b oundaries so there is no runtime cost asso ciated

This complication allows the typ e system to catch statically

with use of the mo dule system

what would otherwise b e dynamic NullPointerExceptions

App endix A contains an example of JVM co de generated

and also gives the compiler more freedom in cho osing and

byMLJ

sharing data representations see Section

The builtin structure Java includes ML synonyms for

Compilation environment

common Jav atyp es and co ercions b etween many equivalent

MLJ can b e run entirely in batchmodeorasaninteractive

ML and Javatyp es for example JavatoString which con

recompilation environment Toplevel structures and signa

verts a Java String into an ML string With one exception

tures are stored onep erle as in Moscow ML MLJ

fromWord these all generate no actual byteco des and are

do esnt implement functors All compilation is driven by

included just to separate the twotyp e systems securely in

the need to pro duce one or more named Java class les for

the source

an application this is usually just the class containing the

MLJ co de can p erform Java op erations such as eld

main metho d whilst for an applet it is usually a sub class

access metho d invo cation and ob ject creation by using a

of javaawtApplet Once these ro ot classes and their ex

collection of new keywords all of which start with an under

p orted names have b een sp ecied the make command com

score which ts well with the existing lexical structure

piles links and optimises the required structures using an

of SML and we also use quotation marks for Java eld and

automatic dep endency analysis There is a smart recompi

metho d names Thus for example

lation manager similar to SMLNJs CM which ensures

let type colour javaawtColor

that only the necessary structures are recompiled when a

val grey valOf getfield colour gray

le is changed though the p ostlink optimisation phase is

in JavatoInt invoke getRed grey

ys p erformed on the whole program Typicallythere alwa

end

compile time relinking all the translated MIL structures

from memory optimising and generating co de is around

makes colour b e an ML synonym for the Java class Color

two thirds of the total initial compile time which also in

and then gets the static eld called gray from that class

cludes parsing typ echecking and translation into MIL

MLJ reads the compiled Java class le which states that

During compilation the compiler not only typ echecks

the eld holds an instance of the class Color and so in

the ML co de but if any external Java classes are men

fers the typ e javaawtColor option for the getfield

tioned their compiled representations are read typically

construct because its an external value which might be

from within the standard classeszip le to typ echeck

null We then invoke the virtual metho d getRed on the

and resolve the references

returned colour value to get the Java int value of its red

comp onent whichwe convert into an ML int The valOf

The Language

is required to removetheoption from the typ e of the re

turned value it raises an exception if its argumentis NONE

MLJ compiles the functorfree subset of SML in

b ecause MLJ do es not allow metho d invo cation on p ossibly

cluding substructures the new where construct etc plus

null Javavalues

our nonstandard extensions for interlanguage working with

The new constructs suchas getfield and invoke have

essentially the same static as their equivalents in

Whole program compiler sounds a bit naive so we prefer to

think of it as a p ostlink optimiser whichhasa much more sophis



Actuallyversion departs from the Denition in four places

ticated ring

The only signicant one is that arithmetic op erations do not raise the

Overflow exception

datatype Behave B of unit Behave

classtype MLButton extends javaawtButton

private field behaviour Behave

public method action ejavaawtEvent option o JavaObject option Javaboolean

let val Bb getfield behaviour this

in putfield behaviour this b JavafromBool true

end

constructor name string bBehave

superJavafromString name behaviour b

Figure Generating a Java class from MLJ

the Java language At a virtual metho d invo cation for ex eects returning a new behaviour which is stored in the

ample the compiler searches up the class hierarchy for a instance variable ready for the next click The constructor

metho d matching the given name and argumenttyp es ap for which our syntax is particularly baro que is called with

plying the same rules for nding the most sp ecic metho d an ML string and the initial b ehaviour It starts by call

by p ossible co ercions on the arguments as Javadoes ML ing the sup erclass initialiser with a Java String and then

p olymorphism isnt allowed to confuse things further as all initialises the instance variable with the supplied b ehaviour

uses of these new constructs must b e implicitly or explicitly

ert existing Java monomorphic This makes it easy to conv

MIL

programs or co de fragments into MLJ but is not intended to

make MLJ a fullyedged ob jectoriented extension of SML

The Monadic Intermediate Language MIL is the heart of

in particular inheritance do es not induce anysubtyping re

the compiler MILisatyp ed language inspired by Moggis

lation on ML typ es

computational lamb da calculus which makes a distinc

MLJ structures can also declare new Java classes with

tion b etween computations and values in its typ e system It

elds and metho ds having a mixture of ML and Javatyp es

has impredicative SystemFstyle p olymorphism and renes

and metho ds implemented in ML In fact all programs must

the computation typ e constructor to include eect informa

contain at least one such class for the Java runtime system

tion MIL also includes some slightly lower level features so

to call into otherwise no ML co de could ever get called The

that most of the optimisations and representation choices

extensions for declaring classes can express most of what can

wewishtomake can b e expressed as MILtoMIL transfor

b e expressed in the Java language including all the access

mations These include notquiterstclass sequence typ es

mo diers public private etc though there are some

used for multiple argumentmultiple result functions and

natural restrictions such as that static elds of ML typ e or

constructors Java typ es used not just for at datatyp e

nonoption Java reference typ e must have initialisers since

interlanguage working but also to express representations

there are no default values of those typ es and overloading

for ML typ es and three kinds of function lo cal global and

on ML typ es is forbidden as twoMLtyp es may end up rep

closure

resented as the same Javatyp e Class denitions in struc

Typ ed intermediate languages are now widely accepted

tures are all anonymous there are ML typ e names b ound to

to b e A Go o d Thing Typ es aid analysis and op

them but no Java class names so a class that is referenced

timisation provide a more secure basis for correct transfor

purely by ML co de will end up with an internal name in the

mations and are required in typ edirected transformations

compiled program and so not be directly accessible from

such as compilation of p olymorphic equality They also help



external Java co de However as weve already mentioned

catch compiler bugs In our case it seems esp ecially natural

at least one toplevel class has to b e exp orted with a given

as the Javabyteco de whichweeventually pro duce is itself

Java name Since exp orted classes can be accessed from

typ ed

Java there are some not entirely trivial further restrictions

The use of computational typ es in the intermediate lan

concerning the typ es and access mo diers of their elds and

guage is more unusual though similar systems have recently

metho ds which are necessary to ensure that anything which

b een prop osed in and the use of a monadic in

is visible either directly or by inheritance to the external

termediate language to express strictnessbased optimisa

Javaworld is of Java primitiveor optioned Java class typ e

tions was prop osed in The separation of computations

Just to givetheavour of howJava classes may b e gen

and values gives MIL a pleasant equational theory full

erated from MLJ Figure shows the denition of an ML

and rules plus commuting conversions whichmakes cor

button class similar to one which might form part of a

rect rewriting simpler Order of evaluation is made explicit

functional GUI to olkit MLButton sub classes the standard

much as it is in CPSbased intermediate represen tations

Java Button class and has an instance variable whichisof

and our renement of computation typ es into dier

a higherorder ML datatyp e Behave When the button is

ent monads gives a unied framework for eect analysis and

its action metho d is called which causes its be pressed

asso ciated transformations

haviour function to b e called presumably with some side



Unless it uses Javas introsp ection capabilities whichwe consider

to b e cheating

V x y f value variables

j c constantofbasetyp e

j V V tuple

n

j in V injection into sum

i

j in V injection into exception

j tV typ e abstraction

j V typ e application

j fold V recursivetyp e intro duction

j unfold V recursivetyp e elimination

j V pro jection

i

M val V trivial computation

j let x M in M evaluation



j V V function application

j ref V reference creation

j V dereferencing

assignment j V V



j raise V throw an exception

j try x M handle y M in M evaluate and catch

 

j case V of in x M in x M case on sum

n n n

x M x M in M case on exception j case V of in

n n

n

j case V of c M c M M case on base typ e

n n

j let tf x M f x M in M recursive function declaration

n n n n n

Figure Terms in MIL

computations ranged over by M asshown in Figure All

int j char j base typ es

evaluation happ ens in let or try The Moggi let construct

j pro duct

n

j sum

let x M in M

n



j function typ e

evaluates the computation term M and binds the result to

j ref reference typ e

x in the scop e of the computation term M The construct



j exn exception

j t p olymorphic typ e

try x M handle y M in M

 

j t recursivetyp e

h i vector of typ es

n

generalises this byproviding a handler M in which the ex



T computation typ e

ception raised by M is b ound to a variable y and the con

f throws reads writes allo csg eects

tinuation M is not evaluated It is interesting to note that



this construct cannot b e dened using let and the more con

v entional M handle y M without recourse to a value of



Figure Typ es in MIL

sum typ e The construct val V allows a value or multiple

values to b e treated as a trivial computation

Typ es and terms

Translation from SML

Typ es in MIL are divided into value typ es ranged over by

The initial translation from typ ed SML to MIL is essentially

and computation typ es ranged over by as shown in



Moggis callbyvalue translation For example a source

Figure Value typ es include base typ es int char etc

application exp exp translates to



pro ducts function typ es with multiple arguments and re

sults sum typ es with multiple argument summands refer

let x M in let y M in xy



ence typ es p olymorphic typ es and recursivetyp es Compu

yp es have the form T indicating a computation tation t

where M and M are the translations of exp and exp The

 

with result typ es and eect The subset relation on

translation also expands out certain features of the source

eects induces a subtyping relation on typ es in an unsur

language that are not present in the target patterns are

prising way Note that b ecause we are only interested in

compiled into at case constructs records are translated

compiling a callbyvalue language wehave restricted func

as ordinary tuples with elds sorted by lab el and even

tion typ es to b e from values to computations

the structure constructs of the SML mo dule language are

Terms are also divided into values ranged over by V and

into ordinary tuples of values The latter relies compiled

on the impredicative p olymorphism in MIL An alternative



Some simplications have b een made for this presentation In

see would b e to stratify the intermediate language in

particular the implementation has over multiple

a similar way to the stratication of SML into co de and

typ es also lowerlevel features that capture Java representations are

omitted

mo dule levels

Uses of p olymorphic equality are also compiled awayus

ing a simple dictionarypassing style One might exp ect that

equality would be an ideal candidate for exploiting Javas

virtual metho ds overriding the equals metho d in classes

representing ML typ es But that would prevent us sharing

representations between ML typ es with dierent equality

functions so do esnt seem worth doing

casebeta

case in V of in x M in x M

i n n n

Transformations let x val V in M

i i

Most of the compilers time is sp ent applying a set of MILto

leteta

MIL transformations to the term representing the whole pro

let x M in val x

gram Some of these are considered to b e generalpurp ose

M

simplication steps and are applied at several stages of the

compilation whilst others p erform sp ecic transformations

letletcc

suchasarityraising functions

let x let x M in M in M

  

Most of the transformations are obviously meaning

let x M in let x M in M

   



preserving rewrites but the tricky part is deciding when

they should b e applied At the moment most of these de

c letcasec

cisions are taken on the basis of some simple rules rather

let y case V of in x M in x M in M

n n n

analysis Some of than as the result of any sophisticated

let f y M in

these rules are typ edirected whereas others involve simple

case V of

prop erties of terms such as the size of a subterm or number

in x let y M in fy

of o ccurrences of a variable The eect inference is currently

in x let y M in fy

n n n

rather naive particularly with regard to recursive functions

and datatyp es so there are a very small number of places

in the basis where wehave annotated computations as pure

Figure Some pro oftheoretic rewrites

so that they may b e deadco ded in programs in which they

are not referenced

Simplication

The most basic of the transformations are essentially just

the pure and reductions and the commuting conver

sions one obtains from the pro of theory of the computa

tional adapted so that large or allo

cating terms are not duplicated see Figure The re

ductions are genuine simplications whilst the commuting

conversions are reorganisations which tend to exp ose fur

ther reductions MLJ applies the commuting conversions

deadlet

exhaustively to obtain a CCnormal form from whichcode

let x M in M



generation is particularly straightforward Doing this sort

M



of heavy rewriting on a large term can b e exp ensive partic

if x not free in M and M T for fallo cs readsg



ularly when it is done functionally as the heap turnover is

then very high Our current simplier uses a quasionepass

deadtry

algorithm similar to that describ ed by App el and Jim in

try x M handle y M in M

 

it maintainsanenvironmentofvariable bindings a cen

let x M in M



sus to countvariable o ccurrences and a stack of evaluation

if M T where throws

contexts to p erform commuting conversions eciently This

algorithm is several times faster than our rst version but

hoistlet

simplication still ends up b eing the most exp ensive phase

let y M in case V of in x M in x M

n n n

b ecause it is rep eated at several stages during compilation

case V of

the total time sp en t in the simplier is typically around

in x M

half the recompile time

in x let y M in M

i i i

The validity of certain rewrites dep ends on eect infor

in x M

n n n

mation in the typ es Some of these are shown in Figure

if y free only in M and M T for fallo cs readsg

i

Polymorphism

Figure Some rewrites dep endent on eect information

Most implementations of SML compile parametric p olymor

phism by boxing that is by ensuring that values of typ e t



Not that were claiming to have justied them formally with re

sp ect to a semantics for the full language

inside a value of typ e t are represented uniformly bya

Object

p ointer they are b oxed In Java the natural waytobox

ob jects is by a free cast to Object and the natural way

to box primitivetyp es is to create a heapallo cated wrap

p er ob ject and cast that to Object Unboxing then involves

P P Exception G

n

using Javas checkcast byteco de to cast backdown again

and in the case of primitivetyp es extracting a eld Done

naively this kind of b oxing can b e extremely inecient and

there are a number of pap ers which address the question

F S E

of how to place the co ercions to reduce the cost of b oxing

eg For an early version of our compiler we imple

mented a recent and mo derately sophisticated graphbased

algorithm for co ercion placement

F F S S E E

m s

orked fairly well we Whilst the graphbased algorithm w

havemorerecently taken a more radical and straightforward

approach removing p olymorphism entirely This is p ossible

Figure Java classes representing MIL typ es

rstly b ecause wehave the whole program to work with and

secondly b ecause it is a prop erty of Standard ML that the

nite number of typ es at which a p olymorphic function is

combined with the other rewrites pro duces a signicant

used can b e determined statically In languages with p oly

improvementin most programs For example the Boyer

morphic recursion such as recentversions of Haskell this

Mo ore b enchmark runs in seconds with all optimisation en

prop erty do es not hold the typ es at which a function is

abled but takes seconds when tupleargument arity raising

used may not b e known until runtime

is turned o Worse it crashes with a stackoverow when

Each p olymorphic function is sp ecialised to pro duce a

de is also disabled b ecause MLJ is then unable to

separate version for eachtyp e instance at whichitisused

use a goto instruction in place of a

In the worst case this can pro duce an exp onential blowup

Observe that decurrying dep ends up on eect informa

in co de size but our exp erience is that this do es not ac

tion in the typ es including termination for otherwise it

tually happ en in practice Three reasons can b e cited for

would b e unsound

this First we sp ecialise not with resp ect to source typ es

but with resp ect to the nal Ja va representations for which

Data representation

much sharing of typ es o ccurs For example supp ose that

the filter function is used on lists containing elements of

ML base typ es

two dierent datatyp es both of which are represented by

Most ML base typ es have close Java equivalents so ML

the universal sum class discussed in Section Then only



ints are represented by Java ints and ML strings are

one version of filter is required Second b oxing and un

Java Strings There are a couple of small represented as

boxingcoercionsintro duce a certain amountofcodeblowup

dierences in the semantics of these typ es or their op erations

of their own and this is avoided Third p olymorphic func

whichhave led us to diverge from the ML denition integer

tions tend to be small so the cost of duplicating them is

arithmetic do es not raise Overflow and for us the Char and

often not great Indeed not only are many p olymorphic

String structures are the same as WideChar and WideString

functions inlined away prior to monomorphising but after

since Java is based on Unico de

monomorphising it is often the case that a particular in

stance of a function is used only once and is consequently

inlined and sub ject to further simplication

Pro ducts

Each distinct pro duct typ e is represented

i n

Arity raising

by a dierent class P whose elds havetyp es see

i n

the class hierarchy in Figure Because we monomorphise

Heap allo cation is exp ensive W e try to avoid creating tuples

representations can b e shared by sorting the elds bytyp e

and closures by replacing curried functions and functions

for example using the same class for the typ e int string as

accepting tuples with functions taking multiple arguments

for string int but the currentversion of the compiler do esnt

and to atten sumsofpro ducts datatyp es by using multi

do this yet

ple argument constructors We also removethe typ e unit

nullary pro duct entirely These transformations should

ideally b e driven by information ab out howvalues are used

Sums

but we nd that fairly simpleminded application of typ e

The natural ob jectoriented view of a sum typ e

n

isomorphisms such as the following

n summands as n sub classes of a single is to representthe

class and then to use metho d lo okup in place of case For

h i

 

example one would represent lists by a class List with sub

T h i

 

classes Nil with no elds and Cons with a eld for the

h i h i

     

head and a eld for the tail The length function would

unit

compile to a metho d whose implementation in Nil simply

unit hi

returns zero and whose implementation in Cons returns the

T huniti T hi

successor of the result of invoking the metho d on the tail



This can itself b e regarded as a kind of b oxing since were using

Whilst this technique is elegant it is not necessarily e

uniform shared representations for certain ML typ es but note that

cient Typical functional co de would generate a large num

wenever b ox primitivetyp es

of ML exceptions In contrast to sums we do not use the ber of small metho ds moreover it is necessary to pass in

same class for exception constructors whose argumenttyp es free variables of the b o dies of case constructs as arguments

are the same MLs handle construct then ts b etter with to the metho ds

the JVM trycatch construct where the class of the excep If the JVM had a classcase byteco de then it would b e

tion is used to determine a blo ck of co de to execute p ossible to use this instead of metho d invo cation Unfortu

For generative exception declarations that app ear inside nately it do es not one must check the classes one at a time

functions we generate a fresh integer count from a global using instanceof and then cast down using checkcast

variable and store this in a eld in the exception constructor Avariation on this idea is to store an integer tag in the

ob ject This eld is tested against in exception handlers and sup erclass and to implement case constructs as a switchon

case constructs the tag followed by a cast down to the appropriate sub class

Exceptions also giv e a nice anecdotal example of the Wetake this a stage further using a single sup erclass S for

sort of lowlevel Javasp ecic tweaking that weve found to all sum typ es with a sub class S for each type of summand

i

b e necessary in addition to highlevel optimisations In an This reduces the numb er of classes required for summands

early version of the compiler we noticed that certain pro moreover b ecause only a single class is used to represent

grams which used exceptions as a controlow mechanism most sum typ es o ccurring in the source program further

ran hundreds of times more slowly than wewould have ex sharing of representations is obtained in typ es constructed

pected The problem was tracked down to a feature of from sum typ es

but for Java whenever an exception is created a complete stack This universal sum scheme is used in general

two sp ecial cases we use more ecient representations as trace is computed and stored in the ob ject The solution

describ ed b elow was simply to override the fillInStackTrace metho d in

the ML exception class so that no stack trace is stored

Enumeration typ es for datatyp es whose constructors

are all nullary we use the primitiveinteger typ e

unctions F

typ es In Java variables of reference typ e con

As mentioned earlier functions in MIL are divided into lo

tain either a valid reference to an ob ject or array or

cals globals and closures These are used as follows

the value null We make use of this for typ es of the

Functions that only ever app ear in tail application p o

form hi where can itself b e represented by a non

sitions sharing a single can b e compiled

null class or arraytyp e For example the SML typ e

inline as basic blo cks Function application is compiled

intint option is implemented using the same class

as a goto byteco de Incidentally this is one go o d rea

as would b e used for intint with NONE represented by

son for compiling to JVM byteco des rather than Java

list is implemented by a single null The typ e int

source the goto instruction is not available in Java

pro duct class with elds for the head an integer and

For example

tail the class itself with nil represented by null

let

If is primitive then a pro duct class is used to rst

val f fn x somecode

wrap the primitivevalue What if is itself of the

in



form hi Then we create an additional dummy

if z then fz else fwz

value of appropriate class typ e to represent the extra

end

value in the typ e For example a value SOME x with

The b o dy of f is simply compiled as a blo ckof co de

the SML typ e int list option is represented in the

and the two calls to f are compiled as jumps

same wayaswould b e x of typ e int listbutNONE is

an extra value created and stored in a global variable

It is sometimes even p ossible to transform a nontail

when the class is initialised

function application into a tail one Consider the fol

lowing fragmentofML

Reference typ es

let

val f fn x somecode

In general a reference typ e ref is simply represented b ya

val y if z then fz else fwz

unary pro duct class For references created at toplevel ie

in

not inside functions that are not used in a rstclass way

somemorecode

assigned to and dereferenced but not compared for equality

end

or passed around we use static elds in a distinguished class

G in other words global variables In the future we hop e

Assuming that f do es not app ear in the expression

to p erform escap e analysis in order to use lo cal variables for

somemorecode then this continuation can b e moved

reference typ es where p ossible

into the denition of f and the calls to f implemented

by jumps

Exceptions

Other functions app earing only in application p ositions

are compiled as static Java metho ds in a distinguished

In SML the typ e exn has a sp ecial status in that it is ex

class G Function application is implemented by the

tensible Exception declarations create fresh distinguishable

invokestatic byteco de unless it is a recursive tail call

exception constructors in the op erational semantics this is

to itself in which case goto is used

formalised by the creation of fresh names However for ex

ceptions declared outside of functions there will b e a xed

The remaining functions are used in a higherorder way

nite set of names that can b e determined at compiletime

and so must b e compiled as closures There are a num

We exploit this by representing each such exception con

ber of ways that this can b e achieved The most ob

structor by a separate class E that sub classes E the class

i

vious is to generate for eachfunctiontyp e an abstract

Benchmark MLJ Moscow SMLNJ

class with an abstract app metho d and then to sub

Nb

class this for each closure of that typ e storing the free

Quicksort

variables as instance variables in the ob ject This is

Life

rather wasteful of classes using one p er function typ e

KnuthBendix

and closure app earing in the program

Mandelbrot

We currently use a dierent and at rst sight rather

BoyerMo ore

alarming scheme Instead of a single app metho d we

FFT

use dierent metho d names for dierent function typ es

There is a single sup erclass F of all functions with

dummy metho ds for each p ossible app metho d Then

Table Compile times seconds

closures with the same typ es of free variables but dif

ferent function typ es share sub classes of F It is even

p ossible for the actual closure ob jects to b e shared if

theirfreevariables are the same but app metho ds are

probably not justied from the p ointof view of execution

dierent For example

sp eed given that the byteco des will usually b e recompiled

by a JITcompiling virtual machine which do es its own in

fun f xstring

dep endent mapping of stack lo cations and lo cal variables to

let

registers and memorybutwe are also keen to reduce the

val f fn yint xy

size of the byteco des We are currently developing an im

val f fn zstring xz

proved backend whichwillmakemuch more intelligent use

in

of the stack

end

Current Status and Performance

f then they can If closures are required for f and

share the same closure class as their free variable typ es

MLJ currently comprises ab out lines of SML

are the same but function typ es are dierent More

written using SMLNJ version plus the Basis li

over the values of their free variables are the same so

brary co de It is freely available over the web at

the same ob ject can b e used for b oth saving an allo

httpresearchpersimmonco ukm lj as an SMLNJ

cation

heap image for Solaris Win Linux and Digital Unix with

the Basis co de compiled in

A simple ow analysis is used to decide how eachfunc

Although there is scop e for further improvement MLJ

tion should b e compiled A more sophisticated ow analysis

already useful in real applications Internal pro jects at is

would not only allowustoidentify more known functions

Persimmon using MLJ include

but would also rene our typ ebased partitioning of appli

cation metho ds allowing more sharing of classes between

Writing functional SGMLXML stylesheets which can

closures

be downloaded into web browsers or run on servers

This involves a lot of interlanguage working includ

ing with Javascript using Netscap e and Microsofts

BBC

browsersp ecic Java classes and with thirdpartyJava

The Basic Blo ck Co de BBC is a static singleassignment

XML parsers

representation of the op erations available in the Javavirtual

Implementing a graphical functional language for l

machine which abstracts away from certain instruction se

tering and classifying events from web servers This

lection details including the distinction b etween stack and

involves interworking with a thirdpartygraph editor

lo cal variables Normal form MIL with all commuting con

written in Java

versions applied is translated into BBC which also includes

some information ab out eects and which ob ject elds are

Wehavea numb er of nice demonstrations including Paul

mutable and the backend then orders and selects instruc

sons Hal theorem prover for rstorder logic compiled

tions to turn that into real byteco de Currently the back

with some thirdpartyJava terminal co de to pro duce an ap

end constructs a dep endency DAG from the BBC and then

plet several functional programs with graphical user inter

works from the top down using the stack where p ossible but

faces and some which access an Oracle database via Javas

mostly storing intermediate results in lo cal variables where

JDBC API The largest program whichwehave successfully

they are used more than once or where ordering constraints

compiled is around lines a compiler for ASN pro

make their immediate use on the stack imp ossible After

ducing C

that there is a pass in which lo cal variable numb ers are re

assigned so that the numb er of copies b etween basic blo cks

Compile times

is minimised combined with a standard registercolouring

phase in whichwe try to minimise the total number of local

The compile times for a range of standard SML b enchmark

variables used

programs are shown in Table All timings were taken

This scheme pro duces co de which is resp ectable but far

on a MHz Pentium Pro with MB of RAM running

from optimal Data passed b etween basic blo cks is never left

Windows NT Wehave compared a recent July

on the stack but is instead passed via lo cal variables Within

internal version of MLJ e with Moscow ML and

basic blo cks themselves the JVMs stack is only used in a

SMLNJ

fairly simpleminded way This causes to o many lo cal vari

ables to b e used and leads to co de b eing somewhat larger

than wewould like Heavy optimisation of the backend is

Benchmark MLJ Moscow SMLNJ

technique if SMLNJ is giv en a whole program as a sin

Nb

gle le then its compile times are often higher than MLJs

Quicksort

Our intermediate language certainly uses more space than

Life

a more traditional untyp ed lambda calculus would rstly

KnuthBendix

because we are carrying typ es around and secondly b ecause

Mandelbrot

the computational lambda calculus translations are inher

BoyerMo ore

ently more verb ose This slows compilation by increasing

FFT

heap turnover The current parser also contributes to long

compile times as it uses parser combinators rather than b e

Table Co de size comparisons kilobytes

ing tabledriven

SecondlyJavaVirtualMachines vary widely in p erfor

mance A go o d JIT compiler pro duces signicant sp eedups

but the current state of the art is that the fastest JITs also

Co de size

have bugs Microsofts Win JIT is generally quite fast but

has a fundamental bug that sometimes causes op erations

Table lists the sizes of compiled co de pro duced by the

to b e unsoundly reordered Luckilywehave b een able to

three compilers To obtain a roughly fair comparison each

identify the problem suciently precisely to add a compiler

excludes the runtime system For MLJ the total size of

to pro duce slightly less ecient co de which avoids option

the class les is given so this excludes the Javainterpreter

the bug The current version of Symantecs JIT is some

required to run it For Moscow ML the size of the byteco de

times very go o d but has a number of serious bugs which

le alone is given so this excludes the camlrunm interpreter

often prevent it from running our co de These problems in

required to run it For SMLNJ the size of the Windows

dicate a pragmatic though we hop e temp orary drawback

heap image pro duced by exportFn is given so this excludes

of clever compilation of other languages to Javabyteco des

the runxwin runtime system required to run it

most Java compilers pro duce fairly naive stylised byte

co des whereas MLJ pro duces rather more contorted byte

Run times

co des which whilst p erfectly legal according to the JVM

a sp ecication could not have b een pro duced by any Jav

Some preliminary b enchmark times are shown in Table

compiler This tends to uncover bugs whichhave not have

All timings were p erformed on the same machine as the

been found by JVM implementers who have only tested

compilation b enchmarks and we again compare MLJ e

against the output of existing Java compilers

with Moscow ML and SMLNJ The run times do

In general MLJ co de run with a JIT compiler tends

not include startup time for the runtime system

to have particularly go o d p erformance even b etter than

Four dierentJava implementations were used to run the

SMLNJ on heavily numeric benchmarks such as Nb

co de compiled by MLJ

Mandelbrot and FFT This is unsurprising as our monomor

java NT the latest version b eta of Suns Java

phisation should allowsuch co de to b e easily translated by

Development Kit running under Windows NT with

aJITinto much the same co de as would b e generated bya

Symantecs JIT x enabled

naive C compiler More typical functional co de whichdoesa

lot of heap allo cation eg Quicksort BoyerMo ore and Life

jview NT the latest version of Microsofts JIT compiler

tends to run rather more slowlyshowing that storage man

build running under Windows NT

agement in JVMs is still fairly p o or we susp ect that the fact

that increasing the initial heap size makes a signicantdier

kaffe Linux the latest version of Tim Wilkin

ence to Quicksort KnuthBendix and BoyerMo ore but not

sons Kae JVM with JIT enabled running under Red

to Life when running on the SunSymantec JVMs indicates

Hat Linux

inecient heap expansion rather than just slow garbage col

lection p er se The comparison with SMLNJ on the Life

java Linux Steve Byrnes v p ort of Suns inter

b enchmark is particularly interesting taking the bench

preting JVM running under RedHat Linux

mark as originally written our whole program optimisa

To illustrate the eect that initial heap size can have

tion allows us to sp ecialise representations including uses

on p erformance Suns JVMs were tested twice rstly with

of p olymorphic equality and run up to times faster than

the default initial heap of MB and secondly with the heap

SMLNJ However when the program is constrained with

starting at MB

a minimal signature SMLNJ can also sp ecialise and runs

nearly times faster than the b est MLJ can manage The

particularly p o or p erformance of Microsofts JIT and unim

Interpretation of the results

pressive p erformance of the others on the KnuthBendix

As usual the details of these smallprogram b enchmark g

b enchmark seems to b e due to the fact that as well as doing

ures should b e treated with some scepticism but its p ossible

a go o d deal of allo cation it makes heavy use of exceptions

to make some broad generalisations

The co de pro duced by MLJ is impressively compact but

The rst thing to note is that MLJ compile times are

not quite as small as that pro duced by Moscow ML Al

very high b etween and times slower than MoscowML

though Moscow has the advantageofabyteco de sp ecically

and b etween and times slower than SMLNJ though

designed for ML we exp ect to b e able to narrow the gap in

are its worth reiterating that the recompile times which

future Wehave already mentioned ongoing improvements

the imp ortantnumb ers for software development are typ

to our backend but just as signicant is the nontrivial space

ically a third less than the total compile times given here

overhead asso ciated with the fairly large n umb er of distinct

But its hardly surprising that extensive functional rewriting

Java classes pro duced by MLJ For example the Knuth

of the whole program turns out to b e a costly compilation

Bendix b enchmark pro duces a total of classes and over

Benchmark MLJ Moscow SMLNJ

java Java

jview java kaffe Java

NT Linux

NT NT Linux Linux

MB MB

Nb

Quicksort y z z

Life

KnuthBendix y

Mandelbrot z z

BoyerMo ore z

FFT

y requires compilation with MLJs microsoftbug switch set to avoid bug in the Microsoft JIT

z program crashed due to a bug in the Symantec JIT timing is with JIT disabled

SMLNJ gave incorrect results

timing improves to seconds when toplevel structure is constrained by minimal signature

Table Run times seconds

K of the total size of K is taken up by the pro duct pay dearly as the earlier discussion of SMLNJs p er

sum exception and F classes each of which contains essen formance on the Life b enchmark indicates more real

tially no real co de There is certainly scop e for improving isticallywehave ourselves doubled the sp eed of parts

our representation choices to decrease co de size still further of MLJ simply bymanually demo dularising the co de

There is a whole range of approaches between com

pletely naive wholeprogram compilation and simple

Conclusions and Further Work

minded separate compilation and whilst the optimum

lies somewhere in the middle the two extremes are

MLJ is a very useful to ol a compiler for a popular func

much the easiest for the compiler writer As MLJ

tional language which pro duces compact highly p ortable

develops caching more information ab out each mo d

co de with reasonable p erformance and whichhasunusually

ule and sacricing some rewrites to b e able to handle

powerful and straightforward access to a large collection of

larger programs and other compilers add ever more

foreign libraries and comp onents But the reasonable per

complex intermo dule optimisations we exp ect them to

formance has only b een achieved at the price of high compile

come much closer together

times and a limitation on the size of programs whichmay

reasonably b e compiled Our decision to do wholeprogram

It is interesting to compare MLJ with Wadler and Oder

optimisation is certainly controversial so it seems worth try

skys Pizza We have started with a standard func

ingtogive some explicit justication

tional language and added extensions to supp ort interlan

guage working with Java whereas Pizza starts with Java

Most imp ortantly b ecause of the relative ineciency

and adds some functional features such as pattern match

of current JVMs and the diculty of mapping ML to

ing and parameterised typ es We are aware of two other

Javabyteco des it was simply the only waytoachieve

attempts to compile SML into Java one by Bertelsen

what we considered to b e adequate p erformance

based on Moscow ML and the other byWalton at Edin

The limitation on program size is just not a problem for

burgh Both of these use uniform representations and do not

many real applications The numb er of SML programs

p erform anything like the same level of optimisation as MLJ

which are more than say lines long is actually

Wakeling has also compiled Haskell into Javabyteco de with

rather small as one can do an awful lot in that much

disapp ointing results large co de and p erformance consid

SML Wehave had it suggested to us that a compiler

erably slower than the Hugs interpreter

which cannot compile itself is useless but this is clearly

The other main attempt to compile a mainly functional

nonsense

language into Javabyteco de is Bothners Kawa compiler for

Scheme Kawahasaninteractive toplevel lo op and com

Trends towards comp onent architectures interlan

piles byteco des dynamically but has to use uniform repre

guage development dynamic linking and distributed

sentations Some informal tests indicate that Kawatypically

systems mean that the large monolithic application is

runs an order of magnitude more slowly than MLJ and tends

b ecoming less common Of course comp onent b ound

to run out of memory or stack space much earlier than MLJ

aries reintro duce the problems of separate compilation

We are currently developing concurrency extensions for

in a worse form but thats all the more reason to com

MLJ which are built on top of Javas builtin threads and

pile each comp onentaswell as p ossible

lo oking further into the future are thinking ab out the p os

sibilityoftakingadvantage of Javas remote metho d invo

Completely separate compilation at the granularityof

cation infrastructure to develop distributed and mobile ap

mo dules intro duced for software engineering purp oses

plications in ML

is an anachronism for which highlevel languages can

Wehave reason to b elieve that JVMs which do tail call M Blume CM A compilation manager for SMLNJ

elimination may app ear so on whichwould remove one of the Technical rep ort Part of SMLNJ do cumenta

other signicant limitations of MLJ although w e already tion

compile most simple lo ops into jumps the lack of more gen

P Bothner Kawa compiling dynamic lan

eral tail call optimisation do es make some programs run out

guages to the Java VM In USENIX Conference

of stack space on reasonablesized inputs If tail call opti

ailable from June Compiler and pap er av

misation is not done for us then we will reluctantly haveto

httpwwwcygnuscombothne rka waht ml

consider selective use of two techniques placing some func

tions in the same metho d and the tinyinterpreter tech

F Henglein and J Jrgensen Formally optimal b ox

nique used in the Glasgow Haskell compiler Naive use

ing In ACM Symposium on Principles of Programming

of of these techniques would cause a signicant worsening

Languages pages

of co de size and sp eed so wewould base our decisions on a

more sophisticated ow analysis whichwould also allowus

S L Peyton Jones Implementing lazy functional lan

to improve most of the other transformations This

guages on sto ck hardware the spineless tagless G

would app ear to p ointtoeven longer compilation times but

machine Journal of Functional Programming pages

we hop e that this can b e avoided byimproving the represen

April

tation of our intermediate language Thecompilercurrently

S L Peyton Jones J Launchbury M Shields and

sp ends a vast amount of time and memory p erforming

A Tolmach Bridging the gulf a common intermediate

term Many of these rewrites trivial rewrites on the MIL

language for ML and Haskell In ACM Symposium on

are commuting conversions whichwould simply disapp ear

Principles of Programming Languages January

if a suitable graph representation were used instead If we

were also to rewrite destructively then we should b e able

J Jrgensen A calculus for boxing analysis of p oly

to obtain further compiler sp eedups A further minor im

morphically typ ed languages Technical Rep ort

provement which we need to makeis to ensure that MLJ

DIKU University of Cop enhagen May

never generates metho ds which exceed the JVMs K byte

limit This has so far only happ ened to us on one unusual

X Leroy Unboxed ob jects and p olymorphic typing

program and we do not anticipate any great diculty in

In th Annual ACM Symposium on Principles of Pro

mo difying the co de generator to prevent it happ ening

gramming Languages pages

Our use of a monadic intermediate language is particu

larly novel Whilst wewould not claim that this allows us

R Milner M Tofte R Harp er and D MacQueen

to p erform any optimisations which could not b e achieved

The Denition of StandardMLRevised MIT Press

by more ad ho c metho ds wehave found it to b e a p ower

Cambridge Mass

ful and elegant framework for structuring the compiler In

E Moggi Notions of computation and monads Infor

general wewould strongly advo cate a principled use of typ e

mation and Computation

theoretic and semantic ideas in compiler implementation

One of the go o d things ab out compiling to Java byte

C Mossin Flow analysis of typ ed higherorder pro

co des is that there is a large ongoing eort to develop b etter

grams Technical Rep ort DIKU University of

faster JVMs whichwe can takeadvantage of for free Early

Cop enhagen

information from Sun ab out their next generation of JVMs

indicates that allo cation and collection will b e improved sig

Martin Odersky and Philip Wadler Pizza into Java

nicantly p ossibly by a factor of That could makeMLJs

Translating theory into practice In ACM Sympo

runtime p erformance comp etitive with native compilers even

anguages Jan sium on Principles of Programming L

if we made no further improvements ourselves

uary

L C Paulson ML for the Working ProgrammerCam

References

bridge University Press second edition

A W App el Compiling with Cam

Z Shao An overview of the FLINTML compiler

bridge University Press

In ACM SIGPLAN International ConferenceonFunc

tional Programming pages June

A W App el and T Jim Shrinking lambda expres

sions in linear time Journal of Functional Program

Z Shao Typ ed crossmo dule compilation Technical

ming Septemb er

Rep ort YALEUDCSTR Departmentof Com

K Arnold and J Gosling The Java Programming Lan

puter Science Yale University July

guage AddisonWesley second edition

O Shivers Controlow Analysis of HigherOrder Lan

PNBenton Strictness Analysis of Lazy Functional

guages PhD thesis Carnegie Mellon UniversityMay

Programs PhD thesis University of Cambridge Com

CMUCS

puter Lab oratory August Technical Rep ort

D Tarditi G Morrisett P Cheng C Stone

PNBenton G M Bierman and V C V de Paiva

R Harp er and P Lee TIL A typ edirected opti

Computational typ es from a logical p ersp ective Jour

mizing compiler for ML In ACM SIGPLAN Confer

nal of Functional Programming To app ear

enc eonProgramming Language Design and Implemen

tation pages Philadelphia PA May

P Bertelsen Compiling SML to Javabyteco de Mas

ters thesis Dept of Information TechnologyTechnical

Univ of Denmark January

A Tolmach Optimizing ML using a hierarchy of dup

astore

monadic typ es In Workshop on Types in Compilation

ifnull

March

aload

getfield Field int a

PWadler The marriage of eects and monads In rd

istore

ACM SIGPLAN Conference on Functional Program

aconstnull

mingSeptemb er this volume

astore

aconstnull

D Wakeling VSD A Haskell to Java virtual ma

astore

chine co de compiler In th International Workshop

aload

on Implementation of Functional Languages Septem

ifnull

b er

aload

getfield Field int a

S Weeks A wholeprogram optimizing compiler

dup

for Standard ML Technical rep ort NEC Re

istore

ber Available from search Institute Novem

iload

httpwwwnecinjneccomh omep ages swee kss mlc

aload

getfield Field Ra b

astore

A Sample output

ificmple

new Class Ra

The co de b elow implements the quicksort algorithm for in

dup

teger lists

iload

aload

fun quick xs

invokespecial Method RaintRa

let

astore

fun quicker xs ys

goto

case xs of

new Class Ra

ys

dup

x xys

aload

getfield Field int a

abs

aload

let

invokespecial Method RaintRa

fun partition leftright

areturn

quicker left aquicker right ys

aload

areturn

partition leftright xxs

if x a

then partition xleft right xs

This program illustrates some of the co de transformations

else partition left xright xs

p erformed byMLJ

in

Integer lists have b een represented by the class Ra with

partitionbs

nil represented by null and xxs represented byan in

end

stance of Ra with x stored in eld a and xs in eld b

in

Notice how the calls to partition and the rst call to

quicker xs

quicker have been implemented by goto byteco des The

end

tuples havebeenremoved the triple passed to partition

has vanished and the function quicker exp ecting a pair has

The internal function quicker compiles to the following

b een transformed into a metho d with two arguments

static metho d

Method Ra bRa Ra

goto

new Class Ra

dup

iload

aload

invokespecial Method RaintRa

astore

goto

new Class Ra

dup

iload

aload

aload

invokestatic Method Ra bRa Ra

invokespecial Method RaintRa

astore

aload

ifnull

aload

getfield Field Ra b