<<

EBG A Lazy Functional

Implemented on the Java Virtual Machine

Tony Clark

February

Abstract

The Java programming language oers a number of features including

p ortability graphics networking Java implements the ob jectoriented

execution mo del in terms of classes ob jects with state message pass

ing and inclusion p olymorphism This work aims to provide a mixed

paradigm environment which oers the advantages of b oth ob jectoriented

and The functional paradigm is supp orted by a

new language called EBG which compiles to the Java VM The resulting

environment can supp ort applications which use b oth ob jectoriented and

functional programming as appropriate

Introduction

The programming language Java has b ecome very p opular by combining a num

b er of features including p ortability ob jectoriented programming WWW com

patibility networking graphics and a growing collection of libraries The lan

guage itself is reasonably small and oers a particular mo del of programming

language execution based on classes ob jects message passing and inclusion

p olymorphism Cardelli Wegner

Although the b enets of using the language are large most notably its p orta

bility and ease of library construction programmers are forced to use a particu

lar style of programming even when it do es not suit all parts of the application

For example op erations over p olymorphic lists are not readily supp orted by

the ob jectoriented mo del since inclusion p olymorphism is often incompatible

with parametric p olymorphism Java uses type casts to recover the type of a list

element Another example o ccurs when programming in terms of lists whose

elements are data items of lo osely related data types Java requires the use of

type tests to determine the actual type of a data item

Fortunately the p ortability of Java arises from its use of a Virtual Machine

VM This is a standard interface for executable co de dened in terms of a col

lection of machine instructions In principle to take advantage of Java features

it is not necessary to program in Java So long as a program can b translated

into Java VM instructions it can oer Javalike advantages

This pap er describ es research which aims to pro duce a mixed programming

environment oering Javalike advantages The environment provides a new

language called EBG in addition to Java EBG is a lazy higher order functional

programming language with a HindleyMilner type system mo dules separate

compilation algebraic types pattern matching and an interface to Java based

on the ob jectoriented mo del of program execution

The resulting environment allows applications to b e implemented as a mix

ture of functional and ob jectoriented programming with the aim b eing to allow

control and data to pass semi freely b etween the languages

The essential feature of the implementation is to translate a functional pro

gram into an equivalent Java program using a onetoone corresp ondence b e

tween functions and classes Each execution of a function denition pro duces

a new closure corresp ondingly the Java program instantiates the appropriate

class pro ducing an object Since the Java VM do es not directly supp ort lexi

cal scoping and nested classes class closures a pro cess termed class lifting is

p erformed on the Java program

A new binary format is used to contain the result of transforming and com

piling an EBG program The default Java class loader is extended to recognise

b oth the extended and basic formats allowing EBG and Java binary les to b e

loaded into the same machine Finally the Java reective language features are

exploited to allow EBG and Java programs to interact

This pap er is structured as follows Section provides example EBG pro

gram co de and shows how the interface to Java programs is used Section

describ es how EBG co de is translated to Java by dening interpreters for sub

sets of b oth languages and sketching a pro of of consistency for the translation

The languages are called and Java resp ectively

Section describ es how class lifting is p erformed which transforms a Java

program containing nested classes into one in which classes o ccur only at the

toplevel Section describ es how the EBG co de is translated to Java VM co de

via an intermediate EBG VM language the extensions to the class loader and

the interlanguage communication mechanisms Finally section analyses the

work compares it with related work and outlines future plans

A basic knowledge of Java ob jectoriented programming and functional pro

gramming are assumed The reader is directed to Garside Mariani

Venners Meyer Bird Wadler and Field Harrison

for introductory material

Example EBG Programs

Sieve of Eratosthenes

Figures and show a simple example of a mixed language application Figure

is an EBG package called Sieve which implements a lazily generated list of

prime numbers using a pro cess called the Sieve of Eratosthenes see Henderson

for more details The packages list and command provide denitions

import list command

integersFrom n nintegersFromn

sievenns nsieveremIfx divisible x n ns

primes sieveintegersFrom

main

new TestSieve produces Obj o

send o printPrimes

Figure Example EBG Co de for Sieve of Eratosthenes

for list and Java interface op erators resp ectively

The package contains a collection of denitions integersFrom is a function

which generates an innite list of numbers in sequence starting with n sieve

is a function which is applied to a list of numbers and removes those numbers

which are multiples of numbers o ccurring earlier in the list primes is a list of

all prime numbers starting from

The function main is an example of how imp erative features are enco ded in

EBG The command new takes a Java class name as an argument and instan

tiates the class The inx op erator produces evaluates its left hand op erand

and supplies the value to its right hand op erand The command send is applied

to an ob ject a metho d name and a list of arguments The result is equivalent

to the following Java statement

oprintPrimes

An EBG package roughly corresp onds to a Java class where all of the toplevel

denitions are declared static Any of the toplevel symbols in an EBG package

can b e referenced by a Java program using the EBG package name as though

it were a Java class name for example Sieveprimes

Figure shows the source co de for a Java class TestSieve which uses the

EBG package Sieve In addition TestSieve uses a collection of static meth

o ds provided by JavaInterface which allow EBG values to b e manipulated

isList isCons head and tail

Both EBG and Java source co de compile using the EBG ebgc

and the Java compiler javac resp ectively to pro duce Java VM ob ject co de

Using a simple extension of the default Java class loader in addition to the

package javalangreflect b oth EBG and Java ob ject co de can b e mixed

into a running Java machine

Execution of the system starts by loading the EBG Sieve package and start

ing to execute the commands in main The rst command creates an instance

of the class TestSieve by dynamically loading the appropriate class le and

instantiating the resulting class The Java reective interface is used to p erform

public class TestSieve extends JavaInterface

public void printPrimes

printNumsSieveprimes

public static void printNumsValue nums

ifisListnums

ifisConsnums

Systemoutprintlnheadnums

printNumstailnums

Figure Example Java Co de Calling EBG Co de

metalevel op erations such as send which invokes a named metho d of an ob ject

In this case when printPrimes is invoked control passes from EBG co de to Java

co de

The metho d printPrimes uses the EBG package as a class with a static at

tribute primes and calls printNums passing a lazily generated innite sequence

of prime numbers The metho d printNums uses the metho ds isCons head and

tail to print out all of the elements of the list The control ow of the program

is shown in gure

Environments

The evaluation of and Javaexpression use environments to asso ciate keys

with values In particular free variables in an expression are b ound in the

current environment and Java uses an environment to mo del the heap Figure

shows the denition of an EBG package Env which implements environments

env is a parametric type with three data constructors Type variables in EBG

are sequences of characters env is parameterised with resp ect to the type of

the keys and the type of the values

An environment is either Empty an asso ciation Bind k v b etween a key k

and a value v or the comp osition of two environments Pair e e Environ

ment lo okup is p erformed by

lookup key env default

which returns the rightmost value asso ciated with key in env or default if the

environment do es not contain the key Environments may contain more than

one entry for a key and shadowing o ccurs on the right The function mapEnv is

used to apply a function to all values in an environment

EBG Java

main is invoked

new TestSieve ! instantiate TestSieve

initialise instance

send o printPrimes ! reference Sieveprimes

call printNums

print

remove if divisible by tailnums

! print

remove if divisible by tailnums

! print

remove if divisible by tailnums

Figure The control ow in Sieve of Eratosthenes

Compiling EBG

EBG is implemented by a translation to Java The key issues of the translation

are function representation and function application This section describ es

these issues by dening two toy languages and analysing the translation b etween

them

The rst language called is a sublanguage of EBG providing integers

single argument functions variables and function application Its op erational

semantics is dened by an interpreter ebgEval written in EBG The second

language called Java is a sublanguage of Java providing nested class deni

tions and simple metho ds Its op erational semantics is dened by an interpreter

javaEval written in EBG

Compilation of EBG is mo delled using a translation trans from programs

to Java programs The translation is shown to b e consistent ie preserve the

meaning of programs by dening a translation trans from Java values to

values such that the following diagram commutes Sabry Wadler

ebgEval

-

ebgVal

6

trans

trans

?

-

javaVal

Java

javaEval

This section is structured as follows Section denes and its op erational

semantics Section dened Java and its op erational semantics Section

denes the translation trans and sketches the pro of of consistency

type env

Empty

Bind

Pair env env

lookup env

lookup key Empty default default

lookup key Bind key value default

case key key of

True value

False default

end

lookup key Pair e e default

let value lookup key e default

in

case value default of

True lookup key e default

False value

end

mapEnv env env

mapEnv fun Empty Empty

mapEnv fun Bind key value

Bind key fun value

mapEnv fun Pair e e

Pair mapEnv fun e mapEnv fun e

Figure Environment Structures

A Calculus

EBG is a lazy functional programming language therefore the op erational se

mantics of is based on a normal order reduction scheme Hankin Plotkin

The abstract syntax of is dened as the type ebg in gure The

op erational semantics is dened as a function ebgEval which is applied to a

expression and an environment asso ciating variable names with

Evaluation of a expression pro duces an integer closure or an error Note

that well typed expressions will not pro duce an error value Figure denes

a type ebgVal for the results of program evaluation

EBG uses normal order reduction which means that expressions are only

evaluated if it is necessary to pro duce the nal program outcome This strategy

is implemented by passing unevaluated expressions as function arguments If

type ebg

EBGInt int

EBGVar string

Lambda string ebg

Apply ebg ebg

type ebgVal

EBGIntVal int

Closure string env string ebgVal ebg

env string ebgVal ebg

EBGError

ebgEval ebg env string ebgVal ebgVal

ebgEval EBGInt n env EBGIntVal n

ebgEval Lambda arg body env

Closure arg env body

ebgEval Apply e e env

let Closure arg env body ebgEval e env in

let newEnv Pair env Bind arg Thunk env e

in ebgEval body

ebgEval EBGVar s env

let Thunk env body lookup s env EBGError

in ebgEval body env

Figure Denition of ebgEval

the value of the argument is ever required to construct the result of the function

then the expression is forced

Delayed evaluation of function arguments is implemented by constructing a

thunk A thunk asso ciates a program expression with the current environment

so that it can b e evaluated at some later date The current environment contains

values for all of the free variables in the delayed expression

As an example of normal order evaluation consider the following expressions

W Lambda x Apply EBGVar x EBGVar x

M Apply Lambda x EBGInt Apply W W

An strategy fully evaluates the argument to a function b efore

applying it If M is evaluated eagerly the application of W to itself will not

terminate However a normal order strategy will only evaluate an argument

expression if it is required in the b o dy of the function In this case

ebgEval M Empty EBGIntVal

A Java Calculus

In order to show how EBG is implemented in Java we show how expressions

are implemented in Java which is a sublanguage of Java containing just the

required language features In particular the required features include

 Anonymous and nested classes Closures and thunks are implemented as

ob jects Java allows classes to b e nested and implements static scoping

rules which corresp ond to nested functions and thunks in expressions

The syntax for instantiating anonymous Java classes Flannagan is

new classname classbody

which denes a subclass of classname and immediately instantiates it

 Class instantiation Each execution of a function or application re

quires a new closure and thunk resp ectively Java represents closures

and thunks as instances of classes

 Message passing Closure ob jects provide a metho d apply which is used

to apply the closure to an argument Thunk ob jects provide a metho d

force which forces the thunk when its value is required

 Ob ject attributes requires that expressions are evalu

ated at most once A thunk has a eld cache which is used to cache the

value of its delayed expression when it is forced

 Self reference To implement lazy evaluation a thunk checks whether it

has forced its delayed expression If not it sends itself a message to force

and then cache the result

Java Syntax and Values

Figure denes the type java which is the abstract syntax of Java A

Java program is an environment of class denitions one of which must dene

a metho d called main with a single argument Execution of a Java program

starts by calling the metho d main and evaluating its b o dy with resp ect to the

environment of toplevel class denitions The values pro duced by evaluating

Java programs are dened by javaVal in gure The values are classes

ob jects integers the null value b o olean values and an error value

A class denition contains variable references and since denitions may b e

nested a class captures the current context when it is created The current con

text is an environment asso ciating all variables freely referenced in the metho d

b o dies of the class with their current values

Consider the class Thunk dened in gure This is a typical abstract class

since it denes a metho d force which calls a metho d value whose implemen

tation is left to a subclass of Thunk The denition of Thunk is represented as

an abstract syntax data value in EBG as follows

type java

Seq java java sequenced commands

JavaInt int integer expression

JavaVar string variable reference

NullClassDef the ultimate superclass

ClassDef a class definition

java the superclass

list string the attributes

env string methodDef the methods

New java instantiation expression

Send java string java method invocation arg

Send java string method invocation args

This self reference

If java java java conditional command

Set string java variable update

Eql java java equality test

type methodDef

MethodDef string java method arg

MethodDef java method args

Figure Java Syntax

ClassDef

NullClassDef

cache

Bind force

MethodDef

If Eql JavaVar cache JavaVar null

Seq Set cache Send This value

JavaVar cache

JavaVar cache

The same denition may b e evaluated more than once causing dierent con

texts to b e asso ciated with the same class Consider the class Closure dened

in gure Each subclass of Closure must dene a metho d called apply

with a single argument Nested classes are p ossible for example the following

corresp onds to the curried function M xy xy

M new Closure

Value applyThunk x

new Closure

Value applyThunk y

xforceapplyy

M contains the denition of two anonymous subclasses of Closure The outer

most class is instantiated pro ducing a Java ob ject o which represents M The

outermost class contains no free variable references however the innermost class

type javaVal

NullClass

Class

env string int

javaVal

list string

env string methodDef

JavaObjVal

env string method

JavaIntVal int

Null

JavaTrue

JavaFalse

JavaError string

type method

Method

string

env string int

javaVal

java

Method env string int javaVal java

NoMethod

Figure Java Values

contains a free reference to the variable x which is an argument of apply in the

outermost class

Each time o is sent an apply message a new class is dened In each case

the class is asso ciated with a dierent value for x The following shows the class

which is created as a result of oapplyt

C Class Bind x t Closure

Bind apply

MethodDef y

Send

Send JavaVar x force

apply

JavaVar y

Notice that all Java classes are asso ciated with an environment in this case

Bind x t which contains the values of variables which are freely referenced in

the b o dy of the class For this reason we say that Java supp orts class closures

Java Instantiation

Ob jects are environments asso ciating metho d names with metho ds A metho d

has four comp onents an argument name a captured context an ob ject and a

b o dy The context is an environment containing asso ciations for all the freely

referenced variables in the b o dy of the metho d The context is constructed when

a class is instantiated by extending the class context with asso ciations b etween

the attribute names and their storage lo cations

Each metho d contains an ob ject which is used as the value of the pseudo

variable this All metho ds in an ob ject have the same ob ject which is a cyclic

reference to the ob ject itself Consider an ob ject which is created when M from

section is evaluated The ob ject pro duced is referred to as o in the

following Java value

o JavaObj

Bind apply

Method x Empty o

New ClassDef Closure

Bind apply

MethodDef y

Send

Send JavaVar x force

apply

JavaVar y

If the ob ject o is sent an apply message with an argument t then the result is

the class in section If C is instantiated the result is the following ob ject

o which captures the current context containing the value for x

o JavaObj Bind apply

Method y Bind x t o

Send

Send JavaVar x force

apply

JavaVar y

Class instantiation is p erformed by an EBG function instantiate exp ecting

three arguments the class to instantiate a memory lo cation used as the start of

attribute storage and an ob ject to b e used as the value of this Instantiation

pro duces three values the new instance an environment asso ciating attribute

names with storage lo cations and the memory blo ck used by the attributes

The value of this is found by a xed p oint Co ok Clark If

the value of instantiating the class c with resp ect to memory lo cation l is o a

and h then instantiate satises the following equation

oah instantiate c l o

Figure shows the denition of the function instantiate The pro cess instan

tiates the sup erclass rst and then merges the instance of the sup erclass with

the extension attributes and metho ds to pro duce an instance of the subclass

Message Passing

Ob jectoriented program execution is p erformed using message passing which

involves the lo okup and invocation of an ob jects metho d Message passing is

p erformed using the function sendMessage exp ecting four arguments

instantiate javaVal int javaVal env string methodenv string intenv int javaVal

instantiate NullClass memoryLocation this EmptyEmptyEmpty

instantiate Class env super atts meths memoryLocation this

let oah instantiate super memoryLocation this in

let ah allocateAtts atts memoryLocation usedMemory h in

let o mapEnv methodDefToMethod Pair env Pair a a this meths

in Pair o oPair a aPair h h

Figure Denition of instantiate

sendMessage

string

env string method

javaVal

env int javaVal javaValenv int javaVal

sendMessage message object value heap

case lookup message object NoMethod of

Method name env this body

let address nextFreeMemoryLocation heap in

let heap Pair heap Bind address value

env Pair env Bind name address

in javaEval body env heap this

NoMethod JavaError messageheap

end

Figure Message passing in Java

sendMessage message object value heap

where message is the name of the message object is the target of the message

value is the value to b e sent and heap is the current memory structure

Messages are synchronous and the result of sending a message is a pair

valueheap containing a data value and an up dated memory

Figure shows the denition of message passing in Java The target is

an environment and should asso ciate the message name with a metho d The

metho d contains an argument name an environment an ob ject and a program

expression The environment asso ciates freely referenced variables in the b o dy

of the metho d with values The environment is extended with the metho d

argument and is used as the context for evaluating the metho d b o dy

Java Evaluation

Evaluation of a Java program prog is p erformed by

javaEval prog env heap this

where env asso ciates free variables in prog with memory addresses heap asso

ciates memory addresses with java values and this is an ob ject whose metho d is

currently b eing p erformed Memory addresses are mo delled as integers starting

from The Java interpreter is shown in gure It is dened by case analy

sis on the structure of the program The interpreter threads the heap through

the program execution and pro duces a pair valueheap where value is a

java value and since the evaluation of prog can pro duce side eects heap is

an up dated heap

Translation of Terms to Java

EBG is implemented by translating it to Java EBG closures are translated

to instances of Closure EBG thunks are translated to instances of Thunk and

EBG integers are translated to Java integers This section denes the syntax

translation from EBG programs to Java programs and provides an overview of

how values can b e translated from one language to the other These translations

are then used to sketch the pro of of consistency for the syntactic translation

EBG values are integers or closures The environments in closures asso ciate

variable names with thunks EBG values are represented in Java as instances

of the classes dened in gure The class Value is the sup erclass of all EBG

values The classes IntVal Closure and Thunk dene Java representations of

EBG integers closures and thunks resp ectively

The classes Closure and Thunk are abstract EBG closures and thunks are

dened as instances of subclasses of these classes Subclasses of Closure must

dene a metho d called apply which is activated when the closure is applied

Subclasses of Thunk must dene a metho d called value which is activated

when the thunk is forced for the rst time Once it is forced an instance of

Thunk uses the variable cache to retain the value

javaEval

java

env string int

env int javaVal

javaVal javaValenv int javaVal

javaEval Seq j j env heap this

javaEval j env nd javaEval j env heap this this

javaEval JavaInt n env heap this JavaIntVal nheap

javaEval JavaVar s env heap this

lookup lookup s env heap JavaError heapheap

javaEval NullClassDef env heap this NullClassheap

javaEval ClassDef super atts meths env heap this

let classheap javaEval super env heap this

in Class env class atts methsheap

javaEval New j env heap this

let classheap javaEval j env heap this in

letrec oaheap instantiate class loc heap JavaObjVal o

in JavaObjVal oPair heap heap

javaEval Send exp message arg env heap this

let JavaObjVal oheap javaEval exp env heap this in

let vheap javaEval arg env heap this

in sendMessage message o heap

javaEval If exp exp exp env heap this

case javaEval exp env heap this of

JavaTrueheap javaEval exp env heap this

JavaFalseheap javaEval exp env heap this

end

javaEval Set varName exp env heap this

let valueheap javaEval exp env heap this

in valuePair heap Bind lookup varName value

javaEval Eql exp exp env heap this

let valueheap javaEval exp env heap this in

let valueheap javaEval exp env heap this

in case value value of

True JavaTrueheap

False JavaFalseheap

end

javaEval This env heap this thisheap

Figure Denition of javaEval

abstract class Value

class IntVal extends Value

abstract class Closure extends Value

abstract Value applyThunk argument

abstract class Thunk

private Value cache null

public Value force

ifcache null

cache value

return cache

public abstract Value value

Figure EBG value classes

trans ebg java

transEBGInt n JavaInt n

transEBGVar s Send JavaVar s force

transLambda arg body

New ClassDef

JavaVar Closure

Bind apply MethodDef arg trans body

transApply e e

Send translate e apply

New ClassDef

JavaVar Thunk

Bind value MethodDef trans e

Figure Denition of trans

Translation of EBG programs to Java programs is dened in gure The

translation of functions and function application instantiate anonymous sub

classes of Closure and Thunk resp ectively Function application is implemented

using the metho d apply and thunks are forced using the metho d force

Consider an EBG program m evaluated by with resp ect to an environ

ment of thunks e pro ducing an EBG value v Given a translation trans from

environments of EBG thunks to environments of Java ob jects m and e can b e

translated and evaluated using javaEval to pro duce a Java value w and a heap

h Given a translation trans from Java values and heaps to EBG values we

must show that

ebgEvalmetrans o javaEval o transme

The pro of is sketched as follows EBG thunks are translated to pro duce in

stances of the appropriate subclass of Thunk Instances of Thunk and Closure

are translated relative to a heap to EBG thunks and classes resp ectively The

pro of of consistency pro ceeds by induction on the structure of the EBG program

m and the environment e

 If m is an integer then the pro of follows by the denition of the interpreters

and translations

 If m is a variable then the pro of follows by assuming that it holds for the

b o dy of the thunk b ound to the variable in e and its environment

 If m is a function then the pro of follows by assuming by induction that

it holds for the b o dy of the function and the environment e

 If m is an application Apply n n then we assume that the theorem holds

for n n with resp ect to e and also holds for the b o dy of the resulting

closure with resp ect to the extended closure environment

Scop e and Nested Classes

EBG is implemented in Java using nested anonymous classes for b oth closures

and thunks Both Java and EBG use lexical scoping rules for variable reference

Nested classes and lexical scoping rules are supp orted in Java by class closures

Although Java provides nested anonymous classes it do es not implement

class closures In order to supp ort lexical scoping it p erforms class lifting which

is a pro cess similar to lambda lifting Field Harrison in order to translate

all class denitions to the toplevel of the program This section describ es how

EBG value classes are mo died to take class lifting into account

Class lifting is a Java program transformation whereby all classes are moved

to the toplevel Lexical scoping is implemented by allo cating space for variables

in heap allo cated activation frames Consider the following function

M xy y xz xz

1

class M extends Closure

Value applyThunk x

frame new Framexframe

return new Mframeapplynew Mframe

class M extends Closure

Value applyThunk y

return yforceapplyframelocal

class M extends Closure

Value applyThunk z

return framelocalforceapply z

Figure An example of class lifting

The result of lifting M is as follows

M xM xM x

1 2 3

M xy y x

2

M xz xz

3

The pro cess of lifting pro duces an equivalent program in which all nested

functions have b een moved to the toplevel and extra parameters are added for

their freely referenced variables

Class lifting has the same eect as lifting except that nested classes are

moved to the toplevel and variables are referenced via heap allo cated frames

Figure shows the result of translating M to Java and then p erforming

1

class lifting Note that the co de in gure has b een simplied by omitting the

creation of thunks Section describ es the complete translation

Class lifting is p erformed using the following algorithm Let P b e a Java program

resulting from trans P is a collection of class denitions indexed by their

names If P contains no nested classes then stop Otherwise a denition d

contains a nested class denition c Dep ending on whether c is a subclass of

0

Closure or Thunk it may reference a single b ound variable v of d Let d b e d

with c replaced by

new knew Framevframe

0

where k is a new class name Let c b e c with all references to v replaced by

abstract class Closure extends Value

private Frame frame

public ClosureFrame frame

thisframe frame

public abstract Value applyThunk t

abstract class Thunk

private Frame frame

private Value cache null

public ThunkFrame frame

thisframe frame

public Value force

ifcache null

cache value

return cache

public abstract Value value

Figure Value class using frames

framelocal

If c references v then all other expressions of the form framelocaln replaced

0 0

with framelocaln The class d is replaced with d in P and c is added

This pro cess is rep eated until it terminates with no nested class denitions

EBG value classes initially dened in gure are extended to supp ort

class lifting Both Closure and Thunk are extended with an attribute frame

whose value is supplied when an instance is created The extended classes are

shown in gure Frame implements a linked list of values The metho d local

is used to index the list elements The initial element in a frame is at p osition

New frames extend existing frames by adding a new element at the start of

the list

Implementation Issues

The semantics of EBG programs and their implementation in Java is dened by a

consistent translation trans in section EBG is implemented by translating

programs directly to Java VM co de without generating any intermediate Java

source co de The machine loader can freely mix Java and EBG ob ject co de

and the reective features of the Java machine p ermit Java and EBG co de to

interact This section describ es the implementation issues relating to the EBG

environment EBG Java package source class source

ebgc javac

ebg package java class file format file format

ebg

Java

machine

Figure Mixing EBG and Java progrm co de

The Class Loader

Java programs are executed by starting a Java machine and loading Java ob ject

les using a class loader A class loader running on the machine is an ob ject of

type ClassLoader which is resp onsible for reading ob ject les and linking Java

VM co de into the current running Java machine

EBG denes a subclass of ClassLoader called ebg which understands the

format of b oth Java and EBG ob ject les The pro cess of loading b oth EBG

and Java into a running machine is shown in gure

Compilation of a Java source le using javac pro duces an ob ject le con

taining a binary representation in a class le format There are entries in the

binary le for all class comp onents including elds metho ds and static entries

Compilation of an EBG source le using ebgc pro duces a le containing

a binary representation in a package le format The package le contains

class le format entries for all the Java classes resulting from class lifting In

addition there is a distinguished class in each package which contains static

elds for each toplevel package denition The value of each eld is of type

Thunk and b oth EBG and Java programs may reference any toplevel EBG

package denitions as static class elds An EBG ob ject package is an instance

of the class Package

public class Package implements Serializable

public Vector importNames

public Hashtable classes

Package methods

where importNames is a vector of imp orted package names and classes is a

collection of asso ciations b etween class names and arrays of bytes in Java class

le format

public class ebg extends ClassLoader

private Hashtable classBytes new Hashtable

private Hashtable loadedClasses new Hashtable

private Vector importedPackages new Vector

private void getClassBytesString fName

Package p objStreamfNamereadObject

Enumeration classNames pclassNames

whileclassNameshasMoreElement s

String cName classNamesnextElement

classBytesputcNamepclasse sge tcN ame

addElementspimportNamesimpor tedP acka ges

private Class loadClassString cName

Class c

ifloadedClassescontainsKeyc Name

ifclassBytescontainsKeycNa me

c defineClasscNameclassByte sge tcNa me

else ifimportedPackagescontains Key cName

getClassBytescName

importedPackagesremoveElement cNam e

c loadClasscName

else c loadJavaClasscName

else c loadedClassesgetcName

loadedClassesputcNametrue

return c

Figure The ebg Class Loader

A Java class is dened by a class loader by supplying the metho d defineClass

with the name of the class and an array of bytes in class le format Figure

shows the implementation of the EBG package loader ebg

The EBG package loader uses three tables The table loadedClasses is

used to record when a class is loaded and dened Once loaded and dened

a class must not b e redened The table classBytes is used to hold the

class le format byte co des of classes when EBG packages are loaded The

classes contained in an EBG package are dened on demand Finally the table

importedPackages holds the names of packages which are imp orted but not yet

loaded

Once compiled an EBG package is loaded using the extended class loader

ebg A package is loaded using the metho d loadClass which returns the Java

class containing the EBG toplevel denitions as static elds The metho d

loadClass uses the package loader tables to cache classes Once a package is

loaded subsequent calls to loadClass will not need to reload the package for

dierent comp onent classes

Pro ducing Java VM Co de

EBG programs are compiled to Java VM co de via an intermediate EBG VM

language The intermediate language allows the low level implementation to b e

changed without aecting the upp er levels of the compilation pro cess

This section gives an overview of the EBG VM and the compilation pro cess

In order to show the key features of the compilation three toy languages are

used EBG is mo delled using the language whose semantics is dened in

section EBG is compiled using an EBG function compile to pro duce EBG

VM instructions implemented as an EBG data type ebgInstr Translation to

Java VM and class lifting is p erformed using an EBG function trans Given

the semantics of Java VM javaVMEval the following diagram commutes

ebgEval

-

ebgVal

6

compile transojavaVMEval

?

-

ebgInstr javaInstr

trans

The EBG VM is a where the stack contains function activation

frames Each frame contains a co de p ointer to the current VM instruction a

p ointer to the previous stack frame and the address of the current lo cal variable

frame The machine instructions are dened as the type ebgInstr in gure

Compilation of an EBG program pro duces a sequence of EBG VM instructions

The compiler is dened in gure A program is compiled as follows

compile prog vars globals

where prog is an EBG program vars is a list of variable names which o ccur

freely in prog and globals is an environment asso ciating toplevel variable

names with the name of their dening package

The Java VM is stack based Each stack frame contains an ob ject which is

currently handling a message a collection of lo cals a p ointer to the current VM

instruction and a p ointer to the previous stack frame The ob ject is always the

value of lo cal and provides a collection of eld values In addition the machine

also contains a collection of classes which may b e instantiated and whose static

elds can b e referenced

type ebgInstr

PushInt int

Local int

Global string string

PushLambda list ebgInstr

App

Force

Delay list ebgInstr

compile

ebg

list string

env string string list ebgInstr

compileEBGInt n vars globals

PushInt n

compileEBGVar s vars globals

case lookup s globals of

Local pos s vars Force

package Global package name Force

end

compileLambda arg body vars globals

PushLambda compile body argvars globals

compileApply exp exp vars globals

let

instrs compile exp vars

instrs Delay compile exp vars globals

instrs App

in instrs instrs instrs

Figure EBG Compilation

When executes on the Java VM the value of lo cal is always an instance

of a subclass of Closure or Thunk The value of lo cal is always the current

lo cal frame

Figure shows an EBG type javaInstr whose values represent the Java

machine instructions used to implement The instructions are briey ex

plained as follows

type javaInstr

VMNew int

Aload

Aload

Astore

Bipush int

GetStatic string string string

Return

InvokeVirtual string

GetField string

Dup

InvokeSpecial string

type VMClass

VMClosure int list javaInstr

VMThunk int list javaInstr

Figure Java VM Instructions

VMNew n instantiate the class named n

Aload push the current ob ject onto the

stack

Aload push the current lo cal frame onto

the stack

Astore set the current lo cal frame from

the head of the stack

Bipush n push the integer n into the stack

Return return the value at the top of the

Translation of EBG VM

stack from the current metho d

call

InvokeVirtual m call the metho d m where the tar

get is on the stack b elow the ar

guments

GetField f push the value of eld f

Dup duplicate the head of the stack

InvokeSpecial m initialise the ob ject at the head of

the stack

instructions and class lifting is p erformed by the EBG function trans dened

in gure A translation is

trans instr classes

where instr is an EBG VM instruction and classes is a list of subclasses

of b oth Closure and Thunk The elements of classes are pro duced by class

lifting and are represented as values of type VMClass dened in gure The

names of these classes are mo delled as integers in the translation Translation

pro duces a pair

trans

ebgInstr

list VMClass javaInstrlist VMClass

transPushInt n classes Bipush n classes

transLocal n classes

Aload

Bipush n

InvokeVirtual localILValue

classes

transGlobal package name classes

GetStatic package name LThunkclasses

transPushLambda instrs classes

letrec

name length classes

c VMClosure name is Return

isclasses maptrans instrs cclasses

in VMNew name

Dup

Aload

InvokeSpecial initLFrameV

classes

trans App classes

InvokeVirtual applyLThunkLValueclass es

trans Force classes

InvokeVirtual forceLValueclasses

transDelay instrs classes

letrec

name length classes

g GetField frame Astore

t VMThunk name g is Return

isclasses maptrans instrs tclasses

in VMNew name

Dup

Aload

InvokeSpecial initLFrameV

classes

Figure Translation of EBG VM to Java VM

PushLambda

PushLambda

Local

Force

Delay

Local

Force

App

Delay

PushLambda

Local

Force

Delay

Local

Force

App

App

Figure EBG VM instructions for M

1

instrsclasses

where instrs is a list of Java VM instructions and classes is an extended

list of subclasses Figure shows that the translation pro cess macroexpands

the EBG VM instructions and lifts classes each time a PushLambda or a Delay

instruction is encountered

Consider the expression M which is dened in section Figure shows

1

the result of representing M as a value of type ebg and then using compile to

1

pro duce EBG VM instructions

Figure shows the classes pro duced by translating the EBG VM instruc

tions to Java classes using trans The subclasses of Closure lab elled and

corresp ond to the functions M M and M resp ectively The subclasses of

1 2 3

Thunk lab elled and are used to delay the evaluation of function arguments

Interlanguage Communication

The EBG environment allows communication b etween EBG and Java co de

within the same Java machine Communication o ccurs through the Java library

javalangreflect which allows Java programs to manipulate and change

themselves during program execution

EBG packages are implemented as Java classes where the toplevel denitions

are enco ded as static elds of type Thunk When ebg loads the rst EBG package

it searches for the value of the eld main and forces its value

Field mainField mainClassgetFieldmain

Thunk mainThunk ThunkmainFieldgetnull

Class thunkClass ClassloadedClassesgetThu nk

Method force thunkClassgetMethodforce

VMNewDupAload

InvokeSpecialinitLFrame V

VMThunk

GetFieldframeAstore

AloadBipush

InvokeVirtuallocalILValue

InvokeVirtualforceLValue

Return

VMClosure

Aload

Bipush

InvokeVirtuallocalILValue

InvokeVirtualforceLValue

VMNew

DupAload

InvokeSpecialinitLFrame V

InvokeVirtualapplyLThunk LValu e

Return

VMThunk

GetFieldframeAstore

VMNew

DupAload

InvokeSpecialinitLFrame V

Return

VMThunk

GetFieldframeAstore

AloadBipush

InvokeVirtuallocalILValue

InvokeVirtualforceLValue

Return

VMClosure

AloadBipush

InvokeVirtuallocalILValue

InvokeVirtualforceLValue

VMNew

DupAload

InvokeSpecialinitLFrame V

InvokeVirtualapplyLThunk LValu e

Return

VMClosure

VMNew

DupAload

InvokeSpecialinitLFrame V

VMNew

DupAload

InvokeSpecialinitLFrame V

InvokeVirtualapplyLThunk LValu e

Return

Figure Java VM instructions for M 1

EBGsystemforceinvokemainThu nk

where mainClass is the class pro duced by loadClass mainThunk is the value

of main in mainClass force is the metho d which forces thunk ob jects The

Java metho d EBGsystem is supplied with the result of forcing mainThunk

EBGsystem is resp onsible for supplying the value of main with a sequence of

Java VM resp onses to the sequence of requests which are generated The mo del

of EBG execution is shown b elow

list response list command EBG

program

The commands pro duced by the denition of main in the package Sieve dened

in gure are new and send The new command is handled by creating a new

instance of the class name and adding it to the list of resp onses

Class namedClass ClassforNamename

addResponsenamedClassnewInstance

The send command is handled by nding the appropriate metho d called name

invoking the metho d with resp ect to the supplied object and argVals and then

adding the return value to the list of resp onses

Class objClass objectgetClass

Method m objClassgetMethodnameargTypes

Object args new ObjectargVals

addResponseminvokeobjectargs

Conclusion

This work aims to provide a mixed paradigm programming environment which

oers the advantages of functional programming denition by cases paramet

ric p olymorphism lazy evaluation higherorder functions algebraic types and

the advantages of Java programming ob jectoriented execution inclusion p oly

morphism p ortability graphics networking multiprocessing

To achieve this aim a new programming language called EBG has b een

designed and constructed EBG oers many of the features of a mo dern func

tional programming language compiles to the Java VM language and provides

primitive features which allow the two languages to interact

This pap er has describ ed the implementation of EBG in terms toy languages

Java ebgInstr and javaInstr These are sublanguages of the corre

sp onding comp onents of the real implementation whose features express the

essential implementation characteristics

In addition to those describ ed in this pap er EBG has a collection of stan

dard functional programming features including pattern matching in denitions

and case expressions Peyton Jones type checking and type inference

Cardelli and named mo dules consisting of collection of type and value

denitions which can b e exp orted by the dening mo dule and imp orted by other

mo dules

EBG functions have any number of arguments The mechanism for main

taining lo cal variables via instances of Frame is generalised to linked lists of heap

allo cated lo cal frames where each frame has a number of entries corresp onding

to the function arguments

EBG provides lo cal variable binding using case let and letrec expressions

In each case the compiler generates co de which extends the current lo cal frame

with the appropriate number of values

Compilation of EBG is very simple minded This has the b enet that the

interface b etween the two languages is clean for example closures and thunks

can b e passed freely b etween EBG and Java b ecause they are implemented as

Java ob jects

In principle closurelike and thunklike ob jects can b e created by Java as

instances of subclasses of Closure and Thunk then passed to EBG programs

This interface provides scop e for exp erimenting with new types of function for

example functions can b e created which connect to other Java machines over a

network and which pro duce a stream of results

The disadvantage of simple minded compilation is slow execution sp eeds for

EBG programs In addition the Java VM co de which is pro duced do es not make

ecient use of the Java VM stack for example by passing function arguments

via a stack frame rather than as part of instances of Frame

EBG currently exists as a prototype implementation written in Java The

compiler uses the java compiler compiler javacc The source co de is currently

ab out lines of Java co de around of which is automatically generated

by javacc EBG has b een used to write a number of EBG libraries some

tutorial examples and the co de in this pap er

The next phase of EBG work will address its compilation and the expan

sion of EBG VM instructions to Java VM instructions In addition functional

programming research has pro duced a number of techniques for analysing and

transforming programs in order to increase their sp eed and decrease their space

usage These techniques include strictness analysis Peyton Jones the

STG machine Peyton Jones and deforestation Wadler

EBG is novel since it is a lazy functional programming language which com

piles to the Java VM Haskell evaluates lazily but do es not compile to the Java

VM MLJ developed by Persimmon IT is a compiler for Standard ML which

pro duces Java bytecodes Standard ML is a higher order functional program

ming language with an eager

Kawa Bothner a b is an implementation of the lispderivative

Scheme which compiles to the Java VM Although Scheme employs an eager

evaluation strategy the translation of Kawa directly to the Java VM uses sim

ilar mechanisms to EBG For example Kawa implements Scheme pro cedures

as instances of subclasses of a Java abstract class Procedure which denes a

collection of apply metho ds

Pizza Odersky Wadler and more recently GJ Brache et al

are extensions of the Java language which aim to address the problem of para

metric types In the case of Pizza Java is extended with parametric types such

as list of anything which are incompatible with existing Java types such as

list of Object GJ aims to extend Pizza so that b oth of these types have the

same representation Our approach diers in that we have provided parametric

types in EBG which is a dierent language from Java but can b e executed on

the same machine The lazy evaluation mechanism of EBG is not addressed by

either Pizza or GJ

Future plans for EBG include increasing the sophistication of its compila

tion and making the Java graphics networking and multiprocessing facilities

available within a functional programming language

References

Bird Wadler P Introduction to Functional Programming Pren

tice Hall Series in Computer Science

Bothner P a Kawa Compiling Dynamic Languages to the Java VM

Presented at the Usenix Conference in New Orleans

Bothner P b Kawa Compiling Scheme to Java Presented at the

Lisp Users Conference in Berkeley CA

Brache G Odersky M Stoutamire D Wadler P Making the

future safe for the past Adding Genericity to the Java Programming

Language in proceedings of the th Annual ACM SIGPLAN Conference

on ObjectOriented Programming Systems Languages and Applications

OOPSLA

Cardelli L Basic Polymorphic Type Checking Science of Computer

Programming

Cardelli L Wegner P On understanding types data abstraction

and p olymorphism ACM Computing Surveys

Clark A N A Layered Ob jectOriented Programming Language

GEC Journal of Research

Clark A N Semantic Primitives for ObjectOriented Programming

Languages PhD Thesis Queen Mary and Westeld College University of

London

Co ok W A Denotational Semantics of Inheritance PhD Thesis

Brown University

Field A J Harrison P G Functional Programming Addison

Wesley Publishing Company

Flannagan D Java in a Nutshel l Second Edition OReilly

Garside R Mariani J Java First Contact Course Technology

Hankin C Lambda Calculi a Guide for Computer Scientists Claren

don Press Oxford University Press

Henderson P Functional Programming Application and Implemen

tation PrenticeHall International

Meyer B ObjectOriented Software Construction Prentice Hall In

ternational Series in Computer Science

Odersky M Wadler P Pizza into Java Translating theory into

practice Symposium on Principles of Programming Languages pp

Peyton Jones S L The Implementation of Functional Programming

Languages PrenticeHall International Series in Computer Science

Peyton Jones S L Implementing lazy functional languages on sto ck

hardware the Spineless Tagless Gmachine Journal of Functional Pro

gramming

Plotkin G Callbyname callbyvalue and the calculus Theo

retical Computer Science pp

Sabry A Wadler P A Reection on CallbyValue ACM Trans

actions on Programming Languages and Systems pp

Venners B Inside the Java Virtual Machine McGrawHill

Wadler P Deforestation Transforming programs to eliminate trees

Theoretical Computer Science pp