<<

Unboxed values as rst class citizens in a nonstrict functional

language

Simon L Peyton Jones and John Launchbury

Department of Computing Science University of Glasgow G QQ

fsimonp j jlgdcsglasgowacuk

July

Abstract

The co de compiled from a nonstrict functional program usually manipulates heap

allo cated boxed numbers Compilers for such languages often go to considerable trouble

to optimise op erations on b oxed numbers into simpler op erations on their unboxed forms

These optimisations are usually handled in an ad hoc manner in the co de generator

b ecause earlier phases of the compiler have no way to talk ab out unboxed values

We present a new approach which makes unboxed values into nearly rstclass

citizens The language including its is extended to handle unboxed values

The optimisation of b oxing and unboxing op erations can now b e reinterpreted as a of

correctnesspre se rving program transformations Indeed the particular transformations

required are ones which a compiler would want to implement anyway The compiler

b ecomes b oth simpler and more mo dular

Two other b enets accrue Firstly the results of strictness analysis can b e exploited

within the same uniform transformational framework Secondly new algebraic data types

with unboxed comp onents can b e declared Values of these types can b e manipulated

much more eciently than the corresp onding b oxed versions

Both a static and a dynamic semantics are given for the augmented language The

denotational dynamic semantics is notable for its use of unpointed domains

Introduction

Most compilers have a phase during which they attempt to optimise the program by applying

correctnesspreserving transformations to it Constant folding is a particular example of this

sort of transformation pro cedure inlining is another

Functional languages are esp ecially amenable to such transformation b ecause of their simple

semantics Nonstrict functional languages are nicest of all b ecause reduction sometimes

called unfolding in a transformational context is always valid in a strict language it is only

valid if the b o dy of the b eing unfolded is guaranteed to evaluate the argument



This pap er app ears in the Pro ceedings of the Conference on Languages and

Computer Architecture Cambridge Sept

Several researchers have b egun to express more and more of the work of the compiler in

the form of correctnesspreserving transformations see Section Such an approach has

obvious advantages Firstly each transformation can b e proved to b e correct indep endent of

the others When more of the compiler takes the form of such transformations it is easier

to prove the compiler correct Secondly each transformation exp oses opp ortunities for other

transformations The more that is done within a transformation phase the more chance there

is for such b enecial interactions

There is an imp ortant class of optimisations for nonstrict languages which has so far b een

b eyond the scop e of program transformation These all relate to the treatment of socalled

unboxed values which we introduce in the next section Instead this family of optimisations

is generally implemented in an ad hoc manner in the co de generator

Following some preliminaries Sections and this pap er makes four main contributions

Most imp ortant we show how the class of unboxedvalue optimisations can formulated

as correctnesspreserving transformations Section Unboxed values are made rst

class citizens distinguished from their b oxed counterparts by the type system These

transformations do not generate much b etter co de than current compilers do our in

tention is only to present the optimisations in a new and elegant way

We show how to apply the same idea to express and exploit the results of strictness

analysis in a uniform way Section

We show how the approach can b e generalised to other algebraic data types Section

Declaring and using such unboxed data types gives a substantial p erformance improve

ment

We provide a formal underpinning for the approach by giving b oth a static semantics

and a denotational dynamic semantics for the language extended with unboxed values

Sections and The dynamic semantics has the desirable prop erty that the

semantic equations are unchanged from those for the language without unboxed values

despite the extra strictness of the extended language This eect is achieved by using

unpointed domains

We conclude by discussing some languagedesign issues reviewing related work and mention

ing some areas for further work

The problem

Consider the following function denition

double x x x

In a nonstrict language x may b e unevaluated when double is called in which case a p ointer

to a heapallo cated closure or susp ension for x is passed to double When double needs

xs value in this case right away it evaluates the closure As a side eect of this evaluation

the closure of x is overwritten with its value Any further attempts to evaluate x will succeed

immediately returning its value This way of ensuring that unevaluated closures are evaluated

at most once is called

Unfortunately lazy evaluation tends to make arithmetic horribly inecient if some care is not

taken In particular in a nave implementation numbers are always represented by a p ointer

to a heapallo cated ob ject which is either an unevaluated closure or is a b ox containing

the numbers actual value which has now overwritten the closure This means that a simple

arithmetic op eration which would take a single machine instruction in a strict language

requires quite a long sequence of instructions the two op erands are fetched from their b oxes

the op eration p erformed a new b ox allo cated to contain the result and the result placed in

it

The pattern representing the value itself on which the builtin machine instructions op er

ate is called an unboxed value Unboxed values come in a variety of shap es and sizes bit

integers bit integers single and doubleprecision oating p oint numbers and so on are

all unboxed values A p ointer to a heapallo cated b ox containing an unboxed value is called

a boxed value Clearly it is vastly more ecient to manipulate unboxed values than b oxed

ones

Quite a lot can b e done For example consider again the denition of double given ab ove

A nave compiler would compile co de for double which would evaluate x extract its unboxed

value then evaluate x again and extract the value again then add the two and b ox the result

A slightly cleverer compiler would realise that it already had xs value in hand and refrain

from the second evaluation As another example consider the following denition

f x y z x y z

A nave compiler might generate co de to b ox the value of yz only to unbox it again right

away A cleverer compiler can elide these unnecessary op erations

As a nal example consider the following call to double

double pq

The straightforward approach is to build a closure for pq and pass it to double which will

evaluate it But double is clearly going to evaluate its argument so it is a waste to allo cate

the closure only for double to evaluate it and discard it for the garbage collector to recover

later It would b e much b etter for the caller to evaluate p and q add their values and pass

pq to double in unboxed form

The Core language

We b egin with a few preliminaries to set the scene for our new prop osals

Our compilation route involves the following steps

The source language is Haskell Hudak et al a stronglytyped nonstrict

purelyfunctional language Haskells main innovative feature is its supp ort for sys

tematic overloading but we do not discuss this asp ect at all here

Haskell is compiled to the Core language All Haskells syntactic sugar has b een

compiled out type checking p erformed and overloading resolved Patternmatching

has b een compiled into case expressions each of which p erforms only a single level of

matching Boxing and unboxing are explicit as describ ed b elow

Program analyses and transformations are applied to the Core language In particular

the b oxingunboxing optimisations are carried out here

The Core language is translated to the STG language the abstract machine co de for the

Spineless Tagless Gmachine our evaluation mo del The Spineless Tagless Gmachine

was initially presented in Peyton Jones Salkild but the latter has b een com

pletely rewritten as a companion to this pap er Peyton Jones

The STG language is translated to Abstract C This is just an internal

which can b e simply printed out as C co de and thence compiled to native co de with

standard C compilers Abstract C can also serve serve as an input to a co de generator

thereby generating native co de directly but we have not yet implemented this route

In this pap er we only concern ourselves with the Core language which is introduced in the next

section However since the Core language do es not have explicit algebraic type declarations

we b orrow Haskells syntax for this purp ose when required

The syntax of the Core language

The abstract syntax of the Core language is given in Figure The binds constituting a prog

should dene main which is taken to b e the value of the program

The concrete syntax we use is conventional parentheses are used to disambiguate application

asso ciates to the left and binds more tightly than any other op erator the b o dy of a lambda

abstraction extends as far to the right as p ossible the usual inx arithmetic op erators are

p ermitted the usual syntax for lists is allowed with inx constructor and empty list

and where the layout makes the meaning clear we allow ourselves to omit semicolons

b etween bindings and case alternatives

Notice that the bindings in letrec expressions are all simple that is the left hand side of

the binding is always just a variable Function bindings are expressed by binding a variable to

a lambda abstraction However we p ermit ourselves the small lib erty in the concrete syntax

of writing the arguments of function bindings to the left of the sign

Similarly the patterns in case expressions are all simple nested has b een

compiled to nested case expressions

Program prog binds

Bindings binds bind ::: bind n

n

bind var expr

Expression expr expr expr Application

j var expr Lambda abstraction

j case expr of alts Case expression

j let bind in expr Lo cal denition

j letrec binds in expr Lo cal recursion

j con Constructor

j var Variable

j literal

Literal values literal integer

j oat

Alternatives alts calt ::: calt default expr n

n

j lalt ::: lalt var expr n

n

Constructor alt calt con var ::: var expr n

n

Literal alt lalt literal expr n

Figure Syntax of the Core language

Here is an example program to illustrate these p oints

fac n case n of

n n factorial n

main fac

A semantics for the Core language

In this section we present a denotational semantics for welltyped Core language programs

We do so with a little more care than usual b ecause we want it to form a basis for the

denotational semantics of unboxed types later We need to do two things

Give a model which denes a domain D  for each type  in the language

Give a valuation function E e which for every expression e of type  gives its value

in the domain D 

The mo del

Beginning with the mo del we give the syntax of types 



j Int

j Float

j  

j  : : :  n

n n

Here is a type variable and ranges over type constructors of arity n These type con

n

structors arise from algebraic data type declarations made by the programmer For example

the declaration

data Tree a Leaf a Branch Tree a Tree a

introduces a Tree with arity Lists b o oleans pairs and other types which

usually come built in are all regarded as examples of such algebraic data types

We dene the domain corresp onding to each type  inductively thus

D Monotype Typevar Dom Dom

D  

D Int  fThe set of xedprecision integersg

?

D Float  fThe set of xedprecision oatingp oint numbersg

?

D    D   D  

D  : : :   D   ::: D  

n n n n

The environment  maps type variables to domains where the type  has no free variables

we write simply D  Notice that the arrow on the left hand side of the fourth equation is

part of the syntax of types whereas on the right hand side it stands for the function space

constructor or functor for domains In just the same way the stands for a domain

n

constructor so we must obviously say just how is dened for any given algebraic data

n

type declaration The general form of an algebraic data type declaration is as follows

data : : : c  : : :  j ::: j c  : : : 

t a n n na

n

where t n > and a Corresp onding to this declaration we give the following

i

functor denition

d ::: d s ::: s

t n ?

where  d ; : : : ; d

n n

s D   ::: D   i n

i i ia

i

The domain constructions are categorical pro duct and sum and lifting they

?

are dened in Figure This equation is more usually given using separated sum instead of

categorical sum and omitting the lifting but the result is the same in either case as is easily

veried The reason we choose this formulation is that it extends smo othly when we add

unboxed types

Lifting A fg flift a j a Ag

?

Categorical sum A B fh ; a i j a Ag fh ; b i j b B g

Categorical pro duct A B fha ; b i j a A; b B g

Figure Denitions of domain constructions

This construction gives rise to a mutually recursive set of functor denitions These can

b e solved in the usual way to dene a domain D  for each type  with no free vari

ables See Smyth Plotkin for a categorical account or Schmidt for a more

elementorientated treatment The latter is particularly useful as it discusses unp ointed ie

b ottomless domains an asp ect imp ortant later on

The semantic equations

That completes the description of the domains involved so it remains only to give the deni

tions of the valuation functions Figure which gives these denitions contains no surprises

The valuation function P gives the meaning of programs E give the meaning of expres

sions B gives the meaning of groups of denitions and K which is not further dened

give the meaning of literal constants

The valuation function E takes an expression and an environment and returns a value We

use Env for the domain of environments and Val for the domain of values dening them

like this

S

Env var D 





S

D  Val



The environment maps a variable of type  to a value in the domain D  and the domain

of values is the union of all the D  In these two equations  ranges only over types with

no free type variables

We need a notation for writing values in the domains corresp onding to algebraic data types

Since only binary sum and pro duct were dened such values should formally consist of nested

pairs but these are rather tiresome to write Instead we will p ermit ourselves the lib erty of

writing such values in the attened form

hc ;  ; : : : ;  i

n

where c is a constructor of arity n and  are values from the appropriate argument domains

i

The initial environment  contains bindings for all the builtin functions For example it

init

P program Val

P prog E letrec prog in main 

init

E expr Env Val

E k  K k

E x   x

E e e E e  E e 

E x e  :E e  fx g

E let x e in b  E b  fx E e g

0 0

E letrec binds in e  E e  x  :B binds  

E c   : : : :  :hc ;  ; : : : ;  i

a a

E case e of c x ::: x e ::: c x ::: x e default e 

a n n na n d

n

case E e  of

hc ;  ; : : : ;  i E e  fx  ;:::; x  g

a a a

:::

hc ;  ; : : : ;  i E e  fx  ;:::; x  g

n n na n n n na na

n n n

else E e 

d

end

E case e of k e ::: k e x e 

n n d

case E e  of

lift k E e 

:::

lift k E e 

n n

lift  E e  fx g

d

end

B binds Env Env

B x e ::: x e  fx E e ; : : : ; x E e g

n n n n

Figure Denotational semantics of the Core language

contains the following binding for the addition function

x :y :case x of

0

lift x case y of

0 0 0

lift y lift x y

end

end

This denition also illustrates the semantic case construct which we use to discriminate

among elements of a domain

The main idea exp osing unboxed types to transformation

We are now ready to present the key idea of the pap er The big problem with earlier ap

proaches to the b oxing issue is this there is no way to talk about unboxed values in the Core

language An immediate consequence is that evaluation of numbers is implicit For example

in the denition

f x y y x

x and y must b e evaluated b efore they can b e subtracted but this fact is implicit as is the

order in which x and y are evaluated

This implicit evaluation is in contrast with the situation for algebraic data types For example

the length function might b e dened like this

length xs case xs of

x xs length xs

Here the evaluation of xs is completely explicit In general it is precisely case expressions

and nothing else which p erform evaluation of data structures

This suggests an obvious improvement p erhaps the evaluation of numbers can b e done by

case expressions as well Supp ose we wrote the denition of f like this

f x y case y of

Int y case x of

Int x case y x of

t Int t

The idea is that the data type Int of xedprecision integers is no longer primitive Instead

we can imagine Int b eing declared in Haskell like this

data Int Int Int

That is the Int type is an algebraic data type like any other with a single constructor

Int Here and elsewhere we will use the same name for the type and its constructor The

Int constructor has one comp onent of type Int which is the primitive type of unboxed

xedprecision integers A new primitive op erator subtracts values of type Int

The outermost case expression in our new formulation of f serves to evaluate y and extract

its unboxed comp onent y The next case expression p erforms a similar function for x giving

x Next the dierence b etween y and x is computed and b ound to t and nally the

function returns a value of type Int obtained by applying the Int constructor to t

At rst sight the innermost case expression is rather curious Why not just write the

following instead

Int y x

The reason we chose to use case for this purp ose is to make explicit that the subtraction is

p erformed b efore the result of the function is returned

In general we will use a sign to identify unboxed types and to identify variables whose type

is unboxed This only serves to make the presentation clearer the characters are treated

by the compiler in the same way as any other alphab etic as part of a name

We note in passing that the transformation from the rst form of f to the second can b e

carried out in a systematic way merely by giving the following denition to the subtraction

op erator which was previously considered primitive

p q case p of

Int p case q of

Int q case p q of

t Int t

Now the passage from one version of f to the other is just a matter of reduction unfolding

the application of to its two arguments

Now that Int has b een expressed in terms of a more primitive type its constructors the

integers and so on must b e regarded as short for Int Int and so on where

and are the unboxed constants for zero and one Similarly pattern matching against

integers b ecomes a twostage pro cess The function

f e

f e

f n en

is shorthand for

f Int e

f Int e

f n en

The latter will get translated by the patternmatching compiler to

f n case n of Int n case n of

e

e

default en

This has a straightforward op erational interpretation The outer case evaluates n and ex

tracts its unboxed contents n The inner case scrutinises n and selects the appropriate

alternative

Now that we can express programs involving unboxed values we can demonstrate how the

optimisations mentioned in Section can b e reinterpreted as program transformations

Avoiding rep eated evaluation

Consider the expression

xx

A nave implementation would evaluate and unbox x twice Let us see what the expression

lo oks like when we unfold the application of just as we unfolded in the previous section

It b ecomes this

case x of

Int x case x of

Int x case x x of

t Int t

Now it is a simple observation that the inner case is scrutinising the same value as the outer

case In general the following transformation holds

case e of ::: c x ::: x ::: case e of ::: c y ::: y body ::: :::

n n

case e of ::: c x ::: x ::: body x =y ::: x =y ::: :::

n n n

There is a side condition we assume that every binding site binds a distinct variable so

that the two o ccurrences of the expression e denote the same value and so that body do es

not rebind the x Applying this transformation to the expression we are studying we get

i

case x of

Int x case x x of

t Int t

which expresses precisely the optimisation we were seeking

Eliding redundant b oxing op erations

As a second example consider the expression

x y z

As mentioned earlier we want to avoid wrapping a b ox around the value of yz b ecause

it will immediately b e unwrapped by the enclosing addition As b efore let us see what the

expression lo oks like when the applications of and are unfolded It b ecomes this

case x of

Int x case case y of

Int y case z of

Int z case y z of

t Int t

of

Int w case x w of t Int t

This expression do es not lo ok very promising but it yields to the following wellknown trans

formation

case case e of p e ::: p e of alts

n n

case e of p case e of alts ::: p case e of alts

n n

The same uniquebindin g side condition is necessary here to o to ensure that the meaning of

alts is not changed by b eing moved into the scop e of the patterns p Where the expression

i

scrutinised by a case is itself a case expression the cases can b e interchanged as shown

We call this the caseofcase transformation for obvious reasons

In general there is a danger of duplicating co de b ecause if the inner case has multiple

alternatives alts will b e duplicated Happily in the expression we are transforming the

inner case has just one alternative We can apply the transformation three times to give

case x of

Int x case y of

Int y case z of

Int z case y z of

t case Int t of

Int w case x w of

t Int t

Now another generallyuseful transformation applies

case c x ::: x of ::: c y ::: y e :::

n n

e x =y ::: x =y

n n

That is a case expression which scrutinises a constructor applied to some variables can b e

replaced by the appropriate alternative after suitable renaming In general a constructor

which is applied to arbitrary expressions can b e transformed to one applied to variables by

introducing some letbindings Using this transformation on the inner case expression

gives

case x of

Int x case y of

Int y case z of

Int z case y z of

t case x t of

t Int t

This nal form refrains from b oxing t and then taking it apart just as we hop ed Instead

the result of the multiplication is used directly in the addition

Strictness analysis and unboxed calls

Much eort has b een devoted in the literature to strictness analysis which detects whether

or not a function is strict Hankin Abramsky For a sequential implementation the

signicance is that strict arguments can b e evaluated b efore the call In particular strict

numeric arguments can b e passed unboxed In the following section we show how the results

of strictness analysis can b e exploited within our transformational framework

Exploiting the results of strictness analysis

As with other asp ects of b oxing and unboxing the exploitation of strictness analysis is usually

left to the hapless co de generator Let us see how it can b e expressed in our extended language

Consider the factorial function with an accumulating parameter which in Haskell might

lo ok like this

afac a a

afac a n afac na n

Translated into the Core language it would take the following form

afac a n case n of

Int n case n of

a

n afac na nInt

Integer constants have b een replaced by the Int constructor applied to the corresp onding

unboxed constant and patternmatching have b een translated into case expressions

Supp ose that a strictness analyser informs us that afac is strict in b oth its arguments Then

without as yet doing anything further to its b o dy we can transform it into the following

pair of denitions

afac a n case a of Int a case n of Int n afac a n

afac a n let n Int n

a Int a

in

case n of

Int n case n of

a

n afac na nInt

The wrapp er afac evaluates the arguments and passes them unboxed to the worker

afac Recall that the annotations are merely present as cues to the human reader

the compiler treats as part of a name like any other letter The latter consists of a

letexpression whose bindings reconstruct the original arguments and whose b o dy is the

unmo died b o dy of the original afac

Now we can go to work on the b o dy of afac We unfold the denitions of and afac

itself and apply the transformations describ ed earlier One further related transformation is

needed

let x c x ::: x in ::: case x of ::: c y ::: y body ::: :::

n n

let x c x ::: x in ::: body x =y ::: x =y :::

n n n

This transformation is applied to the auxiliary let bindings for a and n after which no uses

of a and n remain so the let bindings for them can b e dropp ed A few moments work should

convince you that the result is this

afac a n case n of

Int a

n case n a of

a case n of

n afac a n

Bingo afac is just what we hop ed for a strict constantspace ecient factorial function

Even the recursive call is made directly to afac rather than going via afac Meanwhile

afac acts as an imp edancematcher to provide a b oxed to afac

Now much of the b enet of strictness comes from the fact that often the arguments are

already partly or completely evaluated b efore the call For example consider the following

call to afac

if x then afac x else

Here x is already evaluated b efore we reach the call to afac All we need to do to take

advantage of this is to always unfold calls to afac though not afac This unfolding

exp oses the two evaluations which afac do es The transformations already discussed can

then eliminate b oth of them leaving a direct call to afac

To summarise the results of strictness analysis can b e uniformly incorp orated into the trans

formation pro cess by applying the following pro cedure to each strict function

Split the function into two a wrapper function which evaluates the strict arguments and

passes them to the work function The work function uses letbindings to reconstitute

the original arguments but is otherwise identical to the original function

Unfold all applications of the wrapp er function wherever p ossible

Optimise using the transformations describ ed earlier

Strictness over nonnumeric types

In fact this is not quite the whole story Firstly a function can b e strict in an argument

whose type is a list It is far from clear what the wrapp er function should do in this case

Nor is it clear what the wrapp er should do in the case of a function strict in a functional

argument

The most obvious cases where something useful can b e done are these when the strict ar

gument is of a data type which has just one constructor Then the wrapp er can evaluate

the ob ject extract its comp onents and pass them to the work function Actually it is

only necessary to pass the free variables of the work functions b o dy to the work function

Comp onents which are not used can b e discarded by the wrapp er

To take an example where the constructor has more than one comp onent consider the function

f t case t of xy

It is clearly strict in its argument which is a pair Therefore we split it into two thus

f t case t of xy f x y

f x y let t xy in

Now a call to f in which an explicit pair is given thus

f ee

will b enet from unfolding the wrapp er for f and simplifying giving

f e e

The example stands revealed as the standard currying transformation another example of an

ad hoc transformation app earing as a sp ecial case

Unboxing results

The second addition to the strictness story can b e seen by attempting the same transformation

on the nonaccumulating factorial function Here is the original denition

fac n case n of

Int n case n of

Int

n n fac nInt

Again b ecause fac is strict we may split it into a wrapp er and a worker thus

fac n case n of Int n fac n

fac n case n of

Int

n case n of

n case fac n of

Int m case n m of

r Int r

The p oint is that fac has type Int Int not Int Int Hence fac contains co de for

taking apart the b oxed integer returned by the recursive call to fac the next to innermost

case do es this

It lo oks as if a new b oxed Int is constructed in the heap for each recursive call and this will

indeed b e the case for many implementations

Can we avoid this problem Observing that the result of a function call is always required or

else the call would not have b een made we can mo dify the way in which the split into wrapp er

and work functions is made If the result of the original function is a singleconstructor type

as in this case we can split like this

fac n case n of Int n case fac n of t Int t

fac n let n Int n

in

case original body of fac of

Int r r

Now fac has type Int Int and transformation turns it into

fac n case n of

1

It happ ens not to b e the case for the Spineless Tagless Gmachine which returns even apparently b oxed

integers among other things in a register The reason for this is that the only reason a b oxed integer is ever

evaluated is to unbox it Even so returning an unboxed value is still slightly more ecient than returning a

b oxed one

n case n of

n case fac n of m n m

To conclude the p oint of all this is not so much that any one alternative is obviously much

b etter than any other but that rather the new expressiveness oered by a language featuring

unboxed types allows us to discuss and explore a wide range of options within a single uniform

framework

Generalising unboxed types

The presentation of Section gave a denition for the Int type as an algebraic data type

based on a primitive unboxed type Int This is rather suggestive can we generalise the

idea to all algebraic data types so that Int is not sp ecial but rather a particular case of a

general feature

It turns out that we can indeed do so in two distinct ways

We can allow any constructor and not just Int to take arguments of unboxed type

We can allow any algebraic data type and not just Int to b e unboxed

Constructors with unboxed comp onents

Consider the following conventional denition of a data type for complex numbers and a

corresp onding addition function

data Cpx Cpx Int Int Real and imaginary parts

addCpx Cpx r i Cpx r i Cpx rr ii

But since all constructors are nonstrict what addCpx will do is to build a Cpx b ox containing

p ointers to an unevaluated closure for rr and another for ii Now this may b e exactly

what the programmer wanted it allows values such as Cpx for example but the

implementation cost is heavy compared with a strict language which would simply build a

Cpx b ox containing the values of rr and ii

The discussion of Section suggests the following alternative

data UCpx UCpx Int Int

addUCpx UCpx i r UCpx i r

case i i of

t case r r of

t UCpx t t

Now the comp onents of the UCpx constructor are unboxed integers and are therefore computed

b efore the constructor is built

Arbitrary unboxed algebraic data types

An enumeration type is an algebraic data type whose constructors all have zero arity For

example

data Boolean False True

data Colour Red Green White Blue

Ob jects of type Boolean and Colour are b oxed yet it makes sense to think of an unboxed

b o olean or colour represented as one of a suitable set of bitpatterns This observation

provokes the question can we allow the programmer to declare new unboxed enumeration

types p erhaps like this

data unboxed Colour Red Green White Blue

data unboxed Boolean False True

It would certainly b e more ecient to represent a value of type Colour or Boolean than

to represent one of type Boolean or Colour As usual the annotations are present only

to clarify the presentation it is the unboxed keyword which indicates that the enumeration

should b e unboxed

The idea can b e generalised further by allowing any algebraic data type to b e declared

unboxed For example

data unboxed UPair a b UPair a b

data unboxed UCpx UCpx Int Int

data unboxed Maybe a Just a Nothing

Each of these declarations has a natural op erational reading The type of unboxed pairs

UPair is represented by a pair of p ointers These p ointers are actually carried around to

gether rather then placing them in a heapallo cated b ox and carrying around a p ointer to

this b ox Similarly the UCpx type is represented by a pair of unboxed integers

The Maybe type is a little dierent b ecause it has more than one constructor It can b e

represented by a bit to distinguish one constructor from the other together with enough words

to contain the comp onents of any of the constructors one in this case The implementation

would b e trickier here and the eciency gains might b e less clear cut Since the sole purp ose

of these unboxed types is to improve eciency it is not clear whether it is worth implementing

them in full generality

Recursive unboxed types are even less plausible For example

data unboxed UTree a ULeaf a UBranch UTree a UTree a

Here the size of an unboxed tree would b e variable and it is hard to imagine how it could

ever b e implemented eciently Accordingly we imp ose the rule that there must b e at least

one b oxed type involved in any recursive lo op of types see Section

The semantics of unboxed types

So far we have motivated the introduction of unboxed types by showing a number of ways in

which they can b e useful It is now time to give the idea some formal foundation

We do so in two steps

We discuss the mo dications necessary to the type system that is the static semantics

Section

We mo dify the dynamic semantics of the Core language to take account of unboxed

values and discuss the safety of program transformations in the presence of unboxed

values Section

Before we b egin this sequence we discuss an imp ortant caveat We are trying to make

unboxed values into rstclass citizens but it turns out that we must make three signicant

restrictions on their use

The rst two of these restrictions certainly make writing programs involving unboxed types

less convenient But remember that we are talking here ab out the Core language rather

than ab out the source language in which the programmer writes There are several steps we

can take to reduce the impact of these restrictions on the programmer by inserting implicit

evaluations and co ercions but it is b etter to deal with one issue at a time We therefore

concentrate now on giving a semantics for the restricted language while in Section we

discuss the languagedesign question

Restriction loss of p olymorphism

Unboxed ob jects are dangerous b easts if the garbage collector should ever treat one as a

p ointer to a heap ob ject the entire system might crash Furthermore the size of an unboxed

ob ject may vary with its type An Int ob ject may b e the same size as a p ointer but a

LongInt ob ject would b e larger as would a bit oating p oint number and an unboxed

pair

One approach would b e to add tag to distinguish b oxed and unboxed ob jects and give

size and layout information for unboxed ob jects This would b e intolerably slow For example

the simple function

head xxs x

would have to test the tag on the value x to nd out how many to move into the result

lo cations This approach would largely obviate one of the ma jor b enets of strong typing

namely the p erformance advantage of working with untagged data App el

Accordingly we assume that no tag bits distinguish b oxed from unboxed ob jects relying

solely on the type system to keep them apart It follows immediately that

Restriction Polymorphic functions cannot manipulate unboxed values

Restriction explicit evaluation

Whenever an unboxed value is stored in a or passed to a function it must

rst b e evaluated Whilst this could b e left implicit we prefer to make it explicit The main

reason for doing so is that it allows the dynamic semantics to b e much more straightforward

and ensures that most existing program transformations remain valid Section

There is a simple syntactic criterion which tells if an expression of unboxed type is in head

normal form HNF or not an expression of unboxed type is in HNF if it is

a literal constant

an application of an unboxed constructor or

a variable

An unboxed constructor is a constructor of an unboxed algebraic data type see Sec

tion The only surprise here is that a variable is an HNF this follows from the fact that

unboxed variables are b ound to the bitpattern corresp onding to the value so then cannot

b e b ottom Since functions are all b oxed no partial application is an HNF of unboxed type

Now we can formalise our restriction

Restriction An expression of unboxed type which appears as the argument

of an application or as the righthand side of a binding must be in HNF

As an example of this restriction the expression

f x y

is illegal b ecause x y is not an HNF It must instead b e written

case x y of t f t

The evaluation has thereby b een made explicit

Restriction p ermits recursive denitions of unboxed values which at rst lo ok unreasonable

They do not seem to b e very useful however so we ignore them until Section

Restriction no recursive unboxed data types

For the reasons discussed ab ove in Section and to ensure the domain equations of Section

have a least solution we prohibit recursive denitions of unboxed data types

Polytype  :  j 

Monotype   j 

Boxed type  Type variable

j  

j  : : :  Parameterised b oxed data type

n

Unboxed type  Int

j Float

j  : : :  Parameterised unboxed data type

n

Figure Syntax of types

Restriction There must be at least one boxed type involved in any recursive

loop of types

This is rather similar to the usual rule that type synonyms must not b e recursive

The static semantics of unboxed types

Since all three restrictions are expressed in terms of types it follows that the type system

must b e mo died to embo dy them Restriction is a simple syntactic check but Restrictions

and are more interesting

Types

The syntax for types is given in Figure A monotype  can b e b oxed or unboxed

A b oxed type  is a type variable a function type or a type constructor parameterised

only over further b oxed types Each type constructor corresp onds to a b oxed algebraic

data type declaration

Functions are interesting The argument and result of a function type can b e any type b oxed

or otherwise but the function itself is regarded as boxed b ecause it is represented by a p ointer

to a closure

An unboxed type  is either one of the builtin unboxed types such as Int or Float or an

unboxed type constructor parameterised over b oxed types Each such type constructor

corresp onds to an unboxed algebraic data type declaration

The only surprise here is that data types cannot b e parameterised over unboxed types For

example do es not the type List Int which would b e written Int in Haskell make

p erfect sense The diculty is that the garbage collector cannot tell that this particular list

contains unboxed ob jects which should not b e treated as p ointers and furthermore standard

list cells might well b e unable to accomo date say a doubleprecision oating p ointer number

Another way to think of it is this the list constructor is p olymorphic and hence should

not b e applied to unboxed values Restriction

The only constructors which can have unboxed comp onents are those which have b een explic

itly declared as such such as UCpx Notice that the type UCpx is a b oxed type despite having

unboxed comp onents The restriction is that a p olymorphic type cannot b e parameterised

with unboxed comp onents

Typing

We move on to consider how the type rules can b e adjusted to accomo date the new con

straints Fortunately it is rather easy Most type systems have a rule lo oking like this for

typical examples of complete typing rules see Damas Milner or Hanco cks chapter

in Peyton Jones

A e : 

SPEC

A e   =

This rule is used to instantiate p olymorphic types by substituting an arbitrary monotype 

for the type variable in  All that we need to do to implement Restriction is to ensure

that p olymorphic types are never instantiated with unboxed types thus

A e : 

SPEC

A e   =

That is only b oxed types  should b e substituted for Restriction is also easy to embo dy

We need two rules for application instead of one

A e   A e  

A h  A e 

AP AP

A e e  A e h 

The AP rule applies to functions taking b oxed arguments while the AP rule insists that

an unboxed argument must b e an HNF h The rule for let and letrec need to altered in a

similar way

And that is all No further changes are required to the type system

Type inference

Next we consider what changes need to b e made to a type inference system to accomo date

the new rules and the accompanying two kinds of types

SPEC is usually implemented in a type inference engine by substituting a fresh type variable

for the universally quantied variable This is subsequently sp ecialised as much as necessary

by unication To implement the new SPEC rule we therefore need to mo dify the unication

pro cess the unier should fail to unify an unboxed type with a polymorphic type variable For

example consider the expression

id

where id is the p olymorphic identity function with type : and is an unboxed

integer of type Int The type of id is instantiated to where is a fresh type variable

Then an attempt will b e made to unify with Int and at this p oint a type error will b e

rep orted

This is not quite all we have to do Consider the function denition

uInc x x

Function denitions are usually typed by adding the assumption x to the type environment

where is a fresh type variable and then typing the b o dy This yields a substitution which

can b e applied to to give the argument type for f In this case we do not want the type

checker to complain at the unication of with Int rather we want type inference to succeed

in attributing to uInc the type Int Int

What is required is two forms of type variables uncommitted ones which can unify with

anything and b oxed ones which cannot unify with unboxed types Function arguments

are given uncommitted type variables while b oxed type variables are used to instantiate

p olymorphic types

The same trick can b e used to implement the two forms of AP rule If the argument of the

function is not simple variable we unify the type of the argument with a b oxed type variable

In eect this tags the argument type which might at this stage b e only an uncommitted

type variable so that it can only subsequently unify with a b oxed type as required

This completes our sketch of the mo dest changes required to a type inference system to

accomo date unboxed types It is a little surprising that the type system do es not need two

kinds of type variables only the type inferencer do es It would b e p ossible to introduce two

kinds of type variables into the type system as well and it might make it easier to prove

soundness and completeness results if this were done

The dynamic semantics of unboxed types

Next we consider what changes need to b e made to the dynamic semantics of the Core

language to accomo date unboxed types As b efore we need to give a domain mo del for each

type and we need to give the semantic equations

The mo del

The rst question is what is the domain D Int corresp onding to the type of unboxed

integers Our rst attempts to answer this question suggested that it should b e the usual

domain of integers including but that meant that D Int the domain of b oxed integers

had to have an extra b ottom This double lifting gives rise to all sorts of complications over

which we draw a kindly veil

A much b etter solution is to use unpointed domains that is domains which lack a b ottom

element A go o d introduction is given by Schmidt So the domain of unboxed integers

is dened thus

D Int fThe xedprecision integersg

The domain has no b ottom element and the ordering relation is just the identity The lack

of a b ottom element for unboxed domains corresp onds nicely to our intuition a variable of

unboxed type can never b e b ound to b ottom b ecause the whole computation will diverge

b efore the binding takes place

The dierence b etween the domains corresp onding to Int and Int is that the latter is not

lifted Now it is p ossible to see why we made the outermost lifting explicit in our mo del

for algebraic data types in Section it makes it easy to generalise the mo del to unboxed

data types merely by omitting the lifting Sp ecically for an unboxed algebraic data type

declaration of form

data unboxed : : : c  : : :  j ::: j c  : : : 

t a n n na

n

we derive the following functor denition

d ::: d s ::: s

t n

where  d ; : : : ; d

t t

s D   ::: D   i n

i i ia

i

The only dierence from the b oxed case is the absence of the outermost lifting

The inductive denition for D from Section go es through largely unchanged

D  

D Int  fThe set of xedprecision integersg

D Float  fThe set of xedprecision oatingp oint numbersg

D    D   D  

D    D   D  

?

D  : : :   D   ::: D  

n n n n

D  : : :   D   ::: D  

n n n n

The main dierence comes in the function space construction A function returning an un

b oxed value may fail to terminate so its result type must b e lifted

The resulting domain equations are wellfounded provided Restriction is observed namely

that every recursive lo op of type declarations includes at least one b oxed type

2

This decision is not as ad hoc as it may app ear The restricted form of function application p ermitted by

Restriction directly implements Kleisli comp osition over the lifting monad where an arrow from A to B is

a function from A to B Moggi

?

The semantic equations

The ma jor dierence in the semantics of programs involving unboxed values is that unboxed

arguments are evaluated b efore calling the function to which they are passed At rst it lo oks

as though this will require a signicant adjustment to the semantics but this is the p oint at

which Restriction comes into its own Because Restriction has already made all evaluation

explicit no changes whatsoever are required to the semantic equations of Figure

Whilst the equations themselves do not change the denition of the domains Val and Env

need to b e adjusted in a somewhat subtle way

First the valuation of an expression of unboxed type may fail to terminate and hence as

with function types we need to lift the result domain

D  D  Val

?

 

Do es the same need to b e done for Env No it do es not b ecause we claim that a variable

of unboxed type can never be bound to bottom To justify this claim consider all the places

where a variable of unboxed type can b e b ound

The application of a lambda abstraction Here the argument can only b e a variable

Restriction so if the claim is true of the argument it will also b e true of the variable

b ound by the abstraction

The evaluation of a case alternative Here variables are b ound to the comp onents of a

constructor But if any of these comp onents is of unboxed type then the corresp onding

argument at the call of the constructor will b e a variable and hence cannot b e b ottom

The evaluation of a case alternative Here a variable is b ound to the value which the

case evaluates But if this value is b ottom then the case diverges and so the binding

never takes place

This all corresp onds directly to our intuition The runtime environment will contain p ointers

to p erhaps as yet unevaluated b oxed values and some unboxed values The latter cannot

b e b ottom So we retain the same denition for Env

var D  Env





Transforming programs involving unboxed types

The whole thrust of this pap er has b een to exp ose programs involving unboxed values to opti

mising transformations cf Section It is legitimate to ask whether all the transformations

we know so well still apply in the new setting This is not a trivial question For example

supp ose that we did not imp ose Restriction so that unboxed arguments could b e nontrivial

expressions Then reduction is no longer valid For example consider the expression

x f

where f has type Int Int and f happ ens to diverge on argument Then b ecause

unboxed arguments must b e evaluated b efore a call this expression has value But a

simple reduction transforms this expression to and hence changes the meaning

The most imp ortant reason for Restriction is to prevent this sort of problem All the

usual program transformations remain valid in the new setting We can justify this claim by

observing that al l the existing semantic equations remain unchanged so if a transformation

previously preserved correctness then it wil l do so in the extended language as wel l This

sounds obvious but many of our earlier attempts at a dynamic semantics required dierent

semantic rules and so the correctness of transformations was a matter for sp eculation

There is a small caveat We need to check that the result of a program transformation still

ob eys Restrictions and that is they are still welltyped in the sense of Section

Language design issues

The sole reason for introducing unboxed values in the rst place is to improve eciency There

are no programs which one can write using unboxed values which cannot b e written equally

easily without It is therefore far from obvious whether unboxed values should b e exp osed

to the programmer at all it would b e quite p ossible to use them solely as a convenient

notation inside the compiler helping to supp ort the compilationbytransformation paradigm

as discussed in Sections and

Even so there is a strong case for making unboxed values directly available to the programmer

includin g the generalisations prop osed in Section as we now discuss A p ossible compromise

would b e to make such language extensions available only to the systems programmer for

example the p erson who writes the complexnumber arithmetic package

Consider the UCpx data type introduced in Section In principle it is p ossible that a

very clever program analyser could lo ok at a program written in terms of Cpx and addCpx

and gure out that it would not change the meaning of the program to rewrite it in terms

of UCpx and addUCpx In practice this is far b eyond what current compiler technology can

do b ecause data can b e placed in a data structure in one part of the program and used

somewhere else entirely In any case such an analysis would b e imp ossible in the presence of

separate compilation

In short the eciency improvements arising from generalised unboxed algebraic data types

cannot realistically b e obtained without involving the programmer yet these p erformance

improvements can b e substantial

The other trouble with making unboxed values part of the source language is that Restrictions

and are very tiresome The remaining parts of this section consider how they may b e

alleviated

Automatic evaluation

It is a simple matter to lift Restriction in the source language by transforming the program

to ob ey the restriction after type inference is complete For example whenever a function

is applied to a nonHNF argument of unboxed type the application is enclosed in a case

expression which evaluates the argument and binds it to a variable which is then passed to

the function

Similarly whenever a letexpression binds a nonHNF value of unboxed type it is replaced

with the corresp onding case expression

This transformation cannot b e applied to letrec expressions b ecause there is no corresp ond

ing case expression For the same reason it cannot b e applied to the toplevel bindings of a

program

Automatic co ercion

It is also p ossible to lift Restriction which prohibits unboxed values from b eing passed

to p olymorphic functions at least where there is a b oxed form of the same data type by

co ercing to and from the b oxed form

For example a value of type Int may b e passed to a p olymorphic function by rst b eing

co erced type Int by applying the Int constructor to it Similarly a value of type Int can b e

co erced into one of type Int by evaluating it and extracting its value This is exactly the

approach taken by Leroy

For example the expression

foldr

would after the co ercions have b een introduced lo ok like this

case foldr x y case x of Int x

case y of Int y

case x y of t Int t

Int

Int Int

of

Int t t

The unboxed constants have b een b oxed to make them compatible with the p olymorphic

functions foldr and the list constructor the corresp onding co ercions have b een done to the

function passed to foldr and the result of the foldr is then co erced back to Int

It is debatable whether all of this extra stu should get introduced by the compiler p erhaps

without the programmer realising that it is taking place After all the whole purp ose of the

exercise is to improve p erformance so hidden p erformance losses are bad news

The other problem is that in general it is p ossible to insert co ercions in more than one place

and still obtain a correct program Satish Thatte gives a go o d presentation of the issues

Thatte

Related work

There is a long tradition of compilation by transformation starting with Steeles Rabbit

compiler Steele which pioneered continuationpassing style CPS CPS allows many

representation decisions to b e exp osed and expressed in the language b eing transformed

rather than b eing hidden in a blackbox co de generator The Orbit compiler Kranz

for Scheme is built on these ideas to make a pro ductionquality optimising compiler

App el and MacQueens Standard ML compiler is a further development in this line with even

greater mo dularity esp ecially near the back end App el Jim Kelseys thesis entitled

Compilation by program transformation is also CPSbased but he handles imp erative

languages as well Kelsey

Fradet and LeMetayer describ e another transformationbased compiler based on CPS Fradet

Metayer Fradet Metayer Their approach is unusual in that their target

co de is a continuationbased combinator language which has a direct reading either as a

functional program or as a sequence of abstract machine instructions The trouble is that they

are thereby forced to make an early commitment to some lowlevel representation decisions

such as stack layout

All of this work relates to compilationbytransformation of languages with strict semantics

CPS is wonderful for such languages b ecause it makes explicit the order of evaluation which

is implied by the semantics For nonstrict language where the order of evaluation is demand

driven CPS is not nearly so useful A case expression is the nearest we get to it evaluate

this expression and then continue in one of these ways dep ending on the result you get

The caseofcase transformation and its cousins caseofconstructor caseofvariable in a

scop e where the variable is b ound to a constructor are not new They were the key transfor

mations in the deforestation algorithm given by Wadler whose purp ose is to eliminate

intermediate data structures in functional programs Indeed the work describ ed here could

b e seen as an exploration of the eects of exp osing unboxed types to deforestation the inter

mediate data structures in this case b eing the b oxes around values

None of these works directly addresses the main theme of this pap er namely the treatment

of unboxed values in a nonstrict language The Clean compiler from Nijmegen Brus et al

do es supp ort unboxed values at the program level but the details of what it do es and

how it works are not published

Petersons pap er whose title Untagged data in tagged environments is sup ercially similar

to this pap er addresses a dierent though related problem Peterson Supp ose that

one to ok the ideas of this pap er but dropp ed Restriction which restricts p olymorphism

using tagging to identify unboxed data Peterson then prop oses an analysis to discover regions

of the program in which the tags need not b e attached to the unboxed values which is of

course much more ecient than manipulating the tagged representation It is not clear how

well his analysis would work in a nonstrict language where the order of evaluation is much

harder to predict

Leroys work has already b een mentioned Leroy It deals with a strict language and

concentrates mainly on the co ercion issues discussed in Section It nicely complements

this pap er

Further work

Overloading the builtin op erators

In Section we introduced the following example

data Cpx Cpx Int Int

addCpx Cpx r i Cpx r i Cpx rr ii

Supp ose that one had written the former denition of Cpx and addCpx and then decided to

make the comp onents of Cpx unboxed It would b e nice if one could just replace Int with

Int in the denition of the Cpx data type but make no change to the addCpx function thus

data Cpx Cpx Int Int

addCpx Cpx r i Cpx r i Cpx rr ii

We are assuming that Restriction is lifted as discussed ab ove As things stand addCpx

and every other function which manipulates the comp onents of a Cpx constructor needs to

b e changed to use use unboxed op erations instead of b oxed ones

addCpx Cpx r i Cpx r i Cpx r r i i

This is a nuisance and it would b e nice if the compiler could gure such things out for itself

At rst it seems that Haskells overloading mechanism might solve the problem but this is

defeated by Restriction Hence we cannot declare an instance of the class Num for unboxed

integers Int

Foreignlanguage interfacing

One of the reasons it is quite tricky to call written in another language eg C

from a nonstrict functional program is b ecause other language generally manipulate unboxed

values Once we can manipulate unboxed values in the functional language it is likely to b e

come easier to build a direct interface to other imp erative languages We have not investigated

this yet

Unboxed functions

Do es it make any sense to talk of unboxed functions For example a C program can pass

co de addresses around and then call them

In general a function is represented by a co de p ointer and an environment An unboxed

function could p erhaps b e a co de p ointer together with a p ointer to an environment But

then not much has b een gained compared with making the co de p ointer just one more eld

in the environment

The time that there would b e a substantial b enet would b e for functions which had no free

variables that is no environment Such functions can indeed b e represented by just a co de

address just like C

Recursive denitions of unboxed values

Recursive denitions of unboxed values are p ermitted by Restriction provided their right

hand sides are HNFs and they even make sense For example given the type denitions

data unboxed Two Two One Int

data One One Two

the following recursive denition makes sense attributing to x the type Two

x Two One x

The way to understand such a denition is by naming each sub expression and then in the

denition of each b oxed variable replacing each unboxed variable by its denition This leaves

a set of recursive denitions of b oxed values and some nonrecursive denitions of unboxed

values In this example the result is

x Two y

y One Two y

Only the denition of y is recursive We can always do this op eration b ecause of Restriction

provided there is no lo op of the form

p q

q p

This rather unsatisfactory caveat is one reason we left this section under Further work In

the light of the transformation given ab ove another approach would b e to rule out recursive

denitions of unboxed values

Acknowledgements

When we were enmeshed in various rather complicated attempts at a dynamic semantics for

unboxed values John Hughes had the insight that we should try using unp ointed domains

Kei Davis Cordelia Hall Will Partain John Peterson Ryszard Kubiak and Phil Wadler all

made lots of very useful comments which help ed to improve the presentation Our grateful

thanks to them

This work was done with the supp ort of the SERC GRASP pro ject and the ESPRIT Seman

tique Basic Research Action

Bibliography

AW App el Runtime tags arent necessary CSTR Department of Computer

Science Princeton University

AW App el T Jim Jan Continuationpassing closurepassing style in Pro c ACM

Conference on Principles of Programming Languages ACM

TH Brus MCJD van Eckelen MO van Leer MJ Plasmeijer Sept Clean a language

for functional graph rewriting in Functional programming languages and computer

architecture Portland G Kahn ed LNCS Springer Verlag

LMM Damas R Milner Principal type schemes for functional programs in POPL

P Fradet D Le Metayer Compilation of functional languages by program transfor

mation IRISA Campus de Beaulieu Rennes

P Fradet D Le Metayer June Compilation of lambdacalculus into functional ma

chine co de INRIA

in Abstract Interpretation of Declarative Languages C Hankin S Abramsky eds

Ellis Horwoo d Chichester

P Hudak PL Wadler Arvind B Boutel J Fairbairn J Fasel K Hammond J Hughes T Johns

son R Kieburtz RS Nikhil SL Peyton Jones M Reeve D Wise J Young April

Rep ort on the functional Haskell Department of Computing

Science Glasgow University

R Kelsey May Compilation by program transformation YALEUDCSRR PhD

thesis Department of Yale University

DA Kranz May ORBIT an optimising compiler for Scheme PhD thesis Department

of Computer Science Yale University

X Leroy April Ecient data representations in p olymorphic languages INRIA Re

search Rep ort Ro cquencourt

E Moggi June Computational lambda calculus and monads in Logic in Computer

Science California IEEE

J Peterson Sept Untagged data in tagged environments choosing optimal represen

tations at compile time in Functional Programming Languages and Computer Ar

chitecture London Addison Wesley

SL Peyton Jones The implementation of functional programming languages Prentice

Hall

SL Peyton Jones Feb The Spineless Tagless Gmachine a second attempt Depart

ment of Computing Science University of Glasgow

SL Peyton Jones Jon Salkild Sept The Spineless Tagless Gmachine in Functional

Programming Languages and Computer Architecture D MacQueen ed Addison

Wesley

DA Schmidt Denotational semantics a metho dology for language development Allyn

and Bacon

MB Smyth GD Plotkin Nov The categorytheoretic solution of recursive domain

equations SIAM Journal of Computing

GL Steele Rabbit a compiler for Scheme AITR MIT Lab for Computer Sci

ence

SR Thatte Co ercive type equivalence Department of Maths and Computer Science

Clarkson University NY

P Wadler Deforestation transforming programs to eliminate trees Theoretical Com

puter Science