A mo dular fullylazy lambda lifter in Haskell

Simon L Peyton Jones

Department of Computing Science University of Glasgow G QQ

David Lester

Department of Computer Science University of Manchester M PL

June

Abstract

An imp ortant step in many compilers for functional languages is lambda lifting In

his thesis Hughes showed that by doing lambda lifting in a particular way a useful

prop erty called ful l laziness can b e preserved Hughes Full laziness has b een seen

as intertwined with lambda lifting ever since

We show that on the contrary full laziness can b e regarded as a completely separate

pro cess to lambda lifting thus making it easy to use dierent lambda lifters following a

fulllaziness transformation or to use the fulllaziness transformation in compilers which

do not require lambda lifting

On the way we present the complete co de for our mo dular fullylazy lambda lifter

written in the Haskell language

This paper appears in Software Practice and Experience May

Introduction

Lambda lifting and full laziness are part of the folk lore of functional programming yet the

few published descriptions of fullylazy lambda lifting have b een either informal or hard to

understand with the notable exception of Bird Bird Our treatment diers from

earlier work in the following ways

The main technical contribution is to show how full laziness can b e separated from

lambda lifting so that the two transformations can b e carried out indep endently This

makes each transformation easier to understand and improves the mo dularity of the

compiler

Our treatment deals with a language including let and letrecexpressions Not only is

this essential for ecient compilation in the later stages of most compilers but we also

show that eliminating letrec expressions by translating them into lambda abstrac

tions loses full laziness To our knowledge this has not previously b een realised

We show how to decomp ose each of the transformations further into a comp osition of

simple steps each of which is very easy to program thus further improving mo dularity

We have developed an elegant use of parameterised data types which allows the type

system to help express the purp ose of each pass

We present the complete source co de for our solution and we do cument some of the exp e

rience we gained in developing it indeed the pap er can b e read as an exercise in functional

programming

The co de is presented in the functional programming language Haskell and the source text

for the pap er is an executable literate Haskell program Lines of co de are distinguished by a

leading sign the rest of the text b eing commentary

We introduce all the notation and background that is required to understand the pap er The

rst three sections introduce the language to b e compiled the Haskell language and the main

data types to b e used Following these preliminaries we develop a nonfullylazy lambda lifter

and then rene it into a fully lazy one The pap er concludes with some optimisations to the

fullylazy lambda lifter

The language

We b egin by dening a small language L on which our compiler will op erate The abstract

syntax of the language is given by the following pro ductions

expression name

j constant Literals and builtin functions

j expression expression Application

j let defns in expression Nonrecursive denitions

j letrec defns in expression Recursive denitions

j  name expression Lambda abstraction

defns def ::: def n >

n

def name expression

The concrete syntax is conventional parentheses are used to disambiguate application as

so ciates to the left and binds more tightly than any other op erator the b o dy of a lambda

abstraction extends as far to the right as p ossible the usual inx op erators are p ermitted

and denitions are delimited using layout Notice that the bindings in letrec expressions are

all simple that is the left hand side of the binding is always just a variable Functions are

dened by binding variables to lambda abstractions Here is an example to illustrate these

p oints

letrec

fac n if n n factorial n

in

fac

An imp ortant feature of this language is that any pure highlevel functional programming

language can b e translated into it without loss of expressiveness or eciency This pro cess is

describ ed in some detail by Peyton Jones Peyton Jones

Notice that letrecexpressions are retained despite the fact that they are resp onsible for

many of the subtleties in the rest of the pap er They can certainly b e transformed into

applications of lambda abstractions and the Y combinator but doing so straightforwardly

may result in a serious loss of eciency Peyton Jones Chapter For example

eliminating a letexpression introduces a new lambda abstraction which will under the com

pilation scheme describ ed in this pap er b e lambdalifted and thereby decomp ose execution

into smaller steps Similarly a naive treatment of mutual lo cal recursion using Y involves

much packing and unpacking of tuples which can easily b e avoided if the letrecexpression

is handled explicitly Finally later on we show that laziness can b e lost if letexpressions are

not handled sp ecially

Haskell

Haskell is a recentlydesigned common nonstrict pure functional language Hudak et al

For the purp oses of this pap er it is fairly similar to SML or Miranda except in its

treatment of abstract data types and mo dules We make little use of Haskells ma jor technical

innovation namely systematic overloading using type classes

Haskell mo dules b egin with a declaration naming the mo dule and imp orting any mo dules it

requires

module LambdaLift where

import Utilities

Here the mo dule we will b e dening is called LambdaLift and it imp orts an auxiliary mo dule

called Utilities whose interface is given in the App endix

A data type for compilation

Every compiler has a data type which represents the abstract syntax tree of the program

b eing compiled The denition of this data type has a substantial impact on the clarity and

eciency of the compiler so it merits careful thought

First failed attempt

As a rst attempt the language describ ed in the previous section can translated directly into

the following Haskell algebraic data type declaration

1

Miranda is a trade mark of Research Software Ltd

2

Notice that data constructors such as EVar and ELam and type constructors such as Expresssion and

Name must b oth b egin with capital letter in Haskell

data Expression

EConst Constant

EVar Name

EAp Expression Expression

ELam Name Expression

ELet IsRec Definition Expression

We choose to represent names simply by their character string using a type synonym decla

ration

type Name Char

Constants may b e numbers b o oleans the names of global functions and so on

data Constant CNum Integer CBool Bool CFun Name

The denition of the Constant type can b e changed without aecting any of the rest of this

pap er As an example of the Expression type the Lexpression a would b e represented

by the Haskell expression

EAp EAp EConst CFun EVar a EConst CNum

Letexpressions can usually b e treated in the same manner as letrecexpressions so the two

are given a common constructor and distinguished by a ag of type IsRec It is convenient

to use a b o olean for this ag

type IsRec Bool

recursive True

nonRecursive False

Each denition in the denition list of an ELet construction is just a pair

type Definition Name Expression

Second failed attempt

It do es not take long to discover that this is an insuciently exible data type Many

compiler passes add information to the abstract syntax tree and we need a systematic way to

represent this information Examples of analyses which generate this sort of information are

freevariable analysis binding level analysis type inference strictness analysis and sharing

analysis

The most obvious way to add such information is to add a new constructor for annotations

to the Expr data type thus

data Expression

EVar Name

EAnnot Annotation Expression

together with an auxiliary data type for annotations which can b e extended as required

3

The type Set a is a standard abstract data type for sets whose interface is given in the App endix but

data Annotation FreeVars Set Name

Level Integer

This allows annotations to app ear freely throughout the syntax tree which app ears admirably

exible In practice it suers from two ma jor disadvantages

It is easy enough to add annotation information in the form just describ ed but writing

a compiler pass which uses information placed there by a previous pass is downright

awkward Supp ose for example that a pass wishes to use the freevariable information

left at every no de of the tree by a previous pass Presumably this information is attached

immediately ab ove every no de but the data type would p ermit several annotation no des

to app ear ab ove each no de and worse still none or more than one might have a free

variable annotation

Even if the programmer is prepared to certify that there is exactly one annotation no de

ab ove every tree no de and that it is a freevariable annotation the implementation will

still p erform patternmatching to check these assertions when extracting the annotation

Both of these problems namely the requirement for uncheckable programmer assertions

and some implementation overhead are directly attributable to the fact that every

annotated tree has the rather uninformative type Expression which says nothing ab out

which annotations are present

The second ma jor problem is that further exp erimentation reveals that two distinct

forms of annotation are required The rst annotates expressions as ab ove but the

second annotates the binding o ccurrences of variables that is the o ccurrences on the

lefthand sides of denitions and the b ound variable in a lambda abstraction We

will call these o ccurrences binders An example of the need for the latter comes in

type inference where it the compiler infers a type for each binder as well as for each

subexpression

It is p ossible to use the expressionannotation to annotate binders but it is clumsy and

inconvenient to do so

A happy ending

We will address the second problem rst since it has an easy solution All we need do is

parameterise the Expression type with resp ect to the type of binders thus

data Expr binder

EConst Constant

EVar Name

EAp Expr binder Expr binder

ELam binder Expr binder

ELet IsRec Defn binder Expr binder

type Defn binder binder Expr binder

whose implementation is not further dened

binder is a type variable which in Haskell must b egin with a lowercase letter We can easily

recover a denition of our original Expression type in which a binder is represented by a

Name using a type synonym declaration thus

type Expression Expr Name

Alternatively a data type in which binders are names annotated with a type can b e dened

thus

type TypedExpression Expr Name TypeExpr

where TypeExpr is a data type representing type expressions

Returning to annotations on expressions we can reuse the same technique by parameteris

ing the data type of expressions with rep ect to the annotation type We want to have an

annotation on every no de of the tree so one p ossibility would b e to add an extra eld to

every constructor with the annotation information This is inconvenient if for example you

simply want to extract the freevariable information at the top of a given expression without

p erforming case analysis on the ro ot no de This leads to the following idea each level of the

tree is a pair whose rst comp onent is the annotation and whose second comp onent is the

abstract syntax tree no de Here are the corresp onding Haskell data type denitions

type AnnExpr binder annot annot AnnExpr binder annot

data AnnExpr binder annot

AConst Constant

AVar Name

AAp AnnExpr binder annot AnnExpr binder annot

ALam binder AnnExpr binder annot

ALet IsRec AnnDefn binder annot AnnExpr binder annot

type AnnDefn binder annot binder AnnExpr binder annot

Notice the way that the mutual recursion b etween AnnExpr and AnnExpr ensures that every

no de in the tree carries an annotation The sort of annotations carried by an expression are

now manifested in the type of the expression For example an expression annotated with free

variables has type AnnExpr Name Set Name

It is a real annoyance that AnnExpr and Expr have to dene two essentially identical sets

of constructors There app ears to b e no way around this within the HindleyMil ner type

system It would b e p ossible to abandon the Expr type altogether b ecause the Expr a is

nearly isomorphic to AnnExpr a but there are two reasons why we chose not to do this

Firstly the two types are not quite isomorphic b ecause the latter distinguishes

from while the former do es not Secondly and more seriously it is very tiresome to write

all the s when building and patternmatching on trees of type AnnExpr a

Finally we dene two useful functions bindersOf and rhssOf righthandsides of which

each take the list of denitions in an ELet and pick out the list of variables b ound by the

letrec and list of righthand sides to which they are b ound resp ectively

bindersOf binderrhs binder

bindersOf defns name name rhs defns

rhssOf binderrhs rhs

rhssOf defns rhs namerhs defns

This denition illustrates the use of a type signature to express the type of the function to

b e dened and of a list comprehension in the right hand side of bindersOf which may b e

read the list of all names where the pair namerhs is drawn from the list defns Both

of these are now conventional features of functional programming languages

This completes our development of the central data type The discussion has revealed some of

the strengths and a weakness of the algebraic data types provided by all mo dern functional

programming languages

Lambda lifting

Any implementation of a lexicallyscop ed programming language has to cop e with the fact that

a function may have free variables Unless these are removed in some way an environment

based implementation has to manipulate linked environment frames and a reductionbased

system is made signicantly more complex by the need to p erform renaming during substi

tution A p opular way of avoiding these problems esp ecially in graph reduction implementa

tions is to eliminate all free variables from function denitions by means of a transformation

known as lambda lifting Lambda lifting is a term coined by Johnsson Johnsson

but the transformation was indep endently developed by Hughes Hughes A tutorial

treatment is given by Peyton Jones Peyton Jones

The lambdalifting issue is not restricted to functional languages For example Pascal allows

a function to b e declared lo cally within another function and the inner function may have

free variables b ound by the outer scop e On the other hand the C language do es not p ermit

such lo cal denitions In the absence of side eects it is simple to make a lo cal function

denition into a global one all we need do is add the free variables as extra parameters and

add these parameters to every call This is exactly what lambda lifting do es

In a functionallanguage context lambda lifting transforms an expression into a set of super

combinator denitions each of which denes a function of zero or more arguments and whose

b o dy contains no embedded lambda abstractions In this pap er we will use the convention

that the value of the set of denitions is the value of the sup ercombinator main this artice

avoids the need to sp eak of a set of sup ercombinator denitions together with an expression

to b e evaluated

To take a simple example consider the program

let

f x let g y xx y in g g

in

f

Here x is free in the abstraction y xx y The free variable can b e removed by dening

a new function g which takes x as an extra parameter but whose b o dy is the oending

abstraction and redening g in terms of g giving the following set of sup ercombinator

denitions

g x y xx y

f x let g g x in g g

main f

To stress the fact that this program is a set of sup ercombinator denitions we p ermit

ourselves to write the arguments of the sup ercombinator on the left of the sign

Matters are no more complicated when recursion is involved Supp ose that g was recursive

thus

let

f x letrec g y cons xy g y in g

in

f

Now x and g are b oth free in the yabstraction so the lambda lifter will pro duce the following

set of sup ercombinators

g g x y cons xy g y

f x letrec g g g x in g

main f

Notice that the denition of g is still recursive but the lambda lifter has eliminated the lo cal

lambda abstraction The program is now directly implementable by most compiler back ends

This is not the only way to handle lo cal recursive function denitions The main alternative is

describ ed by Johnsson Johnsson who generates directlyrecursive sup ercombinators

from lo callyrecursive function denitions in our example g would b e directly recursive

rather than calling its parameter g thus

g x y cons xy g x y

f x g x

main f

The lambda lifter he describ es is much more complex than the one we develop here For

Johnsson it is worth the extra work b ecause the back end of his compiler can pro duce b etter

co de from directlyrecursive sup ercombinators but it is not clear that this applies universally

to all implementations At all events we will stick to the simple lambda lifter here since our

main concern is the interaction with full laziness

Implementing a simple lambda lifter

We are now ready to develop a simple lambda lifter It will take an expression and deliver a

list of sup ercombinator denitions so this is its type

lambdaLift Expression SCDefn

Each sup ercombinator denition consists of the name of the sup ercombinator the list of its

arguments and its b o dy

type SCDefn Name Name Expression

It should b e the case that there are no ELam constructors anywhere in the third comp onent

of the triple Unfortunately there is no way to express and hence enforce this constraint

in the type except by declaring yet another new variant of Expr lacking such a constructor

This is really another shortcoming of the type system there is no means of expressing this

sort of subtyping relationship

The lambda lifter works in three passes

First we annotate every no de in the expression with its free variables This is used by

the following pass to decide which extra parameters to add to a lambda abstraction

The freeVars function has type

freeVars Expression AnnExpr Name Set Name

Second the function abstract abstracts the free variables from each lambda abstrac

tion replacing the lambda abstraction with the application of the new abstraction now

a sup ercombinator to the free variables For example the lambda abstraction

x yx yz

would b e transformed to

y z x yx yz y z

abstract has the type signature

abstract AnnExpr Name Set Name Expression

Notice from the type signature that abstract removes the free variable information

which is no longer required

Finally collectSCs gives a unique name to each sup ercombinator collects all the

sup ercombinator denitions into a single list and introduces the main sup ercombinator

denition

collectSCs Expression SCDefn

The compiler itself is the comp osition of these three functions

lambdaLift collectSCs abstract freeVars

It would of course b e p ossible to do all the work in a single pass but the mo dularity provided

by separating them has a number of advantages each individual pass is easier to understand

the passes may b e reusable for example we reuse freeVars b elow and mo dularity makes

it easier to change the algorithm somewhat

4

Inx denotes function comp osition

As an example of the nal p oint the Haskell compiler under development at Glasgow will

b e able to generate b etter co de by omitting the collectSCs pass b ecause more is then

known ab out the context in which the sup ercombinator is applied For example consider the

expression which might b e pro duced by the abstract pass

let f vx vx v

in ff

Here abstract has removed v as a free variable from the xabstraction Rather than com

piling the sup ercombinator indep endently of its context our compiler constructs a closure for

f whose co de accesses v directly from the closure and x from the stack The calls to f thus

do not have to move v onto the stack The more free variables there are the more b enecial

this b ecomes Nor do the calls to f b ecome less ecient b ecause the denition is a lo cal one

the compiler can see the binding for f and can jump directly to its co de

Free variables

We b egin by giving the co de for the freevariable pass not b ecause it is particularly subtle

but b ecause it serves to illustrate Haskell language notation

freeVars EConst k setEmpty AConst k

freeVars EVar v setSingleton v AVar v

freeVars EAp e e

setUnion freeVarsOf e freeVarsOf e AAp e e

where

e freeVars e

e freeVars e

freeVars ELam args body

setDifference freeVarsOf body setFromList args ALam args body

where

body freeVars body

freeVars ELet isRec defns body

setUnion defnsFree bodyFree ALet isRec zip binders rhss body

where

binders bindersOf defns

binderSet setFromList binders

rhss map freeVars rhssOf defns

freeInRhss setUnionList map freeVarsOf rhss

defnsFree isRec setDifference freeInRhss binderSet

not isRec freeInRhss

body freeVars body

bodyFree setDifference freeVarsOf body binderSet

freeVarsOf AnnExpr Name Set Name Set Name

freeVarsOf freevars expr freevars

In the denition of defnsFree the b o olean condition b etween the vertical bar and the equals

sign is a guard which serves to select the appropriate righthand side The function zip in

the denition of defns is a standard function which takes two lists and returns a list consist

ing of pairs of corresp onding elements of the argument lists The set op erations setUnion

setDifference and so on are dened in the utilities mo dule whose interface is given in the

App endix Otherwise the co de should b e selfexplanatory

Generating sup ercombinators

The next pass merely replaces each lambda abstraction which is now annotated with its free

variables with a new abstraction the sup ercombinator applied to its free variables The

full denition is given in the App endix and the only interesting equation is that dealing with

lambda abstractions

abstract free ALam args body

foldl EAp sc map EVar fvList

where

fvList setToList free

sc ELam fvList args abstract body

The function foldl is a standard function given a dyadic function a value b and a list

xs x ; :::; x foldl b xs computes ::: b x x ::: x Notice the way

n n

that the freevariable information is discarded by the pass since it is no longer required

We also observe that abstract treats the two expressions ELam args ELam args body

and ELam argsargs body dierently In the former case the two abstractions will

b e treated separately generating two sup ercombinators while in the latter only one sup er

combinator is pro duced It is clearly advantageous to merge directlynested ELams b efore

p erforming lambda lifting This is equivalent to the  abstraction optimisation noted by

Hughes Hughes

Collecting sup ercombinators

Finally we have to name the sup ercombinators and collect them together To generate new

names the main function has to carry around a name supply in particular it must take

the name supply as an argument and return a depleted supply as a result In addition it

must return the collection of sup ercombinators it has found and the transformed expression

Because of these auxiliary arguments and results we dene a function collectSCse which

do es all the work with collectSCs b eing dened in terms of it

collectSCse NameSupply Expression

NameSupply Bag SCDefn Expression

collectSCs e main e bagToList scs

where

scs e collectSCse initialNameSupply e

The name supply is represented by an abstract data type NameSupply whose interface is given

in the App endix but in which we take no further interest The collection of sup ercombinators

is a bag represented by the abstract data type of Bag which has similar op erations dened

on it as those for Set

The in the last line of collectSCs is a wildcard which signals the fact that we are not

interested in the depleted name supply resulting from transforming the whole program The

co de is now easy to write

collectSCse ns EConst k ns bagEmpty EConst k

collectSCse ns EVar v ns bagEmpty EVar v

collectSCse ns EAp e e

ns bagUnion scs scs EAp e e

where

ns scs e collectSCse ns e

ns scs e collectSCse ns e

In the case of lambda abstractions we replace the abstraction by a name and add a sup er

combinator to the result

collectSCse ns ELam args body

ns bagInsert name args body bodySCs EConst CFun name

where

ns bodySCs body collectSCse ns body

ns name newName ns SC

A common paradigm o ccurs in the case for letrec

collectSCse ns ELet isRec defns body

ns scs ELet isRec defns body

where

ns bodySCs body collectSCse ns body

ns scs defns mapAccuml collectSCsd ns bodySCs defns

collectSCsd ns scs name rhs

ns bagUnion scs scs name rhs

where

ns scs rhs collectSCse ns rhs

When pro cessing a list of denitions we need to generate a new list of denitions threading

the name supply through each This is done by a new higherorder function mapAccuml

which b ehaves like a combination of map and foldl it applies a function to each element of a

list passing an accumulating parameter from left to right and returning a nal value of this

accumulator together with the new list mapAccuml is dened in the App endix

This completes the denition of the simple lambda lifter We now turn our attention to full

laziness

Separating full laziness from lambda lifting

Previous accounts of full laziness have invariably linked it to lambda lifting by describing

fullylazy lambda lifting which turns out to b e rather a complex pro cess Hughes gives an

algorithm but it is extremely subtle and do es not handle letrec expressions Hughes

On the other hand Peyton Jones do es cover letrec expressions but the description is only

informal and no algorithm is given Peyton Jones

In this section we show how full laziness and lambda lifting can b e cleanly separated This

is done by means of a transformation involving letexpressions Lest it b e supp osed that we

have simplied things in one way only by complicating them in another we also show that

p erforming fullylazy lambda lifting without letrec expressions risks an unexp ected loss of

laziness Furthermore much more ecient co de can b e generated for letrec expressions in

later phases of most compilers than for their equivalent lambda expressions

A review of full laziness

We b egin by briey reviewing the concept of full laziness Consider again the example given

earlier

let

f x let g y xx y in g g

in

f

The simple lambda lifter generates the program

g x y xx y

f x let g g x in g g

main f

In the b o dy of f there are two calls to g and hence to g But g x is not a reducible

expression so xx will b e computed twice But x is xed in the b o dy of f so some work is

b eing duplicated It would b e b etter to share the calculation of xx b etween the two calls to

g This can b e achieved as follows instead of making x a parameter to g we make xx

into a parameter like this

g p y p y

f x let g g xx in g g

we omit the denition of main from now on since it do es not change So a fullylazy

lambda lifter will make each maximal free subexpresssion rather than each free variable of

a lambda abstraction into an argument of the corresp onding sup ercombinator A maximal free

expression or MFE of a lambda abstraction is an expression which contains no o ccurrences

of the variable b ound by the abstraction and is not a subexpression of a larger expression

with this prop erty

Full laziness corresp onds precisely to moving a lo opinvariant expression outside the lo op so

that it is computed just once at the b eginning rather than once for each lo op iteration

How imp ortant is full laziness for real programs No serious studies have yet b een made

of this question though we plan to do so However recent work by Holst suggests that the

imp ortance of full laziness may b e greater than might at rst b e supp osed Holst

He shows how to p erform a transformation which automatically enhances the eect of full

laziness to the p oint where the optimisations obtained compare favourably with those gained

by partial evaluation Jones Sestoft Sndergaard though with much less eort

Fulllazy lambda lifting in the presence of letrecs

Writing a fullylazy lambda lifter as outlined in the previous section is somewhat subtle

Our language which includes letrec expressions app ears to make this worse by introducing

a new language construct For example supp ose the denition of g in our running example

was slightly more complex thus

g y let z xx

in let p zz

in p y

Now the subexpression xx is a MFE of the yabstraction but subexpression zz is not

since z is b ound inside the yabstraction Yet it is clear that p dep ends only on x alb eit

indirectly and so we should ensure that zz should only b e computed once

Do es a fullylazy lambda lifter sp ot this if letexpressions are co ded as lambda applications

No it do es not The denition of g would b ecome

g y z p py zz xx

Now xx is free as b efore but zz is not In other words if the compiler does not treat

letrecexpressions specially it may lose ful l laziness which the programmer might reasonably

expect to be preserved

Fortunately there is a straightforward way to handle letrecexpressions as describ ed by

Peyton Jones namely to oat each letrec denition outward until it is outside any lambda

abstraction in which it is free Peyton Jones Chapter For example all we need

do is transform the denition of g to the following

g let z xx

in let p zz

in y p y

Now xx and zz will each b e computed only once Notice that this prop erty should hold for

any implementation of the language not merely for one based on lambda lifting and graph

reduction This is a clue that full laziness and lambda lifting are not as closely related as at

rst app ears a topic to which we will return in the next section

Meanwhile how can we decide how far out to oat a denition It is most easily done by

using lexical level numbers or de Bruijn numbers There are three steps

First assign to each lambdab ound variable a level number which says how many

lambdas enclose it Thus in our example x would b e assigned level number and y

level number

Now assign a level number to each letrecb ound variable outermost rst which is

the maximum of the level numbers of its free variables or zero if there are none In our

example b oth p and z would b e assigned level number Some care needs to b e taken

to handle letrecs correctly

Finally oat each denition whose binder has level n say outward until it is outside

the lambda abstraction whose binder has level n but still inside the leveln ab

straction There is some freedom in this step ab out exactly where b etween the two the

denition should b e placed

Each mutuallyrecursive set of denitions dened in a letrec should b e oated out together

b ecause they dep end on each other and must remain in a single letrec If in fact the

denitions are not mutually recursive despite app earing in the same letrec this p olicy might

lose laziness by retaining in an inner scop e a denition which could otherwise b e oated

further outwards The standard solution is to p erform dependency analysis on the denitions

in each letrec expression to break each group of denitions into its minimal subgroups We

take no further interest in this optimisation which is discussed in Chapter of Peyton Jones

Peyton Jones

Finally a renaming pass should b e carried out b efore the oating op eration so that there

is no risk that the bindings will b e altered by the movement of the letrec denitions For

example the expression

y let y xx in y

is obviously not equivalent to

let y xx in yy

All that is required is to give every binder a unique name to eliminate the name clash

Full laziness without lambda lifting

At rst it app ears that the requirement to oat letrecs outward in order to preserve full

laziness merely further complicates the already subtle fully lazy lambda lifting algorithm

suggested by Hughes However a simple transformation allows al l the full laziness to b e

achieved by letrec oating while lambda lifting is p erformed by the original simple lambda

lifter

The transformation is this before oating letrec denitions replace each MFE e with the

expression let v e in v This transformation b oth gives a name to the MFE and makes it

accessible to the letrec oating transformation which can now oat out the new denitions

Ordinary lambda lifting can then b e p erformed For example consider the original denition

of g

let

f x let g y xx y

in g g

in

f

The sub expression xx is an MFE so it is replaced by a trivial letexpression

let

f x let g y let v xx in v y

in g g

in

f

Now the letexpression is oated outward

let

f x let g let v xx in y v y

in g g

in

f

Finally ordinary lambda lifting will discover that v is free in the yexpression and the

resulting program b ecomes

g v y v y

f x let g let v xx in g v

in g g

main f

A few p oints should b e noted here Firstly the original denition of a maximal free expression

was relative to a particular lambda abstraction The new algorithm we have just developed

transforms certain expressions into trivial letexpresssions Which expressions are so trans

formed Just the ones which are MFEs of any enclosing lambda abstraction For example

in the expression

y z y xx z

two MFEs are identied xx since it is an MFE of the yabstraction and y xx

since it is an MFE of the zabstraction After introducing the trivial letbindings the

expression b ecomes

y z let v y let v xx in v in v z

Secondly the newlyintroduced variable v must either b e unique or the expression must b e

uniquely renamed after the MFEidentication pass

Thirdly in the nal form of the program v is only referenced once so it would b e sensible

to replace the reference by the righthandside of the denition and eliminate the denition

yielding exactly the program we obtained using Hughess algorithm This is a straightforward

transformation and we will not discuss it further here except to note that this prop erty will

hold for all letdenitions which are oated out past a lambda In any case many compiler

back ends will generate the same co de regardless of whether or not the transformation is

p erformed

A fully lazy lambda lifter

Now we are ready to dene the fully lazy lambda lifter It can b e decomp osed into the

following stages

First we must make sure that all ELam constructors bind only a single variable b ecause

the fullylazy lambda lifter must treat each lambda individuall y It would b e p ossible

to enco de this in later phases of the algorithm by dealing with a list of arguments but

it turns out that we can express an imp ortant optimisation by altering this pass alone

separateLams Expression Expression

First we annotate all binders and expressions with level numbers which we represent

by natural numbers starting with zero

type Level Int

addLevels Expression AnnExpr Name Level Level

Next we identify all MFEs by replacing them with trivial letexpressions Level numbers

are no longer required on every subexpression only on binders

identifyMFEs AnnExpr Name Level Level Expr Name Level

A renaming pass makes all binders unique so that oating do es not cause namecapture

errors This must b e done after identifyMFEs which introduces new bindings

rename Expr Name a Expr Name a

Now the letrec denitions can b e oated outwards The level numbers are not required

any further

float Expr Name Level Expression

Finally ordinary lambda lifting can b e carried out using lambdaLift from the section

dealing with simple lambda lifting

The fully lazy lambda lifter is just the comp osition of these passes

fullyLazyLift lambdaLift float rename

identifyMFEs addLevels separateLams

Separating the lambdas

The rst pass which separates lambdas so that each ELam only binds a single argument is

completely straightforward The only interesting equation is that which handles abstractions

which we give here

separateLams ELam args body foldr mkLam separateLams body args

where

mkLam arg body ELam arg body

The other equations are given in the App endix

Adding level numbers

There are a couple of complications concerning annotating an expression with levelnumbers

At rst it lo oks as though it is sucient to write a function which returns an expresssion

annotated with level numbers then for an application for example one simply takes the

maximum of the levels of the two subexpressions Unfortunately this approach loses to o

much information b ecause there is no way of mapping the levelnumber of the body of a

lambda abstraction to the level number of the abstraction itself The easiest solution is rst

to annotate the expression with its free variables and then use a mapping freeSetToLevel

from variables to levelnumbers to convert the freevariable annotations to level numbers

freeSetToLevel Set Name Assn Name Level Level

freeSetToLevel freevars env

maximum map assLookup env setToList freevars

If there are no free variables return level zero

The second complication concerns letrec expressions What is the correct level number to

attribute to the newlyintroduced variables The right thing to do is to take the maximum

of the levels of the free variables of all the righthand sides without the recursive variables or

equivalently map the recursive variables to level zero when taking this maximum This level

should b e attributed to each of the new variables Letexpressions are much simpler just

attribute to each new variable the level number of its righthand side

Now we are ready to dene addLevels It is the comp osition of two passes the rst of which

annotates the expression with its free variables while the second uses this information to

generate levelnumber annotations

addLevels freeToLevel freeVars

We have dened the freeVars function already so it remains to dene freeToLevel The

main function will need to carry around the current level and a mapping from variables to

level numbers so as usual we dene freeToLevel in terms of freeToLevele which do es all

the work

freeToLevele Level Level of context

Assn Name Level Level of in names

AnnExpr Name Set Name Input expression

AnnExpr Name Level Level Result expression

freeToLevel e freeToLevele e

We represent the nametolevel mapping as an asso ciation list with type Assn Name Level

The interface of asso ciation lists is given in the App endix but notice that it is not abstract

It is so convenient to use all the standard functions on lists and notation for lists rather than

to invent their analogues for asso ciations that we have compromised the abstraction

For constants variables and applications it is simpler and more ecient to ignore the free

variable information and calculate the level number directly

freeToLevele level env AConst k AConst k

freeToLevele level env AVar v assLookup env v AVar v

freeToLevele level env AAp e e

max levelOf e levelOf e AAp e e

where

e freeToLevele level env e

e freeToLevele level env e

This cannot b e done for lambda abstractions so we compute the level number of the abstrac

tion using freeSetToLevel We also assign a level number to each variable in the argument

list At present we exp ect there to b e only one such variable but we will allow there to b e

several and assign them all the same level number This works correctly now and turns out

to b e just what is needed to supp ort a useful optimisation later

freeToLevele level env free ALam args body

freeSetToLevel free env ALam args body

where

body freeToLevele level args env body

args zip args repeat level

Letrecexpressions follow the scheme outlined at the b eginning of this section

freeToLevele level env free ALet isRec defns body

levelOf body ALet isRec defns body

where

binders bindersOf defns

freeRhsVars setUnionList free free rhssOf defns

maxRhsLevel freeSetToLevel freeRhsVars

name name binders env

defns map freeToLeveld defns

body freeToLevele level bindersOf defns env body

freeToLeveld name rhs name levelOf rhs rhs

where rhs freeToLevele level envRhs rhs

envRhs isRec namemaxRhsLevel name binders env

not isRec env

Notice that the level of the whole letrec expression is that of the b o dy This is valid provided

the b o dy refers to all the binders directly or indirectly If any denition is unused we might

assign a level number to the letrec which would cause it to b e oated outside the scop e of

some variable mentioned in the unused denition This is easily xed but it is simpler to

assume that the expression contains no redundant denitions

Finally the auxillary function levelOf extracts the level from an expression

levelOf AnnExpr a Level Level

levelOf level e level

5

The dep endency analysis phase referred to earlier could eliminate such denitions

Identifying MFEs

It is simple to identify MFEs by comparing the level number of an expression with the level

of its context This requires an auxiliary parameter to give the level of the context

identifyMFEse Level AnnExpr Name Level Level Expr Name Level

identifyMFEs e identifyMFEse e

Once an MFE e has b een identied our strategy is to wrap it in a trivial letexpression of the

form let v e in v but not all MFEs deserve sp ecial treatment in this way For example

it would b e a waste of time to wrap such a letexpression around an MFE consisting of a

single variable or constant Other examples are given b elow in the section on redundant full

laziness We enco de this knowledge of which MFEs deserve sp ecial treatment in a function

notMFECandidate

notMFECandidate AConst k True

notMFECandidate AVar v True

notMFECandidate False For now everything else is a candidate

identifyMFEse works by comparing the level number of the expression with that of its

context If they are the same or for some other reason the expression is not a candidate for

sp ecial treatment the expression is left unchanged except that identifyMFEse is used to

apply identifyMFEse to its sub expressions otherwise we use transformMFE to p erform the

appropriate transformation

identifyMFEse cxt level e

if level cxt notMFECandidate e

then e

else transformMFE level e

where

e identifyMFEse level e

transformMFE level e ELet nonRecursive vlevel e EVar v

identifyMFEse applies identifyMFEse to the comp onents of the expression Its denition

is straightforward and is given in the App endix

Renaming and oating

The renaming pass is entirely straightforward involving plumbing a name supply in a similar

manner to collectSCs The nal pass which oats letrec expressions out to the appropriate

level is also fairly easy

Complete denitions for rename and float are given in the App endix

Avoiding redundant full laziness

Full laziness do es not come for free It has two main negative eects

Multiple lambda abstractions such as xyE turn into one sup ercombinator under

the simple scheme but two under the fully lazy scheme Two reductions instead of one

are therefore required to apply it to two arguments which may well b e more exp ensive

Lifting out MFEs removes sub expressions from their context and thereby reduces op

p ortunities for a compiler to p erform optimisations Such optimisations might b e par

tially restored by an interprocedural analysis which gured out the contexts again but

it is b etter still to avoid creating the problem

These p oints are elab orated by Fairbairn Fairbairn and Goldb erg Goldb erg

Furthermore they p oint out that often no b enet arises from lifting out every MFE from

every lambda abstraction In particular

If no partial applications of a multiple abstraction can b e shared then nothing is gained

by oating MFEs out to between the nested abstractions

Very little is gained by lifting out an MFE that is not a reducible expression No work

is shared thereby though there may b e some saving in storage b ecause the closure

need only b e constructed once This is more than outweighed by the loss of compiler

optimisations caused by removing the expression from its context

Lifting out an MFE which is a constant expression ie level thereby adding an extra

parameter to pass in is inecient It would b e b etter to make the constant expression

into a sup ercombinator

These observations suggest some improvements to the fully lazy lambda lifter and they turn

out to b e quite easy to incorp orate

If a multiple abstraction is not separated into separate ELam constructors by the

separateLam pass then all the variables b ound by it will b e given the same level

number It follows that no MFE will b e identied which is free in the inner abstraction

but not the outer one This ensures that no MFEs will b e oated out to b etween two

abstractions represented by a single ELam constructor

All that is required is to mo dify the separateLams pass to keep in a single ELam con

structor each multiple abstraction of which partial applications cannot b e shared This

sharing information is not trivial to deduce but at least we have an elegant way to use

its results by mo difying only a small part of our algorithm

This is one reason why we chose to allow ELam constructors to take a list of binders

identifyMFEs uses a predicate notMFECandidate to decide whether to identify a partic

ular sub expression as an MFE This provides a convenient place to add extra conditions

to exclude from consideration expressions which are not redexes This condition to o

is undecidable in general but a go o d approximation can b e made in many cases for

example is obviously not a redex

When identifying MFEs it is easy to sp ot those that are constant expressions b ecause

their level numbers are zero We can rather neatly ensure that it is turned into a

sup ercombinator by the subsequent lambdalifting pass by making it into a lambda

abstraction with an empty argument list This was the other reason why we decided

to make the ELam constructor take a list of arguments The mo dication aects only

identifyMFEse thus

identifyMFEse cxt level e

if level cxt notMFECandidate e then

e

else if level then

ELet nonRecursive vlevel e EVar v

else

ELam e

where

e identifyMFEse level e

Retrosp ective and comparison with other work

It is interesting to compare our approach with Birds very nice pap er Bird which

addresses a similar problem Birds ob jective is to give a formal development of an ecient

fully lazy lambda lifter by successive transformation of an initial sp ecication The resulting

algorithm is rather complex and would b e hard to write down directly thus fully justifying

the eort of a formal development

In contrast we have expressed our algorithm as a comp osition of a number of very simple

phases each of which can readily b e sp ecied and written down directly The resulting

program has a constantfactor ineciency b ecause it makes many traversals of the expression

This is easily removed by folding together successive passes into a single function eliminating

the intermediate data structure Unlike Birds transformations this is a straightforward

pro cess

Our approach has the ma jor advantage that is is modular This allows changes to b e made

more easily For example it would b e a simple matter to replace the lambda lifter with one

which p erformed lambdalifting in the way suggested by Johnsson Johnsson whereas

doing the same for Birds algorithm would b e a ma jor exercise Similarly it proved rather

easy to mo dify the algorithm to b e more selective ab out where full laziness is introduced

The main disadvantage of our approach is that we are unable to take advantage of one

optimisation suggested by Hughes namely ordering the parameters to a sup ercombinator to

reduce the number of MFEs The reason for this is that the optimisation absolutely requires

that lambda lifting b e entwined with the pro cess of MFE identication while we have carefully

separated these activities Happily for us the larger MFEs created by this optimisation are

always partial applications which should probably not b e identied as MFEs b ecause no

work is shared thereby Even so matters might not have fallen out so fortuitously and our

separation of concerns has certainly made some sorts of transformation rather dicult

Acknowledgements

We owe our thanks to John Robson and Kevin Hammond and two anonymous referees for

their useful comments on a draft of this pap er

References

RS Bird A formal development of an ecient sup ercombinator compiler Science of

Computer Programming

Jon Fairbairn Feb Removing redundant laziness from sup ercombinators in Pro c

Asp enas workshop on implementation of functional languages

Benjamin F Goldb erg April Multipro cessor execution of functional programs

YALEUDCSRR Dept of Computer Science Yale University

CK Holst Improving full laziness in Pro ceedings of the Glasgow Workshop on

Functional Programming Ullap o ol SL Peyton Jones G Hutton CK Holst eds

Workshops in Computing Springer Verlag

P Hudak SL Peyton Jones PL Wadler Arvind B Boutel J Fairbairn J Fasel M Guzman

K Hammond J Hughes T Johnsson R Kieburtz RS Nikhil W Partain J Peterson

May Rep ort on the functional programming language Haskell Version

SIGPLAN Notices

RJM Hughes July The design and implementation of programming languages PhD

thesis Programming Research Group Oxford

Thomas Johnsson Lambda lifting transforming programs to recursive equations

in Pro c IFIP Conference on Functional Programming and Computer Architecture

Jouannaud ed LNCS Springer Verlag

Thomas Johnsson Compiling lazy functional languages PhD thesis PMG Chalmers

University Goteb org Sweden

ND Jones P Sestoft H Sndergaard Mix a selfapplicable partial evaluator for

exp eriments in compiler generation Lisp and Symbolic Computation

SL Peyton Jones The Implementation of Functional Programming Languages Prentice

Hall

A App endix Omitted co de

This app endix contains the denitions of functions omitted from the main pap er

A abstract

We b egin with the function abstract

abstract AConst k EConst k

abstract AVar v EVar v

abstract AAp e e EAp abstract e abstract e

abstract free ALam args body

foldl EAp sc map EVar fvList

where

fvList setToList free

sc ELam fvList args abstract body

abstract ALet isRec defns body

ELet isRec name abstract body name body defns abstract body

A separateLams

Next comes the co de for separateLams

separateLams EConst k EConst k

separateLams EVar v EVar v

separateLams EAp e e EAp separateLams e separateLams e

separateLams ELam args body foldr mkLam separateLams body args

where

mkLam arg body ELam arg body

separateLams ELet isRec defns body

ELet isRec name separateLams rhs namerhs defns

separateLams body

A identifyMFEse

identifyMFEse applies identifyMFEse to the comp onents of the expresion

identifyMFEse Level AnnExpr Name Level Level Expr Name Level

identifyMFEse level AConst k EConst k

identifyMFEse level AVar v EVar v

identifyMFEse level AAp e e

EAp identifyMFEse level e identifyMFEse level e

When it encounters a binder it changes the context level number carried down as its rst

argument

identifyMFEse level ALam args body

ELam args identifyMFEse argLevel body

where

argLevel head args

identifyMFEse level ALet isRec defns body

ELet isRec defns body

where

body identifyMFEse level body

defns namerhsLevelidentifyMFEse rhsLevel rhs

namerhsLevelrhs defns

A rename

The function rename gives unique names to the variables in an expression We need an

auxiliary function renamee to do all the work

rename e e where e renamee initialNameSupply e

renamee Assn Name Name NameSupply Expr Namea

NameSupply Expr Namea

renamee env ns EConst k ns EConst k

renamee env ns EVar v ns EVar assLookup env v

renamee env ns EAp e e

ns EAp e e

where

ns e renamee env ns e

ns e renamee env ns e

renamee env ns ELam args body

ns ELam args body

where

ns args mapAccuml newBinder ns args

ns body renamee assocBinders args args env ns body

renamee env ns ELet isRec defns body

ns ELet isRec zip binders rhss body

where

ns body renamee env ns body

binders bindersOf defns

ns binders mapAccuml newBinder ns binders

env assocBinders binders binders env

ns rhss mapAccuml renamee rhsEnv ns rhssOf defns

rhsEnv isRec env

not isRec env

newBinder is just like newName but works over Namea binders

newBinder ns name info

ns name info where ns name newName ns name

assocBinders builds an asso cation list b etween the names inside two lists of Namea

binders

assocBinders Namea Namea Assn Name Name

assocBinders binders binders zip map fst binders map fst binders

A float

The nal pass oats letrec expressions out to the appropriate level The main function has

to return an expression together with a list of denitions which should b e oated outside the

expression

floate Expr Name Level FloatedDefns Expression

The toplevel function float uses floate to do the main work but if any denitions are

oated out to the top level float had b etter install them at this level thus

float e install floatedDefns e where floatedDefns e floate e

There are many p ossible representations for the FloatedDefns type and we will choose a

simple one by representing the denitions b eing oated as a list each element of which

represents a group of denitions identied by its level and together with its IsRec ag

type FloatedDefns Level IsRec Defn Name

Since the denitions in the list may dep end on one another we add the following constraint a

denition group may dep end only on denition groups app earing earlier in the FloatedDefns

list

It is now p ossible to dene install which wraps an expression in a nested set of letrecs

containing the sp ecied denitions

install FloatedDefns Expression Expression

install defnGroups e

foldr installGroup e defnGroups

where

installGroup level isRec defns e ELet isRec defns e

We can now pro ceed to a denition of floate The cases for variables constants and

applications are straightforward

floate EConst k EConst k

floate EVar v EVar v

floate EAp e e fd fd EAp e e

where

fd e floate e

fd e floate e

How far out should a denition b e oated There is more than one p ossible choice but here

we choose to install a denition just inside the innermost lambda which binds one its free

variables

floate ELam args body

outerLevelDefns ELam args install thisLevelDefns body

where

args arg arglevel args

thisLevel head args Extract level of abstraction

floatedDefns body floate body

thisLevelDefns filter groupIsThisLevel floatedDefns

outerLevelDefns filter notgroupIsThisLevel floatedDefns

groupIsThisLevel levelisRecdefns level thisLevel

The case for a letrec expression adds its denition group to those oated out from its b o dy

and from its righthand sides The latter must come rst since the new denition group may

dep end on them

floate ELet isRec defns body

rhsFloatDefns thisGroup bodyFloatDefns body

where

bodyFloatDefns body floate body

rhsFloatDefns defns mapAccuml floatdefn defns

thisGroup thisLevel isRec defns

thisLevel head bindersOf defns

floatdefn floatedDefns namelevel rhs

rhsFloatDefns floatedDefns name rhs

where

rhsFloatDefns rhs floate rhs

A mapAccuml

Finally here is the full denition of utility function mapAccuml

mapAccuml b a b c Function of element of input list

and accumulator returning new

accumulator and element of result list

b Initial accumulator

a Input list

b c Final accumulator and result list

mapAccuml f b b

mapAccuml f b xxs b xxs where b x f b x

b xs mapAccuml f b xs

6

Recall that all variables b ound by a single ELam construct are given the same level

B App endix Interface to Utilities mo dule

This is the text of the interface for the Utilities mo dule This interface is imp orted by

the line import Utilities in the LambdaLift mo dule The interface can b e deduced by the

compiler from the text of the Utilities mo dule indeed the interface b elow is exactly that

pro duced by the compiler mo dulo some reordering and comments

The rst line introduces and names the interface

interface Utilities where

Next we have declarations for the Set type Notice that the data type Set is given with an

abbreviated data declaration which omits the constructors This makes the Set type ab

stract since the imp orting mo dule can only build sets and take them apart using the functions

provided in the interface

data Set a

setDifference Ord a Set a Set a Set a

setIntersect Ord a Set a Set a Set a

setUnion Ord a Set a Set a Set a

setUnionList Ord a Set a Set a

setToList Set a a

setSingleton a Set a

setEmpty Set a

setFromList Ord a a Set a

Bags asso ciation lists and the name supply follow similarly Notice that asso ciation lists are

declared with a type synonym so they are not abstract

data Bag a

bagUnion Bag a Bag a Bag a

bagInsert a Bag a Bag a

bagToList Bag a a

bagFromList a Bag a

bagSingleton a Bag a

bagEmpty Bag a

type Assn a b a b

assLookup Eq a a b a b

data NameSupply

initialNameSupply NameSupply

newName NameSupply Char NameSupply Char