<<

Title Introduction to

Lecturer Mike Gordon

(httpwwwclcamacukus ers mjcg )

Class Computer Science Trip os Part I IGeneral Diploma

Term Lent term

First Lecture Friday January at am

Lo cation Heyco ck Lecture Ro om

Duration Twelve lectures M W F

Preface

This course aims to teach b oth the theory and practice of functional programming

The theory consists of the calculus and the practice will b e illustrated using the

Standard ML

The eld of Functional Programming splits into those who prefer lazy languages

like Haskell and those who prefer strict languages like ML The practical parts

of this course almost exclusively emphasise the latter but the material on the

calculus underlies b oth approaches

The chapters on the calculus have b een largely condensed from Part II of the

b o ok

MJC Gordon Programming Language Theory and its Implementa

tion Prentice Hall International Series in Computer Science cur

rently out of print

The intro duction to ML in Chapter started life as part of

Gordon MJC Milner AJRG and Wadsworth CP Edinburgh

LCF a mechanized logic of computation Springer Lecture Notes in

Computer Science SpringerVerlag

The ML parts of this were up dated substantially in the technical rep ort

G Cousineau M Gordon G Huet R Milner L Paulson and

C Wadsworth The ML handb o ok INRIA

I translated the intro duction of this rep ort into Standard ML and added some new

material to get Chapter The case studies were written by me at great sp eed

and so are b ound to contain numerous mistakes They aim to show how MLbased

functional programming can b e used in practice

The following p eople have contributed in various ways to the material cited ab ove

or to these notes Graham Birtwistle Shiu Kai Chin Avra Cohn Jan van Eijck

Mike Fourman Elsa Gunter Peter Hanco ck Martin Hyland Tom Melham Allan

C Milne Nicholas Ouruso David Shepherd and Roger Stokes i

ii Preface

Contents

Preface i

Intro duction to the calculus

Syntax and semantics of the calculus

Notational conventions

Free and b ound variables

Conversion rules

conversion

conversion

conversion

Generalized conversions

Equality of expressions

The relation

Extensionality

Substitution

Representing Things in the calculus

Truthvalues and the conditional

Pairs and tuples

Numb ers

Denition by recursion

Functions with several arguments

Mutual recursion

Representing the recursive functions

The primitive recursive functions

The recursive functions

The partial recursive functions

Extending the calculus

Theorems ab out the calculus

Callbyvalue and Y

Combinators

Combinator reduction

Functional completeness

Reduction machines

Improved translation to combinators

More combinators

Currys algorithm

Turners algorithm iii

iv Contents

A Quick Overview of ML

Interacting with ML

Expressions

Declarations

Comments

Functions

Typ e abbreviations

Op erators

Lists

Strings

Records

Polymorphism

fnexpressions

Conditionals

Recursion

Equality typ es

Pattern matching

The case construct

Exceptions

Datatyp e declarations

Abstract typ es

Typ e constructors

References and assignment

Iteration

Programming in the large

Case study parsing

Lexical analysis

Simple sp ecial cases of parsing

Applicative expressions

Precedence parsing of inxes

A general topdown precedence parser

Case study the calculus

A calculus parser

Implementing substitution

The SECD machine

Bibliography

Chapter

Intro duction to the calculus

The calculus or lamb dacalculus is a theory of functions that was originally

develop ed by the logician Alonzo Church as a foundation for mathematics This

work was done in the s several years b efore digital computers were invented A

little earlier in the s Moses Schonnkel develop ed another theory of functions

based on what are now called combinators In the s Haskell Curry redis

covered and extended Schonnkels theory and showed that it was equivalent to

the calculus Ab out this time Kleene showed that the calculus was a universal

computing system it was one of the rst such systems to b e rigorously analysed

In the s John McCarthy was inspired by the calculus to invent the program

ming language LISP In the early s Peter Landin showed how the meaning of

imp erative programming languages could b e sp ecied by translating them into the

calculus He also invented an inuential prototyp e programming language called

ISWIM This intro duced the main notations of functional programming and

inuenced the design of b oth functional and imp erative languages Building on this

work Christopher Strachey laid the foundations for the imp ortant area of denota

tional semantics Technical questions concerning Stracheys work inspired

the mathematical logician Dana Scott to invent the theory of domains which is now

one of the most imp ortant parts of theoretical computer science During the s

Peter Henderson and Jim Morris to ok up Landins work and wrote a numb er of

inuential pap ers arguing that functional programming had imp ortant advantages

for software engineering At ab out the same time David Turner prop osed

that Schonnkel and Currys combinators could b e used as the machine co de of

computers for executing functional programming languages Such computers could

exploit mathematical prop erties of the calculus for the parallel evaluation of pro

grams During the s several research groups to ok up Hendersons and Turners

ideas and started working on making functional programming practical by designing

sp ecial architectures to supp ort it some of them with many pro cessors

We thus see that an obscure branch of mathematical logic underlies imp ortant

developments in programming language theory such as

i The study of fundamental questions of computation

ii The design of programming languages

iii The semantics of programming languages

iv The architecture of computers

Syntax and semantics of the calculus

The calculus is a notation for dening functions The expressions of the notation

are called expressions and each such expression denotes a function It will b e

seen later how functions can b e used to represent a wide variety of data and data

structures including numb ers pairs lists etc For example it will b e demonstrated

Chapter Intro duction to the calculus

how an arbitrary pair of numb ers x y can b e represented as a expression As

a notational convention mnemonic names are assigned in b old or underlined to

particular expressions for example is the expression dened in Section

which is used to represent the numb er one

There are just three kinds of expressions

i Variables x y z etc The functions denoted by variables are determined

by what the variables are b ound to in the environment Binding is done by

abstractions see b elow We use V V V etc for arbitrary variables

ii Function applications or combinations if E and E are expressions

then so is E E it denotes the result of applying the function denoted by

E to the function denoted by E E is called the rator from op erator

n denotes and E is called the rand from op erand For example if m

a function representing the pair of numb ers m and n see Section and

sum denotes the addition function calculus see Section then the

n denotes mn application summ

iii Abstractions if V is a variable and E is a expression then V E is an

abstraction with bound variable V and body E Such an abstraction denotes

the function that takes an argument a and returns as result the function

denoted by E in an environment in which the b ound variable V denotes a

More sp ecically the abstraction V E denotes a function which takes an

0 0

argument E and transforms it into the thing denoted by E E V the result

0

of substituting E for V in E see Section For example x sumx

denotes the addone function

Using BNF the syntax of expressions is just

expression variable

j expression expression

j variable expression

If V ranges over the syntax class variable and E E E etc range over

the syntax class expression then the BNF simplies to

E j V E j E E

V

z

z

variables

abstractions

applications

combinations

The description of the meaning of expressions just given ab ove is vague and

intuitive It to ok ab out years for logicians Dana Scott in fact to make it

rigorous in a useful way We shall not b e going into details of this

Example x x denotes the identity function x x E E 2

Example x f f x denotes the function which when applied to E yields

0

f f x E x ie f f E This is the function which when applied to E

0 0

yields f E E f ie E E Thus

x f f x E f f E

and

0 0

f f E E E E

2

1

Note that sum is a expression whereas is a mathematical symb ol in the metalanguage

ie English that we are using for talking ab out the calculus

Notational conventions

Exercise

Describ e the function denoted by x y y 2

Example Section describ es how numb ers can b e represented by expressions

Assume that this has b een done and that are expressions which rep

resent resp ectively Assume also that add is a expression denoting a

function satisfying

n mn add m

Then x add x is a expression denoting the function that transforms

to n and x y add xy is a expression denoting the func n

to the function which when applied to n yields mn tion that transforms m

namely y add my 2

The relationship b etween the function sum in ii at the b eginning of this section

page and the function add in the previous example is explained in Section

Notational conventions

The following conventions help minimize the numb er of brackets one has to write

Function application asso ciates to the left ie E E E means

n

E E E For example

n

E E means E E

E E E means E E E

E E E E means E E E E

V E E E means V E E E Thus the scop e of V

n n

extends as far to the right as p ossible

V V E means V V E For example

n n

x y E means x y E

x y z E means x y z E

x y z w E means x y z w E

Example x y add y x means x y add y x 2

Free and b ound variables

An o ccurrence of a variable V in a expression is free if it is not within the scop e

of a V otherwise it is bound For example

x y xy x y

free free

b ound b ound

Chapter Intro duction to the calculus

Conversion rules

In Chapter it is explained how expressions can b e used to represent data ob jects

like numb ers strings etc For example an arithmetic expression like

can b e represented as a expression and its value can also b e represented as a

expression The pro cess of simplifying to will b e represented by a

pro cess called conversion or reduction The rules of conversion describ ed b elow

are very general yet when they are applied to expressions representing arithmetic

expressions they simulate arithmetical evaluation

There are three kinds of conversion called conversion conversion and

conversion the original motivation for these names is not clear In stating the

0

conversion rules the notation E E V is used to mean the result of substituting

0

E for each free o ccurrence of V in E The substitution is called valid if and only

0 0

if no free variable in E b ecomes b ound in E E V Substitution is describ ed in

more detail in Section

The rules of conversion

conversion

Any abstraction of the form V E can b e converted to

0 0 0

V E V V provided the substitution of V for V in E is

valid

conversion

Any application of the form V E E can b e converted to

E E V provided the substitution of E for V in E is valid

conversion

Any abstraction of the form V E V in which V has no free

o ccurrence in E can b e reduced to E

The following notation will b e used

E E means E converts to E

E E means E converts to E

E E means E converts to E

In Section b elow this notation is extended

The most imp ortant kind of conversion is conversion it is the one that can b e

used to simulate arbitrary evaluation mechanisms conversion is to do with the

technical manipulation of b ound variables and conversion expresses the fact that

two functions that always give the same results on the same arguments are equal see

Section The next three subsections give further explanation and examples of

the three kinds of conversion note that conversion and reduction are used b elow

as synonyms

Conversion rules

conversion

A expression necessarily an abstraction to which reduction can b e applied is

called an redex The term redex abbreviates reducible expression The rule

of conversion just says that b ound variables can b e renamed provided no name

clashes o ccur

Examples

x x y y

x f x y f y

It is not the case that

x y add x y y y add y y

b ecause the substitution y add x y y x is not valid since the y that replaces

x b ecomes b ound 2

conversion

A expression necessarily an application to which reduction can b e applied is

called a redex The rule of conversion is like the evaluation of a function call

in a programming language the b o dy E of the function V E is evaluated in an

environment in which the formal parameter V is b ound to the actual parameter

E

Examples

x f x E f E

x y add x y y add y

y add y add

It is not the case that

x y add x y square y y add square y y

b ecause the substition y add x y square y x is not valid since y is free in

square y but b ecomes b ound after substitution for x in y add x y 2

It takes some practice to parse expressions according to the conventions of Sec

tion so as to identify the redexes For example consider the application

x y add x y

Putting in brackets according to the conventions expands this to

x y add x y

which has the form

x E

where

E y add x y

x E is a redex and could b e reduced to E x

Chapter Intro duction to the calculus

 conversion

A expression necessarily an abstraction to which reduction can b e applied is

called an redex The rule of conversion expresses the prop erty that two functions

are equal if they give the same results when applied to the same arguments This

prop erty is called extensionality and is discussed further in Section For example

conversion ensures that x sin x and sin denote the same function More

0

generally V E V denotes the function which when applied to an argument E

0 0 0

returns E V E V If V do es not o ccur free in E then E V E V E E

0

Thus V E V and E b oth yield the same result namely E E when applied to the

same arguments and hence they denote the same function

Examples

x add x add

y add x y add x

It is not the case that

x add x x add x

b ecause x is free in add x 2

Generalized conversions

The denitions of and can b e generalized as follows

E E if E can b e got from E by converting any subterm

E E if E can b e got from E by converting any subterm

E E if E can b e got from E by converting any subterm

Examples

x y add x y y add y

y add y add

2

The rst of these is a conversion in the generalized sense b ecause y add y

is obtained from x y add x y which is not itself a redex by reducing

the sub expression x y add x y We will sometimes write a sequence of

conversions like the two ab ove as

y add y add x y add x y

Exercise

Which of the three reductions b elow are generalized conversions ie reductions

of sub expressions and which are conversions in the sense dened on page 2

Equality of expressions

i x x

ii y y x x y y

iii y y x x x x

In reductions ii and iii in the exercise ab ove one starts with the same

expression but reduce redexes in dierent orders

An imp ortant prop erty of reductions is that no matter in which order one do es

them one always ends up with equivalent results If there are several disjoint

redexes in an expression one can reduce them in parallel Note however that some

reduction sequences may never terminate This is discussed further in connection

with the normalization theorem of Chapter It is a current hot research topic in

fthgeneration computing to design pro cessors which exploit parallel evaluation

to sp eed up the execution of functional programs

Equality of expressions

The three conversion rules preserve the meaning of expressions ie if E can

b e converted to E then E and E denote the same function This prop erty of

conversion should b e intuitively clear It is p ossible to give a mathematical denition

of the function denoted by a expression and then to prove that this function is

unchanged by or conversion Doing this is surprisingly dicult and is

b eyond the scop e of this b o ok

We will simply dene two expressions to b e equal if they can b e transformed into

each other by a sequence of forwards or backwards conversions It is imp ortant

to b e clear ab out the dierence b etween equality and identity Two expressions

are identical if they consist of exactly the same sequence of characters they are

equal if one can b e converted to the other For example x x is equal to y y

but not identical to it The following notation is used

E E means E and E are identical

E E means E and E are equal

Equality is dened in terms of identity and conversion and

as follows

Equality of expressions

0 0 0

If E and E are expressions then E E if E E or there exist expressions

E E E such that

n

E E

0

E E

n

For each i either

a E E or E E or E E or

i i i i i i

b E E or E E or E E

i i i i i i

Chapter Intro duction to the calculus

Examples

x x

x x y y

x y add x y add

2

From the denition of it follows that

i For any E it is the case that E E equality is reexive

0 0

ii If E E then E E equality is symmetric

0 0 00 00

iii If E E and E E then E E equality is transitive

If a relation is reexive symmetric and transitive then it is called an equivalence

relation Thus is an equivalence relation

0 0

and E are two Another imp ortant prop erty of is that if E E and if E

expressions that only dier in that where one contains E the other contains E

0 0

then E E This prop erty is called Leibnitzs law It holds b ecause the same

0

to sequence of reduction for getting from E to E can b e used for getting from E

0

For example if E E then by Leibnitzs law V E V E E

It is essential for the substitutions in the and reductions to b e valid The va

lidity requirement disallows for example x y x b eing reduced to y y y

since y b ecomes b ound after substitution for x in y x If this invalid substitution

were p ermitted then it would follow by the denition of that

x y x y y y

But then since

x y x y

and

y y y y y

one would b e forced to conclude that More generally by replacing and

by any two expressions it could b e shown that any two expressions are equal

Exercise

Find an example which shows that if substitutions in reductions are allowed to

b e invalid then it follows that any two expressions are equal 2

Example If V V V are all distinct and none of them o ccur free in any of

n

E E E then

n

V V V E E E E

n n

V V V E E E E

n n

V V E E V E E

n n

V V E E V E E

n n

E E V E V E V

n n 2

The relation

Exercise

In the last example where was the assumption used that V V V are all

n

distinct and that none of them o ccur free in any of E E E 2

n

Exercise

Find an example to show that if V V then even if V is not free in E it is not

necessarily the case that

V V E E E E E V E V

2

Exercise

Find an example to show that if V V but V o ccurs free in E then it is not

necessarily the case that

V V E E E E E V E V

2

The ! relation

In the previous section E E was dened to mean that E could b e obtained

from E by a sequence of forwards or backwards conversions A sp ecial case of

this is when E is got from E using only forwards conversions This is written

E E

Denition of

0 0 0

If E and E are expressions then E E if E E or there exist expressions

E E E such that

n

E E

0

E E

n

For each i either E E or E E or E E

i i i i i i

Notice that the denition of is just like the denition of on page except

that part b of is missing

Exercise

0 0 0

Find E E such that E E but it is not the case that E E 2

Exercise

very hard Show that if E E then there exists E such that E E

and E E This prop erty is called the ChurchRosser theorem Some of its

consequences are discussed in Chapter 2

Chapter Intro duction to the calculus

Extensionality

Supp ose V do es not o ccur free in E or E and

E V E V

Then by Leibnitzs law see page

V E V V E V

so by reduction applied to b oth sides

E E

It is often convenient to prove that two expressions are equal using this prop erty

ie to prove E E by proving E V E V for some V not o ccuring free in E

or E We will refer to such pro ofs as b eing by extensionality

Exercise

Show that

f g x f x g x x y x x y x x x

2

Substitution

0

At the b eginning of Section E E V was dened to mean the result of substi

0

tuting E for each free o ccurrence of V in E The substitution was said to b e valid

0 0

if no free variable in E b ecame b ound in E E V In the denitions of and

conversion it was stipulated that the substitutions involved must b e valid Thus

for example it was only the case that

V E E E E V

as long as the substitution E E V was valid

0

It is very convenient to extend the meaning of E E V so that we dont have

to worry ab out validity This is achieved by the denition b elow which has the

0

prop erty that for al l expressions E E and E and al l variables V and V

0 0

V E E E E V and V E V E V V

0

To ensure this prop erty holds E E V is dened recursively on the structure of

E as follows

Substitution

0

E E E V

0

V E

0 0 0

V where V V V

0 0

E E E E V E E V

V E V E

0 0 0 0

V E where V V and V E E V

0 0

V is not free in E

0 0 00 00 0 0

V E where V V and V E V V E V

0 0 00

V is free in E where V is a variable

0

not free in E or E

0

This particular denition of E E V is based on but not identical to the one in

App endix C of

To illustrate how this works consider y y xy x Since y is free in y the last

case of the table ab ove applies Since z do es not o ccur in y x or y

y y xy x z y xz y y x z z xy x z z y

00

In the last line of the table ab ove the particular choice of V is not sp ecied Any

0

variable not o ccurring in E or E will do

A go o d discussion of substitution can b e found in the b o ok by Hindley and Seldin

where various technical prop erties are stated and proved The following exercise

is taken from that b o ok

Exercise

Use the table ab ove to work out

i y x x xy y xx

ii y z x z y z y x

2

0

It is straightforward but rather tedious to prove from the denition of E E V

just given that indeed

0 0

V E E E E V and V E V E V V

0

for all expressions E E and E and all variables V and V

In Chapter it will b e shown how the theory of combinators can b e used to decom

p ose the complexities of substitution into simpler op erations Instead of combinators

it is p ossible to use the socalled nameless terms of De Bruijn De Bruijns idea

is that variables can b e thought of as p ointers to the s that bind them Instead of

lab elling s with names ie b ound variables and then p ointing to them via these

names one can p oint to the appropriate by giving the numb er of levels upwards

needed to reach it For example x y x y would b e represented by As a

Chapter Intro duction to the calculus

more complicated example consider the expression b elow in which we indicate the

numb er of levels separating a variable from the that binds it

z

z

x y x y y x y y

z z

In De Bruijns notation this is

A free variable in an expression is represented by a numb er bigger than the depth of

s ab ove it dierent free variables b eing assigned dierent numb ers For example

x y y x z x y w

would b e represented by

Since there are only two s ab ove the o ccurrence of this numb er must denote a

free variable similarly there is only one ab ove the second o ccurrence of and the

o ccurrence of so these to o must b e free variables Note that could not b e used

to represent w since this had already b een used to represent the free y we thus

chose the rst available numb er bigger than was already in use representing z

Care must b e taken to assign big enough numb ers to free variables For example

the rst o ccurrence of z in x z y z could b e represented by but the second

o ccurrence requires since they are the same variable we must use

Example With De Bruijns scheme x x y x y y would b e represented by

2

Exercise

What expression is represented by 2

Exercise

Describ e an algorithm for computating the De Bruijn representation of the expres

0 0

sion E E V from the representations of E and E 2

Chapter

Representing Things in the

calculus

The calculus app ears at rst sight to b e a very primitive language However

it can b e used to represent most of the ob jects and structures needed for mo dern

programming The idea is to co de these ob jects and structures in such a way that

they have the required prop erties For example to represent the truth values true

and false and the Bo olean function not expressions true false and not are

devised with the prop erties that

not true false

not false true

To represent the Bo olean function and a expression and is devised such

that

and true true true

and true false false

and false true false

and false false false

and to represent or an expression or such that

or true true true

or true false true

or false true true

or false false false

The expressions used to represent things may app ear completely unmotivated at

rst However the denitions are chosen so that they work together in unison

We will write

LET expression

to intro duce as a new notation Usually will just b e a name such as true

or and Such names are written in b old face or underlined to distinguish them

from variables Thus for example true is a variable but true is the expression

x y x see Section b elow and is a numb er but is the expression

f x f f x see Section

Sometimes will b e a more complicated form like the conditional notation E

E j E

Truthvalues and the conditional

This section denes expressions true false not and E E j E with the

following prop erties

Chapter Representing Things in the calculus

not true false

not false true

true E j E E

false E j E E

The expressions true and false represent the truthvalues true and false not

represents the negation function and E E j E represents the conditional if

E then E else E

There are innitely many dierent ways of representing the truthvalues and nega

tion that work the ones used here are traditional and have b een develop ed over the

years by logicians

LET true x y x

LET false x y y

LET not t t false true

It is easy to use the rules of conversion to show that these denitions have the

desired prop erties For example

not true t t false true true denition of not

true false true conversion

x y x false true denition of true

y false true conversion

false conversion

Similarly not false true

Conditional expressions E E j E can b e dened as follows

LET E E j E E E E

This means that for any expressions E E and E E E j E stands for

E E E

The conditional notation b ehaves as it should

true E j E true E E

x y x E E

E

and

false E j E false E E

x y y E E

E

Pairs and tuples

Exercise

Let and b e the expression x y x y j false Show that

and true true true

and true false false

and false true false

and false false false

2

Exercise

Devise a expression or such that

or true true true

or true false true

or false true true

or false false false

2

Pairs and tuples

The following abbreviations represent pairs and ntuples in the calculus

LET fst p p true

LET snd p p false

LET E E f f E E

E E is a expression representing an ordered pair whose rst comp onent

ie E is accessed with the function fst and whose second comp onent ie E

is accessed with snd The following calculation shows how the various denitions

coop erate together to give the right results

fst E E p p true E E

E E true

f f E E true

true E E

x y x E E

E

Exercise

Show that sndE E E

2

A pair is a datastructure with two comp onents The generalization to n comp onents

is called an ntuple and is easily dened in terms of pairs

LET E E E E E E E

n n n

Chapter Representing Things in the calculus

E E is an ntuple with components E E and length n Pairs are

n n

tuples The abbreviations dened next provide a way of extracting the comp onents

of ntuples

n

LET E fst E

n

LET E fstsnd E

n

LET E i fstsndsnd snd E if i n

z

i snd s

n

E LET E n sndsnd snd

z

n snds

It is easy to see that these denitions work for example

n n

E E E E E

n

fst E E

E

n n

E E E E E

n

fst snd E E

fst E

E

n

In general E E E i E for all i such that i n

n i

Convention

n

We will usually just write E i instead of E i when it is clear from the context

what n should b e For example

E E i E where i n

n i

Numb ers

There are many ways to represent numb ers by expressions each with their own

advantages and disadvantages The goal is to dene for each numb er n a

that represents it We also want to dene expressions to represent the expression n

primitive arithmetical op erations For example we will need expressions suc pre

add and iszero representing the successor function n n the predecessor

function n n addition and a test for zero resp ectively These expressions

will represent the numb ers correctly if they have the following prop erties

Numb ers

suc n n for all numb ers n

n for all numb ers n pre n

add m n mn for all numb ers m and n

true iszero

false iszero suc n

The representation of numb ers describ ed here is the original one due to Church In

n

order to explain this it is convenient to dene f x to mean n applications of f to

x For example

f x f f f f f x

By convention f x is dened to mean x More generally

0 0

LET E E E

0 n 0

E LET E E E E E

z

n E s

n 0 n 0 n 0

Note that E EE E E E E E we will use the fact later

Example

f x f f f f x f f x f f x

2

Using the notation just intro duced we can now dene Churchs numerals Notice

how the denition of the expression n b elow enco des a unary representation of n

LET f x x

LET f x f x

f x f f x LET

n

f x f x LET n

The representations of suc add and iszero are now magically pulled out of a hat

The b est way to see how they work is to think of them as op erating on unary

representations of numb ers The exercises that follow should help

LET suc n f x n f f x

LET add m n f x m f n f x

LET iszero n n x false true

Chapter Representing Things in the calculus

Exercise

Show

i suc

ii suc

iii iszero true

false iv iszero

v add

vi add

2

Exercise

Show for all numb ers m and n

n i suc n

ii iszero suc n false

n n iii add

iv add m m

v add m n m n

2

The predecesor function is harder to dene than the other primitive functions

n

The idea is that the predecessor of n is dened by using f x f x ie n to

obtain a function that applies f only n times The trick is to throw away the

n

rst application of f in f To achieve this we rst dene a function prefn that

op erates on pairs and has the prop erty that

i prefn f true x false x

ii prefn f false x false f x

From this it follows that

n n

iii prefn f false x false f x

n n

iv prefn f true x false f x if n

Thus n applications of prefn to true x result in n applications of f to x With

this idea the denition of the predecessor function pre is straightforward Before

giving it here is the denition of prefn

LET prefn f p false fst p snd p j f snd p

Exercise

Show prefn f b x false b x j f x and hence

Numb ers

i prefn f true x false x

ii prefn f false x false f x

n n

iii prefn f false x false f x

n n

iv prefn f true x false f x if n

2

The predecessor function pre can now b e dened

LET pre n f x snd n prefn f true x

It follows that if n

pre n f x snd n prefn f true x denition of pre

n

snd prefn f true x denition of n

n

sndfalse f x by v ab ove

n

f x

hence by extensionality Section on page

n

pre n f x f x

denition of n n

Exercise

Using the results of the previous exercise or otherwise show that

i pre suc n n

ii pre

2

The numeral system in the next exercise is the one used in and has some advan

tages over Churchs eg the predecessor function is easier to dene

Exercise

b

xx LET

b b

LET false

b b

false LET

d

false nb LET n

d

d d

Devise expressions suc iszero pre such that for all n

d

d

i suc nb n

Chapter Representing Things in the calculus

d

b

ii iszero true

d

d

iii iszero suc nb false

d d

nb iv pre suc nb

2

Denition by recursion

To represent the multiplication function in the calculus we would like to dene a

expression mult say such that

mult m n add n add n add n

z

m adds

This would b e achieved if mult could b e dened to satisfy the equation

mult m n iszero m j add n mult pre m n

If this held then for example

mult iszero j add mult pre

by the equation

mult add

by prop erties of iszero the conditional and pre

iszero j add mult pre add

by the equation

add mult add

by prop erties of iszero the conditional and pre

add add iszero j add mult pre

by the equation

add add

by prop erties of iszero and the conditional

The equation ab ove suggests that mult b e dened by

mult m n iszero m pre m n j add n mult

NB

Unfortunately this cannot b e used to dene mult b ecause as indicated by the

arrow mult must already b e dened for the expression to the right of the equals

to make sense

Fortunately there is a technique for constructing expressions that satisfy arbitrary

equations When this technique is applied to the equation ab ove it gives the desired

denition of mult First dene a expression Y that for any expression E has

the following o dd prop erty

Y E E Y E

This says that Y E is unchanged when the function E is applied to it In general

0 0 0

if E E E then E is called a xed point of E A expression Fix with the

prop erty that Fix E E Fix E for any E is called a xedpoint operator There

are known to b e innitely many dierent xedp oint op erators Y is the most

famous one and its denition is

Denition by recursion

LET Y f x f x x x f x x

It is straightforward to show that Y is indeed a xedp oint op erator

Y E f x f x x x f x x E denition of Y

x E x x x E x x conversion

E x E x x x E x x conversion

E Y E the line b efore last

This calculation shows that every expression E has a xed p oint

Armed with Y we can now return to the problem of solving the equation for mult

Supp ose multfn is dened by

LET multfn f j add n f m n iszero m pre m n

and then dene mult by

LET mult Y multfn

Then

mult m n Y multfn m n denition of mult

multfn Y multfn m n xedp oint prop erty of Y

multfn mult m n denition of mult

j add n f pre m n mult m n f m n iszero m

denition of multfn

j add n mult pre m n conversion iszero m

An equation of the form f x x E is called recursive if f o ccurs free in E

n

Y provides a general way of solving such equations Start with an equation of the

form

f x x f

n

g g

where f is some expression containing f To obtain an f so that this

g g

equation holds dene

LET f Y f x x f

n

g g

The fact that the equation is satised can b e shown as follows

f x x Y f x x f x x denition of f

n n n

g g

f x x f Y f x x f x x

n n n

g g g g

xedp oint prop erty

f x x f f x x denition of f

n n

g g

f conversion

g g

Exercise

Construct a expression eq such that

eq m n iszero m iszero n j

iszero n false j eq pre m pre n 2

Chapter Representing Things in the calculus

Exercise

Show that if Y is dened by

LET Y Y y f f y f

then Y is a xedp oint op erator ie for any E

Y E E Y E

2

The xedp oint op erator in the next exercise is due to Turing Barendregt page

Exercise

Show that x y y x x y x y y x x y is a xedp oint op erator 2

The next exercise also comes from Barendregts b o ok where it is attributed to Klop

Exercise

Show that Y is a xedp oint op erator where

LET $ abcdef g hij k l mnopq stuv w xy z r

r thisisaf ixedpointcombinator

LET Y $$$$$$$$$$$$$$$$$$$$$$$$$$

2

Exercise

b

Is it the case that Y f f Y f If so prove it if not nd a expression Y

b b

such that Y f f Y f 2

In the pure calculus as dened on page expressions could only b e applied to

a single argument however this argument could b e a tuple see page Thus one

can write

E E E

n

which actually abbreviates

E E E E E

n n

For example E E E abbreviates E f f E E

Functions with several arguments

In conventional mathematical usage the application of an nargument function f

to arguments x x would b e written as f x x There are two ways of

n n

representing such applications in the calculus

i as f x x or

n

ii as the application of f to an ntuple x x

n

Functions with several arguments

In case i f exp ects its arguments one at a time and is said to b e curried after a

logician called Curry the idea of currying was actually invented by Schonnkel

The functions and or and add dened earlier were all curried One advantage of

curried functions is that they can b e partially applied for example add is the

result of partially applying add to and denotes the function n n

Although it is often convenient to represent nargument functions as curried it is

also useful to b e able to represent them as in case ii ab ove by expressions

exp ecting a single ntuple argument For example instead of representing and

by expressions add and mult such that

n mn add m

mult m n mn

it might b e more convenient to represent them by functions sum and pro d say

such that

sum m n mn

pro d m n mn

This is nearer to conventional mathematical usage and has applications that will

b e encountered later One might say that sum and pro d are uncurried versions of

add and mult resp ectively

Dene

LET curry f x x f x x

LET uncurry f p f fst p snd p

then dening

sum uncurry add

pro d uncurry mult

results in sum and pro d having the desired prop erties for example

n uncurry add m n sum m

f p f fst p snd padd m n

n snd m n add fst m

add m n

mn

Exercise

Show that for any E

curry uncurry E E

uncurry curry E E

hence show that

add curry sum

mult curry pro d

2

We can dene nary functions for currying and uncurrying For n dene

Chapter Representing Things in the calculus

LET curry f x x f x x

n n

n

n n

LET uncurry f p f p p n

n

If E represents a function exp ecting an ntuple argument then curry E represents

n

the curried function which takes its arguments one at a time If E represents a

curried function of n arguments then uncurry E represents the uncurried version

n

which exp ects a single ntuple as argument

Exercise

Show that

i curry uncurry E E

n n

ii uncurry curry E E

n n

2

Exercise

n n

built out of curry and uncurry such that and E Devise expressions E

n n

2 and uncurry E curry E

n n

The following notation provides a convenient way to write expressions which

exp ect tuples as arguments

Generalized abstractions

LET V V E uncurry V V E

n n

n

Example x y mult x y abbreviates

uncurry x y mult x y f p f p p x y mult x y

f p f fst p snd p x y mult x y

p mult fst psnd p

Thus

x y mult x y E E p mult fst p snd p E E

mult fstE E sndE E

mult E E

2

This example illustrates the rule of generalized conversion in the b ox b elow This

rule can b e derived from ordinary conversion and the denitions of tuples and

generalized abstractions The idea is that a tuple of arguments is passed to each

argument p osition in the b o dy of the generalized abstraction then each individual

argument can b e extracted from the tuple without aecting the others

Mutual recursion

Generalized conversion

V V E E E E E E V V

n n n n

where E E E V V is the simultaneous substitution of E E

n n n

for V V resp ectively and none of these variables o ccur free in any of

n

E E

n

It is convenient to extend the notation V V V E describ ed on page so

n

that each V can either b e an identier or a tuple of identiers The meaning of

i

V V V E is still V V V E but now if V is a tuple of

n n i

identiers then the expression is a generalized abstraction

Example f x y f x y means f x y f x y which in turn means

f uncurry x y f x y which equals f p f fst p snd p 2

Exercise

Show that if the only free variables in E are x x and f then if

n

f Y f x x E

n

then

f x x E ff

n

2

Exercise

Dene a expression div with the prop erty that

div m n q r

where q and r are the quotient and remainder of dividing n into m 2

Mutual recursion

To solve a set of mutually recursive equations like

f F f f

n

f F f f

n

f F f f

n n n

we simply dene for i n

f Y f f F f f F f f i

i n n n n

This works b ecause if

f Y f f F f f F f f

n n n n

then f f i and hence

i

f f f F f f F f f f

n n n n

F f f n F f f n

n

F f f F f f since f i f

n n n i

Hence

f F f f

i i n

Chapter Representing Things in the calculus

Representing the recursive functions

The recursive functions form an imp ortant class of numerical functions Shortly

after Church invented the calculus Kleene proved that every recursive function

could b e represented in it This provided evidence for Churchs thesis the hyp othe

sis that any intuitively computable function could b e represented in the calculus

It has b een shown that many other mo dels of compution dene the same class of

functions that can b e dened in the calculus

In this section it is describ ed what it means for an arithmetic function to b e repre

sented in the calculus Two classes of functions the primitive recursive functions

and the recursive functions are dened and it is shown that all the functions in

these classes can b e represented in the calculus

In Section it was explained how a numb er n is represented by the expression

n A expression f is said to represent a mathematical function f if for all numb ers

x x

n

x x y if f x x y f

n n

The primitive recursive functions

A function is called primitive recursive if it can b e constructed from and the

i

functions S and U dened b elow by a nite sequence of applications of the

n

op erations of substitution and primitive recursion also dened b elow

i

where n and i are numb ers The successor function S and pro jection functions U

n

are dened by

i S x x

i

ii U x x x x

n i

n

Substitution

Supp ose g is a function of r arguments and h h are r functions each of n

r

arguments We say f is dened from g and h h by substitution if

r

f x x g h x x h x x

n n r n

Primitive recursion

Supp ose g is a function of n arguments and h is a function of n arguments

We say f is dened from g and h by primitive recursion if

f x x g x x

n n

f S x x x hf x x x x x x

n n n

g is called the base function and h is called the step function It can proved that

for any base and step function there always exists a unique function dened from

them by primitive recursion This result is called the primitive recursion theorem

pro ofs of it can b e found in textb o oks on mathematical logic

Example The addition function sum is primitive recursive b ecause

sum x x

sum S x x S sum x x

2

Representing the recursive functions

It is now shown that every primitive recursive function can b e represented by

expressions

n

It is obvious that the expressions suc p p i represent the initial functions

i

S and U resp ectively

n

Supp ose function g of r variables is represented by g and functions h i r

i

of n variables are represented by h Then if a function f of n variables is dened

i

by substitution by

f x x g h x x h x x

n n r n

then f is represented by f where

f x x gh x x h x x

n n r n

Supp ose function f of n variables is dened inductively from a base function g of

n variables and an inductive step function h of n variables Then

f x x g x x

n n

f S x x x hf x x x x x x

n n n

Thus if g represents g and h represents h then f will represent f if

f x x x

n

iszero x gx x j

n

hf pre x x x pre x x x

n n

Using the xedp oint trick an f can b e constructed to satisfy this equation by

dening f to b e

Y f x x x

n

iszero x gx x j

n

hf pre x x x pre x x x

n n

Thus any primitive recursive function can b e represented by a expression

The recursive functions

A function is called recursive if it can b e constructed from the successor function

and the pro jection functions see page by a sequence of substitutions primitive

recursions and minimizations

Minimization

Supp ose g is a function of n arguments We say f is dened from g by minimization

if

f x x x the smallest y such that g y x x x

n n

The notation MINf is used to denote the minimization of f Functions dened

by minimization may b e undened for some arguments For example if one is the

function that always returns ie onex for every x then MINone is only

dened for arguments with value This is obvious b ecause if f x MIN one x

then

f x the smallest y such that one y x

and clearly this is only dened if x Thus

if x

MIN one x

undened otherwise

Chapter Representing Things in the calculus

To show that any recursive function can b e represented in the calculus it is neces

sary to show how to represent the minimization of an arbitrary function Supp ose

g represents a function g of n variables and f is dened by

f MIN g

Then if a expression min can b e devised such that min x f x x represents

n

the least numb er y greater than x such that

f y x x x

n

then g will represent g where

f x x x g x x x min

n n

min will clearly have the desired prop erty if

min x f x x x

n

eq f x x x x x j min suc x f x x x

n n

n is equal to true if m n and false otherwise a suitable denition where eq m

of eq o ccurs on page Thus min can simply b e dened to b e

Y m

x f x x x

n

eq f x x x x x j m suc x f x x x

n n

Thus any recursive function can b e represented by a expression

Higherorder primitive recursion

There are functions which are recursive but not primitive recursive Here is a version

of Ackermanns function dened by

n n

m m

m n m m n

However if one allows functions as arguments then many more recursive functions

can b e dened by a primitive recursion For example if the higherorder function

rec is dened by primitive recursion as follows

rec x x x

rec S x x x x rec x x x

then can b e dened by

m n rec m S f x rec x f f n

where x x denotes the function that maps x to x Notice that the third

argument of rec viz x must b e a function In the denition of we also to ok

x to b e a function viz S

1

Note that x x is an expression of the calculus whereas x 7! x is a notation of informal mathematics

Extending the calculus

Exercise

Show that the denition of in terms of rec works ie that with dened as

ab ove

n n

m m

m n m m n

2

A function which takes another function as an argument or returns another function

as a result is called higherorder The example shows that higherorder primitive

recursion is more p owerful than ordinary primitive recursion The use of op erators

like rec is one of the things that makes functional programming very p owerful

The partial recursive functions

A partial function is one that is not dened for all arguments For example the

function MINone describ ed ab ove is partial Another example is the division

function since division by is not dened Functions that are dened for all

arguments are called total

A partial function is called partial recursive if it can b e constructed from the suc

cessor function and the pro jection functions by a sequence of substitutions primitive

recursions and minimizations Thus the recursive functions are just the partial re

cursive functions which happ en to b e total It can b e shown that every partial

recursive function f can b e represented by a expression f in the sense that

x x y if f x x y i f

n n

ii If f x x is undened then f x x has no normal form

n n

Note that despite ii ab ove it is not in general correct to regard expressions with

no normal form as b eing undened

Exercise

Write down the expression that represents MINf where f x for all x 2

Extending the calculus

Although it is p ossible to represent dataob jects and datastructures with

expressions it is often inecient to do so For example most computers have

hardware for arithmetic and it is reasonable to use this rather than conversion

to compute with numb ers A mathematically clean way of interfacing computation

rules to the calculus is via so called rules

The idea is to add a set of new constants and then to sp ecify rules called a rules

for reducing applications involving these constants For example one might add

numerals and as new constants together with the rule

m n mn

E E means E results by applying a rule to some sub expression of E

When adding such constants and rules to the calculus one must b e careful not to

destroy its nice prop erties eg the ChurchRosser theorem see page

2

The kind of primitive recursion dened in Section is rstorder primitive recursion

Chapter Representing Things in the calculus

It can b e shown that rules are safe if they have the form

c c c e

n

where c c are constants and e is either a constant or a closed abstraction

n

such expressions are sometimes called values

For example one might add as constants Suc Pre IsZero with

the rules

Suc

n n

Pre

n n

IsZero true

IsZero false

n

Here represents the numb er n Suc Pre IsZero are new constants not dened

n

expressions like suc pre iszero and true and false are the previously dened

expressions which are b oth closed abstractions

Theorems ab out the calculus

If E E then E can b e thought of as having b een got from E by evaluation

If there are no or redexes in E then it can b e thought of as fully evaluated

A expression is said to b e in normal form if it contains no or redexes ie if

the only conversion rule that can b e applied is conversion Thus a expression

in normal form is fully evaluated

Examples

i The representations of numb ers are all in normal form

ii x x is not in normal form

2

Supp ose an expression E is evaluated in two dierent ways by applying two dier

ent sequences of reductions until two normal forms E and E are obtained The

ChurchRosser theorem stated b elow shows that E and E will b e the same except

for having p ossibly dierent names of b ound variables

Because the results of reductions do not dep end on the order in which they are done

separate redexes can b e evaluated in parallel Various research pro jects are currently

trying to exploit this fact by designing multipro cessor architectures for evaluating

expressions It is to o early to tell how successful this work will b e There is

a p ossibility that the communication overhead of distributing redexes to dierent

pro cessors and then collecting together the results will cancel out the theoretical

advantages of the approach Let us hop e this p essimistic p ossibility can b e avoided

It is a remarkable fact that the ChurchRosser theorem an obscure mathematical

result dating from b efore computers were invented may underpin the design of the

next generation of computing systems

Here is the statement of the ChurchRosser theorem It is an example of something

that is intuitively obvious but very hard to prove Many prop erties of the calculus

share this prop erty

Theorems ab out the calculus

The ChurchRosser theorem

If E E then there exists an E such that E E and E E

It is now p ossible to see why the ChuchRosser theorem shows that expressions

can b e evaluated in any order Supp ose an expression E is evaluated in two

dierent ways by applying two dierent sequences of reductions until two normal

forms E and E are obtained Since E and E are obtained from E by sequences

of conversions it follows by the denition of that E E and E E and hence

0

E E By the ChurchRosser theorem there exists an expression E say such

0 0

that E E and E E Now if E and E are in normal form then the

only redexes they can contain are redexes and so the only way that E and E

0

can b e reduced to E is by changing the names of b ound variables Thus E and

E must b e the same up to renaming of b ound variables ie conversion

Another application of the ChurchRosser theorem is to show that if m n then

n Supp ose m n the expressions representing m and n are not equal ie m

n by the ChurchRosser theorem m E and n E for some E But but m

it is obvious from the denitions of m and n namely

m

m f x f x

n

n f x f x

that no such E can exist The only conversions that are applicable to m and n are

conversions and these cannot change the numb er of function applications in an

contains m applications and n contains n applications expression m

0 0

A expression E has a normal form if E E for some E in normal form The

following corollary relates expressions in normal form to those that have a normal

form it summarizes some of the statements made ab ove

Corollary to the ChurchRosser theorem

0 0

i If E has a normal form then E E for some E in normal form

0 0

ii If E has a normal form and E E then E has a normal form

0 0 0

iii If E E and E and E are b oth in normal form then E and E are

identical up to conversion

Proof

0 0

i If E has a normal form then E E for some E in normal form By the

00 00 0 00

ChurchRosser theorem there exists E such that E E and E E

0

As E is in normal form the only redexes it can have are redexes so the

0 00 00

reduction E E must consist of a sequence of conversions Thus E

0

must b e identical to E except for some renaming of b ound variables it must

0

thus b e in normal form as E is

0 00

ii Supp ose E has a normal form and E E As E has a normal form E E

00 0 00

where E is in normal form Hence E E by the transitivity of see

0

page and so E has a normal form

Chapter Representing Things in the calculus

iii This was proved ab ove

2

Exercise

For each of the following expressions either nd its normal form or show that it

has no normal form

i add

ii add

iii x x x x x

iv x x x x x x

v Y

vi Y y y

j f pre x vii Y f x iszero x

2

Notice that a expression E might have a normal form even if there exists an

innite sequence E E E For example x Y f has a normal

even though form

n

x Y f x f Y f x f Y f

The normalization theorem stated b elow tells us that such blind alleys can always

b e avoided by reducing the leftmost or redex where by leftmost is meant the

redex whose b eginning is as far to the left as p ossible

Another imp ortant p oint to note is that E may not have a normal form even though

E E do es have one For example Y has no normal form but Y x It

is a common mistake to think of expressions without a normal form as denoting

undened functions Y has no normal form but it denotes a p erfectly well dened

function Analysis b eyond the scop e of this b o ok see Wadsworths pap er

shows that a expression denotes an undened function if and only if it cannot b e

converted to an expression in head normal form where E is in head normal form if

it has the form

V V V E E

m n

where V V and V are variables and E E are expressions V can

m n

either b e equal to V for some i or it can b e distinct from all of them It follows

i

that the xedp oint op erator Y is not undened b ecause it can b e converted to

f f x f x x x f x x

which is in head normal form

It can b e shown that an expression E has a head normal form if and only if there

exist expressions E E such that E E E has a normal form This

n n

supp orts the interpretation of expressions without head normal forms as denoting

undened functions E b eing undened means that E E E never terminates

n

for any E E Full details on head normal forms and their relation to

n

denedness can b e found in Barendregts b o ok

3

The mathematical characterization of the function denoted by Y can b e found in Stoys b o ok

Callbyvalue and Y

The normalization theorem

If E has a normal form then rep eatedly reducing the leftmost or redex

p ossibly after an conversion to avoid invalid substitutions will terminate

with an expression in normal form

The remark ab out conversion in the statement of the theorem is to cover cases

like

0 0

x y x y y y y y

0 0

where y x y y x y has b een converted so as to avoid the invalid substi

tution y x y y x y y y

A sequence of reductions in which the leftmost redex is always reduced is called a

normal order reduction sequence

0

The normalization theorem says that if E has a normal form ie for some E

0

in normal form E E then it can b e found by normal order reduction This

however is not usually the most ecient way to nd it For example normal

order reduction requires

x x x E

g g g

to b e reduced to

E E

g g g

0

If E is not in normal form then it would b e more ecient to rst reduce E to E

0

say where E is in normal form and then to reduce

0

x x x E

g g g

to

0 0

E E

g g g

thereby avoiding having to reduce E twice

Note however that this callbyvalue scheme is disastrous in cases like

x x x x x x x

It is a dicult problem to nd an optimal algorithm for cho osing the next redex to

reduce For recent work in this area see Levys pap er

Because normal order reduction app ears so inecient some programming languages

based on the calculus eg LISP have used call by value even though it do esnt

always terminate Actually call by value has other advantages b esides eciency

esp ecially when the language is impure ie has constructs with side eects eg as

signments On the other hand recent research suggests that mayb e normal order

evaluation is not as inecient as was originally thought if one uses cunning im

plementation tricks like graph reduction see page Whether functional pro

gramming languages should use normal order or call by value is still a controversial

issue

Callbyvalue and Y

Recall Y

LET Y f x f x x x f x x

Chapter Representing Things in the calculus

Unfortunately Y do esnt work with callbyvalue b ecause applicative order causes

it to go into a lo op

Y f f Y f

f f Y f

f f f Y f

To get around this dene

LET Y f x f y x x y x f y x x y

Note that Y is Y with x x converted to y x x y Y do esnt go es into a

lo op with callbyvalue

Y f f y Y f y

Callbyvalue do esnt evaluate s hence the lo oping is avoided

Chapter

Combinators

Combinators provide an alternative theory of functions to the calculus They

were originally intro duced by logicians as a way of studying the pro cess of substitu

tion More recently Turner has argued that combinators provide a go o d machine

co de into which functional programs can b e compiled Several exp erimental

computers have b een built based on Turners ideas see eg and the results

are promising How these machines work is explained in Section Combinators

also provide a go o d intermediate co de for conventional machines several of the b est

compilers for functional languages are based on them eg

There are two equivalent ways of formulating the theory of combinators

i within the calculus or

ii as a completely separate theory

The approach here is to adopt i as it is slightly simpler but ii was how it was

done originally It will b e shown that any expression is equal to an expression

built from variables and two particular expressions K and S using only function

application This is done by mimicking abstractions using combinations of K and

S It will b e demonstrated how reductions can b e simulated by simpler op era

tions involving K and S It is these simpler op erations that combinator machines

implement directly in hardware The denitions of K and S are

LET K x y x

LET S f g x f x g x

From these denitions it is clear by reduction that for all E E and E

K E E E

S E E E E E E E

Any expression built by application ie combination from K and S is called a

combinator K and S are the primitive combinators

In BNF combinators have the following syntax

combinator K j S j combinator combinator

A combinatory expression is an expression built from K S and zero or more vari

ables Thus a combinator is a combinatory expression not containing variables In

1

The twovolume treatise Combinatory Logic is the denitive reference but the more

recent textb o oks are b etter places to start

Chapter Combinators

BNF the syntax of combinatory expressions is

combinatory expression

K j S

j variable

j combinatory expression combinatory expression

Exercise

Dene I by

LET I x x

Show that I S K K 2

The identity function I dened in the last exercise is often taken as a primitive

combinator but as the exercise shows this is not necessary as it can b e dened from

K and S

Combinator reduction

0 0

If E and E are combinatory expressions then the notation E E is used if

c

0 0

E E or if E can b e got from E by a sequence of rewritings of the form

i K E E E

c

ii S E E E E E E E

c

iii I E E

c

Note that the reduction I E E is derivable from i and ii

c

Example

S K K x K x K x by ii

c

x by i

c

2

This example shows that for any E I E E

c

Any sequence of combinatory reductions ie reductions via can b e expanded

c

into a sequence of conversions This is clear b ecause K E E and S E E E

reduce to E and E E E E resp ectively by sequences of conversions

Functional completeness

A surprising fact is that any expression can b e translated to an equivalent combi

natory expression This result is called the functional completeness of combinators

and is the basis for compilers for functional languages to the machine co de of com

binator machines

The rst step is to dene for an arbitrary variable V and combinatory expression



E another combinatory expression V E that simulates V E in the sense that



V E V E This provides a way of using K and S to simulate adding V to

an expression

Functional completeness

If V is a variable and E is a combinatory expression then the combinatory expres



sion V E is dened inductively on the structure of E as follows



i V V I

 0 0 0

ii V V K V if V V



iii V C K C if C is a combinator

  

iv V E E S V E V E



Note that V E is a combinatory expression not containing V

Example If f and x are variables and f x then

  

x f x S x f x x

S K f I

2



The following theorem shows that V E simulates abstraction



Theorem V E V E

Proof

 

We show that V E V E It then follows immediately that V V E V



V E and hence by reduction that V E V E



The pro of that V E V E is by mathematical induction on the size of E

The argument go es as follows

i If E V then



V E V I V x x V V E

0 0

ii If E V where V V then

0 0 0



V E V K V V x y x V V V E

iii If E C where C is a combinator then



V E V K C x y x C V C E

iv If E E E then we can assume by induction that



V E V E



V E V E

and hence

 

V E V V E E V

 

S V E V E V

 

f g x f x g x V E V E V

 

V E V V E V

E E by induction assumption

E 2

Chapter Combinators

The notation



V V V E

n

is used to mean

  

V V V E

n

Now dene the translation of an arbitrary expression E to a combinatory expres

sion E

C

i V V

C

ii E E E E

C C C



iii V E V E

C C

Theorem For every expression E we have E E

C

Proof

The pro of is by induction on the size of E

i If E V then E V V

C C

ii If E E E we can assume by induction that

E E

C

E E

C

hence

E E E E E E E E

C C C C

0

iii If E V E then we can assume by induction that

0 0

E E

C

hence

0

E V E

C C

 0

V E by translation rules

C

 0

V E by induction assumption

0

V E by previous theorem

E

2

This theorem shows that any expression is equal to a expression built up from

K and S and variables by application ie the class of expressions E dened by

the BNF

E V j K j S j E E

is equivalent to the full calculus

A collection of n combinators C C is called an nelement basis Barendregt

n

Chapter if every expression E is equal to an expression built from C s and

i

variables by function applications The theorem ab ove shows that K and S form a

element basis The exercise b elow from Section of Barendregt shows that

there exists a element basis

Reduction machines

Exercise

Find a combinator X say such that any expression is equal to an expression built

from X and variables by application Hint Let hE E E i p p E E E and

consider hK S Ki hK S Ki hK S Ki and hK S Ki hhK S Ki hK S Kii 2

Examples

   

f x f x x f x f x x

  

f S x f x x x

  

f S Kf S x x x x



f S Kf S I I

 

S f S Kf f S I I

 

S S f S f K f K S I I

 

S S K S S f K f f K S I I

S S K S S K K I K S I I

Y f x f x x x f x x

C C



f x f x x x f x x

C



f x f x x x f x x

C C

  

f x f x x x f x x

C C

  

f x f x x x f x x

   

S f x f x x f x f x x

SSSKSS KKIK SI IS SKSSKKI KS II

2

Reduction machines

Until David Turner published his pap er combinators were regarded as a mathe

matical curiosity In his pap er Turner argued that translating functional languages

ie languages based on the calculus to combinators and then reducing the result

ing expressions using the rewrites given on page is a practical way of implement

ing these languages

Turners idea is to represent combinatory expressions by trees For example

S f x K y z would b e represented by

A

A

A

A

m

z

A A

A A

A A

A A

m m m

y

A

S K

A

A

A

m m

x f

Chapter Combinators

Such trees are represented as p ointer structures in memory Sp ecial hardware or

rmware can then b e implemented to transform such trees according to the rules

of combinator reduction dening

c

For example the tree ab ove could b e transformed to

A A

A A

A A

A A

m m

z z

A A

A A

A A

A A

m m m m

y

x

f

K

using the transformation

A

A

A

A

A A A S

S

A A A

A A A

A A A

A S S S S S

S S S S S

A

A

A

m

S

S

S

which corresp onds to the reduction S E E E E E E E

c

Exercise

What tree transformation corresp onds to K E E E How would this trans

c

formation change the tree ab ove 2

Notice that the tree transformation for S just given duplicates a subtree This

wastes space a b etter transformation would b e to generate one subtree with two

p ointers to it ie

A

A

A

A

A S

S

A

A

A

"

A S S S

S S S

A

 !

A

A

m

S S

S

S S

This generates a graph rather than a tree For further details of such graph reduc

tions see Turners pap er

It is clear from the theorem ab ove that a valid way of reducing expressions is

Improved translation to combinators

i Translating to combinators ie E E

C

ii Applying the rewrites

K E E E

c

S E E E E E E E

c

until no more rewriting is p ossible

An interesting question is whether this pro cess will fully evaluate expressions If

some expression E is translated to combinators then reduced using is the

c

resulting expression as fully evaluated as the result of reducing E directly or is

it only partially evaluated Surprisingly there do esnt seem to b e anything in the

literature on this imp ortant question However combinator machines have b een

built and they app ear to work

It is well known that if E E in the calculus then it is not necessarily the

case that E E For example take

C C

c

E y z y x y

E y y

Exercise

With E and E as ab ove show that E E in the calculus but it is not the

case that E E 2

C C

c

A combinatory expression is dened to b e in combinatory normal form if it contains

no sub expressions of the form K E E or S E E E Then the normalization

theorem holds for combinatory expressions ie always reducing the leftmost com

binatory redex will nd a combinatory normal form if it exists

Note that if E is in combinatory normal form then it do es not necessarily follow

that it is a expression in normal form

Example S K is in combinatory normal form but it contains a redex namely

f g x f x g x x y x

2

Exercise

Construct a combinatory expression E which is in combinatory normal form but

has no normal form 2

Improved translation to combinators

The examples on page show that simple expressions can translate to quite

complex combinatory expressions via the rules on page

To make the co de executed by reduction machines more compact various opti

mizations have b een devised

Examples

2

The most relevant pap er I could nd is one by Hindley This compares reduction

with combinatory reduction but not in a way that is prima facie relevant to the termination of

combinator machines

Chapter Combinators

i Let E b e a combinatory expression and x a variable not o ccurring in E Then

S K E I x K E x I x E x

c c

hence S KE I x E x b ecause E E implies E E so by

c

extensionality Section see on page

S K E I E

ii Let E E b e combinatory expressions and x a variable not o ccurring in either

of them Then

S K E K E x K E x K E x E E

c c

Thus

S K E K E x E E

Now

K E E x E E

c

hence K E E x E E Thus

S K E K E x E E K E E x

It follows by extensionality that

S K E K E K E E

2

Since S K E I E for any E whenever a combinatory expression of the form

S K E I is generated it can b e p eephole optimized to just E Similarly whenever

an expression of the form S K E K E is generated it can b e optimized to

K E E

Example On page it was shown that

 

f x f x x S S K S S K K I K S I I

Using the optimization S K E I E this simplies to

 

f x f x x S S K S K K S I I

2

More combinators

It is easier to recognize the applicability of the optimization S K E I E if I has

not b een expanded to S K K ie if I is taken as a primitive combinator Various

other combinators are also useful in the same way for example B and C dened

by

LET B f g x f g x

LET C f g x f x g

Currys algorithm

These have the following reduction rules

B E E E E E E

c

C E E E E E E

c

Exercise

Show that with B C dened as ab ove

S K E E B E E

S E K E C E E

where E E are any two combinatory expressions 2

Using B and C one can further optimize the translation of expressions to com

binators by replacing expressions of the form S K E E and S E K E by

B E E and C E E

Currys algorithm

Combining the various optimizations describ ed in the previous section leads to

Currys algorithm for translating expressions to combinatory expressions This

algorithm consists in using the denition of E given on page but whenever

C

an expression of the form S E E is generated one tries to apply the following

rewrite rules

S K E K E K E E

S K E I E

S K E E B E E

S E K E C E E

If more than one rule is applicable the earlier one is used For example

S K E K E is translated to K E E not to B E K E

Exercise

Show that using Currys algorithm Y is translated to the combinator

S C B S I I C B S I I

2

Exercise

Show that

S S K S S K K I K S I I C B S I I 2

Chapter Combinators

Turners algorithm

In a second pap er Turner prop osed that Currys algorithm b e extended to use

0

another new primitive combinator called S This is dened by

0

LET S c f g x c f x g x

and has the reduction rule

0

S C E E E C E E E E

c

where C E E E are arbitrary combinatory expressions The reason why C is

0

used is that S has the prop erty that if C is a combinator ie contains no variables

then for any E and E

0

  

x C E E S C x E x E

This can b e shown using extensionality Clearly x is a variable not o ccurring in

0

  

x C E E or S C x E x E exercise why so it is sucient to

show

0

  

x C E E x S C x E x E x



From the denition of x it easily follows that

  

x C E E S S K C x E x E

hence

  

x C E E x S S K C x E x E x

 

S K C x E x x E x

 

K C x x E x x E x

 

C x E x x E x

0

   

But S C x E x E x C x E x x E x also and so

0

  

x C E E x S C x E x E x

Exercise

Where in the argument ab ove did we use the assumption that C is a combinator

2

0

Turners combinator S is useful when translating expressions of the form

V V V E E it will b e seen shortly why it is convenient to numb er the

n

b ound variables in descending order To see this following Turner temp orarily

dene

0 

E to mean V E

00  

E to mean V V E

000   

E to mean V V V E

Recall that

  

V V V E E V V V E E

n C n

C

The next exercise shows that

  

V V V E E

n

gets very complicated as n increases

Turners algorithm

Exercise

Show that

 0 0

i x E E S E E

  00 00

ii x x E E S B S E E

   000 000

iii x x x E E S B S B B S E E

   

iv x x x x E E

0000 0000

S B S B B S B B B S E E

2

  

The size of V V V E E is prop ortional to the square of n Using

n

0

S the size can b e made to grow linearly with n

   0 0

x x E E x S E E

0

 0  0

S S x E x E

0

00 00

S S E E

0

00     00

E x x x E E x S S E

0 0

 00  00

S S S x E x E

0 0

000 000

E S S S E

0 0

     000 000

x x x x E E x S S S E E

0 0 0

 000  000

S S S S x E x E

0 0 0

0000 0000

S S S S E E

Just as B and C were intro duced to simplify combinatory expressions of the form

0 0

S K E E and S E K E resp ectively Turner also devised B and C with an

0

analogous role for S The prop erties required are

0 0

S C K E E B C E E

0 0

S C E K E C C E E

where C is any combinator and E E are arbitrary combinatory expressions

0 0

This is achieved if B and C are dened by

0

LET B c f g x c f g x

0

LET C c f g x c f x g

0 0

Clearly B and C will have the prop erty that for arbitrary expressions C E

E and E

0

B C E E E C E E E

c

0

C C E E E C E E E

c

Exercise

Show that for arbitrary expressions E E and E

0 0

i S E K E E B E E E

Chapter Combinators

0 0

ii S E E K E C E E E

0

iii S B E E E S E E E

0

iv B E E E B E E E

0

v C B E E E C E E E

2

Turners algorithm for translating expressions to combinatory expressions is de

scrib ed by him as follows

Use the algorithm of Curry but whenever a term b eginning in S B or

C is formed use one of the following transformations if it is p ossible to

do so

0

S B K A B S K A B

0

B K A B B K A B

0

C B K A B C K A B

Here A and B stand for arbitrary terms as usual and K is any term com

p osed entirely of constants The correctness of the new algorithm can

b e inferred from the correctness of the Curry algorithm by demonstrat

ing that in each of the ab ove transformations the left and righthand

sides are extensionally equal In each case this follows directly from the

denitions of the combinators involved

Since Turners pioneering pap ers app eared many p eople have worked on improving

the basic idea For example John Hughes has devised a scheme for dynamically

generating an optimal set of primitive combinators called supercombinators for

each program The idea is that the compiler will generate combinatory expres

sions built out of the sup ercombinators for the program b eing compiled It will

also dynamically pro duce micro co de to implement the reduction rules for these

sup ercombinators The result is that each program runs on a reduction machine

tailored sp ecially for it Most current highp erformance implementations of func

tional languages use sup ercombinators Another avenue of research is to use

combinators based on the De Bruijn notation briey describ ed on page The

Categorical Abstract Machine uses this approach

Chapter

A Quick Overview of ML

There are two widely use descendents of the original ML Standard ML and Caml

These notes describ e the former Several implementations of Standard ML exist

These all supp ort the same core language but dier in extensions error message

details etc ATTs public domain Standard ML of New Jersey SMLNJ and

the commercial system PolyML are used for research applications in the Cambridge

Computer Lab oratory The ML implementation on Thor for teaching is Edinburgh

ML from the University of Edinburgh with enhancements due to Arthur Norman

of Cambridge The dierent outputs pro duced by SMLNJ and Edinburgh ML

will b e sometimes shown but the examples that follow are presented in the system

neutral style of Paulsons b o ok which is closer to Edinburgh ML than SMLNJ

Interacting with ML

ML is an interactive language A common way to run it is inside a shell window

from emacs The programs are then tested by cutting and pasting from the text

window to the shell window

The two main things one do es in ML are evaluate expressions and p erform decla

rations

What follows is a session in which simple uses of various ML constructs are illus

trated To make the session easier to follow it is split into a sequence of b oxed

subsessions

Expressions

The toplevel ML prompt is As ML reads a phrase it prompts with until

a complete expression or declaration is found Neither the initial prompt nor the

intermediate prompt will normally b e shown here except in sessions which are

included to illustrate the b ehaviour of particular ML implementations eg the next

two b oxes

SMLNJ is called sml on Computer Lab machines The following seesion shows

it b eing run and the expression b eing evaluated

1

Readers interested in Caml should consult the Web page httppauillacinriafr ca ml

Caml is a lightweight language b etter suited than Standard ML for use on small machines All the

constructs describ ed in this course are in Caml though the syntactic details dier slightly from

Standard ML

2

This overview has evolved from the description of the original ML in Section of

MJC Gordon AJRG Milner and CP Wadsworth Edinburgh LCF A Mechanized

Logic of Computation Lecture Notes in Computer Science SpringerVerlag

3

PolyML was originally develop ed at the Cambridge Computer Lab oratory and then licenced

rst to Imp erial Software Technology and then to Abstract Hardware Limited It has an integrated

p ersistant storage system database and is less memory hungry than Standard ML of New Jersey

Chapter A Quick Overview of ML

1

woodcock sml

Standard ML of New Jersey Version February

val it unit

val it int

it

val it int

After SMLNJ starts up it prints a message followed by val it unit this

will b e explained later It then prompts for user input with the user then input

followed by a carriage return ML then resp onded with val it int a

new line and then prompted again This output shows that evaluates to the

value of typ e int

The user then input it followed by a carriage return and the system resp onded

with val it int again In general to evaluate an expression e one inputs

e followed by a semicolon and then a carriage return the system then prints es

value and typ e in the format shown The value of the last expression evaluated at

top level is rememb ered in the identier it This is shown explicitly in the output

from SMLNJ but not in the output from Edinburgh ML shown in the following

b ox which after Edinburgh ML has b een run has the same input as the preceding

one

2

hammerthorcamacuk groupclteachacnmlunixcm l

FAM groupclteachacnmlunixfam started on Jan

version of Jan

Image file groupclteachacnmlunixc mle xp

written on Jan by FAM version

Loading Generic Heapresexingrelocating by efffff bytes

Edinburgh ML for DOSWinsUnix C Edinburgh University A C Norman

int

it

int

Unless explicity indicated otherwise the b oxed sessions that follow use the format

illustrated by

3

val it int

it

val it int

Prompts are not shown system output is indicated by and the values of

expressions are shown explicitly b ound to it Sometimes part of the output will b e

omitted eg the typ e

Declarations

The declaration val xe evaluates e and binds the resulting value to x

Comments

4

val x

val x int

itx

val it false bool

Notice that declarations do not aect it

Inputting e at top level is actually treated as inputting the declaration let it e

The ML system b oth SMLNJ and Edinburgh ML initially binds it to a sp ecial

value which is the only value of the oneelement typ e unit

To bind the variables x x simultaneously to the values of the expressions

n

e e one can p erform

n

either the declaration val x e and x e and x e

n n

or val x x x e e e

n n

These two declarations are equivalent

5

val y and zx

val y int

val z int

val xy yx

val x int

val y int

A declaration d can b e made lo cal to the evaluation of an expression e by evaluating

the expression let d in e end

6

let val x in xy end

val it int

x

val it int

Comments

Comments start with and end with They nest like parentheses can extend

over many lines and can b e inserted wherever spaces are allowed

7

tr comments cant go in the middle of names ue

Error unbound variable or constructor tr

Error unbound variable or constructor ue

this comment is ignored

val it true bool

Inside this comment another one is nested

Functions

To dene a function f with formal parameter x and b o dy e one p erforms the

declaration fun f x e To apply the function f to an actual parameter e one

evaluates the expression f e

Chapter A Quick Overview of ML

8

fun f x x

val f fn int int

f

val it int

Functions are printed as fn in SMLNJ and Fn in Edinburgh ML since a function

as such is not printable After fn or Fn is printed the typ e of the function is also

printed Functions are printed as fn in these notes

Applying a function to an argument of the wrong typ e results in a typ echecking

error The particular error message dep ends on the ML system used In SMLNJ

9

f true

stdin Error operator and operand dont agree tycon mismatch

operator domain int

operand bool

in expression

f true

In Edinburgh ML

10

f true

Type clash in f true

Looking for a int

I have found a bool

Application binds more tightly than anything else in the language thus for ex

ample f means f not f Functions of several arguments can b e

dened

11

fun add xint yint xy

val add fn int int int

add

val it int

val f add

val f fn int int

f

val it int

Application asso ciates to the left so add means add In the expression

add the function add is applied to the resulting value is the function of typ e

int int which adds to its argument Thus add takes its arguments one at a

time

Without the explicit typing of the formal parameters ML cannot tell whether the

is addition of integers or reals The symb ol is overloaded If the extra typ e

information is omitted an error results In SMLNJ

12

fun add x y xy

stdin Error overloaded variable cannot be resolved

In Edinburgh ML

13

fun add x y xy

Type checking error in syntactic context unknown

Unresolvable overloaded identifier

Definition cannot be found for the type a a a

Typ e abbreviations

This kind of typ echecking error is relatively rare Much more common are errors

resulting from applying functions to arguments of the wrong typ e

The function add could alternatively have b een dened to take a single argument of

the pro duct typ e int int

14

fun addxyint xy

val add fn int int int

add

val it int

let val z in add z end

val it int

add

stdin Error operator and operand dont agree tycon mismatch

operator domain int int

operand int

in expression

add

The error message shown here is the one generated by SMLNJ Notice that this

time the result of the function has had its typ e given explicitly In general it

is sucient to explicitly typ e any sub expression as long as this disambiguates all

overloaded op erators

As well as taking structured arguments eg functions may also return struc

tured results

15

fun sumdiffxintyint xyxy

val sumdiff fn int int int int

sumdiff

val it int int

Typ e abbreviations

Typ es can b e given names

16

type intpair int int

type intpair defined

fun addpair xyintpair xy

val addpair fn intpair int

val it int int

intpair

val it intpair

addpair

val it int

The new name is simply an abbreviation intpair and intint are completely equiv

alent

Chapter A Quick Overview of ML

Op erators

addition and are builtin inx op erators Users can dene their own inxes

using infix for left asso ciative op erators and infixr for right asso ciative ones

17

infix op

infixr op

infix op

infixr op

This merely tells the parser to parse e op e as ope e and e op e as

1 2 1 2 1 2

ope e

1 2

18

fun xint op yint x y

val op fn int int int

op

val it int

fun xint op yint x y

val op fn int int int

op

val it int

An inx of precedence n can b e created by using infix n instead of just infix and

infixr n instead of just infixr If the n is omitted a default precedence of is

assumed

The ML parser can b e told to ignore the inx status of an o ccurrence of an identier

by preceding the o ccurrence with op

19

op

Error nonfix identifier required

op op

val it fn int int int

The inx status of an op erator can b e p ermanently removed using the directive

nonfix

20

val it int

nonfix

nonfix

Error operator is not a function

operator int

in expression

overloaded

Removing the inx status of builtin op erators is not recommended Lets restore

it b efore chaos results is leftasso ciative with precedence

21

infix

infix

Lists

Lists

If e e all have typ e ty then the ML expression e e has typ e ty list

n n

The standard functions on lists are hd head tl tail null which tests whether a

list is emptyie is equal to and the inxed op erators cons and app end

or concatenation

22

val m

val m int list

hd m tl m

val it int int list

null m null

val it falsetrue bool bool

m

val it int list

val it int list

true

stdin Error operator and operand dont agree tycon mismatch

operator domain bool bool list

operand bool int list

in expression

true nil

All the memb ers of a list must have the same typ e the error message shown is from

SMLNJ

Strings

A sequence of characters enclosed b etween quotes is a string

23

this is a string

val it this is a string string

val it string

The empty string is A string can b e explo ded into a list of singlecharacter

strings with the function explode The inverse of this is implode which concatenates

a list of singlecharacter strings into a single string

24

explode

val it fn string string list

explode this is a string

val it

this is a string

string list

implode it

val it this is a string string

Chapter A Quick Overview of ML

Records

Records are datastructures with named comp onents They can b e contrasted with

tuples whose comp onents are determined by p osition

A record with elds x x whose values are v v is created by evaluating

n n

the expression x v x v

n n

25

val MikeData

userid mjcg sex male married true children

val MikeData childrenmarriedtruesex male use rid mjcg

childrenint marriedbool sexstring useridstring

The typ e of x v x v is x x where is the typ e of v

n n n n i i

The order in which record comp onents are named do es not matter

26

val MikeData

sex male userid mjcg children married true

val MikeData childrenmarriedtruesex male us erid mjc g

childrenint marriedbool sexstring useridstring

MikeData MikeData

val it true bool

The comp onent named x of a record can b e extracted using the sp ecial op erator

x

27

children MikeData

val it int

Functions which access record comp onents need to b e explicitly told the typ e of the

record they are accessing since there may b e several typ es of records around with

the same eld names

28

fun Sex p sex p

Error unresolved flex record in let pattern

type persondata useridstring childrenint marriedbool sexstring

type persondata childrenint marriedbool sexstring useridstring

fun Sexppersondata sex p

val Sex fn persondata string

A tuple v v is equivalent to the record v nv ie tuples in

1 n

n

ML are sp ecial cases of records

29

Hello true

val it Hellotrue string bool int

it

val it true bool

Polymorphism

The list pro cessing functions hd tl etc can b e used on all typ es of lists

fnexpressions

30

hd

val it int

hd truefalsetrue

val it true bool

hd

val it int int

Thus hd has several typ es ab ove it is used with typ es int list int

bool list bool and int int list int int In fact if ty is any

typ e then hd has the typ e ty list ty Functions like hd with many typ es

are called polymorphic and ML uses typ e variables a b c etc to represent their

typ es

31

hd

val it fn a list a

The ML function map takes a function f with argument typ e a and result typ e

b and a list l of elements of typ e a and returns the list obtained by applying

f to each element of l which is a list of elements of typ e b

32

map

val map fn a b a list b list

fun add xint x

val add fn int int

map add

val it int list

map can b e used at any instance of its typ e ab ove b oth a and b were instantiated

to int b elow a is instantiated to int list and b to bool Notice that the

instance need not b e sp ecied it is determined by the typ e checker

33

map null

val it falsetruefalsetrue bool list

A useful builtin op erator is function comp osition o

34

op o

val it fn a b c a c b

fun add n n

and add n n

val add fn int int

val add fn int int

add o add

val it int

fnexpressions

The expression fn x e evaluates to a function with formal parameter x and

with b o dy e Thus the declaration fun f x e is equivalent to val f fn x e

Similarly fun f xy z e is equivalent to val f fn xy fn z e In the

theory of functions the symb ol is used instead of fn expressions like fn x e are

sometimes called expressions b ecause they corresp ond to calulus abstractions

xe see Chapter

Chapter A Quick Overview of ML

35

fn x x

val it fn int int

it

val it int

The higher order function map applies a function to each element of a list in turn

and returns the list of results

36

map fn x xx

val it int list

val doubleup map fn x xx

val doubleup fn a list list a list list

doubleup

val it int list list

doubleup

val it a list list

Conditionals

ML has conditionals with syntax if e then e else e with the exp ected meaning

1 2

The truthvalues are true and false b oth of typ e bool

37

if true then else

val it int

if then else

val it int

e orelse e abbreviates if e then true else e and e andalso e abbreviates

1 2 1 2 1 2

if e then e else false

1 2

Recursion

The following denes the factorial function

38

fun fact n if n then else nfactn

val fact fn int int

fact

val it int

Notice that the compiler automatically detects recursive calls In earlier versions of

ML recursion had to b e explicitly indicated

Consider

39

fun f n int n

val f fn int int

fun f n if n then else nfn

val f fn int int

f

val it int

Equality typ es

Here f results in the evaluation of f In earlier versions of ML the rst

f would have b een used so that f would have evaluated to hence the

expression f would have evaluated to

An alternative style of dening functions in Standard ML that avoids enforced

recursion uses val and fn

40

fun f n int n

val f fn int int

val f fn n if n then else nfn

val f fn int int

f

val it int

Here the o ccurrence of f in nfn is interpreted as the previous version of f

The keyword rec after val can b e used to force a recursive interpretation

41

fun f n int n

val f fn int int

val rec f fn n if n then else nfn

val f fn int int

f

val it int

With val rec the o ccurrence of f in nfn is interpreted recursively

Equality typ es

Simple concrete values like integers b o oleans and strings are easy to test for

equality Values of simple datatyp es like pairs and records whose comp onents

have concrete typ es are also easy to test for equality For example v v is equal

1 2

0 0

0 0

There is thus a large class of typ es and v v to v v if and only if v v

1 2

whoses values can b e tested for equality However in general it is undecidable

to test the equality of functions It is thus not p ossible to overload to work

prop erly on all typ es In old versions of ML was interpreted on functions by

testing the equality of the addresses in memory of the datastructure representing

the functions If such a test yielded true then the functions were certainly equal

but many mathematically ie extensionally equal functions were dierent using

this interpretation of

In Standard ML those typ es whose values can b e tested for equality are called

equality typ es and are treated sp ecially Sp ecial typ e variables that are con

strained only to range over equality typ es are provided These have the form

whereas ordinary typ e variables have the form The builtin function has

typ e a a bool Starting from this the ML typ echecker can infer typ es

containing equality typ e variables

42

fun Eq x y x y

val Eq fn a a bool

fun EqualHd l l hd l hd l

val EqualHd fn a list a list bool

Trying to instantiate an equality typ e variable to a functional typ e results in an

error In SMLNJ

Chapter A Quick Overview of ML

43

hd hd

Error operator and operand dont agree equality type required

operator domain Z Z

operand Y list Y X list X

in expression

hdhd

EqualHd hd hd

Error operator and operand dont agree tycon mismatch

operator domain Z Z

operand Y list Y list bool

in expression

overloaded EqualHd

The use of equality typ es in Standard ML is considered controversial some p eople

think they are to o messy for the b enet they provide It is p ossible that future

versions of ML will drop equality typ es

Pattern matching

Functions can b e dened by pattern matching For example here is another deni

tion of the factorial function

44

fun fact

fact n n factn

val fact fn int int

Here is the Fib onacci function

45

fun fib

fib

fib n fibn fibn

val fib fn int int

Supp ose function f is dened by

46

fun f p e

1 1

f p e

2 2

f p e

n n

An expression f e is evaluated by successively matching the value of e with the

patterns p p p in that order until a match is found say with p Then the

n i

value of f e is the value of e During the match variables in the patterns may b e

i

b ound to comp onents of es value and then the variables have these values during

the evaluation of e For example evaluation fib causes to b e matched with

i

then b oth of which fail and then with n which succeeds binding n to The

result is then the value of fib fib which after some recursive calls

evaluates to

47

fib

val it int

Patterns can b e quite elab orate and are typically comp osed with constructors see

Section b elow

The patterns in a function denition need not b e exhaustive In SMLNJ

Pattern matching

48

fun foo

stdin Warning match nonexhaustive

val foo fn int int

In Edinburgh ML

49

fun foo

Warning Patterns in Match not exhaustive

val foo Fn int int

If a function is dened with a nonexhaustive match and then applied to an argu

ment whose value do esnt match any pattern a sp ecial kind of runtime error called

an exception results see Section

In SMLNJ

50

foo

val it int

foo

uncaught Match exception stdin

In Edinburgh ML

51

foo

Exception raised at top level

Warning optimisations enabled

some functions may be missing from the trace

Exception Match raised

Messages warning that a match is nonexhaustive will sometimes b e omitted from

the output shown here

The builtin listpro cessing functions hd and tl can b e dened by

52

fun hdxl x

Warning match nonexhaustive

val hd fn a list a

fun tlxl l

Warning match nonexhaustive

val tl fn a list a list

These denitions give exactly the same results as the builtin functions except on the

empty list where they dier in the exceptions raised exceptions are describ ed

in Section

if x is a variable and p a pattern then the pattern x as p is a pattern that matches

the same things as p but has the additional eect that when a match succeeds the

value matched is b ound to x Consider the function RemoveDuplicates

The wildcard matches anything

53

fun null true

null false

val null fn a list bool

Chapter A Quick Overview of ML

54

fun RemoveDuplicates

RemoveDuplicatesx x

RemoveDuplicatesxxl

if xx then RemoveDuplicatesxl

else xRemoveDuplicatesxl

val RemoveDuplicates fn a list a list

RemoveDuplicates

val it int list

The rep etition and extra list conses of xl can b e avoided as follows

55

fun RemoveDuplicates

RemoveDuplicatesl as x l

RemoveDuplicatesxl as x

if xx then RemoveDuplicates l else xRemoveDuplicates l

Incidently note that alas duplicate variables are not allowed in patterns

56

fun RemoveDuplicates

RemoveDuplicatesl as x l

RemoveDuplicatesxl as x RemoveDuplicates l

RemoveDuplicatesxl xRemoveDuplicates l

Error duplicate variable in patterns x

Anonymous functions fn expressions can b e dened by pattern matching using

the syntax fn p e p e

n n

57

fn none one two many

val it fn a list string

it ittrue it it

val it noneonetwomany string string string string

Patterns can b e constructed out of records with as a wildcard

58

fun IsMalesexmaleperson data true

IsMale false

val IsMale fn persondata bool

IsMale MikeData

val it true bool

An alternative denition is

59

fun IsMalesexxpersondata x male

A more compact form of this is allowed

60

fun IsMalesexpersondata sex male

The eld name sex doubles as a variable Think of a pattern    v   as abbre

viating   vv  

The case construct

The case construct p ermits one to compute by cases on an expression of a datatyp e

The expression case e of p e p e is an equivalent form for the

n n

application fn p e p e e

n n

Exceptions

Exceptions

Some standard functions raise exceptions at runtime on certain arguments When

this happ ens a sp ecial kind of value called an exception packet is propagated which

identies the cause of the exception These packets have names which usually reect

the function that raised the exception they may also contain values

61

hdtl

uncaught exception Hd

div

uncaught exception Div

div

uncaught exception Div

Exceptions must b e declared using the keyword exception they have typ e exn

Exceptions can b e explicitly raised by evaluating an expression of the form raise e

where e evaluates to an exception value Exceptions are printed slightly dierently

in SMLNJ and Edinburgh ML In SMLNJ

62

exception Exexception Ex

exception Ex

exception Ex

ExEx

val it ExEx exn list

raise hd it

uncaught exception Ex

In Edinburgh ML

63

exception Exexception Ex

type exn

con Ex exn

type exn

con Ex exn

ExEx

exn list

raise hd it

Exception raised at top level

Warning optimisations enabled

some functions may be missing from the trace

Exception Ex raised

An exception packet constructor called name and which constructs packets con

taining values of typ e ty is declared by exception name of ty

64

exception Ex of string

exception Ex

Ex

val it fn string exn

raise Ex foo

uncaught exception Ex

Chapter A Quick Overview of ML

The typ e exn is a datatyp e see Section b elow whose constructors are the

exceptions It is the only datatyp e that can b e dynamically extended All other

datatyp es have to have all their constructors declared at the time when the datatyp e

is declared

Because exn is a datatyp e exceptions can b e used in patterns like other constructors

This is useful for handling exceptions

An exception can b e trapp ed and its contents extracted using an exception han

dler An imp ortant sp ecial case is unconditional trapping of all exceptions The

value of the expression e handle e is that of e unless e raises an exception

in which case it is the value of e

65

hd handle

val it int

hd handle

val it int

hdtl handle

val it int

div handle

val it int

The function half dened b elow only succeeds ie do esnt raise an exception on

nonzero even numb ers on it raises Zero and on o dd numb ers it raises Odd

66

exception Zero exception Odd

exception Zero

exception Odd

fun half n

if n then raise Zero

else let

val m n div

in

if nm then m else raise Odd

end

val half fn int int

Some examples of using half

67

half

val it int

half

uncaught exception Zero

half

uncaught exception Odd

half handle

val it int

Failures may b e trapp ed selectively by matching the exception packet this is done

by replacing the wildcard by a pattern For example if e raises E x then the

value of e handle E x e E x e is the value of e if E x equals E x

n n i i

otherwise the handleexpression raises E x

Datatyp e declarations

68

half handle Zero

val it int

half handle Zero

uncaught exception Odd

half handle Zero Odd

val it int

half handle Zero Odd

val it int

Instead of having the two exceptions Zero and Odd one could have a single kind of

exception containing a string

69

exception Half of string

exception Half

fun half n

if n then raise Half Zero

else let

val m n div

in

if nm then m else raise Half Odd

end

val half fn int int

A disadvantage of this approach is that the kind of exception is not printed when

the exceptions are uncaught

70

half

uncaught exception Half

half

uncaught exception Half

half handle Half Zero Half Odd

val it int

half handle Half Zero Half Odd

val it int

Alternatively one can match the contents of the exception packet to a variable s

say and then branch on the value matched to s

71

half handle Half s if sZero then else

val it int

half handle Half s if sZero then else

val it int

Datatyp e declarations

New typ es rather than mere abbreviations can also b e dened Datatyp es are

typ es dened by a set of constructors which can b e used to create ob jects of that

typ e and also in patterns to decomp ose ob jects of that typ e For example to

dene a typ e card one could use the construct datatype

Chapter A Quick Overview of ML

72

datatype card king queen jack other of int

datatype card

con jack card

con king card

con other int card

con queen card

Such a declaration declares king queen jack and other as constructors and gives

them values The value of a ary constructor such as king is the constant value

king The value of a constructor such as other is a constructor function that given

an integer value n pro duces othern

73

king

val it king card

other

val it other card

To dene functions that take their argument from a concrete typ e fnexpressions

of the form fn p e p e can b e used Such an expression denotes

n n

a function that given a value v selects the rst pattern that matches v say p binds

i

the variables of p to the corresp onding comp onents of the value and then evaluates

i

the expression e For example the values of the dierent cards can b e dened in

i

the following way

74

val value fn king

queen

jack

other n n

val value fn card int

Alternatively and p erhaps more lucidly this could b e dened using a fun declara

tion

75

fun value king

value queen

value jack

value other n n

val value fn card int

The notion of datatyp e is very basic and could enable us to build MLs elementary

typ es from scratch For example the b o oleans could b e dened simply by

76

datatype bool true false

datatype bool

con false bool

con true bool

and the p ositive integers by

77

datatype int zero suc of int

datatype int

con suc int int

con zero int

In a similar way LISP Sexpressions could b e dened by

Abstract typ es

78

datatype sexp litatom of string

numatom of int

cons of sexp sexp

datatype sexp

con cons sexp sexp sexp

con litatom string sexp

con numatom int sexp

fun car consxy x and cdr consxy y

Warning match nonexhaustive

val car fn sexp sexp

Warning match nonexhaustive

val cdr fn sexp sexp

val a litatom Foo and a numatom

val a litatom Foo sexp

val a numatom sexp

carconsaa

val it litatom Foo sexp

cdrconsaa

val it numatom sexp

Notice the warning from the compiler that the patterns in the denitions of car and

cdr are not exhaustive these funtions are only partially sp ecied namely only

on lists built with cons ie nonatoms

79

car litatom foo

uncaught exception Match

Abstract typ es

New typ es can also b e dened by abstraction For example a typ e time could b e

dened as follows

80

exception BadTime

exception BadTime

abstype time time of int int

with

fun maketimehrsmins if hrs orelse hrs orelse

mins orelse mins

then raise BadTime

else timehrsmins

and hourstimett t

and minutestimett t

end

type time

val maketime fn int int time

val hours fn time int

val minutes fn time int

This denes an abstract typ e time and three primitive functions maketime hours

and minutes

In general an abstract typ e declaration has the form abstype d with b end where d

is a datatyp e sp ecication and b is a binding ie the kind of phrase that can follow

val Such a declaration intro duces a new typ e ty say as sp ecied by the datatyp e

declaration d However the constructors declared on ty by d are only available

within b The only bindings that result from executing the abstype declaration are

those sp ecied in b

Chapter A Quick Overview of ML

Thus an abstract typ e declaration simultaneously declares a new typ e together

with primitive functions for the typ e the representation datatyp e is not accessible

outside the withpart of the declaration

81

val t maketime

val t time

hours t minutes t

val it int int

Notice that values of an abstract typ e are printed as since their representation is

hidden from the user

Typ e constructors

Both list and are examples of typ e constructors list has one argument hence

a list whereas has two hence a b Typ e constructors may have various

predened op erations asso ciated with them for example list has null hd tl

etc Because of pattern matching it is not necessary to have any predened

op erations for One can dene for example fst and snd by

82

fun fstxy x and sndxy y

val fst fn a b a

val snd fn a b b

val p

val p int int

fst p

val it int

snd p

val it int

A typ e constructor set that represents sets by lists without rep etitions can b e

dened in the following way

83

abstype a set set of a list

with

val emptyset set

fun isemptyset s null s

fun member set false

memberx setyz xy orelse memberx set z

fun addx set setx

addx setyz if xy

then setyz

else let val set l addx set z in

setyl

end

end

val emptyset a list

val isempty fn a set bool

val member fn a a set bool

val add fn a a set a set

Note that the op eration add ensures that no rep etitions of elements o ccur in the list

representing the set Here is an example using these sets

References and assignment

84

val s addaddaddemptyset

val s int set

members

val it true bool

members

val it false bool

References and assignment

References are b oxes that can contain values The contents of such b oxes can

b e changed using the assignment op erator The typ e ty ref is p ossessed by

references containing values of typ e ty

References are created using the ref op erator This takes a value of typ e ty to a

value of typ e ty ref The expression xe changes the contents of the reference

that is the value of x to b e the value of e The value of this assignment expression

is the dummy value this is the unique value of the oneelement typ e unit

Assignments are executed for a side eect not for their value

The contents of a reference can b e extracted using the op erator error message

b elow from SMLNJ

85

x

stdin Error operator and operand dont agree tycon mismatch

operator domain Z ref Z

operand int int

in expression

x

val x ref and y ref

val x ref int ref

val y ref int ref

x

val it ref int ref

x

val it unit

x

val it ref int ref

x

val it int

References should only b e resorted to in exceptional circumstances as exp erience

shows that their use increases the probability of errors

Iteration

Here is an iterative denition of fact using two lo cal references count and result

4

There are some horrible subtleties asso ciated with the typ es of references which are ignored

here The treatment of references in ML is currently in a state of ux

Chapter A Quick Overview of ML

86

fun fact n

let val count ref n and result ref

in while count

do result count result

count count

result

end

val fact fn int int

fact

val it int

The semicolon denotes sequencing When an expression e e is evaluated each

n

e is evaluated in turn and the value of the entire expression is the value of e

i n

Evaluating while e do c consists in evaluating e and if the result is true c is eval

uated for its sideeect and then the whole pro cess rep eats If e evaluates to false

then the evaluation of while e do c terminates with value

Programming in the large

Sophisticated features for structuring collections of declarations programming in

the large are provided in Standard ML but not in earlier versions of ML These

are designed to supp ort the use of ML for large scale system building They account

for much of the complexity of the language

Standard ML of New Jersey is implemented in itself and makes extensive use of

these features Edinburgh ML do es not implement them

The concepts of structures signatures and functors which provide the structuring

constructs for programming in the large are not covered in this course hence their

absence from Edinburgh ML on Thor will not b e a problem

Chapter

Case study parsing

The lexical analysis and parsing programs describ ed here are intended to illustrate

functional programming metho ds and ML rather than parsing theory The style of

parsing presented is quite reasonable for small lightweight ad ho c parsers but would

b e inappropriate for large applications which should b e handled using heavyweight

parser generators like YACC

Lexical analysis

Lexical analysis converts sequences of characters into sequences of tokens also called

words or lexemes

For us a token will b e either a numb er sequence of digits an identier a sequence

of letters or digits starting with a letter or a sp ecial symb ol such as

or Sp ecial symb ols are sp ecied by a table see b elow

A numb er is a sequence of digits The ML inx op erator is overloaded and can

b e applied to strings If x and y are singlecharacter strings then x y just tests

whether the ASCI I co de of x is less then or equal to that of y Thus a single

character string representing a digit can b e characterised by the predicate IsDigit

87

fun IsDigit x x andalso x

val IsDigit fn string bool

A letter can similarly b e characterised by making use of the fact that the ASCI I

co des of all lower case letters are adjacent and also the co des of all upp er case latters

are adjacent

88

fun IsLetter x

a x andalso x z orelse A x andalso x Z

val IsLetter fn string bool

Token are separated by separators which will b e taken to b e spaces newlines and

tabs hence

89

fun IsSeparator x x orelse x n orelse x t

val IsSeparator fn string bool

Single characters that are not digits letters or separators will b e assumed to b e

sp ecial symb ols Multicharacter sp ecial symb ols eg are considered later

The input is assumed to b e supplied as a list of singlecharater strings Lexical

analysis consists on converting such a list to a list of tokens

Supp ose the input just consists of numb ers separated by separators A function

Tokenise that did lexical analysis for just this case would need to rep eatedly remove

1

There is an ML based version of YACC The parser for SMLNJ uses this

Chapter Case study parsing

digits until a nondigit eg a separator was reached and then implo de the removed

characters into a string representing a token and add that to the list of tokens

The function GetNumber takes a list l say of singlecharacter strings and returns a

pair consisting of i a string representing a numb er consisting of all the digits in

l up to the rst nondigit and ii the remainder of l after these digits have b een

removed It is convenient to dene GetNum using an auxiliary function GetNumAux

that has an extra argument buf for accumulating a reversed list of characters

making up the numb er

90

fun GetNumAux buf imploderev buf

GetNumAux buf l as xl

if IsDigit x then GetNumAux xbuf l

else imploderev bufl

val GetNumAux fn string list string list string string list

GetNumAux abc

val it cba string string list

Then GetNum is simply dened by

91

val GetNum GetNumAux

val GetNum fn string list string string list

GetNum

val it string string list

The denition of GetNumAux could have b een lo calised to GetNum using

local in end

Notice that if the list argument of GetNum do esnt start with a numb er then the

empty token implode will b e returned

92

GetNum a

val it a string string list

This problem will go away when we improve the co de later on

Identiers can b e lexically analysed by similar programs

93

fun GetIdentAux buf imploderev buf

GetIdentAux buf l as xl

if IsLetter x orelse IsDigit x

then GetIdentAux xbuf l

else imploderev bufl

val GetIdentAux fn string list string list string string list

GetIdentAux abc efg

val it cbaefg string string list

An identier must start with a letter so GetIdent is dened by

94

exception GetIdentErr

exception GetIdentErr

fun GetIdent xl

if IsLetter x then GetIdentAux x l else raise GetIdentErr

val GetIdent fn string list string string list

The lexical analysis of numb ers and identiers can b e streamlined and unied by

dening a single general function GetTail that takes a predicate as an argument

Lexical analysis

and then uses this to test whether to keep accumulating characters in buf or to

terminate Then GetNumAux corresp onds to GetTail IsDigit and GetIdentAux to

GetTail fn x IsLetter x orelse IsDigit x

The denition of GetTail is similar to that of GetNumAux and GetIdentAux

95

fun GetTail p buf imploderev buf

GetTail p buf l as xl

if p x then GetTail p xbuf l else imploderev bufl

val GetTail fn

stringbool string list string list string string list

Using GetTail a function to get the next token is easy to dene

96

fun GetNextToken x x

GetNextToken xl

if IsLetter x

then GetTail fn x IsLetter x orelse IsDigit x x l

else if IsDigit x

then GetTail IsDigit x l

else xl

val GetNextToken fn string list string string list

To lexically analyse a list of characters GetNextToken is rep eatedly called and sep

arators are discarded

97

fun Tokenise

Tokenise l as xl

if IsSeparator x

then Tokenise l

else let val tl GetNextToken l

in tTokenise l end

val Tokenise fn string list string list

Tokenise explode abcde a

val it abcde a string list

Tokenise do es not handle multicharacter sp ecial symb ols These will b e sp ecied by

a table represented as a list of pairs that shows which characters can follow each

initial segment of each sp ecial symb ol such a table represents a statetransition

function for an automaton For example supp ose the sp ecial symb ols are

and then the table would b e

98

This is not fully general b ecause if is a sp ecial symb ol then the representation

ab ove forces to b e also A fully general treatment of sp ecial symb ols is left as an

exercise

Some utility functions are needed Mem tests whether an element o ccurs in a list

99

fun Mem x false

Mem x xl xx orelse Mem x l

val Mem fn a a list bool

Mem

val it true bool

Mem

val it false bool

Chapter Case study parsing

Notice the equality typ es

Get lo oks up the list of p ossible successors of a given string in a sp ecialsymb ol table

100

fun Get x

Get x xlrest if xx then l else Get x rest

val Get fn a a b list list b list

Get

val it string list

Get

val it string list

The function GetSymbol takes a sp ecialsymb ol table and a token and then extends

the token by removing characters from the input until the table says that no further

extension is p ossible

101

fun GetSymbol spectab tok tok

GetSymbol spectab tok l as xl

if Mem x Get tok spectab then GetSymbol spectab tokx l

else tokl

val GetSymbol fn

string string list list

string string list string string list

The function GetNextToken can b e enhanced to handle sp ecial symb ols It needs to

take a sp ecialsymb ol table as an argument

102

fun GetNextToken spectab x x

GetNextToken spectab xl as xl

if IsLetter x

then GetTail fn x IsLetter x orelse IsDigit x x l

else if IsDigit x

then GetTail IsDigit x l

else if Mem x Get x spectab

then GetSymbol spectab implodexx l

else xl

val GetNextToken fn

string string list list string list string string list

Now Tokenise can b e enhanced to use the new GetNextToken

103

fun Tokenise spectab

Tokenise spectab l as xl

if IsSeparator x

then Tokenise spectab l

else let val tl GetNextToken spectab l

in tTokenise spectab l end

val GetNextToken fn

string string list list string list string string list

Here is a particular table

104

val SpecTab

val SpecTab

string string list list

Tokenise SpecTab explode ab c dffgg

val it abcd f f g g string list

Simple sp ecial cases of parsing

In the next section the lexical analyset Lex will b e used

105

val Lex Tokenise SpecTab o explode

val Lex fn string string list

Lex ab c dffgg

val it abcd f f g g string list

Simple sp ecial cases of parsing

Before giving a complete parser some sp ecial cases are considered

Applicative expressions

Examples of applicative expressions are x f x f x y ff x fg xh x etc

Parse trees for such expression can b e represented by the recursive datatyp e tree

106

datatype tree Atom of string Comb of tree tree

datatype tree

con Atom string tree

con Comb tree tree tree

Right asso ciative without brackets

Supp ose for the moment i the input is supplied as a list of atoms ii brackets are

ignored and ii application is taken to b e right asso ciative Then a simple parser

is

107

fun Parse next Atom next

Parse nextrest CombAtom next Parse rest

Warning match nonexhaustive

val Parse fn string list tree

Parsef x y z

val it Comb Atom fComb Atom xComb Atom yAtom z tree

Left asso ciative without brackets

The usual convention is for application to b e left asso ciative A parser for this is

only slightly more complex

To parse f x y z the following intermediate parsings need to b e done

f is parsed to Atom f

f x is parsed to CombAtom f Atom x

f x y is parsed to CombCombAtom f Atom x Atom y

Intermediate parse trees will b e passed forward via a variable t of an auxiliary

function Parser

Chapter Case study parsing

108

fun Parser t t

Parser t nextrest Parser Combt Atom next rest

val Parser fn tree string list tree

fun Parse next Atom next

Parse nextrest Parser Atom next rest

Warning match nonexhaustive

val Parse fn string list tree

Parsef x y z

val it Comb Comb Comb Atom fAtom xAtom yAtom z tree

Right asso ciative with brackets

Brackets will now b e considered To parse e the parser must b e called

recursively inside the brackets to parse e and then the presence of the clos

ing bracket must b e checked Such a recursive call needs to return the parse

tree for e and the rest of the input list Thus the typ e of Parse changes to

string list tree string list

If the parser encounters an unexp ected closing bracket then it returns the parse tree

so far and the rest of the input For example xyz should parse to

CombAtom x Atom y z

However there may b e no parse tree so far and to handle this case it is convenient

to add an empty tree Nil to the typ e tree

109

datatype tree Nil Atom of string Comb of tree tree

datatype tree

con Atom string tree

con Comb tree tree tree

con Nil tree

Right asso ciative function application is considered rst A rst attempt is

110

exception MissingClosingBracket

exception MissingClosingBracket

fun Parse Nil

Parse rest as Nilrest

Parse rest

case Parse rest

of t rest let val trest Parse rest

in Combtt rest end

raise MissingClosingBracket

Parse nextrest let val trest Parse rest

in CombAtom nexttrest end

val Parse fn string list tree string list

This do esnt quite work

111

Parse x

val it Comb Atom xNil tree string list

Parse xyz

val it Comb Atom xComb Atom yComb Atom zNil

Parse xyz

val it

Comb Atom xComb Atom yNilz tree string list

The empty parse tree Nil returned when Parse exits needs to b e removed This is

easily done by replacing Comb by MkComb where

Simple sp ecial cases of parsing

112

fun MkCombtNil t

MkComb p Comb p

Then Parse is redened

113

fun Parse Nil

Parse rest as Nilrest

Parse rest

case Parse rest

of t rest let val trest Parse rest

in MkCombtt rest end

raise MissingClosingBracket

Parse nextrest let val trest Parse rest

in MkCombAtom nexttrest end

val Parse fn string list tree string list

Parse now works on wellformed expressions

114

Parse x

val it Atom x tree string list

Parse xyz

val it

Comb Atom xComb Atom yAtom z tree string list

Parse xyz

val it Comb Atom xAtom yz tree string list

However the empty parse tree Nil can still b e generated but only in pathological

situations

115

Parse

val it Nil tree string list

Parse a

val it Comb NilAtom a tree string list

Parse

val it Nil tree string list

Parse x

val it Nilx tree string list

This might b e acceptable but probably it is b etter to distinguish the rst three

examples from the last In the mo died version of Parse that follows parses to

the empty atom Atom where is the empty string

Here is the revised denition of Parse

116

fun Parse Nil

Parse rest let val trest Parse rest

in MkCombAtom trest end

Parse rest as Nilrest

Parse rest

case Parse rest

of t rest let val trest Parse rest

in MkCombtt rest end

raise MissingClosingBracket

Parse nextrest let val trest Parse rest

in MkCombAtom nexttrest end

val Parse fn string list tree string list

The pathological examples now b ecome

Chapter Case study parsing

117

Parse

val it Atom tree string list

Parse a

val it Comb Atom Atom a tree string list

Parse

val it Comb Atom Atom tree string list

The second fourth and fth clauses of this latest denition of Parse contain some

rep etition This can b e mitigated by dening an auxiliary function for building a

combination BuildComb parse t inp builds a combination whose op erator is t and

whose op erand is got by calling the supplied parser function parse on the supplied

input inp The combination and the remainder of the input are returned

118

fun BuildComb parse t inp

let val t rest parse inp

in MkCombtt rest end

val BuildComb fn a tree b tree a tree b

fun Parse Nil

Parse rest BuildComb Parse Atom rest

Parse rest as Nilrest

Parse rest

case Parse rest

of t rest BuildComb Parse t rest

raise MissingClosingBracket

Parse nextrest BuildComb Parse Atom next rest

Notice how a mutual recursion is set up b etween Parse and BuildComb via the func

tion argument parse This technique will b e exploited again

The messy fourth clause of Parse could b e simplied with another auxiliary function

that called a supplied parser and then checked for a given input symb ol

119

fun CheckSym parse sym exn inp

case Parse inp

of t nextrest if nextsym then BuildComb parse t rest

else raise exn

raise exn

val CheckSym fn

string list tree a

string exn string list tree a

fun Parse Nil

Parse rest BuildComb Parse Atom rest

Parse rest as Nilrest

Parse rest CheckSym Parse MissingClosingBracket rest

Parse nextrest BuildComb Parse Atom next rest

Whether or not this is an improvement is a matter of taste

Left asso ciative with brackets

The technique used ab ove passing intermediate parse trees as arguments to an

auxiliary function will b e used to parse leftasso ciating applicative expressions

with brackets Here is a rst attempt

Simple sp ecial cases of parsing

120

fun Parser t t

Parser t rest Parser CombAtom t rest

Parser t inp as t inp

Parser t next CombtAtom next

Parser t rest

case Parser Nil rest

of t rest Parser Combtt rest

raise MissingClosingBracket

Parser t nextrest Parser CombtAtom next rest

val Parse Parser Nil

This has a familiar problem with Nil

121

Parse x

val it Comb NilAtom x tree string list

Parse xyzw

val it

Comb

Comb

Comb NilAtom x

Comb Comb NilAtom yAtom zAtom w

tree string list

The solution is to use MkComb instead of Comb where

122

fun MkCombNilt t

MkComb p Comb p

fun Parser t t

Parser t rest Parser CombAtom t rest

Parser t inp as t inp

Parser t next MkCombtAtom next

Parser t rest

case Parser Nil rest

of t rest Parser MkCombtt rest

raise MissingClosingBracket

Parser t nextrest Parser MkCombtAtom next rest

val Parse Parser Nil

Then

123

Parse x

val it Atom x tree string list

Parse xyzw

val it Comb Comb Atom xComb Atom yAtom zAtom w

tree string list

Precedence parsing of inxes

The parsing of expressions like x y z will now b e considered Parse trees are

represented by

124

datatype tree Nil

Atom of string

BinOp of string tree tree

Binary op erators are assumed to have a precedence given by a table represented as

a list of pairs

Chapter Case study parsing

125

val BinopTable

The function Lookup gets the precedence of an op erator from such a table

126

fun Lookup sntab x if xs then n else Lookup tab x

Warning match nonexhaustive

val Lookup fn a b list a b

Note that the use of forces an equality typ e A string is an op erator if it is assigned

a precedence by the precedence table

127

fun InTable x false

InTable sntab x xs orelse InTable tab x

val InTable fn a b list a bool

The parser function Parser that follows is quite tricky but uses many of the princi

ples that have already b een illustrated The main new ingredient are precedences

Assume ie the ASCI I version of has higher precedence than Consider the

parsing of xyz versus the parsing of xyz

For xyz the parser must pro ceed by

A rst building BinOp Atom x Atom y

A then building Atom z

A nally building Binop BinOp Atom x Atom y Atom z

For xyz the parser must pro ceed by

B rst building Atom x

B then building BinOp Atom y Atom z

A nally building Binop Atom x BinOp Atom y Atom z

In b oth cases the left hand argument to the op erator must b e held whilst the right

hand argument is parsed This will b e done by giving Parser an extra parameter

the parsetree of the alreadyparsed argument or Nil This is the same idea

already used to parse left asso ciative applicative expressions in that case the extra

parameter holding the alreadyparsed rator The name of the extra parameter will

b e t

Precedences are used to decide b etween AA and BB During the parsing

there is a current precedence of the parse For example if the data in BinopTable

ab ove is used then when parsing the second argument of the precedence will b e

and when parsing the second argument of the precedence will b e

If the current precedence is m and a binary op erator op say is encountered whose

precedence is n then

A if mn the expression just parsed is the second argument of the op erator that

pro ceeds it case AA

B if not mn then the expression just parsed is the rst argument of the already

encountered op erator op case BB

Simple sp ecial cases of parsing

The expression just parsed will b e the parse tree b ound to the parse tree parameter

t of Parser So in case A ab ove this parameter should b e returned immediately

In case B the parser should b e called recursively to get the parse tree t say of

the second argument of op and then BinOpoptt returned

With this explanation I hop e the ML co de to achieve this is comprehensible

128

fun Parser tab m t t

Parser tab m t inp as nextrest

if InTable tab next

then let val n Lookup tab next

in if mint n

then t inp

else let val trest Parser tab n Nil rest

in Parser tab m BinOpnextt t rest end

end

else Parser tab m Atom next rest

val Parser fn

string int list

int tree string list tree string list

val Parse Parser BinopTable Nil

val Parse fn string list tree string list

Here are some examples

129

Parse xyz

val it BinOp BinOp Atom xAtom yAtom z

Parse xyz

val it BinOp Atom xBinOp Atom yAtom z

Parse xyz

val it BinOp Atom xBinOp Atom yAtom z

tree string list

The last of these examples shows that binary op erators are parsed as right asso

ciative This is b ecause is used to compare the current precedence with that of

an encountered op erator Since mm is always false the eect is as though the the

op erator on the left is not of higher precedence than the one on the right hence

right asso ciativity If is changed to then op erators will parse as left asso ciative

The general case where some op erators are left asso ciative and some right asso cia

tive can b e handled by giving op erators b oth left and right precedences This is

considered in Section

A prop erty of the ab ove parser is that if next is not a known op erator ie is not

in tab then the last parse tree parsed viz t is thrown away

130

Parse xyz

val it Atom z tree string list

Instead of doing this juxtap osed expressions without any intervening binary op

erators can b e interpreted as function applications First parse trees have to b e

up dated to p ermit this

131

datatype tree Nil

Atom of string

Comb of tree tree

BinOp of string tree tree

Then the last line of Parser is mo died

Chapter Case study parsing

132

fun Parser tab m t t

Parser tab m t inp as nextrest

if InTable tab next

then let val n Lookup tab next

in if mint n

then t inp

else let val trest Parser tab n Nil rest

in Parser tab m BinOpnextt t rest end

end

else Parser tab m Combt Atom next rest

val Parser fn

string int list

int tree string list tree string list

val Parse Parser BinopTable Nil

val Parse fn string list tree string list

This almost works

133

Parse xyz

val it Comb Comb Comb NilAtom xAtom yAtom z

The usual trick of using MkComb instead of Comb is needed

134

fun MkCombNilt t

MkComb p Comb p

fun Parser tab m t t

Parser tab m t inp as nextrest

if InTable tab next

then let val n Lookup tab next

in if mint n

then t inp

else let val trest Parser tab n Nil rest

in Parser tab m BinOpnextt t rest end

end

else Parser tab m MkCombt Atom next rest

val Parse Parser BinopTable Nil

This works

135

Parse xyz

val it Comb Comb Atom xAtom yAtom z

Parse xyz

val it BinOp Comb Atom xAtom yAtom z

tree string list

Notice that function application binds tighter than binary op erators which is nor

mally what is wanted though achieved here in a rather ad ho c and accidental

manner Two things that Parse do esnt handle are unary op erators and brackets

The datatyp e of parse trees needs to b e adjusted to handle unary op erators

136

datatype tree Nil

Atom of string

Comb of tree tree

Unop of string tree

BinOp of string tree tree

fun MkCombNilt t

MkComb p Comb p

Simple sp ecial cases of parsing

Unary op erators need a precedence If has higher precedence than then xy

parses as BinOp Unop Atom x Atom y If has lower precedence

than then xy parses as Unop BinOp Atom x Atom y

The precendences of unary op erators will b e held in a table

137

val UnopTable

Parsing is now straightforward b oth the unary and binory op erator tables need to

b e passed to Parser and an extra clauses is added to test for unary op erators

138

fun Parser tab as utabbtab m t t

Parser tab as utabbtab m t inp as nextrest

if InTable utab next

then let val n Lookup utab next

in let val trest Parser tab n Nil rest

in Parser tab m MkCombt Unopnextt rest end

end

else if InTable btab next

then let val n Lookup btab next

in if mint n

then t inp

else let val trest Parser tab n Nil rest

in Parser tab m BinOpnextt t rest end

end

else Parser tab m MkCombt Atom next rest

val Parse Parser UnopTableBinopTable Nil

Here are some examples

139

Parse x

val it Atom x tree string list

Parse x

val it Unop Atom x tree string list

Parse xy

val it BinOp Unop Atom xAtom y

Parse xy

val it Unop BinOp Atom xAtom y

Brackets can b e handled in the same way as they were for left asso ciated applica

tive expressions

Chapter Case study parsing

140

fun Parser tab as utabbtab m t t

Parser tab m t rest Parser tab m CombAtom t rest

Parser tab m t inp as t inp

Parser tab m t rest

case Parser tab Nil rest

of t rest Parser tab m MkCombtt rest

raise MissingClosingBracket

Parser tab as utabbtab m t inp as nextrest

if InTable utab next

then let val n Lookup utab next

in let val trest Parser tab n Nil rest

in Parser tab m MkCombt Unopnextt rest end

end

else if InTable btab next

then let val n Lookup btab next

in if mint n

then t inp

else let val trest Parser tab n Nil rest

in Parser tab m BinOpnextt t rest end

end

else Parser tab m MkCombt Atom next rest

val Parse Parser UnopTableBinopTable Nil

Here are some more examples

141

fun P s Parseexplode s

val P fn string tree string list

P xy

val it Unop BinOp Atom xAtom y

P xy

val it BinOp Unop Atom xAtom y

P xyzw

val it

Comb BinOp Atom xAtom yComb Atom zAtom w

A general topdown precedence parser

The parser just given works by lo oking at the next item b eing input and then invokes

some action which dep ends on the item to parse the rest of the input A more

general scheme is to asso ciate actions with items and then to have a simple parsing

lo op that consists in rep eatedly reading an item and then executing the asso ciated

action

A rather general datatyp e of parse trees is the following

142

datatype tree Nil

Atom of string

Comb of tree tree

Node of string tree list

Unary op erator expressions will parse to trees of the form Nodenamear g and

binary op erator expressions to trees of the form Nodenamear g ar g

1 2

The familiar MkComb hack will b e needed The empty parse tree Nil should never

arise as the right comp onent of a combination since left asso ciative application will

b e adopted so an exception will b e raised if it do es

2

The parser describ ed here is lo osely based on Vaughan Pratts CGOL system MIT

A general topdown precedence parser

143

exception NilRightArg

fun MkCombNilt t

MkCombtNil raise NilRightArg

MkCombt t Combtt

The action asso ciated with an item may involve recursive calls to the parser To

handle this the techique describ ed earlier of passing a parse function as an argument

can b e used see BuildComb and CheckSym describ ed ab ove The typ e of parse

functions is given by the the following typ e abbreviation

144

type parser int tree string list tree string list

Selected input items will have precedences and actions asso ciated with them Prece

dences are integers Intuitively actions are represented by a functions of typ e

parser However since an action might need to recursively invoke the whole parser

it should b e passed a parsing function In general an action must b e represented

by a function of typ e parserparser A symb ol table asso ciates precedences and

actions to strings

145

type symtab string int parser parser

The main parsing function is now very simple since all the detail has b een hivedo

into the symb ol table

146

fun Parser symtab mint t t

Parser symtab m t inp as nextrest

let val nparsefn symtab next

in if mn then tinp

else parsefn Parser symtab m t inp

end

The parse stops on the empty string If the input isnt empty then the next item

is lo oked up in the symb ol table Left asso ciation will b e taken as the default so if

the current precedence equals or exceeds the precedence of the next item then the

parse stops and the last item parsed t is returned with the rest of the input If

the current precedence is less than the precedence of the next item then the parse

action asso ciated with the next item in the symb ol table is executed The parse

function Parser symtab is passed to the parse action so that it can if necessary

invoke the whole parse recursively

The denition of Parser intuitively has typ e symtabparser However the actual

typ e assigned by ML is more general

a

int

int b a list b a list

int b a list b a list

int b a list b a list

To constrain the typ es so that typ echecking yields the intuitive typ e requires some

contortions The following do es it

Chapter Case study parsing

147

fun Parser tabsymtab parser

fn m

fn t

fn t

inp as nextrest

let val nparsefn tab next

in if mn then tinp

else parsefn Parser tab m t inp

end

val Parser fn symtab parser

Standard ML seems rather worse than its predecessors in the exibility it allows for

writing typ e constraints

Notice that every input item is supp osed to have an entry in the symb ol table The

kind of items that might b e encountered include atoms unary op erators binary

op erators brackets b oth op ening and closing and keywords asso ciated with other

kinds of constructs eg if then else while

Generic functions to construct appropriate symb ol table entries for these will now

b e describ ed

The parser is initially invoked with a sp ecic symb ol table precedence and t set

to Nil

The action asso ciated with an atom a say is just to return Atom a Since the atom

may b e the argument of some preceding function whose parse tree will b e b ound

to t the parse tree that is actually returned by the parse action of an atom is

MkCombtAtom a

148

fun ParseAtom parse p t nextrest

parse p MkCombtAtom next rest

The action asso ciated with an op ening bracket is to recursively call the parser check

that there is a matching closing bracket remove it and then continue the parse

The function ParseBracket b elow is the action invoked by an op ening bracket It

takes as a parameter the closing bracket it should check for The parse tree t is

combined using MkComb with the parse tree t of the stu parsed inside of the

brackets If t is Nil then the denition of MkComb ensures that t b ecomes the new

lastthingparsed b ound to t in the rest of the parse However if t is not Nil then

what is b eing parsed must have the form e e where t is the parse tree of e so

1 2 1

a combination is generated namely the parse tree of e applied to the parse tree

1

of e

2

149

exception MissingClosingBracket

fun ParseBracket close parse p t rest

let val t nextrest parse Nil rest

in if closenext then parse p MkCombtt rest

else raise MissingClosingBracket

end

One can ensure that the parsing initiated by an op ening bracket will terminate at

a closing bracket by giving the closing bracket a suciently low precedence in the

symb ol table eg Closing brackets should always terminate the current parse

so it is an error to try to execute the parse acton asso ciated with them in the symb ol

table the typ e of the symb ol table is such that all items have some action in the

case of closing brackets this should never actually b e invoked

150

exception TerminatorParseErr

fun Terminator parse raise TerminatorParseErr

A general topdown precedence parser

The next function provides a rather general way of sp ecifying parser actions The

idea is that to parse a given kind of construct the parser is called recursively to get

each constituent and then a no de containing the resulting constituent parse trees is

returned Each recursive invokation of the parser might require some lo cal checking

for keywords etc For example to parse if e then e else e the parser is called

1 2

to get the parse tree of the e then the presence of then is checked then must b e

a terminator and it is removed then the parser is invoked to get the parse tree

for e then the presence of else is checked else must b e a terminator and it it

1

removed then the parser is invoked again to get the parse tree of e and nally a

2

no de like NodeCOND ttt is returned

The function ParseSeq b elow takes a constructor function mktree for building a

no de invokes the parser a numb er of times and then builds a parse tree by applying

mktree to the resulting constituent parse trees

Each invokation of the parser can b e wrapp ed around with some extra checking

activity This is sp ecied by providing a list of functions of typ e parserparser

applying such a function to a parser pro duces a new parser with the checking added

on The simplest case of this is no checking which is sp ecied by the identity

function

151

fun Id x x

ChkAft is used to mo dify a parser to check that a given keyword o ccurs after the

parser is invoked If pparser is a parser function then ChkAft s p is a parser

function that rst invokes p then checks for s and deletes it if found and raises an

exception otherwise

152

exception ChkAftErr

fun ChkAft s parse m t inp

case parse m t inp

of t srest if ss then trest else raise ChkAftErr

The function ParseSeq b elow takes a parse tree constructor function mktree of typ e

tree tree list tree for building a no de The rst parse tree is the one passed

as a parameter t to the parser and the list of parse trees are the constituents that

have just b een parsed

Constructors for building parse trees of unary op erator expressions and binary op

erator expressions are MkUnop and MkBinop resp ectively

Supp ose u is a unary op erator and consider e u e this should parse to

1 2

Combe Unopu e where e and e are the parse trees of e and e resp ec

1 2 1 2 1 2

tively The parse tree constructor for unary op erators is thus

153

fun MkUnop unop ttl MkCombtNodeunoptl

Supp ose b is a binary op erator and consider e b e this should parse to

1 2

Binopb e e where e and e are the parse trees of e and e resp ectively

1 2 1 2 1 2

The parse tree constructor for binary op erators is thus

154

fun MkBinop bnop ttl Nodebnopttl

The function ParseSeq also takes as a parameter a list of parser transformations

eg Id or ChkAft f and returns a parser that recursively invokes the parser once

for each parser transformation and then builds a parse tree using mktree applied

Chapter Case study parsing

to the resulting constituent parse trees For example the parsing of conditionals is

sp ecied by

ChkAft then ChkAft else Id

ParseSeq uses an auxiliary function ParseSeqAux that iterates down the list of sup

plied parse tree transformers invoking them in turn The denitions are short but

admittedly cryptic To try to improve their readability typ e constraints have b een

added to constrain excess p olymophism Without the constraints ParseSeq gets the

incomprehensibly general typ e

tree a list b

int

int b c list d

int tree c list a c list list

int b c list d int tree c list d

with the constraints the typ e is

tree tree list tree

int parser parser list parser parser

Unfortunately as with Parser it is necesary to go to some contortions to achieve

this typ e constraint Instead of writing

ParseSeq mktree m fl parse n t rest

It is necessary to write

fun ParseSeq mktree m fl parse

fn n fn t fn rest

and then add the typ e constraints shown b elow

155

fun ParseSeqAux m fparserparser parseparser n inp

let val t rest f parse m Nil inp

in t rest end

ParseSeqAux m ffl parserparserlist parse n inp

let val t rest f parse Nil inp

in let val lrest ParseSeqAux m fl parse rest

in tl rest end

end

fun ParseSeq mktree m flparserparserlist parseparser parser

fn n fn t fn rest

let val lrest ParseSeqAux m fl parse n rest

in parse n mktreetl rest end

A symb ol table is a function of typ e string int parser parser Here is

an example

156

fun SymTab ParseSeq MkBinop MULT Id

SymTab ParseSeq MkBinop ADD Id

SymTab ParseSeq MkUnop MINUS Id

SymTab if ParseSeq MkUnop COND ChkAft then

ChkAft else

Id

SymTab ParseBracket

SymTab Terminator

SymTab then Terminator

SymTab else Terminator

SymTab x ParseAtom

A general topdown precedence parser

Notice that the left precedence of is which is the same as its right precedence

However the left precedence of is which is greater than its right precedence

The eect of this is to make left asso ciative and right asso ciative In gen

eral if the left precedence is less than or equal to the right precedence then left

asso ciativity results otherwise right asso ciativity results

The complete parser P dened b elow uses the lexical analyser Lex and the symb ol

table ab ove Assume the co de for Lex as describ ed in Section is in the le

Lexml

157

use Lexml

val P Parser SymTab Nil o Lex

val P fn string tree string list

P f if x then y z else y z

val it

Comb

Atom f

Node

COND

Atom xNode ADDAtom yAtom z

Node MULTAtom yAtom z

Chapter Case study parsing

Chapter

Case study the calculus

It is assumed that integers and unary and binary op erations over integers are

primitive The typ e atom packages these up into a single datatyp e Both unary

op erator atoms Op and binary op erator atoms Op have a name and a semantics

158

datatype atom Num of int

Op of string intint

Op of string intintint

The application of an atomic op eration to a value is dened by the function ConApply

see b elow The application of a binary op erator b to m results in a unary op erator

named mb exp ecting the other argument

To convert the argument m to a string that can b e concatenated with the name of the

op erator a function to convert a numb er to a string giving its decimal representation

is dened

159

fun StringOfNum

StringOfNum

StringOfNum

StringOfNum

StringOfNum

StringOfNum

StringOfNum

StringOfNum

StringOfNum

StringOfNum

StringOfNum n

StringOfNumn div StringOfNumn mod

StringOfNum

val it string

Now ConApply can b e dened

160

fun ConApplyOpf Num m Numf m

ConApplyOpxf Num m OpStringOfNum mx fn n fmn

val ConApply fn atom atom atom

ConApplyOpop Num

val it Op fn atom

ConApplyit Num

val it Num atom

expressions are represented by the datatyp e lam

161

datatype lam Var of string

Con of atom

App of lam lam

Abs of string lam

Chapter Case study the calculus

A calculus parser

It is convenient to have a calculus parser Assume the co de of the parser describ ed

in Section is in the le Parserml

162

use Parserml

A sutable symb ol table for the calculus is

163

fun LamSymTab ParseSeq MkBinop MULT Id

LamSymTab ParseSeq MkBinop ADD Id

LamSymTab Terminator

LamSymTab ParseSeqMkUnop Abs ChkAft Id

LamSymTab ParseBracket

LamSymTab Terminator

LamSymTab x ParseAtom

Note that is our ASCI I representation of This is actually just a single

backslash the rst one is the escap e character needed to include the second one in

the string

The following function lexically analyses and then parses a string recall that Parser

returns a parse tree and the remaining input

164

fun ParseLam s

let val t Parser LamSymTab Nil Lex s

in t end

Warning binding not exhaustive

val ParseLam fn string tree

ParseLam xx

val it

Comb Node AbsAtom xNode ADDAtom xAtom

Atom

tree

The output of ParseLam is an element of the general purse parse tree typ e tree

dened on page This is easily converted to typ e lam A function for testing

whether a string represents a numb er ie is a string of digits is needed

165

fun IsNumber s

let fun TestDigList true

TestDigList xl IsDigit x andalso TestDigList l

in TestDigListexplode s

end

val IsNumber fn string bool

If a string represents a numb er then the following provides a way of converting it

to a numb er ie value of typ e int

A calculus parser

166

fun DigitVal

DigitVal

DigitVal

DigitVal

DigitVal

DigitVal

DigitVal

DigitVal

DigitVal

DigitVal

Warning match nonexhaustive

val DigitVal fn string int

fun NumOfString s

let fun ListVal

ListVal xl DigitVal x ListVal l

in ListValrevexplode s

end

val NumOfString fn string int

NumOfString

val it int

Armed with this stringtonumb er converter it is routine to convert values of typ e

tree to values of typ e lam The fourth clause of the denition of Convert b elow is

a little hack to make x x xn e parse as xx xn e This hack

makes use of the fact that sequences of variables parse as leftasso ciated applications

167

ParseLam x y z w

val it Node AbsComb Comb Atom xAtom yAtom zAtom w

168

fun Convert Atom x

if IsNumber x then ConNumNumOfString x else Var x

Convert Combab

AppConvert a Convert b

Convert NodeAbsAtom x a

AbsxConvert a

Convert NodeAbsComba Atom x a

ConvertNodeAbsa NodeAbsAtom xa

Convert NodeADDab

AppAppConOpop Convert a Convert b

Warning match nonexhaustive

val Convert fn tree lam

The function PL for Parse Lamb da expression parses a string and then converts

it to a value of typ e lam

169

val PL Convert o ParseLam

val PL fn string lam

PL xy

val it App App Con Op fnVar xVar y lam

PL xxy y

val it

App Abs xApp App Con Op fnVar xVar y

Var y

lam

Here is the xedp oint op erator Y see Section

Chapter Case study the calculus

170

PL f x fzx x f x fzx x f

val it

Abs

f

App

Abs xAbs fAbs zApp App Var xVar xVar f

Abs xAbs fAbs zApp App Var xVar xVar f

lam

An unparser or prettyprinter will b e useful for viewing elements of typ e lam

The one that follows UPL is rather crude for example it do es not attempt to

format expressions across lines though it do es at leat avoid putting brackets around

variables

The name UPL stands for UnParse Lamb da expression and BUPL for Bracket and

UnParse Lamb da expression

171

fun UPL Var x x

UPL ConNum n StringOfNum n

UPL ConOpx x

UPL ConOpx x

UPL AppConOpxe x BUPL e

UPL AppAppConOpxee BUPL e x BUPL e

UPL Appee UPL e BUPL e

UPL Absxe x UPL e

and BUPLVar x x

BUPLConNum n StringOfNum n

BUPL e UPL e

Implementing substitution

Recall the denition of substitution on page

0

E E E V

0

V E

0 0 0

V where V V V

0 0

E E E E V E E V

V E V E

0 0 0 0

V E where V V and V E E V

0 0

V is not free in E

0 0 00 00 0 0

V E where V V and V E V V E V

0 0 00

V is free in E where V is a variable

0

not free in E or E

This is easily implemented in ML Some auxiliary settheoretic functions on lists

are needed some of which have b een met b efore First a test for memb ership

172

fun Mem x false

Mem x xs xx orelse Mem x s

val Mem fn a a list bool

Implementing substitution

Note that the union of two lists dened b elow do es not intro duce duplicates

173

fun Union l l

Union xl l

if Mem x l then Union l l else xUnion l l

val Union fn a list a list a list

Union

val it int list

Subtract l l removes all memb ers of l from l ie is l minus l

174

fun Subtract l

Subtract xl l

if Mem x l then Subtract l l else xSubtract l l

val Subtract fn a list a list a list

Subtract

val it int list

Using Mem Union and Subtract the function Frees to compute a list of the free

variables in a expression is easily dened

175

fun Frees Var x x

Frees Con c

Frees Appee Union Frees e Frees e

Frees Absxe Subtract Frees e x

val Frees fn lam string list

PL xxy

val it Abs xApp App Con Op fnVar xVar y lam

Frees it

val it y string list

Substitution needs to rename variables to avoid capture This will b e done by

priming them

176

fun Prime x x

Prime x

val it x string

Variant xl x primes x sucient numb er of times so that the result do es not o ccur

in the list xl

177

fun Variant xl x

if Mem x xl then Variant xl Prime x else x

val Variant fn string list string string

Variant xyzyw y

val it y string

Now at last substitution can b e dened Subst E E V computes EEV according

to the table ab ove

Chapter Case study the calculus

178

fun Subst e as Var x e x if xx then e else e

Subst e as Con c e x e

Subst Appe e e x AppSubst e e x Subst e e x

Subst e as Absxe e x

if xx then e

else if Mem x Frees e

then let val x Variant Frees e Frees e x

in Absx SubstSubst e Var x x e x

end

else Absx Subst e e x

val Subst fn lam lam string lam

Here are some examples

179

Subst PLxxy x PL x

val it

App Abs xApp App Con Op fnVar xVar y

Con Num

lam

UPL it

val it x xy string

UPLSubst PLxxy PLx y

val it x xx string

A function EvalN can now b e dened to do normal order reduction sometimes called

callbyname Note that the evaluation do es not go inside b o dies so do es

not compute normal forms

180

fun EvalN e as Var e

EvalN e as Con e

EvalN Absxe Absx e

EvalN AppCon a Con a ConConApplyaa

EvalN Appee

case EvalN e

of Absxe EvalNSubst e e x

e as Con a case EvalN e

of Con a ConConApplyaa

e Appee

e Appe EvalN e

val EvalN fn lam lam

Here is a typical example that only terminates with normal order evaluation

181

EvalN PLx x x x x x x

val it Con Num lam

val true

x y zw

The SECD machine

Callbyvalue evaluation can b e programmed with the function EvalV

The SECD machine

182

fun EvalV e as Var e

EvalV e as Con e

EvalV e as Abs e

EvalV Appee

let val e EvalV e

in

case EvalV e

of Absxe EvalVSubst e e x

e as Con a case e

of Con a ConConApplyaa

Appee

e Appee

end

The SECD machine is a classical virtual machine for reducing expressions using

callbyvalue It was develop ed in the s by Peter Landin and has b een analysed

by Gordon Plotkin Various more recent practical vitual machines for ML are

descendents of the SECD machine

The name SECD comes from Stack Environment Control and Dump which are

the four comp onents of the machine state

The stack of an SECD machine holds a sequence represented by a list of atoms and

closures The environment provides values of variables A closure is an abstraction

paired with an environment The mutually recursive datatyp es of item and env

represent items and environments resp ectively Environments are represented by

asso ciation lists of variables represented by strings and and items

183

datatype item Atomic of atom

Closure of lam env

and env EmptyEnv

Env of string item env

The function Lookup lo oks up the value of a variable in an environment and raises

an exception is the variable do esnt have a variable

184

exception LookupErr

fun LookupsEmptyEnv raise LookupErr

LookupsEnvsienv if ss then i else Lookupsenv

The control of an SECD is a sequence represented by a list of instructions which

are either the sp ecial op eration Ap or a expression

185

datatype instruction Ap Exp of lam

A datatyp e state that represents SECD machine states can now b e dened

186

type stack item list

and control instruction list

datatype state NullState

State of stack env control state

The transitions of the SECD machine are given by the function Step

Chapter Case study the calculus

187

fun StepStatevS E StateSECD

StatevS E C D

StepStateS E ExpVar xC D

StateLookupxES E C D

StepStateS E ExpCon vC D

StateAtomic vS E C D

StepStateS E ExpAbsxeC D

StateClosureAbsxeES E C D

StepStateClosureAbsxeE v S E ApC D

State EnvxvE Exp e StateSECD

StepStateAtomic vAtomic vS E ApC D

StateAtomicConApplyvv S E C D

StepStateS E ExpAppeeC D

StateS E Exp eExp eApC D

The function Run iterates Step until a nal state is reached and then returns a list

of all the intermediate states

188

fun Runstate as StateEmptyEnvNullStat e state

Run state stateRunStep state

val Run fn state state list

The function Eval takes a expression and evaluates it using the SECD machine

189

fun Eval e

let fun EvalAuxStatevEmptyEnv Null Stat e v

EvalAux state EvalAuxStep state

in EvalAuxStateEmptyEnvExp eNullState end

val Eval fn lam item

Load loads a lamb daexpression into an SECD state ready for running

190

fun Load e StateEmptyEnvExp eNullState

val Load fn lam state

SECDRun parses a string loads the resulting expression into an SECD state and

then runs the result SECDEval is similar but it just Eval s the result

191

fun SECDRun s RunLoad s

val SECDRun fn lam state list

fun SECDEval s EvalPL s

val SECDEval fn string item

SECDEval xy xy

val it Atomic Num item

Bibliography

Augustsson L A compiler for lazy ML in Pro ceedings of the ACM Symp o

sium on LISP and Functional Programming Austin pp

Barendregt HP The Lamb da Calculus revised edition Studies in Logic

NorthHolland Amsterdam

Barron DW and Strachey C Programming in Fox L ed Advances in

Programming and Nonnumerical Computation Chapter Pergamon Press

Bird R and Wadler P An Intro duction to Functional Programming Prentice

Hall

Boyer RS and Mo ore J S A Computational Logic Academic Press

De Bruijn NG Lamb da calculus notation with nameless dummies a to ol for

automatic formula manipulation Indag Math pp

Burge W Recursive Programming Techniques AddisonWesley

Clarke TJW et al SKIM the S K I Reduction Machine in Pro ceedings

of the ACM LISP Conference pp

Curry HB and Feys R Combinatory Logic Vol I North Holland Amster

dam

Curry HB Hindley JR and Seldin JP Combinatory Logic Vol II Studies

in Logic North Holland Amsterdam

Fairbairn J and Wray SC Co de generation techniques for functional lan

guages in Pro ceedings of the ACM Conference on LISP and Functional

Programming Cambridge Mass pp

Gordon MJC On the p ower of list iteration The Computer Journal

No

Gordon MJCThe Denotational Description of Programming Languages

SpringerVerlag

Gordon MJC Programming Language Theory and its Implementation

Prentice Hall International Series in Computer Science out of print

Gordon MJC Milner AJRG and Wadsworth CP Edinburgh LCF a

mechanized logic of computation Springer Lecture Notes in Computer Science

SpringerVerlag

Henderson P Functional Programming Application and Implementation

Prentice Hall

Bibliography

Henderson P and Morris JM A lazy evaluator in Pro ceedings of The Third

Symp osium on the Principles of Programming Languages Atlanta Georgia

pp

Hindley JR Combinatory reductions and lamb dareductions compared

Zeitschrift fur Mathematische Logik und Grundlagen der Mathematik pp

Hindley JR and Seldin JP Intro duction to Combinators and Calculus

London Mathematical So ciety Student Texts Cambridge University Press

Hughes RJM Sup er combinators a new implementation metho d for ap

plicative languages in Pro ceedings of the ACM Symp osium on LISP and

Functional Programming Pittsburgh

Kleene SC denability and recursiveness Duke Math J pp

Krishnamurthy EV and Vickers BP Compact numeral representation with

combinators The Journal of Symb olic Logic No pp June

A

Lamp ort L L T X A Do cument Preparation System AddisonWesley

E

Landin PJ The next programming languages Comm Asso c Comput

Mach pp

Levy JJ Optimal reductions in the lamb da calculus in Hindley JR and

Seldin JP eds To HB Curry Essays on Combinatory Logic Lamb da

Calculus and Formalism Academic Press New York and London

Mauny M and Suarez A Implementing functional languages in the categor

ical abstract machine in Pro ceedings of the ACM Conference on LISP

and Functional Programming pp Cambridge Mass

Milner AJRG A prop osal for Standard ML in Pro ceedings of the ACM

Symp osium on LISP and Functional Programming Austin

Morris JH Lamb da Calculus Mo dels of Programming Languages PhD Dis

sertation MIT

Peyton Jones SL The Implementation of Functional Programming Lan

guages Prentice Hall

Plotkin GD Callbyname callbyvalue and the calculus Theoretical

Computer Science pp

Schonnkel M Ub er die Bausteine der mathematischen Logik Math An

nalen pp Translation printed as On the building blo cks of

mathematical logic in van Heijeno ort J ed From Frege to Godel Harvard

University Press

Scott DS Mo dels for various typ e free calculi in Supp es P et al eds

Logic Metho dology and Philosophy of ScienceIV Studies in Logic North

Holland Amsterdam

Stoy JE Denotational Semantics The ScottStrachey Approach to Program

ming Language Theory MIT Press

Bibliography

Turner DA A new implementation technique for applicative languages

Software Practice and Exp erience pp

Turner DA Another algorithm for bracket abstraction The Journal of Sym

b olic Logic No pp June

Turner DA Functional programs as executable sp ecications in Hoare

CAR and Shepherdson JC eds Mathematical Logic and Programming

Languages Prentice Hall

Wadsworth CP The relation b etween computational and denotational prop

erties for Scotts D mo dels of the lamb dacalculus SIAM Journal of Com

1

puting pp

Wadsworth CP Some unusual calculus numeral systems in Hindley JR

and Seldin JP eds To HB Curry Essays on Combinatory Logic Lamb da

Calculus and Formalism Academic Press New York and London