<<

Intro duction to Standard ML

Rob ert Harp er

Scho ol of Computer Science

Carnegie Mellon University

Pittsburgh PA

Copyright Rob ert Harp er

All rights reserved

1

With exercises by Kevin Mitchell Edinburgh University Edinburgh UK

Contents

Intro duction

The Core Language

Interacting with ML

Basic expressions values and typ es

Unit

Bo oleans

Integers

Strings

Real Numb ers

Tuples

Lists

Records

Identiers bindings and declarations

Patterns

Dening functions

Polymorphism and Overloading

Dening typ es

Exceptions

Imp erative features

The Mo dules System

Overview

Structures and Signatures

Abstractions

Functors

The mo dules system in practice iii

iv CONTENTS

InputOutput

A Answers

Acknowledgements

Several of the examples were cribb ed from Luca Cardellis intro duction to his

dialect of ML from Robin Milners core language denition from Dave

MacQueens mo dules pap er and from Ab elson and Sussmans b o ok

Joachim Parrow Don Sannella and David Walker made many helpful sug

gestions vii

Chapter

Intro duction

These notes are an intro duction to the Standard ML

Here are some of the highlights of Standard ML

ML is a language Functions are rstclass

data ob jects they may b e passed as arguments returned as results

and stored in variables The principal control mechanism in ML is

recursive function application

ML is an interactive language Every phrase read is analyzed compiled

and executed and the value of the phrase is rep orted together with its

typ e

ML is strongly typed Every legal expression has a typ e which is deter

mined automatically by the Strong typing guarantees that

no program can incur a typ e error at run time a common source of

bugs

ML has a polymorphic typ e system Each legal phrase has a uniquely

determined most general typing that determines the set of contexts in

which that phrase may b e legally used

ML supp orts abstract types Abstract typ es are a useful mechanism for

program mo dularization New typ es together with a set of functions on

ob jects of that typ e may b e dened The details of the implementation

are hidden from the user of the typ e achieving a degree of isolation

that is crucial to program maintenance

CHAPTER INTRODUCTION

ML is statical ly scoped ML resolves identier references at compile

time leading to more mo dular and more ecient programs

ML has a typ esafe exception mechanism Exceptions are a useful

means of handling unusual or deviant conditions arising at runtime

ML has a modules facility to supp ort the incremental construction of

large programs An ML program is constructed as an interdep endent

collection of structures which are glued together using functors Sepa

rate compilation is supp orted through the ability to exp ort and imp ort

functors

Standard ML is the newest memb er of a family of languages tracing its

origins to the ML language develop ed at Edinburgh by Mike Gordon Robin

Milner and Chris Wadsworth in the midseventies Since then numerous

dialects and implementations have arisen b oth at Edinburgh and elsewhere

Standard ML is a synthesis of many of the ideas that were explored in the

variant languages notably Luca Cardellis dialect and in the functional

language HOPE develop ed by Ro d Burstall Dave MacQueen and Don San

nella The most recent addition to the language is the mo dules system

develop ed by Dave MacQueen

These notes are intended as an informal intro duction to the language and

its use and should not b e regarded as a denitive description of Standard

ML They have evolved over a numb er of years and are in need of revision

b oth to reect changes in the language and the exp erience gained with it

since its inception Comments and suggestions from readers are welcome

The denition of Standard ML is available from MIT Press A less

formal but in many ways obsolete account is available as an Edinburgh

University technical rep ort The reader is encouraged to consult the

denition for precise details ab out the language

Chapter

The Core Language

Interacting with ML

Most implementations of ML are interactive with the form of interac

tion b eing the readevalprint dialogue familiar to LISP users in which

an expression is entered ML analyzes compiles and executes it and the

1

result is printed on the terminal

Here is a sample interaction

int

ML prompts with and precedes its output with The user entered

the phrase ML evaluated this expression and printed the value

of the phrase together with its typ e int

Various sorts of errors can arise during an interaction with ML Most

of these fall into three categories syntax errors typ e errors and runtime

faults You are probably familiar with syntax errors and runtime errors from

your exp erience with other programming languages Here is an example of

what happ ens when you enter a syntactically incorrect phrase

let x in x end

Parse error Was expecting in in let x

1

The details of the interaction with the ML top level vary from one implementation to

another but the overall feel is similar in all systems known to the author These notes

were prepared using the Edinburgh compiler circa

CHAPTER THE CORE LANGUAGE

Runtime errors such as dividing by zero are a form of exception ab out

which we shall have more to say b elow For now we simply illustrate the

sort of output that you can exp ect from a runtime error

div

Failure Div

The notion of a typ e error is somewhat more unusual We shall have

more to say ab out typ es and typ e errors later For now it suces to remark

that a typ e error arises from the improp er use of a value such as trying to

add to true

true

Type clash in true

Looking for a int

I have found a bool

One particularly irksome form of error that ML cannot diagnose is the

innite lo op If you susp ect that your program is lo oping then it is often p os

sible to break the lo op by typing the interrupt character typically Control

C ML will resp ond with a message indicating that the exception interrupt

has b een raised and return to the top level Some implementations have a

debugging facility that may b e helpful in diagnosing the problem

Other forms of errors do arise but they are relatively less common and

are often dicult to explain outside of context If you do get an error message

that you cannot understand then try to nd someone with more exp erience

with ML to help you

The details of the user interface vary from one implementation to another

particularly with regard to output format and error messages The examples

that follow are based on the Edinburgh ML compiler you should have no

diculty interpreting the output and relating it to that of other

Basic expressions values and typ es

We b egin our intro duction to ML by intro ducing a set of basic typ es In ML

a typ e is a collection of values For example the integers form a typ e as do

the character strings and the b o oleans Given any two typ es and the

set of ordered pairs of values with the left comp onent a value of typ e and

BASIC EXPRESSIONS VALUES AND TYPES

the right a value of typ e is a typ e More signicantly the set of functions

mapping one typ e to another form a typ e In addition to these and other

basic typ es ML allows for userdened typ es We shall return to this p oint

later

Expressions in ML denote values in the same way that numerals denote

numb ers The typ e of an expression is determined by a set of rules that

guarantee that if the expression has a value then the value of the expression

is a value of the typ e assigned to the expression got that For example

every numeral has typ e int since the value of a numeral is an integer We

shall illustrate the typing system of ML by example

Unit

The typ e unit consists of a single value written sometimes pronounced

unit as well This typ e is used whenever an expression has no interesting

value or when a function is to have no arguments

Bo oleans

The typ e bool consists of the values true and false The ordinary b o olean

negation is available as not the b o olean functions andalso and orelse are

also provided as primitive

The conditional expression if e then e else e is also considered

1 2

here b ecause its rst argument e must b e a b o olean Note that the else

clause is not optional The reason is that this if is a conditional expres

sion rather than a conditional command such as in Pascal If the else

clause were omitted and the test were false then the expression would have

no value Note to o that b oth the then expression and the else expression

must have the same typ e The expression

if true then true else

is type incorrect or il ltyped since the typ e of the then clause is bool

whereas the typ e of the else clause is unit

not true

false bool

false andalso true

CHAPTER THE CORE LANGUAGE

false bool

false orelse true

true bool

if false then false else true

true bool

if true then false else true

false bool

Integers

The typ e int is the set of p ositive and negative integers Integers are

written in the usual way except that negative integers are written with the

character rather than a minus sign

int

int

div

int

mod

int

The usual arithmetic op erators div and mod are available with

div and mod b eing integer division and remainder resp ectively The usual

relational op erators and are provided as well They each

take two expressions of typ e int and return a b o olean according to whether

or not the relation holds

false bool

div

true bool

if mod then else

int

Notice that the relational op erators when applied to two integers evaluate

to either true or false and therefore have typ e bool

BASIC EXPRESSIONS VALUES AND TYPES

Strings

The typ e string consists of the set of nite sequences of characters Strings

are written in the conventional fashion as characters b etween double quotes

The double quote itself is written n

Fish knuckles

Fish knuckles string

string

Sp ecial characters may also app ear in strings but we shall have no need of

them Consult the ML language denition for the details of how to build

such strings

The function size returns the length in characters of a string and the

2

function is an inx app end function

Rhinocerous Party

Rhinocerous Party

size Walrus whistle

int

Real Numb ers

The typ e of oating p oint numb ers is known in ML as real Real numb ers

are written in more or less the usual fashion for programming languages an

integer followed by a decimal p oint followed by one or more digits followed

by the exp onent marker E followed by another integer The exp onent part

is optional provided that the decimal p oint is present and the decimal part

is optional provided that the exp onent part is present These conventions

are needed to distinguish integer constants from real constants ML do es not

supp ort any form of typ e inclusion so an integer must b e explicitly co erced

to a real

real

2

By inx we mean a function of two arguments that is written b etween its arguments just as addition is normally written

CHAPTER THE CORE LANGUAGE

E

real

E

real

The usual complement of basic functions on the reals are provided The

arithmetic functions and may b e applied to real numb ers though

one may not mix and match a real can only b e added to a real and not to

an integer The relational op erators and so on are also dened for

the reals in the usual way Neither div nor mod are dened for the reals but

the function denotes ordinary realvalued division In addition there are

functions such as sin sqrt and exp for the usual mathematical functions

The function real takes an integer to the corresp onding real numb er and

floor truncates a real to the greatest integer less than it

real

real

true bool

floor

real

floor

real

cos

real

cos

Type clash in cos

Looking for a real

I have found a int

This completes the set of atomic typ es in ML We now move on to the

compound types those that are built up from other typ es

Tuples

The typ e where and are typ es is the typ e of ordered pairs whose

rst comp onent has typ e and whose second comp onent has typ e Ordered

pairs are written e e where e and e are expressions Actually theres

1 2 1 2

BASIC EXPRESSIONS VALUES AND TYPES

no need to restrict ourselves to pairs we can build ordered ntuples where

n by writing n commaseparated expressions b etween parentheses

true

true bool unit

false Blubber

falseBlubber int bool int string

if then Yes else No mod

No string int

Equality b etween tuples is componentwise equality two ntuples are

equal if each of their corresp onding comp onents are equal It is a typ e error

to try to compare two tuples of dierent typ es it makes no sense to ask

whether say true is equal to abc since their corresp onding

comp onents are of dierent typ es

mod not false true

true bool

abc div abc

false bool

true abc

Type clash in trueabc

Looking for a boolint

I have found a stringunit

Lists

The typ e list consists of nite sequences or lists of values of typ e For

instance the typ e int list consists of lists of integers and the typ e bool

list list consists of lists of lists of b o oleans There are two notations for

lists the basic one and a convenient abbreviation The rst is based on the

following characterization of lists a list is either empty or it consists of

a value of typ e followed by a list This characterization is reected in the

following notation for lists the empty list is written nil and a nonempty

list is written el where e is an expression of some typ e and l is some

list The op erator is pronounced cons after the LISP listforming

function by that name

If you think ab out this denition for a while youll see that every non

empty list can b e written in this form

CHAPTER THE CORE LANGUAGE

e e e nil

1 2 n

where each e is an expression of some typ e and n This accords with

i

the intuitive meaning of a list of values of a given typ e The role of nil is to

serve as the terminator for a list every list has the form illustrated ab ove

This metho d of dening a typ e is called a recursive typ e denition Such

denitions characteristically have one or more base cases or starting p oints

and one or more recursion cases For lists the base case is the empty list

nil and the recursion case is cons which takes a list and some other value

and yields another list Recursivelydened typ es o ccupy a central p osition

in functional programming b ecause the organization of a functional program

is determined by the structure of the data ob jects on which it computes

Here are some examples of using nil and to build lists

nil

a list

nil

int list

nil nil nil

int list list

This is it

Thisisit string list

Notice that ML prints lists in a compressed format as a commaseparated

list of the elements of the list b etween square brackets This format is con

venient for input as well and you may use it freely But always keep in mind

that it is an abbreviation the nil and format is the primary one

The typ e of nil see the example in Section is p eculiar b ecause it

involves a type variable printed as a and pronounced alpha The reason

for this is that there is nothing ab out an empty list that makes it a list of

integers or a list of b o oleans or any typ e of list at all It would b e silly to

require that there b e a distinct constant denoting the empty list for each typ e

of list and so ML treats nil as a polymorphic ob ject one that can inhabit

a variety of structurallyrelated typ es The constant nil is considered to b e

an int list or a bool list or a int list list according to the context

Note however that nil inhabits only list typ es This is expressed by assigning

the typ e a list to nil where a is a variable ranging over the collection

of typ es An instance of a typ e involving a typ e variable called a polytype

BASIC EXPRESSIONS VALUES AND TYPES

for short is obtained by replacing all o ccurrences of a given typ e variable by

some typ e p erhaps another p olytyp e For example a list has int list

and int b list as instances A typ e that do es not involve any typ e

variables is called a monotype

Equality on lists is itembyitem two lists are equal if they have the same

length and corresp onding comp onents are equal As with tuples it makes

no sense to compare two lists with dierent typ es of elements and so any

attempt to do so is considered a typ e error

nil

true bool

div div

false bool

Records

The last comp ound typ e that we shall consider in this section is the record

type Records are quite similar to Pascal records and to C structures and to

similar features in other programming languages A record consists of a nite

set of lab elled elds each with a value of any typ e as with tuples dierent

elds may have dierent typ es Record values are written by giving a set of

equations of the form l e where l is a lab el and e is an expression enclosed

in curly braces The equation l e sets the value of the eld lab elled l to the

value of e The typ e of such a value is a set of pairs of the form l t where

l is a lab el and is a typ e also enclosed in curly braces The order of the

equations and typings is completely immaterial comp onents of a record are

identied by their lab el rather than their p osition Equality is comp onent

wise two records are equal if their corresp onding elds determined by lab el

are equal

nameFoousedtrue

nameFoo usedtrue namestring usedbool

nameFoousedtrue usednot falsenameFoo

true bool

nameBarusedtrue nameFoousedtrue

false bool

Tuples are sp ecial cases of records The tuple typ e is actually

shorthand for the record typ e f g with two elds lab eled

CHAPTER THE CORE LANGUAGE

and Thus the expressions and fg have precisely the

same meaning

This completes our intro duction to the basic expressions values and

typ es in ML It is imp ortant to note the regularity in the ways of forming

values of the various typ es For each typ e there are basic expression forms

for denoting values of that typ e For the atomic typ es these expressions

are the constants of that typ e For example the constants of typ e int are

the numerals and the constants of typ e string are the character strings

enclosed in double quotes For the comp ound typ es values are built using

value constructors or just constructors whose job is to build a memb er of

a comp ound typ e out of the comp onent values For example the pairing

constructor written takes two values and builds a memb er of a tuple

typ e Similarly nil and are constructors that build memb ers of the list

typ e as do the square brackets The record syntax can also b e viewed as a

syntactically elab orate constructor for record typ es This view of data as

b eing built up from constants by constructors is one of the fundamental prin

ciples underlying ML and will play a crucial role in much of the development

b elow

There is one more very imp ortant typ e in ML the function typ e Before

we get to the function typ e it is convenient to take a detour through the

declaration forms of ML and some of the basic forms of expressions With

that under our b elt we can more easily discuss functions and their typ es

Identiers bindings and declarations

In this section we intro duce declarations the means of intro ducing identiers

in ML All identiers must b e declared b efore they are used the names of

the builtin functions such as and size are predeclared by the compiler

Identiers may b e used in several dierent ways in ML and so there is a

declaration form for each such use In this section we will concern ourselves

with value identiers or variables A variable is intro duced by binding it to

a value as follows

val x

val x int

val s Abc def

val s Abcdef string

IDENTIFIERS BINDINGS AND DECLARATIONS

val pair x s

val pair Abcdef int string

The phrase val x is called a value binding To evaluate a value

binding ML evaluates the righthand side of the equation and sets the value

of the variable on the lefthand side to this value In the ab ove example x

is b ound to an integer Thereafter the identier x always stands for the

integer as can b e seen from the third line ab ove the value of x s

is obtained from the values of x and s

Notice that the output from ML is slightly dierent than in our examples

ab ove in that it prints x b efore the value The reason for that is that

whenever an identier is declared ML prints its denition the form of the

denition dep ends on the sort of identier for now we have only variables

for which the denition is the value of the variable An expression e typ ed

at the top level in resp onse to MLs prompt is evaluated and the value

of e is printed along with its typ e ML implicitly binds this value to the

identier it so that it can b e conveniently referred to in the next toplevel

phrase

It is imp ortant to emphasize the distinction b etween MLs notion of a

variable and that of most other programming languages MLs variables

are more like the const declarations than var declarations of Pascal in

particular binding is not assignment When an identier is declared by a

value binding a new identier is created it has nothing whatever to

do with any previously declared identier of the same name Furthermore

once an identier is b ound to a value there is no way to change that value

its value is whatever we have b ound to it when it was declared If you are

unfamiliar with functional programming then this will seem rather o dd at

least until we discuss some sample programs and show how this is used

Since identiers may b e reb ound some convention ab out which binding

to use must b e provided Consider the following sequence of bindings

val x

val x int

val y x

val y int

val x true

val x true bool

val z x

CHAPTER THE CORE LANGUAGE

val z true bool

The second binding for x hides the previous binding and do es not aect

the value of y Whenever an identier is used in an expression it refers to

the closest textually enclosing value binding for that identier Thus the

o ccurrence of x in the righthand side of the value binding for z refers to

the second binding of x and hence has value true not This rule is no

dierent than that used in other blo ckstructured languages but it is worth

emphasizing that it is the same

Multiple identiers may b e b ound simultaneously using the keyword

and as a separator

val x

val x int

val x true and y x

val x true bool

val y int

Notice that y receives the value not true Multiple value bindings joined

by and are evaluated in parallel rst all of the righthand sides are evalu

ated then the resulting values are all b ound to their corresp onding lefthand

sides

In order to facilitate the following explanation we need to intro duce some

terminology We said that the role of a declaration is to dene an identier

for use in a program There are several ways in which an identier can b e

used one of which is as a variable To declare an identier for a particular

use one uses the binding form asso ciated with that use For instance to

declare an identier as a variable one uses a value binding which binds

a value to the variable and establishes its typ e Other binding forms will

b e intro duced later on In general the role of a declaration is to build an

environment which keeps track of the meaning of the identiers that have

b een declared For instance after the value bindings ab ove are pro cessed

the environment records the fact that the value of x is true and that the

value of y is Evaluation of expressions is p erformed with resp ect to this

environment so that the value of the expression x can b e determined to b e

true

Just as expressions can b e combined to form other expressions by using

functions like addition and pairing so to o can declarations b e combined with

IDENTIFIERS BINDINGS AND DECLARATIONS

other declarations The result of a comp ound declaration is an environment

determined from the environments pro duced by the comp onent declarations

The rst combining form for declarations is one that weve already seen the

3

semicolon for sequential composition of environments

val x val x true and y x

val x int

val x true bool

val y int

When two declarations are combined with semicolon ML rst evaluates the

lefthand declaration pro ducing an environment E and then evaluates the

0

righthand declaration with resp ect to E pro ducing environment E The

second declaration may hide the identiers declared in the rst as indicated

ab ove

It is also useful to b e able to have local declarations whose role is to

assist in the construction of some other declarations This is accomplished

as follows

local

val x

in

val u xx xx

val v x x div

end

val u int

val v int

The binding for x is lo cal to the bindings for u and v in the sense that x is

available during the evaluation of the bindings for u and v but not thereafter

This is reected in the result of the declaration only u and v are declared

It is also p ossible to lo calize a declaration to an expression using let

let

val x

in

3

The semicolon is syntactically optional two sequential bindings are considered to b e

separated by a semicolon

CHAPTER THE CORE LANGUAGE

xx x

end

int

The declaration of x is lo cal to the expression o ccurring after the in and is

not visible from the outside The b o dy of the let is evaluated with resp ect

to the environment built by the declaration o ccurring b efore the in In

this example the declaration binds x to the value With resp ect to this

environment the value of xxx is and this is the value of the whole

expression

Exercise What is the result printed by the ML system in response to

the fol lowing declarations Assume that there are no initial bindings for x

y or z

val x and y x

val x local val x in val y x end val z x

let val x in let val x and y x in x y end end

Patterns

You may have noticed that there is no means of obtaining say the rst com

p onent of a tuple given only the expressions dened so far Comp ound values

are decomp osed via Values of comp ound typ es are them

selves comp ound built up from their comp onent values by the use of value

constructors It is natural to use this structure to guide the decomp osition

of comp ound values into their comp onent parts

Supp ose that x has typ e intbool Then x must b e some pair with the

left comp onent an integer and the right comp onent a b o olean We can obtain

the value of the left and right comp onents using the following generalization

of a value binding

val x true

val x true intbool

val left right x

val left int

val right true bool

PATTERNS

The lefthand side of the second value binding is a pattern which is built up

from variables and constants using value constructors That is a pattern is

just an expression p ossibly involving variables The dierence is that the

variables in a pattern are not references to previouslyb ound variables but

rather variables that are ab out to b e b ound by patternmatching In the

ab ove example left and right are two new value identiers that b ecome

b ound by the value binding The pattern matching pro cess pro ceeds by

traversing the value of x in parallel with the pattern matching corresp onding

comp onents A variable matches any value and that value is b ound to that

identier Otherwise ie when the pattern is a constant the pattern and

the value must b e identical In the ab ove example since x is an ordered pair

the pattern match succeeds by assigning the left comp onent of x to left

and the right comp onent to right

Notice that the simplest case of a pattern is a variable This is the form

of value binding that we intro duced in the previous section

It do es not make sense to pattern match say an integer against an or

dered pair nor a list against a record Any such attempt results in a typ e

error at compile time However it is also p ossible for pattern matching to

fail at run time

val xfalse

val x false boolint

val falsew x

val w int

val truew x

Failure match

Notice that in the second and third value bindings the pattern has a con

stant in the left comp onent of the pair Only a pair with this value as left

comp onent can match this pattern successfully In the case of the second

binding x in fact has false as left comp onent and therefore the match suc

ceeds binding to w But in the third binding the match fails b ecause

true do es not match false The message Failure match indicates that

a runtime matching failure has o ccurred

Pattern matching may b e p erformed against values of any of the typ es

that we have intro duced so far For example we can get at the comp onents

of a three element list as follows

CHAPTER THE CORE LANGUAGE

val l Lo and behold

val l Loandbehold string list

val xxx l

val x Lo string

val x and string

val x behold string

This works ne as long as we know the length of l in advance But

what if l can b e any nonempty list Clearly we cannot hop e to write a

single pattern to bind all of the comp onents of l but we can decomp ose l

in accordance with the inductive denition of a list as follows

val l Lo and behold

val l Loandbehold string list

val hdtl l

val hd Lo string

val tl andbehold string list

Here hd is b ound to the rst element of the list l called the head of l and

tl is b ound to the list resulting from deleting the rst element called the

tail of the list The typ e of hd is string and the typ e of tl is string list

The reason is that constructs lists out of a comp onent the left argument

and another list

Exercise What would happen if we wrote val hdtl l instead

of the above Hint expand the abbreviated notation into its true form then

match the result against l

Supp ose that all we are interested in is the head of a list and are not

interested in its tail Then it is inconvenient to have to make up a name

for the tail only to b e ignored In order to accommo date this dont care

case ML has a wildcard pattern that matches any value whatso ever without

creating a binding

val l Lo and behold

val l Loandbehold string list

val hd l

val hd Lo string

PATTERNS

Pattern matching may also b e p erformed against records and as you

may have guessed it is done on the basis of lab elled elds An example will

illustrate record pattern matching

val nameFoo usedtrue

val r nameFoousedtrue namestringusedbool

val usedu namen r

val n Foo string

val u true bool

It is sometimes convenient to b e able to match against a partial record

pattern This can b e done using the record wildcard as the following example

illustrates

val usedu r

val u true bool

There is an imp ortant restriction on the use of record wildcards it must b e

p ossible to determine at compile time the typ e of the entire record pattern

ie all the elds and their typ es must b e inferrable from the context of the

match

Since singleeld selection is such a common op eration ML provides a

shorthand notation for it the name eld of r may b e designated by the appli

cation name r Actually name is b ound to the function fn fnameng

n which selects the name eld from a record and thus it must b e p os

sible to determine from context the entire record typ e whenever a selection

function is used In particular fn x name x will b e rejected since the

full record typ e of x is not xed by the context of o ccurrence You will recall

that ntuples are sp ecial forms of records whose lab els are natural numb ers i

such that i n The ith comp onent of a tuple may therefore b e selected

using the function i

Patterns need not b e at in the following sense

val x foo true

val x footrue stringboolint

val lllrr x

val ll foo string

val lr true bool

val r int

CHAPTER THE CORE LANGUAGE

Sometimes it is desirable to bind intermediate pattern variables For

instance we may want to bind the pair lllr to an identier l so that

we can refer to it easily This is accomplished by using a layered pattern A

layered pattern is built by attaching a pattern to a variable within another

pattern as follows

val x foo true

val x footrue stringboolint

val l as lllr r x

val l footrue stringbool

val ll foo string

val lr true bool

val r int

Pattern matching pro ceeds as b efore binding l and r to the left and right

comp onents of x but in addition the binding of l is further matched against

the pattern lllr binding ll and lr to the left and right comp onents of

l The results are printed as usual

Before you get to o carried away with pattern matching you should real

ize that there is one signicant limitation patterns must b e linear a given

pattern variable may o ccur only once in a pattern This precludes the p ossi

bility of writing a pattern xx which matches only symmetric pairs those

for which the left and right comp onents have the same value This restriction

causes no diculties in practice but it is worth p ointing out that there are

limitations

Exercise Bind the variable x to the value by constructing patterns

to match against the fol lowing expressions

For example given the expression truehello the required pattern

is x

a b ctrue

DEFINING FUNCTIONS

Dening functions

So far we have b een using some of the predened functions of ML such

as the arithmetic functions and the relational op erations In this section we

intro duce function bindings the means by which functions are dened in ML

We b egin with some general p oints ab out functions in ML Functions are

used by applying them to an argument Syntactically this is indicated by

writing two expressions next to one another as in size abc to invoke the

function size with argument abc All functions take a single argument

multiple arguments are passed by using tuples So if for example there

were a function append which takes two lists as arguments and returns a

list then an application of append would have the form appendll it

has single argument which is an ordered pair ll There is a sp ecial

syntax for some functions usually just the builtins that take a pair as

argument called inx application in which the function is placed b etween

the two arguments For example the expression e e really means apply

the function to the pair ee It is p ossible for userdened functions

to b e inx but we shall not go into that here

Function application can take a syntactically more complex form in ML

than in many common programming languages The reason is that in most

of the common languages functions can b e designated only by an identifer

and so function application always has the form f e e where f is an

1 n

identier ML has no such restriction Functions are p erfectly go o d values

and so may b e designated by arbitrarily complex expressions Therefore the

0

general form of an application is e e which is evaluated by rst evaluating e

0

obtaining some function f then evaluating e obtaining some value v and

applying f to v In the simple case that e is an identier such as size then

the evaluation of e is quite simple simply retrieve the value of size which

had b etter b e a function But in general e can b e quite complex and require

any amount of computation b efore returning a function as value Notice

that this rule for evaluation of function application uses the cal lbyvalue

parameter passing mechanism since the argument to a function is evaluated

b efore the function is applied

0

How can we guarantee that in an application e e e will in fact evaluate

to a function and not say a b o olean The answer of course is in the

typ e of e Functions are values and all values in ML are divided up into

typ es A is a comp ound typ e that has functions as memb ers

CHAPTER THE CORE LANGUAGE

A function typ e has the form pronounced to where and

are typ es An expression of this typ e has as value a function that whenever

it is applied to a value of typ e returns a value of typ e provided that

it terminates unfortunately there is no practical means of ensuring that all

functions terminate for all arguments The typ e is called the domain type

0

of the function and is called its range type An application e e is legal

0

only if e has typ e and e has typ e that is only if the typ e of the

argument matches the domain typ e of the function The typ e of the whole

expression is then which follows from the denition of the typ e

For example

size

size fn string int

not

not fn bool bool

not

Type clash in not

Looking for a bool

I have found a int

The typ e of size indicates that it takes a string as argument and returns

an integer just as we might exp ect Similarly not is a function that takes a

b o olean and returns a b o olean Functions have no visible structure and so

print as fn The application of not to fails b ecause the domain typ e of

not is bool whereas the typ e of is int

Since functions are values we can bind them to identiers using the value

binding mechanism intro duced in the last section For example

val len size

val len fn string int

len abc

int

The identier size is b ound to some internallydened function with typ e

stringint The value binding ab ove retrieves the value of size some

function and binds it to the identier len The application len abc is

pro cessed by evaluating len to obtain some function evaluating abc to

obtain a string itself and applying that function to that string The result

DEFINING FUNCTIONS

is b ecause the function b ound to size in ML returns the length of a string

in characters

Functions are complex ob jects but they are not built up from other

ob jects in the same way that ordered pairs are built from their comp onents

Therefore their structure is not available to the programmer and pattern

matching may not b e p erformed on functions Furthermore it is not p ossible

to test the equality of two functions due to a strong theoretical result which

says that this cannot b e done even in principle Of all the typ es we have

intro duced so far every one except the function typ e has an equality dened

on values of that typ e Any typ e for which we may test equality of values

of that typ e is said to admit equality No function typ e admits equality

and every atomic typ e admits equality What ab out the other comp ound

typ es Recall that equality of ordered pairs is dened comp onentwise

two ordered pairs are equal i their left comp onents are equal and their right

comp onents are equal Thus the typ e admits equality i b oth and

admit equality The same pattern of reasoning is used to determine whether

an arbitrary typ e admits equality The roughandready rule is that if the

values of a typ e involve functions then it probably do esnt admit equality

this rule can b e deceptive so once you get more familiar with ML you are

encouraged to lo ok at the ocial denition in the ML rep ort

With these preliminaries out of the way we can now go on to consider

userdened functions The syntax is quite similar to that used in other

languages Here are some examples

fun twice x x

val twice fn intint

twice

int

fun fact x if x then else xfactx

val fact fn intint

fact

int

fun plusxyintxy

val plus fn intintint

plus

int

Functions are dened using function bindings that are intro duced by the

CHAPTER THE CORE LANGUAGE

keyword fun The function name is followed by its parameter which is a

pattern In the rst two examples the parameter is a simple pattern con

sisting of a single identier in the third example the pattern is a pair whose

left comp onent is x and right comp onent is y When a userdened function

is applied the value of the argument is matched against the parameter of the

function in exactly the same way as for value bindings and the b o dy of the

function is evaluated in the resulting environment For example in the case

of twice the argument which must b e an integer since the typ e of twice

is intint is b ound to x and the b o dy of twice x is evaluated yielding

the value For plus the pattern matching is slightly more complex since

the argument is a pair but it is no dierent from the value bindings of the

previous section the value of the argument is matched against the pattern

xy obtaining bindings for x and y The b o dy is then evaluated in this

environment and the result is determined by the same evaluation rules The

int in the denition of plus is called a type constraint its purp ose here

is to disambiguate b etween integer addition and real addition We shall have

more to say ab out this and related issues later on

Exercise Dene the functions circumference and area to compute

these properties of a circle given its radius

Exercise Dene a function to compute the absolute value of a real

number

The denition of the function fact illustrates an imp ortant p oint ab out

function denitions in ML functions dened by fun are recursive in the sense

that the o ccurrence of fact in the righthand side of the denition of fact

refers to the very function b eing dened as opp osed to some other binding

for fact which may happ en to b e in the environment Thus fact calls

itself in the pro cess of evaluating its b o dy Notice that on each recursive

call the argument gets smaller provided that it was greater than zero to

b egin with and therefore fact will eventually terminate Nonterminating

denitions are certainly p ossible and are the bane of the ML novice For a

trivial example consider the function

fun fxfx

val f fn ab

Any call to f will lo op forever calling itself over and over

DEFINING FUNCTIONS

Exercise An alternative syntax for conditional statements might be

dened by

fun newifABC if A then B else C

Explain what goes wrong if the denition for fact is altered to use this new

denition

Now we can go on to dene some interesting functions and illustrate

how real programs are written in ML Recursion is the key to functional

programming so if youre not very comfortable with it youre advised to go

slowly and practice evaluating recursive functions like fact by hand

So far we have dened functions with patterns consisting only of a single

variable or an ordered pair of variables Consider what happ ens if we at

tempt to dene a function on lists say is nil which determines whether or

not its argument is the empty list The list typ es have two value constructors

nil and A function dened on lists must work regardless of whether the

list is empty or not and so must b e dened by cases one case for nil and

one case for Here is the denition of is nil

fun isnil nil true

isnil false

isnil fn a list bool

isnil nil

true bool

isnil

false bool

The denition of is nil reects the structure of lists it is dened by cases

one for nil and one for ht separated from one another by a vertical bar

In general if a function is dened on a typ e with more than one value

constructor then that function must have one case for each constructor

This guarantees that the function can accept an arbitrary value of that typ e

without failure Functions dened in this way are called clausal function

denitions b ecause they contain one clause for each form of value of the

argument typ e

Of course clausal denitions are appropriate for recursivelydened func

tions as well Supp ose that we wish to dene a function append that given

two lists returns the list obtained by tacking the second onto the end of the

rst Here is a denition of such a function

CHAPTER THE CORE LANGUAGE

fun appendnill l

appendhdtll hd appendtll

val append fn a list a list a list

There are two cases to consider one for the empty list and one for a non

empty list in accordance with the inductive structure of lists It is trivial to

app end a list l to the empty list the result is just l For nonempty lists

we can app end l to hdtl by consing hd onto the result of app ending l to

tl

Exercise Evaluate the expression append by hand to con

vince yourself that this denition of append is correct

Exercise What function does the fol lowing denition compute

fun r rht appendrth

The typ e of append is a p olytyp e that is it is a typ e that involves the

typ e variable a The reason is that append obviously works no matter what

the typ e of the elements of the list are the typ e variable a stands for the

typ e of the elements of the list and the typ e of append ensures that b oth

lists to b e app ended have the same typ e of elements which is the typ e of the

elements of the resulting list This is an example of a polymorphic function

it can b e applied to a variety of lists each with a dierent element typ e

Here are some examples of the use of append

append

int list

append

int list

appendBowlofsoup

Bowl of soup string list

Notice that we used append for ob jects of typ e int list and of typ e string

list

In general ML assigns the most general typ e that it can to an expression

By most general we mean that the typ e reects only the commitments

that are made by the internal structure of the expression For example in

the denition of the function append the rst argument is used as the target

DEFINING FUNCTIONS

of a pattern match against nil and forcing it to b e of some list typ e

The typ e of the second argument must b e a list of the same typ e since it

is p otentially consd with an element of the rst list These two constraints

imply that the result is a list of the same typ e as the two arguments and

hence append has typ e a list a list a list

Returning to the example ab ove of a function fx dened to b e fx we

see that the typ e is ab b ecause aside from b eing a function the b o dy of

f makes no commitment to the typ e of x and hence it is assigned the typ e

a standing for any typ e at all The result typ e is similarly uncommitted

and so is taken to b e b an arbitrary typ e You should convince yourself

that no typ e error can arise from any use of f even though it has the very

general typ e ab

Function bindings are just another form of declaration analogous to the

value bindings of the previous section in fact function bindings are just a

sp ecial form of value binding Thus we now have two metho ds for building

declarations value bindings and function bindings This implies that a func

tion may b e dened anywhere that a value may b e declared in particular

lo cal function denitions are p ossible Here is the denition of an ecient

list reversal function

fun reverse l

let fun revnily y

revhdtly revtlhdy

in

revlnil

end

val reverse fn a list a list

The function rev is a lo cal function binding that may b e used only within

the let Notice that rev is dened by recursion on its rst argument and

reverse simply calls rev and hence do es not need to decomp ose its argu

ment l

Functions are not restricted to using parameters and lo cal variables

they may freely refer to variables that are available when the function is

dened Consider the following denition

fun pairwithxl

let fun p y xy

CHAPTER THE CORE LANGUAGE

in map p l

end

val pairwith fn a b list ab list

val l

val l int list

pairwithal

aaa string int list

The lo cal function p has a nonlo cal reference to the identier x the pa

rameter of the function pairwith The same rule applies here as with other

nonlo cal references the nearest enclosing binding is used This is exactly

the same rule that is used in other blo ck structured languages such as Pascal

but diers from the one used in most implementations of LISP

Exercise A perfect number is one that is equal to the sum of al l

its factors including but not including itself For example is a perfect

number because Dene the predicate isperfect to test for

perfect numbers

It was emphasized ab ove that in ML functions are values they have the

same rights and privileges as any other value In particular this means that

functions may b e passed as arguments to other functions and applications

may evaluate to functions Functions that use functions in either of these

ways are called higher order functions The origin of this terminology is

somewhat obscure but the idea is essentially that functions are often taken

to b e more complex data items than say integers which are called rst

order ob jects The distinction is not absolute and we shall not have need

to make much of it though you should b e aware of roughly what is meant

by the term

First consider the case of a function returning a function as result Sup

p ose that f is such a function What must its typ e lo ok like Lets supp ose

that it takes a single argument of typ e Then if it is to return a function

as result say a function of typ e then the typ e of f must b e

This reects the fact that f takes an ob ject of typ e and returns a function

whose typ e is The result of any such application of f may itself b e

applied to a value of typ e resulting in a value of typ e Such a succes

sive application is written fee or just f e e this is not the same

as fee Rememb er that ee is a single ob ject consisting of an

DEFINING FUNCTIONS

ordered pair of values Writing f e e means apply f to e obtaining

a function then apply that function to e This is why we went to such

trouble ab ove to explain function application in terms of obtaining a function

value and applying it to the value of the argument functions can b e denoted

by expressions other than identiers

Here are some examples to help clarify this

fun times xint yint xy

val times fn intintint

val twice times

val twice fn int int

twice

int

times

int

The function times is dened to b e a function that when given an integer

4

returns a function which when given an integer returns an integer The

identifer twice is b ound to times Since is an ob ject of typ e int the

result of applying times to is an ob ject of typ e intint as can b e seen

by insp ecting the typ e of times Since twice is a function it may b e applied

to an argument to obtain a value in this case twice returns of course

Finally times is successively applied to then the result is applied to

yielding This last application might have b een parenthesized to times

for clarity

It is also p ossible for functions to take other functions as arguments

Such functions are often called functionals or operators but once again we

shall not concern ourselves terribly much with this terminology The classical

example of such a function is the map function which works as follows map

takes a function and a list as arguments and returns the list resulting from

applying the function to each element of the list in turn Obviously the

function must have domain typ e the same as the typ e of the elements of the

list but its range typ e is arbitrary Here is a denition for map

fun map f nil nil

map f hdtl fhd map f tl

val map fn ab a list b list

4

The need for int on x and y will b e explained in Section b elow

CHAPTER THE CORE LANGUAGE

Notice how the typ e of map reects the correlation b etween the typ e of the

list elements and the domain typ e of the function and b etween the range

typ e of the function and the result typ e

Here are some examples of using map

val l

val l int list

map twice l

int list

fun listify x x

val listify fn a a list

map listify l

int list list

Exercise Dene a function powerset that given a set represented as

a list wil l return the set of al l its subsets

Combining the ability to take functions as values and to return functions

as results we now dene the comp osition function It takes two functions as

argument and returns their comp osition

fun composefgx fgx

val compose fn ab ca cb

val fourtimes composetwicetwice

val fourtimes fn intint

fourtimes

int

Lets walk through this carefully The function compose takes a pair of

functions as argument and returns a function this function when applied

to x returns fgx Since the result is fgx the typ e of x must b e

the domain typ e of g since f is applied to the result of gx the domain

typ e of f must b e the range typ e of g Hence we get the typ e printed

ab ove The function fourtimes is obtained by applying compose to the pair

twicetwice of functions The result is a function that when applied to

x returns twicetwicex in this case x is so the result is

Now that youve gained some familiarity with ML you may feel that it

is a bit p eculiar that declarations and function values are intermixed So far

DEFINING FUNCTIONS

there is no primitive expression form for functions the only way to designate

a function is to use a fun binding to bind it to an identier and then to refer

to it by name But why should we insist that al l functions have names

There is a go o d reason for naming functions in certain circumstances as we

shall see b elow but it also makes sense to have anonymous functions or

lambdas the latter terminology comes from LISP and the calculus

Here are some examples of the use of function constants and their rela

tionship to clausal function denitions

fun listify x x

val listify fn aa list

val listify fn xx

listify fn aa list

listify

int list

listify

int list

fn xx

int list

val l

val l int list

mapfn xxl

int list list

We b egin by giving the denition of a very simple function called listify

that makes a single element list out of its argument The function listify

is exactly equivalent except that it makes use of a function constant The

expression fn xx evaluates to a function that when given an ob ject

x returns x just as listify do es In fact we can apply this function

directly to the argument obtaining In the last example we pass

the function denoted by fn xx to map dened ab ove and obtain the

same result as we did from map listify l

Just as the fun binding provides a way of dening a function by pat

tern matching so may anonymous functions use patternmatching in their

denitions For example

fn nil nil hdtl tl

int list

CHAPTER THE CORE LANGUAGE

fn nil nil hdtl tl

nil int list

The clauses that make up the denition of the are col

lectively called a match

The very anonymity of anonymous functions prevents us from writing

down an anonymous function that calls itself recursively This is the reason

why functions are so closely tied up with declarations in ML the purp ose of

the fun binding is to arrange that a function have a name for itself while it

is b eing dened

Exercise Consider the problem of deciding how many dierent ways

there are of changing into and pence coins Suppose

that we impose some order on the types of coins Then it is clear that the

fol lowing relation holds

Numb er of ways to change amount a using n kinds of coins

Numb er of ways to change amount a using all but the rst kind of coin

Numb er of ways to change amount ad using all n kinds of coins

where d is the denomination of the rst kind of coin

This relation can be transformed into a recursive function if we specify the

degenerate cases that terminate the recursion If a we wil l count this as

one way to make change If a or n then there is no way to make

change This leads to the fol lowing recursive denition to count the number

of ways of changing a given amount of money

fun firstdenom

firstdenom

firstdenom

firstdenom

firstdenom

firstdenom

fun cc

cc

ccamount kinds

if amount then

POLYMORPHISM AND OVERLOADING

else

ccamountfirstdenom kinds kinds

ccamount kinds

fun countchange amount ccamount

Alter this example so that it accepts a list of denominations of coins to

be used for making change

Exercise The solution given above is a terrible way to count change

because it does so much redundant computation Can you design a better

algorithm for computing the result this is hard and you might like to skip

this exercise on rst reading

Exercise The Towers of Hanoi Suppose you are given three rods

and n disks of dierent sizes The disks can be stacked up on the rods thereby

forming towers Let the n disks initial ly be placed on rod A in order of de

creasing size The task is to move the n disks from rod A to rod C such

that they are ordered in the original way This has to be achieved under the

constraints that

In each step exactly one disk is moved from one rod to another rod

A disk may never be placed on top of a smal ler disk

Rod B may be used as an auxiliary store

Dene a function to perform this task

Polymorphism and Overloading

There is a subtle but imp ortant distinction that must b e made in order

for you to have a prop er grasp of p olymorphic typing in ML Recall that we

dened a p olytyp e as a typ e that involved a typ e variable those that do not

are called monotyp es In the last section we dened a p olymorphic function

as one that works for a large class of typ es in a uniform way The key idea is

that if a function do esnt care ab out the typ e of a value or comp onent of

a value then it works regardless of what that value is and therefore works

CHAPTER THE CORE LANGUAGE

for a wide class of typ es For example the typ e of append was seen to b e

a list a list a list reecting the fact that append do es not

care what the comp onent values of the list are only that the two arguments

are b oth lists having elements of the same typ e The typ e of a p olymorphic

function is always a p olytyp e and the collection of typ es for which it is

dened is the innite collection determined by the instances of the p olytyp e

For example append works for int lists and bool lists and intbool

lists and so on ad innitum Note that p olymorphism is not limited to

functions the empty list nil is a list of every typ e and thus has typ e a

list

This phenomenon is to b e contrasted with another notion known as over

loading Overloading is a much more ad hoc notion than p olymorphism b e

cause it is more closely tied up with notation than it is with the structure of

a functions denition A ne example of overloading is the addition func

tion Recall that we write to denote the sum of two integers and

and that we also write for the addition of the two real numb ers

and This may seem like the same phenomenon as the app ending

of two integer lists and the app ending of two real lists but the similarity is

only apparent the same app end function is used to app end lists of any typ e

but the algorithm for addition of integers is dierent from that for addition

for real numbers If you are familiar with typical machine representations

of integers and oating p oint numb ers this p oint is fairly obvious Thus

the single symb ol is used to denote two dierent functions and not a sin

gle p olymorphic function The choice of which function to use in any given

instance is determined by the typ e of the arguments

This explains why it is not p ossible to write fun plusxyxy in ML

the compiler must know the typ es of x and y in order to determine which

addition function to use and therefore is unable to accept this denition The

way around this problem is to explicitly sp ecify the typ e of the argument to

plus by writing fun plusxintyintxy so that the compiler knows

that integer addition is intended It it an interesting fact that in the absence

of overloaded identiers such as it is never necessary to include explicit

5

typ e information But in order to supp ort overloading and to allow you to

explicitly write down the intended typ e of an expression as a doublechecking

measure ML allows you to qualify a phrase with a typ e expression Here are

5

Except o ccasionally when using partial patterns as in fun f fxg x

POLYMORPHISM AND OVERLOADING

some examples

fun plusxy xy

Unresolvable overloaded identifier

fun plusxintyint xy

val plus fn intintint

bool

Type clash in bool

Looking for a bool

I have found a int

plustrue intintint bool

fn true intintint bool

fun idxa x

val id fn a a

Note that one can write p olytyp es just as they are printed by ML typ e

variables are identiers preceded by a single quote

Equality is an interesting inb etween case It is not a p olymorphic

function in the same sense that append is yet unlike it is dened for

arguments of nearly every typ e As discussed ab ove not every typ e admits

equality but for every typ e that do es admit equality there is a function

that tests whether or not two values of that typ e are equal returning true

or false as the case may b e Now since ML can tell whether or not a

given typ e admits equality it provides a means of using equality in a quasi

p olymorphic way The trick is to intro duce a new kind of typ e variable

written a which may b e instantiated to any typ e that admits equality an

equality typ e for short The ML typ e checker then keeps track of whether

a typ e is required to admit equality and reects this in the inferred typ e of

a function by using these new typ e variables For example

fun member x nil false

member x ht if xh then true else memberxt

val member fn a a list bool

The o ccurrences of a in the typ e of member limit the use of member to those

typ es that admit equality

CHAPTER THE CORE LANGUAGE

Dening typ es

The typ e system of ML is extensible Three forms of typ e bindings are

available each serving to intro duce an identier as a typ e constructor

The simplest form of typ e binding is the transparent type binding or type

abbreviation A typ e constructor is dened p erhaps with parameters as

an abbreviation for a presumably complex typ e expression There is no

semantic signicance to such a binding all uses of the typ e constructor

are equivalent to the dening typ e

type intpair int int

type intpair int int

fun fxintpair let val lrx in l end

val f fn intpair int

f

int

type a pair a a

type a pair a a

type boolpair bool pair

type boolpair bool pair

Notice that there is no dierence b etween intint and intpair b ecause

intpair is dened to b e equal to intint The only reason to qualify x with

intpair in the denition of f is so that its typ e prints as intpairint

The typ e system of ML may b e extended by dening new comp ound typ es

using a datatype binding A data typ e is sp ecied by giving it a name and

p erhaps some typ e parameters and a set of value constructors for building

ob jects of that typ e Here is a simple example of a datatype declaration

datatype color Red Blue Yellow

type color

con Red color

con Blue color

con Yellow color

Red

Red color

This declaration declares the identier color to b e a new data typ e with

DEFINING TYPES

6

constructors Red Blue and Yellow This example is reminiscent of the

enumeration typ e of Pascal

Notice that ML prints type color without any equation attached to

reect the fact that color is a new data typ e It is not equal to any other typ e

previously declared and therefore no equation is appropriate In addition

to dening a new typ e the datatype declaration ab ove also denes three

new value constructors These constructors are printed with the keyword

con rather than val in order to emphasize that they are constructors and

may therefore b e used to build up patterns for clausal function denitions

Thus a datatype declaration is a relatively complex construct in ML it

simultaneously creates a new typ e constructor and denes a set of value

constructors for that typ e

The idea of a data typ e is p ervasive in ML For example the builtin typ e

bool can b e thought of as having b een predeclared by the compiler as

datatype bool true false

type bool

con true bool

con false bool

Functions may b e dened over a userdened data typ e by pattern match

ing just as for the primitive typ es The value constructors for that data typ e

determine the overall form of the function denition just as nil and are

used to build up patterns for functions dened over lists For example

fun favorite Red true

favorite Blue false

favorite Yellow false

val favorite fn colorbool

val color Red

val color Red color

favorite color

true bool

This example also illustrates the use of the same identier in two dierent

ways The identier color is used as the name of the typ e dened ab ove

and as a variable b ound to Red This mixing is always harmless though

6

Nullary constructors those with no arguments are sometimes called constants

CHAPTER THE CORE LANGUAGE

p erhaps confusing since the compiler can always tell from context whether

the typ e name or the variable name is intended

Not all userdened value constructors need b e nullary

datatype money nomoney coin of int note of int

check of stringint

type money

con nomoney money

con coin intmoney

con note intmoney

con check stringintmoney

fun amountnomoney

amountcoinpence pence

amountnotepounds pounds

amountcheckbankpence pence

val amount fn moneyint

The typ e money has four constructors one a constant and three with ar

guments The function amount is dened by patternmatching using these

constructors and returns the amount in p ence represented by an ob ject of

typ e money

What ab out equality for userdened data typ es Recall the denition

of equality of lists two lists are equal i either they are b oth nil or they

are of the form ht and ht with h equal to h and t equal to t In

general two values of a given data typ e are equal i they are built the same

way ie they have the same constructor at the outside and corresp onding

comp onents are equal As a consequence of this denition of equality for data

typ es we say that a userdened data typ e admits equality i each of the

domain typ es of each of the value constructors admits equality Continuing

with the money example we see that the typ e money admits equality b ecause

b oth int and string do

nomoney nomoney

true bool

nomoney coin

false bool

coin coin

true bool

DEFINING TYPES

checkTSB checkClydesdale

true bool

Data typ es may b e recursive For example supp ose that we wish to dene

a typ e of binary trees A binary tree is either a leaf or it is a no de with two

binary trees as children The denition of this typ e in ML is as follows

datatype btree empty leaf node of btree btree

type btree

con empty btree

con leaf btree

con node btreebtreebtree

fun countleaves empty

countleaves leaf

countleaves nodetreetree

countleavestreecountleavestree

val countleaves fn btreeint

Notice how the denition parallels the informal description of a binary tree

The function countleaves is dened recursively on btrees returning the

numb er of leaves in that tree

There is an imp ortant pattern to b e observed here functions on recursive

lydened data values are dened recursively We have seen this pattern

b efore in the case of functions such as append which is dened over lists

7

The builtin typ e list can b e considered to have b een dened as follows

datatype a list nil of a a list

type a list

con nil a list

con a a list a list

This example illustrates the use of a parametric data typ e declaration the

typ e list takes another typ e as argument dening the typ e of the memb ers

of the list This typ e is represented using a typ e variable a in this case as

argument to the typ e constructor list We use the phrase typ e constructor

b ecause list builds a typ e from other typ es much as value constructors build

values from other values

7

This example do es not account for the fact that is an inx op erator but we will

neglect that for now

CHAPTER THE CORE LANGUAGE

Here is another example of a recursivelydened parametric data typ e

datatype a tree empty leaf of a

node of a tree a tree

type a tree

con empty a tree

con leaf aa tree

con node a treea treea tree

fun frontier empty

frontier leafx x

frontier nodett

appendfrontiertfrontiert

val frontier fn a tree a list

val tree nodeleafanodeleafbleafc

val tree nodeleafanodeleafbleafc

string tree

frontier tree

abc string list

The function frontier takes a tree as argument and returns a list consisting

of the values attached to the leaves of the tree

Exercise Design a function samefrontierxy which returns true

if the same elements occur in the same order regard less of the internal struc

ture of x and y and returns false otherwise A correct but unsatisfactory

denition is

fun samefrontierxy frontier x frontier y

This is a dicult exercise the problem being to avoid attening a huge tree

when it is frontier unequal to the one with which it is being compared

ML also provides a mechanism for dening abstract types using an abstype

8

binding An abstract typ e is a data typ e with a set of functions dened on

it The data typ e itself is called the implementation type of the abstract typ e

and the functions are called its interface The typ e dened by an abstyp e

binding is abstract b ecause the constructors of the implementation typ e are

8

Abstract typ es in this form are for the most part sup erseded by the mo dules system

describ ed in the next chapter

DEFINING TYPES

hidden from any program that uses the typ e called a client only the inter

face is available Since programs written to use the typ e cannot tell what the

implementation typ e is they are restricted to using the functions provided

by the interface of the typ e Therefore the implementation can b e changed

at will without aecting the programs that use it This is an imp ortant

mechanism for structuring programs so as to prevent interference b etween

comp onents

Here is an example of an abstract typ e declaration

abstype color blend of intintint

with val white blend

and red blend

and blue blend

and yellow blend

fun mixpartsint blendrby

partsint blendrby

if parts orelse parts then white

else let val tppartsparts

and rp partsrpartsr div tp

and bp partsbpartsb div tp

and yp partsypartsy div tp

in blendrpbpyp

end

end

type color

val white color

val red color

val blue color

val yellow color

val mix fn intcolorintcolorcolor

val green mix yellow blue

val green color

val black mix red mix blue yellow

val black color

There are several things to note ab out this declaration First of all the

typ e equation o ccurring right after abstype is a data typ e declaration ex

actly the same syntax applies as the ab ove example may suggest Following

CHAPTER THE CORE LANGUAGE

the denition of the implementation typ e is the interface declaration b e

tween with and end Examining MLs output for this declaration we see

that ML rep orts type color without an equation reecting the fact that it

is a new typ e unequal to any others Furthermore note that no construc

tors are declared as a result of the abstype declaration unlike the case of

data typ e denitions This prevents the client from building an ob ject of

typ e color by any means other than using one of the values provided by the

interface of the typ e These two facts guarantee that the client is insulated

from the implementation details of the abstract typ e and therefore allows

for a greater degree of separation b etween client and implementor Among

other things this allows for more exibility in program maintenance as the

implementation of color is free to b e changed without aecting the client

Note however that the functions dened within the with clause do have ac

cess to the implementation typ e and its constructors for otherwise the typ e

would b e quite useless

Note that the insulation of the client from the implementation of the

abstract typ e prevents the client from dening functions over that typ e by

pattern matching It also means that abstract typ es do not admit equality

If an abstract typ e is to supp ort an equality test then the implementor must

dene an equality function for it

Thus there are three ways to dene typ e constructors in ML Transparent

typ e bindings are used to abbreviate complex typ e expressions primarily for

the sake of readability rather than to intro duce a new typ e Data typ e bind

ings are used to extend the typ e system of ML A data typ e is sp ecied by

declaring a new typ e constructor and providing a set of value constructors for

that typ e Data typ e denitions are appropriate for sp ecifying data that is

describ ed structurally such as a tree for then it is natural that the under

lying structure b e visible to the client For data structures that are dened

b ehaviorally such as a stack or a priority queue an abstract typ e denition

is appropriate the structural realization is not part of the denition of the

typ e only the functions that realize the dened b ehavior are relevant to the

client

Exercise An abstract type set might be implemented by

abstype a set set of a list

with val emptyset a set

EXCEPTIONS

fun singleton e a a set

fun unions a set s a set a set

fun membere a s a set bool

membere set ht e h

orelse membere set t

fun intersections a set s a set a set

end

Complete the denition of this abstract type

Exercise Modify your solution so that the elements of the set are

stored in an ordered list Hint One approach would be to pass the order

ing relation as an additional parameter to each function Alternatively the

ordering relation could be supplied to those functions that create a set from

scratch and embedded in the representation of a set The union function

could then access the ordering relation from the representation of one of its

arguments and propagate it to the union set We wil l return to this problem

later when a more elegant mechanism for performing this parameterization

wil l be discussed

Exceptions

Supp ose that we wish to dene a function head that returns the head of a list

The head of a nonempty list is easy to obtain by patternmatching but what

ab out the head of nil Clearly something must b e done to ensure that head

is dened on nil but it is not clear what to do Returning some default value

is undesirable b oth b ecause it is not at all evident what value this might b e

and furthermore it limits the usability of the function if headnil were

dened to b e say nil then head would apply only to lists of lists

In order to handle cases like this ML has an exception mechanism The

purp ose of the exception mechanism is to provide the means for a function to

give up in a graceful and typ esafe way whenever it is unable or unwilling

to return a value in a certain situation The graceful way to write head is as

follows

exception Head

exception Head

CHAPTER THE CORE LANGUAGE

fun headnil raise Head

headxl x

val head fn a lista

head

int

head nil

Failure Head

The rst line is an exception binding that declares head to b e an exception

The function head is dened in the usual way by patternmatching on the

constructors of the list typ e In the case of a nonempty list the value of

head is simply the rst element But for nil the function head is unable to

return a value and instead raises an exception The eect of this is seen in

the examples following the declaration of head applying head to nil causes

the message Failure Head to b e printed indicating that the expression

headnil caused the exception Head to b e raised Recall that attempts to

divide by zero result in a similar message the internallydened function div

raises the exception Div if the divisor is

With exception and raise we can dene functions that ag undesirable

conditions by raising an exception But to b e complete there ought to b e a

way of doing something ab out an error and indeed there is such a mechanism

in ML called an exception hand ler or simply a hand ler We illustrate its use

by a simple example

fun head l headl handle Head

val head fn int listint

head

int

headnil

int

0

The expression e handle exn e is evaluated as follows rst evaluate e

if it returns a value v then the value of the whole expression is v if it raises

0

the exception exn then return the value of e if it raises any other exception

0

then raise that exception Notice that the typ e of e and the typ e of e must

b e the same otherwise the entire expression would have a dierent typ e

dep ending on whether or not the lefthand expression raised an exception

This explains why the typ e of head is int listint even though l do es

EXCEPTIONS

not app ear to b e constrained to b e an integer list Continuing the ab ove

example head applies head to l if it returns a value then that is the value

of head if it raises exception Head then head returns

Since a given expression may p otentially raise one of several dierent

exceptions several exceptions can b e handled by a single handler as follows

exception Odd

exception Odd

fun foo n if n mod then

raise Odd

else

div n

val foo fn intint

fun bar m foom handle Odd

Div

val bar fn intint

foo

Failure Div

bar

int

foo

Failure Odd

bar

int

foo

int

bar

int

The function foo may fail in one of two ways by dividing by zero causing

the exception Div to b e raised or by having an o dd argument raising the

exception Odd The function bar is dened so as to handle either of these

contingencies if foom raises the exception Odd then barm returns if

it raises Div it returns otherwise it returns the value of foom

Notice that the syntax of a multipleexception handler is quite like the

syntax used for a patternmatching denition of a lamb da In fact one

can think of an exception handler as an anonymous function whose domain

typ e is exn the typ e of exceptions and whose range typ e is the typ e of the

CHAPTER THE CORE LANGUAGE

expression app earing to the left of handle From the p oint of view of typ e

checking exceptions are nothing more than constructors for the typ e exn

just as nil and cons are constructors for typ es of the form a list

It follows that exceptions can carry values simply by declaring them to

take an argument of the appropriate typ e The attached value of an exception

can b e used by the handler of the exception An example will illustrate the

p oint

exception oddlist of int list and oddstring of string

exception oddlist of int list

exception oddstring of string

handle oddlistnil

oddlistht

oddstring

oddstrings sizes

The exception declaration intro duces two exceptions oddlist which takes

a list of integers as argument and oddstring which takes a string The

handler p erforms a case analysis b oth on the exception and on its argument

just as we might dened a function by pattern matching against a data typ e

What happ ens if the elided expression in the previous example raises an

exception other than oddstring or odd list Here the similarity to functions

ends For in the case of functions if the match is not exhaustive and the

function is applied to an argument that fails to match any pattern then the

exception Match is raised But in the case of exception handlers the excep

tion is reraised in the hop e that an outer handler will catch the exception

For example

exception Theirs and Mine

exception Theirs

exception Mine

fun fx if x then raise Mine else raise Theirs

val f fn int a

f handle Mine

int

f handle Mine

Failure Theirs

f handle Mine handle Theirs

EXCEPTIONS

int

Since exceptions are really values of typ e exn the argument to a raise

expression need not b e simply an identier For example the function f

ab ove might have b een dened by

fun fx raise if x then Mine else Theirs

val f fn int a

Furthermore the wildcard pattern matches any exception whatso ever so

that we may dene a handler that handles all p ossible exceptions simply b e

including a default case as in

handle

An exception binding is a form of declaration and so may have limited

scop e The handler for an exception must lie within the scop e of its dec

laration regardless of the name This can sometimes lead to p eculiar error

messages For example

exception Exc

exception Exc

let exception Exc in raise Exc end handle Exc

Failure Exc

Despite app earances the outer handler cannot handle the exception raised

by the raise expression in the b o dy of the let for the inner Exc is a distinct

exception that cannot b e caught outside of the scop e of its declaration other

than by a wildcard handler

Exercise Explain what is wrong with the fol lowing two programs

exception exn bool

fun f x

let exception exn int

in if x then raise exn with x else x

end

f handle exn with true false

CHAPTER THE CORE LANGUAGE

fun f x

let exception exn

in if p x then a x

else if q x then fb x handle exn c x

else raise exn with d x

end

f v

Exercise Write a program to place n queens on an n n chess board

so that they do not threaten each other

Exercise Modify your program so that it returns al l solutions to the

problem

Imp erative features

ML supp orts references and assignments References are a typ esafe form

of p ointer to the heap Assignment provides a way to change the ob ject

to which the p ointer refers The typ e ref is the typ e of references to

9

values of typ e The function refaa ref allo cates space in the

heap for the value passed as argument and returns a reference to that lo

cation The function a refa is the contents of function returning

the contents of the lo cation given by the reference value and the function

a refaunit is the assignment function

val x ref

val x ref int ref

x

int

x

unit

x

int

9

At present  must b e a monotyp e though it is exp ected that one of several prop osed

metho ds of handling p olymorphic references will so on b e adopted

IMPERATIVE FEATURES

All reference typ es admit equality Ob jects of typ e ref are heap ad

dresses and two such ob jects are equal i they are identical Note that this

implies that they have the same contents but the converse do esnt hold we

can have two unequal references to the same value

val x ref

val x ref int ref

val y ref

val y ref int ref

xy

false bool

x y

true bool

This corresp onds in a language like Pascal to having two dierent variables

with the same value assigned to them they are distinct variables even though

they have the same value at the moment For those of you familiar with

LISP the equality of references in ML corresp onds to LISPs eq function

rather than to equal

Along with references comes the usual imp erative language constructs

such as sequential comp osition and iterative execution of statements In ML

statements are expressions of typ e unit expressing the idea that they are

evaluated for their side eects to the store rather than their value The

inx op erator implements sequencing and the construct while e do

e provides iteration

Exercise The fol lowing abstract type may be used to create an innite

stream of values

abstype a stream stream of unit a a stream

with fun nextstream f f

val mkstream stream

end

Given a stream s next s returns the rst value in the stream and a stream

that produces the rest of the values This is il lustrated by the fol lowing ex ample

CHAPTER THE CORE LANGUAGE

fun natural n mkstreamfn n naturaln

val natural fn int int stream

val s natural

val s int stream

val firstrest next s

val first int

val rest int stream

val next next rest

val next int

Write a function that returns the innite list of prime numbers in the form

of a stream

Exercise The implementation of the stream abstract type given above

can be very inecient if the elements of the stream are examined more than

once This is because the next function computes the next element of the

stream each time it is cal led This is wasteful for an applicative stream such

as the prime numbers example as the value returned wil l always be the

same Modify the abstract type so that this ineciency is removed by using

references

Exercise Modify your stream abstract type so that streams can be

nite or innite with a predicate endofstream to test whether the stream has

nished

Chapter

The Mo dules System

Overview

The ability to decomp ose a large program into a collection of relatively inde

p endent mo dules with welldened interfaces is essential to the task of build

ing and maintaining large programs The ML mo dules sytem supplements

the core language with constructs to facilitate building and maintaining large

programs

Many mo dern programming languages provide for some form of mo dular

decomp osition of programs into relatively indep endent parts Exactly what

constitutes a program unit and how they are related is by no means estab

lished in the literature and consequently there is no standard terminology

Program comp onents are variously called among other things mo dules

packages and clusters in ML we use the term structure short for

environment structure This choice of terminology is telling MLs con

ception of a program unit is that it is a reied environment Recall that the

environment is the rep ository of the meanings of the identiers that have

b een declared in a program For example after the declaration val x

the environment records the fact that x has value which is of typ e int

Now the fundamental notion underlying program mo dularization is that the

aim is to partition the environment into chunks that can b e manipulated

relatively indep endently of one another The reason for saying relatively is

that if two mo dules constitute a program then there must b e some form of

interaction b etween them and there must b e some means of expressing and

CHAPTER THE MODULES SYSTEM

managing this interaction This is the problem of sharing

Exactly what sorts of op erations one is able to p erform with a program

unit and how sharing is managed are the characteristic features of any

mo dularization system At the very least one wants a mo dules facility to

allow for separate compilation of program units some means of assembling

the units into a complete program and some form of insulation b etween the

units so as to avoid inadvertent dep endency on accidental prop erties of

a unit such as the details of its implementation Managing the interaction

b etween insulation abstraction and sharing is the key issue that determines

the form of solution to the other problems p osed by the desire for mo dular

program development

Just as the typ e of an identier mediates its use in a program so struc

tures have a form of typ e called a signature that describ es the structure to

the rest of the world In the literature the typ e of a program unit is called an

interface or package description MLs terminology is suggested by the

analogy b etween an environment structure and an algebraic structure the

latters typ e b eing an algebraic signature Just as typ es are a summary

of the compiletime prop erties of an expression so a signature is a summary

of the information that is known ab out a structure at compile time However

in contrast to the core language explicitly ascribing a signature to a struc

ture eects b oth the compiletime and runtime prop erties of that structure

by dening a limited view of that structure

A functor is a function that takes structures to structures The idea

is that if a structure S dep ends on another structure T only to the extent

sp ecied in T s signature then S may b e isolated from T s implementation

details by dening a function that given any structure with T s signature

returns the structure S with that structure plugged in In the literature

this facility is called a parameterized mo dule or a generic package In ML

we cho ose the term functor b oth b ecause it is suggestive of its functional

character and also b ecause it accords with the mathematical terminology

surrounding structures and signatures mentioned ab ove The declaration of

a functor corresp onds to building S in isolation and the application of that

functor to a structure corresp onds to linking together the parts of a program

to form a coherent whole Functors are also the basis for an elegant form of

information hiding called an abstraction For most purp oses abstractions

are a replacement for abstract typ es

We b egin our intro duction to the mo dules facility by lo oking at structures

STRUCTURES AND SIGNATURES

and signatures

Structures and Signatures

A structure is essentially an environment turned into a manipulable ob ject

The basic form of expression denoting a structure is called an encapsulated

declaration consisting of a declaration bracketed by the keywords struct

and end Here is a simple example of an encapsulated declaration

struct

type t int

val x

fun fx if x then else xfx

end

The value of this encapsulated declaration is a structure in which the typ e

identier t is b ound to int and the value identiers x and f are b ound to

and the factorial function resp ectively Although we shall regard a struc

ture as a kind of value the kind denoted by an encapsulated declaration

it do es not have the same status as ordinary values In particular one may

not simply enter an encapsulated declaration at top level the way that one

might enter an arithmetic expression However they may b e b ound to iden

tiers using structure bindings a form of declaration that may app ear only

at top level or within an encapsulated declaration For the time b eing we

will restrict our attention to structure bindings at the top level and defer

discussion of structure bindings within structures until later Thus we may

bind the ab ove structure to an identier as follows

structure S

struct

type t int

val x

fun fx if x then else xfx

end

structure S

struct

type t int

CHAPTER THE MODULES SYSTEM

val f fn int int

val x int

end

1

Notice that the result of evaluating the structure binding is an environment

Consequently ML prints the environment resulting from the declaration b e

tween struct and end almost as though it were typ ed directly at top level

Of course a structure is an indep endent environment in that the declaration

within an encapsulated declaration do es not eect the top level environment

So for example neither t nor f are available at top level after the ab ove dec

laration

However they may b e accessed by reaching into the structure b ound to

S using a qualied name A qualied name consists of a structure path and

a simple identier separated by a dot For the present a structure path is

simply a single structure identier later on we will need to generalize paths

to a sequence of structure identiers We may refer to the comp onents of the

structure S using qualied names as follows

x

Type checking error in x

Unbound value identifier x

Sx

int

SfSx

int

Sx St

St

The expression Sx is a qualied name that refers to the value identier x in

the structure S Its value as you might exp ect is Similarly Sf designates

the function f dened in the structure S the factorial function When it is

applied to Sx that is to it returns Reference to the identiers dened

by S is not limited to values the last example illustrates the use of the typ e

identier St dened in S to b e int

If you are writing a bit of co de that refers to several comp onents of a

single structure it can get quite tedious to continually use qualied names

1

For technical reasons some implementations of ML rearrange the environment b efore

printing

STRUCTURES AND SIGNATURES

To alleviate this problem ML provides a declaration form that op ens up a

structure and incorp orates its bindings into the lo cal environment so that

they can b e referred to directly

let open S in fx end

int

open S

val x int

val f fn intint

type t int

In the rst example we lo cally op en structure S within a let expression so

that we can write fx instead of the more verb ose SfSx In the second

example we op en S at the top level thereby adding its bindings to the top

level environment as can b e seen by the result of the expression

It is often helpful to think of a structure as a kind of value b oth b ecause

it reects the idea of treating environments as ob jects and also b ecause it

suggests the sorts of op erations that one might p erform on them Just as

every value in the core language has a typ e so structures have typ es as well

namely signatures Signatures describ e structures in much the same way

that typ es describ e ordinary values in that they serve as a description of the

computational role of the value by determining the sorts of ways in which

it can b e used This is necessarily vague and signatures are not just a new

form of typ e but nonetheless this analogy should help you to see whats

going on

If we examine the output of ML on the ab ove examples we notice a

certain inconsistency b etween the rep ort for structure bindings and the rep ort

for value bindings at least as long as we push the structures as values

analogy whereas for value bindings ML rep orts b oth the value and the

typ e for structure bindings only a form of value is printed Lets consider

what would happ en if ML were to adhere to the val binding convention for

structure bindings

structure S

struct

val x

val b x end

CHAPTER THE MODULES SYSTEM

structure S

struct

val x

val b true

end

sig

val x int

val b bool

end

In this fanciful example the typ e information for the variables app ears in the

signature whereas the value app ears in the structure This accords with our

intuitive idea of a signature as a description of a value the structure One

can see that the val binding format is rather awkward for fat ob jects like

structures so the actual ML system prints an amalgamation of the structure

and its signature in resp onse to a structure binding

The expression bracketed by sig and end in the ab ove example is called a

signature the b o dy of which is called a specication A sp ecication is similar

to a declaration except that it merely describ es an identier by assigning it

a typ e rather than giving it a value and implicitly a typ e For the present

we consider only val sp ecications adding the other forms as we go along

In the ab ove example x is sp ecied to have typ e int and b typ e bool

Signature expressions are not limited to the output of the ML compiler

They play a crucial role in the use of the mo dules system particularly in

functor declarations and therefore are often typ ed directly by the user Sig

natures may b e b ound to signature identiers using signature bindings in

much the same way that typ es may b e b ound to typ e identiers using typ e

bindings Signature bindings are intro duced with the keyword signature

and may only app ear at top level

signature SIG

sig

val x int

val b bool

end

signature SIG sig

STRUCTURES AND SIGNATURES

val x int

val b bool

end

The output from a signature binding is not very enlightening and so Ill omit

it from future examples

The primary signicance of signatures lies in signature matching A struc

ture matches a signature if roughly the structure satises the sp ecication

in the signature Since sp ecications are similar to typ es the idea is simi

lar to typ e checking in the core language though the details are a bit more

complex One use of signatures is to attach them to structure identiers in

structure bindings as a form of correctness check in which we sp ecify that

the structure b eing b ound must match the given signature

structure S SIG

struct

val x

val b x

end

structure S

struct

val x int

val b false bool

end

The notation SIG on the structure binding indicates that the encapsulated

declaration on the right of the equation must match the signature SIG

Since ML accepted the ab ove declaration it must b e that the structure

do es indeed match the given signature Why is that the case The given

structure matches SIG b ecause

Sx is b ound to which is of typ e int as required by SIG

and

Sb is b ound to false which is of typ e bool

In short if a variable x is assigned a typ e in a signature then the corre

sp onding expression b ound to x in the structure must have typ e

The signature may require less than the structure presents For example

CHAPTER THE MODULES SYSTEM

structure S SIG

struct

val x

val b false

val s Garbage

end

structure S

struct

val x int

val b false bool

end

Here the structure b ound to S denes variables x b and s while the signature

SIG only requires x and b Not only is the typ e of s immaterial to the

signature matching but it is also removed from the structure by the pro cess

of signature matching The idea is that SIG denes a view of the structure

consisting only of x and b Other signatures may b e used to obtain other

views of the same structure as in the following example

structure S

struct

val x

val b false

val s String

end

structure S

struct

val x int

val b false bool

val s String string

end

signature SIG

sig

val x int

val b bool

end

and SIG sig

STRUCTURES AND SIGNATURES

val b bool

val s string

end

structure S SIG S and S SIG s

structure S

struct

val x int

val b false bool

end

structure S

struct

val b false bool

val s String string

end

Exercise A signature for structures that possess an ordering can be

written as

signature ORD

sig

type t

val le t t bool

end

Create structures for ordered integers and realstring pairs to match this

signature

If a value in a structure has p olymorphic typ e then it satises a sp eci

cation only if the p olymorphic typ e has the sp ecied typ e as an instance So

for example if x is b ound in some structure to nil which as typ e a list

then x satises the sp ecications int list and bool list list for exam

ple as should b e obvious by now But what happ ens if the sp ecication typ e

is p olymorphic Lets supp ose that an identier f is sp ecied to have typ e

a lista list In order to satisfy this sp ecication a structure must

bind a value to f that can take an arbitrary list to another list of that typ e

Thus it is not go o d enough that f b e of typ e say int listint list for

the sp ecication requires that f work for bool list as well The general

principle is that the value in the structure must b e at least as general as that

CHAPTER THE MODULES SYSTEM

in the sp ecication So if f is b ound in a structure to the identity function

which has typ e aa then it satises the sp ecication ab ove The reason

is that it takes a value of any typ e and returns a value of that typ e so a

fortiori it can take a list of any typ e and return a list of that typ e Heres

an example to summarize

signature SIG

sig

val n a list

val l int list

val f a list a list

end

structure S SIG

struct

val n nil a list

val l nil a list

fun fx x a a

end

Exercise What is wrong with the fol lowing declaration

structure S SIG

struct

val n

val l nil

fun fx x

end

Exception bindings within structures are sub ject to the same restriction

as for exception bindings in the core language they must have monotyp es

Exception sp ecications prescrib e the typ e of the exception only and the

rules for signature matching are the same as for variables except that the

complications related to p olymorphic typ es do not arise

structure S

struct

exception Barf

exception Crap Barf

STRUCTURES AND SIGNATURES

fun fx if x then raise Barf

else if x then raise Crap

else

end

structure S

struct

exception Barf

exception Crap

val f fn intint

end

Sf

Failure Barf

Sf

int

Typ e declarations and sp ecications raise more interesting questions

First lets consider transparent typ e bindings in structures such as in the

rst example of this section in which t is b ound to int What might the sig

nature of such a structure b e Lets consider an example under the imaginary

structureprinting regime that we considered ab ove

structure S

struct

type t int

val x

fun fx if x then else xfx

end

structure S

struct

type t int

val f fn

val x

end

sig

type t

val f intint

val x int

CHAPTER THE MODULES SYSTEM

end

The sp ecication of identier t in the structure b ound to S is just type t

indicating that its value is a typ e namely int

When a typ e identier takes an argument the sp ecication is written in

the obvious way

structure S

struct

type a t a int

val x true

end

structure S

struct

type a t a int

val x true

end

sig

type a t

val x bool int

end

Notice the form of the sp ecication for t

Both of the ab ove sp ecication forms are acceptable in signature expres

sions But what happ ens to signature matching Consider the following

example

signature SIG

sig

type a t

val x int bool

end

structure S SIG

struct

type a t a bool

val x true end

STRUCTURES AND SIGNATURES

structure S

struct

type a t a bool

val x true int bool

end

The structure b ound to S matches SIG b ecause St is a unary one argument

typ e constructor as sp ecied in SIG

If a signature sp ecies a typ e constructor then that typ e constructor may

b e used in the remainder of the sp ecication Heres an example

signature SIG

sig

type a t

val x int t

end

This signature sp ecies the class of structures that dene a unary typ e con

structor t and a variable of typ e int t for that typ e constructor t

Now lets return to the structure S ab ove and consider whether or not

it matches this signature SIG According to the informal reading of SIG just

given S ought to match SIG More precisely S matches SIG b ecause

St is a unary typ e constructor as required

The typ e of Sx is intbool Now int t is equal to intbool

by denition of St and therefore Sx satises the sp eci

cation int t

It is imp ortant to realize that during signature matching all of the typ e

identiers in the signature are taken to refer to the corresp onding identiers

in the structure so that the sp ecication int t is taken to mean int St

Exercise Which signatures match the fol lowing structure

structure S

struct

type a t a int

val x true end

CHAPTER THE MODULES SYSTEM

As a metho dological p oint it is usually wise to adhere to the signature

closure rule which states that the free identiers of a signature are to b e

limited to signature identiers and builtin functions like and the so

called pervasives

Exercise Given

structure A struct datatype a D d of a end

which of the fol lowing are valid signatures for

structure B

struct

type t int AD

fun fAdx Adx

end

sig type t val f int AD int AD end

sig type t val f t int AD end

sig type t val f t t end

Data typ e declarations in structures present no great diculties Consider

the following example

signature SIG

sig

type a List

val Append a List a List a List

end

structure S SIG

struct

datatype a List Nil Cons of a a List

fun AppendxNil x

AppendxConsht ConshAppendxt

end

structure S

struct

type a List

val Append fn a List a List a List end

STRUCTURES AND SIGNATURES

As an exercise convince yourself that S matches SIG by arguing along the

same lines as weve done for the other examples considered so far

In the ab ove example the signature SIG ascrib ed to S has no entries for

the constructors of the data typ e List There are two ways to sp ecify the

constructors in SIG One is to treat them just like ordinary values as the

following example illustrates

signature SIG

sig

type a List

val Nil a List

val Cons a a List a List

val Append a List a List a List

end

structure S SIG

struct

datatype a List Nil Cons of a a List

fun AppendxNil x

AppendxConsht ConshAppendxt

end

structure S

struct

type a List

val Nil a List

val Cons a a List a List

val Append fn a List b List a List

end

Notice that a List is no longer a data typ e and that Nil and Cons are

simply variables not value constructors

The other p ossibility is to sp ecify the constructors as constructors so that

the structure of a typ e is visible The way to do this is with the data typ e

sp ecication which is syntactically identical to the data typ e declaration

Heres an example

signature SIG

sig

datatype a List Nil Cons of a a List

CHAPTER THE MODULES SYSTEM

val Append a List a List a List

end

structure T SIG S

structure T

struct

type a List

con Nil a List

con Cons a a List a List

val Append fn a List a List a List

end

The utility of this approach to sp ecifying constructors will b e explained b elow

when we intro duce functors

Abstract typ e declarations in structures do not present any new issues for

signature matching as they merely serve to declare a typ e and some identiers

asso ciated with it Abstract typ e sp ecications do not arise b ecause as we

shall see b elow we have another means of treating typ es abstractly and so

there is no need of such a sp ecication

Exercise Dene an implementation of stacks using signatures and

structures

In practice structures are typically built up from one another according

to some pattern determined by the application If a structure S is built from

another structure T then S is said to depend on T MacQueen classies

dep endency in two ways First the dep endence of S on T may b e essen

tial or inessential Essential dep endence arises when S may only b e used in

conjunction with T the relationship b etween the two is so close that they

may not b e usefully separated All other forms of dep endence are inessen

tial Second the dep endence of S on T may b e either explicit or implicit

S explicitly dep ends on T if the signature of S can only b e expressed by

reference to the signature of T otherwise the dep endence is implicit Note

that explicit dep endence is always essential

The simplest case of inessential dep endence o ccurs when S imp orts a value

from T as in the following example

structure T struct

STRUCTURES AND SIGNATURES

val x

end

structure T

struct

val x int

end

structure S

struct

val y Tx

end

structure S

struct

val y int

end

It is clear that S can b e used indep endently of T even though S was dened

by reference to T This form of dep endence is sometimes called dependence

by construction

Essential dep endence is much more imp ortant One form of essential

dep endence o ccurs when T declares an exception that can b e raised by a

function in S For example

structure T

struct

exception Barf

fun foox if x then raise Barf else div x

end

structure T

struct

exception Barf

val foo fn intint

end

structure S

struct

fun gx Tfoox

end

Since Sg raises the exception Barf the use of S is limited to contexts in

which T is available for otherwise one cannot handle the exception Therefore

CHAPTER THE MODULES SYSTEM

S dep ends essentially on T and ought to b e packaged together with it Note

however that the dep endence is implicit for the signature of S printed by

ML do es not make reference to T

Essential and explicit dep endence o ccurs when S overtly uses a data typ e

dened in T as in

structure T

struct

datatype a List Nil Cons of a a List

fun lenNil

lenConsht lent

end

structure T

struct

type a List

con Nil a List

con Cons a a List a List

val len fn a List int

end

structure S

struct

val len Tlen

end

structure S

struct

val len fn a TList int

end

Notice that the signature of S makes reference to the structure T reecting

the fact that len may only b e applied to values of a typ e dened in T

Note that the signature closure rule precludes the p ossibility of ascribing

a nontrivial signature to S in the ab ove example for a signature expression

may not contain free references to structure identiers such as T This may

seem like an arbitrary restriction but in fact it serves to call attention to the

fact that S and T are closely related and should b e packaged together as a

unit This kind of packaging can b e achieved by making T b e a substructure

of S by including the declaration of T within the encapsulated declaration of

S as follows

STRUCTURES AND SIGNATURES

structure S

struct

structure T

struct

datatype a List Nil Cons of a a List

fun lenNil

lenConsht lent

end

val len Tlen

end

structure S

struct

structure T

struct

type a List

con Nil a List

con Cons a a List a List

val len fn a List int

end

val len fn a TList int

end

In this way one may form a hierarchical arrangement of interdep endent struc

tures and may thereby package together a related set of structures as a unit

Substructures require the denition of a structure path to b e generalized

to an arbitrary dotseparated sequence of structure identiers each a com

p onent of the previous For example ST is a structure path and STlen

is a qualied name that selects the function len in the structure T in the

structure S

By making T b e a substructure of S we can express the signature of S

within the language by using substructure sp ecications and qualied names

as in the following example

signature SIGT

sig

datatype a List Nil Cons of a a List

val len a List int end

CHAPTER THE MODULES SYSTEM

signature SIGS

sig

structure T SIGT

val len a TList int

end

Notice the structure sp ecication in SIGS which asserts that the substructure

T is to match signature SIGT Note also that the sp ecication of len in SIGS

mentions TList which is lo cal to SIGS by virtue of the fact that T is a

substructure of S

Exercise Dene a structure Exp that implements a datatype of ex

pressions with associated operation It should satisfy the signature

signature EXP

sig

datatype id Id of string

datatype exp Var of id

App of id exp list

end

Dene another signature SUBST and structure Subst that implements sub

stitutions for these expressions ie dene a type subst in terms of a list

of identierexpression pairs and a substitute function which given a sub

stitution and an expression returns the expression resulting from applying

substitution

Abstractions

We noted ab ove that the pro cess of signature matching cuts down struc

tures so that they have only the comp onents present in the signature The

ascription of a signature to a structure provides a view of that structure

so that signature matching provides a limited form of information hiding by

restricting access to only those comp onents that app ear in the signature

One reason to make such restrictions is that it can b e helpful in program

maintenance to precisely dene the interface of each program mo dule Simi

lar concerns are addressed by abstract typ es in the core language one reason

ABSTRACTIONS

to use an abstract typ e is to ensure that all uses of that typ e are indep en

dent of the details of the implementation Signature matching can provide

some of the facilities of abstract typ es since with it one can throw away

the constructors of a data typ e thereby hiding the representation But this

turns out to b e a sp ecial case of a more general information hiding construct

in ML called an abstraction

The fundamental idea is that we would like in certain circumstances to

limit the view of a structure to b eing exactly what is sp ecied in the signature

The following example illustrates the p oint

signature SIG

sig

type t

val x t t

end

structure S SIG

struct

type t int

val x fn x x

end

structure S

struct

type t int

val x fn t t

end

Sx

int

Sx St

int St

Note that St is int even though SIG makes no mention of this fact

The purp ose of an abstraction is to suppress all information ab out the

structure other than what explicitly app ears in the signature

abstraction S SIG

struct

type t int

val x fn x x

CHAPTER THE MODULES SYSTEM

end

abstraction S SIG

Sx

int

Sx St

Type error in Sx St

Looking for a int

I have found a St

The eect of the abstraction declaration is to limit all information ab out S

to what is sp ecied in SIG

There is a close connection b etween abstractions and abstract typ es Con

sider the following abstract typ e

abstype a set set of a list

with

val emptyset set

fun unionsetlsetl setll

end

type a set

val emptyset a set

val union fn a set a set a set

emptyset

a set

This declaration denes a typ e a set with op erations empty set and union

The constructor set for sets is hidden in order to ensure that the typ e is

abstract ie that no client can dep end on the representation details

In general an abstype declaration denes a typ e and a collection of

op erations on it while hiding the implementation typ e Abstractions provide

another way of accomplishing the same thing as the following example shows

signature SET

sig

type a set

val emptyset a set

val union a set a set a set end

ABSTRACTIONS

abstraction Set SET

struct

datatype a set set of a list

val emptyset set

fun unionsetlsetl setll

end

abstraction Set SET

Setset

Undefined variable Setset

Semptyset

a Sset

Exercise Dene an abstraction for complex numbers using the signa

ture

signature COMPLEX

sig

type complex

exception divide unit

val rectangular real real imag real complex

val plus complex complex complex

val minus complex complex complex

val times complex complex complex

val divide complex complex complex

val eq complex complex bool

val realpart complex real

val imagpart complex real

end

Hint Given two complex numbers z a ib and z c id the fol lowing

1 2

hold

z z a c ib d

1 2

z z a c ib d

1 2

z z ac bd iad bc

1 2

ac bd ibc ad

z z

1 2

2 2

c d

CHAPTER THE MODULES SYSTEM

Abstractions are more exible than abstract typ es in one sense and a

bit less exible in another The exibility comes from the fact that the

abstraction neednt t the data typ e with op erations mold imp osed by

abstract typ es For example no typ e need b e declared at all or if so it

neednt b e a data typ e Abstract typ es are marginally more exible in that

they are ordinary declaration forms and may therefore app ear anywhere

that a declaration may app ear whereas abstractions are sub ject to the same

limitations as structure bindings they may only app ear at top level or within

an encapsulated declaration This limitation do es not app ear to b e unduly

2

restrictive as it is customary to dene all typ es at top level anyway

Functors

ML programs are hierarchical arrangements of interrelated structures Func

tors which are functions on structures are used to manage the dynamics of

program development in ML Functors play the role of a linking loader in

many programming languages they are the means by which a program is

assembled from its comp onent parts

Functors are dened using functor bindings which may only o ccur at

top level The syntax of a functor binding is similar to the clausal form of

function denition in the core language Here is an example

signature SIG

sig

type t

val eq t t bool

end

functor F P SIG SIG

struct

type t Pt Pt

fun eqxyuv Peqxu andalso Peqyv

end

functor F P SIG SIG

2

It is advisable to avoid abstypes in ML b ecause they are b eing phased out in favor of abstractions

FUNCTORS

The signature SIG sp ecies a typ e t with a binary relation eq The functor F

denes a function that given any structure matching signature SIG returns

another structure which is required to match SIG as well Of course the

result signature may in general dier from the parameter signature

Functors are applied to structures to yield structures

structure S SIG

struct

type t int

val eq ttbool op

end

structure S

struct

type t int

val eq fn ttbool

end

structure SS SIG FS

structure SS

struct

type t int int

val eq fn t t bool

end

Here we have created a structure S that matches signature SIG The functor F

when applied to structure S builds another structure of the same signature

but with t b eing the typ e of pairs of integers and the equality function

dened on these pairs Notice how SS is built as a function of S by F

Functors enjoy a degree of p olymorphism that stems from the fact that

signature matching is dened to allow the structure to have more information

than is required by the signature which is then thrown away as discussed

ab ove For example

structure T SIG

struct

type t string int

val eq t t bool op

fun fxtxx end

CHAPTER THE MODULES SYSTEM

structure T

struct

type t string int

val eq fn t t bool

end

structure TT SIG FT

structure TT

struct

type t stringintstringint

val eq t t bool

end

Although functors are limited to a single argument this is not a serious

limitation for several structures can b e packaged into one as substructures

and then passed to a functor In practice this is not much of an inconvenience

for it is usually the case that if one wants to pass several structures to a

functor then they are so closely related as to b e packaged together anyway

Functors are sub ject to a closure restriction similar to that for signatures

they may not have any free references to values typ es or exceptions in the

environment except for p ervasive system primitives The functor b o dy

may freely refer to the parameters and their comp onents using qualied

names to lo callydeclared identiers and to previouslydeclared functors

and signatures

Though it is p erhaps the most common case the b o dy of a functor need

not b e an encapsulated declaration qualied names and functor applications

are p erfectly acceptable but functors are not recursive Here are some

examples

functor G P SIG SIG FFP

functor G P SIG SIG

functor I P SIG SIG P

functor I P SIG SIG

It is worth noting that the functor I is not the identity function for if S is

a structure matching SIG but with more comp onents than are mentioned in

SIG then the result of the application FS will b e the cutdown view of S

and not S itself For example

THE MODULES SYSTEM IN PRACTICE

structure S

struct

type t int

val eq op

fun fx x

end

structure S

struct

type t int

val eq fn int int bool

val f fn a a

end

structure S I S

structure S

struct

type t int

val eq fn t t bool

end

Notice that the comp onent f of S is missing from the result of applying I to

S

Exercise Convert your implementation of sets using an ordered list

representation into a form where the equality and ordering functions are pro

vided as arguments to a set functor

This completes our intro duction to the fundamental mechanisms of the

ML mo dules system There is one very imp ortant idea still to b e discussed

the sharing sp ecication We defer considering sharing sp ecications until

we have illustrated the use of functors in programming

The mo dules system in practice

In this section we illustrate the use of the mo dules system in program de

velopment We shall consider in outline the development of a parser that

translates an input stream into an abstract syntax tree and records some in

formation ab out the symb ols encountered into a symb ol table The program

CHAPTER THE MODULES SYSTEM

is divided into four units one for the parser one for the abstract syntax tree

management routines one for the symb ol table and one to manage symb ols

Here are the signatures of these four units

signature SYMBOL

sig

type symbol

val mksymbol string symbol

val eqsymbol symbol symbol bool

end

signature ABSTSYNTAX

sig

structure Symbol SYMBOL

type term

val idname term Symbolsymbol

end

signature SYMBOLTABLE

sig

structure Symbol SYMBOL

type entry

type table

val mktable unit table

val lookup Symbolsymbol table entry

end

signature PARSER

sig

structure AbstSyntax ABSTSYNTAX

structure SymbolTable SYMBOLTABLE

val symtable SymbolTabletable

val parse string AbstSyntaxterm

end

Of course these signatures are abbreviated and idealized but it is hop ed that

they are suciently plausible to b e convincing and informative Please note

the hierarchical arrangement of these structures Since the parser mo dule

uses b oth the abstract syntax mo dule and the symb ol table mo dule in an

essential way it must include them as substructures Similarly b oth the

THE MODULES SYSTEM IN PRACTICE

abstract syntax mo dule and the symb ol table mo dule include the symb ol

mo dule as substructures

Now lets consider how we might build a parser in this conguration For

getting ab out the algorithms and representations we might think of writing

down a collection of structures such as the following

structure Symbol SYMBOL

struct

datatype symbol symbol of string

fun mksymbols symbol s

fun eqsymbol sym sym

end

structure AbstSyntax ABSTSYNTAX

struct

structure Symbol SYMBOL Symbol

datatype term

fun idname term

end

structure SymbolTable SYMBOLTABLE

struct

structure Symbol SYMBOL Symbol

type entry

type table

fun mktable

fun lookupsymtable

end

structure Parser PARSER

struct

structure AbstSyntax ABSTSYNTAX AbstSyntax

structure SymbolTable SYMBOLTABLE SymbolTable

val symtable SymbolTablemktable

fun parsestr

SymbolTablelookupAbstSyntaxidnamet symtable

end

Note that in the last line of Parser we apply SymbolTablelookup to the

result of an application of AbstSyntaxidname This is typ e correct only by

CHAPTER THE MODULES SYSTEM

virtue of the fact that AbstSyntax and SymbolTable include the same struc

ture Symbol Were there to b e two structures matching signature SYMBOL

one b ound into SymbolTable and the other b ound into AbstSyntax then this

line of co de would not typ e check Keep this fact in mind in what follows

Now this organization of our compiler seems to b e OK at least so far

as the static structure of the system is concerned But if you imagine that

there are umpteen other structures around each with a few thousand lines of

co de then one can easily imagine that this approach would b ecome somewhat

unwieldy Supp ose that there is a bug in the symb ol table co de which we

x and now we would like to rebuild the system with the new symb ol table

mo dule installed This requires us to recompile the ab ove set of structure

expressions along with all the others that are aected as a consequence in

order to rebuild the system Clearly some form of separate compilation and

linking facility is needed What we are aiming at is to b e able to recompile

any one mo dule in isolation from the others and then relink the compiled

forms into the desired static conguration Of course this idea is not new

the p oint is to see how its done in ML

The key is never to write down a structure explicitly but rather to orga

nize the system as a set of functors each taking its dep endents as arguments

and taking no arguments if it has no dep endents Then to link the sys

tem one merely applies the functors so as to construct the appropriate static

conguration For our example the functors will lo ok like this

functor SymbolFun SYMBOL

struct

datatype symbol symbol of string

fun mksymbols symbol s

fun eqsymbol sym sym

end

functor AbstSyntaxFun Symbol SYMBOL ABSTSYNTAX

struct

structure Symbol SYMBOL Symbol

datatype term

fun idname term

end

functor SymbolTableFun Symbol SYMBOL SYMBOLTABLE struct

THE MODULES SYSTEM IN PRACTICE

structure Symbol SYMBOL Symbol

type entry

type table

fun mktable

fun lookupsymtable

end

signature PARSERPIECES

sig

structure SymbolTable SYMBOLTABLE

structure AbstSyntax ABSTSYNTAX

end

functor ParserFun Pieces PARSERPIECES PARSER

struct

structure AbstSyntax ABSTSYNTAX PiecesAbstSyntax

structure SymbolTable SYMBOLTABLE PiecesSymbolTable

val symtable SymbolTablemktable

fun parsestr

SymbolTablelookupAbstSyntaxidnamet symtable

end

The signature PARSER PIECES is the signature of the two comp onents on

which the parser dep ends the symb ol table and the abstract syntax The

functor ParserFun dep ends on such a pair in order to construct a parser The

functor SymbolFun takes no arguments since it has no dep endent structures

in our setup

The system is built up from these functors by the following sequence of

declarations You should b e able to convince yourself that they result in the

same static conguration that we dened ab ove

structure Symbol SYMBOL SymbolFun

structure Pieces PARSERPIECES

struct

structure SymbolTable SYMBOLTABLE SymbolTableFun Symbol

structure AbstSyntax ABSTSYNTAX AbstSyntaxFun Symbol

end

structure Parser PARSER ParserFun Pieces

We have glossed over a problem with ParserFun however Recall that we

said that the function parse dened in Parser is typ e correct only by virtue

CHAPTER THE MODULES SYSTEM

of the fact that SymbolTable and AbstSyntax have the same substructure

Symbol and hence the same typ e of symb ols Now in ParserFun the function

parse knows only the signatures of these two structures and not that they

are implemented in a compatible way Therefore the compiler is forced to

reject ParserFun and our p olicy of using functors to supp ort mo dularity

app ears to b e in trouble

There is a way around this called the sharing specication The idea is

to attach a set of equations to the signature PARSER PIECES that guarantees

that only a compatible pair of symb ol table and abstract syntax structures

can b e passed to ParserFun Here is a revised denition of PARSER PIECES

that expresses the requisite sharing information

signature PARSERPIECES

sig

structure SymbolTable SYMBOLTABLE

structure AbstSyntax ABSTSYNTAX

sharing SymbolTableSymbol AbstSyntaxSymbol

end

The sharing clause ensures that only compatible pairs of symb ol table and

abstract syntax mo dules may b e packaged together as PARSER PIECES where

compatible means having the same Symbol mo dule Using this revised

signature the declaration of ParserFun is now legal and can b e used to

construct the conguration of structures that we describ ed ab ove

There are in general two forms of sharing sp ecication one for typ es and

one for structures In the ab ove example we used a structure sharing sp eci

cation to insist that two comp onents of the parameters b e equal structures

Two structures are equal if and only if they result from the same evalua

tion of the same struct expression or functor application For example the

following attempt to construct an argument for ParserFun fails b ecause the

sharing sp ecication is not satised

structure Pieces PARSERPIECES

struct

structure SymbolTable SymbolTableFun SymbolFun

structure AbstSyntax AbstSyntaxFun SymbolFun end

THE MODULES SYSTEM IN PRACTICE

The problem is that each application of SymbolFun yields a distinct structure

and therefore SymbolTable and AbstSyntax fail the compatibility check

The second form of sharing sp ecication is b etween typ es For example

the following version of PARSER PIECES might suce if the only imp ortant

p oint is that the typ e of symb ols b e the same in b oth the symb ol table

mo dule and the abstract syntax mo dule

signature PARSERPIECES

sig

structure SymbolTable SYMBOLTABLE

structure AbstSyntax ABSTSYNTAX

sharing SymbolTableSymbolsymbol AbstSyntaxSymbolsymbol

end

Typ e equality is similar to structure equality in that two data typ es are equal

if and only if they result from the same evaluation of the same declaration

So for example if we have two syntactically identical data typ e declarations

the typ es they dene are distinct

Returning to our motivating example supp ose that we wish to x a bug

in the symb ol manipulation routines How then is our program to b e re

constructed to reect the change First we x the bug in SymbolFun and

reevaluate the functor binding for SymbolFun Then we rep eat the ab ove

sequence of functor applications in order to rebuild the system with the new

symb ol routines The other functors neednt b e recompiled only reapplied

Chapter

InputOutput

ML provides a small collection of inputoutput primitives for p erforming sim

ple character IO to les and terminals The fundamental notion in the ML

IO system is the character stream a nite or innite sequence of characters

There are two typ es of stream instream for input streams and outstream

for output streams An input stream receives its characters from a producer

typically a terminal or disk le and an output stream sends its characters

to a consumer also often a terminal or disk le A stream is initialized by

connecting it to a pro ducer or consumer Input streams may or may not

have a denite end but in the case that they do ML provides primitives for

detecting this condition

The fundamental IO primitives are packaged into a structure BasicIO

with signature BASICIO dened as follows

signature BASICIO sig

Types and exceptions

type instream

type outstream

exception iofailure string

Standard input and output streams

val stdin instream

val stdout outstream

Stream creation

val openin string instream

val openout string outstream

Operations on input streams

val input instream int string

val lookahead instream string

val closein instream unit

val endofstream instream bool

Operations on output streams

val output outstream string unit

val closeout outstream unit

end

BasicIO is implicitly opend by the ML system so these identiers may b e

used without a qualied name

The typ e instream is the typ e of input streams and the typ e outstream

is the typ e of output streams The exception io failure is used to represent

all of the errors that may arise in the course of p erforming IO The value

asso ciated with this exception is a string representing the typ e of failure

typically some form of error message

in and the outstream std out are automatically con The instream std

1

nected to the users terminal

The open in and open out primitives are used to asso ciate a disk le

with a stream The expression open ins creates a new instream whose

pro ducer is the le named s and returns that stream as value If the le

named by s do es not exist the exception io failure is raised with value

Cannot open s Similarly open outs creates a new outstream whose

consumer is the le s and returns that stream

The input primitive is used to read characters from a stream Evaluation

of inputsn causes the removal of n characters from the input stream s

If fewer than n characters are currently available then the ML system will

2

wait until they b ecome available from the pro ducer asso ciated with s If the

1

Under UNIX they are actually connected to the ML pro cesss standard input and

standard output les which may or may not b e a terminal

2

The exact denition of available is implementationdep endent For instance op er

ating systems typically buer terminal input on a linebyline basis so that no characters

CHAPTER INPUTOUTPUT

end of stream is reached while pro cessing an input fewer than n characters

may b e returned In particular input from a closed stream returns the null

string The function lookaheads returns the next character on instream

s without removing it from the stream Input streams are terminated by the

close in op eration It is not ordinarily necessary to close input streams but

in certain cases it is desirable to do so due to host system limitations The

end of an input stream is detected by end of stream a derived form that is

dened as follows

val endofstreams lookaheads

Characters are written to an outstream with the output primitive The

string argument consists of the characters to b e written to the given outstream

out is used to terminate an output stream Any further The function close

attempts to output to a closed stream cause io failure to b e raised with

value Output stream is closed

In addition to the basic set of IO primitives dened ab ove ML also

provides a few extended op erations One is called input line of typ e

instreamstring which reads an entire line from the given input stream

A line is dened to b e a sequence of characters terminated by a newline char

acter n Another is the function use of typ e string listunit which

takes a list of le names which are to b e loaded into the ML system as

though they had b een typ ed at top level This primitive is very useful for

interacting with the host system particularly for large programs

Exercise Modify your towers of hanoi program so that it prints out

the sequence of moves

Exercise Write a function to print out your solutions to the queens

problem in the form of a chess board

are available until an entire line has b een typ ed

Bibli ography

Harold Ab elson and Gerald Sussman Structure and Interpretation of

Computer Programs The MIT Press

Ro d Burstall David MacQueen and Donald Sannella HOPE An Ex

p erimental Applicative Language Edinburgh University Internal Rep ort

CSR

Luca Cardelli ML under UNIX ATT Bell Lab oratories

Michael Gordon Robin Milner and Christopher Wadsworth Edinburgh

LCF SpringerVerlag Lecture Notes in Computer Science vol

Rob ert Harp er David MacQueen and Robin Milner Standard ML Ed

inburgh University Internal Rep ort ECSLFCS March

David MacQueen Mo dules for Standard ML in

Robin Milner Mads Tofte and Rob ert Harp er The Denition of Stan

dard ML MIT Press

App endix A

Answers

Answer

Unbound value identifier x

val x int

val y int

val z int

int

Answer

The computer would match hdtlnil against

Eatthewalnutnil The lists are of dierent length

so the pattern matching would fail

Answer

bx

x or x

x

Answer

local val pi

in fun circumference r pi r

fun area r pi r r

end

Answer

fun abs x if x then x else x

Answer

To evaluate factn the system must evaluate newifnfactn

The arguments to this function must b e evaluated b efore the call

to the function This involves evaluating factn even when

n The function will therefore lo op

Answer

This is an inecient denition of a function to reverse the order

of the elements in a list

Answer

fun isperfect n

let fun addfactors

addfactorsm

if n mod m

then m addfactorsm else addfactorsm

in n orelse addfactorsn div n end

Answer

fun cons h t ht

fun powerset

powersetht

let val pst powerset t in map cons h pst pst end

Answer

fun cc

cc

ccamount kinds as ht

if amount then

else ccamounthkinds ccamount t

fun countchange coins amount ccamount coins

APPENDIX A ANSWERS

Answer

fun nthl l nthnht nthnt

fun countchange coins sum

let fun initialtable

initialtable ht initialtable t

fun countamounttable

let fun countusingl l

countusinghtht

let val t as c

countusingtt

val diff amount h

val cnt c if diff then

else if diff then

else hdnthhh

in cntht

end

in if amount sum then hdhd table

else countamountcountusingcoinstable

end

in count initialtable coins end

Answer

local

fun movediskfrom to from to

fun transferfrom to spare movediskfrom to

transferfrom to spare n

transferfrom spare to n

movediskfrom to

transferspare to from n

in

fun towerofhanoin transferABCn end

An alternative solution that explicitly mo dels the disks and checks for illegal

moves could b e written as follows

local

fun inclmn if mn then else minclmn

fun movediskffhfl t spare

ffl tfh spare

movediskffhfl ttl as thtt spare

if fh int th then error Illegal move

else ffl tfhtl spare

fun transferfrom to spare movediskfrom to spare

transferfrom to spare n

let val fst transferfrom spare to n

val fts movediskf t s

val stf transfers t f n

in fts end

in

fun towerofhanoin

transferAinclnBCn

end

Answer

fun samefrontieremptyempty true

samefrontierleaf x leaf y x y

samefrontiernodeemptyt nodeemptyt

samefrontiertt

samefrontiernodeleaf xt nodeleaf yt

x y andalso samefrontiertt

samefrontiert as node t as node

samefrontieradjust t adjust t

samefrontier false

and adjustx as nodeempty x

adjustx as nodeleaf x

adjustnodenodettt adjustnodetnodett

APPENDIX A ANSWERS

An alternative solution using exceptions section is given b elow

fun samefrontiertreetree

let exception samefringe unit

fun checkelempty empty restt restt

checkelleaf x leaf y restt

if x y then restt else raise samefringe

checkelel nodelr restt

checkelel l rrestt

checkel raise samefringe

fun check raise samefringe

checkempty tree

checkelempty hd tree tl tree

checkl as leafel tree

checkell hd tree tl tree

checknodett tree

checkt checkt tree

in nullchecktreetree handle samefringe false

end

Answer

abstype a set set of a list

with val emptyset set

fun singleton e set e

fun unionset l set l setll

fun membere set false

membere set ht

e h orelse membere set t

fun intersectionset s set

intersectionsetht s

let val tset as set tl intersectionset t s

in if memberhs then sethtl else tset end

end

Answer

abstype a set set of a list

eq a a bool

lt a a bool

with fun emptyset ops set ops

fun singletone ops sete ops

fun membere set leqlt

let fun find false

find ht

if eqe h then true

else if lte h then false

else findt

in find l end

fun unionsetlops as eqlt setl

let fun mergel l

mergel l

mergel as ht l as ht

if eqhh then hmergett

else if lthh then hmergetl

else hmergelt

in setmergellops end

fun intersectsetlops as eqlt setl

let fun interl

interl

interl as ht l as ht

if eqhh then hintertt

else if lthh then intertl

else interlt

in setinterllops end

end

Answer

The exception b ound to the outer exn is distinct from that

b ound to the inner exn thus the exception raised by f

APPENDIX A ANSWERS

with excepted value could only b e handled by a han

dler within the scop e of the inner exception declaration it

will not b e handled by the handler in the program which

exp ects a b o olean value So this exception will b e rep orted

at top level This would apply even if the outer exception

declaration were also of typ e int the two exceptions b ound

to exn would still b e distinct

If pv is false but qv is true the recursive call will evalu

ate fbv Then if b oth pbv and qbv are false this

evaluation will raise an exn exception with excepted value

dbv But this packet will not b e handled since the ex

ception of the packet is that which is b ound to exn by the

inner not outer evaluation of the exception declaration

Answer

fun threatxinty xy

x x

orelse y y

orelse xy xy

orelse xy xy

fun conflictpos false

conflictpos ht threatposh orelse conflictpost

exception conflict

fun addqueeninplace

let fun tryqueenj

if conflictij place then raise conflict

else if in then ijplace

else addqueeninijplace

handle conflict

if j n then raise conflict else tryqueenj

in tryqueen end

fun queensn addqueen n

Answer

exception conflict int int list list

fun addqueeninplaceplaces

let fun tryqueenj places

if conflictij place

then raise conflict with places

else if in

then raise conflict with ijplaceplaces

else addqueeninijplaceplaces

handle conflict with newplaces

if j n then raise conflict with newplaces

else tryqueenj newplaces

in tryqueenplaces end

fun allqueensn

addqueenn handle conflict with places places

Answer

val primes

let fun nextprimenl

let fun checkn n

checknht

if n mod h then checknl

else checknt

in checknl end

fun primstream nl

mkstreamfn let val n nextprimenl

in n primstreamnnl end

in primstream end

Answer

abstype a stream stream of unit a a stream ref

with fun nextstream f

let val res f in f fn res res end

fun mkstream f streamref f end

APPENDIX A ANSWERS

An alternative solution that is more verb ose but p erhaps clearer is given

b elow

abstype a stream stream of a streamelmt ref

and a streamelmt uneval of unit a a stream

eval of a a stream

with fun nextstreamr as refunevalf

let val res f in r eval res res end

nextstreamrefevalr r

fun mkstream f streamrefuneval f

end

Answer

abstype a stream stream of unit a a stream ref

with local exception endofstream in

fun nextstream f

let val res f

in f fn res res

end

fun mkstream f

streamref f

fun emptystream

streamreffn raise endofstream

fun endofstreams

next s false handle endofstream true

end

end

Answer

structure INTORD ORD

struct

type t int

val le int int bool op

end

structure RSORD ORD struct

type t real string

fun lerreal sstring rs

r r orelse r r andalso s s

end

Answer

The signature requires the typ e of n to b e an a list ie if a

structure T matches SIG then trueTn should b e legitimate

This cannot b e the case if we were allowed to supply a value for

n with a more sp ecic typ e such as int list Therefore the

declaration is disallowed

Answer

sig type a t val x bool int end

and

sig type a t val x bool t end

Answer

Only sig type t val f t t end satises the signature

closure rule the others contain free references to the structure

A

Answer

signature STACK

sig

datatype a stack nilstack push of a a stack

exception pop unit and top unit

val empty a stack bool

and pop a stack a stack

and top a stack a

end

structure Stack STACK struct

APPENDIX A ANSWERS

datatype a stack nilstack push of a a stack

exception pop unit and top unit

fun emptynilstack true empty false

fun poppushs s pop raise pop

fun toppushx x top raise top

end

Answer

structure Exp EXP

struct

datatype id Id of string

datatype exp Var of id

App of id exp list

end

signature SUBST

sig

structure E EXP

type subst

val subst Eid Eexp list subst

val lookup Eid subst Eexp

val substitute subst Eexp Eexp

end

structure Subst SUBST

struct

structure E Exp

type subst Eid Eexp list

fun substx x

fun lookupid EVar id

lookupid idel

if id id then e else lookupidl

fun substitute s EVar id lookupids

substitute s EAppidargs

EAppid map substitute s args

end

Answer

abstraction Rect COMPLEX

struct

datatype complex rect of real real

exception divide unit

fun rectangular real imag rectreal imag

fun plusrectab rectcd rectacbd

fun minusrectab rectcd rectacbd

fun timesrectab rectcd rectac bd ad bc

fun dividerectab rectcd

let val cd cc dd

in if cd

then raise divide

else rectac bdcd bc adcd end

fun eqrectab rectcd ac andalso bd

fun realpartrecta a

fun imagpartrectb b

end

Answer

signature ORD

sig

type elem

val eq elem elem bool

val le elem elem bool

end

signature SET

sig

type set

structure O ORD

val emptyset set

val singleton Oelem set

val member Oelem set bool

val union set set set

val intersect set set set end

APPENDIX A ANSWERS

functor SetO ORD SET

struct

datatype set set of Oelem list

structure O O

val emptyset set

fun singleton e set e

fun membere set l

let fun find false

find ht

if Oeqe h then true

else if Olte h then false

else findt

in find l end

fun unionset l set l

let fun mergel l

mergel l

mergel as ht l as ht

if Oeqhh then hmergett

else if Olthh then hmergetl

else hmergelt

in setmergell end

fun intersectset l set l

let fun interl

interl

interl as ht l as ht

if Oeqhh then hintertt

else if Olthh then intertl

else interlt

in setinterll end

end

Answer

local

fun inclmn if mn then else minclmn

fun movediskffhfl ttl spare

if notnull tl andalso fh int hd tl

then error Illegal move

else

outputstdout

Move makestring fh

from f to t n

ffl tfhtl spare

fun transferfrom to spare movediskfrom to spare

transferfrom to spare n

let val fst transferfrom spare to n

val fts movediskf t s

val stf transfers t f n

in fts end

in

fun towerofhanoin

transferAinclnBCn

end

Answer

fun printboardplacens

let fun presentpos intint false

presentpos ht posh orelse presentpost

fun printcolumnij

if j n then

else

outputsif presentij place

then Q else

printcolumnij

fun printrowi

if i n then

else printcolumni

outputsn

printrowi

APPENDIX A ANSWERS

in printrow outputsn end