A Simpli ed AccountofPolymorphic References
Rob ert Harp er
June, 1993
CMU-CS-93-169
Scho ol of Computer Science
Carnegie Mellon University
Pittsburgh, PA 15213
Abstract
A pro of of the soundness of Tofte's imp erativetyp e discipline with resp ect to a structured op erational
semantics is given. The presentation is based on a semantic formalism that combines the b ene ts of the
approaches considered byWrightandFelleisen, and byTofte, leading to a particularly simple pro of of
soundness of Tofte's typ e discipline.
This researchwas sp onsored by the Defense Advanced Research Pro jects Agency, CSTO, under the title \The Fox
Pro ject: Advanced Development of Systems Software", ARPA Order No. 8313, issued by ESD/AVS under Contract
No. F19628-91-C-0168.
The views and conclusions contained in this do cument are those of the author and should not b e interpreted as
representing ocial p olicies, either expressed or implied, of the Defense Advanced Research Pro jects Agency or the
U.S. Government.
Keywords: lamb da calculus, typ e theory, functional programming, references and assignment
1 Intro duction
The extension of Damas and Milner's p olymorphic typ e system for pure functional programs [2] to accomo-
date mutable cells has proved to b e problematic. The nave extension of the pure language with op erations
to allo cate a cell, and to retrieve and mo dify its contents is unsound [11]. The problem has received consid-
erable attention, notably by Damas [3], Tofte [10,11], and LeroyandWeiss [7]. Tofte's solution is based on
a greatest xed p oint construction to de ne the semantic typing relation [11] (see also [8]). This metho d has
b een subsequently used by Leroy and Weiss [7]andTalpin and Jouvelot [9]. It was subsequently noted by
Wright and Felleisen [13] that the pro of of soundness can b e substantially simpli ed if the argument is made
by induction on the length of an execution sequence, rather than on the structure of the typing derivation.
Using this metho d they establish the soundness of a restriction of the language to require that let-b ound
expressions b e values. In this note we present an alternative pro of of the soundness of Tofte's imp erativetyp e
discipline using a semantic framework that is intermediate b etween that of WrightandFelleisen and that of
Tofte. The formalism considered admits a very simple and intuitively app ealing pro of of the soundness of
Tofte's typ e discipline, and may b e of some use in subsequent studies of this and related problems.
2 A Language with Mutable Data Structures
The syntax of our illustrative language is given by the following grammar:
expressions e ::= x j l j unit j ref e j e := e j ! e j x:e j e e j let x be e in e
1 2 1 2 1 2
values v ::= x j l j unit j x:e
The meta-variable x ranges over a countably in nite set of variables, and the meta-variable l ranges over
a countably in nite set of locations. In the ab ove grammar unit is a constant, ref and ! are one-argument
primitive op erations, and :=isatwo-argument primitive op eration. Capture-avoiding substitution of a value
v for a free variable x in an expression e is written [v=x]e.
The syntax of typeexpressions is given by the following grammar:
monotypes ::= t j unit j ref j !
1 2
polytypes ::= j8t:
The meta-variable t ranges over a countably in nite set of type variables. The symbol unit is a distinguished
base typ e, and typ es of the form ref stand for the typ e of references to values of typ e . The set FTV( )
of typ e variables o ccurring freely in a p olytyp e is de ned as usual, as is the op eration of capture-avoiding
substitution of a monotyp e for free o ccurrences of a typ e variable t in a p olytyp e , written [=t] .
A variable typing is a function mapping a nite set of variables to p olytyp es. The meta-variable ranges
over variable typings. The p olytyp e assigned to a variable x in a variable typing is (x), and the variable
0
typing [x: ] is de ned so that the variable x is assigned the p olytyp e ,anda variable x 6= x is assigned
0
the p olytyp e (x ). The set of typ e variables o ccuring freely in a variable typing , written FTV ( ), is
S
de ned to b e FTV ( (x)). A location typing is a function mapping a nite set of lo cations to
x2dom( )
monotyp es. The meta-variable ranges over lo cation typings. Notational conventions similar to those for
variable typings are used for lo cation typings.
Polymorphic typ e assignment is de ned by a set of rules for deriving judgements of the form ; ` e : ,
with the intended meaning that the expression e has typ e under the assumption that the lo cations in
e have the monotyp es ascrib ed by , and the free variables in e have the p olytyp es ascrib ed by . The
rules of inference are given in Table 1. These rules make use of two auxiliary notions. The polymorphic
0
instance relation is de ned to hold i is a p olytyp e of the form 8t : ...:8t : and is a monotyp e
1 n
0
of the form [ ; ...; =t ; ...;t ] ,where , ..., are monotyp es. This relation is extended to p olytyp es
1 n 1 n 1 n
0 0
by de ning i whenever . The polymorphic generalization of a monotyp e relative
to a lo cation typing and variable typing , Close ( ), is the p olytyp e 8t : ...:8t : , where FTV ( ) n
; 1 n
(FTV () [ FTV( )) = f t ; ...;t g. As a notational convenience, we sometimes write ` e : for ; ;` e :
1 n
and Close ( ) for Close ( ).
;;
The following lemma summarizes some imp ortant prop erties of the typ e system: 1
; ` x : ( (x) ) (var)
; ` l : ref ((l )= ) (loc)
; ` unit : unit (triv)
; ` e :
(ref)
; ` ref e : ref
; ` e : ref ; ` e :
1 2
(assign)
; ` e := e : unit
1 2
; ` e : ref
(retrieve)
; ` ! e :
; [x: ] ` e :
1 2
(x 62 dom( )) (abs)
; ` x:e : !
1 2
; ` e : ! ; ` e :
1 2 2 2
(app)
; ` e e :
1 2
; ` e : ; [x: Close ( )] ` e :
1 1 ; 1 2 2
(x 62 dom( )) (let)
; ` let x be e in e :
1 2 2
Table 1: Polymorphic Typ e Assignment 2
` v ) v; (val)
0
` e ) v;
0
(l 62 dom( )) (alloc)
0
` ref e ) l; [l :=v ]
0
` e ) l;
(contents)
0 0
` ! e ) (l );
` e ) l; ` e ) v;
1 1 1 2 2
(update)
` e := e ) unit; [l :=v ]
1 2 2
0 0 0
` e ) x:e ; ` e ) v ; ` [v =x]e ) v;
1 1 1 2 2 2 2 2
1 1
(apply)
0
` e e ) v;
1 2
` e ) v ; ` [v =x]e ) v ;
1 1 1 1 1 2 2 2
(bind)
` let x be e in e ) v ;
1 2 2 2
Table 2: Op erational Semantics for References
Lemma 2.1
1. (Weakening) Suppose that ; ` e : .Ifl 62 dom(), then [l : ]; ` e : ,andifx 62 dom( ), then
; [x: ] ` e : .
0 0 0 0
2. (Substitution) If ; ` v : and ; [x: ] ` e : , and if Close ( ) , then ; ` [v=x]e :
;
0 0
3. (Specialization) If ; ` e : and Close ( ) , the ; ` e : .
;
The pro ofs are routine inductions on the structure of typing derivations. Substitution is stated only for values,
in recognition of the fact that in a call-by-value language only values are ever substituted for variables during
evaluation.
3 Semantics and Soundness
A memory is a partial function mapping a nite set of lo cations to values. The contents ofalocation
l 2 dom()isthevalue (l ), and we write [l :=v ] for the memory which assigns to lo cation l the value v
0 0
and to a lo cation l 6= l the value (l ). Notice that the result may either b e an update of (if l 2 dom())
or an extension of (if l 62 dom()).
The op erational semantics of the language is de ned by a collection of rules for deriving judgements of
0
the form ` e ) v; ,withtheintended meaning that the closed expression e,whenevaluated in memory
0
, results in value v and memory . The rules of the semantics are given in Table 2.
The typing relation is extended to memories and lo cation typings by de ning : to hold i dom()=
dom(), and for every l 2 dom(), ` l : (l ). Notice that the typing relation is de ned so that (l )
may mention lo cations whose typ e is de ned by . (Compare Tofte's account [11].) For example, supp ose
that is the memory sending lo cation l to x:x + 1, and lo cation l to y :(! l ) y + 1, and supp ose that
0 1 0
is the lo cation typing assigning the typ e int !int to b oth l and l .Theveri cation that : requires
0 1
checking that ` y :(! l ) y +1 : int!int,which requires determining the typ e assigned to lo cation l by .
0 0
0
As p ointed out byTofte [11], the memory which assigns (l ) to b oth l and l can arise as a result of
1 0 1
0
an assignment statement. Toverify that : requires checking that ` (l ):(l ), which itself relies
0 0
on (l )! Tofte employs a \greatest xed p oint" construction to account for this p ossibility, but no such
0
machinery is needed here. This is the principal advantage of our formalism. (A similar advantage accrues
to WrightandFelleisen's approach [13]andwas suggested to us by them.)
Wenow turn to the question of soundness of the typ e system. 3
0 0 0 0 0
Conjecture 3.1 If ` e ) v; ,and ` e : , with : , then there exists such that , : ,
0
and ` v : .
The intention is to capture the preservation of typing under evaluation, taking account of the fact that
evaluation may allo cate storage, and hence intro duce \new" lo cations that are not governed by the initial
0
lo cation typing .Thus the lo cation typing is to b e constructed as a function of the evaluation of e,as
will b ecome apparent in the sequel.
0
A pro of by induction on the structure of the derivation of ` e ) v; go es through for all cases but bind.
0
For example, consider the expression ref e.Wehave ` ref e ) l; [l :=v ]by alloc, ` ref e : ref by ref,
0
and : . It follows from the de nition of alloc that ` e ) v; , and from the de nition of ref
0 0 0 0
that ` e : .Soby induction there is a lo cation typing such that : and ` v : .To
00 0
complete the pro of we need only check that the lo cation typing = [l := ] satis es the conditions that
0 00 00
[l :=v ]: and that ` l : ref , b oth of which follow from the assumptions and Lemma 2.1(1). The
other cases follow a similar pattern, with the exception of rule bind.To see where the pro of breaks down,
0
let us consider the obvious attempt to carry it through. Our assumption is that ` let x be e in e ) v;
1 2
by bind, ` let x be e in e : by let,and : . It follows that ` e ) v ; for some value v and
1 2 2 1 1 1 1
0
some memory , and that ` [v =x]e ) v; .We also havethat ` e : for some monotyp e ,
1 1 1 2 1 1 1
and that ; x: Close ( ) ` e : for some monotyp e . By induction there is a lo cation typing
1 2 2 2 1
such that : and ` v : .To complete the pro of it suces to show that ` [v =x]e : .This
1 1 1 1 1 1 1 2 2
would follow from the typing assumptions governing v and e by an application of Lemma 2.1(2), provided
1 2
that we could show that Close ( ) Close ( ). But this holds i FTV ( ) FTV (), whichdoesnot
1 1 1
1
necessarily obtain. For example, if e = ref (x:x) and has the form (t!t) ref ,where t do es not o ccur in
1 1
, then Close ( ) generalizes t, whereas Close ( ) do es not. (This observation is due to Tofte, who also
1 1
1
go es on to provide a counterexample to the theorem [11].)
The simplest approach to recovering soundness is to preclude p olymorphic generalization on the typ e of
a let-b ound expression unless that expression is a value. Under this restriction the pro of go es through, for
0 0 0 0
we can readily see that if ` v ) v ; , then v = v and = , and that if : and : ,with ,
1 1
then = . Consequently, Close ( ) = Close ( ) in the ab ove pro of sketch, and this is sucientto
1 1 1
1
0
complete the pro of. Following Tofte [11], we deem an expression e non-expansive i ` e ) v; implies
0
= . By restricting the bind rule so that e is non-expansive, we ensure that = , which suces for
1 1
the pro of. Unfortunately in anyinteresting language this condition is recursively undecidable, and hence
some conservative approximation must b e used. Tofte cho oses the simple and memorable condition that e
1
b e a (syntactic) value.
The requirement that p olymorphic let's bind values is rather restrictive. Following ideas of MacQueen
(unpublished) and Damas [3], Tofte intro duced a mo di cation to the typ e system that admits a more
exible use of p olymorphism, without sacri cing soundness. Tofte's idea is to employ a marking of typ e
variables so as to maintain the invariantthatifa typ e variable can o ccur in the typ e of a lo cation in the
store, then generalization on that typ e variable is suppressed. The set of typ e variables is divided into two
countably in nite disjoint subsets, the imperative and the applicative typ e variables. A monotyp e is called
imperative i all typ e variables o ccurring within it are imp erative. The typingruleforref is constrained
so that the typ e of e in rule ref is required to b e imp erative. Polymorphic generalization must preserve
the imp erative/applicative distinction, and p olymorphic instantiation is de ned so that an imp erativetyp e
variable may only b e instantiated to an imp erative monotyp e. In addition a restricted form of generalization,
written AppClose ( ), is de ned similarly to Close ( ), with the exception that only applicativetyp e
;
;
variables are generalized in the result; any imp erativetyp e variables remain free.
With the machinery of applicative and imp erativetyp es in hand, Tofte replaces the bind rule with the
following two rules:
; ` v : ; [x: Close ( )] ` e :
1 1 ; 1 2 2
(x 62 dom( )) (bind-val)
; ` let x be v in e :
1 2 2
; ` e : ; [x: AppClose ( )] ` e :
1 1 1 2 2
;
(x 62 dom( )) (bind-ord)
; ` let x be e in e :
1 2 2 4
Thus if the let-b ound expression is a value, it may b e used p olymorphically without restriction; otherwise
only the applicativetyp e variables may b e generalized.
The idea b ehind these mo di cations is to maintain a conservative approximation to the set of typ e
variables that may o ccur in the typ e of a value stored in memory. Thisisachieved by ensuring that if a typ e
variable o ccurs freely in the memory, then it is imp erative. The converse cannot, of course, b e e ectively
maintained since the lo cation typing in the soundness theorem is computed as a function of the evaluation
trace. Wesay that a lo cation typing is imp erativei thetyp e assigned to every lo cation is imp erative.
0 0 0
Theorem 3.2 If ` e ) v; , and ` e : , with : and imperative, then thereexists such that
0 0 0 0
is imperative, , : ,and ` v : .
0
The pro of pro ceeds by induction on the structure of the derivation of ` e ) v; . Consider the evaluation
rule alloc. The restriction on rule ref ensures that if ref e : ref ,then is imp erative. Consequently,the
00 0 0
lo cation typing = [x : ] is imp erative since, by supp osition, is imp erative, and, by induction, is
imp erative. The signi cance of maintaining the imp erativeinvariant on lo cation typings b ecomes apparent
in the case of the bind rules. The rule bind-val is handled as sketched ab ove: since v is a value, it is
1
non-expansive, consequently = , which suces for the pro of. The rule bind-ord is handled by observing
1
that regardless of whether is a prop er extension of or not, wemust have Close ( ) AppClose ( ),
1 1 1
1
for if a typ e variable t o ccurs freely in but not in ,itmust b e (by induction hyp othesis) imp erative,
1
and hence is not generalized in AppClose ( )(by de nition of AppClose ). This is sucient to complete
1
the pro of.
4 Conclusion
Wehave presented a simpli ed pro of of the soundness of Tofte's typ e discipline for combining p olymorphism
and mutable references in ML. The main contribution is the elimination of the need for the maximal xed
p oint argumentusedbyTofte [11]. The metho ds considered here have b een subsequently employed by
Greiner to establish the soundness of the \weak p olymorphism" typ e discipline implemented in the Standard
ML of New Jersey compiler [1]. Our approachwas in uenced by the work of Wright and Felleisen [13]who
pioneered the use of reduction semantics to prove soundness of typ e assignment systems.
Several imp ortant studies of the problem of combining p olymorphic typ e inference and computational
e ects (including mutable references) have b een conducted in recentyears. The interested reader is referred
to the work of Gi ord, Jouvelot and Talpin [6, 9], LeroyandWeiss [7], Wright[12], Hoang, Mitchell, and
Viswanathan [5], and Greiner [4] for further details and references.
The author is grateful to Matthias Felleisen, Andrew Wright, and John Greiner for their comments and
suggestions.
References
[1] Andrew W. App el and David B. MacQueen. Standard ML of New Jersey. In J. Maluszynski and
M. Wirsing, editors, Third Int'l Symp. on Prog. Lang. Implementation and Logic Programming, pages
1{13, New York, August 1991. Springer-Verlag.
[2] Luis Damas and Robin Milner. Principal typ e schemes for functional programs. In Ninth ACM Sym-
posium on Principles of Programming Languages, pages 207{212, 1982.
[3] Luis Manuel Martins Damas. Type Assignment in Programming Languages. PhD thesis, Edinburgh
University, 1985.
[4] John Greiner. Standard ml weak p olymorphism can b e sound. Technical Rep ort CMU{CS{93{160,
Scho ol of Computer Science, Carnegie Mellon University, Pittsburgh, PA, May1993.
[5] My Hoang, John Mitchell, and Ramesh Viswanathan. Standard ML-NJ weak p olymorphism and im-
p erative constructs. In Eighth Symposium on Logic in Computer Science,1993. 5
[6] Pierre Jouvelot and David Gi ord. Algebraic reconstruction of typ es and e ects. In Eighteenth ACM
Symposium on Principles of Programming Languages, pages 303{310, 1991.
[7] Xavier Leroy and Pierre Weis. Polymorphic typ e inference and assignment. In Eighteenth ACM Sym-
posium on Principles of Programming Languages, pages 291{302, Orlando, FL, January 1991. ACM
SIGACT/SIGPLAN.
[8] Robin Milner and Mads Tofte. Co-induction in relational semantics. Technical Rep ort ECS-LFCS-88-
65, Lab oratory for the Foundations of Computer Science, Edinburgh University,Edinburgh, Octob er
1988.
[9] Jean-Pierre Talpin and Pierre Jouvelot. The typ e and e ect discipline. In Seventh Symposium on Logic
in Computer Science, pages 162{173, 1992.
[10] Mads Tofte. Operational Semantics and Polymorphic Type Inference. PhD thesis, Edinburgh University,
1988. Available as Edinburgh University Lab oratory for Foundations of Computer Science Technical
Rep ort ECS{LFCS{88{54.
[11] Mads Tofte. Typ e inference for p olymorphic references. Information and Computation, 89:1{34, Novem-
b er 1990.
[12] Andrew Wright. Typing references by e ect inference. In Proceedings of the European Symposium on
Programming, 1992.
[13] Andrew K. Wright and Matthias Felleisen. A syntactic approachtotyp e soundness. Technical Rep ort
TR91{160, Department of Computer Science, Rice University, July 1991. To app ear, Information and
Computation. 6