A Simpli ed AccountofPolymorphic References

Rob ert Harp er

June, 1993

CMU-CS-93-169

Scho ol of Computer Science

Carnegie Mellon University

Pittsburgh, PA 15213

Abstract

A pro of of the soundness of Tofte's imp erativetyp e discipline with resp ect to a structured op erational

semantics is given. The presentation is based on a semantic formalism that combines the b ene ts of the

approaches considered byWrightandFelleisen, and byTofte, leading to a particularly simple pro of of

soundness of Tofte's typ e discipline.

This researchwas sp onsored by the Defense Advanced Research Pro jects Agency, CSTO, under the title \The Fox

Pro ject: Advanced Development of Systems Software", ARPA Order No. 8313, issued by ESD/AVS under Contract

No. F19628-91-C-0168.

The views and conclusions contained in this do cument are those of the author and should not b e interpreted as

representing ocial p olicies, either expressed or implied, of the Defense Advanced Research Pro jects Agency or the

U.S. Government.

Keywords: lamb da calculus, typ e theory, , references and assignment

1 Intro duction

The extension of Damas and Milner's p olymorphic typ e system for pure functional programs [2] to accomo-

date mutable cells has proved to b e problematic. The nave extension of the pure language with op erations

to allo cate a cell, and to retrieve and mo dify its contents is unsound [11]. The problem has received consid-

erable attention, notably by Damas [3], Tofte [10,11], and LeroyandWeiss [7]. Tofte's solution is based on

a greatest xed p oint construction to de ne the semantic typing relation [11] (see also [8]). This metho d has

b een subsequently used by Leroy and Weiss [7]andTalpin and Jouvelot [9]. It was subsequently noted by

Wright and Felleisen [13] that the pro of of soundness can b e substantially simpli ed if the argument is made

by induction on the length of an execution sequence, rather than on the structure of the typing derivation.

Using this metho d they establish the soundness of a restriction of the language to require that let-b ound

expressions b e values. In this note we present an alternative pro of of the soundness of Tofte's imp erativetyp e

discipline using a semantic framework that is intermediate b etween that of WrightandFelleisen and that of

Tofte. The formalism considered admits a very simple and intuitively app ealing pro of of the soundness of

Tofte's typ e discipline, and may b e of some use in subsequent studies of this and related problems.

2 A Language with Mutable Data Structures

The syntax of our illustrative language is given by the following grammar:

expressions e ::= x j l j unit j ref e j e := e j ! e j x:e j e e j let x be e in e

1 2 1 2 1 2

values v ::= x j l j unit j x:e

The meta-variable x ranges over a countably in nite set of variables, and the meta-variable l ranges over

a countably in nite set of locations. In the ab ove grammar unit is a constant, ref and ! are one-argument

primitive op erations, and :=isatwo-argument primitive op eration. Capture-avoiding substitution of a value

v for a free variable x in an expression e is written [v=x]e.

The syntax of typeexpressions is given by the following grammar:

monotypes  ::= t j unit j  ref j  !

1 2

polytypes  ::=  j8t:

The meta-variable t ranges over a countably in nite set of type variables. The symbol unit is a distinguished

base typ e, and typ es of the form  ref stand for the typ e of references to values of typ e  . The set FTV( )

of typ e variables o ccurring freely in a p olytyp e  is de ned as usual, as is the op eration of capture-avoiding

substitution of a monotyp e  for free o ccurrences of a typ e variable t in a p olytyp e  , written [=t] .

A variable typing is a function mapping a nite set of variables to p olytyp es. The meta-variable ranges

over variable typings. The p olytyp e assigned to a variable x in a variable typing is (x), and the variable

0

typing [x: ] is de ned so that the variable x is assigned the p olytyp e  ,anda variable x 6= x is assigned

0

the p olytyp e (x ). The set of typ e variables o ccuring freely in a variable typing , written FTV ( ), is

S

de ned to b e FTV ( (x)). A location typing is a function mapping a nite set of lo cations to

x2dom( )

monotyp es. The meta-variable  ranges over lo cation typings. Notational conventions similar to those for

variable typings are used for lo cation typings.

Polymorphic typ e assignment is de ned by a set of rules for deriving judgements of the form ; ` e :  ,

with the intended meaning that the expression e has typ e  under the assumption that the lo cations in

e have the monotyp es ascrib ed by , and the free variables in e have the p olytyp es ascrib ed by . The

rules of inference are given in Table 1. These rules make use of two auxiliary notions. The polymorphic

0

instance relation    is de ned to hold i  is a p olytyp e of the form 8t : ...:8t : and  is a monotyp e

1 n

0

of the form [ ; ...; =t ; ...;t ] ,where  , ...,  are monotyp es. This relation is extended to p olytyp es

1 n 1 n 1 n

0 0

by de ning    i    whenever    . The polymorphic generalization of a monotyp e  relative

to a lo cation typing  and variable typing , Close ( ), is the p olytyp e 8t : ...:8t : , where FTV ( ) n

; 1 n

(FTV () [ FTV( )) = f t ; ...;t g. As a notational convenience, we sometimes write  ` e :  for ; ;` e : 

1 n

and Close ( ) for Close ( ).



;;

The following lemma summarizes some imp ortant prop erties of the typ e system: 1

; ` x :  ( (x)   ) (var)

; ` l :  ref ((l )= ) (loc)

; ` unit : unit (triv)

; ` e : 

(ref)

; ` ref e :  ref

; ` e :  ref ; ` e : 

1 2

(assign)

; ` e := e : unit

1 2

; ` e :  ref

(retrieve)

; ` ! e : 

; [x: ] ` e : 

1 2

(x 62 dom( )) (abs)

; ` x:e :  !

1 2

; ` e :  ! ; ` e : 

1 2 2 2

(app)

; ` e e : 

1 2

; ` e :  ; [x: Close ( )] ` e : 

1 1 ; 1 2 2

(x 62 dom( )) (let)

; ` let x be e in e : 

1 2 2

Table 1: Polymorphic Typ e Assignment 2

 ` v ) v;  (val)

0

 ` e ) v; 

0

(l 62 dom( )) (alloc)

0

 ` ref e ) l;  [l :=v ]

0

 ` e ) l; 

(contents)

0 0

 ` ! e )  (l );

 ` e ) l;   ` e ) v; 

1 1 1 2 2

(update)

 ` e := e ) unit; [l :=v ]

1 2 2

0 0 0

 ` e ) x:e ;  ` e ) v ;  ` [v =x]e ) v; 

1 1 1 2 2 2 2 2

1 1

(apply)

0

 ` e e ) v; 

1 2

 ` e ) v ;  ` [v =x]e ) v ;

1 1 1 1 1 2 2 2

(bind)

 ` let x be e in e ) v ;

1 2 2 2

Table 2: Op erational Semantics for References

Lemma 2.1

1. (Weakening) Suppose that ; ` e :  .Ifl 62 dom(), then [l : ]; ` e :  ,andifx 62 dom( ), then

; [x: ] ` e :  .

0 0 0 0

2. (Substitution) If ; ` v :  and ; [x: ] ` e :  , and if Close ( )   , then ; ` [v=x]e : 

;

0 0

3. (Specialization) If ; ` e :  and Close ( )   , the ; ` e :  .

;

The pro ofs are routine inductions on the structure of typing derivations. Substitution is stated only for values,

in recognition of the fact that in a call-by-value language only values are ever substituted for variables during

evaluation.

3 Semantics and Soundness

A memory  is a partial function mapping a nite set of lo cations to values. The contents ofalocation

l 2 dom()isthevalue (l ), and we write [l :=v ] for the memory which assigns to lo cation l the value v

0 0

and to a lo cation l 6= l the value (l ). Notice that the result may either b e an update of  (if l 2 dom())

or an extension of  (if l 62 dom()).

The op erational semantics of the language is de ned by a collection of rules for deriving judgements of

0

the form  ` e ) v;  ,withtheintended meaning that the closed expression e,whenevaluated in memory

0

, results in value v and memory  . The rules of the semantics are given in Table 2.

The typing relation is extended to memories and lo cation typings by de ning  :  to hold i dom()=

dom(), and for every l 2 dom(),  ` l : (l ). Notice that the typing relation is de ned so that (l )

may mention lo cations whose typ e is de ned by . (Compare Tofte's account [11].) For example, supp ose

that  is the memory sending lo cation l to x:x + 1, and lo cation l to y :(! l ) y + 1, and supp ose that

0 1 0

 is the lo cation typing assigning the typ e int !int to b oth l and l .Theveri cation that  :  requires

0 1

checking that  ` y :(! l ) y +1 : int!int,which requires determining the typ e assigned to lo cation l by .

0 0

0

As p ointed out byTofte [11], the memory  which assigns (l ) to b oth l and l can arise as a result of

1 0 1

0

an assignment statement. Toverify that  :  requires checking that  ` (l ):(l ), which itself relies

0 0

on (l )! Tofte employs a \greatest xed p oint" construction to account for this p ossibility, but no such

0

machinery is needed here. This is the principal advantage of our formalism. (A similar advantage accrues

to WrightandFelleisen's approach [13]andwas suggested to us by them.)

Wenow turn to the question of soundness of the typ e system. 3

0 0 0 0 0

Conjecture 3.1 If  ` e ) v;  ,and  ` e :  , with  : , then there exists  such that    ,  :  ,

0

and  ` v :  .

The intention is to capture the preservation of typing under evaluation, taking account of the fact that

evaluation may allo cate storage, and hence intro duce \new" lo cations that are not governed by the initial

0

lo cation typing .Thus the lo cation typing  is to b e constructed as a function of the evaluation of e,as

will b ecome apparent in the sequel.

0

A pro of by induction on the structure of the derivation of  ` e ) v;  go es through for all cases but bind.

0

For example, consider the expression ref e.Wehave  ` ref e ) l;  [l :=v ]by alloc,  ` ref e :  ref by ref,

0

and  : . It follows from the de nition of alloc that  ` e ) v;  , and from the de nition of ref

0 0 0 0

that  ` e :  .Soby induction there is a lo cation typing    such that  :  and  ` v :  .To

00 0

complete the pro of we need only check that the lo cation typing  =  [l := ] satis es the conditions that

0 00 00

 [l :=v ]: and that  ` l :  ref , b oth of which follow from the assumptions and Lemma 2.1(1). The

other cases follow a similar pattern, with the exception of rule bind.To see where the pro of breaks down,

0

let us consider the obvious attempt to carry it through. Our assumption is that  ` let x be e in e ) v; 

1 2

by bind,  ` let x be e in e :  by let,and : . It follows that  ` e ) v ; for some value v and

1 2 2 1 1 1 1

0

some memory  , and that  ` [v =x]e ) v;  .We also havethat ` e :  for some monotyp e  ,

1 1 1 2 1 1 1

and that ; x: Close ( ) ` e :  for some monotyp e  . By induction there is a lo cation typing   

 1 2 2 2 1

such that  :  and  ` v :  .To complete the pro of it suces to show that  ` [v =x]e :  .This

1 1 1 1 1 1 1 2 2

would follow from the typing assumptions governing v and e by an application of Lemma 2.1(2), provided

1 2

that we could show that Close ( )  Close ( ). But this holds i FTV ( )  FTV (), whichdoesnot

 1  1 1

1

necessarily obtain. For example, if e = ref (x:x) and  has the form (t!t) ref ,where t do es not o ccur in

1 1

, then Close ( ) generalizes t, whereas Close ( ) do es not. (This observation is due to Tofte, who also

 1  1

1

go es on to provide a counterexample to the theorem [11].)

The simplest approach to recovering soundness is to preclude p olymorphic generalization on the typ e of

a let-b ound expression unless that expression is a value. Under this restriction the pro of go es through, for

0 0 0 0

we can readily see that if  ` v ) v ; , then v = v and  = , and that if  :  and  :  ,with   ,

1 1

then  = . Consequently, Close ( ) = Close ( ) in the ab ove pro of sketch, and this is sucientto

1  1  1

1

0

complete the pro of. Following Tofte [11], we deem an expression e non-expansive i  ` e ) v;  implies

0

 = . By restricting the bind rule so that e is non-expansive, we ensure that  = , which suces for

1 1

the pro of. Unfortunately in anyinteresting language this condition is recursively undecidable, and hence

some conservative approximation must b e used. Tofte cho oses the simple and memorable condition that e

1

b e a (syntactic) value.

The requirement that p olymorphic let's bind values is rather restrictive. Following ideas of MacQueen

(unpublished) and Damas [3], Tofte intro duced a mo di cation to the typ e system that admits a more

exible use of p olymorphism, without sacri cing soundness. Tofte's idea is to employ a marking of typ e

variables so as to maintain the invariantthatifa typ e variable can o ccur in the typ e of a lo cation in the

store, then generalization on that typ e variable is suppressed. The set of typ e variables is divided into two

countably in nite disjoint subsets, the imperative and the applicative typ e variables. A monotyp e is called

imperative i all typ e variables o ccurring within it are imp erative. The typingruleforref is constrained

so that the typ e  of e in rule ref is required to b e imp erative. Polymorphic generalization must preserve

the imp erative/applicative distinction, and p olymorphic instantiation is de ned so that an imp erativetyp e

variable may only b e instantiated to an imp erative monotyp e. In addition a restricted form of generalization,

written AppClose ( ), is de ned similarly to Close ( ), with the exception that only applicativetyp e

;

;

variables are generalized in the result; any imp erativetyp e variables remain free.

With the machinery of applicative and imp erativetyp es in hand, Tofte replaces the bind rule with the

following two rules:

; ` v :  ; [x: Close ( )] ` e : 

1 1 ; 1 2 2

(x 62 dom( )) (bind-val)

; ` let x be v in e : 

1 2 2

; ` e :  ; [x: AppClose ( )] ` e : 

1 1 1 2 2

;

(x 62 dom( )) (bind-ord)

; ` let x be e in e : 

1 2 2 4

Thus if the let-b ound expression is a value, it may b e used p olymorphically without restriction; otherwise

only the applicativetyp e variables may b e generalized.

The idea b ehind these mo di cations is to maintain a conservative approximation to the set of typ e

variables that may o ccur in the typ e of a value stored in memory. Thisisachieved by ensuring that if a typ e

variable o ccurs freely in the memory, then it is imp erative. The converse cannot, of course, b e e ectively

maintained since the lo cation typing in the soundness theorem is computed as a function of the evaluation

trace. Wesay that a lo cation typing is imp erativei thetyp e assigned to every lo cation is imp erative.

0 0 0

Theorem 3.2 If  ` e ) v;  , and  ` e :  , with  :  and  imperative, then thereexists  such that 

0 0 0 0

is imperative,    ,  :  ,and  ` v :  .

0

The pro of pro ceeds by induction on the structure of the derivation of  ` e ) v;  . Consider the evaluation

rule alloc. The restriction on rule ref ensures that if ref e :  ref ,then  is imp erative. Consequently,the

00 0 0

lo cation typing  =  [x :  ] is imp erative since, by supp osition,  is imp erative, and, by induction,  is

imp erative. The signi cance of maintaining the imp erativeinvariant on lo cation typings b ecomes apparent

in the case of the bind rules. The rule bind-val is handled as sketched ab ove: since v is a value, it is

1

non-expansive, consequently  = , which suces for the pro of. The rule bind-ord is handled by observing

1

that regardless of whether  is a prop er extension of  or not, wemust have Close ( )  AppClose ( ),

1  1 1

1 

for if a typ e variable t o ccurs freely in  but not in ,itmust b e (by induction hyp othesis) imp erative,

1

and hence is not generalized in AppClose ( )(by de nition of AppClose ). This is sucient to complete

1



the pro of.

4 Conclusion

Wehave presented a simpli ed pro of of the soundness of Tofte's typ e discipline for combining p olymorphism

and mutable references in ML. The main contribution is the elimination of the need for the maximal xed

p oint argumentusedbyTofte [11]. The metho ds considered here have b een subsequently employed by

Greiner to establish the soundness of the \weak p olymorphism" typ e discipline implemented in the Standard

ML of New Jersey compiler [1]. Our approachwas in uenced by the work of Wright and Felleisen [13]who

pioneered the use of reduction semantics to prove soundness of typ e assignment systems.

Several imp ortant studies of the problem of combining p olymorphic typ e inference and computational

e ects (including mutable references) have b een conducted in recentyears. The interested reader is referred

to the work of Gi ord, Jouvelot and Talpin [6, 9], LeroyandWeiss [7], Wright[12], Hoang, Mitchell, and

Viswanathan [5], and Greiner [4] for further details and references.

The author is grateful to Matthias Felleisen, Andrew Wright, and John Greiner for their comments and

suggestions.

References

[1] Andrew W. App el and David B. MacQueen. Standard ML of New Jersey. In J. Maluszynski and

M. Wirsing, editors, Third Int'l Symp. on Prog. Lang. Implementation and Logic Programming, pages

1{13, New York, August 1991. Springer-Verlag.

[2] Luis Damas and . Principal typ e schemes for functional programs. In Ninth ACM Sym-

posium on Principles of Programming Languages, pages 207{212, 1982.

[3] Luis Manuel Martins Damas. Type Assignment in Programming Languages. PhD thesis, Edinburgh

University, 1985.

[4] John Greiner. Standard ml weak p olymorphism can b e sound. Technical Rep ort CMU{CS{93{160,

Scho ol of Computer Science, Carnegie Mellon University, Pittsburgh, PA, May1993.

[5] My Hoang, John Mitchell, and Ramesh Viswanathan. Standard ML-NJ weak p olymorphism and im-

p erative constructs. In Eighth Symposium on Logic in Computer Science,1993. 5

[6] Pierre Jouvelot and David Gi ord. Algebraic reconstruction of typ es and e ects. In Eighteenth ACM

Symposium on Principles of Programming Languages, pages 303{310, 1991.

[7] Xavier Leroy and Pierre Weis. Polymorphic typ e inference and assignment. In Eighteenth ACM Sym-

posium on Principles of Programming Languages, pages 291{302, Orlando, FL, January 1991. ACM

SIGACT/SIGPLAN.

[8] Robin Milner and Mads Tofte. Co-induction in relational semantics. Technical Rep ort ECS-LFCS-88-

65, Lab oratory for the Foundations of Computer Science, Edinburgh University,Edinburgh, Octob er

1988.

[9] Jean-Pierre Talpin and Pierre Jouvelot. The typ e and e ect discipline. In Seventh Symposium on Logic

in Computer Science, pages 162{173, 1992.

[10] Mads Tofte. Operational Semantics and Polymorphic Type Inference. PhD thesis, Edinburgh University,

1988. Available as Edinburgh University Lab oratory for Foundations of Computer Science Technical

Rep ort ECS{LFCS{88{54.

[11] Mads Tofte. Typ e inference for p olymorphic references. Information and Computation, 89:1{34, Novem-

b er 1990.

[12] Andrew Wright. Typing references by e ect inference. In Proceedings of the European Symposium on

Programming, 1992.

[13] Andrew K. Wright and Matthias Felleisen. A syntactic approachtotyp e soundness. Technical Rep ort

TR91{160, Department of Computer Science, Rice University, July 1991. To app ear, Information and

Computation. 6