Rank 2 typ e systems and recursive de nitions
Technical Memorandum MIT/LCS/TM{531
Trevor Jim
Lab oratory for Computer Science
Massachusetts Institute of Technology
August 1995; revised Novemb er 1995
Abstract
We demonstrate an equivalence b etween the rank 2 fragments of the
p olymorphic lamb da calculus System F and the intersection typ e dis-
cipline: exactly the same terms are typable in each system. An imme-
diate consequence is that typability in the rank 2 intersection system
is DEXPTIME-complete. Weintro duce a rank 2 system combining
intersections and p olymorphism, and prove that it typ es exactly the
same terms as the other rank 2 systems. The combined system sug-
gests a new rule for typing recursive de nitions. The result is a rank 2
typ e system with decidable typ e inference that can typ e some inter-
esting examples of p olymorphic recursion. Finally,we discuss some
applications of the typ e system in data representation optimizations
suchasunboxing and overloading.
Keywords: Rank 2 typ es, intersection typ es, p olymorphic recursion,
boxing/unboxing, overloading.
1 Intro duction
In the past decade, Milner's typ e inference algorithm for ML has b ecome
phenomenally successful. As the basis of p opular programming languages
like Standard ML and Haskell, Milner's algorithm is the preferred metho d
of typ e inference among language implementors. And in the theoretical
545 Technology Square, Cambridge, MA 02139, [email protected]. Supp orted
by NSF grants CCR{9113196 and CCR{9417382, and ONR Contract N00014{92{J{1310. 1
community, the literature on typ e inference is dominated by extensions of
ML's let-p olymorphism.
In this pap er we examine some alternatives to ML that have attracted
surprisingly little attention: the systems of rank 2 typ es intro duced by
Leivant [21]. These systems are slightly more p owerful than ML|strictly
more terms can b e assigned typ es|and the increased p ower comes for free|
the complexityoftypability is identical. But the unique feature of the rank 2
systems that justi es further study is that, in sharp contrast to other exten-
sions of ML, they abandon let-p olymorphism.
We use the expression x:xx to illustrate the limitations of let-p oly-
morphism. It is well known that this expression cannot b e typ ed in ML:
the only way for ML to typ e the self-application xx is by assigning a p oly-
morphic typ e to x, and ML do es not allow abstraction over variables with
p olymorphic typ e. In ML, the only mechanism for intro ducing variables of
p olymorphic typ e is the let-expression:
let x =y :y
in xx
This let-expression binds x to the identity function y :y , which has the
p olymorphic typ e 8t:t ! t in ML. By ML's let-p olymorphism, x is assigned
the typ e 8t:t ! t, which is sucienttotyp e xx.
The problem with this is that we cannot typ echeck the uses of x the
application xx separately from its de nition the function y :y . So ML
must b e extended with a module language in order to supp ort programming
in the large, where it is impractical to require every p olymorphic de nition
to app ear in the same source le as every use.
In contrast, x:xxistypable in all of the rank 2 systems we consider.
Here are two rank 2 typings:
x:xx: 8t:t ! t ! 8s:s ! s;
x:xx: t ! t ^ t ! t ! t ! t ! t ! t:
The rst typing says that x:xx is a function that, when given an argument
with typ e t ! t for any typ e t, pro duces a result with typ e s ! s, for any s.
The identity function is an appropriate argument.
The second typing says that x:xx is a function that, when given an
argumenthaving both the typ es t ! t and t ! t ! t ! t, pro duces
a result of typ e t ! t. Once again, the identityy :y is an appropriate
argument. 2
The rank 2 systems we consider are subsystems of two widely studied
typ e systems, System F and the system of intersection typ es. System F,
intro duced indep endently by Girard [7] and by Reynolds [28], predates ML
and can typ e many more terms. A recent result of Wells [34], however, shows
that typability in the system is undecidable, putting typ e inference out of
reach.
The system of intersection typ es, intro duced indep endently by Copp o
and Dezani [5] and by Sall e [29], can typ e even more terms than System F:
1
it typ es all and only the strongly normalizing terms. The equivalence of
typability and strong normalization implies that typ e inference, just as with
System F, is unattainable.
With the goal of typ e inference in mind, we seek decidable restrictions of
these typ e systems. Restrictions based on the rank of typ es were suggested
by Leivant [21]. The rank of a typ e can b e easily determined by examining it
in tree form. A typ e is of rank k if no path from the ro ot of the typetoatyp e
constructor of interest either typ e intersection `^'ortyp e quanti cation `8'
passes to the left of k arrows. The typ es shown in Figure 1 are rank 2 typ es,
b ecause no path from ro ot to ^ or 8 passes to the left of two arrows. But
the typ es shown in Figure 2 go b eyond rank 2 they are rank 3 typ es. The
typ es given ab ove for x:xx are rank 2 typ es.
!
^
6 6
6 6
6 6
! !
s s
6 6
6 6
6 6
s
^
t
8t
6
6
6
! !
t
6 6
6 6
6 6
s
t t t
t ^ t ! s ! t ^ s s ! 8t:t ! t ! s
Figure 1: Examples of rank 2 typ es
Ranks 0 and 1 of Leivant's systems are equivalent to the simply typ ed
lamb da calculus, which can typ e fewer terms than ML. But starting with
1
Without the typ e constant ! . 3
!
6
6
6 !
6
6
! 6
s
6
6
! 6
s
6
6
6 !
s
6
6
6
^
t
6
6
6
s
8t
s
t
t
t ^ s ! t ! s s ! 8t:t ! s ! s
Figure 2: Typ es that go b eyond rank 2
rank 2, the systems can typ e more terms than ML.
Rank 2 of System F, whichwe call , has received the most study. Mc-
2
Cracken [23] prop osed a typ e inference algorithm for based on Leivant's
2
ideas. This algorithm is incorrect. Kfoury and Tiuryn [12] show that the
complexityoftypabilityin is identical to that of ML. Kfoury and
2
Wells [16, 17] give a correct typ e inference algorithm, and show that ranks 3
and higher in System F are undecidable.
Leivant's original pap er is almost the only work on rank 2 of the inter-
section typ e disciplin e, whichwe call I . Leivantsketched a typ e inference
2
algorithm for I , but the algorithm was not formalized and proved correct
2
until recently [33]. Leivant also conjectured the undecidabili ty of ranks 3
and higher in the intersection system; to our knowledge the details of his
pro of idea have never b een veri ed.
I has a signi cant advantage over : it has principal typings. This
2 2
means that for any term M ,if M is typable in I , then there is an I typing
2 2
judgment
A ` M :
that represents all of the p ossible typing judgments for M . Other typings
for M can b e obtained from the principal typing by simple op erations sub-
stitution and subsumption.
Contributions of the pap er
Since I has principal typings, and do es not, we b elieve I deserves
2 2 2
more study. The rst contribution of this pap er is to develop some of the 4
2
basic prop erties of I .We establish the following equivalence: a term is
2
typable in I if and only if it is typable in . An immediate corollary is that
2 2
typabilityinI is DEXPTIME-complete, identical to typabilityin and
2 2
ML. We also consider some variants of I , and show they are all equivalent
2
in terms of typability.
The second contribution of this pap er is to intro duce a new typ e system,
P , that combines rank 2 intersection typ es and top-level quanti cation of
2
typ e variables, as in ML. P has principal typings, so it clearly improves
2
on . Its advantage over I is more subtle. The addition of quanti ers
2 2
makes typ es more expressive: the quanti ers identify generic typ e variables,
that is, typ e variables which can safely b e instantiated with anytyp e. In
particular, this suggests an interesting typ e inference algorithm for recursive
de nitions.
A recursive de nition is written in the form xM , and is meantto
denote a program x such that
x = M;
where M may contain some uses of x. The standard rule for typing recursive
de nitions lo oks like
A [fx : g` M :
A ` xM :
Most typ e inference algorithms restrict the typ e in this rule to b e a simple
typ e. The rule of polymorphic recursion relaxes this restriction by allow-
ing to b e an ML typ e scheme. This gives a useful increase in typing
power|it can typ e some natural programs that cannot b e typ ed by the
simple recursion rule. However, p olymorphic recursion makes typ e infer-
ence undecidable [14].
We suggest another wayoftyping recursive de nitions:
A [fx : g` M :
where
A ` xM :
The rule says that as long as the typ e of M is more general than the
assumption on x needed to typ e M ,we can deduce as the typ e of the
recursive de nition.
We extend P to typ e recursive de nitions in this way. The resulting
2
system can typ e many but not all of the examples that seem to require
2
The equivalence b etween the rank 2 fragments of System F and the intersection typ e
discipli ne has b een shown indep endently byYokouchi [35]. 5
p olymorphic recursion. Moreover, the system has principal typ es and de-
cidable typ e inference.
Organization of the pap er
s s
In x2, weintro duce I , a syntax-directed version of I , and , a syntax-
2
2 2
directed version of . The main result is that a term is typable in one
2
system if and only if it is typable in the other. An immediate corollary
s
is that typabilityinI is DEXPTIME-complete, the same complexityas
2
s s
in ML and .Inx3, we present the typ e inference algorithm for I .In
2 2
x4, we discuss some other de nitions of rank 2 intersection typ e systems,
and show their equivalence with I .Inx5, we de ne P , show that it has
2 2
principal typings, and giveatyp e inference algorithm. In x6, we discuss
various ways of typing recursive de nitions, and we prop ose an extension
of P that can typ e many examples of p olymorphic recursion. We discuss
2
applications of P to compilation in x7, and we summarize our results in x8.
2
2 Rank 2 typ e systems
2.1 Preliminaries
We will b e de ning a number of typ e systems; here we develop machinery
that will b e useful in all of them.
We use x;y;::: to range over a countable set of variables, and t; s to
range over a countable set, Tv ,oftyp e variables. The terms and typ es of
the systems will vary, but in all cases we use ;;::: to range over typ es,
and M , N , . . . to range over terms.
The terms of the pure lambda calculus are de ned by the following
grammar:
M ::= x j M M j xM :
1 2
Unless stated otherwise, terms are considered syntactically equal mo dulo
renaming of b ound variables. We adopt the usual conventions that allowus
to omit parentheses: application asso ciates to the left, and the scop e of an
abstraction `' extends to the right as far as p ossible. We write x x :M
1 n
for x x M .
1 n
The typ es of our systems will all b e subsets of the typ es with quanti -
cation and intersection:
::= t j ! j 8t j ^ :
1 2 1 2 6
By convention, `!' asso ciates to the right, so that, e.g., t ! t ! t may
b e written more compactly as t ! t ! t, and `^' binds more tightly than
`!', e.g., ^ ! t means ^ ! t. The scop e of a quanti er `8' extends
~
as far to the right as p ossible. We write 8t for the typ e
8t 8t :::8t :::;
1 2 n
~
where t = t ;t ;:::;t and n 0.
1 2 n
The set of simple types, T , is de ned by the following inductive equa-
0
tion:
T = f t j t is a typ e variable g[f ! j ; 2 T g:
0 0
A type environment is a nite set fx : ;:::;x : g of variable, typ e
1 1 n n
pairs, where the variables x ;:::;x are distinct. We use A to range over
1 n
typ e environments. We write Ax for the typ e paired with x in A, domA
for the set fx j9:x : 2 Ag, and A for the typ e environment A with
x
any pair for the variable x removed. We write A [ A for the union of two
1 2
typ e environments; by convention we assume that domA and domA
1 2
are disjoint. For any set T of typ es, wesay A is a T typ e environmentif
Ax 2 T for all x 2 domA.
The notion of free type variable is de ned as usual. We write FTV for
the free typ e variables of a typ e , and FTVA for the free typ e variables
of all typ es app earing in A.We write GenA; for the 8-closure of by
the typ e variables FTV FTVA.
A judgment is a relation b etween typ e environments, terms, and typ es,
written A ` M : . A term M is typable if A ` M : for some A and .
A pair hA; i of a typ e environment and a typ e is called simply a pair .
Two pairs hA ; i and hA ; i are disjoint if their free typ e variables are
1 1 2 2
disjoint. An acceptable pair of a term M in a typ e system is a pair hA; i
such that the judgment A ` M : holds in the typ e system. We write
AP M for the set of acceptable pairs of M in a typ e system T .
T
A substitution is a mapping from typ e variables to simple typ es whichis
the identity on all but a nite number of typ e variables. We use S; R; Q; U
to range over substitutions. The domain and range of a substitution S are
de ned
domS = ft j St 6= tg;
[
rngS = FTVSt:
t2domS 7
If domS = ft ;t ;:::;t g and St = for all i, then S can b e written in
1 2 n i i
the form ft := ;:::;t := g.
1 1 n n
The application of substitutions is extended to typ es, typ e environments,
and pairs in the usual way. The comp osition of substitutions is denoted
by juxtap osition, so that SRt =SRt = S Rt. Wesay S and S are
1 2
disjoint if domS and domS are disjoint sets. If S and S are disjoint,
1 2 1 2
then the substitution S [ S is de ned as follows:
1 2
8
S t if t 2 domS ;
<
1 1
S [ S t= S t if t 2 domS ;
1 2 2 2
:
t otherwise.
Note that wehave made a severe restriction on substitutions: they map
typ e variables only to simple typ es, and not typ es in general.
2.2 The rank 2 intersection typ e system
There are many di erent formulations of intersection typ e systems; see van
Bakel [33] for a survey.We will presentavery restricted intersection typ e
system here, the system of rank 2 intersection typ es. Our system is a slight
generalization of van Bakel's version see x4.1.
The terms of the intersection type system are just the terms of the
lamb da calculus. The sets T and T are de ned to b e the smallest sets
1 2
satisfying the following equations:
T = T [f ^ j ; 2 T g;
1 0 1
T = T [f ! j 2 T ; 2 T g:
2 0 1 2
The set T of rank 1 typ es consists of nite, nonemptyintersections of simple
1
typ es. T is the set of rank 2 intersection typ es: these are typ es p ossibly
2
containing intersections, but only to the left of a single arrow. Note that
T = T \ T , and for i 2f0; 1; 2g,if 2 T , then S 2 T .
0 1 2 i i
In order to simplify subsequent de nitions, we adopt the following syn-
tactic convention: we consider `^' to b e an asso ciative, commutative, and
idemp otent op erator, so that any T typ e may b e considered a nite, non-
1
V
, where each 2 T . empty set of simple typ es, written in the form
i i 0
i2I
De nition 1 For i 2f1; 2g,we de ne the relation as the least partial
i
order on T closed under the following rules:
i
V V
. i If f j j 2 J g f j i 2 I g, then
j i 1 j i
j 2J i2I 8
V
var A [fx : g`x : where i 2 I
i i 0
0
i2I
A [fx : g`M :
x
abs
A ` xM : !
V
A ` M : ! ; 8i 2 I A ` N :
i i
i2I
app
A ` MN:
s
Figure 3: Typing rules of I .Typ es in typ e environments are in T , and
1
2
derived typ es are in T .
2
ii If and , then ! ! .
1 1 1 2 2 2 1 2 2 1 2
The rst rule says that expresses the natural ordering on intersection
1
typ es, and the second rule says that ob eys the usual antimonotonic
2
ordering on function typ es, restricted to rank 2.
Some useful prop erties of the orderings and are summarized in
1 2
the following lemma.
Lemma 2
i If 2 T and 2 T , then i = .
0 1 1
ii If 2 T and 2 T , then i = .
2 0 2
iii For i 2f1; 2g,if , then S S .
i i
Judgments in our rank 2 system are de ned inductively by the rules of
s
Figure 3. We write I .A` M : if the judgment A ` M : follows by
2
these rules, with typ es app earing in typ e environments restricted to T , and
1
s
derived typ es restricted to T . The sup erscript `s' in I indicates that the
2
2
system is syntax-directed, in contrast with a later variant see x4.
If A and A are T typ e environments, we de ne A + A ,a T typ e
1 2 1 1 2 1
environment, as follows: for each x 2 domA [ domA ,
1 2
8
A x if x 62 domA ;
<
1 2
A + A x= A x if x 62 domA ;
1 2 2 1
:
A x ^ A x otherwise.
1 2
s s 0
Lemma 3 Weakening If I .A` M : , then I .A+ A ` M : for
2 2
0
any T type environment A .
1 9
Pro of: An easy induction on typing derivations. 2
s s
Lemma 4 Substitutivity If I .A ` M : , then I .SA ` M : S for
2 2
any substitution S .
Pro of: By induction on the structure of M .
V
i If M = x, then Ax= and = for some i 2 I . Then
i i 0
0
i2I
V
s
S , I .SA ` x : S , and S = S . SAx=
i i i
2 0 0
i2I
s
ii If M = xN then must b e of the form ! , and I .A [fx :
1 2 x
2
s
g`N : . Then by induction, I .SA [fx : g ` N : S ,soby
1 2 x 1 2
2
s s
rule abs, I .SA ` N : S ! S ,orI .SA ` N : S ! .
x 1 2 x 1 2
2 2
s
Then byweakening, I .SA` N : S ! .
1 2
2
V
s
2 T wehave I .A ` M : iii If M = M M , then for some
i 1 1 1 2
2
i2I
V
s
! and I .A ` M : for all i 2 I . By induction we
i 2 i
2
i2I
V
s s
have I S ! S and I .SA` M : .SA ` M : S , and by
i 1 2 i
2 2
i2I
s
rule app,wehave I .SA` M M : S, as desired.
1 2
2
2
2.3 System F
The terms of System F are exactly the terms of the lamb da calculus. The
types of System F are de ned by the following grammar:
::= t j ! j 8t :
1 2
We consider System F typ es to b e syntactically equal mo dulo renaming of
b ound typ e variables, reordering of adjacent quanti ers, and elimination of
unnecessary quanti ers.
The typ es of System F can b e organized into a hierarchy as follows.
First, de ne R0 = T . Then for n 0, the set Rn + 1 is de ned to b e
0
the least set satisfying
Rn +1 = Rn [f ! j 2 Rn; 2 Rn +1g
[f8t j 2 Rn +1g:
It will b e useful to restrict typ es so that quanti ers do not app ear to the
immediate right of arrows. Therefore we de ne the sets
0
S = S [f8t j 2 Sg;
0 0
S = T [f ! j 2 S; 2 S g:
0 10
var A [fx : g`x : where
A [fx : g`M :
x 1 2
abs
A ` xM : !
1 2
~
A ` M :8t ! ; A ` N :
1 2 1
app each t 62 FTV A
i
A ` MN:
2
s
Figure 4: Typing rules of .Typ es in typ e environments are in S1, and
2
0
derived typ es are in S 2.
0 0
We write Sn for S \ Rn and S n for S \ Rn. Note that the S1
typ es are exactly the ML typ e schemes.
0 0
De nition 5 Supp ose = 8t t : 2 S1, and ; 2 T .Wesay is
1 n 0
0
an instanceof , written , if and only if for some ;:::; 2 T ,we
1 n 0
0 0
have = ft := ;:::;t := g .We write 8s s if and only
1 1 n n 1 m
0
if s ;:::;s are not free in and .
1 m
Note that the sense of ` ' is opp osite to that of our other subtyping relations;
for example, b oth \ " and \ "may b e read, \ is more general
2
than ." We make an exception in the case of ` ' to b e consistent with its
use in ML [24].
s
Wenow de ne , our version of the rank 2 fragment of System F. The
2
s
sup erscript `s' in indicates that the system is syntax-directed. See Kfoury
2
and Tiuryn [12] for a de nition of , the non-syntax-directed version.
2
The judgments of the system are de ned by the rules of Figure 4. We
s
.A` M : if A ` M : is derivable from these rules, where typ es write
2
in typ e environments are restricted to S1, and derived typ es are restricted
0
to S 2.