Behavioral Subtyping Using Invariants and Constraints
a
Barbara H. Liskov Jeannette M. Wing
July 1999
CMU-CS-99-156
a
MIT Lab. for Computer Science, 545 Technology Square, Cambridge, MA 02139
Scho ol of Computer Science
Carnegie Mellon University
Pittsburgh, PA 15213
The work describ ed in this pap er is based on a Novemb er 1994 ACM TOPLAS pap er, \A Behavioral
Notion of Subtyping, by the same authors. This pap er's version has b een submitted to the volume
Formal Methods For DistributedProcessing: An Object-Oriented Approach, edited byHoward Bowman
and John Derrick.
Abstract
We presentaway of de ning the subtyp e relation that ensures that subtyp e ob jects preserve b ehavioral
prop erties of their sup ertyp es. The subtyp e relation is based on the sp eci cations of the sub- and sup ertyp es.
Our approach handles mutable typ es and allows subtyp es to have more metho ds than their sup ertyp es.
Dealing with mutable typ es and subtyp es that extend their sup ertyp es has surprising consequences on how
to sp ecify and reason ab out ob jects. In our approach, we discard the standard data typ e induction rule,
we prohibit the use of an analogous \history" rule, and we make up for b oth losses by adding explicit
predicates|invariants and constraints{to our typ e sp eci cations. We also discuss the rami cations of our
approach of subtyping the design of typ e families.
This researchwas supp orted for Liskov in part by the Advanced Research Pro jects Agency of the Department
of Defense, monitored by the Oce of Naval Research under contract N00014-91-J-4136 and in part by the National
Science Foundation under Grant CCR-8822158; for Wing, by the Avionics Lab, Wright Research and Development
Center, Aeronautical Systems Division AFSC, U. S. Air Force, Wright-Patterson AFB, OH 45433-6543 under
Contract F33615-90-C-1465, ARPA Order No. 7597.
Keywords: subtyp e, ob ject-oriented design, abstraction function, mutable typ es, invariants, constraints, sp eci cations
1 Intro duction
What do es it mean for one typ e to b e a subtyp e of another? We argue that this is a semantic question
having to do with the b ehavior of the ob jects of the twotyp es: the ob jects of the subtyp e oughttobehave
the same as those of the sup ertyp e as far as anyone or any program using sup ertyp e ob jects can tell.
For example, in strongly typ ed ob ject-oriented languages such as Simula 67[DMN70], C++[Str86],
+
Mo dula-3[Nel91], and Trellis/Owl[SCB 86], subtyp es are used to broaden the assignment statement. An
assignment
x: T := E
is legal provided the typ e of expression E is a subtyp e of the declared typ e T of variable x. Once the
assignment has o ccurred, x will b e used according to its \apparent" typ e T, with the exp ectation that if the
program p erforms correctly when the actual typ e of x's ob ject is T, it will also work correctly if the actual
typ e of the ob ject denoted by x is a subtyp e of T.
Clearly subtyp es must provide the exp ected metho ds with compatible signatures. This consideration has
+ +
led to the formulation of the contra/covariance rules[BHJ 87, SCB 86, Car88]. However, these rules are
not strong enough to ensure that the program containing the ab ove assignment will work correctly for any
subtyp e of T, since all they do is ensure that no typ e errors will o ccur. It is well known that typ e checking,
while very useful, captures only a small part of what it means for a program to b e correct; the same is
true for the contra/covariance rules. For example, stacks and queues might b oth haveaput metho d to add
an element and a get metho d to remove one. According to the contravariance rule, either could b e a legal
subtyp e of the other. However, a program written in the exp ectation that x is a stack is unlikely to work
correctly if x actually denotes a queue, and vice versa.
What is needed is a stronger requirement that constrains the b ehavior of subtyp es: prop erties that can
b e proved using the sp eci cation of an ob ject's presumed typ e should hold even though the ob ject is actually
a memb er of a subtyp e of that typ e:
SubtypeRequirement: Let x b e a prop erty provable ab out ob jects x of typ e T. Then y
should b e true for ob jects y of typ e S where S is a subtyp e of T.
Atyp e's sp eci cation determines what prop erties we can prove ab out ob jects.
We are interested only in safety prop erties \nothing bad happ ens". First, prop erties of an ob ject's
b ehavior in a particular program must b e preserved: to ensure that a program continues to work as exp ected,
calls of metho ds made in the program that assume the ob ject b elongs to a sup ertyp e must have the same
b ehavior when the ob ject actually b elongs to a subtyp e. In addition, however, prop erties indep endent
of particular programs must b e preserved b ecause these are imp ortant when indep endent programs share
ob jects. We fo cus on two kinds of such prop erties: invariants, which are prop erties true of all states, and
history properties, which are prop erties true of all sequences of states. We formulate invariants as predicates
over single states and history prop erties, over pairs of states. For example, an invariant prop erty of a bag is
that its size is always less than its b ound; a history prop erty is that the bag's b ound do es not change. We
do not address other kinds of safety prop erties of computations, e.g., the existence of an ob ject in a state,
the numb er of ob jects in a state, or the relationship b etween ob jects in a state, since these do not have
to do with the meanings of typ es. We also do not address liveness prop erties \something go o d eventually
happ ens", e.g., the size of a bag will eventually reach the b ound.
This chapter provides a general, yet easy to use, de nition of the subtyp e relation that satis es the
Subtyp e Requirement. Our approach handles mutable typ es and allows subtyp es to have more metho ds
than their sup ertyp es. Dealing with mutable typ es and subtyp es that extend their sup ertyp es has surprising
consequences on how to sp ecify and reason ab out ob jects. In our approach, we discard the standard data
typ e induction rule, we prohibit the use of an analogous \history" rule, and we make up for b oth losses by
adding explicit predicates to our typ e sp eci cations. Our sp eci cations are formal, which means that they
have a precise mathematical meaning that serves as a rm foundation for reasoning. Our sp eci cations can
also b e used informally as describ ed in [LG85].
Our de nition applies in a very general distributed environment in which p ossibly concurrent users share
mutable ob jects. Our approach is also constructive: One can prove whether a subtyp e relation holds by
proving a small numb er of simple lemmas based on the sp eci cations of the twotyp es. 1
The chapter also explores the rami cations of the subtyp e relation and shows howinteresting typ e families
can b e de ned. For example, arrays are not a subtyp e of sequences b ecause the user of a sequence exp ects
it not to change over time and 32-bit integers are not a subtyp e of 64-bit integers b ecause a user of 64-bit
integers would exp ect certain metho d calls to succeed that will fail when applied to 32-bit integers. However,
typ e families can b e de ned that group such related typ es together and thus allow generic routines to b e
written that work for all family memb ers. Our approach makes it particularly easy to de ne typ e families:
it emphasizes the prop erties that all family memb ers must preserve, and it do es not require the intro duction
of unnecessary metho ds i.e., metho ds that the sup ertyp e would not naturally have.
The chapter is organized as follows. Section 2 discusses in more detail what we require of our subtyp e
relation and provides the motivation for our approach. We describ e our mo del of computation in Section 3
and present our sp eci cation metho d in Section 4. We give a formal de nition of subtyping in Section 5; we
discuss its rami cations on designing typ e hierarchies in Section 6. We describ e related work in Section 7
and summarize our contributions in Section 8.
2 Motivation
To motivate the basic idea b ehind our notion of subtyping, let's lo ok at an example. Consider a b ounded
bag typ e that provides a put metho d that inserts elements into a bag and a get metho d that removes an
arbitrary element from a bag. Put has a pre-condition that checks to see that adding an element will not
grow the bag b eyond its b ound; get has a pre-condition that checks to see that the bag is non-empty.
Consider also a b ounded stacktyp e that has, in addition to push and pop metho ds, a swap top metho d
that takes an integer, i, and mo di es the stackby replacing its top with i. Stack's push and pop metho ds
have pre-conditions similar to bag's put and get, and swap top has a pre-condition requiring that the stack
is non-empty.
Intuitively, stack is a subtyp e of bag b ecause b oth kinds of collections b ehave similarly. The main
di erence is that the get metho d for bags do es not sp ecify precisely what element is removed; the pop
metho d for stack is more constrained, but what it do es is one of the p ermitted b ehaviors for bag's get
metho d. Let's ignore swap top for the moment.
Supp ose wewant to show stack is a subtyp e of bag. We need to relate the values of stacks to those of
bags. This can b e done by means of an abstraction function, like that used for proving the correctness of
implementations [Hoa72]. A given stackvalue maps to a bag value where we abstract from the insertion
order on the elements.
We also need to relate stack's metho ds to bag's. Clearly there is a corresp ondence b etween stack's push
metho d and bag's put and similarly for the pop and get metho ds even though the names of the corresp onding
metho ds do not match. The pre- and p ost-conditions of corresp onding metho ds will need to relate in some
precise to b e de ned way. In showing this relationship we need to app eal to the abstraction function so
that we can reason ab out stackvalues in terms of their corresp onding bag values.
Finally, what ab out swap top? Most other de nitions of the subtyp e relation have ignored such \extra"
metho ds, and it is p erfectly adequate do so when programs are considered in isolation and there is no aliasing.
In such a constrained situation, a program that uses an ob ject that is apparently a bag but is actually a stack
will never call the extra metho ds, and therefore their b ehavior is irrelevant. However, we cannot ignore extra
metho ds in the presence of aliasing, and also in a general computational environment that allows sharing
of mutable ob jects bymultiple users or pro cesses. In particular, we need to pay attention to extra mutator
metho ds like swap top that mo dify their ob ject.
Consider rst the case of aliasing. The problem here is that within a program an ob ject is accessible by
more than one name, so that mo di cations using one of the names are visible when the ob ject is accessed
using the other name. For example, supp ose is a subtyp e of and that variables
x:
y:
b oth denote the same ob ject whichmust, of course, b elong to or one of its subtyp es. When the ob ject
is accessed through x, only metho ds can b e called. However, when it is used through y, metho ds can b e
called and if these metho ds are mutators, their e ects will b e visible later when the ob ject is accessed via 2
x. To reason ab out the use of variable x using the sp eci cation of its typ e ,we need to imp ose additional
constraints on the subtyp e relation.
Now consider the case of an environment of shared mutable ob jects, such as is provided by ob ject-oriented
databases e.g., Thor [Lis92] and Gemstone [MS90]. In such systems, there is a universe containing shared,
mutable ob jects and a way of naming those ob jects. In general, lifetimes of ob jects may b e longer than
the programs that create and access them i.e., ob jects might b e p ersistent and users or programs may
access ob jects concurrently and/or ap erio dically for varying lengths of time. Of course there is a need for
some form of concurrency control in suchanenvironment. We assume such a mechanism is in place, and
consider a computation to b e made up out of atomic units i.e., transactions that exclude one another. The
transactions of di erent computations can b e interleaved and thus one computation is able to observe the
mo di cations made by another.
If there were subtyping in suchanenvironment the following situation might o ccur. A user installs a
directory ob ject that maps string names to bags. Later, a second user enters a stackinto the directory under
some string name; such a binding is analogous to assigning a subtyp e ob ject to a variable of the sup ertyp e.
After this, b oth users o ccasionally access the stack ob ject. The second user knows it is a stack and accesses
it using stack metho ds. The question is: What do es the rst user need to know in order for his or her
programs to make sense?
We think it ought to b e sucient for a user to know only ab out the \apparent" typ e of the ob ject; the
subtyp e ought to preserveany prop erties that can b e proved ab out the sup ertyp e. In particular, the rst
user ought to b e able to reason ab out his or her use of the stack ob ject using invariant and history prop erties
of bag.
Our approachachieves this goal by adding information to typ e sp eci cations. To handle invariants, we
add an invariant clause; to handle history prop erties, a constraint clause. Showing that is a subtyp e of
requires showing that under the abstraction function 's invariant implies 's invariant and 's constraint
implies 's constraint.
For example, for the bag and stack example, the twoinvariants are identical: b oth state that the size of
the bag stack is less than or equal to its b ound. Similarly, the two constraints are identical: b oth state that
the b ound of the bag or stack do es not change. Showing that stack's invariant and constraint resp ectively
imply bag's invariant and constraint is trivial. The extra metho d swap top is p ermitted b ecause even though
it changes the stack's contents, it preserves stack's invariant and constraint.
In Section 5 we present and discuss our subtyp e de nition. First, however, we de ne our mo del of
computation, and then discuss sp eci cations, since these de ne the ob jects, values, and metho ds that will
b e related by the subtyp e relation.
3 Mo del of Computation
We assume a set of all p otentially existing ob jects, Obj, partitioned into disjointtyp ed sets. Each ob ject
has a unique identity.Atype de nes a set of values for an ob ject and a set of methods that provide the only
means to manipulate that ob ject. E ectively Obj is a set of unique identi ers for all ob jects that can contain
values.
Ob jects can b e created and manipulated in the course of program execution. A state de nes a value for
each existing ob ject. It is a pair of mappings, an environment and a store.Anenvironment maps program
variables to ob jects; a store maps ob jects to values.
State = Env Store
Env = Var ! Obj
Store = Obj ! Val
Givenavariable, x, and a state, , with an environment, :e, and store, :s,we use the notation x to denote
the value of x in state ; i.e., x = :s:ex. When we refer to the domain of a state, dom, we mean
more precisely the domain of the store in that state.
We mo del a typ e as a triple, hO; V ; M i, where O Obj is a set of ob jects, V Val is a set of values,
and M is a set of metho ds. Each metho d for an ob ject is a producer,an observer,oramutator. Pro ducers
of an ob ject of typ e return new ob jects of typ e ; observers return results of other typ es; mutators mo dify 3
ob jects of typ e . An ob ject is immutable if its value cannot change and otherwise it is mutable;atyp e is
immutable if its ob jects are and otherwise it is mutable. Clearly a typ e can b e mutable only if some of its
metho ds are mutators. We allow mixed methods where a pro ducer or an observer can also b e a mutator.
We also allow metho ds to signal exceptions; we assume termination exceptions, i.e., each metho d call either
terminates normally or in one of a numb er of named exception conditions. To b e consistent with ob ject-
oriented language notation, we write x.ma to denote the call of metho d m on ob ject x with the sequence
of arguments a.
Ob jects come into existence and get their initial values through creators. These are often called con-
structors in the literature. Unlike other kinds of metho ds, creators do not b elong to particular ob jects, but
rather are indep endent op erations.
A computation, i.e., program execution, is a sequence of alternating states and transitions starting in
some initial state, :
0
Tr ::: Tr
0 1 1 n 1 n n
Each transition, Tr , of a computation sequence is a partial function on states; we assume the execution of
i
each transition is atomic. A history is the subsequence of states of a computation; we use and to range
over states in any computation, c, where precedes in c. The value of an ob ject can change only through
the invo cation of a mutator; in addition the environment can change through assignment and the domain of
the store can change through the invo cation of a creator or pro ducer.
Ob jects are never destroyed:
8 1 i n : dom dom .
i 1 i
4 Sp eci cations
4.1 Typ e Sp eci cations
Atyp e sp eci cation includes the following information:
The typ e's name;
A description of the typ e's value space;
A de nition of the typ e's invariant and history prop erties;
For each of the typ e's metho ds:
{ Its name;
{ Its signature including signaled exceptions;
{ Its b ehavior in terms of pre-conditions and p ost-conditions.
Note that the creators are missing. Omitting creators allows subtyp es to provide di erent creators than
their sup ertyp es. In addition, omitting creators makes it easy for a typ e to havemultiple implementations,
allows new creators to b e added later, and re ects common usage: for example, Javainterfaces and virtual
typ es provide no way for users to create ob jects of the typ e. We showhow to sp ecify creators in Section 4.2.
In our work we use formal sp eci cations in the two-tiered style of Larch [GHW85]. The rst tier de nes
sorts, which are used to de ne the value spaces of ob jects. In the second tier, Larch interfaces are used to
de ne typ es.
For example, Figure 1 gives a sp eci cation for a bag typ e whose ob jects have metho ds put, get, card,
and equal. The uses clause de nes the value space for the typ e by identifying a sort. The clause in the
gure indicates that values of ob jects of typ e bag are denotable by terms of sort B intro duced in the BBag
sp eci cation; a value of this sort is a pair, hel ems; boundi, where elems is a mathematical multiset of integers
and bound is a natural numb er. The notation fgstands for the emptymultiset, [ is a commutative
op eration on multisets that do es not discard duplicates, 2 is the memb ership op eration, and j x j is a 4
cardinality op eration that returns the total numb er of elements in the multiset x. These op erations as well
as equality = and inequality6= are all de ned in BBag.
The invariant clause contains a single-state predicate that de nes the typ e's invariant prop erties. The
constraint clause contains a two-state predicate that de nes the typ e's history prop erties. We will discuss
these clauses in more detail in subsequent sections.
bag = typ e
uses BBag bag for B
for all b: bag
invariant j b :el ems j b :bound
constraint b :bound = b :bound
put = pro c i: int
requires j b :el ems j
pr e pr e
mo di es b
ensures b :el ems = b :el ems [fig^b :bound = b :bound
post pr e post pr e
get = pro c returns int
requires b :el ems 6= fg
pr e
mo di es b
ensures b :el ems = b :el ems fr esul tg^r esul t 2 b :el ems ^
post pr e pr e
b :bound = b :bound
post pr e
card = pro c returns int
ensures r esul t = j b :el ems j
pr e
equal = pro c a: bag returns b o ol
ensures r esul t =a = b
end bag
Figure 1: A Typ e Sp eci cation for Bags
The b o dy of a typ e sp eci cation provides a sp eci cation for each metho d. Since a metho d's sp eci cation
needs to refer to the metho d's ob ject, weintro duce a name for that ob ject in the for all line. We use result
to name a metho d's result parameter. In the requires and ensures clauses x stands for an ob ject, x for
pr e
1
its value in the initial state, and x for its value in the nal state. Distinguishing b etween initial and
post
nal values is necessary only for mutable typ es, so we suppress the subscripts for parameters of immutable
typ es likeintegers. We need to distinguish b etween an ob ject, x, and its value, x or x , b ecause
pr e post
we sometimes need to refer to the ob ject itself, e.g., in the equal metho d, which determines whether two
mutable bags are the same ob ject.
A metho d m's pre-condition, denoted m.pre, is the predicate that app ears in its requires clause; e.g.,
put's pre-condition checks to see that adding an element will not enlarge the bag b eyond its b ound. If the
clause is missing, the pre-condition is trivially \true."
A metho d m's post-condition, denoted m.post, is the conjunction of the predicates given by its mo di es
and ensures clauses. A mo di es x ;:::;x clause is shorthand for the predicate:
1 n
8 x 2 dompr e fx ;:::;x g :x = x
1 n pr e post
1
Note that pre and post are implicitly universally quanti ed variables over states. Also, more formally, x stands for
pr e
pre.spre.ex; x stands for post.spost.ex.
post 5
whichsays only ob jects listed maychange in value. A mo di es clause is a strong statement ab out all
ob jects not explicitly listed, i.e., their values may not change; if there is no mo di es clause then nothing
maychange. For example, card's p ost-condition says that it returns the size of the bag and no ob jects
including the bag change, and put's p ost-condition says that the bag's value changes by the addition of its
integer argument, and no other ob jects change.
Metho ds may terminate normally or exceptionally; the exceptions are listed in a signals clause in the
metho d's header. For example, instead of the get metho d we mighthave had
0
get = pro c returns int signals empty
mo di es b
ensures if b :el ems = fg then signal empty
pr e
else b :el ems = b :el ems fr esul tg^
post pr e
r esul t 2 b :el ems ^ b :bound = b :bound
pr e post pr e
4.2 Sp ecifying Creators
Ob jects are created and initialized through creators. Figure 2 shows sp eci cations for three di erent creators
for bags. The rst creator creates a new empty bag whose b ound is its integer argument. The second and
third creators x the bag's b ound to b e 100. The third creator uses its integer argument to create a singleton
bag. The assertion newx stands for the predicate:
x 2 dompost dompr e
Recall that ob jects are never destroyed so that dompr e dompost.
bag create= pro c n: int returns bag
requires n 0
ensures newresult ^ r esul t = hfg;ni
post
bag create smal l = pro c returns bag
ensures newresult ^ r esul t = hfg; 100i
post
bag create single = pro c i: int returns bag
ensures newresult ^ r esul t = hfig; 100i
post
Figure 2: Creator Sp eci cations for Bags
4.3 Typ e Sp eci cations Need Explicit Invariants
By not including creators in typ e sp eci cations and by allowing subtyp es to extend sup ertyp es with mutators
weloseapowerful reasoning to ol: data typ e induction. Data typ e induction is used to provetyp e invariants.
The base case of the rule requires that each creator of the typ e establish the invariant; the inductive case
requires that each metho d in particular eachmutator preserve the invariant. Without the creators, wehave
no base case. Without knowing all mutators of typ e as added by 's subtyp es, wehave an incomplete
inductive case. With no data typ e induction rule, we cannot provetyp e invariants!
To comp ensate for the lack of a data typ e induction rule, we state the invariant explicitly in the typ e
sp eci cation through an invariant clause; if the invariant is trivial i.e., identical to \true", the clause can
b e omitted. The invariant de nes the legal values of its typ e .For example, we include
invariant j b :el ems j b :bound
in the typ e sp eci cation of Figure 1 to state that the size of a b ounded bag never exceeds its b ound. The
predicate x app earing in an invariant clause for typ e stands for the predicate: For all computations,
c, and all states in c, 6
8x : :x2dom x
Any additional invariant prop ertymust follow from the conjunction of the typ e's invariant and invariants
that hold for the entire value space. For example, we could show that the size of a bag is nonnegative b ecause
this is true for all mathematical multiset values.
As part of sp ecifying a typ e and its creators wemust show that the invariant holds for all ob jects of the
typ e. All creators for a typ e must establish 's invariant, I :
For each creator for typ e , show for all x : that I [r esul t =x ].
post
where P [a=b] stands for predicate P with every o ccurrence of b replaced by a. Similarly, each pro ducer must
establish the invariant on its newly-created ob ject. In addition, eachmutator of the typ e must preserve the
invariant.To prove this, we assume eachmutator is called on an ob ject of typ e with a legal value one
that satis es the invariant, and show that anyvalue of a ob ject it mo di es is legal:
For eachmutator m of , for all x : assume I [x =x ] and show I [x =x ].
pr e post
For example, wewould need to show that the three creators for bag establish the invariant, and that
put and get preserve the invariant for bag. We can ignore card and equal b ecause they are observers.
Informally the invariant holds b ecause each creator guarantees that the size is no larger than the b ound;
put's pre-condition checks that there is enough ro om in the bag for another element; and get either decreases
the size of the bag or leaves it the same.
The loss of data typ e induction means that additional invariants cannot b e proved. Therefore the sp eci er
must b e careful to de ne an invariant that is strong enough that all desired invariants follow from it.
4.4 Typ e Sp eci cations Need Explicit Constraints
We are interested in the history prop erties of ob jects in addition to their invariant prop erties. We can
formulate history prop erties as predicates over state pairs, and prove them using the history rule:
History Rule:For each of the i mutators m of , for all x : :
m :pr e ^ m :post [x =x ;x =x ]
i i pr e post
x ;x
We cannot use this history rule directly,however. It is incomplete since subtyp es may de ne additional
mutators. If we use it without considering the extra mutators, it is easy to prove prop erties that do not hold
for subtyp e ob jects!
To comp ensate for the lack of the history rule, we state history prop erties explicitly in the typ e sp eci-
2
cation through a constraint clause ; if the constraint is trivial, the clause can b e omitted. For example,
the constraint
constraint b :bound = b :bound
in the sp eci cation of bag declares that a bag's b ound never changes. As another example, consider a fat set
ob ject that has an insert but no delete metho d; fat sets only grow in size. The constraint for fat set would
be:
constraint 8 i : int:i2s i2s
The predicate x ;x app earing in a constraint clause for typ e stands for the predicate: For all
computations, c, and all states and in c such that precedes ,
8x : :x2dom x ;x
2
The use of the term \constraint" is b orrowed from the Ina Jo sp eci cation language [SH92], which also includes constraints
in sp eci cations. 7
Note that we do not require that b e the immediate successor of in c.
Just as we had to prove that metho ds preserve the invariant, wemust show that they satisfy the constraint.
This is done by using the history rule for eachmutator.
The loss of the history rule is analogous to the loss of a data typ e induction rule. A practical consequence
of not having a history rule is that the sp eci er must make the constraint strong enough so that all desired
history prop erties follow from it.
5 The Meaning of Subtyp e
5.1 Sp ecifying Subtyp es
To state that a typ e is a subtyp e of some other typ e, we simply app end a subtyp e clause to its sp eci cation.
We allowmultiple sup ertyp es; there would b e a separate subtyp e clause for each. An example is given in
Figure 3.
A subtyp e's value space may b e di erent from its sup ertyp e's. For example, in the gure the sort, S,
for b ounded stackvalues is de ned in BStack as a pair, hitems; l imiti, where items is a sequence of integers
and limit is a natural numb er. The invariant indicates that the length of the stack's sequence comp onentis
less than or equal to its limit. The constraint indicates that the stack's limit do es not change. In the pre-
and p ost-conditions, [ ] stands for the empty sequence, jj is concatenation, last picks o the last elementof
a sequence, and al lButLast returns a new sequence with all but the last element of its argument.
Under the subtyp e clause we de ne an abstraction function, A, that relates stackvalues to bag values
by relying on the helping function, mk el ems, that maps sequences to multisets in the obvious manner. We
will revisit this abstraction function in Section 5.3. The subtyp e clause also lets sp eci ers relate subtyp e
metho ds to those of the sup ertyp e. The subtyp e must provide all metho ds of its sup ertyp e; we refer to these
3
as the inherited metho ds. Inherited metho ds can b e renamed, e.g., push for put; all other metho ds of the
sup ertyp e are inherited without renaming, e.g., equal. In addition to the inherited metho ds, the subtyp e
may also have some extra metho ds, e.g., swap top. Stack's equal metho d must take a bag as an argument
to satisfy the contravariance requirement. We discuss this issue further in Section 6.1.
5.2 De nition of Subtyp e
The formal de nition of the subtyp e relation, ,isgiven in Figure 4. It relates twotyp es, and , eachof
whose sp eci cations resp ectively preserves its invariant, I and I , and satis es its constraint, C and C .
In the rules, since x is an ob ject of typ e , its value x or x isamember of S and therefore cannot b e
pr e post
used directly in the predicates ab out ob jects which are in terms of values in T . The abstraction function
A is used to translate these values so that the predicates ab out ob jects make sense. A may b e partial,
need not b e onto, but can b e many-to-one. We require that an abstraction function b e de ned for all legal
values of the subtyp e although it need not b e de ned for values that do not satisfy the subtyp e invariant.
Moreover, it must map legal values of the subtyp e to legal values of the sup ertyp e.
The rst clause addresses the need to relate inherited metho ds of the subtyp e. Our formulation is similar
to America's [Ame90]. The rst two signature rules are the standard contra/covariance rules. The exception
rule says that m may not signal more than m , since a caller of a metho d on a sup ertyp e ob ject should not
exp ect to handle an unknown exception. The pre- and p ost-condition rules are the intuitive counterparts to
the contravariant and covariant rules for signatures. The pre-condition rule ensures the subtyp e's metho d
can b e called at least in any state required by the sup ertyp e. The p ost-condition rule says that the subtyp e
metho d's p ost-condition can b e stronger than the sup ertyp e metho d's p ost-condition; hence, any prop erty
that can b e proved based on the sup ertyp e metho d's p ost-condition also follows from the subtyp e's metho d's
p ost-condition.
The second clause addresses preserving program-indep endent prop erties. The invariant rule and the
assumption that the typ e sp eci cation preserves the invariant suces to argue that invariant prop erties of a
sup ertyp e are preserved by the subtyp e. The argument for the preservation of subtyp e's history prop erties
3
We do not mean that the subtyp e inherits the co de of these metho ds but simply that it provides metho ds with the same
b ehavior as de ned b elow as the corresp onding sup ertyp e metho ds. 8
stack=typ e
uses BStack stack for S
for all s: stack
invariant l eng ths :items s :l imit
constraint s :l imit = s :l imit
push = pro c i: int
requires l eng ths :items
pr e pr e
mo di es s
ensures s :items = s :items jj [ i ] ^ s :l imit = s :l imit
post pr e post pr e
pop = pro c returns int
requires s :items 6=[ ]
pr e
mo di es s
ensures r esul t = l asts :items ^ s :items = al l B utLasts :items ^
pr e post pr e
s :l imit = s :l imit
post pr e
swap top = pro c i: int
requires s :items 6=[ ]
pr e
mo di es s
ensures s :items = al l B utLasts :items jj [ i ] ^ s :l imit = s :l imit
post pr e post pr e
height = pro c returns int
ensures r esul t = l eng ths :items
pr e
equal = pro c t: bag returns b o ol
ensures r esul t =s=t
subtyp e of bag push for put, pop for get, height for card
8st : S:Ast=hmk el emsst:items; st:l im iti
where mk el ems : Seq ! M
8i : I nt; sq : Seq
mk el ems[ ] = fg
mk el emssq jj [ i ] = mk el emssq [fig
end stack
Figure 3: StackTyp e 9
Definition of the subtype relation, : = hO ;S;Mi is a subtype of = hO ;T;Ni if
there exists an abstraction function, A : S ! T , and a renaming map, R : M ! N , such that:
1. Subtyp e metho ds preserve the sup ertyp e metho ds' b ehavior. If m of is the corresp onding
renamed metho d m of , the following rules must hold:
Signature rule.
{ Contravarianceofarguments. m and m have the same numb er of arguments. If
the list of argumenttyp es of m is and that of m is , then 8i: .
i i i i
{ Covarianceofresult. Either b oth m and m have a result or neither has. If there
is a result, let m 's result typ e b e and m 's b e . Then .
{ Exception rule. The exceptions signaled by m are contained in the set of exceptions
signaled by m .
Methods rule. For all x : :
{ Pre-condition rule. m :pr e[Ax =x ] m :pr e:
pr e pr e
{ Post-condition rule. m :post m :post[Ax =x ;Ax =x ]
pr e pr e post post
2. Subtyp es preserve sup ertyp e prop erties. For all computations, c, and all states and in c
such that precedes , for all x : :
Invariant Rule. Subtyp e invariants ensure sup ertyp e invariants.
I I [Ax =x ]
Constraint Rule. Subtyp e constraints ensure sup ertyp e constraints.
C C [Ax =x ;Ax =x ]
Figure 4: De nition of the Subtyp e Relation
is completely analogous, using the constraint rule and the assumption that the typ e sp eci cation satis es its
constraint.
We do not include the invariant in the metho ds or constraint rule directly. For example, the pre-
condition rule could have b een
m :pr e[Ax =x ] ^ I [Ax =x ] m :pr e
pr e pr e pr e pr e
We omit adding the invariant b ecause if it is needed in doing a pro of it can always b e assumed, since it is
known to b e true for all ob jects of its typ e.
Note that in the various rules we require x : ,yet x app ears in predicates concerning ob jects as well.
This makes sense b ecause .
5.3 Applying the De nition of Subtyping as a Checklist
Pro ofs of the subtyp e relation are usually obvious and can b e done by insp ection. Typically, the only interest-
ing part is the de nition of the abstraction function; the other parts of the pro of are usually straightforward.
However, this section go es through the steps of an informal pro of just to show what kind of reasoning is
involved. Formal versions of these informal pro ofs are given in [LW92].
Let's revisit the stack and bag example using our de nition as a checklist. Here
= hO ;S;fpush; pop; sw ap top; heig ht; eq ual gi
stack
= hO ;B; fput; g et; car d; eq ual gi
bag
Recall that we represent a b ounded bag's value as a pair, hel ems; boundi, ofamultiset of integers and a xed
b ound, and a b ounded stack's value as a pair, hitems; l imiti, of a sequence of integers and a xed b ound. It
can easily b e shown that each sp eci cation preserves its invariant and satis es its constraint. 10
We use the abstraction function and the renaming map given in the sp eci cation for stack in Figure 3.
The abstraction function states that for all st : S
Ast= hmk el emsst:items ; st:l imi ti
where the helping function, mk el ems : Seq ! M , maps sequences to multisets such that for all sq : Seq; i :
Int:
mk el ems[ ] = fg
mk el emssq jj [ i ] = mk el emssq [fig
A is partial; it is de ned only for sequence{natural numb ers pairs, hitems; l imiti, where limit is greater than
or equal to the size of items.
The renaming map R is
Rpush = put
Rpop = get
Rheight = card
Requal = equal
Checking the signature and exception rules is easy and could b e done by the compiler.
Next, we show the corresp ondences b etween push and put,between pop and get, etc. Let's lo ok at the pre-
and p ost-condition rules for just one metho d, push. Informally, the pre-condition rule for put/push requires
4
that we show :
j As :el ems j pr e pr e l eng ths :items pr e pr e Intuitively, the pre-condition rule holds b ecause the length of stack is the same as the size of the corresp onding bag and the limit of the stack is the same as the b ound for the bag. Here is an informal pro of with slightly more detail: 1. A maps the stack's sequence comp onent to the bag's multiset by putting all elements of the sequence into the multiset. Therefore the length of the sequence s :items is equal to the size of the multiset pr e As :el ems. pr e 2. Also, A maps the limit of the stack to the b ound of the bag so that s :l imit = As :bound. pr e pr e 3. From put's pre-condition we know j As :el ems j pr e pr e 4. push's pre-condition holds by substituting equals for equals. Note the role of the abstraction function in this pro of. It allows us to relate stack and bag values, and therefore we can relate predicates ab out bag values to those ab out stackvalues and vice versa. Also, note howwe dep end on A b eing a function in step 4 where we use the substitutivity prop erty of equality. The p ost-condition rule requires that we show push's p ost-condition implies put's. We can deal with the mo di es and ensures parts separately. The mo di es part holds b ecause the same ob ject is mentioned in b oth sp eci cations. The ensures part follows from the de nition of the abstraction function. The invariant rule requires that we show that the invariant on stacks: l eng ths :items s :l imit implies that on bags: j As :el ems jAs :bound 4 Note that we are reasoning in terms of the values of the ob ject, s, and that b and s refer to the same ob ject b app ears in the bag sp eci cation. 11 We can show this by a simple pro of of induction on the length of the sequence of a b ounded stack. The constraint rule requires that we show that the constraint on stacks: s :l imit = s :l imit implies that on bags: As :bound = As :bound This is true b ecause the length of the sequence comp onent of a stack is the same as the size of the multiset comp onent of its bag counterpart. top;itistaken care of just like all the other Note that we do not havetosayanything sp eci c for swap metho ds when we show that the sp eci cation of stack satis es its invariant and constraint. 6 Typ e Hierarchies The requirementwe imp ose on subtyp es is very strong and raises a concern that it might rule out many useful subtyp e relations. To address this concern welooked at a numb er of examples. We found that our technique captures what p eople want from a hierarchy mechanism, but we also discovered some surprises. The examples led us to classify subtyp e relationships into two broad categories. In the rst category, the subtyp e extends the sup ertyp e by providing additional metho ds and p ossibly additional \state." In the second, the subtyp e is more constrained than the sup ertyp e. We discuss these relationships b elow. In practice, manytyp e families will exhibit b oth kinds of relationships. 6.1 Extension Subtyp es A subtyp e extends its sup ertyp e if its ob jects have extra metho ds in addition to those of the sup ertyp e. Abstraction functions for extension subtyp es are onto, i.e., the range of the abstraction function is the set of all legal values of the sup ertyp e. The subtyp e might simply have more metho ds; in this case the abstraction function is one-to-one. Or its ob jects might also have more \state," i.e., they might record information that is not present in ob jects of the sup ertyp e; in this case the abstraction function is many-to-one. As an example of the one-to-one case, consider a typ e intset for set of integers with metho ds to insert and delete elements, to select elements, and to provide the size of the set. A subtyp e, intset2, mighthave more metho ds, e.g., union, is empty. Here there is no extra state, just extra metho ds. Supp ose intset's invariant and constraints are b oth trivial; intset2's would b e as well. Thus, proving that intset2 preserves intset's invariant and constraint is trivial. It is easy to discover when a prop osed subtyp e really is not one. For example, the fat set typ e discussed earlier has an insert metho d but no delete metho d. Intset is not a subtyp e of fat set b ecause fat sets only grow while intsets grow and shrink; intset do es not preservevarious history prop erties of fat set, in particular, the constraint that once some integer is in the fat set, it remains in the fat set. The attempt to show that the intset constraint which is trivial implies that of fat set would fail. As a simple example of a many-to-one case, consider immutable pairs and triples Figure 5. Pairs have metho ds that fetch the rst and second elements; triples have these metho ds plus an additional one to fetch the third element. Triple is a subtyp e of pair and so is semi-mutable triple with metho ds to fetch the rst, second, and third elements and to replace the third element b ecause replacing the third element do es not a ect the rst or second element. This example shows that it is p ossible to haveamutable subtyp e of an immutable sup ertyp e, provided the mutations are invisible to users of the sup ertyp e. Mutations of a subtyp e that would b e visible through the metho ds of an immutable sup ertyp e are ruled out. For example, an immutable sequence, whose elements can b e fetched but not stored, is not a sup ertyp e of mutable array, which provides a store metho d in addition to the sequence metho ds. For sequences we can prove elements do not change; this is not true for arrays. The attempt to construct the subtyp e relation will fail b ecause the constraint for sequences do es not follow from that for arrays. Many examples of extension subtyp es are found in the literature. One common example concerns p ersons, employees, and students Figure 6. A p erson ob ject has metho ds that rep ort its prop erties such as its name, age, and p ossibly its relationship to other p ersons e.g., its parents or children. Student and employee are 12 immutable pair immutable triple semi-mutable triple Figure 5: Pairs and Triples subtyp es of p erson; in each case they have additional prop erties, e.g., a studentidnumb er, an employee employer and salary. In addition, typ e student employee is a subtyp e of b oth student and employee and also p erson, since the subtyp e relation is transitive. In this example, the subtyp e ob jects have more state than those of the sup ertyp e as well as more metho ds. person student employee student_employee Figure 6: Person, Student, and Employee Another example from the database literature concerns di erent kinds of ships [HM81]. The sup ertyp e is generic ships with metho ds to determine such things as who is the captain and where the ship is registered. Subtyp es contain more sp ecialized ships such as tankers and freighters. There can b e quite an elab orate hierarchy e.g., tankers are a sp ecial kind of freighter. Windows are another well-known example [HO87]; subtyp es include b ordered windows, colored windows, and scrollable windows. Common examples of subtyp e relationships are allowed by our de nition provided the equal metho d and other similar metho ds are de ned prop erly in the subtyp e. Supp ose sup ertyp e provides an equal metho d and consider a particular call x.equaly. The diculty arises when x and y actually b elong to , a subtyp e of . If ob jects of the subtyp e have additional state, x and y may di er when considered as subtyp e ob jects but ought to b e considered equal when considered as sup ertyp e ob jects. For example, consider immutable triples x = h0; 0; 0i and y = h0; 0; 1i. Supp ose the sp eci cation of the equal metho d for pairs says: equal = pro c q: pair returns b o ol ensures r esul t =p:f ir st = q:f irst ^ p:second = q :second We are using p to refer to the metho d's ob ject. However, wewould exp ect two triples to b e equal only if their rst, second, and third comp onents were equal. If a program using triples had just observed that x and y di er in their third element, wewould exp ect x.equaly to return \false," but if the program were using them as pairs, and had just observed that their rst and second elements were equal, it would b e wrong for the equal metho d to return false. The way to resolve this dilemma is to havetwo equal metho ds in triple: pair equal = pro c p: pair returns b o ol ensures r esul t =p:f ir st = q:f irst ^ p:second = q :second 13 triple equal = pro c p: triple returns b o ol ensures r esul t =p:f ir st = q:f irst ^ p:second = q :second ^ p:thir d = q :thir d One of them pair equal simulates the equal metho d for pair; the other triple equal is a metho d just on triples. In some ob ject-oriented languages, suchasJava, the additional equal metho ds are obtained byoverloading. The problem is not limited to equality metho ds. It also a ects metho ds that \exp ose" the abstract state of ob jects, e.g., an unparse metho d that returns a string representation of the abstract state of its ob ject. x.unparse ought to return a representation of a pair if called in a context in which x is considered to b e a pair, but it ought to return a representation of a triple in a context in which x is known to b e a triple or some subtyp e of triple. The need for several equality metho ds seems natural for realistic examples. For example, asking whether e1 and e2 are the same p erson is di erent from asking if they are the same employee. In the case of a p erson holding two jobs, the answer might b e true for the question ab out p erson but false for the question ab out employee. 6.2 Constrained Subtyp es The second kind of subtyp e relation o ccurs when the subtyp e is more constrained than the sup ertyp e. In this case, the sup ertyp e sp eci cation is written in a way that allows variation in b ehavior among its subtyp es. Subtyp es constrain the sup ertyp e by reducing the variability. The abstraction function is usually into rather than onto. The subtyp e may extend those sup ertyp e ob jects that it simulates by providing additional metho ds and/or state. Since constrained subtyp es reduce variation, it is crucial when de ning this kind of typ e hierarchyto think carefully ab out what variability is p ermitted for the subtyp es. The variability will show upinthe sup ertyp e sp eci cations in twoways: in the invariant and constraint, and also in the sp eci cations of the individual metho ds. In b oth cases the sup ertyp e de nitions will b e nondeterministic in those places where di erent subtyp es are exp ected to provide di erent b ehavior. Avery simple example concerns elephants. Elephants come in many colors realistically grey and white, but we will also allow blue ones. However all albino elephants are white and all royal elephants are blue. Figure 7 shows the elephant hierarchy. The set of legal values for regular elephants includes all elephants whose color is grey or blue or white: invariant e :col or = w hite _ e :col or = grey _ e :col or = bl ue The set of legal values for royal elephants is a subset of those for regular elephants: invariant e :col or = bl ue and hence the abstraction function is into. The situation for albino elephants is similar. Furthermore, the elephant metho d that returns the color if there is such a metho d can return grey or blue or white, i.e., it is nondeterministic; the subtyp es restrict the nondeterminism for this metho d by de ning it to return a sp ecifc color. This simple example has led others to de ne a subtyping relation that requires non-monotonic reasoning [Lip92], but we b elieve it is b etter to use variability in the sup ertyp e sp eci cation and straightforward reasoning metho ds. However, the example shows that a sp eci er of a typ e family has to anticipate subtyp es and capture the variation among them in the sp eci cation of the sup ertyp e. The bag typ e discussed in Section 4.1 has two kinds of variability. First, as discussed earlier, the sp eci- cation of get is nondeterministic b ecause it do es not constrain which element of the bag is removed. This nondeterminism allows stack to b e a subtyp e of bag: the sp eci cation of pop constrains the nondetermin- ism. We could also de ne a queue that is a subtyp e of bag; its dequeue metho d would also constrain the nondeterminism of get but in a way di erent from pop. In addition, the actual value of the b ound for bags is not de ned; it can b e any natural numb er, thus allowing subtyp es to have di erent b ounds. This variability shows up in the sp eci cation of put, where we 14 elephant royal albino Figure 7: Elephant Hierarchy do not say what sp eci c b ound value causes the call to fail. Therefore, a user of put must b e prepared for a failure. Of course the user could deduce that a particular call will succeed, based on a previous sequence of metho d calls and the constraint that the b ound of a bag do es not change. A subtyp e of bag might limit the b ound to a xed value, or to a smaller range. Several subtyp es of bag are shown in Figure 8; mediumbags havevarious b ounds, so that this typ e mighthave its own subtyp es, e.g., bag 150. bag largebag mediumbag smallbag 32 (bound = 2 ) (100 <= bound <= 1000) (bound = 20) bag_150 (bound = 150) Figure 8: A Typ e Family for Bags The bag hierarchymay seem counterintuitive, since we might exp ect that bags with smaller b ounds should b e subtyp es of bags with larger b ounds. For example, we might exp ect smallbag to b e a subtyp e of largebag. However, the sp eci cations for the twotyp es are incompatible: the b ound of every largebag is 32 2 , which is clearly not true for smallbags. Furthermore, this di erence is observable via the metho ds: It is legal to call the put metho d on a largebag whose size is greater than or equal to 20, but the call is not legal for a smallbag. Therefore the pre-condition rule is not satis ed. Although the bag typ e can have subtyp es with di erent b ounds, it cannot have subtyp es where the b ounds of the bags can change dynamically.Ifwewantedatyp e family that included b oth bag and such dynamic bags, wewould need to de ne a sup ertyp e in which the b ound is allowed, but not required, to vary. Figure 9 shows the new typ e hierarchy. Dynamic bags have a b ound that tracks the size: each time an element is added or removed from a dynamic bag, the b ound changes to match the new size. Flexible bags have an additional mutator, change bound: change bound = pro c n: int requires n jb :el emsj pr e mo di es b ensures b :el ems = b :el ems ^ b :bound = n post pr e post Notice that other typ es in the family need not haveachange bound metho d. This example illustrates the di erentways that subtyp es reduce variability. All varying bag subtyp es bag's put metho d is non-deterministic, reduce variability in the sp eci cation for the put metho d; varying 15 varying_bag I: size <= bound C: true flexible_bag dynamic_bag bag I: size <= bound I: size = bound I: size <= bound C: true C: true C: bound stays the same [...as in Fig. 8...] Figure 9: Another Typ e Family for Bags since it might add the element and change the b ound if the size is the same as the b ound, or it might not. Bag and exible bag reduce this variabilityby not adding the element, whereas dynamic bag do es add the element. In addition, bag reduces variabilityby restricting the constraint: the trivial constraint for varying bag can b e thought of as stating \either a bag's b ound maychange or it stays the same;" the constraint for bag reduces this variabilityby making a choice \the bag's b ound stays the same" and users can then rely on this prop erty for bags and its subtyp es. Dynamic bag reduces variabilityby restricting varying bag's invariant so that it no longer allows the size to b e less than the b ound. Finally, exible bag reduces variability b ecause of the extra mutator, change bound; all its subtyp es must allow explicit re-setting of the b ound. Another example is a family of integer counters shown in Figure 10. When a counter is advanced, we only know that its value gets bigger, so that the constraint is simply constraint c c The doubler and multiplier subtyp es have stronger constraints. For example, a multiplier's value always increases bya multiple, so that its constraint is: constraint 9 n : int : [ n> 0^c =nc ] For a family like this, we mightcho ose to havean advance metho d for counter so that each of its subtyp es is constrained to have this metho d or we might not. If wedoprovide an advance metho d, its sp eci cation will have to b e nondeterministic i.e., it merely states the the size of the counter grows to allow the subtyp es to provide the de nitions that are appropriate for them. In the case of the bag family illustrated in Figure 8, all typ es in the hierarchy might b e \real" in the sense that they have ob jects. However, sometimes sup ertyp es are virtual; they de ne the prop erties all subtyp es bag of Figure 9 might b e suchatyp e. have in common but have no ob jects of their own. Varying Virtual typ es are useful in manytyp e hierarchies. For example, wewould use them to construct a hierarchy for integers. Smaller integers cannot b e a subtyp e of larger integers b ecause of observable di erences in b ehavior; for example, an over ow exception that would o ccur when adding two 32-bit integers would not o ccur if they were 64-bit integers. Also, larger integers cannot b e a subtyp e of smaller ones b ecause exceptions do not o ccur when exp ected. However, we clearly would likeintegers of di erent sizes to b e related. This is accomplished by designing a virtual sup ertyp e that includes them. Such a hierarchyis shown in Figure 11, where integer is a virtual typ e whose invariant simply says that the size of an integer is greater than zero. Integer typ es with di erent sizes are subtyp es of integer. In addition, small integer typ es are subtyp es of regular int, another virtual typ e; the invariant in the sp eci cation for regular int states that the size of an integer is either 16 bits or 32 bits. An integer family mighthave a structure like this, or it might b e atter byhaving all integer typ es b e direct subtyp es of integer. 16 counter (value never decreases) incrementer doubler multiplier (value never decreases) (value doubles) (value multiplies) Figure 10: Typ e Family for Counters integer 64-bit-int regular_int 32-bit-int 16-bit-int Figure 11: Integer Family 7 Related Work Some research on de ning subtyp e relations is concerned with capturing constraints on metho d signatures via + + the contra/covariance rules, such as those used in languages likeTrellis/Owl [SCB 86], Emerald[BHJ 87], Quest [Car88], Ei el [Mey88], POOL [Ame90], and to a limited extent Mo dula-3 [Nel91]. Our rules place constraints not just on the signatures of an ob ject's metho ds, but also on their b ehavior. Our work is most similar to that of America [Ame91], who has prop osed rules for determining based on typ e sp eci cations whether one typ e is a subtyp e of another. Meyer [Mey88] also uses pre- and p ost- condition rules similar to America's and ours. Cusack's [Cus91] approach of relating typ e sp eci cations de nes subtyping in terms of strengthening state invariants. However, none of these authors considers neither the problems intro duced by extra mutators nor the preservation of history prop erties. Therefore, they allow certain subtyp e relations that we forbid e.g., intset could b e a subtyp e of fat set in these approaches. Our use of constraints in place of the history rule is one of two techniques discussed in [LW94]. That pap er prop oses a second technique in which there is no constraint; instead, extra metho ds are not allowed to intro duce new b ehavior. It requires that the b ehavior of each extra mutator b e \explained" in terms of existing b ehavior, through existing metho ds. We b elieve the use of constraints is simpler and easier to reason ab out than this \explanation" approach. The emphasis on semantics of abstract typ es is a prominent feature of the work by Leavens. In his Ph.D. thesis Leavens [Lea89] de nes typ es in terms of algebras and subtyping in terms of a simulation relation between them. His simulation relations are a more general form of our abstraction functions. Leavens considered only immutabletyp es. Dhara [Dha92, DL92, LD92] extends Leavens' thesis work to deal with mutable typ es, but rules out the cases where extra metho ds cause problems, e.g., aliasing. Because of their restrictions they allow some subtyp e relations to hold where we do not. For example, they allowmutable pairs to b e a subtyp e of immutable pairs whereas we do not. Others haveworked on the sp eci cation of typ es and subtyp es. For example, manyhave prop osed Z as + the basis of sp eci cations of ob ject typ es[CL91, DD90, CDD 89]; Goguen and Meseguer[GM87 ] use FOOPS; Leavens and his colleagues use Larch[Lea91, LW90, DL92]. Though several of these researchers separate the sp eci cation of an ob ject's creators from its other metho ds, none has identi ed the problem p osed by the 17 missing creators, and thus none has provided an explicit solution to this problem. 8 Summary We de ned a new notion of the subtyp e relation based on the semantic prop erties of the subtyp e and sup ertyp e. An ob ject's typ e determines b oth a set of legal values and an interface with its environment through calls on its metho ds. Thus, we are interested in preserving prop erties ab out sup ertyp e values and metho ds when designing a subtyp e. We require that a subtyp e preserve the b ehavior of the sup ertyp e metho ds and also all invariant and history prop erties of its sup ertyp e. We are particularly interested in an ob ject's observable b ehavior state changes, thus motivating our fo cus on history prop erties and on mutable typ es and mutators. We also presented a way to sp ecify the semantic prop erties of typ es formally. One reason wechose to base our approach on Larch is that Larch allows formal pro ofs to b e done entirely in terms of sp eci cations. In fact, once the theorems corresp onding to our subtyping rules are formally stated in Larch, their pro ofs are almost completely mechanical|a matter of symb ol manipulation|and could b e done with the assistance of the Larch Prover[GG89, ZW97]. In developing our de nition, wewere motivated primarily by pragmatics. Our intention is to capture the intuition programmers apply when designing typ e hierarchies in ob ject-oriented languages. However, intuition in the absence of precision can often go astray or lead to confusion. This is why it has b een unclear how to organize certain typ e hierarchies suchasintegers. Our de nition sheds light on such hierarchies and helps in uncovering new designs. It also supp orts the kind of reasoning that is needed to ensure that programs that work correctly using the sup ertyp e continue to work correctly with the subtyp e. Programmers have found our approach relatively easy to apply and use it primarily in an informal way. The essence of a subtyp e relationship is expressed in the mappings. These mappings can b e de ned informally, in much the same way that abstraction functions and representation invariants are given as comments in a program that implements an abstract typ e. The pro ofs can also b e done informally, in the style given in Section 5.3; they are usually straightforward and can b e done by insp ection. We also showed that our approach is useful by lo oking at a numb er of examples. This led us to identify two kinds of subtyp es: ones that extend the sup ertyp e, and ones that constrain it. In the former case, the sup ertyp e can b e de ned without a great deal of thought ab out the subtyp es, but in the latter case, this is not p ossible; instead the sup ertyp e sp eci cation must b e done carefully so that it allows all of the intended subtyp es. In particular the sp eci cation of the sup ertyp e must contain sucient nondeterminism in the invariant, constraint, and metho d sp eci cations. Our analysis raises two issues ab out typ e hierarchy that have b een ignored previously by b oth the formal metho ds and ob ject-oriented communities. First, subtyp es can have more metho ds, sp eci cally more mutators, than their sup ertyp es. Second, subtyp es need to have di erent creators than sup ertyp es. These issues forced us to revisit pro of rules normally asso ciated with typ e sp eci cations: the data typ e induction rule and the history rule. We decided to preclude the use of these rules, and to have explicit invariants and constraints to replace them. Although it is p ossible to de ne a subtyp e relation that avoids explicit invariants and constraints, doing so is awkward and often requires invention of sup er uous sup ertyp e metho ds and creators. We prefer to use explicit invariants and constraints b ecause this allows a more direct wayof capturing the designer's intent. References [Ame90] America, P. A parallel ob ject-oriented language with inheritance and subtyping. SIGPLAN, 2510:161{168, Octob er 1990. [Ame91] America, P. Designing an ob ject-oriented programming language with b ehavioural subtyping. In de Bakker, J. W., de Ro ever, W. P., and Rozenb erg, G., editors, Foundations of Object-Oriented Languages, REX School/Workshop, Noordwijkerhout, The Netherlands, May/June 1990,volume 489 of LNCS, pages 60{90. Springer-Verlag, NY, 1991. 18 + [BHJ 87] Black, A. P., Hutchinson, N., Jul, E., Levy, H. M., and Carter, L. Distribution and abstract typ es in Emerald. IEEE TSE, 131:65{76, January 1987. [Car88] Cardelli, L. A semantics of multiple inheritance. Information and Computation, 76:138{164, 1988. + [CDD 89] Carrington, D., Duke, D., Duke, R., King, P., Rose, G., and Smith, P. Ob ject-Z: An ob ject ori- ented extension to Z. In FORTE89, International ConferenceonFormal Description Techniques, Decemb er 1989. [CL91] Cusack, E. and Lai, M. Ob ject-oriented sp eci cation in LOTOS and Z, or my cat really is ob ject- oriented! In de Bakker, J. W., de Ro ever, W. P., and Rozenb erg, G., editors, Foundations of Object OrientedLanguages, pages 179{202. Springer Verlag, June 1991. LNCS 489. [Cus91] Cusack, E. Inheritance in ob ject oriented Z. In Proceedings of ECOOP '91. Springer-Verlag, 1991. [DD90] Duke, D. and Duke, R. A history mo del for classes in ob ject-Z. In Proceedings of VDM '90: VDM and Z. Springer-Verlag, 1990. [Dha92] Dhara, K. K. Subtyping among mutable typ es in ob ject-oriented programming languages. Mas- ter's thesis, Iowa State University, Ames, Iowa, 1992. Master's Thesis. [DL92] Dhara, K. K. and Leavens, G. T. Subtyping for mutable typ es in ob ject-oriented programming languages. Technical Rep ort 92-36, Department of Computer Science, Iowa State University, Ames, Iowa, Novemb er 1992. [DMN70] Dahl, O.-J., Myrhaug, B., and Nygaard, K. SIMULA common base language. Technical Re- p ort 22, Norwegian Computing Center, Oslo, Norway, 1970. [GG89] Garland, S. and Guttag, J. An overview of LP, the Larch Prover. In Proceedings of the Third International ConferenceonRewriting Techniques and Applications, pages 137{151, Chap el Hill, NC, April 1989. Lecture Notes in Computer Science 355. [GHW85] Guttag, J. V., Horning, J. J., and Wing, J. M. The Larch family of sp eci cation languages. IEEE Software, 25:24{36, Septemb er 1985. [GM87] Goguen, J. A. and Meseguer, J. Unifying functional, ob ject-oriented and relational programming with logical semantics. In Shriver, B. and Wegner, P., editors, Research Directions in Object OrientedProgramming. MIT Press, 1987. [HM81] Hammer, M. and McLeo d, D. A semantic database mo del. ACM Trans. Database Systems, 63:351{386, 1981. [HO87] Halb ert, D. C. and O'Brien, P. D. Using typ es and inheritance in ob ject-oriented programming. IEEE Software, 45:71{79, Septemb er 1987. [Hoa72] Hoare, C. Pro of of correctness of data representations. Acta Informatica, 11:271{281, 1972. [LD92] Leavens, G. T. and Dhara, K. K. A foundation for the mo del theory of abstract data typ es with mutation and aliasing preliminary version. Technical Rep ort 92-35, Department of Computer Science, Iowa State University, Ames, Iowa, Novemb er 1992. [Lea89] Leavens, G. Verifying ob ject-oriented prograsm that use subtyp es. Technical Rep ort 439, MIT Lab oratory for Computer Science, February 1989. Ph.D. thesis. [Lea91] Leavens, G. T. Mo dular sp eci cation and veri cation of ob ject-oriented programs. IEEE Soft- ware, 84:72{80, July 1991. [LG85] Liskov, B. and Guttag, J. Abstraction and Speci cation in Program Design. MIT Press, 1985. 19 [Lip92] Lip eck, U. Semantics and usage of defaults in sp eci cations. In Foundations of Information Systems Speci cation and Design, March 1992. Dagstuhl Seminar 9212 Rep ort 35. [Lis92] Liskov, B. Preliminary design of the Thor ob ject-oriented database system. In Proc. of the SoftwareTechnology Conference.DARPA, April 1992. Also Programming Metho dology Group Memo 74, MIT Lab oratory for Computer Science, Cambridge, MA, March 1992. [LW90] Leavens, G. T. and Weihl, W. E. Reasoning ab out ob ject-oriented programs that use subtyp es. In ECOOP/OOPSLA '90 Proceedings, 1990. [LW92] Liskov, B. and Wing, J. Family values: A semantic notion of subtyping. Technical Rep ort 562, MIT Lab. for Computer Science, 1992. Also available as CMU-CS-92-220. [LW94] Liskov, B. and Wing, J. A b ehavioral notion of subtyping. ACM Trans. on Prog. Lang. and Systems, pages 1811{1841, Novemb er 1994. [Mey88] Meyer, B. Object-oriented Software Construction. Prentice Hall, New York, 1988. [MS90] Maier, D. and Stein, J. Development and implementation of an ob ject-oriented DBMS. In Zdonik, S. and Maier, D., editors, Readings in Object-Oriented Database Systems, pages 167{185. Morgan Kaufmann, 1990. [Nel91] Nelson, G. Systems Programming with Modula-3. Prentice Hall, 1991. + [SCB 86] Scha ert, C., Co op er, T., Bullis, B., Kilian, M., and Wilp olt, C. An intro duction to Trellis/Owl. In Proceedings of OOPSLA '86, pages 9{16, Septemb er 1986. [SH92] Scheid, J. and Holtsb erg, S. Ina Jo sp eci cation language reference manual. Technical Rep ort TM-6021/001/06, Paramax Systems Corp oration, A Unisys Company, June 1992. [Str86] Stroustrup, B. The C++ Programming Language. Addison-Wesley, 1986. [ZW97] Zaremski, A. and Wing, J. Sp eci cation matching of software comp onents. ACM Trans. on Software Engineering and Methodology, 64:333{369, Octob er 1997. 20