Behavioral Using Invariants and Constraints

a

Barbara H. Liskov Jeannette M. Wing

July 1999

CMU-CS-99-156

a

MIT Lab. for Computer Science, 545 Technology Square, Cambridge, MA 02139

Scho ol of Computer Science

Carnegie Mellon University

Pittsburgh, PA 15213

The work describ ed in this pap er is based on a Novemb er 1994 ACM TOPLAS pap er, \A Behavioral

Notion of Subtyping, by the same authors. This pap er's version has b een submitted to the volume

Formal Methods For DistributedProcessing: An Object-Oriented Approach, edited byHoward Bowman

and John Derrick.

Abstract

We presentaway of de ning the subtyp e relation that ensures that subtyp e ob jects preserve b ehavioral

prop erties of their sup ertyp es. The subtyp e relation is based on the sp eci cations of the sub- and sup ertyp es.

Our approach handles mutable typ es and allows subtyp es to have more metho ds than their sup ertyp es.

Dealing with mutable typ es and subtyp es that extend their sup ertyp es has surprising consequences on how

to sp ecify and reason ab out ob jects. In our approach, we discard the standard data typ e induction rule,

we prohibit the use of an analogous \history" rule, and we make up for b oth losses by adding explicit

predicates|invariants and constraints{to our typ e sp eci cations. We also discuss the rami cations of our

approach of subtyping the design of typ e families.

This researchwas supp orted for Liskov in part by the Advanced Research Pro jects Agency of the Department

of Defense, monitored by the Oce of Naval Research under contract N00014-91-J-4136 and in part by the National

Science Foundation under Grant CCR-8822158; for Wing, by the Avionics Lab, Wright Research and Development

Center, Aeronautical Systems Division AFSC, U. S. Air Force, Wright-Patterson AFB, OH 45433-6543 under

Contract F33615-90-C-1465, ARPA Order No. 7597.

Keywords: subtyp e, ob ject-oriented design, abstraction function, mutable typ es, invariants, constraints, sp eci cations

1 Intro duction

What do es it mean for one typ e to b e a subtyp e of another? We argue that this is a semantic question

having to do with the b ehavior of the ob jects of the twotyp es: the ob jects of the subtyp e oughttobehave

the same as those of the sup ertyp e as far as anyone or any program using sup ertyp e ob jects can tell.

For example, in strongly typ ed ob ject-oriented languages such as 67[DMN70], C++[Str86],

+

Mo dula-3[Nel91], and Trellis/Owl[SCB 86], subtyp es are used to broaden the assignment statement. An

assignment

x: T := E

is legal provided the typ e of expression E is a subtyp e of the declared typ e T of variable x. Once the

assignment has o ccurred, x will b e used according to its \apparent" typ e T, with the exp ectation that if the

program p erforms correctly when the actual typ e of x's ob ject is T, it will also work correctly if the actual

typ e of the ob ject denoted by x is a subtyp e of T.

Clearly subtyp es must provide the exp ected metho ds with compatible signatures. This consideration has

+ +

led to the formulation of the contra/covariance rules[BHJ 87, SCB 86, Car88]. However, these rules are

not strong enough to ensure that the program containing the ab ove assignment will work correctly for any

subtyp e of T, since all they do is ensure that no typ e errors will o ccur. It is well known that typ e checking,

while very useful, captures only a small part of what it means for a program to b e correct; the same is

true for the contra/covariance rules. For example, stacks and queues might b oth haveaput metho d to add

an element and a get metho d to remove one. According to the contravariance rule, either could b e a legal

subtyp e of the other. However, a program written in the exp ectation that x is a stack is unlikely to work

correctly if x actually denotes a queue, and vice versa.

What is needed is a stronger requirement that constrains the b ehavior of subtyp es: prop erties that can

b e proved using the sp eci cation of an ob ject's presumed typ e should hold even though the ob ject is actually

a memb er of a subtyp e of that typ e:

SubtypeRequirement: Let x b e a prop erty provable ab out ob jects x of typ e T. Then y 

should b e true for ob jects y of typ e S where S is a subtyp e of T.

Atyp e's sp eci cation determines what prop erties we can prove ab out ob jects.

We are interested only in safety prop erties \nothing bad happ ens". First, prop erties of an ob ject's

b ehavior in a particular program must b e preserved: to ensure that a program continues to work as exp ected,

calls of metho ds made in the program that assume the ob ject b elongs to a sup ertyp e must have the same

b ehavior when the ob ject actually b elongs to a subtyp e. In addition, however, prop erties indep endent

of particular programs must b e preserved b ecause these are imp ortant when indep endent programs share

ob jects. We fo cus on two kinds of such prop erties: invariants, which are prop erties true of all states, and

history properties, which are prop erties true of all sequences of states. We formulate invariants as predicates

over single states and history prop erties, over pairs of states. For example, an invariant prop erty of a bag is

that its size is always less than its b ound; a history prop erty is that the bag's b ound do es not change. We

do not address other kinds of safety prop erties of computations, e.g., the existence of an ob ject in a state,

the numb er of ob jects in a state, or the relationship b etween ob jects in a state, since these do not have

to do with the meanings of typ es. We also do not address liveness prop erties \something go o d eventually

happ ens", e.g., the size of a bag will eventually reach the b ound.

This chapter provides a general, yet easy to use, de nition of the subtyp e relation that satis es the

Subtyp e Requirement. Our approach handles mutable typ es and allows subtyp es to have more metho ds

than their sup ertyp es. Dealing with mutable typ es and subtyp es that extend their sup ertyp es has surprising

consequences on how to sp ecify and reason ab out ob jects. In our approach, we discard the standard data

typ e induction rule, we prohibit the use of an analogous \history" rule, and we make up for b oth losses by

adding explicit predicates to our typ e sp eci cations. Our sp eci cations are formal, which means that they

have a precise mathematical meaning that serves as a rm foundation for reasoning. Our sp eci cations can

also b e used informally as describ ed in [LG85].

Our de nition applies in a very general distributed environment in which p ossibly concurrent users share

mutable ob jects. Our approach is also constructive: One can prove whether a subtyp e relation holds by

proving a small numb er of simple lemmas based on the sp eci cations of the twotyp es. 1

The chapter also explores the rami cations of the subtyp e relation and shows howinteresting typ e families

can b e de ned. For example, arrays are not a subtyp e of sequences b ecause the user of a sequence exp ects

it not to change over time and 32- integers are not a subtyp e of 64-bit integers b ecause a user of 64-bit

integers would exp ect certain metho d calls to succeed that will fail when applied to 32-bit integers. However,

typ e families can b e de ned that group such related typ es together and thus allow generic routines to b e

written that work for all family memb ers. Our approach makes it particularly easy to de ne typ e families:

it emphasizes the prop erties that all family memb ers must preserve, and it do es not require the intro duction

of unnecessary metho ds i.e., metho ds that the sup ertyp e would not naturally have.

The chapter is organized as follows. Section 2 discusses in more detail what we require of our subtyp e

relation and provides the motivation for our approach. We describ e our mo del of computation in Section 3

and present our sp eci cation metho d in Section 4. We give a formal de nition of subtyping in Section 5; we

discuss its rami cations on designing typ e in Section 6. We describ e related work in Section 7

and summarize our contributions in Section 8.

2 Motivation

To motivate the basic idea b ehind our notion of subtyping, let's lo ok at an example. Consider a b ounded

bag typ e that provides a put metho d that inserts elements into a bag and a get metho d that removes an

arbitrary element from a bag. Put has a pre-condition that checks to see that adding an element will not

grow the bag b eyond its b ound; get has a pre-condition that checks to see that the bag is non-empty.

Consider also a b ounded stacktyp e that has, in addition to push and pop metho ds, a swap top metho d

that takes an integer, i, and mo di es the stackby replacing its top with i. Stack's push and pop metho ds

have pre-conditions similar to bag's put and get, and swap top has a pre-condition requiring that the stack

is non-empty.

Intuitively, stack is a subtyp e of bag b ecause b oth kinds of collections b ehave similarly. The main

di erence is that the get metho d for bags do es not sp ecify precisely what element is removed; the pop

metho d for stack is more constrained, but what it do es is one of the p ermitted b ehaviors for bag's get

metho d. Let's ignore swap top for the moment.

Supp ose wewant to show stack is a subtyp e of bag. We need to relate the values of stacks to those of

bags. This can b e done by means of an abstraction function, like that used for proving the correctness of

implementations [Hoa72]. A given stackvalue maps to a bag value where we abstract from the insertion

order on the elements.

We also need to relate stack's metho ds to bag's. Clearly there is a corresp ondence b etween stack's push

metho d and bag's put and similarly for the pop and get metho ds even though the names of the corresp onding

metho ds do not match. The pre- and p ost-conditions of corresp onding metho ds will need to relate in some

precise to b e de ned way. In showing this relationship we need to app eal to the abstraction function so

that we can reason ab out stackvalues in terms of their corresp onding bag values.

Finally, what ab out swap top? Most other de nitions of the subtyp e relation have ignored such \extra"

metho ds, and it is p erfectly adequate do so when programs are considered in isolation and there is no aliasing.

In such a constrained situation, a program that uses an ob ject that is apparently a bag but is actually a stack

will never call the extra metho ds, and therefore their b ehavior is irrelevant. However, we cannot ignore extra

metho ds in the presence of aliasing, and also in a general computational environment that allows sharing

of mutable ob jects bymultiple users or pro cesses. In particular, we need to pay attention to extra mutator

metho ds like swap top that mo dify their ob ject.

Consider rst the case of aliasing. The problem here is that within a program an ob ject is accessible by

more than one name, so that mo di cations using one of the names are visible when the ob ject is accessed

using the other name. For example, supp ose  is a subtyp e of  and that variables

x: 

y: 

b oth denote the same ob ject whichmust, of course, b elong to  or one of its subtyp es. When the ob ject

is accessed through x, only  metho ds can b e called. However, when it is used through y,  metho ds can b e

called and if these metho ds are mutators, their e ects will b e visible later when the ob ject is accessed via 2

x. To reason ab out the use of variable x using the sp eci cation of its typ e  ,we need to imp ose additional

constraints on the subtyp e relation.

Now consider the case of an environment of shared mutable ob jects, such as is provided by ob ject-oriented

databases e.g., Thor [Lis92] and Gemstone [MS90]. In such systems, there is a universe containing shared,

mutable ob jects and a way of naming those ob jects. In general, lifetimes of ob jects may b e longer than

the programs that create and access them i.e., ob jects might b e p ersistent and users or programs may

access ob jects concurrently and/or ap erio dically for varying lengths of time. Of course there is a need for

some form of concurrency control in suchanenvironment. We assume such a mechanism is in place, and

consider a computation to b e made up out of atomic units i.e., transactions that exclude one another. The

transactions of di erent computations can b e interleaved and thus one computation is able to observe the

mo di cations made by another.

If there were subtyping in suchanenvironment the following situation might o ccur. A user installs a

directory ob ject that maps string names to bags. Later, a second user enters a stackinto the directory under

some string name; such a binding is analogous to assigning a subtyp e ob ject to a variable of the sup ertyp e.

After this, b oth users o ccasionally access the stack ob ject. The second user knows it is a stack and accesses

it using stack metho ds. The question is: What do es the rst user need to know in order for his or her

programs to make sense?

We think it ought to b e sucient for a user to know only ab out the \apparent" typ e of the ob ject; the

subtyp e ought to preserveany prop erties that can b e proved ab out the sup ertyp e. In particular, the rst

user ought to b e able to reason ab out his or her use of the stack ob ject using invariant and history prop erties

of bag.

Our approachachieves this goal by adding information to typ e sp eci cations. To handle invariants, we

add an invariant clause; to handle history prop erties, a constraint clause. Showing that  is a subtyp e of 

requires showing that under the abstraction function  's invariant implies  's invariant and  's constraint

implies  's constraint.

For example, for the bag and stack example, the twoinvariants are identical: b oth state that the size of

the bag stack is less than or equal to its b ound. Similarly, the two constraints are identical: b oth state that

the b ound of the bag or stack do es not change. Showing that stack's invariant and constraint resp ectively

imply bag's invariant and constraint is trivial. The extra metho d swap top is p ermitted b ecause even though

it changes the stack's contents, it preserves stack's invariant and constraint.

In Section 5 we present and discuss our subtyp e de nition. First, however, we de ne our mo del of

computation, and then discuss sp eci cations, since these de ne the ob jects, values, and metho ds that will

b e related by the subtyp e relation.

3 Mo del of Computation

We assume a of all p otentially existing ob jects, Obj, partitioned into disjointtyp ed sets. Each ob ject

has a unique identity.Atype de nes a set of values for an ob ject and a set of methods that provide the only

means to manipulate that ob ject. E ectively Obj is a set of unique identi ers for all ob jects that can contain

values.

Ob jects can b e created and manipulated in the course of program execution. A state de nes a value for

each existing ob ject. It is a pair of mappings, an environment and a store.Anenvironment maps program

variables to ob jects; a store maps ob jects to values.

State = Env  Store

Env = Var ! Obj

Store = Obj ! Val

Givenavariable, x, and a state, , with an environment, :e, and store, :s,we use the notation x to denote



the value of x in state ; i.e., x = :s:ex. When we refer to the domain of a state, dom, we mean



more precisely the domain of the store in that state.

We mo del a typ e as a triple, hO; V ; M i, where O Obj is a set of ob jects, V Val is a set of values,

and M is a set of metho ds. Each metho d for an ob ject is a producer,an observer,oramutator. Pro ducers

of an ob ject of typ e  return new ob jects of typ e  ; observers return results of other typ es; mutators mo dify 3

ob jects of typ e  . An ob ject is immutable if its value cannot change and otherwise it is mutable;atyp e is

immutable if its ob jects are and otherwise it is mutable. Clearly a typ e can b e mutable only if some of its

metho ds are mutators. We allow mixed methods where a pro ducer or an observer can also b e a mutator.

We also allow metho ds to signal exceptions; we assume termination exceptions, i.e., each metho d call either

terminates normally or in one of a numb er of named exception conditions. To b e consistent with ob ject-

oriented language notation, we write x.ma to denote the call of metho d m on ob ject x with the sequence

of arguments a.

Ob jects come into existence and get their initial values through creators. These are often called con-

structors in the literature. Unlike other kinds of metho ds, creators do not b elong to particular ob jects, but

rather are indep endent op erations.

A computation, i.e., program execution, is a sequence of alternating states and transitions starting in

some initial state,  :

0

 Tr  :::  Tr 

0 1 1 n1 n n

Each transition, Tr , of a computation sequence is a partial function on states; we assume the execution of

i

each transition is atomic. A history is the subsequence of states of a computation; we use  and  to range

over states in any computation, c, where  precedes  in c. The value of an ob ject can change only through

the invo cation of a mutator; in addition the environment can change through assignment and the domain of

the store can change through the invo cation of a creator or pro ducer.

Ob jects are never destroyed:

8 1  i  n : dom  dom .

i1 i

4 Sp eci cations

4.1 Typ e Sp eci cations

Atyp e sp eci cation includes the following information:

 The typ e's name;

 A description of the typ e's value space;

 A de nition of the typ e's invariant and history prop erties;

 For each of the typ e's metho ds:

{ Its name;

{ Its signature including signaled exceptions;

{ Its b ehavior in terms of pre-conditions and p ost-conditions.

Note that the creators are missing. Omitting creators allows subtyp es to provide di erent creators than

their sup ertyp es. In addition, omitting creators makes it easy for a typ e to havemultiple implementations,

allows new creators to b e added later, and re ects common usage: for example, Javainterfaces and virtual

typ es provide no way for users to create ob jects of the typ e. We showhow to sp ecify creators in Section 4.2.

In our work we use formal sp eci cations in the two-tiered style of Larch [GHW85]. The rst tier de nes

sorts, which are used to de ne the value spaces of ob jects. In the second tier, Larch interfaces are used to

de ne typ es.

For example, Figure 1 gives a sp eci cation for a bag typ e whose ob jects have metho ds put, get, card,

and equal. The uses clause de nes the value space for the typ e by identifying a sort. The clause in the

gure indicates that values of ob jects of typ e bag are denotable by terms of sort B intro duced in the BBag

sp eci cation; a value of this sort is a pair, hel ems; boundi, where elems is a mathematical multiset of integers

and bound is a natural numb er. The notation fgstands for the emptymultiset, [ is a commutative

op eration on multisets that do es not discard duplicates, 2 is the memb ership op eration, and j x j is a 4

cardinality op eration that returns the total numb er of elements in the multiset x. These op erations as well

as equality = and inequality6= are all de ned in BBag.

The invariant clause contains a single-state predicate that de nes the typ e's invariant prop erties. The

constraint clause contains a two-state predicate that de nes the typ e's history prop erties. We will discuss

these clauses in more detail in subsequent sections.

bag = typ e

uses BBag bag for B

for all b: bag

invariant j b :el ems j b :bound

 

constraint b :bound = b :bound

 

put = pro c i: int

requires j b :el ems j

pr e pr e

mo di es b

ensures b :el ems = b :el ems [fig^b :bound = b :bound

post pr e post pr e

get = pro c returns int

requires b :el ems 6= fg

pr e

mo di es b

ensures b :el ems = b :el ems fr esul tg^r esul t 2 b :el ems ^

post pr e pr e

b :bound = b :bound

post pr e

card = pro c returns int

ensures r esul t = j b :el ems j

pr e

equal = pro c a: bag returns b o ol

ensures r esul t =a = b

end bag

Figure 1: A Typ e Sp eci cation for Bags

The b o dy of a typ e sp eci cation provides a sp eci cation for each metho d. Since a metho d's sp eci cation

needs to refer to the metho d's ob ject, weintro duce a name for that ob ject in the for all line. We use result

to name a metho d's result parameter. In the requires and ensures clauses x stands for an ob ject, x for

pr e

1

its value in the initial state, and x for its value in the nal state. Distinguishing b etween initial and

post

nal values is necessary only for mutable typ es, so we suppress the subscripts for parameters of immutable

typ es likeintegers. We need to distinguish b etween an ob ject, x, and its value, x or x , b ecause

pr e post

we sometimes need to refer to the ob ject itself, e.g., in the equal metho d, which determines whether two

mutable bags are the same ob ject.

A metho d m's pre-condition, denoted m.pre, is the predicate that app ears in its requires clause; e.g.,

put's pre-condition checks to see that adding an element will not enlarge the bag b eyond its b ound. If the

clause is missing, the pre-condition is trivially \true."

A metho d m's post-condition, denoted m.post, is the conjunction of the predicates given by its mo di es

and ensures clauses. A mo di es x ;:::;x clause is shorthand for the predicate:

1 n

8 x 2 dompr e fx ;:::;x g :x = x

1 n pr e post

1

Note that pre and post are implicitly universally quanti ed variables over states. Also, more formally, x stands for

pr e

pre.spre.ex; x stands for post.spost.ex.

post 5

whichsays only ob jects listed maychange in value. A mo di es clause is a strong statement ab out all

ob jects not explicitly listed, i.e., their values may not change; if there is no mo di es clause then nothing

maychange. For example, card's p ost-condition says that it returns the size of the bag and no ob jects

including the bag change, and put's p ost-condition says that the bag's value changes by the addition of its

integer argument, and no other ob jects change.

Metho ds may terminate normally or exceptionally; the exceptions are listed in a signals clause in the

metho d's header. For example, instead of the get metho d we mighthave had

0

get = pro c returns int signals empty

mo di es b

ensures if b :el ems = fg then signal empty

pr e

else b :el ems = b :el ems fr esul tg^

post pr e

r esul t 2 b :el ems ^ b :bound = b :bound

pr e post pr e

4.2 Sp ecifying Creators

Ob jects are created and initialized through creators. Figure 2 shows sp eci cations for three di erent creators

for bags. The rst creator creates a new empty bag whose b ound is its integer argument. The second and

third creators x the bag's b ound to b e 100. The third creator uses its integer argument to create a singleton

bag. The assertion newx stands for the predicate:

x 2 dompost dompr e

Recall that ob jects are never destroyed so that dompr e dompost.

bag create= pro c n: int returns bag

requires n  0

ensures newresult ^ r esul t = hfg;ni

post

bag create smal l = pro c returns bag

ensures newresult ^ r esul t = hfg; 100i

post

bag create single = pro c i: int returns bag

ensures newresult ^ r esul t = hfig; 100i

post

Figure 2: Creator Sp eci cations for Bags

4.3 Typ e Sp eci cations Need Explicit Invariants

By not including creators in typ e sp eci cations and by allowing subtyp es to extend sup ertyp es with mutators

weloseapowerful reasoning to ol: data typ e induction. Data typ e induction is used to provetyp e invariants.

The base case of the rule requires that each creator of the typ e establish the invariant; the inductive case

requires that each metho d in particular eachmutator preserve the invariant. Without the creators, wehave

no base case. Without knowing all mutators of typ e  as added by  's subtyp es, wehave an incomplete

inductive case. With no data typ e induction rule, we cannot provetyp e invariants!

To comp ensate for the lack of a data typ e induction rule, we state the invariant explicitly in the typ e

sp eci cation through an invariant clause; if the invariant is trivial i.e., identical to \true", the clause can

b e omitted. The invariant de nes the legal values of its typ e  .For example, we include

invariant j b :el ems j b :bound

 

in the typ e sp eci cation of Figure 1 to state that the size of a b ounded bag never exceeds its b ound. The

predicate x  app earing in an invariant clause for typ e  stands for the predicate: For all computations,



c, and all states  in c, 6

8x : :x2dom  x 



Any additional invariant prop ertymust follow from the conjunction of the typ e's invariant and invariants

that hold for the entire value space. For example, we could show that the size of a bag is nonnegative b ecause

this is true for all mathematical multiset values.

As part of sp ecifying a typ e and its creators wemust show that the invariant holds for all ob jects of the

typ e. All creators for a typ e  must establish  's invariant, I :



For each creator for typ e  , show for all x :  that I [r esul t =x ].

 post 

where P [a=b] stands for predicate P with every o ccurrence of b replaced by a. Similarly, each pro ducer must

establish the invariant on its newly-created ob ject. In addition, eachmutator of the typ e must preserve the

invariant.To prove this, we assume eachmutator is called on an ob ject of typ e  with a legal value one

that satis es the invariant, and show that anyvalue of a  ob ject it mo di es is legal:

For eachmutator m of  , for all x :  assume I [x =x ] and show I [x =x ].

 pr e   post 

For example, wewould need to show that the three creators for bag establish the invariant, and that

put and get preserve the invariant for bag. We can ignore card and equal b ecause they are observers.

Informally the invariant holds b ecause each creator guarantees that the size is no larger than the b ound;

put's pre-condition checks that there is enough ro om in the bag for another element; and get either decreases

the size of the bag or leaves it the same.

The loss of data typ e induction means that additional invariants cannot b e proved. Therefore the sp eci er

must b e careful to de ne an invariant that is strong enough that all desired invariants follow from it.

4.4 Typ e Sp eci cations Need Explicit Constraints

We are interested in the history prop erties of ob jects in addition to their invariant prop erties. We can

formulate history prop erties as predicates over state pairs, and prove them using the history rule:

History Rule:For each of the i mutators m of  , for all x :  :

m :pr e ^ m :post  [x =x ;x =x ]

i i pr e  post 

x ;x 

 

We cannot use this history rule directly,however. It is incomplete since subtyp es may de ne additional

mutators. If we use it without considering the extra mutators, it is easy to prove prop erties that do not hold

for subtyp e ob jects!

To comp ensate for the lack of the history rule, we state history prop erties explicitly in the typ e sp eci-

2

cation through a constraint clause ; if the constraint is trivial, the clause can b e omitted. For example,

the constraint

constraint b :bound = b :bound

 

in the sp eci cation of bag declares that a bag's b ound never changes. As another example, consider a fat set

ob ject that has an insert but no delete metho d; fat sets only grow in size. The constraint for fat set would

be:

constraint 8 i : int:i2s i2s

 

The predicate x ;x  app earing in a constraint clause for typ e  stands for the predicate: For all

 

computations, c, and all states  and  in c such that  precedes  ,

8x : :x2dom  x ;x 

 

2

The use of the term \constraint" is b orrowed from the Ina Jo sp eci cation language [SH92], which also includes constraints

in sp eci cations. 7

Note that we do not require that  b e the immediate successor of  in c.

Just as we had to prove that metho ds preserve the invariant, wemust show that they satisfy the constraint.

This is done by using the history rule for eachmutator.

The loss of the history rule is analogous to the loss of a data typ e induction rule. A practical consequence

of not having a history rule is that the sp eci er must make the constraint strong enough so that all desired

history prop erties follow from it.

5 The Meaning of Subtyp e

5.1 Sp ecifying Subtyp es

To state that a typ e is a subtyp e of some other typ e, we simply app end a subtyp e clause to its sp eci cation.

We allowmultiple sup ertyp es; there would b e a separate subtyp e clause for each. An example is given in

Figure 3.

A subtyp e's value space may b e di erent from its sup ertyp e's. For example, in the gure the sort, S,

for b ounded stackvalues is de ned in BStack as a pair, hitems; l imiti, where items is a sequence of integers

and limit is a natural numb er. The invariant indicates that the length of the stack's sequence comp onentis

less than or equal to its limit. The constraint indicates that the stack's limit do es not change. In the pre-

and p ost-conditions, [ ] stands for the empty sequence, jj is concatenation, last picks o the last elementof

a sequence, and al lButLast returns a new sequence with all but the last element of its argument.

Under the subtyp e clause we de ne an abstraction function, A, that relates stackvalues to bag values

by relying on the helping function, mk el ems, that maps sequences to multisets in the obvious manner. We

will revisit this abstraction function in Section 5.3. The subtyp e clause also lets sp eci ers relate subtyp e

metho ds to those of the sup ertyp e. The subtyp e must provide all metho ds of its sup ertyp e; we refer to these

3

as the inherited metho ds. Inherited metho ds can b e renamed, e.g., push for put; all other metho ds of the

sup ertyp e are inherited without renaming, e.g., equal. In addition to the inherited metho ds, the subtyp e

may also have some extra metho ds, e.g., swap top. Stack's equal metho d must take a bag as an argument

to satisfy the contravariance requirement. We discuss this issue further in Section 6.1.

5.2 De nition of Subtyp e

The formal de nition of the subtyp e relation, ,isgiven in Figure 4. It relates twotyp es,  and  , eachof

whose sp eci cations resp ectively preserves its invariant, I and I , and satis es its constraint, C and C .

   

In the rules, since x is an ob ject of typ e  , its value x or x isamember of S and therefore cannot b e

pr e post

used directly in the predicates ab out  ob jects which are in terms of values in T . The abstraction function

A is used to translate these values so that the predicates ab out  ob jects make sense. A may b e partial,

need not b e onto, but can b e many-to-one. We require that an abstraction function b e de ned for all legal

values of the subtyp e although it need not b e de ned for values that do not satisfy the subtyp e invariant.

Moreover, it must map legal values of the subtyp e to legal values of the sup ertyp e.

The rst clause addresses the need to relate inherited metho ds of the subtyp e. Our formulation is similar

to America's [Ame90]. The rst two signature rules are the standard contra/covariance rules. The exception

rule says that m may not signal more than m , since a caller of a metho d on a sup ertyp e ob ject should not

 

exp ect to handle an unknown exception. The pre- and p ost-condition rules are the intuitive counterparts to

the contravariant and covariant rules for signatures. The pre-condition rule ensures the subtyp e's metho d

can b e called at least in any state required by the sup ertyp e. The p ost-condition rule says that the subtyp e

metho d's p ost-condition can b e stronger than the sup ertyp e metho d's p ost-condition; hence, any prop erty

that can b e proved based on the sup ertyp e metho d's p ost-condition also follows from the subtyp e's metho d's

p ost-condition.

The second clause addresses preserving program-indep endent prop erties. The invariant rule and the

assumption that the typ e sp eci cation preserves the invariant suces to argue that invariant prop erties of a

sup ertyp e are preserved by the subtyp e. The argument for the preservation of subtyp e's history prop erties

3

We do not mean that the subtyp e inherits the co de of these metho ds but simply that it provides metho ds with the same

b ehavior as de ned b elow as the corresp onding sup ertyp e metho ds. 8

stack=typ e

uses BStack stack for S

for all s: stack

invariant l eng ths :items  s :l imit

 

constraint s :l imit = s :l imit

 

push = pro c i: int

requires l eng ths :items

pr e pr e

mo di es s

ensures s :items = s :items jj [ i ] ^ s :l imit = s :l imit

post pr e post pr e

pop = pro c  returns int

requires s :items 6=[ ]

pr e

mo di es s

ensures r esul t = l asts :items ^ s :items = al l B utLasts :items ^

pr e post pr e

s :l imit = s :l imit

post pr e

swap top = pro c i: int

requires s :items 6=[ ]

pr e

mo di es s

ensures s :items = al l B utLasts :items jj [ i ] ^ s :l imit = s :l imit

post pr e post pr e

height = pro c returns int

ensures r esul t = l eng ths :items

pr e

equal = pro c t: bag returns b o ol

ensures r esul t =s=t

subtyp e of bag push for put, pop for get, height for card

8st : S:Ast=hmk el emsst:items; st:l im iti

where mk el ems : Seq ! M

8i : I nt; sq : Seq

mk el ems[ ] = fg

mk el emssq jj [ i ] = mk el emssq  [fig

end stack

Figure 3: StackTyp e 9

Definition of the subtype relation, :  = hO ;S;Mi is a subtype of  = hO ;T;Ni if

 

there exists an abstraction function, A : S ! T , and a renaming map, R : M ! N , such that:

1. Subtyp e metho ds preserve the sup ertyp e metho ds' b ehavior. If m of  is the corresp onding



renamed metho d m of  , the following rules must hold:



 Signature rule.

{ Contravarianceofarguments. m and m have the same numb er of arguments. If

 

the list of argumenttyp es of m is and that of m is , then 8i:  .

 i  i i i

{ Covarianceofresult. Either b oth m and m have a result or neither has. If there

 

is a result, let m 's result typ e b e and m 's b e . Then  .

 

{ Exception rule. The exceptions signaled by m are contained in the set of exceptions



signaled by m .



 Methods rule. For all x :  :

{ Pre-condition rule. m :pr e[Ax =x ]  m :pr e:

 pr e pr e 

{ Post-condition rule. m :post  m :post[Ax =x ;Ax =x ]

  pr e pr e post post

2. Subtyp es preserve sup ertyp e prop erties. For all computations, c, and all states  and  in c

such that  precedes  , for all x :  :

 Invariant Rule. Subtyp e invariants ensure sup ertyp e invariants.

I  I [Ax =x ]

   

 Constraint Rule. Subtyp e constraints ensure sup ertyp e constraints.

C  C [Ax =x ;Ax =x ]

     

Figure 4: De nition of the Subtyp e Relation

is completely analogous, using the constraint rule and the assumption that the typ e sp eci cation satis es its

constraint.

We do not include the invariant in the metho ds or constraint rule directly. For example, the pre-

condition rule could have b een

m :pr e[Ax =x ] ^ I [Ax =x ]  m :pr e

 pr e pr e  pr e pr e 

We omit adding the invariant b ecause if it is needed in doing a pro of it can always b e assumed, since it is

known to b e true for all ob jects of its typ e.

Note that in the various rules we require x :  ,yet x app ears in predicates concerning  ob jects as well.

This makes sense b ecause    .

5.3 Applying the De nition of Subtyping as a Checklist

Pro ofs of the subtyp e relation are usually obvious and can b e done by insp ection. Typically, the only interest-

ing part is the de nition of the abstraction function; the other parts of the pro of are usually straightforward.

However, this section go es through the steps of an informal pro of just to show what of reasoning is

involved. Formal versions of these informal pro ofs are given in [LW92].

Let's revisit the stack and bag example using our de nition as a checklist. Here

 = hO ;S;fpush; pop; sw ap top; heig ht; eq ual gi

stack

 = hO ;B; fput; g et; car d; eq ual gi

bag

Recall that we represent a b ounded bag's value as a pair, hel ems; boundi, ofamultiset of integers and a xed

b ound, and a b ounded stack's value as a pair, hitems; l imiti, of a sequence of integers and a xed b ound. It

can easily b e shown that each sp eci cation preserves its invariant and satis es its constraint. 10

We use the abstraction function and the renaming map given in the sp eci cation for stack in Figure 3.

The abstraction function states that for all st : S

Ast= hmk el emsst:items ; st:l imi ti

where the helping function, mk el ems : Seq ! M , maps sequences to multisets such that for all sq : Seq; i :

Int:

mk el ems[ ] = fg

mk el emssq jj [ i ] = mk el emssq  [fig

A is partial; it is de ned only for sequence{natural numb ers pairs, hitems; l imiti, where limit is greater than

or equal to the size of items.

The renaming map R is

Rpush = put

Rpop = get

Rheight = card

Requal = equal

Checking the signature and exception rules is easy and could b e done by the compiler.

Next, we show the corresp ondences b etween push and put,between pop and get, etc. Let's lo ok at the pre-

and p ost-condition rules for just one metho d, push. Informally, the pre-condition rule for put/push requires

4

that we show :

j As :el ems j

pr e pr e



l eng ths :items

pr e pr e

Intuitively, the pre-condition rule holds b ecause the length of stack is the same as the size of the corresp onding

bag and the limit of the stack is the same as the b ound for the bag. Here is an informal pro of with slightly

more detail:

1. A maps the stack's sequence comp onent to the bag's multiset by putting all elements of the sequence

into the multiset. Therefore the length of the sequence s :items is equal to the size of the multiset

pr e

As :el ems.

pr e

2. Also, A maps the limit of the stack to the b ound of the bag so that s :l imit = As :bound.

pr e pr e

3. From put's pre-condition we know j As :el ems j

pr e pr e

4. push's pre-condition holds by substituting equals for equals.

Note the role of the abstraction function in this pro of. It allows us to relate stack and bag values, and

therefore we can relate predicates ab out bag values to those ab out stackvalues and vice versa. Also, note

howwe dep end on A b eing a function in step 4 where we use the substitutivity prop erty of equality.

The p ost-condition rule requires that we show push's p ost-condition implies put's. We can deal with the

mo di es and ensures parts separately. The mo di es part holds b ecause the same ob ject is mentioned in

b oth sp eci cations. The ensures part follows from the de nition of the abstraction function.

The invariant rule requires that we show that the invariant on stacks:

l eng ths :items  s :l imit

 

implies that on bags:

j As :el ems jAs :bound

 

4

Note that we are reasoning in terms of the values of the ob ject, s, and that b and s refer to the same ob ject b app ears in

the bag sp eci cation. 11

We can show this by a simple pro of of induction on the length of the sequence of a b ounded stack.

The constraint rule requires that we show that the constraint on stacks:

s :l imit = s :l imit

 

implies that on bags:

As :bound = As :bound

 

This is true b ecause the length of the sequence comp onent of a stack is the same as the size of the multiset

comp onent of its bag counterpart.

top;itistaken care of just like all the other Note that we do not havetosayanything sp eci c for swap

metho ds when we show that the sp eci cation of stack satis es its invariant and constraint.

6 Typ e Hierarchies

The requirementwe imp ose on subtyp es is very strong and raises a concern that it might rule out many

useful subtyp e relations. To address this concern welooked at a numb er of examples. We found that our

technique captures what p eople want from a mechanism, but we also discovered some surprises.

The examples led us to classify subtyp e relationships into two broad categories. In the rst category,

the subtyp e extends the sup ertyp e by providing additional metho ds and p ossibly additional \state." In

the second, the subtyp e is more constrained than the sup ertyp e. We discuss these relationships b elow. In

practice, manytyp e families will exhibit b oth kinds of relationships.

6.1 Extension Subtyp es

A subtyp e extends its sup ertyp e if its ob jects have extra metho ds in addition to those of the sup ertyp e.

Abstraction functions for extension subtyp es are onto, i.e., the range of the abstraction function is the set of

all legal values of the sup ertyp e. The subtyp e might simply have more metho ds; in this case the abstraction

function is one-to-one. Or its ob jects might also have more \state," i.e., they might record information that

is not present in ob jects of the sup ertyp e; in this case the abstraction function is many-to-one.

As an example of the one-to-one case, consider a typ e intset for set of integers with metho ds to insert

and delete elements, to select elements, and to provide the size of the set. A subtyp e, intset2, mighthave

more metho ds, e.g., union, is empty. Here there is no extra state, just extra metho ds. Supp ose intset's

invariant and constraints are b oth trivial; intset2's would b e as well. Thus, proving that intset2 preserves

intset's invariant and constraint is trivial.

It is easy to discover when a prop osed subtyp e really is not one. For example, the fat set typ e discussed

earlier has an insert metho d but no delete metho d. Intset is not a subtyp e of fat set b ecause fat sets only

grow while intsets grow and shrink; intset do es not preservevarious history prop erties of fat set, in particular,

the constraint that once some integer is in the fat set, it remains in the fat set. The attempt to show that

the intset constraint which is trivial implies that of fat set would fail.

As a simple example of a many-to-one case, consider immutable pairs and triples Figure 5. Pairs have

metho ds that fetch the rst and second elements; triples have these metho ds plus an additional one to fetch

the third element. Triple is a subtyp e of pair and so is semi-mutable triple with metho ds to fetch the rst,

second, and third elements and to replace the third element b ecause replacing the third element do es not

a ect the rst or second element. This example shows that it is p ossible to haveamutable subtyp e of an

immutable sup ertyp e, provided the mutations are invisible to users of the sup ertyp e.

Mutations of a subtyp e that would b e visible through the metho ds of an immutable sup ertyp e are ruled

out. For example, an immutable sequence, whose elements can b e fetched but not stored, is not a sup ertyp e

of mutable array, which provides a store metho d in addition to the sequence metho ds. For sequences we can

prove elements do not change; this is not true for arrays. The attempt to construct the subtyp e relation will

fail b ecause the constraint for sequences do es not follow from that for arrays.

Many examples of extension subtyp es are found in the literature. One common example concerns p ersons,

employees, and students Figure 6. A p erson ob ject has metho ds that rep ort its prop erties such as its name,

age, and p ossibly its relationship to other p ersons e.g., its parents or children. Student and employee are 12 immutable pair

immutable triple semi-mutable triple

Figure 5: Pairs and Triples

subtyp es of p erson; in each case they have additional prop erties, e.g., a studentidnumb er, an employee

employer and salary. In addition, typ e student employee is a subtyp e of b oth student and employee and

also p erson, since the subtyp e relation is transitive. In this example, the subtyp e ob jects have more state

than those of the sup ertyp e as well as more metho ds.

person

student employee

student_employee

Figure 6: Person, Student, and Employee

Another example from the database literature concerns di erent kinds of ships [HM81]. The sup ertyp e is

generic ships with metho ds to determine such things as who is the captain and where the ship is registered.

Subtyp es contain more sp ecialized ships such as tankers and freighters. There can b e quite an elab orate

hierarchy e.g., tankers are a sp ecial kind of freighter. Windows are another well-known example [HO87];

subtyp es include b ordered windows, colored windows, and scrollable windows.

Common examples of subtyp e relationships are allowed by our de nition provided the equal metho d and

other similar metho ds are de ned prop erly in the subtyp e. Supp ose sup ertyp e  provides an equal metho d

and consider a particular call x.equaly. The diculty arises when x and y actually b elong to  , a subtyp e

of  . If ob jects of the subtyp e have additional state, x and y may di er when considered as subtyp e ob jects

but ought to b e considered equal when considered as sup ertyp e ob jects.

For example, consider immutable triples x = h0; 0; 0i and y = h0; 0; 1i. Supp ose the sp eci cation of the

equal metho d for pairs says:

equal = pro c q: pair returns b o ol

ensures r esul t =p:f ir st = q:f irst ^ p:second = q :second

We are using p to refer to the metho d's ob ject. However, wewould exp ect two triples to b e equal only if

their rst, second, and third comp onents were equal. If a program using triples had just observed that x and

y di er in their third element, wewould exp ect x.equaly to return \false," but if the program were using

them as pairs, and had just observed that their rst and second elements were equal, it would b e wrong for

the equal metho d to return false.

The way to resolve this dilemma is to havetwo equal metho ds in triple:

pair equal = pro c p: pair returns b o ol

ensures r esul t =p:f ir st = q:f irst ^ p:second = q :second 13

triple equal = pro c p: triple returns b o ol

ensures r esul t =p:f ir st = q:f irst ^ p:second = q :second

^ p:thir d = q :thir d

One of them pair equal simulates the equal metho d for pair; the other

triple equal is a metho d just on triples. In some ob ject-oriented languages, suchasJava, the additional

equal metho ds are obtained byoverloading.

The problem is not limited to equality metho ds. It also a ects metho ds that \exp ose" the abstract state

of ob jects, e.g., an unparse metho d that returns a string representation of the abstract state of its ob ject.

x.unparse ought to return a representation of a pair if called in a context in which x is considered to b e a

pair, but it ought to return a representation of a triple in a context in which x is known to b e a triple or

some subtyp e of triple.

The need for several equality metho ds seems natural for realistic examples. For example, asking whether

e1 and e2 are the same p erson is di erent from asking if they are the same employee. In the case of a p erson

holding two jobs, the answer might b e true for the question ab out p erson but false for the question ab out

employee.

6.2 Constrained Subtyp es

The second kind of subtyp e relation o ccurs when the subtyp e is more constrained than the sup ertyp e. In this

case, the sup ertyp e sp eci cation is written in a way that allows variation in b ehavior among its subtyp es.

Subtyp es constrain the sup ertyp e by reducing the variability. The abstraction function is usually into rather

than onto. The subtyp e may extend those sup ertyp e ob jects that it simulates by providing additional

metho ds and/or state.

Since constrained subtyp es reduce variation, it is crucial when de ning this kind of typ e hierarchyto

think carefully ab out what variability is p ermitted for the subtyp es. The variability will show upinthe

sup ertyp e sp eci cations in twoways: in the invariant and constraint, and also in the sp eci cations of the

individual metho ds. In b oth cases the sup ertyp e de nitions will b e nondeterministic in those places where

di erent subtyp es are exp ected to provide di erent b ehavior.

Avery simple example concerns elephants. Elephants come in many colors realistically grey and white,

but we will also allow blue ones. However all albino elephants are white and all royal elephants are blue.

Figure 7 shows the elephant hierarchy. The set of legal values for regular elephants includes all elephants

whose color is grey or blue or white:

invariant e :col or = w hite _ e :col or = grey _ e :col or = bl ue

  

The set of legal values for royal elephants is a subset of those for regular elephants:

invariant e :col or = bl ue



and hence the abstraction function is into. The situation for albino elephants is similar. Furthermore, the

elephant metho d that returns the color if there is such a metho d can return grey or blue or white, i.e.,

it is nondeterministic; the subtyp es restrict the nondeterminism for this metho d by de ning it to return a

sp ecifc color.

This simple example has led others to de ne a subtyping relation that requires non-monotonic reasoning

[Lip92], but we b elieve it is b etter to use variability in the sup ertyp e sp eci cation and straightforward

reasoning metho ds. However, the example shows that a sp eci er of a typ e family has to anticipate subtyp es

and capture the variation among them in the sp eci cation of the sup ertyp e.

The bag typ e discussed in Section 4.1 has two kinds of variability. First, as discussed earlier, the sp eci-

cation of get is nondeterministic b ecause it do es not constrain which element of the bag is removed. This

nondeterminism allows stack to b e a subtyp e of bag: the sp eci cation of pop constrains the nondetermin-

ism. We could also de ne a queue that is a subtyp e of bag; its dequeue metho d would also constrain the

nondeterminism of get but in a way di erent from pop.

In addition, the actual value of the b ound for bags is not de ned; it can b e any natural numb er, thus

allowing subtyp es to have di erent b ounds. This variability shows up in the sp eci cation of put, where we 14 elephant

royal albino

Figure 7: Elephant Hierarchy

do not say what sp eci c b ound value causes the call to fail. Therefore, a user of put must b e prepared for a

failure. Of course the user could deduce that a particular call will succeed, based on a previous sequence of

metho d calls and the constraint that the b ound of a bag do es not change. A subtyp e of bag might limit the

b ound to a xed value, or to a smaller range. Several subtyp es of bag are shown in Figure 8; mediumbags

havevarious b ounds, so that this typ e mighthave its own subtyp es, e.g., bag 150.

bag

largebag mediumbag smallbag 32 (bound = 2 ) (100 <= bound <= 1000) (bound = 20)

bag_150

(bound = 150)

Figure 8: A Typ e Family for Bags

The bag hierarchymay seem counterintuitive, since we might exp ect that bags with smaller b ounds

should b e subtyp es of bags with larger b ounds. For example, we might exp ect smallbag to b e a subtyp e

of largebag. However, the sp eci cations for the twotyp es are incompatible: the b ound of every largebag is

32

2 , which is clearly not true for smallbags. Furthermore, this di erence is observable via the metho ds: It is

legal to call the put metho d on a largebag whose size is greater than or equal to 20, but the call is not legal

for a smallbag. Therefore the pre-condition rule is not satis ed.

Although the bag typ e can have subtyp es with di erent b ounds, it cannot have subtyp es where the

b ounds of the bags can change dynamically.Ifwewantedatyp e family that included b oth bag and such

dynamic bags, wewould need to de ne a sup ertyp e in which the b ound is allowed, but not required, to

vary. Figure 9 shows the new typ e hierarchy. Dynamic bags have a b ound that tracks the size: each time an

element is added or removed from a dynamic bag, the b ound changes to match the new size. Flexible bags

have an additional mutator, change bound:

change bound = pro c n: int

requires n jb :el emsj

pr e

mo di es b

ensures b :el ems = b :el ems ^ b :bound = n

post pr e post

Notice that other typ es in the family need not haveachange bound metho d.

This example illustrates the di erentways that subtyp es reduce variability. All varying bag subtyp es

bag's put metho d is non-deterministic, reduce variability in the sp eci cation for the put metho d; varying 15 varying_bag I: size <= bound C: true

flexible_bag dynamic_bag bag I: size <= bound I: size = bound I: size <= bound C: true C: true C: bound stays the same

[...as in Fig. 8...]

Figure 9: Another Typ e Family for Bags

since it might add the element and change the b ound if the size is the same as the b ound, or it might

not. Bag and exible bag reduce this variabilityby not adding the element, whereas dynamic bag do es

add the element. In addition, bag reduces variabilityby restricting the constraint: the trivial constraint

for varying bag can b e thought of as stating \either a bag's b ound maychange or it stays the same;" the

constraint for bag reduces this variabilityby making a choice \the bag's b ound stays the same" and users

can then rely on this prop erty for bags and its subtyp es. Dynamic bag reduces variabilityby restricting

varying bag's invariant so that it no longer allows the size to b e less than the b ound. Finally, exible bag

reduces variability b ecause of the extra mutator, change bound; all its subtyp es must allow explicit re-setting

of the b ound.

Another example is a family of integer counters shown in Figure 10. When a counter is advanced, we

only know that its value gets bigger, so that the constraint is simply

constraint c  c

 

The doubler and multiplier subtyp es have stronger constraints. For example, a multiplier's value always

increases bya multiple, so that its constraint is:

constraint 9 n : int : [ n> 0^c =nc ]

 

For a family like this, we mightcho ose to havean advance metho d for counter so that each of its subtyp es is

constrained to have this metho d or we might not. If wedoprovide an advance metho d, its sp eci cation will

have to b e nondeterministic i.e., it merely states the the size of the counter grows to allow the subtyp es to

provide the de nitions that are appropriate for them.

In the case of the bag family illustrated in Figure 8, all typ es in the hierarchy might b e \real" in the sense

that they have ob jects. However, sometimes sup ertyp es are virtual; they de ne the prop erties all subtyp es

bag of Figure 9 might b e suchatyp e. have in common but have no ob jects of their own. Varying

Virtual typ es are useful in manytyp e hierarchies. For example, wewould use them to construct a

hierarchy for integers. Smaller integers cannot b e a subtyp e of larger integers b ecause of observable di erences

in b ehavior; for example, an over ow exception that would o ccur when adding two 32-bit integers would

not o ccur if they were 64-bit integers. Also, larger integers cannot b e a subtyp e of smaller ones b ecause

exceptions do not o ccur when exp ected. However, we clearly would likeintegers of di erent sizes to b e

related. This is accomplished by designing a virtual sup ertyp e that includes them. Such a hierarchyis

shown in Figure 11, where integer is a virtual typ e whose invariant simply says that the size of an integer is

greater than zero. Integer typ es with di erent sizes are subtyp es of integer. In addition, small integer typ es

are subtyp es of regular int, another virtual typ e; the invariant in the sp eci cation for regular int states that

the size of an integer is either 16 or 32 bits. An integer family mighthave a structure like this, or it

might b e atter byhaving all integer typ es b e direct subtyp es of integer. 16 counter (value never decreases)

incrementer doubler multiplier

(value never decreases) (value doubles) (value multiplies)

Figure 10: Typ e Family for Counters

integer

64-bit-int regular_int

32-bit-int 16-bit-int

Figure 11: Integer Family

7 Related Work

Some research on de ning subtyp e relations is concerned with capturing constraints on metho d signatures via

+ +

the contra/covariance rules, such as those used in languages likeTrellis/Owl [SCB 86], Emerald[BHJ 87],

Quest [Car88], Ei el [Mey88], POOL [Ame90], and to a limited extent Mo dula-3 [Nel91]. Our rules place

constraints not just on the signatures of an ob ject's metho ds, but also on their b ehavior.

Our work is most similar to that of America [Ame91], who has prop osed rules for determining based

on typ e sp eci cations whether one typ e is a subtyp e of another. Meyer [Mey88] also uses pre- and p ost-

condition rules similar to America's and ours. Cusack's [Cus91] approach of relating typ e sp eci cations

de nes subtyping in terms of strengthening state invariants. However, none of these authors considers neither

the problems intro duced by extra mutators nor the preservation of history prop erties. Therefore, they allow

certain subtyp e relations that we forbid e.g., intset could b e a subtyp e of fat set in these approaches.

Our use of constraints in place of the history rule is one of two techniques discussed in [LW94]. That

pap er prop oses a second technique in which there is no constraint; instead, extra metho ds are not allowed

to intro duce new b ehavior. It requires that the b ehavior of each extra mutator b e \explained" in terms of

existing b ehavior, through existing metho ds. We b elieve the use of constraints is simpler and easier to reason

ab out than this \explanation" approach.

The emphasis on semantics of abstract typ es is a prominent feature of the work by Leavens. In his Ph.D.

thesis Leavens [Lea89] de nes typ es in terms of algebras and subtyping in terms of a simulation relation

between them. His simulation relations are a more general form of our abstraction functions. Leavens

considered only immutabletyp es. Dhara [Dha92, DL92, LD92] extends Leavens' thesis work to deal with

mutable typ es, but rules out the cases where extra metho ds cause problems, e.g., aliasing. Because of their

restrictions they allow some subtyp e relations to hold where we do not. For example, they allowmutable

pairs to b e a subtyp e of immutable pairs whereas we do not.

Others haveworked on the sp eci cation of typ es and subtyp es. For example, manyhave prop osed Z as

+

the basis of sp eci cations of ob ject typ es[CL91, DD90, CDD 89]; Goguen and Meseguer[GM87 ] use FOOPS;

Leavens and his colleagues use Larch[Lea91, LW90, DL92]. Though several of these researchers separate the

sp eci cation of an ob ject's creators from its other metho ds, none has identi ed the problem p osed by the 17

missing creators, and thus none has provided an explicit solution to this problem.

8 Summary

We de ned a new notion of the subtyp e relation based on the semantic prop erties of the subtyp e and

sup ertyp e. An ob ject's typ e determines b oth a set of legal values and an with its environment

through calls on its metho ds. Thus, we are interested in preserving prop erties ab out sup ertyp e values

and metho ds when designing a subtyp e. We require that a subtyp e preserve the b ehavior of the sup ertyp e

metho ds and also all invariant and history prop erties of its sup ertyp e. We are particularly interested in an

ob ject's observable b ehavior state changes, thus motivating our fo cus on history prop erties and on mutable

typ es and mutators.

We also presented a way to sp ecify the semantic prop erties of typ es formally. One reason wechose to

base our approach on Larch is that Larch allows formal pro ofs to b e done entirely in terms of sp eci cations.

In fact, once the theorems corresp onding to our subtyping rules are formally stated in Larch, their pro ofs are

almost completely mechanical|a matter of symb ol manipulation|and could b e done with the assistance of

the Larch Prover[GG89, ZW97].

In developing our de nition, wewere motivated primarily by pragmatics. Our intention is to capture

the intuition programmers apply when designing typ e hierarchies in ob ject-oriented languages. However,

intuition in the absence of precision can often go astray or lead to confusion. This is why it has b een unclear

how to organize certain typ e hierarchies suchasintegers. Our de nition sheds light on such hierarchies

and helps in uncovering new designs. It also supp orts the kind of reasoning that is needed to ensure that

programs that work correctly using the sup ertyp e continue to work correctly with the subtyp e.

Programmers have found our approach relatively easy to apply and use it primarily in an informal way.

The essence of a subtyp e relationship is expressed in the mappings. These mappings can b e de ned informally,

in much the same way that abstraction functions and representation invariants are given as comments in a

program that implements an abstract typ e. The pro ofs can also b e done informally, in the style given in

Section 5.3; they are usually straightforward and can b e done by insp ection.

We also showed that our approach is useful by lo oking at a numb er of examples. This led us to identify

two kinds of subtyp es: ones that extend the sup ertyp e, and ones that constrain it. In the former case, the

sup ertyp e can b e de ned without a great deal of thought ab out the subtyp es, but in the latter case, this is

not p ossible; instead the sup ertyp e sp eci cation must b e done carefully so that it allows all of the intended

subtyp es. In particular the sp eci cation of the sup ertyp e must contain sucient nondeterminism in the

invariant, constraint, and metho d sp eci cations.

Our analysis raises two issues ab out typ e hierarchy that have b een ignored previously by b oth the

formal metho ds and ob ject-oriented communities. First, subtyp es can have more metho ds, sp eci cally more

mutators, than their sup ertyp es. Second, subtyp es need to have di erent creators than sup ertyp es. These

issues forced us to revisit pro of rules normally asso ciated with typ e sp eci cations: the data typ e induction

rule and the history rule. We decided to preclude the use of these rules, and to have explicit invariants

and constraints to replace them. Although it is p ossible to de ne a subtyp e relation that avoids explicit

invariants and constraints, doing so is awkward and often requires invention of sup er uous sup ertyp e metho ds

and creators. We prefer to use explicit invariants and constraints b ecause this allows a more direct wayof

capturing the designer's intent.

References

[Ame90] America, P. A parallel ob ject-oriented language with inheritance and subtyping. SIGPLAN,

2510:161{168, Octob er 1990.

[Ame91] America, P. Designing an ob ject-oriented with b ehavioural subtyping. In

de Bakker, J. W., de Ro ever, W. P., and Rozenb erg, G., editors, Foundations of Object-Oriented

Languages, REX School/Workshop, Noordwijkerhout, The Netherlands, May/June 1990,volume

489 of LNCS, pages 60{90. Springer-Verlag, NY, 1991. 18

+

[BHJ 87] Black, A. P., Hutchinson, N., Jul, E., Levy, H. M., and Carter, L. Distribution and abstract

typ es in Emerald. IEEE TSE, 131:65{76, January 1987.

[Car88] Cardelli, L. A semantics of multiple inheritance. Information and Computation, 76:138{164,

1988.

+

[CDD 89] Carrington, D., Duke, D., Duke, R., King, P., Rose, G., and Smith, P. Ob ject-Z: An ob ject ori-

ented extension to Z. In FORTE89, International ConferenceonFormal Description Techniques,

Decemb er 1989.

[CL91] Cusack, E. and Lai, M. Ob ject-oriented sp eci cation in LOTOS and Z, or my cat really is ob ject-

oriented! In de Bakker, J. W., de Ro ever, W. P., and Rozenb erg, G., editors, Foundations of

Object OrientedLanguages, pages 179{202. Springer Verlag, June 1991. LNCS 489.

[Cus91] Cusack, E. Inheritance in ob ject oriented Z. In Proceedings of ECOOP '91. Springer-Verlag,

1991.

[DD90] Duke, D. and Duke, R. A history mo del for classes in ob ject-Z. In Proceedings of VDM '90:

VDM and Z. Springer-Verlag, 1990.

[Dha92] Dhara, K. K. Subtyping among mutable typ es in ob ject-oriented programming languages. Mas-

ter's thesis, Iowa State University, Ames, Iowa, 1992. Master's Thesis.

[DL92] Dhara, K. K. and Leavens, G. T. Subtyping for mutable typ es in ob ject-oriented programming

languages. Technical Rep ort 92-36, Department of Computer Science, Iowa State University,

Ames, Iowa, Novemb er 1992.

[DMN70] Dahl, O.-J., Myrhaug, B., and Nygaard, K. SIMULA common base language. Technical Re-

p ort 22, Norwegian Computing Center, Oslo, Norway, 1970.

[GG89] Garland, S. and Guttag, J. An overview of LP, the Larch Prover. In Proceedings of the Third

International ConferenceonRewriting Techniques and Applications, pages 137{151, Chap el Hill,

NC, April 1989. Lecture Notes in Computer Science 355.

[GHW85] Guttag, J. V., Horning, J. J., and Wing, J. M. The Larch family of sp eci cation languages. IEEE

Software, 25:24{36, Septemb er 1985.

[GM87] Goguen, J. A. and Meseguer, J. Unifying functional, ob ject-oriented and relational programming

with logical semantics. In Shriver, B. and Wegner, P., editors, Research Directions in Object

OrientedProgramming. MIT Press, 1987.

[HM81] Hammer, M. and McLeo d, D. A semantic database mo del. ACM Trans. Database Systems,

63:351{386, 1981.

[HO87] Halb ert, D. C. and O'Brien, P. D. Using typ es and inheritance in ob ject-oriented programming.

IEEE Software, 45:71{79, Septemb er 1987.

[Hoa72] Hoare, C. Pro of of correctness of data representations. Acta Informatica, 11:271{281, 1972.

[LD92] Leavens, G. T. and Dhara, K. K. A foundation for the mo del theory of abstract data typ es with

mutation and aliasing preliminary version. Technical Rep ort 92-35, Department of Computer

Science, Iowa State University, Ames, Iowa, Novemb er 1992.

[Lea89] Leavens, G. Verifying ob ject-oriented prograsm that use subtyp es. Technical Rep ort 439, MIT

Lab oratory for Computer Science, February 1989. Ph.D. thesis.

[Lea91] Leavens, G. T. Mo dular sp eci cation and veri cation of ob ject-oriented programs. IEEE Soft-

ware, 84:72{80, July 1991.

[LG85] Liskov, B. and Guttag, J. Abstraction and Speci cation in Program Design. MIT Press, 1985. 19

[Lip92] Lip eck, U. Semantics and usage of defaults in sp eci cations. In Foundations of Information

Systems Speci cation and Design, March 1992. Dagstuhl Seminar 9212 Rep ort 35.

[Lis92] Liskov, B. Preliminary design of the Thor ob ject-oriented database system. In Proc. of the

SoftwareTechnology Conference.DARPA, April 1992. Also Programming Metho dology Group

Memo 74, MIT Lab oratory for Computer Science, Cambridge, MA, March 1992.

[LW90] Leavens, G. T. and Weihl, W. E. Reasoning ab out ob ject-oriented programs that use subtyp es.

In ECOOP/OOPSLA '90 Proceedings, 1990.

[LW92] Liskov, B. and Wing, J. Family values: A semantic notion of subtyping. Technical Rep ort 562,

MIT Lab. for Computer Science, 1992. Also available as CMU-CS-92-220.

[LW94] Liskov, B. and Wing, J. A b ehavioral notion of subtyping. ACM Trans. on Prog. Lang. and

Systems, pages 1811{1841, Novemb er 1994.

[Mey88] Meyer, B. Object-oriented Software Construction. Prentice Hall, New York, 1988.

[MS90] Maier, D. and Stein, J. Development and implementation of an ob ject-oriented DBMS. In Zdonik,

S. and Maier, D., editors, Readings in Object-Oriented Database Systems, pages 167{185. Morgan

Kaufmann, 1990.

[Nel91] Nelson, G. Systems Programming with Modula-3. Prentice Hall, 1991.

+

[SCB 86] Scha ert, C., Co op er, T., Bullis, B., Kilian, M., and Wilp olt, C. An intro duction to Trellis/Owl.

In Proceedings of OOPSLA '86, pages 9{16, Septemb er 1986.

[SH92] Scheid, J. and Holtsb erg, S. Ina Jo sp eci cation language reference manual. Technical Rep ort

TM-6021/001/06, Paramax Systems Corp oration, A Unisys Company, June 1992.

[Str86] Stroustrup, B. The C++ Programming Language. Addison-Wesley, 1986.

[ZW97] Zaremski, A. and Wing, J. Sp eci cation matching of software comp onents. ACM Trans. on

Software Engineering and Methodology, 64:333{369, Octob er 1997. 20