CS 411 Lecture 21 Introduction to Polymorphism & 24 Oct 2003 Lecturer: Greg Morrisett

There are 3 basic kinds of polymorphism:

• Subtyping (or inclusion) polymorphism. • . • Ad hoc polymorphism.

Subtyping arises when we can pass an object of type τ1 to an operation (including a function or method) that expects a τ2 and we don’t get a type-error. Most modern languages, including Java and C#, have some notion of subtyping as the basic form of polymorphism.

Parametric polymorphism is a generalization of abstract data types and languages such as ML and Haskell are notable for their inclusion of parametric polymorphism. Parametric polymorphism arises naturally for functions, such as the identity, or list reversal, where we manipulate objects in a way that’s oblivious to the type. Unlike subtyping, it’s possible to express input-output relationships on type signatures which gives the caller more reasoning power. But, subtype polymorphism tends to work better (or at least more easily) for dealing with heterogeneous data structures.

Note, however, that you can simulate subtyping in a pretty straightforward way with parametric polymor- phism, but it requires passing around coercion functions. You can also simulate parametric polymorphism in a language like Java, but it requires run-time type tests. So neither situation is ideal, and next-generation languages, including Java 1.5 and C# include a notion of “generics” which generalizes both subtyping and parametric polymorphism.

Ad hoc polymoprhism is everything else. Overloading is a good example. Something like “+” appears to work on both integers and reals, but note that we get different code for the different operations. Furthermore, “+” is only defined on some types and there’s no way to abstract over those types. Languages such as C++ provide a lot of support for ad hoc polymorphism through templates, but anyone who has used template libraries can tell you that this was not well thought through. Language researchers, particularly in the Haskell camp, have been working on better ways to support the power of ad hoc polymorphism (sometimes called polytypism) and this, coupled with notions of reflection, are hot research topics. This is mostly because we don’t have a clean handle on these concepts the same way we do subtyping and parametric polymorphism. The funny thing is that it’s only in the last 15 years that we’ve gotten a good handle on those two!

Subtyping

This arises naturally if we think about the idea of “types as sets”, for then “subtypes are subsets”. More abstractly, the principle behind subtyping is that, if you can use a value of type τ1 anywhere that you expect to have a value of type τ2, then you ought to be able to treat τ1 as a subtype of τ2.

For example, given record types:

type point2d = {x:int, y:int} type point3d = {x:int, y:int, z:int} then we ought to be able to pass in a point3d to any function expecting a point2d, because all that function can do is select out the x or y components, and those operations are defined on point3d values as well as point2d values.

1 If we think of point2d and point3d as interfaces (`ala Java or C#) then it’s natural to think that anything that implements the point3d interface also implements the point2d interface.

We can extend our little functional language with subtyping by adding one new rule called subsumption:

Γ ` e : τ1 τ1 ≤ τ2

Γ ` e : τ2

The rule says that if e has type τ1 and τ1 is a subtype of τ2, then e also has type τ2.

Of course, now we need to define what it means for one type to be a subtype of another. Remember the principle to follow is that τ1 ≤ τ2 means that for any operation op you can perform on a τ2 value, you should be able to perform op on a τ1 value.

All subtyping frameworks yield a partial order so they include rules for reflexivity and transitivity:

τ ≤ τ

τ1 ≤ τ2 τ2 ≤ τ3

τ1 ≤ τ3

For something like (immutable) records, the subtyping rule might look like this:

0 0 τ1 ≤ τ1 . . . τn ≤ τn 0 0 {l1 : τ1, . . . , ln : τn, ln+1 τn+1} ≤ {l1 : τ1, . . . , ln : τn}

0 This rule says that a record type R1 is a subtype of a record type R2 if for all l : τ in R2, there exists a label 0 l with type τ in R1, and furthermore, τ ≤ τ . This rule is sound because the only operation we can apply to an R2 value is to extract a component using the #l operation. If #l is defined in R1, then the operation will succeed. And intuitively, if that yields a value of type τ, then we can use subtyping to show that τ ≤ τ 0.

What about sums? Consider the datatypes:

datatype t1 = C1 of int datatype t2 = C1 of int | C2 of bool datatype t3 = C1 of int | C2 of bool | C3 of real

The only operation on sums is to do a case on them. If I have a case that can handle all of the constructors of t3, then it’s safe to pass in a t2 or a t1 value because the set of constructors in those types is a subset of the set used in the definition of t3. So, for sums, the subtyping rule is the opposite – the subtype must have fewer labels. 0 0 τ1 ≤ τ1 . . . τn ≤ τn 0 0 [l1 of τ1 | ... | ln of τn] ≤ [l1 of τ1 | ... | ln of τn | ln+1 of τn+1] 0 Notice also that we assume that τ1 ≤ τ1 instead of the other way around for records. So, once again, sums are pleasingly dual to products.

What about functions? The only operation on a function is to apply it to some argument. So, suppose I have a function: f :({x : int, y : int} → {x : int, y : int}) → int Suppose that I have another function:

g : {x : int} → {x : int, y : int, z : int}

2 Does it make sense to pass g to f? That is, what can go wrong if we do f(g)?

Well, f might try to apply g to some argument. It thinks that g takes in {x : int, y : int} arguments. So, it might pass an {x : int, y : int} value to g. Now g thinks that it’s being given an argument of type {x : int}, so the only thing it can do is apply #x to the argument. That seems to be okay because conceptually, we can use subtyping to treat the {x : int, y : int} value as if it’s a {x : int} value.

What about the result? f thinks that g returns {x : int, y : int} so it might do #x or #y on the result. But g returns a {x : int, y : int, z : int} value. Again, this is okay because the operations that f might do on g’s result are okay. Or conceptually, we can apply subtyping to coerce g’s result to {x : int, y : int} if you like.

So it appears as if g’s type is a subtype of the argument type of f. That is, we have:

({x : int} → {x : int, y : int, z : int}) ≤ ({x : int, y : int} → {x : int, y : int})

In general, the subtyping rule for functions is:

0 0 τ1 ≤ τ1 τ2 ≤ τ2 0 0 (τ1 → τ2) ≤ (τ1 → τ2)

So note that the argument types are contra-variant while the result types are co-variant. There are a lot of ways to see why contra-variance in the argument type is necessary. The example above demonstrates this, but so does drawing a Venn diagram of sets, subsets, and mappings from one set to another. Or from a logical standpoint, we can think of (τ1 → τ2) as similar to (¬τ1 ∨ τ2) — the negation on the argument suggests that we’ll have to flip the subtyping around.

Subtypes as Subsets vs. Coercions

Earlier, I said that we can think of subtypes as “subsets” and this is the general principle that most languages follow. However, it requires that we have a uniform representation for values that lie in related types. For instance, in Java, all Objects are one word so that we can pass them around, stick them in data structures, etc. But that means that double values are not Objects. A boxed double value (e.g., Double) is a subtype of Object, and we can coerce a double to a Double. So, in some sense, double is a “coerceable-subtype” of Object.

In fact, we can think of all of the subtyping rules as being witnessed by some coercion function. In particular, if we think of τ1 ≤ τ2 as a function C : τ1 → τ2, then the subsumption rule becomes:

Γ ` e : τ1 C : τ1 → τ2

Γ ` C e : τ2

The other coercions can be calculated as follows:

λx : τ. x : τ → τ (reflexivity is the identity)

C1 : τ1 → τ2 C2 : τ2 → τ3 (transitivity is composition) C1 ◦ C2 : τ3

Then record subtyping is witnessed by this coercion:

0 0 C1 : τ1 → τ1 ...Cn : τn → τn λr : {l1 : τ1, . . . , ln : τn, ln+1 : τn+1}. {l1 = C1(#l1 r), . . . , ln = Cn#ln }

3 Coercion-based subtyping is simpler from the perspective of the language designer and implementor, and it doesn’t require us to have uniform representations for objects. But coercions aren’t so attractive for the programmer. First, they cost time. Second, they don’t work for mutable objects since they fundamentally operate by making a copy of the object. (You can achieve this by providing a level of indirection. That’s what Java interfaces do.)

Another issue with coercions is that you must have some notion of coherence. This means that, no matter how we prove that the code type-checks, its meaning doesn’t change. In particular, if I have a proof P1 that ` e : τ and another proof P2 that ` e : τ, whether I use P1 to insert the coercions or P2 shouldn’t matter. I should get the same value out at the end of the day.

Note that the coercions of double to Double in Java are not coherent. The problem is that if d is a double, then d == d does not return the same thing as (Double)d == (Double)d Similar problems arise in C. In general, (implicit) coercions turn out to be a good idea only in pure, functional languages. In imperative languages, it’s almost always a disaster (witness the problems of arithmetic in C/C++.)

4