<<

CS 6110 S18 Lecture 23

1 Introduction

In this lecture, we attempt to extend the typed λ-calculus to support objects. In particular, we explore the concept of subtyping in detail, which is one of the key features of object-oriented (OO) languages.

Subtyping was first introduced in , the first object-oriented . Its inventors Ole-Johan Dahl and Kristen Nygaard later went on to win the Turing award for their contribution to the field of object-oriented programming. Simula introduced a number of innovative features that have become the mainstay of modern OO languages including objects, subtyping and inheritance.

The concept of subtyping is closely tied to inheritance and polymorphism and offers a formal way of studying them. It is best illustrated by means of an example. This is an example of a that describes a

Person

Student Staff

Grad Undergrad

TA RA

Figure 1: A Subtype Hierarchy subtype relationship between different types. In this case, the types Student and Staff are both subtypes of Person. Alternatively, one can say that Person is a supertype of Student and Staff. Similarly, TA is a subtype of the Student and Person types, and so on. The subtype relationship is normally a preorder (reflexive and transitive) on types.

A subtype relationship can also be viewed in terms of subsets. If σ is a subtype of τ, then all elements of type σ are automatically elements of type τ.

The ≤ symbol is typically used to denote the subtype relationship. Thus, Staff ≤ Person, RA ≤ Student, etc. Sometimes the symbol <: is used, but we will stick with ≤.

2 Basic Subtyping Rules

Formally, we write σ ≤ τ to indicate that σ is a subtype of τ. In , this is equivalent to saying Jσ K ⊆ Jτ K. The informal interpretation of this subtype relation is that anything of type σ can be used in a context that expects something of type τ. This is known as the subsumption rule: Γ ⊢ e : σ σ ≤ τ Γ ⊢ e : τ Two further general rules are: σ ≤ τ τ ≤ ρ τ ≤ τ σ ≤ ρ

1 which say that ≤ is reflexive and transitive, respectively, thus a preorder. In many cases, antisymmetry holds as well, making the subtyping relation a partial order, but this is not always true.

The subtyping rules governing the types 1 and 0 are interesting:

• 1 (unit): Every type is a subtype of 1, that is, τ ≤ 1 for all types τ, thus 1 is the . If a context expects something of type 1, then it can accept any type. In Java, this is equivalent to the type Object. • 0 (void): Every type is a supertype of 0, i.e., 0 ≤ τ for all types τ, thus 0 is the . The type 0 can be accepted by any context in lieu of any other type. In Java, this is the type of null.

2.1 Products and Sums

The subtyping rules for product and sum types are quite intuitive: σ ≤ σ′ τ ≤ τ ′ σ ≤ σ′ τ ≤ τ ′ σ ∗ τ ≤ σ′ ∗ τ ′ σ + τ ≤ σ′ + τ ′ These rules say that the product and sum type constructors are monotone with respect to the subtype relation. When the subtyping relationship goes in the same direction in the premise as in the conclusion, it is called a covariant subtyping relationship.

2.2 Records

Recall our extensions to the grammar of e and τ for adding support for record types:

e ::= · · · | {ℓ1 = e1, . . . , ℓn = en} | e.x

τ ::= · · · | {ℓ1 : τ1, . . . , ℓn : τn} where n ≥ 1. The ℓi are called field identifiers and are assumed to be distinct. Also, the order in which they are listed in {ℓ1 = e1, . . . , ℓn = en} and {ℓ1 : τ1, . . . , ℓn : τn} does not matter. We also had the following rule added to the small-step semantics:

{ℓ1 = v1, . . . , ℓn = vn}.ℓi → vi as well as the following typing rules:

Γ ⊢ ei : τi, 1 ≤ i ≤ n Γ ⊢ e : {. . . , x : τ, . . .} Γ ⊢ {ℓ1 = e1, . . . , ℓn = en} : {ℓ1 : τ1, . . . , ℓn : τn} Γ ⊢ e.x : τ

There are two subtyping rules for records:

• Depth subtyping: this defines the subtyping relation between two records that have the same number of fields.

σi ≤ τi, 1 ≤ i ≤ n {ℓ1 : σ1, . . . , ℓn : σn} ≤ {ℓ1 : τ1, . . . , ℓn : τn}

• Width subtyping: this defines the subtyping relation between two records that have different number of fields. m ≤ n (1) {ℓ1 : τ1, . . . , ℓn : τn} ≤ {ℓ1 : τ1, . . . , ℓm : τm}

2 where the ≤ in the premise is integer comparison. Observe that in this case, the subtype has more components than the supertype. This is analogous to the relationship between a subclass and a superclass in which the subclass has all the components of the superclass, which it inherits from the superclass, but perhaps has some components that the superclass does not have.

The depth and width subtyping rules for records can be combined into a single rule: m ≤ n σ ≤ τ , 1 ≤ i ≤ m i i (2) {ℓ1 : σ1, . . . , ℓn : σn} ≤ {ℓ1 : τ1, . . . , ℓm : τm}

Mathematically, record types can be viewed as product types whose components are indexed by the field labels ℓ. The operators .ℓ are the corresponding projections. Thus if ∆ is a partial function with finite domain dom ∆ associating a type ∆(ℓ) with each element ℓ in its domain, we might write ∏ ∆(ℓ) ℓ∈dom ∆ ∏ for the record type, or just ∆ for short. An expression of this type would be a tuple indexed by dom ∆:

(eℓ | ℓ ∈ dom ∆).

In this view, the typing rules would take the form ∏ ⊢ ∈ ⊢ Γ eℓ : ∆(ℓ), ℓ dom∏∆ Γ e : ∆ Γ ⊢ (eℓ | ℓ ∈ dom ∆) : ∆ Γ ⊢ e.ℓ : ∆(ℓ) and the subtyping rule (2) would take the form ⊆ ≤ ∈ dom ∆1 dom ∆2 ∏ ∆2(∏ℓ) ∆1(ℓ), ℓ dom ∆1 ∆2 ≤ ∆1

2.3 Variants

The analogous extension for sum types is known as variant records or just variants. These are ∆-indexed coproducts: ∑ ∑ ∆(ℓ) = ∆. ℓ∈dom ∆ In OCaml, one would write

type t = I of int | S of string | P of int * string to declare the type of a variant record whose elements consist of the (tagged) union of three sets. Here dom ∆ = {I, S, P} with ∆(I) = int, ∆(S) = string, and ∆(P) = int * string.

The depth subtyping rule for variants is the same as that for records (replacing the records with variants). However, the width subtyping rule is different: instead of m ≤ n in (1) and (2), we should take n ≤ m. Suppose we used the width subtyping rule (1), the same form as for records. Intuitively, if σ ≤ τ, then this implies that anything of type σ can be used in a context expecting something of type τ. Suppose we now had a match statement that did on something of type τ. Our subtyping relation says that we can pass in something of type σ to this match statement and it will still work. However, since τ has fewer components than σ and the match statement was originally written for an object of type τ, there will be

3 values of σ for which no corresponding case exists. Thus, for variants, the direction of the ≤ symbol in the premise needs to be reversed. In other words, for variants, the subtype should have fewer components than the supertype. This leads to the combined rule ⊆ ≤ ∈ dom ∆1 dom ∆2∑ ∆1(∑ℓ) ∆2(ℓ), ℓ dom ∆1 ∆1 ≤ ∆2

3 Function Subtyping

Based on the subtyping rules we have encountered so far, our first impulse might be to write down something of the following form to describe the subtyping relation for functions: σ ≤ σ′ τ ≤ τ ′ σ → τ ≤ σ′ → τ ′ However, this is incorrect. To see why, consider the following code fragment:

let f : σ → τ = g in let f ′ : σ′ → τ ′ = g′ in let t : σ′ = v in f ′(t)

Suppose σ ≤ σ′ and τ ≤ τ ′. By the rule above, we would have σ → τ ≤ σ′ → τ ′, thus we should be able to substitute f for f ′ and the resulting program should be type correct, provided the original one was. But it is not: if σ′ is a strict supertype of σ and the value v is of type σ′ but not of type σ, then f will crash, since it expects an input of type σ and it is not getting one.

Actually, the incorrect typing rule given above was implemented in the language Eiffel, and runtime type checking had to be added later to make the language type safe.

The correct subtyping rule for functions is: σ′ ≤ σ τ ≤ τ ′ σ → τ ≤ σ′ → τ ′ Note that the ordering on the domain is reversed in the premise. Succinctly stated, a function f of type σ → τ can safely take any input of any subtype σ′ of σ, and produces a value that can be taken to be of any supertype τ ′ of τ; thus we are free to regard f as a function of type σ′ → τ ′.

In all the type constructors we had seen so far, the subtyping relation was preserved. With the constructor, however, the subtyping relation on the domain is reversed. We say that the constructor → is contravariant in the domain and covariant in the codomain.

4 References

Recall the extensions to the grammar of e and τ for adding support for references:

e ::= · · · | ref e | !e | e1 := e2 τ ::= · · · | τ ref The typing rules are

Γ ⊢ e : τ Γ ⊢ e : τ ref Γ ⊢ e1 : τ ref Γ ⊢ e2 : τ ⊢ ⊢ Γ ref e : τ ref Γ !e : τ Γ ⊢ e1 := e2 : 1

4 As for subtyping, once again our first impulse might be to write down something like the following: σ ≤ τ σ ref ≤ τ ref However, again this would be incorrect. The problem is that to be safe, a ref appearing on the left-hand side of an assignment operator should be contravariant. Consider the following example:

let x : Square ref = ref square in let y : Shape ref = x in (y := circle ; (!x).side)

Even though this code type-checks with the given subtyping rule for reference types, it is not type-correct, since in the last line x does not refer to a square anymore. This problem actually exists in Java when using arrays, as the designers incorrectly used the rule given above. Consequently, a runtime check is necessary.

public class Test { public static void main(String[] args) { B[] b = new B[1]; A[] a = b; a[0] = new A(); } }

class A {} class B extends A {}

Exception in thread ‘‘main’’ java.lang.ArrayStoreException: A at Test.main(Test.java:5)

In order to overcome this problem we must use the correct rule below: σ ≤ τ τ ≤ σ σ ref ≤ τ ref The subtyping rule for references is thus both covariant and contravariant.

5