CS 6110 S18 Lecture 23 Subtyping 1 Introduction 2 Basic Subtyping Rules

CS 6110 S18 Lecture 23 Subtyping 1 Introduction In this lecture, we attempt to extend the typed λ-calculus to support objects. In particular, we explore the concept of subtyping in detail, which is one of the key features of object-oriented (OO) languages. Subtyping was first introduced in Simula, the first object-oriented programming language. Its inventors Ole-Johan Dahl and Kristen Nygaard later went on to win the Turing award for their contribution to the field of object-oriented programming. Simula introduced a number of innovative features that have become the mainstay of modern OO languages including objects, subtyping and inheritance. The concept of subtyping is closely tied to inheritance and polymorphism and offers a formal way of studying them. It is best illustrated by means of an example. This is an example of a hierarchy that describes a Person Student Staff Grad Undergrad TA RA Figure 1: A Subtype Hierarchy subtype relationship between different types. In this case, the types Student and Staff are both subtypes of Person. Alternatively, one can say that Person is a supertype of Student and Staff. Similarly, TA is a subtype of the Student and Person types, and so on. The subtype relationship is normally a preorder (reflexive and transitive) on types. A subtype relationship can also be viewed in terms of subsets. If σ is a subtype of τ, then all elements of type σ are automatically elements of type τ. The ≤ symbol is typically used to denote the subtype relationship. Thus, Staff ≤ Person, RA ≤ Student, etc. Sometimes the symbol <: is used, but we will stick with ≤. 2 Basic Subtyping Rules Formally, we write σ ≤ τ to indicate that σ is a subtype of τ. In denotational semantics, this is equivalent to saying Jσ K ⊆ Jτ K. The informal interpretation of this subtype relation is that anything of type σ can be used in a context that expects something of type τ. This is known as the subsumption rule: Γ ` e : σ σ ≤ τ Γ ` e : τ Two further general rules are: σ ≤ τ τ ≤ ρ τ ≤ τ σ ≤ ρ 1 which say that ≤ is reflexive and transitive, respectively, thus a preorder. In many cases, antisymmetry holds as well, making the subtyping relation a partial order, but this is not always true. The subtyping rules governing the types 1 and 0 are interesting: • 1 (unit): Every type is a subtype of 1, that is, τ ≤ 1 for all types τ, thus 1 is the top type. If a context expects something of type 1, then it can accept any type. In Java, this is equivalent to the type Object. • 0 (void): Every type is a supertype of 0, i.e., 0 ≤ τ for all types τ, thus 0 is the bottom type. The type 0 can be accepted by any context in lieu of any other type. In Java, this is the type of null. 2.1 Products and Sums The subtyping rules for product and sum types are quite intuitive: σ ≤ σ0 τ ≤ τ 0 σ ≤ σ0 τ ≤ τ 0 σ ∗ τ ≤ σ0 ∗ τ 0 σ + τ ≤ σ0 + τ 0 These rules say that the product and sum type constructors are monotone with respect to the subtype relation. When the subtyping relationship goes in the same direction in the premise as in the conclusion, it is called a covariant subtyping relationship. 2.2 Records Recall our extensions to the grammar of e and τ for adding support for record types: e ::= · · · j f`1 = e1; : : : ; `n = eng j e:x τ ::= · · · j f`1 : τ1; : : : ; `n : τng where n ≥ 1. The ì are called field identifiers and are assumed to be distinct. Also, the order in which they are listed in f`1 = e1; : : : ; `n = eng and f`1 : τ1; : : : ; `n : τng does not matter. We also had the following rule added to the small-step semantics: f`1 = v1; : : : ; `n = vng:ì ! vi as well as the following typing rules: Γ ` ei : τi; 1 ≤ i ≤ n Γ ` e : f: : : ; x : τ; : : :g Γ ` f`1 = e1; : : : ; `n = eng : f`1 : τ1; : : : ; `n : τng Γ ` e:x : τ There are two subtyping rules for records: • Depth subtyping: this defines the subtyping relation between two records that have the same number of fields. σi ≤ τi; 1 ≤ i ≤ n f`1 : σ1; : : : ; `n : σng ≤ f`1 : τ1; : : : ; `n : τng • Width subtyping: this defines the subtyping relation between two records that have different number of fields. m ≤ n (1) f`1 : τ1; : : : ; `n : τng ≤ f`1 : τ1; : : : ; `m : τmg 2 where the ≤ in the premise is integer comparison. Observe that in this case, the subtype has more components than the supertype. This is analogous to the relationship between a subclass and a superclass in which the subclass has all the components of the superclass, which it inherits from the superclass, but perhaps has some components that the superclass does not have. The depth and width subtyping rules for records can be combined into a single rule: m ≤ n σ ≤ τ ; 1 ≤ i ≤ m i i (2) f`1 : σ1; : : : ; `n : σng ≤ f`1 : τ1; : : : ; `m : τmg Mathematically, record types can be viewed as product types whose components are indexed by the field labels `. The operators :` are the corresponding projections. Thus if ∆ is a partial function with finite domain dom ∆ associating a type ∆(`) with each element ` in its domain, we might write Y ∆(`) `2dom ∆ Q for the record type, or just ∆ for short. An expression of this type would be a tuple indexed by dom ∆: (e` j ` 2 dom ∆): In this view, the typing rules would take the form Q ` 2 ` Γ e` : ∆(`); ` domQ∆ Γ e : ∆ Γ ` (e` j ` 2 dom ∆) : ∆ Γ ` e:` : ∆(`) and the subtyping rule (2) would take the form ⊆ ≤ 2 dom ∆1 dom ∆2 Q ∆2(Q`) ∆1(`); ` dom ∆1 ∆2 ≤ ∆1 2.3 Variants The analogous extension for sum types is known as variant records or just variants. These are ∆-indexed coproducts: X X ∆(`) = ∆: `2dom ∆ In OCaml, one would write type t = I of int | S of string | P of int * string to declare the type of a variant record whose elements consist of the (tagged) union of three sets. Here dom ∆ = fI; S; Pg with ∆(I) = int, ∆(S) = string, and ∆(P) = int * string. The depth subtyping rule for variants is the same as that for records (replacing the records with variants). However, the width subtyping rule is different: instead of m ≤ n in (1) and (2), we should take n ≤ m. Suppose we used the width subtyping rule (1), the same form as for records. Intuitively, if σ ≤ τ, then this implies that anything of type σ can be used in a context expecting something of type τ. Suppose we now had a match statement that did pattern matching on something of type τ. Our subtyping relation says that we can pass in something of type σ to this match statement and it will still work. However, since τ has fewer components than σ and the match statement was originally written for an object of type τ, there will be 3 values of σ for which no corresponding case exists. Thus, for variants, the direction of the ≤ symbol in the premise needs to be reversed. In other words, for variants, the subtype should have fewer components than the supertype. This leads to the combined rule ⊆ ≤ 2 dom ∆1 dom ∆2P ∆1(P`) ∆2(`); ` dom ∆1 ∆1 ≤ ∆2 3 Function Subtyping Based on the subtyping rules we have encountered so far, our first impulse might be to write down something of the following form to describe the subtyping relation for functions: σ ≤ σ0 τ ≤ τ 0 σ ! τ ≤ σ0 ! τ 0 However, this is incorrect. To see why, consider the following code fragment: let f : σ ! τ = g in let f 0 : σ0 ! τ 0 = g0 in let t : σ0 = v in f 0(t) Suppose σ ≤ σ0 and τ ≤ τ 0. By the rule above, we would have σ ! τ ≤ σ0 ! τ 0, thus we should be able to substitute f for f 0 and the resulting program should be type correct, provided the original one was. But it is not: if σ0 is a strict supertype of σ and the value v is of type σ0 but not of type σ, then f will crash, since it expects an input of type σ and it is not getting one. Actually, the incorrect typing rule given above was implemented in the language Eiffel, and runtime type checking had to be added later to make the language type safe. The correct subtyping rule for functions is: σ0 ≤ σ τ ≤ τ 0 σ ! τ ≤ σ0 ! τ 0 Note that the ordering on the domain is reversed in the premise. Succinctly stated, a function f of type σ ! τ can safely take any input of any subtype σ0 of σ, and produces a value that can be taken to be of any supertype τ 0 of τ; thus we are free to regard f as a function of type σ0 ! τ 0. In all the type constructors we had seen so far, the subtyping relation was preserved. With the function space constructor, however, the subtyping relation on the domain is reversed. We say that the function type constructor ! is contravariant in the domain and covariant in the codomain. 4 References Recall the extensions to the grammar of e and τ for adding support for references: e ::= · · · j ref e j !e j e1 := e2 τ ::= · · · j τ ref The typing rules are Γ ` e : τ Γ ` e : τ ref Γ ` e1 : τ ref Γ ` e2 : τ ` ` Γ ref e : τ ref Γ !e : τ Γ ` e1 := e2 : 1 4 As for subtyping, once again our first impulse might be to write down something like the following: σ ≤ τ σ ref ≤ τ ref However, again this would be incorrect.

CS 6110 S18 Lecture 23 Subtyping 1 Introduction 2 Basic Subtyping Rules

An Extended Theory of Primitive Objects: First Order System Luigi Liquori

On Model Subtyping ClÉment Guy, Benoit Combemale, Steven Derrien, James Steel, Jean-Marc JÉzÉquel

Cyclone: a Type-Safe Dialect of C∗

REFPERSYS High-Level Goals and Design Ideas*

Spindle Documentation Release 2.0.0

Union Types for Semistructured Data

Dotty Phantom Types (Technical Report)

Lecture Notes on Types for Part II of the Computer Science Tripos

Open C++ Tutorial∗

Subtyping Recursive Types

Let's Get Functional

Cross-Platform Language Design