<<

Type Checking Modular with and

Eric Allen Justin Hilburn Scott Kilpatrick Victor Luchangco Oracle Labs Oracle Labs University of Texas Oracle Labs [email protected] [email protected] at Austin [email protected] [email protected] Sukyoung Ryu David Chase Guy L. Steele Jr. KAIST Oracle Labs Oracle Labs [email protected] david.[email protected] [email protected]

Abstract Keywords object-oriented programming, multiple dispatch, In previous work, we presented rules for defining overloaded symmetric dispatch, multiple inheritance, overloading, mod- functions that ensure under symmetric multiple ularity, methods, multimethods, static types, run-time types, dispatch in an object-oriented language with multiple inher- ilks, components, separate compilation, Fortress, meet rule itance, and we showed how to check these rules without requiring the entire type to be known, thus sup- 1. Introduction porting modularity and extensibility. In this work, we extend A key feature of object-oriented languages is dynamic dis- these rules to a language that supports parametric polymor- patch: there may be multiple definitions of a function (or phism on both classes and functions. method) with the same name—we say the function is over- In a multiple-inheritance language in which any type may loaded—and a call to a function of that name is resolved at be extended by types in other modules, some overloaded run time based on the “run-time types”—we use the term functions that might seem valid are correctly rejected by our ilks—of the arguments, using the most specific definition rules. We explain how these functions can be permitted in that is applicable to arguments having those particular ilks. a language that additionally supports an exclusion relation With single dispatch, a particular argument is designated as among types, allowing to declare “nominal the receiver, and the call is resolved only with respect to that exclusions” and also implicitly imposing exclusion among argument. With multiple dispatch, the ilks of all arguments different instances of each polymorphic type. We give rules to a call are used to resolve the call. Symmetric multiple dis- for computing the exclusion relation, deriving many type patch is a special case of multiple dispatch in which all ar- exclusions from declared and implicit ones. guments are considered equally when resolving a call. We also show how to check our rules for ensuring the Multiple dispatch provides a level of expressivity that safety of overloaded functions. In particular, we reduce the closely models standard mathematical notation. In particu- problem of handling parametric polymorphism to one of lar, mathematical operators such as + and ≤ and ∪ and es- determining relationships among universal and pecially · and × have different definitions depending on the existential types. Our system has been implemented as part “types” (or even the number) of their operands; in a language of the open-source Fortress compiler. with multiple dispatch, it is natural to define these operators Categories and Subject Descriptors .3.3 [Programming as overloaded functions. Similarly, many operations on col- Languages]: Language Constructs and Features—classes lections such as append and zip have different definitions and objects, inheritance, modules, packages, polymorphism depending on the ilks of two or more arguments. In an object-oriented language with symmetric multiple General Terms Languages dispatch, some restrictions must be placed on overloaded function definitions to guarantee type safety. For example, consider the following overloaded function definitions:

f(a: Object, b: Z): Z = 1 Permission to make digital or hard copies of all or part of this work for personal or f(a: , b: Object): = 2 classroom use is granted without fee provided that copies are not made or distributed Z Z for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute To which of these definitions ought we dispatch when f is to lists, requires prior specific permission and/or a fee. OOPSLA ’11 October 22–27, 2011, Portland, Oregon, USA. called with two arguments of ilk Z ? (We assume that Z is Copyright 2011 ACM 978-1-4503-0940-0/11/10. . . $10.00 a subtype of Object, written Z <: Object .) Castagna et al. [4] address this problem in the context of types” is adequate for explanatory purposes. Not so, how- a without parametric polymorphism or multiple ever, for generic functions. inheritance by requiring every pair of overloaded function For some time during the development of Fortress, we definitions to satisfy the following properties: (i) whenever considered an interpretation of generic functions analogous the domain type1 of one is a subtype of the domain type to the one above for generic types; that is, the generic func- of the other, the return type of the first must also be a tion definition:4 ii subtype of the return type of the second; and ( ) whenever tail X `x: List X ´: List X = e the domain types of the two definitions have a common J K J K J K lower bound (i.e., a common nontrivial2 subtype), there is should be understood as if it denoted an infinite of a unique definition for the same function whose domain monomorphic definitions: type is the greatest lower bound of the domain types of tail`x: List Object ´: List Object = e the two definitions. Thus, to satisfy the latter property for tail`x: ListJString K´: List JString K= e the example above, the must provide a third ` J ´ K J K tail x: List Z : List Z = e definition, such as: ... J K J K f(a: Z, b: Z): Z = 3 The intuition was that for any specific function call, the usual We call this latter property the Meet Rule because it is rule for dispatch would then choose the appropriate most equivalent to requiring that the definitions for each over- specific definition for this (infinitely) overloaded function. loaded function form a meet semilattice partially ordered by Although that intuition worked well enough for a sin- the subtype relation on their domain types, which we call the gle polymorphic function definition, it failed utterly when more specific than relation.3 The Meet Rule guarantees that we considered multiple function definitions. For example, a there are no ambiguous function calls at run time. programmer might want to provide definitions for specific We call the first property above the Return Type Rule (or monomorphic special cases, as in: ` ´ Subtype Rule). It ensures type preservation when a function tail X x: List X : List X = e1 J` K ´J K J K call is resolved at run time (based on the ilks of the argument tail x: List Z : List Z = e3 values) to a different (and more specific) definition than the J K J K If the interpretation above is taken seriously, this would be most specific one that could be determined at equivalent to: (based on the types of the argument expressions). ` ´ In this paper, we give new Meet and Return Type Rules tail x: List Object : List Object = e1 ` J K´ J K that ensure safe overloaded functions in a language that sup- tail x: List String : List String = e1 ` J ´ K J K ports symmetric multiple dispatch, multiple inheritance, and tail x: List Z : List Z = e1 parametric polymorphism for both types and functions (i.e., ... J K J K ` ´ generic types and generic functions), as does the Fortress tail x: List Z : List Z = e3 language we are developing [1]. We prove that these rules J K J K guarantee type safety. This extends previous work [2] in which is ambiguous for calls in which the argument is of which we gave analogous rules, and proved the analogous type List Z . J K result, for a core of Fortress that does not support generics. It gets worse if the programmer wishes to handle an To handle parametric polymorphism, it is helpful to have infinite set of cases specially. It would seem natural to write: ` ´ an intuitive interpretation for generic types and functions. tail X x: List X : List X = e1 J K J K ` J K ´ One way to think about a generic type such as List T (a list tail X <: Number x: List X : List X = e2 with elements of type T —type parameter lists inJ FortressK J K J K J K are delimited by white square brackets) is that it represents to handle specially all cases where X is a subtype of an infinite set of ground types: List Object (lists of ob- Number. But the model would regard this as an overloaded function with an infinite number of ambiguities. jects), List String (lists of strings),JList Z K (lists of inte- gers), and soJ on. AnK actual type checker mustJ K have rules for It does not suffice to “break ties” by choosing the instan- working with uninstantiated (non-ground) generic types, but tiation of the more specific generic definition. Consider the for many purposes this model of “an infinite set of ground following overloaded definitions: quux X (x: X): Z = 1 1 J K The “domain type” of a function definition is the type of its parameter. quux(x: Z): Z = 2 Hereafter we consider every function to have a single parameter; the ap- pearance of multiple parameters denotes a single parameter. Intuitively, we might expect that the call quux(x) evaluates 2 A type is a nontrivial subtype of another type if it is not the trivial “bottom” to 2 whenever the ilk of x is a subtype of Z , and to 1 type defined in the next section. 3 Despite its name, this relation, like the subtype relation, is reflexive: two 4 The first pair of white square brackets delimits the type parameter decla- function definitions with the same domain type are each more specific than rations, but the other pairs of white brackets provide the type arguments to the other. In that case, we say the definitions are equally specific. the generic type List. otherwise. However, under the “infinite set of monomorphic under this interpretation the second definition is more spe- definitions” interpretation, the call quux(x) when x has cific than the first. type N <: Z would evaluate to 1 because the most specific In providing rules to ensure that any valid set of over- monomorphic definition would be the the instantiation of the loaded function definitions guarantees that there is always a generic definition with N . unique function to call at run time, we strive to be maximally It is not even always obvious which function definition permissive: A set of overloaded definitions should be disal- is the most specific one applicable to a particular call in lowed only if it permits ambiguity that cannot be resolved at the presence of overloaded generic functions: the overloaded run time. Unfortunately, this goal is in tension with another definitions might have not only distinct argument types, but requirement, to support modularity and extensibility. In par- also distinct type parameters (even different numbers of type ticular, we assume the program will be composed of several parameters), so the type values of these parameters make modules, and that types defined in one module may be ex- sense only in distinct type environments. For example, con- tended by types defined in other modules. We want to be sider the following overloaded function definitions: able to check the rules separately for each module, and not have to recheck a module when some other module extends foo X <: Object (x: X, y: Object): Z = 1 J K its types. foo Y <: Number (x: Number, y: Y ): Z = 2 J K The difficulty is due to multiple inheritance: Because The type parameter of the first definition denotes the type the type hierarchy defined by a module may be extended of the first argument, and the type parameter of the second by types in other modules, two types may have a common definition denotes the type of the second argument; they nontrivial subtype even if no type declared in this module bear no relation to each other. How should we compare extends them both. Thus, for any pair of overloaded func- such function definitions to determine which is the best to tion definitions with incomparable domain types (i.e., nei- dispatch to? How can we ensure that there even is a best one ther definition is more specific than the other), the Meet in all cases? Rule requires some other definition to resolve the potential Under the “infinite set of monomorphic definitions” in- ambiguity. Because explicit intersection types cannot be ex- terpretation, these definitions would be equivalent to: pressed in Fortress, it is not always possible to provide such a function definition. foo(x: Object, y: Object): = 1 Z Consider, for example, the following overloaded function foo(x: Number, y: Object): = 1 Z definitions: foo(x: Z, y: Object): Z = 1 ... print(s: String): () print(i: ): () foo(x: Number, y: Number): Z = 2 Z foo(x: Number, y: Z): Z = 2 Although this overloading may seem intuitively to be valid, ... in a multiple-inheritance type system that allows any type to be extended by some other module, one could define When foo is called on two arguments of type Z , both a type StringAndInteger that extends both String and foo(x: Z, y: Object) and foo(x: Number, y: Z) are ap- . In that case, a call to print with an argument of type plicable (assuming Z <: Number <: Object ). Neither is Z more specific than the other, and moreover no definition of StringAndInteger would be ambiguous. Thus, this over- loading must be rejected by our overloading rules. foo(x: Z, y: Z) has been supplied to satisfy the Meet Rule, so this overloading is ambiguous. To address this problem, Fortress enables programmers to We propose to avoid such ambiguities by adopting an al- declare “nominal exclusion”, restricting how type construc- ternate model for generic functions, similar to one proposed tors may be extended, and uses this to derive an exclusion by Bourdoncle and Merz [3], in which each function defini- relation on types. Types related by exclusion must not have tion is regarded not as an infinite set of definitions, but rather any nontrivial subtype in common. Many languages enforce as a single definition whose domain type is existentially and exploit exclusion implicitly. For example, single inheri- quantified over its type parameters. (A monomorphic defi- tance ensures that incomparable types exclude each other. If nition is then regarded as a degenerate generic definition.) In the domains of two overloaded function definitions exclude this model, overloaded function definitions are (partially) or- each other, then these definitions can never both be applica- ble to the same call, so no ambiguity can arise between them. dered by the subtype relation on existential types. Adapting 5 dispatch and the Meet Rule to use this new partial order is In the example above, if String and Z exclude each other, straightforward. Adapting the Return Type Rule is somewhat then the overloaded definitions of print above are valid. more complicated, but checking it reduces to checking sub- We already exploited exclusion in our prior work on typing relationships between universal types. Adopting this Fortress without generics, but the constructs Fortress pro- model has made overloaded generic functions in Fortress vides for explicitly declaring exclusion are insufficient for both tractable and effective. In particular, the overloading of 5 Indeed, String and Z are declared to exclude each other in the Fortress foo just shown is permitted and is not ambiguous, because standard . allowing some intuitively appealing overloaded functions Type checking is done in the context of a class table T , involving generic types. In particular, we could not guaran- which is a set of declarations (at most one tee type safety when a type extends multiple instantiations declaration for each type constructor) of the following form: of a generic type. Implicitly forbidding such extension—a property we call multiple instantiation exclusion—allows C X <: {M} <: {N} J K these intuitively appealing overloaded functions. where the only type variables that appear in M and N are 2. Preliminaries those in X. This declares the type constructor C, and each Xi is a type parameter of C with bounds Mi. As usual for In this section, we describe the standard parts of the type languages with nominal subtyping, we allow recursive and system we consider in this paper, and establish terminol- mutually recursive references in T (i.e., a type constructor ogy and notation for entities in this system. To minimize can be mentioned in the bounds and supertypes of its own the syntactic overhead, we avoid introducing a new language and other type constructors’ declarations). We say that C and instead give a straightforward formalization of the type extends a type constructor D if Ni = D T for some Ni and system. Novel parts of the type system, including the rules T . A class table is well-formed if everyJ typeK that appears in for type checking overloaded function declarations, are de- it is well-formed, as defined below, and the extends relation scribed in later sections. over type constructors is acyclic. The type constructor declaration above specifies that the 2.1 Types constructed type C U (i) is well-formed (with respect to Following Kennedy and Pierce [8], we define a world of T ) if and only if |JU|K= |X| and Ui <:[U/X]Mij for S T U V W types ranged over by metavariables , , , , and . 1 ≤ i ≤ |U| and 1 ≤ j ≤ |Mi| (where <: is the subtyping type variables Types are of five forms: (ranged over by relation defined below, and [U/X]Mij is Mij with Uk sub- metavariables X, Y , and Z); constructed types (ranged over stituted for each occurrence of Xk in Mij for 1 ≤ k ≤ |U|); by metavariables K, L, M and N), consisting of the spe- and (ii) is a subtype of [U/X]Nl for 1 ≤ l ≤ |N|. The class cial constructed type Any and type constructor applications, table induces a (nominal) subtyping relation <: over the con- written C T , where C is a type constructor and T is a list structed types by taking the reflexive and transitive closure J K of types; structural types, consisting of types and tuple of the subtyping relation derived from the declarations in the types; compound types, consisting of intersection types and class table. In addition, every type is a subtype of Any and union types; and the special type Bottom, which represents a supertype of Bottom. the uninhabited type (i.e., no value belongs to Bottom). The We say that a type T (of any form) is well-formed with abstract syntax of types is defined as follows (where A indi- respect to T , and write T ∈ T , if every constructed type cates a possibly empty comma-separated sequence of syn- occurring in T is well-formed with respect to T . Typically, tactic elements A): the class table is fixed and implicit, and we assume it is well- T ::= X formed and often omit explicit reference to it. | Any Given the type constructor declaration above, we denote | C T type constructor application the set of explicitly declared supertypes of the constructed J K type C T by: | T → T arrow type J K | (T) tuple type C T .extends = {[T/X]N} | T ∩ T J K | T ∪ T and the set of ancestors of C T (defined recursively) by: J K | Bottom [ ancestors(C T ) = {C T } ∪ ancestors(M). 6 J K J K A type may have multiple syntactic forms. In particular, M∈C T .extends a tuple type of length one is synonymous with its element J K To reduce clutter, nullary applications are written without type, and a tuple type with any Bottom element is synony- brackets; for example, C is written C . We also elide mous with Bottom. In addition, any types that are provably the braces delimiting a singletonJK list of either bounds of a equivalent as defined below are also synonymous. type parameter or supertypes of a class in a type constructor As in Fortress, compound types—intersection and union declaration. types—and Bottom are not first-class: these forms of types We extend the subtyping relation to structural and com- cannot be written in a program; rather, they are used by the pound types in the usual way: Arrow types are contravariant type analyzer during type checking. For example, type vari- in their domain types and covariant in their return types. One ables may have multiple bounds, so that any valid instantia- tuple type is a subtype of another if and only if they have the tion of such a variable must be a subtype of the intersection same number of elements, and each element of the first is of its bounds. a subtype of the corresponding element of the other. An in- 6 We abuse terminology by not distinguishing type terms and types. tersection type is the most general type that is a subtype of each of its element types, and a union type is the most spe- to T then ilk(v) <: T . (This notion of ilk corresponds to cific type that is a supertype of each of its element types. what is sometimes called the “class” or “run-time type” of To extend the subtyping relation to type variables, we the value.8) require a type environment, which maps type variables to The implementation significance of ilks is that it is possi- bounds: ble to select the dynamically most specific applicable func- ∆ = X <: {M} tion from an overload set using only the ilks of the argument values; no other information about the arguments is needed. In the context of ∆, each type variable Xi is a subtype of In a safe type system, if an expression is determined by each of its bounds Mij. Note that the type variables Xi may the type system to have type T , then every value computed appear within the bounds Mij. We write ∆ ` S <: T to by the expression at run time will belong to type T ; more- indicate the judgment that S is a subtype of T in the context over, whenever a function whose ilk is U → V is applied to of ∆. When ∆ is empty, we write this judgment simply as an argument value, then the argument value must belong to S <: T . And we say that the types S and T are equivalent, type U . written S ≡ T , when S <: T and T <: S. Henceforth, given a type environment, we consider only 2.4 Declarations types whose (free) type variables are bound in the type environment. Because our type language does not involve A function declaration (for a class table) consists of a name, any type variable binding—type variables are bound only by a sequence of type parameter declarations (enclosed in white generic type constructor or function declarations—the set of square brackets), a type indicating the domain of the func- free type variables of T , written FV (T ), is defined as the set tion, and a type indicating the codomain of the function (i.e., of all type variables syntactically occurring within T . the return type). A type parameter declaration consists of a type parameter name and its bounds. 2.2 Extensibility For example, in the following function declaration: To enable modular type checking and compilation, we do not assume that the class table is complete; there might be f X <: M, Y <: N List X , Tree Y : Map X,Y declarations yet unknown. Specifically, we cannot infer that J K J K J K J K two constructed types have no common constructed subtype the name of the function is f , the type parameter decla- from the lack of any such type in the class table. However, rations are X <: M and Y <: N , the domain type is we do assume that each declaration is complete, so that all the tuple type List X , Y  , and the return type the supertypes of a constructed type are known. is Map X,Y . WeJ abbreviateK J aK function declaration as 0 0 A class table T is an extension of T (written T ⊇ T ) f ∆ S :JT whenK we do not want to emphasize the bounds. 0 if every declaration in T is also in T . From this, it follows ToJ reduceK clutter, we omit the white square brackets of a dec- 0 that for any well-formed extension T of a well-formed class laration when the sequence of type parameter declarations is table T , any type that is well-formed with respect to T is empty, and elide braces around singleton lists of bounds. 0 well-formed with respect to T and the subtyping relation A function declaration d = f X <: {N} S :T may be 0 0 on T agrees with that of T . That is, T ∈ T implies T ∈ T , instantiated with type argumentsJ W if |W K| = |X| and and T <: U in T implies T <: U in T 0. Wi <:[W/X]Nij for all i and j; we call [W/X]f S :T 2.3 Values and Ilks the instantiation of d with W . When we do not care about W , we just say that f U :V is an instance of d (and it is Types are intended to describe the values that might be pro- understood that U = [W/X]S and V = [W/X]T for duced by an expression or passed into a function. In Fortress, some W ). We use the metavariable D to range over finite for example, there are three kinds of values: objects, func- collections of sets of function declarations and Df for the tions, and ; every object belongs to at least one con- subset of D that contains all declarations of name f. structed type, every function belongs to at least one arrow An instance f U :V of a declaration d is applicable to type, and every tuple belongs to at least one tuple type. We a type T if and only if T <: U. A function declaration is say that two types T and U have the same extent if every applicable to a type if and only if at least one of its instances value v belongs to T if and only if v belongs to U . No is. For any type T , the set Df (T ) contains precisely those value belongs to Bottom. declarations in Df that are applicable to T . We place a requirement on values and on the type system that describes them: Although a value may belong to more than one type, every value v belongs to a unique type ilk(v) 8 We prefer the term “ilk” to “run-time type” because the notion—and 7 usefulness—of the most specific type to which a value belongs is not (the ilk of the value) that is representable in the type system confined to run time. We prefer it to the term “class,” which is used in The and has the property that for every type T , if v belongs Language Specification [7], because not every language uses the term “class” or requires that every value belong to a class. For those who like 7 The type system presented here satisfies this requirement simply by pro- acronyms, we offer the mnemonic retronyms “implementation-level ” viding intersection types. and “intrinsically least kind.” 3. Overloading Rules and Resolution This definition of specificity introduces a In this section, we define the “meaning” of overloaded problem for : If d2 is the most specific dec- generic functions; that is, we define how a call to such a laration applicable to the static types of the argument expres- function is dispatched, and we give rules for overloaded sions, and d1  d2 is the most specific declaration applicable declarations that ensure that our dispatch procedure is well- to the ilks of the arguments, then the type parameter instanti- defined, as Castagna et al. do for overloaded monomorphic ations derived by static type inference are relevant to d2, but functions [4]. The basic idea is simple: For any set of over- not to d1. Because the call is dispatched to d1, we require loaded function declarations, we define a partial order on the type parameters for d1 to be inferred dynamically. Showing declarations—we call this order the specificity relation—and how to do so is beyond the of this paper. dispatch any call to the most specific declaration applicable to the call, based on the ilks of the arguments. The rules for 3.2 Overloading Rules valid overloading ensure that the most specific declaration is Given a class table T , a set D of generic function declara- well-defined (i.e., unique) for any call (assuming that some tions for T , and a function name f, the set Df is valid (or is declaration is applicable to the arguments), and that the re- a valid overloading) if it satisfies the following three rules: turn type of a declaration is a subtype of the return type of any less specific declaration. The latter property is necessary No Duplicates Rule For every d1, d2 ∈ Df , if d1  d2 and for type preservation for dynamic dispatch: a more specific d2  d1 then d1 = d2. declaration may be applicable to the ilks of the arguments Meet Rule For every d , d ∈ D , there exists a declaration than the most specific declaration applicable to the static 1 2 f d ∈ D (possibly d or d ) such that, d  d and types of the argument expressions, so we must ensure that 0 f 1 2 0 1 d  d and d is applicable to any type T ∈ T to which the return type of this more specific declaration is a subtype 0 2 0 both d and d are applicable. of the return type used to type check the program (i.e., at 1 2 compile time). Return Type Rule For every d1, d2 ∈ Df with d1  d2, Specifically, we define three rules:9 the No Duplicates and every type T 6≡ Bottom such that d1 ∈ Df (T ), if Rule ensures that no two declarations are equally specific; an instance f S2 :T2 of d2 is applicable to T , then there the Meet Rule ensures that the set of overloaded declarations is an instance f S1 :T1 of d1 that is applicable to T with form a meet semilattice under the specificity relation; and T1 <: T2. the Return Type Rule ensures type preservation for dynamic dispatch. We prove that any set of overloaded function dec- The No Duplicates Rule forbids distinct declarations larations satisfying these three properties is safe, even if the from being equally specific (i.e., each more specific than class table is extended (Theorem 1). the other). The Meet Rule requires every pair of declarations to have disambiguating declaration 3.1 Specificity of Generic Function Declarations a , which is more specific than both and applicable whenever both are applicable. (If one For monomorphic function declarations, the specificity rela- of the pair is more specific than the other, then it is the tion is just subtyping on their domain types. However, the disambiguating declaration.) domain type of a generic function declaration may include The Return Type Rules guarantees that whenever the type type parameters of the declaration, and type parameters of checker might have used an instance of a declaration d2 to distinct declarations bear no particular relation to each other. check a program, and then a more specific declaration d1 is Furthermore, the subtyping relation between their domain selected by dynamic dispatch, then there is some instance of types of may depend on the instantiation of their type pa- d1 that is applicable to the argument and whose return type rameters, as illustrated by foo and quux in the introduction. is a subtype of the return type of the instance of d2 the type Instead of using subtyping, we adopt the following intu- checker used, which is necessary for type preservation, as itive notion of specificity: One declaration is more specific discussed above. than another if the second is applicable to every argument SinceBottom is well-formed, and tuple types with differ- that the first is applicable to. That is, for any d1, d2 ∈ Df , ent numbers of arguments have no common subtype other d1 is more specific than d2 (written d1  d2) if d1 ∈ Df (T ) than Bottom, the Meet Rule requires that an overloaded implies d2 ∈ Df (T ) for every well-formed type T . Neatly, function with declarations that take different numbers of this turns out to be equivalent subtyping over domain types arguments have a declaration applicable only to Bottom. where the domain type of a generic function declaration is Such a declaration would never be applied (because no value interpreted as an existential type [3]; we use that formulation belongs to Bottom), and it cannot be written in Fortress to mechanically check the overloading rules (see Section 6). (because Bottom is not first-class). To avoid this technical- ity, we implicitly augment every set Df with a declaration 9 The meet rule of Castagna et al. requires the existence and uniqueness of f Bottom:Bottom. This declaration is strictly more specific the meet. We split these into two rules. than any declaration that a programmer can write, and its re- turn type is a subtype of every type, so it trivially satisfies all transitive, and antisymmetry is a direct corollary of the No three rules when checked with any other declaration in Df . Duplicates Rule. This technicality raises the following question: Must the Second, we must show that (i) for every d1, d2 ∈ Df there Meet Rule hold for every T ∈ T ? Could we not, for ex- exists a d0 ∈ Df such that d0  d1 and d0  d2 and (ii) if 0 0 0 ample, have excluded Bottom from consideration, as in the there exists a d0 ∈ Df such that d0  d1 and d0  d2 then 0 Return Type Rule, and avoided the technicality? If so, for d0  d0. which types is it necessary that the Meet Rule hold? The an- Let d1 and d2 be declarations in Df . By the Meet Rule, swer is, we must check the Meet Rule for a type T ∈ T there exists a declaration d0 ∈ Df that is applicable to a type to which both d1 and d2 are applicable if there could be a T ∈ T if and only if both d1 and d2 are too. Since for every 0 value of type T such that for any type T ∈ T to which the T to which d0 is applicable we have that d1 and d2 are also 0 value belongs, T <: T . In other words, we must check it applicable to it, we know that d0  d1 and d0  d2. 0 for all “leaf” types. Thus, if we did not require extensibil- Now let d0 ∈ Df be more specific than both d1 and d2. 0 ity, we can check the Meet Rule only for those types that Then for every type T ∈ T such that d0 is applicable to T , are ilks of values. However, because we require extensibil- d1 and d2 are also applicable to T ; thus d0 is applicable to T 0 ity, and we support multiple inheritance, we use intersection and d0  d0.  types instead. The No Duplicates Rule and the Meet Rule each corre- sponds to a defining property of meet semilattices (antisym- 3.3 Properties of Overloaded Functions metry and the existence of meets, respectively), while the With the rules for valid overloading laid out, we now de- Return Type Rule guarantees that this interpretation is con- scribe some useful properties of valid overloaded sets and of sistent with the semantics of multiple dynamic dispatch. the rules themselves. Lemma 4 A valid set of overloaded function declarations

Lemma 1 If d1 and d2 are declarations in Df such that Df (T ) has a unique most specific declaration. d1  d2 and d2 6 d1, then the pair (d1, d2) satisfies the No Duplicates Rule and the Meet Rule. Proof sketch: The set Df (T ) forms a meet semilattice by the previous lemma and moreover it is clearly finite. By Proof: The No Duplicates Rule is vacuously satisfied, and straightforward induction a finite meet semilattice has a least the Meet Rule is satisfied with d0 = d1 since d1  d2 element, so there exists a unique declaration in Df (T ) that is implies that d1 is applicable to a type T if and only if both more specific than all others.  d and d are applicable to T . 1 2  3.4 Overloading Resolution Safety We now prove the main theorem of this paper, that a valid Lemma 2 For every type T ∈ T , if Df is a valid set with set of overloaded generic function declarations is safe even respect to T then so is Df (T ). if the class table is extended. Before proving the theorem Proof: The No Duplicates Rule and Return Type Rule are we establish two lemmas. First, we show that if a set of straightforward applications of the respective rules on Df . declarations is valid for a given class table, then it is valid Let d1, d2 be declarations in Df (T ) and let d0 ∈ Df be for any (well-formed) extension of that class table. Second, its disambiguating declaration guaranteed by the Meet Rule we show that if a set of overloaded declarations is valid, then on Df . Then d0 is applicable to exactly those types U to there is always a single best choice of declaration to which which d1 and d2 are both applicable. Since d1 and d2 are by to dispatch any (legal) call to that function (i.e., the unique definition both applicable to T , d0 must also be applicable most specific declaration applicable to the arguments). to T , and hence d0 ∈ Df (T ). Therefore the Meet Rule on Df (T ) is satisfied.  Lemma 5 (Extensibility) If Df is valid with respect to the To further characterize valid sets of overloaded defini- class table T , then Df is valid with respect to any extension tions and the more specific relation , we interpret them T 0 of T . as meet semilattices. A partially ordered set (A, v) forms a meet semilattice if, for every pair of elements a, b ∈ A, Proof sketch: In Section 6 we will show that checking the their greatest lower bound, or meet, is also in A. validity of Df can be reduced to examining subtype rela- tionships between existential and universal types, which are Lemma 3 A valid set of overloaded function declarations constructed solely from types appearing in Df and hence T . forms a meet semilattice with the more specific relation. Extension of the class table preserves subtype relationships between types in T and hence preserves validity of Df .  Proof: Suppose Df is a valid set of overloaded function declarations with respect to class table T . First, (Df , ) Lemma 6 (Unambiguity) If Df is valid with respect to the forms a partially ordered set: clearly  is reflexive and class table T , then for every type T ∈ T such that Df (T ) is nonempty, there is a unique most specific declaration in Section 4.1, but for now, we simply note that they do not Df (T ). help with the overloaded tail function. Specifically, even with these exclusion mechanisms, we cannot define List so Proof: Let T ∈ T be a type such that Df (T ) is nonempty. that the definitions for tail satisfy the Return Type Rule. This set is valid by Lemma 2 and thus contains a unique most To see this, consider the following overloaded definitions specific declaration by Lemma 4.  (from Section 1): tail T `x: List T ´: List T Theorem 1 (Overloading Safety) Suppose Df is valid with `J K ´J K J K 0 tail x: List Z : List Z respect to the class table T . Then for any type S ∈ T ⊇ T , J K J K if Df (S) is nonempty then there exists a unique most specific and the following type constructor declaration: declaration dS ∈ Df (S). Furthermore for any declaration ˘ ¯ BadList <: List Z , List String d ∈ Df (S) and instance f T :U of d, there exists an instance J K J K f V :W of dS that is applicable to S such that W <: U. Both definitions of tail are applicable to the type BadList, and the monomorphic one is more specific. Two instances of Proof: The Extensibility Lemma lets us consider only the the generic definition are applicable to this type: case when T 0 = T . Now the Unambiguity Lemma entails ` ´ that such a d exists, and the Return Type Rule entails the tail List Z : List Z S ` J K ´J K rest. tail List String : List String  J K J K The Return Type Rule requires that the return type of each 4. Exclusion of these instances be a supertype of the return type of the Although the rules in Section 3 allow programmers to write monomorphic definition (the monomorphic definition is its valid sets of overloaded generic function declarations, they only one instance); that is, it requires List Z <: List Z sometimes reject overloaded definitions that might seem to and List Z <: List String . The latter is clearlyJ K false.J K be valid. For example, given the type system as we have A similarJ K issue arisesJ whenK trying to satisfy the Meet Rule described it thus far, the overloaded tail function from the by providing a disambiguating definition for incomparable introduction would be rejected by the overloading rules. definitions, as in the following example (where Z <: R ): These are not false negatives: multiple inheritance can in- ` ´ minimum X <: R, Y <: Z p: Pair X,Y : R troduce ambiguities by extending two incomparable types, J K` J K´ minimum X <: Z, Y <: R p: Pair X,Y : R as discussed in the introduction. Because we allow class ta- J K` J K´ minimum X <: Z, Y <: Z p: Pair X,Y : Z bles to be extended by unknown modules, we cannot gener- J K J K ally infer that two types have no common nontrivial subtype The first definition is applicable to exactly those arguments from the lack of any such declared type. Therefore, the Meet of type Pair X,Y for some X <: R and Y <: Z ; the Rule requires the programmer to provide a disambiguating second is applicableJ K to exactly those arguments of type definition for any pair of overloaded definitions whose do- Pair X,Y for some X <: Z and Y <: R . So we might main types are incomparable. thinkJ thatK both definitions are applicable to exactly those This problem is not new with parametric polymorphism, arguments of type Pair X,Y for some X <: Z and as the print example in the introduction shows. To ad- Y <: Z , which is exactlyJ whenK the third definition is ap- dress the problem, Fortress defines an exclusion relation ♦ plicable. However, this is not true! We could, for example, over types such that two types that exclude each other have have the following type constructor declaration: no common nontrivial subtypes; that is, if T U then ˘ ¯ ♦ BadPair <: Pair R, Z , Pair Z, R T ∩ U is synonymous with Bottom. Thus, overloaded defi- J K J K nitions whose domain types exclude each other trivially sat- The first two definitions of minimum are both applicable to isfy the Meet Rule: there are no types (other than Bottom) BadPair but the third is not. to which both definitions are applicable. Exclusion allows We might say that the problem with the above examples us to describe explicitly what is typically implicit in single- is not with the definitions of tail and minimum , but with inheritance class . the definitions of BadList and BadPair; we must reject the In our previous work on Fortress without generics [2], idea that a value may belong to List Z or List String — we provided a special rule—the Exclusion Rule—to exploit or to Pair Z, R or Pair R, Z —butJ K not to both.J Indeed,K this information. However, the Exclusion Rule can also be Fortress imposesJ K a rule thatJ forbidsK multiple instantiation viewed, as we do in this paper, as a special case of the inheritance [8], in which a type (other than Bottom) is a Meet Rule, where there are no nontrivial types to which both subtype of distinct applications of a type constructor.10 We definitions are applicable. 10 Fortress provides three mechanisms to explicitly declare This definition suffices for the type system described in this paper, in which all type parameters are invariant. It is straightforward, but beyond exclusion [1]: an object declaration, a comprises clause the scope of this paper, to extend this definition to systems that support and an excludes clause. We describe these precisely in covariant and contravariant type parameters. call this rule multiple instantiation exclusion and adopt it Multiple instantiation exclusion further restricts generic here. types: Distinct instantiations of a generic type (i.e., distinct Multiple instantiation exclusion is easy to enforce stati- applications of a type constructor) have no common subtype cally, and experience suggests that it is not onerous in prac- other than Bottom. tice: it is already required in Java, for example [7]. Also, To define well-formedness for class tables with exclusion Kennedy and Pierce have shown that in systems that enforce (including multiple instantiation exclusion), we define an multiple instantiation exclusion (along with some technical exclusion relation ♦ over well-formed types: S ♦ T asserts restrictions), nominal subtyping is decidable [8].11 that S and T have no common subtypes other than Bottom. For constructed types C T and D U , C T ♦ D U if 4.1 Well-Formed Class Tables with Exclusion J K J K J K J K To incorporate exclusion into our type system, we first • D U ∈ C T .excludes; J K J K augment type constructor declarations with two (optional) • for all L ∈ C T .comprises, D U is not a subtype of L; J K J K clauses—the excludes and comprises clauses—and add • C T is declared by an object declaration and C T is a new kind of type constructor declaration—the object notJ aK subtype of D U ; J K declaration. We then change the definition of well-formed J K class tables to reflect the new features, and to enforce multi- • D U ♦ C T by any of the conditions above; J K J K ple instantiation exclusion. • C = D and T 6≡ U; or The syntax for a type constructor declaration with the • M ♦ N for some M :> C T and N :> D U . optional excludes and comprises clauses, each of which J K J K specifies a list of types, is: We augment our notion of a well-formed class table to require that the subtyping and exclusion relations it induces     C X <: {M} <:{N} excludes {L} comprises {K} “respect” each other. That is, for all well-formed constructed J K types M and N, if M ♦ N then no well-formed constructed This declaration asserts that for well-formed type C T , type is a subtype of both M and N. A well-formed extension J K the only common subtype of C T and [T/X]Li for any Li to a class table T must preserve this property. in L is Bottom, and any strictJ subtypeK of C T must also Except for the imposition of multiple instantation exclu- J K be a subtype of [T/X]Ki for some Ki in K. sion, these changes generalize the standard type system de- Omitting the excludes clause is equivalent to having scribed in Section 2: A class table that does not use any of excludes {} ; omitting the comprises clause is equiva- the new features is well-formed in this augmented system lent to having comprises { Any } .12 exactly when it is well-formed in the standard system. On We define the sets of instantiations of types in excludes the other hand, multiple instantiation exclusion restricts the and comprises clauses analogously to C T .extends. That set of well-formed class tables: a table that is well-formed is, for an application C T of the declarationJ K above, we when multiple instantiation inheritance is permitted might have: J K not be well-formed under multiple instantiation exclusion. We extend the exclusion relation to structural and com- C T .excludes = {[T/X]L} pound types as follows: Every arrow type excludes every J K non-arrow type other than Any. Every non-singleton tu- C T .comprises = {[T/X]K} ple type excludes every non-tuple type other than Any. (A J K singleton tuple type is synonymous with its element type, A class table may also include object declarations, and so excludes exactly those types excluded by its element which have the following syntax: type.) Non-singleton tuple type (V ) excludes non-singleton tuple type (W ) if either |V | 6= |W | or V excludes W for object D X <: {M} <:{N} i i J K some i. An intersection type excludes any type excluded by This declaration is convenient for defining “leaf types”: it any of its constituent types; a union type excludes any type asserts that D T has no subtypes other than itself and excluded by all of its constituent types.Bottom excludes ev- Bottom. AlthoughJ K the declaration has no excludes or ery type (including itself—it is the only type that excludes it- comprises clause, this condition implies that D T ex- self), and Any does not exclude any type other than Bottom. cludes any type other than its supertypes (and thereforeJ K it is (We define the exclusion relation formally in Section 7.) as if it had a clause comprises {} ).

11 They also show that forbidding contravariant type parameters results in 5. Examples decidable nominal subtyping, so subtyping in our type system is decidable We now consider several sets of overloaded generic function in any case. 12 declarations, and argue informally why they are (or are not, To catch likely programming errors, Fortress requires that every Ki in a comprises clause for C T be a subtype of C T , but allowing Any to in one case) permitted by the rules in Section 3, paying appear in a comprises clauseJ K simplifies our presentationJ K here. particular attention to where multiple instantiation exclusion ` ´ is required. We give a formal system and algorithm for minimum X <: Z, Y <: Z p: Pair X,Y : Z performing these checks in Section 6. J K J K Any argument to which the first two definitions are both First, consider the function foo from Section 1: applicable must be of type Pair X1,Y1 ∩ Pair X2,Y2 J K J K foo X <: Object (x: X, y: Object): Z = 1 for some X1 <: R, Y1 <: Z, X2 <: Z, and Y2 <: R. J K foo Y <: Number (x: Number, y: Y ): Z = 2 Multiple instantiation exclusion implies that X1 = X2 and J K Y = Y Pair X ,Y This overloading is valid: The second definition is strictly 1 2, so the argument must be of type 1 1 , X <: ∩ = Y <: ∩ = J K more specific than the first because the first definition is where 1 R Z Z and 1 Z R Z, so the third applicable to a pair of arguments exactly if the type of each definition is applicable to it. And since the third definition is is a subtype of Object, whereas the second is applicable to more specific than the first two, it satisfies the requirement a pair of arguments exactly if the type of each is a subtype of of the Meet Rule. Number. Thus, these definitions satisfy the No Duplicates The following set of overloaded declarations is also valid (given the declaration ArrayList X <: List X ): Rule and the Meet Rule by Lemma 1. And they satisfy the J K J K Return Type Rule because the return type of both definitions bar X ArrayList X : Z J K J K is always Z . bar Y <: Z List Y : Z J K J K The overloaded definitions for tail are also valid: bar Z <: Z ArrayList Z : Z ` ´ J K J K tail X x: List X : List X = e1 The first two declarations are incomparable: the first is ap- J K J K ` J K ´ tail X <: Number x: List X : List X = e2 plicable to ArrayList T for any type T , the second to `J ´ K J K J K tail x: List Z : List Z = e3 List U for U <: Z .J Thus,K both declarations are applica- J K J K J K The first definition is applicable to any argument of type ble to any argument of type ArrayList T ∩ List U for J K J K List T for any well-formed type T , the second is applica- any T and U <: Z . Since ArrayList T <: List T , this J K J K ble toJ anK argument of type List T when T <: Number , type is a subtype of List T ∩ List U , which, because of multiple instantiation exclusion,J K is BottomJ K unless T ≡ U, and the third is applicable to anJ argumentK of type List Z . Thus, each definition is strictly more specific than theJ pre-K in which case, ArrayList T ∩ List U ≡ ArrayList U . J K J K J K ceding one, so the No Duplicates Rule and Meet Rule are This is exactly the type to which the third declaration is ap- satisfied for each pair of definitions by Lemma 1. plicable, so the Meet Rule is satisfied. To see that the Return Type Rule is satisfied by the first Note that this example is similar to the previous one with two definitions, consider any type W 6≡ Bottom to which minimum except that rather than having each of the two the second definition is applicable—so W <: List T for definitions being more restrictive on a type parameter, one some T <: Number —and any instantiation of the firstJ K def- uses a more specific type constructor. inition with type U that is applicable to W —so W <: Finally, we consider three examples that do not involve List U . Then W <: List T ∩ List U . By multiple in- generic types, beginning with the following declarations: stantiationJ K exclusion, List TJ K∩ List UJ K≡ Bottom unless baz X (x: X): X J K J K J K T ≡ U. Since W 6≡ Bottom, we have T ≡ U, so W <: baz(x: Z): Z List U with U <: Number. Thus, the instantiation of the This pair is not valid: it does not satisfy the Return Type secondJ K definition with U has return type List U , which Rule. Consider, for example, an argument of type <: . is also the return type of the instantiation of theJ firstK defi- N Z The second declaration, which is more specific than the first, nition under consideration. (In Section 7.8 we describe how and the instantiation of the first declaration with are both to incorporate this sort of reasoning about validity into our N applicable to this argument, but the return type of the algorithmic checking of the overloading rules.) Z second declaration is not a subtype of the return type The Return Type Rule is also satisfied by the third defini- N of the instance of the first declaration. This rejection by the tion and either of the first two because the third definition is Return Type Rule is not gratuitous: baz may be called with applicable only to arguments of type List , and because Z an argument of type in a context that expects an in of multiple instantiation exclusion, the onlyJ instantiationK of N N return. either the first or second definition that is applicable to such We can fix this example by making the second declaration an argument is its instantiation with . That instantiation Z generic: has return type List Z , which is also the return type of the third definition. J K baz X (x: X): X J K The minimum example from Section 4 is also valid baz X <: Z (x: X): X under multiple instantiation exclusion, which is necessary J K This pair is valid: the second declaration is strictly more spe- in this case to satisfy the Meet Rule rather than the Return cific than the first, so the No Duplicates and Meet Rules Type Rule: are satisfied. To see that the Return Type Rule is satisfied, ` ´ minimum X <: R, Y <: Z p: Pair X,Y : R consider any type W 6≡ Bottom to which the second dec- J K` J K´ minimum X <: Z, Y <: R p: Pair X,Y : R laration is applicable—so W <: Z —and any instantiation J K J K of the first with some type T that is applicable to W —so Existential Subtyping: ∆ ` δ . δ ∆ ` δ ≤ δ W <: T . Then the instantiation of the second declaration with W has return type W , which is a subtype of the return 0 type T of the instance of the first declaration under consid- ∆ = ∆, X <: {M} X ∩ (FV (U) ∪ FV (N)) = ∅ 0 0 eration. ∆ ` T <:[V/Y ]U ∀i . ∆ ` Vi <:[V/Y ]{Ni} ∆ ` ∃ X <: {M} T . ∃ Y <: {N} U 6. Overloading Rules Checking J K J K ≡ 0 In this section, we describe how to mechanically check the ∆ ` δ −→ δr ∆ ` δr . δ overloading rules from Section 3. The key insight is that the ∆ ` δ ≤ δ0 more specific relation on overloaded function declarations corresponds to the subtyping relation on the domain types of the declarations, where the domain types of generic function Universal Subtyping: ∆ ` σ . σ ∆ ` σ ≤ σ declarations are existential types [3]. Thus, the problem of determining whether one declaration is more specific than ∆0 = ∆, Y <: {N} Y ∩ (FV (T ) ∪ FV (M)) = ∅ another reduces to the problem of determining whether one ∆0 ` [V/X]T <: U ∀i . ∆0 ` V <:[V/X]{M } existential type is a subtype of another. i i We then formulate the overloading rules as subtyping ∆ ` ∀ X <: {M} T . ∀ Y <: {N} U checks on existential and universal types (universal types J K J K arise in the reformulation of the Return Type Rule), and 0 ≡ 0 0 ∆ ` σ −→ σ r ∆ ` σ . σ r give an algorithm to perform these subtyping checks. The ∆ ` σ ≤ σ0 algorithm we describe is sound but not complete: it does rejects some sets of overloaded functions that are valid by Figure 1. Subtyping of universal and existential types. Note the overloading rules in Section 3, but it accepts many of that alpha-renaming of type variables may be necessary to them, including all of the valid examples in Section 5. apply these rules.

6.1 Existential and Universal Types The inner subtyping judgment on existentials ∆ ` Given a generic function declaration d = f ∆ S :T , its δ δ , defined in Figure 1, states that δ is a subtype domain type, written dom(d), is the existentialJ typeK ∃ ∆ S. 1 . 2 1 of δ in the environment ∆ if the constituent type of δ is a An existential type binds type parameter declarationsJ overK 2 1 subtype of an instance of δ in the environment obtained by a type, but these type parameters cannot be instantiated; 2 conjoining ∆ and the bounds of δ . instead, the existential type represents some hidden type 1 In the outer subtyping judgment ∆ ` δ ≤ δ0, we instantiation and the corresponding instantiated type. We first perform existential reduction to produce δr (denoted write ∃ X <: {N} T to bind each type variable Xi with ∆ ` δ −→≡ δ ∆ ` δ δ0 J K r). Then we check whether r . . bounds {Ni} over the type T , and we use the metavariable δ We provide (and explain) the formal definition of existential to range over existential types. reduction in Section 7.8, but for now note that it has the The arrow type of declaration d above, written arrow(d), following properties: is the universal arrow type ∀ ∆ S → T . We use this to for- mulate the Return Type Rule.J AK universal type binds type 1. ∆ ` δr . δ parameter declarations over some type and can be instanti- 0 0 2. ∆ ` δ . δ implies ∆ ` δr . δ r ated by any types fitting the type parameters’ bounds. We 3. (δr)r = δr write ∀ X <: {N} T to bind each type variable Xi with boundsJ{N } over theK type T , and we use the metavariable 4. (∃ T )r = ∃ T i JK JK σ to range over universal types. The first three properties show that ≤ is a preorder and that Note that universal and existential types are not actually . implies ≤. Adding the fourth property lets us show that types in our system. any ground instance T of δ with T 6= Bottom is an instance of δr 6.2 Universal and Existential Subtyping We use existential reduction because merely extending We define subtyping judgments for universal and existential the subtyping relation for ordinary types with exclusion is types, which we use in checking the overloading rules. We not enough to let us check the overloading rules. For ex- actually define inner and outer subtyping judgments on uni- ample, to check that the first two declarations of Dbar from versals and existentials; the former correspond to a relatively Section 5 satisfy the Meet Rule, we must be able to deduce standard interpretation of each (which resembles those de- that the existential fined in [3]); the latter incorporate quantifier reduction, de-  fined in Section 7.8. ∃ X <: Any, Y <: Z ArrayList X ∩ List Y J K J K J K and the existential Proof: First we show that δ1 ∧ δ2 is the meet of δ1 and δ2 under .. That δ1 ∧ δ2 . δ1 and δ1 ∧ δ2 . δ2 is obvious. For ∃ W <: Z ArrayList W J K J K any δ0, if U and V are instantiations that prove δ0 . δ1 and describe the same set of ground instances of types. δ0 . δ2, respectively, then we can use the instantiation U, V The rules for universals are dual to those for existentials. to prove that δ0 ≤ δ1 ∧ δ2. The inner subtyping judgment on universals ∆ ` σ1 σ2, Now we show that the meet under . is also the meet . ≡ defined in Figure 1, states that σ1 is a subtype of σ2 in the under ≤. Suppose that δ0 ≤ δ1, δ0 ≤ δ1, and ∆ ` δ0 −→ 0 0 environment ∆, if an instance of σ1 is a subtype of the con- δ0. A little work lets us deduce that δ0 . δ1 ∧ δ2 and hence stituent type of σ2 in the environment obtained by conjoin- δ0 ≤ δ1 ∧ δ2. The fact that δ1 ∧ δ2 ≤ δ1 and δ1 ∧ δ2 ≤ δ2 ing ∆ and the bounds of σ2. In the outer universal subtyping follows from the fact that . implies ≤.  judgment ∆ ` σ ≤ σ0, we first perform universal reduction We can check the Return Type Rule using the subtype 0 0 ≡ 0 to produce σ r (denoted ∆ ` σ −→ σ r) and then check relation on universal types. 0 whether ∆ ` σ . σ r. Again, we provide the formal defi- nition of universal reduction in Section 7.8, noting that it has Theorem 2 Let d1 = f ∆1 S1 :T1 and d2 = f ∆2 S2 :T2 the following properties: be declarations in Df withJ dK1  d2. They satisfyJ theK Return Type Rule if arrow(d1) is a subtype of the arrow type σ∧ = 1. ∆ ` σ . σr ∀ ∆1, ∆2 (S1 ∩ S2) → T2. 0 0 2. ∆ ` σ . σ implies ∆ ` σr . σ r J K Proof: Let d∧ = f ∆1, ∆2 S1 ∩ S2 : T2, so arrow(d∧) = 3. (σr)r = σr σ∧. Note that d∧ andJ d1 areK equally specific and that d∧ and 4. (∀ S → T )r = ∀ S → T d satisfy the Return Type Rule. Because arrow(d ) ≤ σ , JK JK 2 1 ∧ Again the first three properties show that ≤ is a preorder and for every instance U → V of σ∧ with U 6≡ Bottom, we can that . implies ≤. Adding the fourth property lets us show find an instance U1 → V1 of arrow(d1) with U <: U1 and that any ground instance S → T of δ with S 6= Bottom is V1 <: V . Thus, the pair d1 and d2 satisfy the Return Type an instance of σr. Rule because the pair d∧, d2 does.  We need universal reduction for the same reason we need existential reduction, to check the overloading rules. For 7. Constraint-Based Judgments example to show the first two declarations in Dtail from Up to this point the precise definitions of subtyping and ex- Section 5 satisfy the Return Type Rule, we use universal clusion between types (and quantifier reduction) have re- reduction to show that mained unspecified. In this section we describe a small lan- ∀ X <: Number, Y <: Any List X ∩ List Y  guage of type constraints and we define subtyping and ex- J K J K J K clusion with respect to constraints. Finally, with constraint- and based subtyping and exclusion defined, we explain in more ∀ W <: Number List W → List W detail the notion of quantifier reduction used in the ≤ judg- J K J K J K have all the same nontrivial instances. ments (and thus in our rule-checking). 6.3 Mechanically Checking the Rules 7.1 Inference Variables With our interpretations of applicability and specificity into Until now we have only considered types whose free vari- existential subtyping, we now describe the process of check- ables are bound in an explicit type environment. To gather ing the validity of a set of overloaded declarations Df ac- constraints, however, we must check subtype and exclusion cording to the rules in Section 3. relationships between types with unbound inference vari- We can check the No Duplicates Rule by verifying that ables. Intuitively, we have no control over the constraints on for every pair of distinct function declarations d1, d2 ∈ Df a bound type variable (which are fixed by the associated type either d1 6 d2 or d1 6 d2. environment), but we may introduce constraints on an infer- The Meet Rule requires that every pair of declarations ence variable. While the syntax of type variables is uniform, d1, d2 ∈ Df has a meet in Df . Because the more specific we conventionally distinguish them by using the metavari- relation on function declarations corresponds to the subtyp- ables X and Y for bound type variables and I and J for ing relation on the (existential) domain types, we just need inference type variables. to find a declaration d ∈ D whose domain type dom(d ) 0 f 0 7.2 Judgment Forms is equivalent to the meet (under ≤) of the existential types dom(d1) and dom(d2). Figure 2 shows how to compute the In Figure 3, we list the judgments for generating constraints. meet of two existential types. A judgment of the form ∆ ` S ∗ T ⇐ C states that under the assumptions ∆, the constraint C on the inference vari- Lemma 7 δ1 ∧ δ2 (as defined in Figure 2) is the meet of δ1 ables implies the proposition S ∗ T , where ∗ ranges over and δ2 under ≤. <:, 6<:, ♦, 6♦, ≡, and 6≡. If S and T contain no inference Existential Meet: δ1 ∧ δ2     ∃ X <: {M} T ∧ ∃ Y <: {N} U def= ∃ X <: {M}, Y <: {N} (T ∩ U) J K J K J K where X ∩ Y = X ∩ (FV (U) ∪ FV (N)) = Y ∩ (FV (T ) ∪ FV (M)) = ∅ Figure 2. The computed meet of existential types.

Primitive Judgments: ∆ ` T ∗ T ⇐ C Constraint Grammar Constraint Utilities

C ::= S <: T ¬C = C ∆ ` S <: T ⇐ C ∆ ` S T ⇐ C ♦ | S ♦ T ∆ ` S 6<: T ⇐ C ∆ ` S 6♦ T ⇐ C | S 6<: T toConstraint(∆) = C | S 6♦ T | C ∧ C toBounds(C) = ∆ Derived Judgments: ∆ ` T ∗ T ⇐ C | C ∨ C | false ∆ ` unify(C) = φ , C ∆ ` S <: T ⇐ C ∆ ` T <: S ⇐ C0 | true ∆ ` S ≡ T ⇐ C ∧ C0 Figure 4. Constraints. Note that unify and toBounds are partial functions. ∆ ` S 6<: T ⇐ C ∆ ` T 6<: S ⇐ C0 ∆ ` S 6≡ T ⇐ C ∨ C0 constraint C1 ∧ C2 is satisfied exactly when both C1 and C2 are satisfied, and a disjunction constraint C1 ∨ C2 is satisfied Derived Judgments: ∆ ` T ∗ T ⇒ C exactly when one or both of C1 and C2 are satisfied. The constraint false is never satisfied, and the constraint true ∆ ` S ≡ T ⇐ C is always satisfied. The equivalence constraint S ≡ T is derived as S <: T ∧ T <: S. ∆ ` S 6≡ T ⇒ ¬C Following Smith and Cartwright [16], we normalize all Figure 3. Constraint-based judgment forms. constraint formulas into disjunctive normal form and sim- plify away obvious contradictions and redundancies. We fur- variables the judgment behaves like an unconditional judg- ther make use of some auxiliary meta-level definitions, de- ment (i.e., it only produces the constraints true or false). fined in Figure 4. The negation ¬C of a constraint C has a Similarly, the judgment of the form standard de Morgan interpretation. Each type environment ∆ = X <: {M} naturally describes a constraint on the vari- ∆ ` S ∗ T ⇒ C ables X, which we denote toConstraint(∆). This conversion has a partial inverse toBound(C) that is defined whenever C states that under the assumptions ∆, if the proposition S ∗ T can be written as a conjunction of constraints of the form holds, then C must be true of the inference variables. In X <: M.13 particular, when C holds of the inference variables, S ∗ T does not have to hold for every valid instantiation of the 7.4 Subtyping bound type variables. (Note that we only make use of this judgment where ∗ is 6≡.) Figure 5 presents the full definition of our constraint-based An important point about both kinds of judgments is subtyping algorithm as inference rules for the judgment ∆ ` that the types S and T should be considered inputs and the T <: T ⇐ C. Note that our algorithm requires that these constraint C should be considered an output. rules be processed in the given order. Smith and Cartwright similarly define a sound and com- 7.3 Constraint Forms plete algorithm for generating constraints from the Java sub- Our grammar for type constraints is defined in Figure 4. A typing relation [16]. We essentially preserve their semantics primitive constraint is either positive or negative. We define with two notable differences. First, our definition includes positive primitive constraints: S <: T specifies that a S is a additional rules for tuple types to account for the fact that subtype of T , and S ♦ T specifies that S must exclude T . Similarly, we define negative primitive constraints: S 6<: T 13 If C has multiple conjuncts of this form for a single X, then the resulting and S 6♦ T with the obvious interpretations. A conjunction environment contains multiple bounds for X using the {M} notation. Subtyping Rules: ∆ ` T <: T ⇐ C

Logical rules Structural rules

∆ ` Bottom <: T ⇐ true |S| = |T | ∀i. ∆ ` Si <: Ti ⇐ Ci ∀i. ∆ ` Si <: Bottom ⇐ Di ∆ ` (S) <:(T ) ⇐ (V C ) ∨ (W D ) ∆ ` M <: Bottom ⇐ false i i

|S|= 6 1 ∀i. ∆ ` Si <: Bottom ⇐ Ci W ∆ ` S → T <: Bottom ⇐ false ∆ ` (S) <: T ⇐ Ci

∆ ` U <: S ⇐ C ∆ ` T <: V ⇐ C0 ∆ ` T <: Any ⇐ true ∆ ` S → T <:U → V ⇐ C ∧ C0

∆ ` Any <:S → T ⇐ false |U|= 6 1 ∆ ` S → T <:(U) ⇐ false ∆ ` T <: U ⇐ C ∆ ` T <: V ⇐ C0 ∆ ` T <: U ∩ V ⇐ C ∧ C0 |T |= 6 1 ∆ ` M <:(T ) ⇐ false ∆ ` S <: U ⇐ C ∆ ` T <: U ⇐ C0 00 ∆ ` S ♦ T ⇐ C ∆ ` S ∩ T <: U ⇐ C ∨ C0 ∨ C00 ∆ ` S → T <: M ⇐ false

∆ ` T <: U ⇐ C ∆ ` T <: V ⇐ C0 ∆ ` M <:S → T ⇐ false ∆ ` T <: U ∪ V ⇐ C ∨ C0

∆ ` S <: U ⇐ C ∆ ` T <: U ⇐ C0 Constructed types ∆ ` S ∪ T <: U ⇐ C ∧ C0 C 6= D ∀M ∈C S .extends. ∆ ` M <:D T ⇐ CM J K W J K ∆ ` C S <:D T ⇐ CM Inference Variables J K J K I 6∈ ∆ ∀i. ∆ ` Si ≡ Ti ⇐ Ci ∆ ` I <: I ⇐ true V ∆ ` C S <:C T ⇐ Ci J K J K I 6∈ ∆ ∆ ` I <: T ⇐ I <: T

I 6∈ ∆ ∆ ` S <: I ⇐ S <: I

Bound Variables

∆ ` X <: X ⇐ true

∆ ` ∆(X) <: T ⇐ C ∆ ` X <: T ⇐ C

∆ ` S <: Bottom ⇐ C ∆ ` S <: X ⇐ C Figure 5. Algorithm for generating subtyping constraints. Apply the first rule that matches. any tuple is equivalent to Bottom if any of its element types 7.6 Negative Judgments and Negation is equivalent to Bottom. In the rules for constraint-based exclusion (Figure 6), we use the negative judgments ∆ ` T 6<: T ⇐ C and |S| = |T | ∀i. ∆ ` Si <: Ti ⇐ Ci 0 ∆ ` T 6 T ⇐ C to determine constraints under which ∀i. ∆ ` Si <: Bottom ⇐ C ♦ i the negations hold. Instead of defining negative judgments V W 0 ∆ ` (S) <:(T ) ⇐ ( Ci) ∨ ( Ci) explicitly, we describe how to derive them from their positive counterparts according to de Morgan’s laws. |S|= 6 1 ∀i. ∆ ` S <: Bottom ⇐ C i i For the negative subtyping judgment ∆ ` T 6<: T ⇐ C, W ∆ ` (S) <: T ⇐ Ci the rules for bound variables are given below: Second, our definition includes an additional rule for inter- X ∈ ∆ ∆ ` S 6<: ∆(X) ⇐ C section types to account for exclusion since the intersection ∆ ` X 6<: T ⇐ false ∆ ` S 6<: X ⇐ C of excluding types is equivalent to Bottom. This rule makes our exclusion and subtyping rules mutually dependent. The other rules for ∆ ` T 6<: T ⇐ C are obtained as follows: For each rule of ∆ ` T <: T ⇐ C that is not 0 ∆ ` S <: U ⇐ C ∆ ` T <: U ⇐ C in the section marked “bound variables,” make a new rule 00 ∆ ` S ♦ T ⇐ C for ∆ ` T 6<: T ⇐ C by replacing each occurrence of a ∆ ` S ∩ T <: U ⇐ C ∨ C0 ∨ C00 relation symbol ∗ with its negation, and by swapping each ∧ with ∨ and true with false, and vice versa. For example, the We formally define the subtyping judgment from Sec- rule for intersection types on the r.h.s. tion 2 as a trivial application of constraint-based subtyping with the following rule: ∆ ` T <: U ⇐ C ∆ ` T <: V ⇐ C0 ∆ ` S <: T ⇐ true ∆ ` T <: U ∩ V ⇐ C ∧ C0 ∆ ` S <: T becomes the following rule in the negative subtyping judg- 7.5 Exclusion ment Figure 6 presents our definition of constraint-based exclu- ∆ ` T 6<: U ⇐ C ∆ ` T 6<: V ⇐ C0 sion as inference rules for the judgment ∆ ` T T ⇐ C. ♦ ∆ ` T 6<: U ∩ V ⇐ C ∨ C0 As with subtyping, our algorithm requires that these rules be processed in order. Additionally, if no rule matches, then the The rules for the negative exclusion judgment ∆ ` l.h.s. and r.h.s. types should be swapped and the rules tried T 6 T ⇐ C are derived from those of ∆ ` T T ⇐ C again. ♦ ♦ according to the process above. The rule for bound variables To make these rules algorithmic, we break the exclusion is relation on constructed types into four subrelations ♦x, ♦c, ♦o, and ♦m. The first three relations are further decomposed X ∈ ∆ into the asymmetric relations .x, .c, and .o. ∆ ` X 6♦ T ⇐ false 1. C S .x D T determines whether D T has a su- perJ typeK N suchJ K that N appears in the excludesJ K clause of The negative subtyping judgment should not be confused an ancestor of C S . with the derived contrapositive judgment ∆ ` T 6≡ T ⇒ C J K given in Figure 3, for the two judgments handle bound type 2. C S .c D T determines whether D T excludes everyJ K type inJ theK (nontrivial) comprises clauseJ K of C S . variables very differently. Intuitively, the negative assertion J K ∆ ` S 6≡ T ⇐ C computes the constraint C that satisfies 3. C S .o D T determines whether C S is an ob- the inequivalence for an arbitrary instantiation of the type J K J K J K ject and D T is not a supertype of C S . variables bound in ∆. Whereas the contrapositive assertion J K J K 4. C S ♦m D T determines whether there is a pair of ∆ ` S 6≡ T ⇒ C computes the constraint C that holds typesJ K(M,N)J suchK that M is an ancestor of C S , for any instantiation of ∆ such that the inequivalence is N is an ancestor of D T , and M and N are distinctJ K true. The following derivable assertions further illustrate this applications of the sameJ typeK constructor. distinction, for ∆ = X <: Any, Y <: Any :  As with subtyping, we formally define the exclusion ∆ ` Pair X,I ∩ Pair Y,J 6≡ Bottom ⇐ false judgment described in Section 4 as a trivial application of ∆ ` PairJX,IK ∩ PairJY,JK 6≡ Bottom ⇒ I ≡ J constraint-based exclusion with the following rule: J K J K 7.7 Unification ∆ ` S T ⇐ true ♦ Suppose that C is a conjunction of type equivalences. A ∆ ` S ♦ T unifier of C is a substitution φ of types for inference type Exclusion: ∆ ` T ♦ T ⇐ C Logical rules Constructed types ∆ ` C S ♦x D T ⇐ Ce ∆ ` Bottom ♦ T ⇐ true J K J K ∆ ` C S ♦c D T ⇐ Cc J K J K ∆ ` C S ♦o D T ⇐ Co ∆ ` T <: Bottom ⇐ C J K J K ∆ ` C S ♦m D T ⇐ Cp ∆ ` Any ♦ T ⇐ C J K J K ∆ ` C S ♦D T ⇐ Ce ∨ Cc ∨ Co ∨ Cp J K J K 0 ∆ ` S ♦ U ⇐ C ∆ ` T ♦ U ⇐ C 00 ∆ ` C S .∗ D T ⇐ C ∆ ` S ∩ T <: Bottom ⇐ C J K J K 0 ∆ ` D T .∗ C S ⇐ C 0 00 ∆ ` S ∩ T ♦ U ⇐ C ∨ C ∨ C J K J K 0 ∆ ` C S ♦∗ D T ⇐ C ∨ C J K J K ∗ ∈ {x, c, o} 0 where ∆ ` S ♦ U ⇐ C ∆ ` T ♦ U ⇐ C 0 ∆ ` S ∪ T ♦ U ⇐ C ∧ C ∆ ` C S .∗ C T ⇐ false J K J K where ∗ ∈ {x, c, o}

Inference Variables C 6= DA = ancestors(C S ) S  J K I 6∈ ∆ ∀N ∈ M∈A M.excludes . ∆ ` D T <: N ⇐ CN WJ K ∆ ` I ♦ I ⇐ false ∆ ` C S .x D T ⇐ CN J K J K I 6∈ ∆ C 6= D ∀M ∈C S .comprises. ∆ ` M D T ⇐ C ∆ ` I ♦ T ⇐ I ♦ T ♦ M J K V J K ∆ ` C S .c D T ⇐ CM J K J K Bound Variables C 6= DC does not have a comprises clause ∆ ` ∆(X) ♦ T ⇐ C ∆ ` C S .c D T ⇐ false ∆ ` X ♦ T ⇐ C J K J K C 6= D object C ∆ ` C S 6<:D T ⇐ C J K J K Structural rules ∆ ` C S .o D T ⇐ C |S| = |T | ∀i.∆ ` Si ♦ Ti ⇐ Ci J K J K W ∆ ` (S) ♦ (T ) ⇐ Ci C 6= D ¬(object C) ∆ ` C S .o D T ⇐ false |S|= 6 |T | J K J K ∆ ` (S) ♦ (T ) ⇐ true ∀M ∈ ancestors(C S ). ∀N ∈ ancestors(DJT K). J K M 6= Any |T |= 6 1 ∆ ` M ♦· m N ⇐ CM,N W ∆ ` M ♦ (T ) ⇐ true ∆ ` C S ♦m D T ⇐ CM,N J K J K

|T |= 6 1 ∀i.∆ ` Si 6≡ Ti ⇐ Ci W ∆ ` S → R (T ) ⇐ true ∆ ` C S ♦· m C T ⇐ Ci ♦ J K J K C 6= D ∆ ` S → T ♦U → V ⇐ false ∆ ` C S ♦· m D T ⇐ false J K J K M 6= Any ∆ ` M ♦T → U ⇐ true Figure 6. Algorithm for generating exclusion constraints. Each rule is symmetric; apply the first one that matches. variables such that φ(C) = true. We say that a unifier φ is We call the general pattern of simplifying an existentially more general than a unifier ψ if there exists a substitution τ quantified (intersection) type existential reduction, given by such that τ ◦ φ = ψ. In other words φ is more general than the judgment ∆ ` δ −→≡ δ in Figure 7. The first rule ψ if ψ factors throughout φ. for existential reduction performs the constraint-based case We can extend the notion of a unifier to an arbitrary analysis described above, while the second merely relates conjunction C in the case that C can be expressed as a the existential to itself if the premises of the first rule do not conjunction C0 ∧ C00 where C0 is entirely equivalences and hold. We thus explain the first rule in more detail. C00 contains no type equivalences. Then we define a unifier The first premise determines the constraints C that must of C to be a unifier of C0. Finally, we can extend the notion be true under the hypothesis that T 6≡ Bottom (i.e. that this of unifier to a constraint C in disjunctive normal form to be type is nontrivial). Note that the type variables from ∆ are a unifier of any disjunct of C. bound, while any type variables from the existential itself, The (partial) function unify in Figure 4 takes a constraint ∆0, become inference type variables mentioned in C. The C and produces a most general unifier φ if one exists. This second premise binds C0 to exactly the inference type vari- is always the case if C consists of a single conjunct. unify ables and bounds denoted by the existential’s type param- additionally produces the substituted leftover part, φ(C00). eters; these are the constraints that must hold for T to still make sense. In the third premise, if unify succeeds, it pro- 7.8 Quantifier Reduction duces a substitution φ for any inference type variables from 0 In the evaluation of valid overloadings from Section 5, in- ∆ constrained by equalities. Because φ is a most general tensional type analysis was required in order to reason about unifier, it has the property that any other valid substitution 0 certain examples. Since this reasoning justified the valid- ψ of ∆ ’s variables with ψ(T ) 6≡ Bottom must be equal to ity of these overloaded functions, we incorporate it into the τ ◦ φ, for some other substitution τ. Moreover, if unify suc- 00 present formal system as well. ceeds, it produces a set of leftover constraints C that are not Whenever two different domain types should be applica- unifiable equalities (but have still been simplified). If it is 00 00 ble to the same argument type W (in order to validate the possible to express C as some type environment ∆ , then Meet Rule or Return Type Rule), an existentially quantified we use this as the new type parameters over the simplified intersection type naturally arises as the necessary supertype type φ(T ). of W . Intersection types S ∩ T in our type system naturally Similarly we call the general pattern of simplifying a fall into two distinct cases: either S 6 T , or S T in which universally quantified arrow type universal reduction, given ♦ ♦ ≡ case the intersection has the same extent as Bottom. In the by the judgment ∆ ` σ1 −→ σ2 The first premise reduces second case, the intersection is trivial and W , as a subtype of the domain type dom(σ) = ∃ ∆0 S, resulting in a new the intersection, must also be trivial. Moreover, because the existential type δ = ∃ ∆00 S0 andJ aK substitution φ mapping argument type W to which both declarations must be appli- type variables from ∆J0 toK types with variables in ∆00. We cable is necessarily equivalent to Bottom, then the Meet then construct a new arrow with domain δ and range φ(T ). Rule and Return Type Rule are both trivially satisfied by As an example, in order to check that the first two dec- the presence of the implicit overloading on Bottom. In this larations of Dbar from Section 5 satisfy the Meet Rule, we manner case analysis on whether an existentially quantified must reduce the existential (intersection) type is Bottomfacilitates the checking of our  ∃ X <: Any, Y <: Z ArrayList X ∩ List Y . rules. J K J K J K Na¨ıvely one might expect this case analysis on S ∩ T Thus we must find the constraint C such that to simply check whether S ♦ T . However, as is the case when checking generic function declarations, the types S ` ArrayList X ∩ List Y 6≡ Bottom ⇒ C J K J K and T might have free type variables, whose uncertainty can be derived, noting that X and Y are actually (un- often precludes a definitive statement about S ♦ T . (For bound) type inference variables here. In this instance C = example, C X ♦ C Y holds only if X 6≡ Y .) Our X ≡ Y due to multiple instantiation exclusion. Then we J K J K solution is to reason backwards: Under the assumption that convert the bounds on the existential’s type parameters the intersection is nontrivial (that the types do not exclude), into the constraint C0 on X and Y as inference variables: gather the necessary constraints on type parameters. (For toConstraint(X <: Any, Y <: Z ) = X <: Any, Y <: Z . example, C X ∩ C Y 6≡ Bottom yields the constraint Unifying the constraint X ≡ Y .)J TheseK constraintsJ K are then reduced, resulting 0 in an instantiation (and potentially tighter bounds on type C ∧ C = X ≡ Y ∧ X <: Any ∧ Y <:Z parameters) that necessarily follows from our assumption of yields the type substitution φ = [W/X, W/Y ] (for some nontriviality.14 fresh variable W ) and the simplified leftover constraint C00 =

14 A similar sort of case analysis and constraint solving arises for pattern resemble our existential types and pattern matching resembles our function matching with generalized algebraic data types (GADTs) [15]: GADTs application. is well-formed if it satisfies the well-formedness conditions ≡ Existential Reduction: ∆ ` δ −→ δ , φ of a whole program, as described in previous sections. A compound module combines multiple modules, pos- ∆ ` T 6≡ Bottom ⇒ C toConstraint(∆0) = C0 sibly renaming members (i.e., classes and functions) of its ∆ ` unify(C ∧ C0) = φ , C00 toBounds(C00) = ∆00 constituents. More precisely, a compound module is a col- ≡ lection of filters, where a filter consists of a module and a ∆ ` ∃ ∆0 T −→ ∃ ∆00 φ(T ) , φ J K J K complete mapping from names of members of the module to names. The name of a member that is not renamed is simply otherwise mapped to itself. ≡ ∆ ` ∃ ∆0 T −→ ∃ ∆0 T, [/] The semantics of a compound module is the semantics of J K J K the simple module that results from recursively flattening the ≡ compound module as follows: Universal Reduction: ∆ ` σ −→ σ , φ

≡ ∆ ` ∃ ∆0 S −→ ∃ ∆00 S0 , φ • Flattening a simple module simply yields the same mod- J K J K ∆ ` ∀ ∆0 S → T −→≡ ∀ ∆00 S0 → φ(T ) , φ ule. J K J K • Flattening a compound module C consisting of filters (module/mapping pairs) (c , m ),..., (c , m ) yields a Figure 7. Quantifier reduction judgments. 1 1 N N simple module whose class table and collection of func- tion declarations are the unions of the class tables and W <: C00 Z . Since has the form of a type environment, collections of function declarations of s1, . . . , sN , where 00 toBounds(C ) = W <: Z , we finally reduce this existential s is the simple module resulting from first flattening c  i i to ∃ W <: Z ArrayList W ∩ List W . However, due and then renaming all members of the resulting simple J K J K J K to the class table declaration of ArrayList W this existen- module according to the mapping m . J K i tial type will be indistinguishable (by .) from the simpler ∃ W <: Z ArrayList W . J K J K When checking that the first two declarations Dtail from A compound module is well-formed if its flattened version is Section 5 satisfy the Return Type Rule, we use universal well-formed. This requirement implies that the type hierar- reduction to prove chies in each constituent component are consistent with the type hierarchy in the flattened version. ` ∀ X <: Any List X → List X We can now use this model of modularity to see that we ≤ ∀JX <: AnyK, Y

9. Related Work 9.2 Type classes 9.1 Overloading and Dynamic Dispatch. Wadler and Blott [17] introduced the as a means to Castagna et al. proposed rules for defining overloaded func- specify and implement overloaded functions such as equal- tions to ensure type safety [4]. They assumed knowledge of ity and arithmetic operators in Haskell. Other authors have the entire type hierarchy (to determine whether two types translated type classes to languages besides Haskell [6, 14, have a common subtype), and the type hierarchy was as- 18]. Type classes encapsulate overloaded function declara- sumed to be a meet semilattice (to ensure that any two types tions, with separate instances that define the behavior of have a greatest lower bound). those functions (called class methods) for any particular type Millstein and Chambers devised the language Dubious schema. Parametric polymorphism is then augmented to ex- to study how to modularly ensure safety for overloaded press type class constraints, providing a way to quantify a functions with symmetric multiple dynamic dispatch (mul- type variable — and thus a function definition — over all timethods) in a type system supporting multiple inheritance types that instantiate the type class. [11, 12]. With Clifton and Leavens, they developed Multi- In systems with type classes, overloaded functions must Java [5], an extension of Java with Dubious’ semantics for be contained in some type class, and their signatures must multimethods. Lee and Chambers presented F(EML) [9], a vary in exactly the same structural position. This uniformity language with classes, symmetric multiple dispatch, and pa- is necessary for an overloaded function call to admit a princi- rameterized modules. In previous work [2], we built on the pal type; with a principal type for some function call’s con- work of Millstein and Chambers to give modular rules for text, the type checker can determine the constraints under a core of the Fortress language [1], which supports multiple which a correct overloaded definition will be found. Because inheritance and does not require that types have expressible of this requirement, type classes are ill-suited for fixed, ad meets (i.e., the types that can be expressed in the language hoc sets of overloaded functions like: need not form a meet semilattice), but defines an exclusion println(): () = println(“”) 15 Suggested by an anonymous reviewer of a previous version of this paper. println(s: String): () = ... or functions lacking uniform variance in the domain and References 16 range like: [1] Eric Allen, David Chase, Joe Hallett, Victor Luchangco, Jan- bar(x: Z): Boolean = (x = 0) Willem Maessen, Sukyoung Ryu, Guy L. Steele Jr., and Sam bar(x: Boolean): = if x then 1 else 2 end Tobin-Hochstadt. The Fortress Language Specification Ver- Z sion 1.0, March 2008. bar(x: String): String = x [2] Eric Allen, J. J. Hallett, Victor Luchangco, Sukyoung Ryu, With type classes one can write overloaded functions with and Guy L. Steele Jr. Modular multiple dispatch with multiple identical domain types. Such behavior is consistent with the inheritance. In SAC ’07, 2007. static, type-based dispatch of Haskell, but it would lead to ir- [3] Franc¸ois Bourdoncle and Stephan Merz. Type checking reconcilable ambiguity in the dynamic, value-based dispatch higher-order polymorphic multi-methods. In POPL ’97, 1997. of our system. [4] Giuseppe Castagna, Giorgio Ghelli, and Giuseppe Longo. A A broader interpretation of Wadler and Blott’s [17] sees calculus for overloaded functions with subtyping. Information type classes as program abstractions that quotient the space and Computation, 117(1):115–135, 1995. of ad-hoc polymorphism into the much smaller space of [5] Curtis Clifton, Todd Millstein, Gary T. Leavens, and Craig class methods. Indeed, Wadler and Blott’s title suggests that Chambers. MultiJava: Design rationale, compiler implemen- the unrestricted space of ad-hoc polymorphism should be tation, and applications. ACM TOPLAS, 28(3), 2006. tamed, whereas our work embraces the possible expressivity [6] Derek Dreyer, Robert Harper, and Manuel M. T. Chakravarty. achieved from mixing ad-hoc and parametric polymorphism Modular type classes. In POPL ’07, 2007. by specifying the requisites for determinism and type safety. [7] James Gosling, Bill Joy, Guy Steele, and Gilad Bracha. The Java Language Specification, Third Edition. Addison-Wesley 10. Conclusion and Discussion Longman, Amsterdam, 3 edition, June 2005. We have shown how to statically ensure safety of over- [8] Andrew J. Kennedy and Benjamin C. Pierce. On decidability loaded, polymorphic functions while imposing relatively of nominal subtyping with variance, September 2006. FOOL- minimal restrictions, solely on function definition sites. We WOOD ’07. provide rules on definitions that can be checked modularly, [9] Keunwoo Lee and Craig Chambers. Parameterized modules irrespective of call sites, and we show how to mechanically for classes and extensible functions. In ECOOP ’06, 2006. verify that a program satisfies these rules. The type analysis [10] Vassily Litvinov. Constraint-based polymorphism in Cecil: required for implementing these checks involves subtyping Towards a practical and static type system. In OOPSLA ’98, on universal and existential types, which adds complexity 1998. not required for similar checks on monomorphic functions. [11] Todd Millstein and Craig Chambers. Modular statically typed We have defined an object-oriented language to explain our multimethods. Information and Computation, 175(1):76–118, system of static checks, and we have implemented them as 2002. part of the open-source Fortress compiler [1]. [12] Todd David Millstein. Reconciling software extensibility with Further, we show that in order to check many “natural” modular program reasoning. PhD thesis, University of Wash- overloaded functions with our system in the context of a ington, 2003. generic, object-oriented language with multiple inheritance, [13] Martin Odersky. The Scala Language Specification, Version richer type relations must be available to programmers— 2.7. EPFL Lausanne, Switzerland, 2009. the subtyping relation prevalent among such languages does [14] Jeremy G. Siek. A language for . PhD not afford enough type analysis alone. We have therefore thesis, Indiana University, Indianapolis, IN, USA, 2005. introduced an explicit, nominal exclusion relation to check [15] Vincent Simonet and Franc¸ois Pottier. A constraint-based ap- safety of more interesting overloaded functions. proach to guarded algebraic data types. ACM Trans. Program. Variance annotations have proven to be a convenient and Lang. Syst., 29(1):1, 2007. expressive addition to languages based on nominal subtyp- [16] Daniel Smith and Robert Cartwright. Java type inference is ing [3, 8, 13]. They add additional complexity to polymor- broken: can we fix it? In OOPSLA ’08, 2008. phic exclusion checking, so we leave them to future work. [17] P. Wadler and S. Blott. How to make ad-hoc polymorphism less ad hoc. In POPL ’89, 1989. Acknowledgments [18] Stefan Wehr, Ralf Lammel,¨ and Peter Thiemann. JavaGI: This work is supported in part by the Engineering Research Generalized interfaces for Java. In ECOOP 2007, 2007. Center of Excellence Program of Korea Ministry of Educa- tion, Science and Technology(MEST) / National Research Foundation of Korea(NRF) (Grant 2011-0000974).

16 With the multi-parameter type class extension, one could define functions as these. A reference to the method bar, however, would require an explicit type annotation like :: Int -> Bool to apply to an Int.