<<

CZ: Without Diamonds

Donna Malayeri Carnegie Mellon University [email protected]

fortunately however, problems arise when integrating modular mul- Abstract timethods even with restricted forms of multiple inheritance, such Multiple inheritance has long been plagued with the “diamond” in- as traits or Java multiple inheritance. Previous work either heritance problem, leading to solutions that restrict expressiveness, disallows multiple inheritance across module boundaries, or bur- such as and traits. Instead, we address the diamond problem dens programmers by requiring that they always provide (possibly directly, considering two important difficulties it causes: ensuring a numerous) disambiguating methods. correct semantics for object initializers, and typechecking multiple To solve these problems, we take a different approach: while dispatch in a modular fashion. We show that previous solutions to permitting multiple inheritance, we disallow inheritance diamonds these problems are either unsatisfactory or cumbersome, and sug- entirely. So that there is no loss of expressiveness, we divide the gest a novel approach: supporting multiple inheritance but forbid- notion of inheritance into two concepts: an inheritance dependency ding diamond inheritance. Expressiveness is retained through two (expressed using a requires clause, an extension of a Scala con- features: a “requires” construct that provides a form of subtyping struct [32]) and actual inheritance. Through examples, we illustrate without inheritance (inspired by Scala [33]), and a dynamically- that programs that require diamond inheritance can be translated dispatched “super” call similar to that found in traits. Through ex- to a hierarchy that uses a combination of requires and multiple amples, we illustrate that inheritance diamonds can be eliminated inheritance, without the presence of diamonds. As a result, our lan- via a combination of “requires” and ordinary inheritance. We pro- guage, CZ—for cubic zirconia—retains the expressiveness of dia- vide a sound formal model for our language and demonstrate its mond inheritance. modularity and expressiveness. We argue that a hierarchy with multiple inheritance is conceptu- ally two or more separate hierarchies. These hierarchies represent 1. Introduction different “dimensions” of the class that is multiply inherited. We express dependencies between these dimensions using requires, Single inheritance, mixins [11, 21, 20], and traits [17, 33] each have and give an extended example of its use in Sect. 5. disadvantages: single inheritance restricts expressiveness, mixins Our solution has two advantages: fields and multiple inheritance must be linearly applied, and traits do not allow state. Multiple (including initializers) can gracefully co-exist, and multiple dis- inheritance solves these problems, but poses challenges itself. patch and multiple inheritance can be combined. To achieve the There are two well-known problems with multiple inheritance: latter, we make an incremental extension to existing techniques for (a) a class can inherit multiple features with the same name, and modular typechecking of multiple dispatch.1 (b) a class can have more than one path to a given ancestor (i.e., An additional feature of our language is a dynamically- the “diamond problem”, also known as “fork-join” inheritance) dispatched super call, modelled after trait super calls [17]. When [34, 36]. The first, the conflicting-features problem, can be solved a call is made to A.super. f () on an object with dynamic type by allowing renaming (e.g., Eiffel [27]) or by linearizing the class , the call proceeds to f defined within D’s immediate superclass hierarchy [37, 36]. However, there is still no satisfactory solution along the A path (i.e., E where D extends E and E  A). (The to the diamond problem. path must always be specified because D can have more than one The diamond problem arises when a class C inherits an ancestor parent.) With dynamically-dispatched super calls and requires, A through more than one path. This is particularly problematic our language attains the expressiveness of traits while still allow- when A has fields—should C inherit multiple copies of the fields ing classes to inherit state. or just one? Virtual inheritance in C++ is designed as one solution We have formalized our system as an extension of Feather- for C to inherit only one copy of A’s fields [19]. But with only one weight Java (FJ) [24] (Sect. 7) and have proved it sound [25]. copy of A’s fields, object initializers are a problem: if C transitively Contributions. calls A’s initializer, how can we ensure that it is called only once? • The design of a novel multiple inheritance scheme that solves Existing solutions either restrict the form of constructor definitions, (1) the object initialization problem and (2) the modular type- or ignore some constructor calls. checking of external methods, by forbidding diamond inheri- There is another consequence of the diamond problem: it causes tance (Sect. 6). multiple inheritance to interact poorly with modular typechecking • Generalization of the requires construct and integration with of multiple dispatch. Multiple dispatch is a very powerful language dynamically-dispatched super calls (Sect. 6). mechanism that provides direct support for extensibility and soft- • Examples that illustrate how a diamond inheritance scheme can ware evolution [13, 15]; for these reasons, it has been adopted by be converted to one without diamonds (Sections 4 and 5). designers of new programming languages, such as Fortress [2]. Un- • A formalization of the language (Sect. 7) and proof of type safety. Copyright is held by the author/owner(s). 1 FOOL ’09 24 January, Savannah, Georgia, USA. Without loss of generality, our formal system includes external methods ACM . (also known as open classes) rather than full multimethods; see Sect. 2. Stream « constructor » method Stream.seek { Stream(int) /∗∗ default implementation: do nothing ∗/ void Stream.seek(long pos) { }

InputStream OutputStream /∗∗ seek if pos <= eofPos ∗/ « constructor » « constructor » void InputStream.seek(long pos) { ... } Stream.super(1024); InputStream(int) OutputStream(int) Stream.super(2048); /∗∗ if pos > eofPos, fill with zeros ∗/ void OutputStream.seek(long pos) { ... } InputOutputStream } In the context of our diamond hierarchy, this method definition Figure 1. An inheritance diamond. Italicized class names indicate is ambiguous—what if seek() is called on an object of type abstract classes. InputOutputStream? Unfortunately, it is difficult to perform a modular check to determine this fact. When typechecking the defi- nition of seek(), we cannot search for a potential subclass of both InputStream and OutputStream, as this analysis would not be 2. The Problem modular. And, when typechecking InputOutputStream, we can- not search for external methods defined on both of its superclasses, To start with, the diamond problem raises a question: should class as that check would not be modular, either. We provide a detailed C with a repeated ancestor A have two copies of A’s instance description of the conditions for modularity in Sect. 7.1. variables or just one—i.e., should inheritance be “ inheritance” It is important to note that this problem is not confined to or “graph inheritance” [12]? As the former may be modelled using multiple (implementation) inheritance—it arises in any scenario composition, the latter is the desirable semantics; it is supported in where an object can have multiple dynamic types (or tags) on which languages such as Scala, Eiffel, and C++ (the last through virtual dispatch is performed. For instance, the problem appears if dispatch inheritance) [32, 27, 19]. is permitted on Java interfaces, as in JPred [22]. Next, diamond inheritance leads to (at least) two major prob- lems that have not been adequately solved: (1) determining how and when the superclass constructor/initializer should be called 3. Previous Solutions [37, 36], and (2) how to ensure non-ambiguity of multimethods in a modular fashion [30, 22, 3]. The first problem arises when the Object initialization. Languages that attempt to solve the object graph inheritance semantics is chosen, while the second appears initialization problem include Eiffel [27], C++ [19], Scala [32] and with either tree or graph semantics. with stateful traits [8]. In Eiffel, even though (by default) only one instance of the re- Object initialization. To illustrate the first problem, consider peatedly inherited class is included (e.g., Stream), when construct- Figure 1, which shows a class hierarchy containing a diamond. Sup- ing an InputOutputStream, the Stream constructor is called pose that the Stream superclass has a constructor taking an integer, twice. This has the advantage of simplicity, but unfortunately it to set the size of a buffer. InputStream and OutputStream call does not provide the proper semantics; Stream’s constructor may this constructor with different values (1024 and 2048, respectively). perform a stateful operation (e.g., allocating a buffer), and this op- But, when creating an InputOutputStream, with which value eration would occur twice. should the Stream constructor be called? Moreover, InputStream In C++, if virtual inheritance is used (so that there is only and OutputStream could even call different constructors with dif- one copy of Stream), the constructor problem is solved as fol- fering parameter types, making the situation even more uncertain. lows: the calls to the Stream constructor from InputStream and OutputStream are ignored, and InputOutputStream must call Modular multiple dispatch. The second problem regards mul- the Stream constructor explicitly.4 Though the Stream construc- tiple dispatch, which has been argued to be more natural and ex- tor is called only once, this awkward design has the problem that pressive than single dispatch [13, 15, 14]. However, typechecking constructor calls are ignored. The semantics of InputStream may multiple dispatch in a modular fashion becomes very difficult in require that a particular Stream constructor be called, but the lan- the presence of multiple inheritance—precisely because of the dia- guage semantics would ignore this dependency. mond problem. Scala provides a different solution: trait constructors may To simplify the discussion in this paper, we focus on external not take arguments. (Scala traits are abstract classes that may methods (also known as open classes), which are essentially mul- contain state and may be multiply inherited.) This ensures that timethods that dispatch on the first argument only (corresponding InputStream and OutputStream call the same super-trait con- to the receiver of an ordinary method). Multimethods with asym- structor, causing no ambiguity for InputOutputStream. Though metric dispatching semantics (i.e., the order of arguments affects this design is simple and elegant, it restricts expressiveness. dispatch) can be translated to external methods in a straightforward Smalltalk with stateful traits [8] does not contain constructors, manner.2 External methods essentially allow programmers to add but by convention, objects are initialized using an initialize methods to a class outside of its definition. message. Unfortunately, this results in the same semantics as Eiffel; To see why diamond inheritance causes problems, consider the here, the Stream constructor would be called twice [7]. following definition of the external method3 seek: Finally, we note that although (stateless) traits and mixins do not suffer from the object initialization problem, they are less expres- sive than multiple inheritance. In particular, non-private accessors 2 An asymmetric, or encapsulated multimethod dispatching on classes in a trait negatively impact information hiding, and introducing new A1,..., An can be translated to external methods defined on each Ai, where “state” in a trait (through accessors) results in client classes having each method calls the method in class A , with the actual code defined i+1 to implement these accessors [8]. On the other hand, though mixins in the method on An. Symmetric multiple dispatch cannot be encoded us- ing external methods; this semantics adds a few orthogonal typechecking issues. 4 Since there is no default Stream constructor, this call cannot be automat- 3 We have defined this method externally for illustrative purposes. ically generated. do contain state, they must be linearly applied and mixins cannot ing that subclasses of InputStream also inherit from Stream. In inherit from one another [11, 5]. If the latter were allowed, this other words, InputStream requires the presence of Stream in would be essentially equivalent to Scala traits, which do have the the extends clause of concrete subclasses, but it need not extend object initialization problem. Stream itself. If we make InputStream an abstract class (making it serve only as a unit of reuse), it can be safely treated as a subtype Modular multiple dispatch. There are two main solutions to the of Stream. However, any concrete subclasses of InputStream problem of modular typechecking of multiple dispatch (or external (generators of instances), must also inherit from Stream. Accord- methods) in the presence of multiple inheritance. The first solution ingly, InputOutputStream must inherit from Stream directly. is simply to restrict expressiveness and disallow multiple inheri- We have reified this notion of an inheritance dependency using tance across module boundaries; this is the approach taken by the the requires keyword, a generalized form of a similar construct “System M” variant of Dubious [30]. in Scala [33, 32].6 JPred [22] and Fortress [3] take a different approach. The When a class C requires a class B: diamond problem arises in these languages due to multiple interface inheritance and multiple trait inheritance, respectively. • C is abstract, and In these languages, the typechecker ensures that external methods • C is a subtype of B (but not a subclass), and are unambiguous by requiring that the programmer always specify a method for the case that an object is a subtype of two or more • Subclasses of C must either require B themselves (making incomparable interfaces (or traits). In our streams example, the them abstract) or extend B (allowing them to be concrete). programmer would have to provide a method like the following (in In essence, C requires B is a contract for C’s concrete subclasses JPred syntax): to extend B. void f(Stream s) when s@InputStream && s@OutputStream The revised stream hierarchy is displayed in Fig. 2. In the orig- inal hierarchy, InputStream served as both generator of instances (In Fortress, the method would be specified using intersection and a unit of reuse. In the revised hierarchy, we divide the class types.) Note that in both languages this method would have to in two—one for each role. The class ConcreteInputStream is be defined for every subset of incomparable types (that con- the generator of instances, and the abstract class InputStream is tains at least 2 members), regardless of whether a type like the unit of reuse. Accordingly, InputStream requires Stream, InputOutputStream is ever defined. Even if two types will never and ConcreteInputStream extends both InputStream and have a common subtype, the programmer must specify a disam- Stream. The concrete class InputOutputStream extends each of 5 biguating method, one that perhaps throws an exception. Thus, Stream, InputStream, and OutputStream, creating a subtyping the problem with this approach is that the programmer is required diamond, but not a subclassing diamond, as requires does not to write numerous additional methods—exponential in the number create a subclass relationship. of incomparable types—some of which may never be called. JPred alleviates the problem somewhat by providing syntax to specify that a particular branch should be preferred in the case of an ambi- Stream guity, but it may not always be possible for programmers to know requires requires in advance which method to mark as preferred. InputStream OutputStream Neither JPred interfaces nor Fortress traits may contain state and thus the languages do not provide a solution to the object ini- tialization problem; neither does Dubious, since it does not contain ConcreteInputStream InputOutputStream ConcreteOutputStream constructors.

requires 4. An Overview of CZ EncryptedStream CZ’s design is based on the intuition that there are relationships between classes that are not captured by inheritance, and that if EncryptedInputStream EncryptedOutputStream class hierarchies could express richer interconnections, inheritance diamonds need not exist. Suppose the concrete class C extends Figure 2. The stream hierarchy of Fig. 1, translated to CZ, with A. As noted by Schärli et al., it is beneficial to recognize that C an encryption extension in gray. Italicized class names indicate serves two roles: (1) it is a generator of instances, and (2) it is a unit abstract classes, solid lines indicate extends, and dashed lines of reuse (through subclassing) [35]. In the first role, inheritance is indicate requires. necessary—it is the implementation strategy. In the second role, however, it is possible to transform the class hierarchy to one The code for InputStream will be essentially the same as where an inheritance dependency between C and A is stated and before, except for the call to its super constructor (explained further where subclasses of C inherit from both C and A. This notion below). Because InputStream is a subtype of Stream, it may use of inheritance dependency is of key importance in CZ, because all the fields and methods of Stream, without having to define them while multiple inheritance is permitted, inheritance diamonds are itself. forbidden. Programmers may add another dimension of stream be- Consider the inheritance diamond of Fig. 1. To translate this havior through additional abstract classes, for instance hierarchy to CZ, InputStream’s relationship with Stream would EncryptedStream. EncryptedStream is a type of stream, be changed from inheritance to an inheritance dependency, requir- but it need not extend Stream, merely require it. Concrete subclasses, such as EncryptedInputStream must inherit from 5 In Fortress, the programmer may specify that two traits are disjoint, mean- Stream, which is achieved by extending ConcreteInputStream. ing that there will never be a subtype of both. To allow modular typecheck- ing, this disjoint specification must appear on one of the two trait definitions, 6 In Scala, requires is used to specify the type of a method’s receiver (i.e., which means that one must have knowledge of the other; consequently this it is a selftype), and does not create a subtype relationship. As far as the is not a extensible solution. Scala team is aware, our proposed use of requires is novel [39]. (It would also be possible to extend Stream and InputStream class ASTNode { directly.) abstract ASTNode eval(); The requires relationship can also be viewed as declaring a } semantic “”—if B requires A, then B is effectively stating class Plus extends ASTNode { that it is an extension of A that can be “mixed-in” to clients. ASTNode eval() { ... } For example, EncryptedStream is enhancing Stream by adding String toString() { return "+"; } encryption. Because the relationship is explicitly stated, it allows B } to be substitutable for A. ... Using requires is preferable to using extends because the class DebugNode requires ASTNode { two classes are more loosely coupled. This allows programmers ASTNode eval() { to express inheritance relationships that would otherwise require print(this.toString()); diamond inheritance, without those associated problems. return ASTNode.super.eval(); // dynamic super call } Object initialization. Because there are no inheritance dia- } monds, the object initialization problem is trivially solved. Note class DebugPlus extends DebugNode, Plus { that if class C requires A, it need not (and should not) call ASTNode eval() { A’s constructor, since C does not inherit from A. In our exam- return DebugNode.super.eval(); // ordinary super call ple, InputStream does not call the Stream constructor, while } ConcreteInputStream calls the constructors of its superclasses, } InputStream and Stream. Thus, a subtyping diamond does not cause problems for object initialization. Figure 4. Implementing a mixin-like debug class using This may seem similar to the C++ solution; after all, in both dynamically-bound super calls, and performing external dis- designs, InputOutputStream calls the Stream constructor. How- patch on the ASTNode hierarchy. ever, the CZ design is preferable for two reasons: a) there are no constructor calls to non-direct superclasses, and, more impor- tantly, b) no constructor calls are ignored. In the C++ solution, checked for ambiguities. Subtyping diamonds do not cause prob- InputStream may expect a particular Stream constructor to be lems, as external is based on subclassing. called; as a result, it may not be properly initialized when this call Fragments of CZ. Note that it would be possible to omit multi- is ignored. methods from the language and use the CZ design (as is) for only Modular multiple dispatch. A similar principle solves the prob- the object initialization problem. That is, our solution can be used lem of modular multiple dispatch. In CZ, an external method may to solve either the object initialization problem, the modular multi- only override a method in a superclass, not a required class. (This method problem, or both. restriction does not apply to internal methods, as this scenario does Using “requires”. Introducing two kinds of class relationships not cause problems for modular typechecking.) So, the definitions raises the question: when should programmers use requires, of InputStream.seek and OutputStream.seek do not override rather than extends? A rule of thumb is that requires should Stream.seek—such a definition would essentially create two un- be used when a class is an extension of another class and is itself a related overloads of a method named seek. unit of reuse. If necessary, a concrete class extending the required Let us suppose for a moment that all classes in Fig. 2 have been class (such as ConcreteInputStream) could also be defined to defined, except InputOutputStream. Accordingly, we would re- allow object creation. Note that this concrete class definition would write the seek methods as follows: be trivial, likely containing only a constructor. On the other hand, when a class hierarchy contains multiple disjoint alternatives (such // unrelated methods that share the same name void InputStream.doSeek(long pos) { ... } as in the AST example in the next section), extends should be void OutputStream.doSeek(long pos) { ... } used; the no-diamond property is also a semantic property of the class hierarchy in question. method Stream.seek { Subtyping and subclassing. Since requires provides subtyp- /∗∗ default implementation: do nothing ∗/ ing without subclassing, our design may seem to bear similar- void Stream.seek(long pos) { ... } ity to other work that has also separated these two concepts (e.g. void ConcreteInputStream.seek(long pos) { [23, 38, 14]). There is an important difference, however, regarding InputStream.super.doSeek(); information hiding. In a language that separates subclassing and } subtyping, an interface type (used in a non-Java sense) must neces- void ConcreteOutputStream.seek(int pos) { OutputStream.super.doSeek(); sarily contain only “public” members; otherwise an arbitrary class } would be able to access another class’s private or protected state. } For this reason, the requires relationship establishes a stronger relationship than simply subtyping; for example, a member in (Though these definitions are slightly more verbose than before, Stream may be declared “protected” and may then be accessed by syntactic sugar could be provided.) InputStream. This does not violate information hiding, however, Note that the typechecker does not require that a disambiguat- as we are guaranteed that concrete subclasses of InputStream will ing method be provided for “InputStream && OutputStream”, extend Stream, fulfilling the intent that only extenders have access unlike JPred and Fortress. If a programmer later defines to protected state. For a more detailed discussion, please see our InputOutputStream, but does not re-define seek, the default im- companion technical report [25]. plementation of Stream.seek will be inherited. An external or in- ternal method for InputOutputStream can then be implemented, 5. Example: Abstract Syntax Trees perhaps one that calls OutputStream.doSeek(). Here, it is of key importance that subclass diamonds are disal- Consider a simple class hierarchy for manipulating abstract syntax lowed; because they cannot occur, external methods can be easily trees, such as the one in Fig. 3. The original hierarchy is the one ASTNode DebugNode # parent : ASTNode # location : SourceRef requires + eval() : ASTNode + eval() : ASTNode DebugNode.super.eval()

Num Plus Var requires DebugVar DebugPlus DebugNum + eval() : ASTNode + eval() : ASTNode + eval() : ASTNode + eval() : ASTNode + eval() : ASTNode + eval() : ASTNode + getLeft() : ASTNode + getRight : ASTNode EnhancedDebugVar # varDef : SourceRef + eval() : ASTNode

Figure 3. The AST node example in CZ. Abstract classes and abstract methods are set in italic.

<> IASTNode + getParent() : IASTNode + setParent(IASTNode) + eval() : ASTNode

DebugNode ASTNode # location : SourceRef return parent : ASTNode + getParent() : IASTNode getWrapped().getParent() + getParent() : IASTNode + setParent(IASTNode) + setParent(IASTNode) + eval() : IASTNode print(this.toString()); # getWrapped() : IASTNode return getWrapped().eval();

Num Var Plus DebugPlus EnhancedDebugVar DebugNum + eval() : ASTNode + eval() : ASTNode + getLeft() : IASTNode - wrapped : IPlus - wrapped : IVar - wrapped : INum + getRight() : IASTNode + getLeft() : IASTNode # getWrapped() : IVar # getWrapped() : INum + eval() : IASTNode + getRight() : IASTNode # getWrapped() : IPlus

<> <> <> return wrapped.getLeft() INum IVar IPlus + getLeft() : IASTNode + getRight() : IASTNode

Figure 5. The example of Fig. 3 expressed in a Java-like language, resulting in a proliferation of interfaces and boilerplate code. The visibility modifiers ‘+’, ‘-’ and ‘#’ indicate public, private and protected, respectively. Dashed lines represent extends; solid lines represent implements. on the left, which consists of ASTNode, Num, Var, and Plus. An ally has a source location field, DebugNode.location. Debugging ASTNode contains a reference pointing to its parent node, as indi- support, on the right side of the figure, is essentially a new dimen- cated in the figure. Each of the concrete subclasses of ASTNode im- sion of AST nodes, which we express using requires. (For the plement its own version of the abstract ASTNode.eval() method. moment, suppose that EnhancedDebugVar inherits directly from Suppose that after we have defined these classes, we wish to DebugNode and ignore DebugVar. We will come back to this when add a new method that operates over the AST. For instance, we comparing to mixins.) Now, classes like DebugPlus can multiply may want to check that variables are declared before they are used inherit from ASTNode and DebugNode without creating a subclass- (assuming a variable declaration statement). Since CZ supports ing diamond. In particular, DebugPlus does not inherit two copies external methods, a method defCheck() could be added externally of the parent field, because DebugNode does not inherit from (i.e, as follows: is not a subclass of) ASTNode. Thus, the no-diamond property al- lows fields and multiple inheritance to co-exist gracefully. method ASTNode.defCheck { // external method In this example, each of these classes has a method eval() void ASTNode.defCheck() { ... } void Var.defCheck() { ... } which evaluates that node of the AST, as in the code in void Plus.defCheck() { ... } Fig. 4. Suppose we intend DebugNode to act as a generic wrap- void Num.defCheck() { ... } per class for each of the subclasses of ASTNode. This can } be implemented by using a dynamically resolved super call of the form ASTNode.super.eval() after performing the debug- (We could also use the Visitor pattern, but the need for this must specific functionality (in this case, printing the node’s string rep- be anticipated, and the double dispatch code it requires is tedious resentation). The prefix ASTNode.super means “find the parent and error-prone [15].)Note that the programmer would only have class of the dynamic class of this along the ASTNode path.” At to define cases for Num, Var and Plus; she need not specify what runtime, when eval() is called on an instance of DebugPlus, method should be called when an object has a combination of these the chain of calls proceeds as follows: DebugPlus.eval() → types—such a situation cannot occur (as there are no diamonds). DebugNode.eval() → Plus.eval(). If the dynamically re- Now, suppose we wish to add debugging support to our AST, after the original hierarchy is defined. Each node now addition- solved super call behaved as an ordinary super call, it would fail, because DebugNode has no superclass. 6. CZ Design Each of the DebugNode subclasses implements its own eval() In this section, we give informal details of the typechecking rules in method that calls DebugNode.eval() with an ordinary super CZ, and provide an intuition as to why typechecking is modular. In call. (This could be omitted if the language linearized method Sect. 7 we formalize CZ and provide a detailed argument showing overriding based on the order of inheritance declarations, such as in its modularity. Scala traits.) Dynamic super calls are a generalization of ordinary super calls, when the qualifier class is a required class. 6.1 Multiple Inheritance The properties of classes and internal methods in CZ are the fol- Discussion. The examples illustrate that subtyping allows substi- lowing: tutability; subclassing, in addition to providing inheritance, defines semantic alternatives that may not overlap (such as Num, Var and C1. If a class C extends D1 and D2 then there must not exist some Plus in the example above). Because they do not overlap, we can E, other than Object, such that both D1 and D2 are subclasses of safely perform an unambiguous case analysis on them—that is, ex- E (the no-diamond property). ternal dispatch. In other words, external dispatch in our system is C2. Each method name has a unique point of introduction. That is, analogous to case-analyzing datatypes in functional programming. in the calculus, two classes only share a method name if it exists in Using Single Inheritance. This example would be more difficult a common superclass or common required class. to express in a language with single inheritance. One straightfor- C3. If a class C extends D1 and D2 and method m is defined on ward design in a Java-like language is presented in Fig. 5. Multiple both D1 and D2 (internally or externally), then C must also define inheritance is simulated using interfaces for subtyping, and compo- m. sition for dispatch. For instance, calls to DebugPlus.getLeft() are delegated to the wrapped IPlus object. The template method We have already described the reason for the no-diamond prop- C1 design pattern is used by DebugNode to implement eval() (sub- erty, condition . We make a special case for the class Object— classes override getWrapped()). the root of the inheritance hierarchy, since every class automatically Note the addition of 4 new interfaces and the boilerplate code extends it. (Otherwise, a class could never extend two unrelated needed to implement getters, setters and delegation. The problem classes—the existence of Object would create a diamond.) Note would be even worse if another dimension of behavior were to be that this does not result in the object initialization problem, because added. Furthermore, the design has the problem that the getters and Object has only a no-argument constructor. Also, this condition setters have to be public, since they are defined in an interface. For does not preclude a class from inheriting from two concrete classes instance, the “parent” field in ASTNode is effectively fully visible, if this does not form a diamond. C2 adversely affecting information hiding. Additionally, one would Condition is imposed so that one kind of ambiguity can be have to implement the visitor design pattern (not shown) to allow checked locally. It prevents a name clash if two methods (inter- external traversal of the AST. nal or external) in unrelated classes A and B coincidentally have the same name and a third class inherits from both A and B.7 The Using Mixins. This example would be difficult to express us- condition can be easily implemented in the compiler by appending ing mixins. Aside from the limitation that a total ordering must the class name to a method name at the point in which it is intro- be specified during mixin composition [17], other issues arise. duced (i.e., method m first introduced in class A becomes m_A). Suppose that DebugVar.eval() prints the variable name, while For example, if the classes Circle and Cowboy both have a method EnhancedDebugVar.eval() prints the variable name and the lo- draw, in the calculus the methods would be named draw_Circle cation in the source program where it is defined. Since DebugVar and draw_Cowboy. Of course, an implementation of the language is intended to be reused, it requires rather than extends Var. would have to provide a syntactic way for disambiguating meth- To translate this example, both DebugNode and DebugVar must ods that accidentally have the same name; this could be achieved become mixins (which we will prefix with M). Since mixins can- through rename directives (e.g., Eiffel [27]) or by using qualified not inherit from one another, MDebugVar would not be able to names (e.g., C# interfaces and C++). express an explicit relationship with MDebugNode, but would in- Note that if C requires B and it defines an internal method m, stead have to declare location and eval() as required mem- then C.m overrides B.m and is considered part of the same method bers (e.g., Jam, Strongtalk [4, 5]). This, in turn, leads to two family (and therefore has the same name). problems. First, MDebugVar cannot be treated as a subtype of Condition C3 ensures that diamond subtyping does not lead MDebugNode, greatly reducing expressiveness. Second (and more to problems. If two classes D1 and D2 have a common method significantly), supposing that external methods were to be inte- m, then m must be contained in some common required class (as grated with mixins, it would be impossible to write an external otherwise the two m’s would have different qualified names). Since method for MDebugNode and override it for MDebugVar—there m is contained in two superclasses of C, this is an ambiguity that is no relationship between the two mixins. That is, we may wish must be resolved by C. to write methods MDebugNode. f and MDebugVar. f (its override), and have EnhancedDebugVar inherit this latter definition. Instead, 6.2 External Methods the definition of f must be pushed down to EnhancedDebugVar, CZ includes external methods; methods can be added to a class which creates problems for code reuse. In particular, suppose that outside of its definition. Such methods may be overridden by method f and EnhancedDebugVar are independent extensions that other methods, either internal or external. Typechecking an exter- have no knowledge of each other. In a mixin world, external method nal method has two components: exhaustiveness checking (i.e., the definitions cannot be truly modular extensions. provided cases provide full coverage of the dispatch hierarchy) and The heart of the problem is that mixins are defined in isolation— ambiguity checking (i.e., when executing a given method call, only though they can be composed, they cannot be subclasses (or even one method is applicable). As previously mentioned, asymmetric subtypes) of one another. Our solution could be viewed as similar to mixins, with the addition of subtyping and design intent (through 7 Incidentally, this is not the convention used in Java interfaces, but is that requires) and (no-diamond) multiple inheritance. of C#. multimethods can be encoded with external methods; accordingly, 6.3 Dynamically-Dispatched Super Calls the same typechecking issues apply to both. In CZ’s formal system, exhaustiveness of external methods is As illustrated in Sect. 5, CZ includes dynamically-dispatched ensured because there are no abstract methods; if the language super calls. When A requires B (i.e., A is acting as a mixin included abstract methods, external method definitions would not extension of B), then within A, a call of the form B.super is dy- be be permitted to be abstract. This, and as the restrictions below, namically resolved, similar to super calls in traits. Other super calls are enforced in Millstein and Chambers’s “System M” variant of (i.e., those where the qualifier is a parent class) have the same se- the Dubious calculus [30], and in later extensions such as MultiJava mantics as that of Java. [15, 16] and EML [29]. Of these, only System M includes multiple 6.4 Discussion inheritance. To allow modular ambiguity checking, CZ methods must obey Extensions. It would be possible to combine our solution with the following rules: existing techniques for dealing with the object initialization and modular multiple dispatch problems. A programmer could specify E1. All external method definitions of a method m must appear in that a class C, whose constructor takes no arguments, may be the the method block where the method family m is introduced (using root of a diamond hierarchy. Then, we would use the Scala solution the method declaration). for ensuring that C’s constructor is called only once. To solve the E2. When an external method family m is introduced, it must multiple dispatch problem, if C is the owner of a method family declare an owner class C: this specifies that the method family is m, the typechecker would ensure that m contained disambiguating rooted at C. C must be a proper subtype of Object, the root of the definitions for the case of a diamond—the JPred and Fortress solu- inheritance hierarchy. An external method definition m for class D tions. is valid only if D is a subclass of C. We could also generalize dynamically-dispatched super calls so that they are chained, as in Scala [33]. In Scala, a super call E3. An external method must not override an internal one (though in a trait is dispatched to the next type in the linearization. In an internal method may override an external one). this way, traits can call sibling methods, which is a very powerful While all three conditions are the same as those in System M, composition mechanism. that language did not allow multiple inheritance across module Encapsulation and the diamond problem. As noted by Snyder, boundaries. In CZ we can remove this restriction by ensuring that there are two possible ways to view inheritance: as an internal diamond inheritance does not occur—condition C1. (Note that in design decision chosen for convenience, or as a public declaration CZ, each class and each top-level method declaration is in its own that a subclass is specializing its superclass, thereby adhering to its module.) semantics [37]. Condition E1 is necessary because otherwise there could be two Though Snyder believes that it can be useful to use inheritance external method definitions m defined for the same class C, leading without it being part of the external interface of a class, we ar- to an ambiguity. gue that the second definition of inheritance is more appropriate. Condition E2 ensures that diamonds with Object at the top In fact, if inheritance is being used merely out of convenience (e.g., (permitted by condition C1) do not cause an ambiguity. Concretely, Vector extending Stack in the Java standard library), then it is consider the following method definition: very likely that composition is a more appropriate design [9]. For method Object.g { // illegal CZ definition−owner cannot be Object similar reasons, we do not believe a language should allow inher- void Stream.g() { ... } itance without subtyping—e.g., C++ private inheritance—as this void Foo.g() { ... } can always be implemented using a helper class whose visibility is } restricted using the language’s module system. class Bar extends Stream, Foo { ... } // problem! two versions of g()! Nevertheless, if one takes the view that inheritance choices should not be visible to subclasses, a form of the diamond prob- If this were valid code, there would exist a method definition g() lem can arise in CZ. In particular, suppose class D extends B and for each of Stream and Foo. In this case, Bar would inherit two C, C extends A, and B extends Object—a valid hierarchy (recall equally valid definitions of g(). For typechecking to be modular, that condition C1 makes a special exception for diamonds involv- when checking Bar, we should not have to check all definitions of ing Object). Now suppose that B is changed to extend A, and the external methods, including g(). Note that not specifying an owner maintainer of B is unaware that class D exists. Now A, B and C class has the same effect as using Object as an owner. typecheck, but D does not. Thus, the use of inheritance can invali- Additionally, condition E2 ensures that diamond subtyping (as date subclasses, which violates Snyder’s view of encapsulation. opposed to subclassing) does not result in a class inheriting the This situation highlights the fact that, in general, requires same external method through more than one path. If overriding should be favored over extends if a class is intended to be reused. were permitted based on subtyping, the problem described with This principle is in accordance with the design of classes in Sather diamond inheritance (Sect. 2) would re-appear. [38], traits in Scala and Fortress [32, 2, 3], and the advice that “non- The owner class is also important for implementing condition leaf” classes in C++ be abstract [28]. In Sather, for example, only C2 from the previous section—the owner class name must be part abstract classes may have descendants; concrete classes form the of the method name used by the internal language. (A related leaves of the inheritance hierarchy [38]. issue, defining two external methods with the same name m, can be resolved by using a naming convention for compilation units.) Condition E3 is required to avoid a situation where an external 7. Formal System method m is defined on class C and C also defines an internal In this section, we describe the formalization of CZ, which is based 8 method m, causing an ambiguity. on Featherweight Java (FJ) [24]. We use the same conventions as FJ; D is shorthand for the (possibly empty) list D1,..., Dn, which 8 This restriction unfortunately prevents a class C from declaring an abstract method m for the purpose of allowing clients to “plug in” different versions is a solution to this, however: one can use structural subtyping instead of of m (in different namespaces) into the context where m is called. There abstract methods to define an interface for C.m [26]. may be indexed by Di. We use the same metavariables as FJ, with (which will always be an external method, due to the @internalDef the addition that m and n range over internal and external method premise) has the same type as the current method. names, respectively; M and N range over internal and external The T-CLASS rule (Fig. 10) checks class definitions. Premise method declarations, respectively. (4) ensures that requires is propagated down each level of the The grammar of CZ is presented in Fig. 6. Modifications to inheritance hierarchy; the extending class must either extend or FJ are highlighted. Class declarations may extend or require a require its parents’ required classes. Premise (5) ensures that list of classes. There is also a new type of declaration: top-level requires is copied at each level of the hierarchy. Premise (6) methods. The declaration method C.m{N} introduces an external specifies that a subclassing diamond cannot occur, except for the method family with the owner class C. (Owner classes were de- case of Object. Finally, premises (7) and (8) enforce condition scribed in Sect. 6.2.) The syntax requires that each external method C3, ensuring that subtyping diamonds do not cause problems. The be defined within the method block; this effectively enforces con- external(m, Di) judgement returns the first superclass of Di for dition E1 of Sect. 6.2. which an external method m is defined. Note that at most one of Aside from virtual super calls, and the removal of casts (they are C’s superclasses can have an external method m, as otherwise a orthogonal to our goals), CZ expression forms are identical to those diamond would occur. of FJ. For simplicity, we have not modeled ordinary super calls in The rule T-TOP-METHOD requires that the method owner not be our calculus, as this has been considered by others (e.g., [21, 31]) Object, as required by condition E2. and is orthogonal to the issues we are considering. Therefore, the Dynamic Semantics. The evaluation rules are presented in class qualifier of a super call must be a required class. Fig. 12. Most of the rules are similar to FJ, with the notable ex- We have added a new subtype judgement (Fig. 7), denoted by ception of E-SUPER-INVK. This rule uses the auxiliary judgement ‘<:’, which handles the requires relationship. Subclassing (‘’) super(C, D), which finds the immediate superclass of the class C implies subtyping, and if class A requires B then A <: B, but along the path D. Then, mbody is called on the result of the super A  B. In CZ, the requires relation is not transitive; subclasses call. must either require or extend the required class. This is enforced by The mbody judgement (Fig. 11) mirrors mtype with two differ- the typechecking rules. ences: there is no requires rule, and there must be a unique super- The auxiliary judgements for typechecking and evaluation ap- class that has a particular method body. The type safety theorems pear after the typechecking and evaluation rules, in Fig. 11. We show that there is a correspondence between these two judgements, will describe each of these when describing the rules that use them. based on the class and method typechecking rules. The remaining new rules are straightforward congruence rules.

Static Semantics. The rules for typechecking expressions are in 7.1 Modularity Fig. 8. The rule for method invocations, T-INVK, is the same as that Here, we describe the conditions under which a class-based system in FJ. However, the auxiliary judgement it uses, mtype, is different. with external methods is modular when there is no explicit module The CZ judgement mtype (Fig. 11) has two additional rules as system. We argue informally that typechecking in CZ is modular compared to FJ: one for external method definitions, and one for based on the structure of the typechecking rules. methods received from a required class. The judgement first looks Conditions for modular typechecking. for the method m in the class itself, if it is not there, then it looks for an external method family with that name. If neither of those 1. Checking a class definition C with methods M should only re- two cases applies, superclasses are recursively searched; otherwise quire examining: (a) signatures of methods transitively overrid- den in , (b) signatures of methods transitively overridden by required classes are searched. In the last rule, @mtype(m, Dk) is M C’s inherited methods, (c) class declarations of C’s supertypes, shorthand for @B → B. mtype(m, Dk) = B → B. Note that in our formalism, as in FJ, all definitions (including and (d) signatures of methods called by M. external methods) have global scope. In a real implementation, 2. Checking the definition of a particular external method C.m there would be specific import statements for external methods that should only require examining: (a) the declarations of C and its would control their visibility. supertypes, (b) the signature of the external method that C.m The rule T-SUPER-INVK checks the virtual super call described overrides, and (c) the signatures of methods that C.m calls. In in Sect. 6. Essentially, for a call of the form this.B.super.m(e), particular, the typechecker may not search for subclasses of C. where this : C0, instead of looking up mtype(m, C0), we look up We show that typechecking in CZ obeys these rules. We must mtype(m, B), where B is the a required class of C0. consider all direct or indirect uses of mtype, because that judgement The rule T-NEW has one additional premise as compared to examines both internal and external methods. This includes uses FJ: the requires clause must be empty. This ensures that the of override, which calls mtype. We must also examine uses of class is concrete and can be instantiated, which in turn ensures the the auxiliary judgement external. Uses of internalDef(m, C) are soundness of the subtyping relation induced by requires. permitted, as this judgement finds the signature of m in C or its Rules for typechecking methods are displayed in Fig. 9. The supertypes. This is permitted under modularity condition 2(a). rule T-METHOD checks internal methods, and uses the override Note that since our system does not have explicit modules, auxiliary judgement, which is the same as that of FJ. In this rule, external and internal method names must contain the “module” we check that method m is a valid override of the same method in they are defined in. This convention can be used to simulate module all superclasses and required classes. import statements. Typechecking external methods is a bit more involved than In the rule for checking a class definition, T-CLASS, all premises checking internal ones. The first three premises of the rule but (3) and (8) are modular by inspection. Premise (3) uses the T-EXT-METHOD are the same of those in T-METHOD. Premise (4) T-METHOD rule, which we will consider shortly, and premise (8) ensures that there is no internal definition of m in C, enforcing checks external methods overridden by some Di. We observe that condition E3 of Sect. 6.2. Premise (5), C  B, ensures that the if there is a method m is defined or inherited by Di, we are per- class C on which the external method is defined is a subclass mitted to examine signatures of methods (internal or external) that of the method family’s owner class B, as required by condition it overrides. The external judgement searches for such an external E2. Finally, premise (6) ensures that the method being overridden method. As mentioned, in a complete implementation of CZ, exter- Declarations L ::= class C extends C requires C { C f ; K M } | method C.m { N }

Constructors K ::= C(C f ) { this. f = f ; } Methods M ::= C m(C x) { return e; }

External Methods N ::= CC.m(C x){ return e; }

Expressions e ::= x | e. f | e.m(e) | e.C.super.m(e) | new C(e)

Figure 6. CZ grammar

Subclassing C  D

C  DD  E CT(C) = class C extends D1,..., Dn ··· { ... }

C  C C  E C  Di

Subtyping C <: D

C  D C <: DD <: E CT(C) = class C extends D requires E1,..., En { ... }

C <: D C <: E C <: Ei

Figure 7. Subclassing () and subtyping (<:) judgement

Γ ` e : C Γ ` e0 : C0 fields(C0) = C f (T-VAR) (T-FIELD) Γ ` x : Γ(x) Γ ` e0. fi : Ci

Γ ` e0 : C0 mtype(m, C0) = D → C Γ ` e : C C <: D (T-INVK) Γ ` e0.m(e) : C

Γ ` e0 : C0 class C0 extends D0 requires B, E mtype(m, B) = D → C Γ ` e : C C <: D (T-SUPER-INVK) Γ ` e0.super.m(e) : C

fields(C) = D f Γ ` e : C C <: D class C requires • (T-NEW) Γ ` new C(e) : C Figure 8. Expression typing nal methods would be explicitly imported; this judgement would Checking external methods is also modular. In the rule only examine those methods imported by supertypes. T-EXT-METHOD, premise (1) checks the method body as with The T-METHOD rule has three premises for which we need to T-METHOD; the same reasoning as above applies here. We demonstrate modularity: typechecking e0 (premise 1) and the two should therefore consider premise (6). We observe that over- override checks (premises 4 and 5). However, note that when type- ride(m, D, ··· ) will consider internal or external methods in su- checking e0, uses of any external methods may indeed use T-INVK perclasses of C. Since we have @internalDef, internal methods are or T-SUPER-INVK (both of which use mtype), but this is effectively ruled out. Accordingly, override must be considering the overrid- a case of client-side typechecking, rather than implementation-side den method of the same method we are already checking, due to the typechecking. In other words, this is an instance of case 1(c). syntactic restriction that all cases of an external method are defined The two override checks are not problematic, either. In each together (condition E1). Therefore, all the checks are modular; no case, mtype searches for an internal or external method m in the other external methods are being examined. superclasses and required classes of the class C. It does not examine all external methods or examine subclasses of C. Note that since 7.2 Type Safety an internal method m must include a module name, if it overrides an external method m it is implicitly “importing” the external We prove type safety using the standard progress and preservation method’s module. Therefore, typechecking class definitions in CZ theorems, with a slightly stronger progress theorem than that of FJ, is modular. due to the omission of casts. Note that in our system, type safety implies that method calls are always unambiguous, as the mbody M ok in C – — ˜ x : C, this : C ` e0 : E0 E0 <: C0 class C extends D requires E ™ š override(m, D , C → C0) override(m, E , C → C0) (T-METHOD) C0 m(C x){ return e0; } ok in C

N ok in C.m – — ˜ x : C, this : C ` e0 : E0 E0 <: C0 class C extends D requires E ™ 0 0 š › @C . internalDef(m, C) = C C  B override(m, D, C → C0) (T-EXT-METHOD) C0 C.m(C x){ return e0; } ok in B.m

Figure 9. Method typing

Declaration Typing

L ok

– — i∈1..n fields( Di ) = Di gi (i ∈ 1..n) K = C(Di gi, C f ){ this.gi = gi ; this. f = f } ˜ ™ 0 0 0 M ok in C class Di requires E , implies ∃k. Dk  E or ∃k. Ek  E (i ∈ 1..n) š 00 00 00 class Ei requires E , implies ∃k. Dk  E or ∃k. Ek  E (i ∈ 1..n) › 0 0 0 0 ∀i, j ∈ 1..n. i 6= j, implies @D . Di  D and Dj  D (D 6= Object) œ ∀m. ∃i, j. i 6= j. internalDef(m, Di) 6= internalDef(m, Dj), implies m ∈ M  ∀m. ∃i, j. i 6= j. internalDef(m, Di) and external(m, Di) = A and Dj  A, implies m ∈ M (T-CLASS) class C extends D1,..., Dn requires E1,..., En { C f ; K M } ok

C 6= Object N ok in C.m (T-TOP-METHOD) method C.m { N } ok

Figure 10. Class and external method typing

judgement requires that there be a unique applicable method. We Lemma 7.3. If C  D1 and C  D2 and Γ ` new C(e) : C and refer the reader to our companion technical report [25] for the proof m ∈/ C and mbody(m, D1) = x1.e1 and mbody(m, D2) = x2.e2 of type safety; we give a brief outline here. then either D1  D2 or D2  D1. Theorem 7.1 (Preservation). If Γ ` e : C and e 7−→ e0, then Lemma 7.4. If D :: mtype(m, C) and D does not contain the rule 0 0 0 Γ ` e : C for some C <: C. MTYPE4, then mbody(m, C) = x.e, for some x and e. The proof of preservation is relatively straightforward and is similar Lemma 7.2 is needed because it is the rule MTYPE4 that could to the proof of FJ. We make use of an auxiliary lemma (not shown) result in mbody not being defined—it is the only rule that has no that proves that mtype returns a unique value. mbody counterpart. Lemma 7.3 is used to show that mbody has a unique value for a superclass of the concrete class C. We make Theorem 7.2 (Progress). If · ` e : C then either e is a value or 0 0 use of this lemma in the inductive step of the Lemma 7.4, as it there is an e with e 7−→ e . is straightforward to show that mbody is defined, but additional The proof of progress is slightly more complex. The proof requires reasoning is needed to show that its value is unique. the following lemma: With these lemmas, the rest of the proof of progress is straight- forward. Lemma 7.1. If mtype(m, C) = D → D and Γ ` new C(e) : C then mbody(m, C) = x.e0 for some x and e0. 8. Related work However, unlike in FJ, we cannot prove this lemma by induction Here we describe related work that was not previously discussed in on the derivation of mtype, since for the inductive step, we do not Sect. 3. have a derivation Γ ` new Dk(e) : Dk. Instead, we make use of As mentioned in Sect. 3, traits [17, 2] cause problems for in- three auxilliary lemmas: formation hiding—they essentially make it impossible to have pri- Lemma 7.2. If D :: mtype(m, D) = B → B and C <: D and vate or protected “state” that is not accessible by objects that reuse Γ ` new C(e) : C, then there exist D0 and D0 such that C  D0 the trait, as such “state” can only be implemented using accessors. and D0 :: mtype(m, D0) = B → B does not contain the rule Stateful traits [8] also do not help in this regard, as they have been designed for maximal code reuse, rather than information hiding. MTYPE4. In this design, state is hidden by default, but clients can “unhide” fields(C) = C f class C extends D requires E { C f ; K M }

fields(Di) = Bi gi (i ∈ 1..n)

fields(Object) = • fields(C) = Bi gi , C f mtype(m, C) = D → D (MTYPE2) (MTYPE1) class C ··· { C f ; K M } class C ··· { C f ; K M } m ∈/ MCT(m) = method D.m { N } B m(B x) { return e} ∈ M BC.m(B x) { return e} ∈ N mtype(m, C) = B → B mtype(m, C) = B → B

(MTYPE3) (MTYPE4) class C extends D requires E { C f ; K M } m ∈/ M class C extends D requires E { C f ; K M } m ∈/ M CT(m) = method F.m { N } implies C.m ∈/ N CT(m) = method F.m { N } implies C.m ∈/ N ∃k mtype(m D ) = B → B . , k ∀k. @mtype(m, Dk) ∃k. mtype(m, Ek) = B → B mtype(m, C) = B → B mtype(m, C) = B → B

internalDef(m, C) = D

class C extends D requires E { C f ; K M } 0 0 class C ··· { C f ; K M } class C extends D ··· { C f ; K M } m ∈/ M ∀k. @D . internalDef(m, Dk) = D 0 0 B m(B x) { return e} ∈ M m ∈/ M ∃k. internalDef(m, Dk) = Dk ∃k. internalDef(m, Ek) = Ek 0 0 internalDef(m, C) = C internalDef(m, C) = Dk internalDef(m, C) = Ek

external(m, C) = A

class C extends D requires E class C extends D requires E CT(m) = method F.m { N } CT(m) = method F.m { N } CT(m) = method F.m { N } C.m ∈/ N C.m ∈/ N ∀k. @A. external(m, Dk) = A 0 0 C.m ∈ N ∃k. external(m, Dk) = F ∃k. external(m, Ek) = F external(m, C) = F external(m, C) = F0 external(m, C) = F0 mbody(m, C) = x.e class C ··· { C f ; K M } class C ··· { C f ; K M } m ∈/ MCT(m) = method D.m { N } B m(B x) { return e} ∈ M BC.m(B x) { return e} ∈ N mbody(m, C) = x.e mbody(m, C) = x.e

class C extends D requires E { C f ; K M } m ∈/ M

CT(m) = method F.m { N } implies C.m ∈/ N ∃ unique k . mbody(m, Dk ) = x.e mbody(m, C) = x.e

override(m, D, C → C0) super(C, D) = E

mtype(m, D) = D → D implies 0 class C extends EE  D C = D and C0 = D0 super(C, D) = E override(m, D, C → C0) Figure 11. CZ typechecking and evaluation auxiliary judgements e 7−→ e0 fields(C) = τ f mbody(m, C) = x.e0 (E-INVK) (new C(e)). fi 7−→ ei (new C(e)).m(d) 7−→ [d/x, (new C(e))/this] e0

super(C, D) = E mbody(m, E) = x.e0 (E-SUPER-INVK) (new C(e)).D.super.m(d) 7−→ [d/x, (new C(e))/this] e0 Congruence rules 0 0 0 e0 7−→ e0 e0 7−→ e0 e0 7−→ e0 0 0 0 e0. f 7−→ e0. f e0.m(e) 7−→ e0.m(e) e0.C.super.m(e) 7−→ e0.C.super.m(e)

0 0 ei 7−→ ei ei 7−→ ei e 7−→ e0 i i e0.C.super.m(..., ei,... ) 7−→ new C(..., ei,... ) 7−→ 0 0 0 e0.m(..., ei,... ) 7−→ e0.m(..., ei,... ) e0.C.super.m(..., ei,... ) new C(..., ei,... )

Figure 12. Evaluation rules it, and may have to resort to merging variables that are inherited from multiple traits. While this provides a great deal of flexibility 9. Conclusions and Future Work for trait clients, this comes at the cost of information hiding. Also, We have presented a language that solves two major problems as previously mentioned, this design does not address the problem caused by inheritance diamonds: object initialization and external of a correct semantics for object initialization in the presence of method dispatch. We have also shown how programs written with diamonds. traditional multiple inheritance can be converted to programs in As mentioned in Sect. 3, JPred [22] and Fortress [3] perform our language. We note that though diamonds can still cause en- modular multimethod typechecking by requiring that programmers capsulation problems (depending on the definition of encapsula- provide disambiguating methods, some of which may never be tion), this problem can be ameliorated by preferring requires over called. Neither language solves the problem of multiple inheritance extends. with state. We have implemented a CZ typechecker as an extension to On the other hand, we observe that the JPred and Fortress Java (using JastAdd [18]) and plan to investigate how our language dispatch semantics may be more expressive than that of CZ. In CZ, could be used to express existing programs. in the class hierarchy Fig. 2, the abstract class InputStream may not override a Stream method externally (though it may override Acknowledgements it internally), because it is not a subclass of Stream. In contrast, if this hierarchy were expressed in e.g. JPred (using interfaces in The author thanks Jonathan Aldrich for helpful discussions and place of abstract classes), a predicate method defined on Stream Neelakatan Krishnaswami, William Lovas, Steven Matuszek, Gi- could be overridden by either InputStream or OutputStream. lad Bracha, and the anonymous reviewers for feedback on an ear- Note, however, that programmers can achieve a similar effect in lier version of this paper. This research was supported in part by the CZ by having concrete classes call helper methods (which can be U.S. Department of Defense, Army Research Office grant number defined externally) in the abstract classes. DAAD19-02-1-0389 entitled “Perpetually Available and Secure In- Cecil [13, 14] also provides both multiple inheritance and multi- formation Systems,” and Jonathan Aldrich’s NSF CAREER award method dispatch, but it does not include constructors (and therefore CCF-0546550. provides ordinary dispatch semantics for methods acting as con- structors), and it performs whole-program typechecking of multi- References methods. Like JPred, the language Half & Half [6] provides multimethod [1] . Agrawal, L. DeMichiel, and B. Lindsay. Static type checking of dispatch on Java interfaces. In this language, if there exist external multi-methods. In OOPSLA, pages 113–128, 1991. method implementations on two incomparable interfaces A and B, [2] E. Allen, D. Chase, J. Hallett, V. Luchangco, J. Maessen, S. Ryu, the visibility of one of the two interfaces must be module-private. G. Steele, Jr., and S. Tobin-Hochstadt. The Fortress Language Like System M, this effectively disallows multiple (interface) in- Specification, Version 1.0. Available at http://research.sun. heritance across module boundaries. Half & Half does not consider com/projects/plrg/Publications/fortress.1.0.pdf, 2008. the problem of multiple inheritance with state. [3] E. Allen, J. J. Hallett, V. Luchangco, S. Ryu, and G. L. Steele Jr. It is possible to modify the semantics of multimethod dispatch Modular multiple dispatch with multiple inheritance. In SAC ’07, so that by definition ambiguities do not arise in the presence of pages 1117–1121. ACM, 2007. multiple inheritance. A language may linearize the class hierarchy [4] D. Ancona, G. Lagorio, and E. Zucca. Jam - designing a Java [1] or choose the appropriate method based on their textual ordering extension with mixins. ACM Trans. Program. Lang. Syst., 25(5):641– [10]. However, such semantics can be fragile and confusing for 712, 2003. programmers. [5] L. Bak, G. Bracha, S. Grarup, R. Griesemer, D. Griswold, and U. Hölzle. Mixins in Strongtalk. In ECOOP 2002 Workshop on Inheritance, 2002. [6] G. Baumgartner, M. Jansche, and K. Läufer. Half & Half: Multiple dispatch and retroactive abstraction for Java. Technical Report OSU- CISRC-5/01-TR08, Dept. of Computer and Information Science, The [32] M. Odersky. The Scala language specification. Available at Ohio State University, March 2002. http://www.scala-lang.org/docu/files/ScalaReference. [7] A. Bergel. Personal communication, October 2008. pdf, 2007. [8] A. Bergel, S. Ducasse, O. Nierstrasz, and R. Wuyts. Stateful traits [33] M. Odersky and M. Zenger. Scalable Component Abstractions. In OOPSLA ’05 and their formalization. Computer Languages, Systems & Structures, , 2005. 34(2-3):83–108, 2008. [34] M. Sakkinen. Disciplined inheritance. In ECOOP, pages 39–56, 1989. [9] J. Bloch. Effective Java: Guide. Addison- Wesley, 2001. [35] N. Schärli, S. Ducasse, O. Nierstrasz, and A.P. Black. Traits: ECOOP ’03 [10] J. Boyland and G. Castagna. Parasitic methods: An implementation Composable Units of Behaviour. In . Springer, 2003. of multi-methods for Java. In OOPSLA, pages 66–76, 1997. [36] G. Singh. Single versus multiple inheritance in object oriented SIGPLAN OOPS Mess. [11] G. Bracha and W. Cook. Mixin-based inheritance. In ECOOP ’90, programming. , 5(1):34–43, 1994. 1990. [37] A. Snyder. Encapsulation and inheritance in object-oriented OOPSLA [12] B. Carré and J. Geib. The point of view notion for multiple programming languages. In , pages 38–45, 1986. inheritance. In OOPSLA/ECOOP ’90, pages 312–321. ACM, 1990. [38] C. Szyperski, S. Omohundro, and S. Murer. Engineering a [13] C. Chambers. Object-oriented multi-methods in Cecil. In ECOOP programming language: The type and class system of Sather. Programming Languages and System ’92, 1992. In Jürg Gutknecht, editor, Architectures, volume 782 of Lecture Notes in Computer Science. [14] C. Chambers and the Cecil Group. The Cecil language: specifi- Springer, 1993. cation and rationale, Version 3.2. Available at http://www.cs. [39] G. Washburn. Personal communication, December 2008. washington.edu/research/projects/cecil/, 2004. [15] C. Clifton, G. T. Leavens, C. Chambers, and T. Millstein. MultiJava: modular open classes and symmetric multiple dispatch for Java. In OOPSLA ’00, pages 130–145, 2000. [16] C. Clifton, T. Millstein, G. T. Leavens, and C. Chambers. MultiJava: Design rationale, compiler implementation, and applications. ACM Trans. Program. Lang. Syst., 28(3):517–575, 2006. [17] S. Ducasse, O. Nierstrasz, N. Schärli, R. Wuyts, and A.P. Black. Traits: A mechanism for fine-grained reuse. ACM Trans. Program. Lang. Syst., 28(2):331–388, 2006. [18] Torbjörn Ekman and Görel Hedin. JastAdd. http://www.jastadd. org, 2008. [19] M. Ellis and B. Stroustrup. The Annotated C++ Reference Manual. Addison-Wesley Longman Publishing Co., Inc., Boston, MA, USA, 1990. [20] M. Flatt, R.B. Findler, and M. Felleisen. Scheme with classes, mixins, and traits. In APLAS, pages 270–289. Springer, 2006. [21] M. Flatt, S. Krishnamurthi, and M. Felleisen. Classes and mixins. In POPL ’98, 1998. [22] C. Frost and T. Millstein. Modularly typesafe interface dispatch in JPred. In FOOL/WOOD’06, January 2006. [23] N. C. Hutchinson. EMERALD: An object-based language for distributed programming. PhD thesis, University of Washington, Seattle, WA, USA, 1987. [24] A. Igarashi, B. Pierce, and P. Wadler. Featherwieght Java: a Minimal Core Calculus for Java and GJ. In OOPSLA ’99, November 1999. [25] D. Malayeri and J. Aldrich. CZ: Multiple inheritance without diamonds. Technical Report CMU-CS-08-169, School of Computer Science, Carnegie Mellon University, Dec. 2008. [26] D. Malayeri and J. Aldrich. Integrating nominal and structural subtyping. In ECOOP 2008, July 2008. [27] B. Meyer. Object-Oriented Software Construction, 2nd Edition. Prentice-Hall, 1997. [28] S. Meyers. Effective C++: 50 specific ways to improve your programs and designs. Addison Wesley Longman Publishing Co., Inc. Redwood City, CA, USA, 1992. [29] T. Millstein, C. Bleckner, and C. Chambers. Modular typechecking for hierarchically extensible datatypes and functions. ACM Trans. Program. Lang. Syst., 26(5):836–889, 2004. [30] T. Millstein and C. Chambers. Modular statically typed multimethods. Inf. Comput., 175(1):76–118, 2002. [31] N. Nystrom, S. Chong, and A. Myers. Scalable extensibility via nested inheritance. In OOPSLA ’04, pages 99–115, 2004.