What Shall We Tell the Children (About Inheritance)?
Andrew P. Black Portland State University [email protected]
Abstract Thus wrote Cook and Palsberg, in their 1989 OOPSLA pa- per “A Denotational Semantics of Inheritance and its Cor- Since the groundbreaking work of Kamin, Reddy, and par- rectness” [7]. That paper goes on to tell a story about the ticularly Cook in the late 1980s, there has been broad agree- meaning of inheritance that has become broadly accepted as ment that the meaning of inheritance in object-oriented pro- the standard semantics. Working independently, Kamin [12] gramming languages can be best explained using generator and Reddy [15] had previously published their own seman- functions and their fixpoints. Consequently, it is a little sur- tics, which are broadly similar. (Kamin’s is less composi- prising to realise that no current mainstream programming tional because it involves taking fix points over the whole language actually explains inheritance to its users in this program, rather than individual class generators.) way. Instead, most languages make up a “story” that pur- This paper does not contain any new results. It’s goal is ports to explain inheritance, but that on closer inspection to explain to you, the reader, as clearly as possible what you contains serious flaws. It is as if, being asked to explain the probably already know: how inheritance works. We look at facts of life to our children, we are so embarrassed by the operational and denotational definitions of inheritance, and truth that we make up a story about storks, knowing even notice where they differ. We consider the complexity that is as we do so that it defies the laws not only of biology but unexpectedly introduced by an apparently simple concept: also of physics. This paper explores both the truth and the immutable objects. Finally, we consider the question in the fictions about how objects are brought into the world. My title: what shall we tell the children about inheritance. Is the hope is that future programming languages can tell the chil- typical object-oriented programmer ready to hear the truth dren, if not the whole truth, then at least a partial truth that about the facts of life? Or should we continue to make up is consistent with the laws of mathematics. stories full of fictions and half-truths? I am interested in these issues because I am involved in Categories and Subject Descriptors D.3.3 [Programming designing a new programming language called Grace [5], Languages]: classes and objects; inheritance which aims to be an “as simple as possible” language for teaching object-oriented programming to computer scien- General Terms Languages, Design tists. Because the target audience comprises novice program- mers, we want to introduce no more concepts than necessary, Keywords inheritance semantics, object initialisation, im- but because they will, we hope, become computer scientists, mutable objects we want those concepts to conform to the laws of logic and mathematics, and not be things that will have to be “un- learned” later. My point of view is that it’s OK for novices 1. Introduction computer scientists not to be told the whole truth; it’s not OK Inheritance is one of the central concepts in object- to tell them outright falsehoods. I agree with the late John oriented programming. Despite its importance, there Reynolds in believing that for this we should be convicted seems to be a lack of consensus on the proper way to of moral turpitude and have our tenure revoked [16]. describe inheritance. 2. How Objects Work Many languages don’t allow objects to exist unless they Permission to make digital or hard copies of all or part of this work for personal or have been generated by a class. This “classes first” approach classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation constrains the teaching sequence, and makes it harder to on the first page. To copy otherwise, to republish, to post on servers or to redistribute introduce objects to students. to lists, requires prior specific permission and/or a fee. MASPEGHI 2013 Montpellier, France Grace follows a few other languages, starting with Emer- Copyright © 2013 ACM 978-1-4503-2046-7/13/07. . . $15.00 ald [4], and including Reddy’s ObjectTalk calculus [15], OCaml [13] and Scala [19], in providing stand-alone ob- Square brackets are used to build a new environment by jects that are independent of any class. A Grace object is adding a list of bindings to an existing environment; ρ0 is very simple: it is a mapping from method names (seman- the empty environment. Methods are functions that take a tically, elements of an arbitrary index set) to method bod- list of arguments ~a, and return the meaning of the expression ies. A method body is a “parameterised behaviour”, that is, in the method body. (This will usually be a function from a parameterised expression whose evaluation can have ef- a store to a pair consisting of a store and a value, but for fects. The act of “sending a message” to an object, which in our purposes, it doesn’t really matter.) Here, the method Grace we call “requesting a method” of the object, involves body meaning is determined in an environment in which the parameterising the method using the arguments provided in method’s parameters are bound to its arguments, and self is the method request, and evaluating the method body. bound to the parameter to the generator function. Here is a point object in Grace syntax. It’s fine to have two explanations of something, so long as def point34= object { they correspond. As Cook and Palsberg [7] showed, the op- method x{3} erational and denotational definitions do mostly correspond, method y{4} but notice one place in which they differ. The operational method magnitude{ definition binds self when a method is requested, whereas the ( self.x^2 + self.y^2 ).sqrt} denotational definition binds self when the object is created. method isSmaller(p) { Specifically, this means that there is no way that the denota- self.magnitude
1 method checker(number){ 6.4 Grace 2 object { Two of our goals for Grace are supporting immutable ob- 3 def oneTwoList= aList.generatedby{ jects, and making object construction much simpler than in 4 lst −> aList.with(1,2) ++ lst} 5 def evenCache= self.computeIsEven Smalltalk or Java. Immutable objects — not just as a usage 6 method computeIsEven is confidential{ pattern (variable objects that happen not to change) but as 7 oneTwoList.at(number) == 2} a concept recognised by the implementation — are impor- 8 method isEven{evenCache} tant both to support a functional style of object-oriented pro- 9 print "This isa silly example" gramming, and to enable efficient implementation in dis- 10 manager.register(self) tributed memory computers. Another compelling reason to 11 } 12 } 1 static final variables are initialised using a static initialiser. In this example, the method checker() answers an object 7. Inheriting Initialisation (defined on lines 2–11) whose only public2 method, isEven, allows clients to ask if number is even. The confidential Inheritance and initialisation interact in sometimes-surprising method computeIsEven performs an expensive calculation: it ways. As Gil and Shragai [9] discuss, the very feature determines if number is even by using it to index an infinite that gives inheritance its power — being able to override list 1,2,1,2,... and checking if the extracted value is 2. inherited methods — means that inherited code can never be sure what a self-request will do. The initial design of This example shows that the value of a constant such as Grace avoided these problems by inheriting from already evenCache may be defined in terms of a method request, and initialised objects. Let’s look at the canonical example: that the requested method may refer to other constants. What def colourPointFactory= object { is the operational semantics of such an object constructor? In method x(xcoord)y(ycoord)colour(c){ particular, what object does self denote on line 5? It should object { be the object under construction, which therefore needs a inherits pointFactory.x(xcoord)y(ycoord) computeIsEven method, which in turn needs the definition of method colour{c} oneTwoList. Note that Grace also allows statements at the } top-level of the object, such as on lines 9 and 10; this code } will be executed for its effect during object construction. } Allowing such code is convenient, but doesn’t actually add The original semantics was to copy the methods and any extra power: arbitrary effectful code can in any case be fields of the object specified in the inherits clause (the super- part of an expression used to initialise a field. object) into the object under construction (the sub-object). These considerations led us to a “two-phase” semantics This semantics has the property that the super-object was for object construction. In the first phase, the skeleton of the fully created before the sub-object even began to exist, so object is built: space is allocated for constants, variables, and there is no possibility of overriding methods that may be a dictionary of methods. The methods are closures, which involved in its initialisation — essentially Gil and Shragai’s may bind names defined in the object constructor (such as monomorphism restriction. This is both good and bad: it oneTwoList), or in the surrounding scope (such as number). preserves safety, but makes it harder for the sub-object to It’s OK for the methods to reference as-yet-undefined con- change the behaviour of the super object in ways that may stants: all will be well so long as the constants are defined be- be both safe and useful. fore the method is requested. At the end of this phase, self has The “object copy” semantics had two effects that sur- come into existence, but its fields have not yet been defined. prised us. The first is that any object is now copyable, even if In the second phase, the code inside the object constructor it has no copy method: object {inherits x} returns a copy of (but outside of the method bodies) is executed: in our ex- x. It also means that if the creation code of the super-object ample, this includes the expressions that give values to the is immodest [9], for example, by exposing a reference to self fields oneTwoList and evenCache, and the print and register to some other object, then that reference will be to the super- requests. We would like to say that both phases happen at object and not to the sub-object. object creation time, and indeed, from the point of view of Because of these issues, we changed the semantics of a client, the object does spring into existence atomically. inherits to eliminate the implicit copy and instead require the However, if cross-examined in a court of law, we would programmer to supply a fresh object in the inherits clause. have to be very careful. If the definition of oneTwoList were This fresh object is then mutated into the sub-object. This moved after the definition of evenCache, then the request of changed the symptom of immodesty: now, if the super-object computeIsEven would access an uninitialised constant, which is immodest, other objects can observe the mutation. could not happen if constant declaration were truly atomic. Similarly, if manager.register(self) were moved to line 3, then To try to conceal the mutation, we delayed the execution manager would be able to request isEven, which might fail if of code that might expose self as far as possible. We did this evenCache were not yet defined. by extending the “2-phase initialisation” rule, introduced in Section 6.4, to cover inheritance too. Informally, the idea is Evaluating this design honestly, we are forced to admit that, in the first phase, the “shell” of the sub-object, includ- that it is excessively complex. Perhaps more importantly, it ing its fields and methods, is constructed first, and is given a does not meet our goal: under certain circumstances, objects new object identifier id. Then the shell of each of the super creation can be visibly non-atomic, and the timing of exter- objects up the inheritance chain is concatenated onto the first nal requests can determine whether or not they fail by ac- shell; the whole composite object keeps the identifier id. In cessing an uninitialised variable. the second phase, all of the code involved in object construc- tion is executed, starting with the ultimate super-object and 2 In Grace, methods are public by default. The method computeIsEven is explicitly annotated as confidential, which means that it can be requested working down to the sub-object. During this process, self is only of self. always id, which means both that any references exposed externally are to the composite object, and that code used to Here, the factory method newGame declares a local variable calculate initial values can be overridden by sub-objects. size, and then returns a new object, whose methods close The disadvantages of this proposal include all of the dis- over size. The effect is exactly the same as if size were advantages of the original 2-phase construction rule. The declared inside the object. rule is baroque; it’s hard to imagine explaining this to a class- If the effect is the same, how does moving the variables room full of novice programmers without having to apolo- outside of the object help? Because the object’s initialisation gise. Mutation is still visible: although the shell of the object expressions are also moved out of the object, so the object is complete before self is exposed, not all of the fields are constructor is guaranteed to be free of side-effects. Let’s re- initialised, and other objects can consequently see constants write the example from Section 6.4 in this style: change from uninitialised to initialised. These disadvantages method checker(number){ are compounded by the loss of referential transparency: an def result= object { object expression in an inherits clause has a meaning differ- method computeIsEven is confidential{ ent from the same expression in other contexts. Moreover, oneTwoList.at(number) == 2} method isEven{evenCache} the “freshness” requirement means that in practice the super- } object expression is almost certainly a request on a factory def oneTwoList= aList.generatedby{ method, which effectively means that classes are required lst −> aList.with(1,2) ++ lst} for inheritance. def evenCache= result.computeIsEven print "This isa silly example" 8. What shall we tell the children? manager.register(result) result Is there a simple and consistent story for inheritance and ob- } ject initialisation? I believe that there is, in the presence of Notice that we have moved all of the defs, and all of the what may seem to be draconian language restrictions: elim- top-level statements, out of the object constructor and into inate from object constructors both initialised declarations the factory method checker. The methods in the object result and top-level statements! These restrictions mean that no still have access to evenCache and oneTwoList. This code user-written code is executed during object construction, self has similar “2-phase” semantics to the original version — cannot escape, and creation code cannot be incorrectly over- first build the object exporting the method isEven, and then ridden, thus observing the “hardhat” restrictions [9]. Its ac- initialise the constants oneTwoList and evenCache. However, tually fine to allow variable declarations without initialisers this semantics is now explicit in the ordering of the exe- inside an object constructor, but this adds no power, so for cutable code in the factory method. simplicity lets imagine that we eliminate all declarations. In addition to forcing the evaluation of initialisation ex- It may seem that in imposing these restrictions, we have pressions to be completed before an object is constructed, severely impaired our programming language, but that is ac- these restrictions have at least three negative consequences. tually is not so. This is because Grace supports nesting and First, it seems likely that having declaration and use of lexical scope. Indeed, definitions and variables declared in- variables close together aids in code comprehension, so a side an object constructor are not really instance constants method body using temporaries and parameters is easier to and instance variables in the conventional sense. They are understand than a method body using instance variables, and just local variables of the constructor body; they will con- a method body using instance variables is easier to under- tinue to exist after the body has completed execution if and stand than a method body using variables from the surround- only if they are referenced by the returned object, that is, if ing scope. Second, methods that exist solely for the purpose they are captured by the body of one or more of the object’s of initialising constants must be moved outside of the object methods. being constructed, and into the enclosing object, where they To see this, lets suppose that two methods in an object can be accessed from the initialisation expression. This may representing a game wish to share a variable; for the sake of be considered good or bad, depending on your point of view. However, a method that is both used to initialise a constant example, we will call them setBoardSize and boardSize. All that is required is for this variable to be declared in some and is exposed to clients poses a problem: it must either be scope surrounding the methods: duplicated (clearly bad), or the client version (inside the ob- ject) must be implemented by requesting the version in the method newGame{ enclosing object. This is not as bad, but certainly adds more var size complexity than the version with a single method. object { method setBoardSize(n){ size :=n} The final negative consequence arrises because, in Grace, method boardSize{ size} everything is an object: there is an implicit object construc- } tor surrounding all “top level” code. If we disallow exe- } cutable code in object constructors, where do we put it? We could postulate an implicit “main method” that surrounds None of these solutions seems “obviously right” for a the whole program, but this seems like a hack. Shouldn’t the teaching language. Is it possible to satisfy all three re- principal unit of composition in an object-oriented language quirements without stepping outside the mainstream? Can be the object? MASPEGHIans help us to find a better solution, or to choose It is worth nothing that OCaml effectively implements between the alternatives described in this paper? these same “draconian restrictions”, not by banning exe- cutable code from inside an object, but by effectively evalu- Acknowledgments ating it (and thus determining the values of an object’s fields) before the object is constructed, as described in Section 6.3. The author thanks Kim Bruce, James Noble, and David Thus, OCaml does not allow initialisation expressions to ref- Ungar for fruitful discussions that have helped to shape erence self. This overcomes the first objection to the restric- this article. The anonymous referees also provided valuable tion (loss of locality), but not the others. guidance. Notice that in this restricted version of Grace, it is still possible for methods, invoked after the object has been con- structed, to execute arbitrary initialisation code, expose self, References and perform side effects. This is just what happens with the [1] M. Abadi and L. Cardelli. A Theory of Objects. Springer, initialize method in Smalltalk, OCaml “initializers”, and Java 1996. “constructors”. The innovation in Grace — as in OCaml — is [2] K. Beck. Smalltalk Best Practice Patterns. Prentice Hall, that this is no longer the only way of giving values to instance Upper Saddle River, NJ, 1997. ISBN 0-13-476904-X. variables: in the common case where a variable’s value can [3] A. P. Black. Object-oriented programming: some history, and be determined before the object containing it is created, the challenges for the next fifty years. Information and Computa- variable can already have its value at object creation time. tion, In press, 2013. As one of the anonymous referees pointed out, while this fa- [4] A. P. Black, N. Hutchinson, E. Jul, and H. Levy. Object cility is necessary for immutable objects, it is also beneficial structure in the Emerald system. In Proceedings First ACM for mutable objects. Conference on Object-Oriented Programming Systems, Lan- guages and Applications, pages 78–86, Portland, Oregon, Oc- 9. Conclusion tober 1986. ACM Press. [5] A. P. Black, K. B. Bruce, M. Homer, and J. Noble. Grace: Defining the exact semantics of inheritance in Grace has the absence of (inessential) difficulty. In Onward! ’12: Pro- proved to be surprisingly difficult. We started out with a ceedings 12th SIGPLAN Symp. on New Ideas in Programming number of requirements, all of which seemed reasonable, and Reflections on Software, pages 85–98, New York, NY, given the goals of Grace, and which were not in obvious 2012. ACM. URL http://doi.acm.org/10.1145/2384592. conflict with each other: 2384601. R1: it should be possible to create immutable objects (in [6] A. Borning. Classes versus prototypes in object-oriented languages. In FJCC, pages 36–40. IEEE Computer Society, addition to mutable objects); 1986. ISBN 0-8186-0743-2. R2: it should be possible to inherit from objects (in addition [7] W. Cook and J. Palsberg. A denotational semantics of in- to classes); and heritance and its correctness. In Conference on Object- R3: the semantics of object inheritance should be very sim- oriented programming systems, languages and applications, ilar to those of class inheritance. pages 433–443, New Orleans, LA USA, 1989. ACM Press. [8] T. V. Cutsem and M. S. Miller. Proxies: design principles Our original “object copy” semantics achieved R1 and for robust object-oriented intercession apis. In W. D. Clinger, R2, but not R3. In eliminating the copy, and adopting instead editor, DLS, pages 59–72. ACM, 2010. ISBN 978-1-4503- the “fresh object” semantics, we achieved R3, but at the ex- 0405-4. pense of R1 and R2. Eliminating executable code from ob- [9] J. Y. Gil and T. Shragai. Are we ready for a safer construction ject constructors enables us to achieve R1, R2, and R3. How- environment? In Proceedings of the 23rd European Confer- ever, it too has some disadvantages. If we simply forbid code ence on Object-Oriented Programming — ECOOP 2009, vol- in the constructor to mention self or other local variables — ume 5653 of LNCS, pages 495–519, Berlin, Heidelberg, 2009. essentially OCaml’s solution — then we are imposing a re- Springer-Verlag. striction that is absent from “mainstream” object-oriented [10] A. Goldberg and D. Robson. Smalltalk-80: The Language and languages — a possible negative for our intended audience. its Implementation. Addison-Wesley, 1983. Forcing the state variables of the object out of the object con- [11] J. Gosling, B. Joy, G. Steele, G. Bracha, and A. Buckley. structor and into the enclosing scope avoids that restriction, The Java® Language Specification. Oracle Corp, Java SE but imposes what may be seen as an even more significant 7 Edition edition, Feb 2013. URL http://docs.oracle.com/ one; in any case, it is also a departure from the mainstream. javase/specs/jls/se7/html/index.html. [12] S. N. Kamin. Inheritance in SMALLTALK-80: a denotational ming, pages 289–297, 1988. URL http://doi.acm.org/10. definition. In J. Ferrante and P. Mager, editors, ACM Symp. on 1145/62678.62721. Principles of Programming Languages, pages 80–87. ACM [16] J. C. Reynolds. Types, abstraction and parametric polymor- Press, Jan 1988. phism. In IFIP Congress, pages 513–523, 1983. [13] X. Leroy, D. Doligez, A. Frisch, J. Garrigue, D. Rémy, and [17] G. L. Steele, Jr. Rabbit: A compiler for scheme. Technical J. Vouillon. The OCaml system: documentation and user’s Report 474, Massachusetts Institute of Technology, AI Lab, manual. INRIA, release 4.00 edition, July 2012. URL http:// Cambridge, MA, USA, 1978. caml.inria.fr/pub/docs/manual-ocaml-4.00/index.html. [18] A. Taivalsaari. Delegation versus concatenation, or cloning [14] O. L. Madsen, B. Møller-Pedersen, and K. Nygaard. Object- is inheritance too. SIGPLAN OOPS Mess., 6:20–49, July Oriented Programming in the BETA Programming Language. 1995. ISSN 1055-6400. URL http://doi.acm.org/10.1145/ ACM Press/Addison-Wesley, 1993. 219260.219264. [15] U. S. Reddy. Objects as closures: Abstract semantics of [19] D. Wampler and A. Payne. Programming Scala: Scalability = object-oriented languages. In LISP and Functional Program- Functional Programming + Objects. O’Reilly, 2009.