A Type System for Smalltalk

A Type System for Smalltalk Justin 0. Graver University of Florida Ralph E. Johnson University of Illinois at Urbana-Champaign Abstract subclassing with subtyping [Sus81, B182, SCB’86, Str86, Mey88]. In our type system, types are based This paper describes a type system for Smalltalk that on classes (i.e. each class defines a type or a fam- is type-safe, that allows most Smalltalk programs to ily of types) but subclassing has no relationship to be type-checked, and that can be used as the basis of subtyping. This is because Smalltalk classes inherit an optimizing compiler. implementation and not specification. Our type system uses discriminated union types and signature types to describe inclusion polymor- 1 Introduction phism and describes functional polymorphism (some- times called parametric polymorphism, see [DT88]) There has been a lot of interest recently in type using bounded universal quantification. It has the systems for object-oriented programming languages following features: [CW85, DT88]. Since Smalltalk was one of the ear- liest object-oriented languages, it is not surprising a automatic case analysis of union types, that there have been several attempts to provide a type system for it [SuaBl, BI82]. Unfortunately, l effective treatment of side-effects by basing the none of the attempts have been completely successful definition of the subtype relation on equality of [Joh86]. In particular, none of the proposed type sys- parameterized types, and tems are both type-safe and capable of type-checking l type-safety, i.e. a variable’s value always con- most common Smalltalk programs. Smalltalk vio- forms to its type. lates many of the assumptions on which most object- oriented type systems are based, and a successful Because the type system uses class information, it can type system for Smalltalk is necessarily different from be used for optimization. It has been implemented as those for other languages. part of the TS (Typed Smalltalk) optimizing compiler We have designed a new type system for Smalltalk. [JGZ88]. The biggest difference between our type system and others is that most type systems for object-oriented programming languages equate classes with types and 2 Background Authors’ address, telephone, and e-mail: Smalltalk [GRSS] is. a pure object-oriented program- Department of Computer and Information Sciences, E301 CSE, Gainesville, FL 32611, (904) 392-1507, graverOcis.ti.edu ming language in that everything, from integers to Department of Computer Science, 1304 W. Springfield Ave., text windows to the execution state, is an object. In Urbana, IL 61801, (217) 244-0093, [email protected] particular, classes are objects. Since everything is an object, the only operation needed in Smalltalk is message sending. Smalltalk uses run-time type-checking. Every message send is dynamically bound to an implementation depending on the class of the receiver. This unified view of the universe makes Smalltalk a Permission to copy without fee all or part of this material is granted compact yet powerful language. provided that the copies are not made or distributed for direct Smalltalk is extremely extensible. Most operations commercialadvantage, the ACM copyright notice and the title of the are built out of a small set of powerful primitives. publication and its date appear, and notice is given that copying is by permission of the Asociation for Computing Machinery. To copy othcr- wise, or to republish, requires a fee and/or specific permission. 0 1990 ACM 089791-3434/9O/OOOi/O136 $1.50 136 For example, control structures are all implemented in terms of blocks, which are the Smalltalk equivalent class Change: of closures. Smalltalk’s equivalent to if-then-else is values the ifTtue:ifFalse: message, which is sent to a boolean TAtray with: self class with: self parameters object with two block arguments. If the receiver is an parameters instance of class True then the “true block” is evalu- self subclassResponsibility ated. Similarly, if the receiver is an instance of class Fake then the “false block” is evaluated. Looping is class ClassRelatedChange: implemented in a similar way using recursion. These parameters primitives can be used to implement case statements, TclassName generators, exceptions, and coroutines[Deu81]. Smalltalk has been used mostly for prototyping and class MethodChange: exploratory development. It is ideal for these pur- parameters poses because Smalltalk programs are easy to change fAtray with: className with: selector and reuse, and Smalltalk comes with a powerful programming environment and large library of generally class OtherChange: useful components. However, it has not been used parameters as much for production programming. This is partly Tself text because it is hard to use for multi-person projects, partly because it is hard to deliver application programs without the large programming environment, Figure 1: Change and its subclasses. and partly because it is not as efficient as languages like C. In spite of these problems, the development and maintenance of Smalltalk programs is so easy that more and more companies are developing applications with it. class is needed) [JF88]. Although conscientious prd- A type system can help solve Smalltalk’s problems. grammers remove these improprieties from finished Type information makes programs easier to under- programs, we wanted a type system that would al- stand and can be used to automatically check inter- low them, since they are often important intermedi- face consistency, which makes Smalltalk better suited ate steps in the development process. Thus, a static for multiperson projects. However, our main motiva- type system for Smalltalk must be as flexible as the tion for type-checking Smalltalk is to provide infor- traditional dynamic type-checking. mation needed by an optimizing compiler. Smalltalk In an untyped object-oriented language like methods (procedures) tend to be very small, mak- Smalltalk, classes inherit only the implementation ing aggressive inline substitution necessary to achieve of their superclasses. In contrast, most type sys- good performance. Type information is required to tems for object-oriented programming languages re- bind methods at compile-time. Although some type quire classes to inherit the specification of their super- information can be acquired by dataflow analysis of classes [BI82, Car84, CW85, SCB*86, Str86, Mey88], individual methods [CUL89], explicit type declara- not just the implementation [Sua81, Joh86]. A class tions produce better results and also make programs specification can be an explicit signature [BHJL86], more reliable and easier to understand. From this the implicit signature implied by the name of a class point of view, the type system has been quite suc- [SCB*86], or a signature combined with method pre- cessful, because the TS compiler can make Smalltalk and post-conditions, class invariants, etc. [Mey88]. programs run nearly as fast as C programs. Inheriting specification means that the type of meth- It is important that a type system for Smalltalk not ods in subclasses must be subtypes of their specifica- hurt Smalltalk’s good qualities. In particular, it must tions in superclasses. not hinder the rapid-prototyping style often used by Since Smalltalk is untyped, only implementation is Smalltalk programmers. Exploratory programmers inherited. This makes retrofitting a type system dif- try to build programs as quickly as possible, mak- ficult. Since specification inheritance is a logical or- ing localized changes instead of global reorganization. ganization, many parts of the Smalltalk inheritance They often use explicit run-time checks for particu- hierarchy conform to specification inheritance, but lar classes (evidence that code is in the wrong class) there is nothing in the language that requires or en- and create new subclasses to reuse code rather than forces this. Thus, it is common to find classes in to organize abstraction (evidence that a new abstract the Smalltalk- class library that inherit implemen- 137 tation and ignore specification. 3 Types Dictionary is a good example of a class that inher- The abstract and concrete syntax for type expressions its implementation but not specification; it is a sub- is shown in Figure 2 (abstract syntax on the left and class of Set. A Dictionary is a keyed lookup table concrete syntax on the right). In the concrete syn- that is implemented as a hash table of <key, value> tax grammar terminals are underlined, (something)* pairs. Dictionary inherits the hash table implemen- represents zero or more repetitions of the something, tation from Set, but applications using Sets would and (something)+ represents one or more repetitions behave quite differently if given Dictionaries instead. of the something. Each of the type forms will be de- scribed in detail in the following sections. Abstract classes, a commonly used Smalltalk pro- We use the abstract syntax in inference rules and gramming technique, provide other examples where the concrete syntax in examples. Type expressions classes inherit implementation but not specification. are denoted by a, t, and u. Type variables are de- Consider the method definitions shown in Figure 1. noted by p (abstract syntax) or by P (concrete_syn- These (and other) classes are used to track changes tax). Lists (tuples) of types are denoted by t = made to the system. The abstract class Change de- <ir,tz,***r t, >. The empty list is denoted by <> fines the values method in terms of the parameters and t et’is the list < t, 21, t2,. , t, >. Similar nota- method. The implementation of the values method is tion is used for lists of type variables. C denotes a inherited by the subclasses of Change. The result of class name and m denotes a message selector. sending the values message depends on the result of Strictly speaking, a type variable is not a type; it sending the parameters message, which is different for is a place holder (or representative) for a type. We each class in Figure 1.

Load more