Traits: Experience with a Language Feature
Total Page:16
File Type:pdf, Size:1020Kb
7UDLWV([SHULHQFHZLWKD/DQJXDJH)HDWXUH (PHUVRQ50XUSK\+LOO $QGUHZ3%ODFN 7KH(YHUJUHHQ6WDWH&ROOHJH 2*,6FKRRORI6FLHQFH1(QJLQHHULQJ$ (YHUJUHHQ3DUNZD\1: 2UHJRQ+HDOWKDQG6FLHQFH8QLYHUVLW\ 2O\PSLD$:$ 1::DONHU5G PXUHPH#HYHUJUHHQHGX %HDYHUWRQ$25 EODFN#FVHRJLHGX ABSTRACT the desired semantics of that method changes, or if a bug is This paper reports our experiences using traits, collections of found, the programmer must track down and fix every copy. By pure methods designed to promote reuse and understandability reusing a method, behavior can be defined and maintained in in object-oriented programs. Traits had previously been used to one place. refactor the Smalltalk collection hierarchy, but only by the crea- tors of traits themselves. This experience report represents the In object-oriented programming, inheritance is the normal way first independent test of these language features. Murphy-Hill of reusing methods—classes inherit methods from other classes. implemented a substantial multi-class data structure called ropes Single inheritance is the most basic and most widespread type of that makes significant use of traits. We found that traits im- inheritance. It allows methods to be shared among classes in an proved understandability and reduced the number of methods elegant and efficient way, but does not always allow for maxi- that needed to be written by 46%. mum reuse. Consider a small example. In Squeak [7], a dialect of Smalltalk, Categories and Subject Descriptors the class &ROOHFWLRQ is the superclass of all the classes that $UUD\ +HDS D.2.3 [Programming Languages]: Coding Tools and Tech- implement collection data structures, including , , 6HW niques - object-oriented programming and . The property of being empty is common to many ob- jects—it simply requires that the object have a size method, and D.3.3 [Programming Languages]: Language Constructs and that the method returns zero. Since all collections have a size, it Features – classes and objects, inheritance makes sense that the method isEmpty should be defined for all collection classes. Because Squeak uses single inheritance, by General Terms defining isEmpty in the class &ROOHFWLRQ, all subclasses of &RO OHFWLRQ Design, Languages inherit the isEmpty method. However, a problem arises with classes outside the collection Keywords hierarchy. Take the class &5'LFWLRQDU\, a part of the “Genie” Traits, Ropes, Reuse, Smalltalk, Inheritance handwriting recognition system in Squeak. This class is not a subclass of &ROOHFWLRQ, but it does have a size and an isEmpty 1. INTRODUCTION method. The method isEmpty was written in exactly the same &ROOHFWLRQ &5'LFWLRQDU\ Reusability is a very important property of object-oriented pro- way twice; once in and once in . grams. With no reuse, all the methods that a class requires would need to be defined in the class itself. When two or more This may seem like a small problem—one repeated method is classes define the same method, code is duplicated. In addition not a lot of repeated code. But this problem is compounded by to being wasteful by taking up memory, duplicated code lowers several factors. First, the method isEmpty occurs in 24 classes, programmer productivity in two ways. Initially, the programmer often defined in a similar manner. Second, a number of similar must take the time to make a copy of a certain method. Later, if methods that rely exclusively on isEmpty are common, includ- ing notEmpty and ifEmpty:. Third, it is quite common to find The research described in this experience paper was conducted at the duplicated methods in unrelated classes in the class hierarchy. OGI School of Science & Engineering at OHSU by an undergraduate For example, the classes 5HFWDQJOH and 5HFWDQJOH0RUSK, student from The Evergreen State College. It was funded by NSF award have much common code that they cannot share because they CCR 0098323. are unrelated: their only common superclass is 2EMHFW [3]. How can we reuse code more effectively? Copyright is held by the author/owner(s). OOPSLA’04, Oct. 24–28, 2004, Vancouver, British Columbia, Canada. ACM 1-58113-833-4/04/0010. One workaround is simply to tolerate the missing methods, rather than dupliciating them. For example, the user of a &5'LFWLRQDU\ could be required to convert it into a 'LFWLRQDU\, 275 &5'LFWLRQDU\ could be required to convert it into a 'LFWLRQDU\, this manner. Using traits, we can define behavior in one place which is a subclass of &ROOHFWLRQ, and then send the isEmpty and have that behavior utilized in multiple places. In addition, message to the 'LFWLRQDU\, but this would be grossly inefficient the protocols of the classes that use traits tend to become more and exceedingly verbose. Another workaround would be to push uniform. For example, while 20 Squeak classes define isEmpty, the method isEmpty up the hierarchy into the 2EMHFW class, but only 2 also define notEmpty. However, if the methods had been isEmpty is inappropriate in subclasses of 2EMHFW that do not reused from the 7(PSW\QHVV trait, both methods would be have a size method. Multiple inheritance is another solution, but defined in all the classes. Readers wishing to learn more about it presents its own set of problems, such as the possibility of traits and the comparison between traits and other language conflicting variable and method names. No alternative to single features are referred to previous publications [3,5]. inheritance has gained widespread acceptance [8]. Small examples are acceptable illustrations of the possibilities 2. TRAITS of traits, but little is known about how useful traits are in a real- istic programming situation. Traits were tested by refactoring Traits are a new solution to the problem of code duplication the entire collections hierarchy [3], but the refactoring was done across unrelated classes [5]. A trait is a reusable building block by two of the creators of traits themselves—Andrew Black and from which programmers construct classes. Traits are collec- Nathanael Schärli. Emerson Murphy-Hill, a programmer famil- tions of pure methods, much like classes, except that they have iar with object-oriented programming but unfamiliar with traits, no superclass and no state. Traits have a set of methods that was assigned to evaluate traits in an independent manner, under define behavior, called the “provides set”, and another set of the gentle direction of Andrew Black. Murphy-Hill spent a methods that have yet to be defined, called the “requires set.” summer at OGI School of Science and Engineering with a Re- search Experience for Undergraduates grant from the National Suppose that we want to create a trait to solve the isEmpty prob- Science Foundation. At the beginning of the project, Murphy- lem. We define the method isEmpty in terms of size, but we Hill was unfamiliar with both Smalltalk and traits. While Black cannot yet define size, because it is often dependent on imple- assisted him in learning Smalltalk, Black did not provide guid- mentation. Furthermore, the methods notEmpty, ifEmpty:, and ance on how to use traits. Murphy-Hill learned about traits from ifNotEmpty: can be defined in terms of isEmpty. So isEmpty, some of the available papers [3,5,6], and utilized this knowledge notEmpty, ifEmpty:, and ifNotEmpty: are in the provides set and to write a significant, multi-class data structure called ropes. size is in the requires set. We now have a reusable building The entire design and implementation process took about one block that we will call 7(PSW\QHVV (see Figure 1). month, but certainly could have progressed faster if the pro- grammer had experience with Smalltalk prior to this experience. 7(PSW\QHVV can be incorporated into the &ROOHFWLRQ class, into the &5'LFWLRQDU\ class, and into any other class that de- fines size. To incorporate a trait into a class, the code in this 3. ROPES class has to be refactored so that old, repeated methods are re- Ropes are an alternative to strings originally developed by moved. Then, a trait should be made which defines these meth- Boehm, Atkinson, and Plass at Xerox PARC [4]. Strings are ods and the trait itself is used in the class. Each class that almost always implemented as fixed length arrays of characters, duplicates behavior defined by the trait should be refactored in which are not ideal for some of the most common string opera- tions. In particular, concatenation and substring selection are 7(PSW\QHVV common operations that strings perform slowly in traditional Provides Requires implementations because they require that large portions of the array be copied. Ropes make both concatenation and substring LV(PSW\ VL]H selection faster by introducing alternative data structures. Al- ↑ self size = 0 VHOIUHTXLUHPHQW though these data structures require ropes to be immutable, immutability can be considered an advantage in that it greatly QRW(PSW\ simplifies sharing of ropes between cooperating processes. ↑ self isEmpty not LI(PSW\D%ORFN 4. THE EXPERIMENT ↑ self isEmpty ifTrue: To see the effect of traits on the development process, we im- >D%ORFNYDOXH@ plemented ropes twice: once with traits, and once without. We based both versions on a Java implementation of ropes by An- LI1RW(PSW\D%ORFN drew Black, which utilized Aspects [2]. Black chose ropes as an ↑ self notEmpty ifTrue: application of traits for Murphy-Hill and let him develop ropes >D%ORFNYDOXH@ as he saw fit. )LJXUH7KHWUDLW7(PSW\QHVVZLWKWKHSURYLGHG PHWKRGVLQWKHOHIWFROXPQDQGWKHUHTXLUHG PHWKRGVLQWKHULJKWFROXPQ 276 Collection SequenceableCollection Rope ArrayedCollection String Concat Empty Flat Substring )LJXUH7KHSDUWLDO&ROOHFWLRQKLHUDUFK\#VKRZLQJWKH6WULQJFODVVDQGWKH5RSHFODVV 4.1. First Implementation: Ropes without Traits The first issue was where to put a Rope class so that it could To implement ropes, Murphy-Hill wrote four subclasses of reuse as much behavior from String as possible. To gain the Rope: Empty, Flat, Substring, and Concat, each with a different greatest amount of reuse, the Rope class should be a subclass of underlying data representation.