Type-Check Elimination: Two Reengineering Patterns

Stephane´ Ducasse, Robb Nebbe, Tamar Richner

Software Composition Group, Universitat¨ Bern g fducasse,nebbe,richner @iam.unibe.ch

http://www.iam.unibe.ch/scg/

Abstract A reengineering pattern describes how to go from an existing legacy solution to a new refactored solution. In this paper we discuss the role of reengineering patterns and contrast them with and antipatterns. We then highlight the structure of a reengineering pattern and present two simple, related patterns for type-check elimination.

1 Reengineering Patterns

When important legacy can no longer gracefully evolve to meet changing requirements it is often reengineered. Reengineering patterns codify and record knowledge about modifying legacy software: they help in diagnosing problems and identifying weaknesses which hinder further development of the system and aid in finding solutions which are more appropriate to the new requirements. Reengineering patterns differ from Design Patterns [GHJV95] in their emphasis on the pro- cess of moving from an existing legacy solution to a new refactored solution. Whereas a design pattern presents a solution for a recurring design problem, a reengineering pattern presents a refactored solution for a recurring legacy solution which is no longer appropriate, and describes how to move from the legacy solution to the refactored solution. The mark of a good reengineer- ing pattern is (a) the clarity with which it exposes the advantages, the cost and the consequences of the target solution with respect to the existing solution, and not how elegant the target solution is, (b) the description of the change process: how to get from one state of the system to another. We also contrast reengineering patterns with AntiPatterns [BMMM98]. Antipatterns, as exposed by Brown et al., are presented as “bad” solutions to design and management issues in software projects. Many of the problems discussed are managerial concerns that are outside the direct control of developers. Moreover, the emphasis in antipatterns is on prevention: how to avoid making the mistakes which lead to the antipatterns. Consequently, antipatterns may be of interest when starting a project or during development but are no longer helpful when we are confronted with a legacy system. In contrast, in reengineering we prefer to withhold judgement and use the term “legacy solution” or “legacy pattern” for a solution which at the time, and under the contraints given, seemed appropriate. In reengineering it is too late for prevention, and reengineering patterns therefore concentrate on the cure: how to detect problems and move to more appropriate solutions. In dealing with legacy systems it is neither cost effective nor prudent to refactor the code without a clear goal. The task of determining which parts of the legacy software are significantly obstructing further development is, however, a difficult one. Reengineering patterns try to guide developers by discussing issues such as the applicability of the pattern, the cost of transformation and any particular implementation problems. Note, however, that the question of whether a pattern should be applied cannot be wholly addressed in this discussion and remains ultimately a question of judgment. In the context of a project developing a methodology for reengineering object-oriented legacy systems to frameworks, we are working on a pattern language for reengineering. The patterns presented in this paper address typical legacy solutions found in object-oriented code. We expect, however, that some reengineering patterns will describe more overall strategies for dealing with legacy systems, and thus be of a less technical nature than the patterns presented here. We continue with a brief overview of the format used for writing reengineering patterns. This is followed by a pair of sample patterns dealing with the problem of missing polymorphism.

2 Form of a reengineering pattern

Pattern Name. We use a short sentence with a verb that emphasizes the kind of reengineering transformation.

Intent. A description of the process, together with the result and why it is desirable. Applicability. When is the pattern applicable? When is it not applicable? This section includes a list of symptoms, a list of reengineering goals and a list of related patterns. Symptoms are those experienced when reusing, maintaining or changing the system. Reengineering goals present the qualities improved through the application of this pattern.

Motivation. This section presents an example: it must acquaint the reader with a concrete ex- ample so the reader can better understand the more abstract presentation of the problem which follows in the structure and process sections. The example clearly describes the structure of the existing legacy system, the structure of the reengineered system, and the relation between the two. The state of the system before and after the application of the pattern are described.

Structure. It describes the structure of the system before and after reengineering. Each structure section is similar to the structure section in the Gang of Four pattern book. The partici- pants and their collaborations are identified. Consequences discuss the advantages and disadvantages of the target structure in comparison to the initial structure. Process. The process section is subdivided into three sections: the detection, the recipe and the difficulties. It is well-known that the code often tells the reengineer where the problem is and that the reengineering process must be goal driven to avoid reengineering code that is not obstructing further development. However, there are cases where it may be interesting to automatically detect where different patterns can be applied. The detection section describes methods and tools to guide and detect when the code is indeed suffering from the serious problems and that the process described can help to alleviate this problem. The recipe states how to perform the reengineering operation and its possible variants. The difficulties section discusses situations where the reengineering operation is infeasible or its application is compromised by other existing problems.

Discussion. In this section cost and benefit tradeoffs of applying the pattern are discussed. The legacy solution is commented to show why such a solution was good at the period of time but insufficient or unadapted to the current problem. What is the cost of detecting this problem? What is the magnitude of the problem? What is the benefit gained by applying the pattern? This discussion should aid an engineer in deciding (once he knows the pattern is applicable to the code) whether or not it is, in this specific case, worth applying the pattern.

Language Specific Issues. This section lists what must be specifically resolved for each lan- guage. What makes it more difficult? More easy?

3 Type Check Elimination

The introduction of polymorphism is an important and frequent reengineering operation in object- oriented legacy systems. By replacing hand coded polymorphism with the support built into the language both simplifies the software and makes it more flexible. Even in the presence of poly- morphism it is our experience that developers continue to implement functionality that would be best handled through polymorphism by other means. Here we present two patterns: Type Check Elimination within a Provider Hierarchy and Type Check Elimination in Clients. The essential distinction between these two patterns is the Location of the type check: in Type Check Elimination within a Provider Hierarchy the type check is encapsulated within the provider class while in Type Check Elimination in Clients the type check is performed in client classes and tightly couples the clients to the provider hierarchy. Note that when one class depends on another we call this class a client class and the class it depends on is a provider class. This is a general terminology and is not specific to these patterns. Each reengineering pattern is self contained as we have found that depending on our goals when reengineering a software system we may be interested in one and not the other. For exam- ple, if we wish to extract a subsystem then Type Check Elimination in Clients is very impor- tant. On the other hand if we wish to add functionality then Type Check Elimination within a Provider Hierarchy is often more relevant. Type Check Elimination in a Provider Hierarchy

Intent Transform a single provider class being used to implement what are conceptually a set of related types into a hierarchy of classes. Decision structures, such as case statements or if-then-elses, over type information are replaced by polymorphism. This results in increased modularity and facilitates the extension of functionality through the addition of new subclasses.

Applicability

Symptoms.

 Methods contain large decision structures over an instance variable of the provider class to which they belong.

 Extending the functionality of the provider class requires modifying many methods.

 Many clients depend on a single provider class.

Reengineering Goals.

 Improve modularity.

 Simplify extension of provider functionality.

Related Reengineering Patterns. A closely related pattern is Type Check Elimination in Clients where the case statements over types are in the client code as opposed to the provider code. The essential distinction is if the decision structure is over an instance variable of the class (this pattern) or another class (see Type Check Elimination in Clients ). The pattern is also related to the object oriented heuristics : “Explicit case analysis on the value of an attribute is often an error” [Rie96].

Motivation Case statements are sometimes used to simulate polymorphic dispatch. This often seems to be the

result of the absence of polymorphism in an earlier version of the language (Ada’83 ! Ada’95 ++ or C ! C ). Another possibility is that programmer don’t fully master the use of polymorphism and as a result do not always recognize when it is applicable. In any language that supports polymorphism it is preferable to exploit the language support rather than simulate it. In the presence of polymorphism the process of dispatching is part of the language. In con- trast, with case statements or other large decision structures the simulated dispatch must coded and maintained by hand. Accordingly, changing or extending the functionality are more difficult because they often affect many places in the source code. It also results in long methods with fragmented logic that are hard to understand. Programmers often fall back to the language they are most familiar with – in the Variable Kent Beck shows an example of such a situation related to Lisp programmers [Bec97]. Thus, they may continue to implement solutions which do not exploit polymorphism even when polymorphism is available. This could occur especially when programmers extend an existing design by programming around its flaws, rather than reengineering it.

Initial Situation. Our example, taken in a simplified form from one of the case studies, consists of a message class that wraps two different kinds of messages (TEXT and ACTION) that must be serialized to be sent across a network connection as shown in the code and the figure 1.

Message Client1 send(channel Channel) Client2 set_value(action Integer) set_value(text String) receive(channel Channel)

Figure 1: Initial relation and structure of clients and providers.

A single provider class implements what is conceptually a set of related types. One attribute of the class functions as surrogate type information and is used in a decision structure to handle different variations of functionality required.

class Message f public: Message(); set value(char* text); set value(int action); void send(Channel c); void receive(Channel c); ... private: void* data;

int type ; g // from Message::send const int TEXT = 1; const int ACTION = 2;

switch (type ) f case TEXT: ...

case ACTION: ... g; Final Situation. The case statements have been replaced by polymorphism and the original class has been transformed into a hierarchy comprised of an abstract superclass and concrete subclasses. Clients must then be adapted to create the appropriate concrete subclass. Initially there may be a large number of dependencies on this class, making modification expensive in terms of compilation time, and increasing the effort required to test the class. The target structure improves all of these problems with the only cost being the effort required to refactor the provider class and to adapt the clients to the new hierarchy.

Client1 Message Client2 send(channel Channel) receive(channel Channel)

Text_Message Action_Message send(channel Channel) send(channel Channel) receive(channel Channel) receive(channel Channel) Text_Message(text String) Action_Message(action Integer)

Figure 2: Final relation and structure of clients and providers. class Message f public: virtual void send(Channel c) = 0; virtual void receive(Channel c) = 0; ...

g;

class Text Message: public Message f public: Text Message(char* text); void send(Channel c); void receive(Channel c); private: char* text; ...

g;

class Action Message: public Message f public: Action Message(int action); void send(Channel c); void receive(Channel c); private: int action; ...

g;

Structure

Participants.

 A single provider (Message) class that is transformed into a hierarchy of classes (Mes- sage, Text Message and Action Message)

 Asetofclient classes

Collaborations. The single provider class will be transformed into a hierarchy, thereby in- creasing modularity and facilitating extension of functionality. Initially, the clients are all dependent on a single provider class. This class encompasses several variants of functionality and thus encapsulates all the collaboration that would normally be handled by polymorphism. This results in long methods typically containing case statements or other large decision structures. The situation is improved by refactoring the single provider class into a hierarchy of classes: an abstract superclass and a concrete subclass for each variant. Each of the new subclasses is simpler than the initial class and these are relatively independent of each other.

Consequences. The functionality of the hierarchy can be extended adding a new subclass with- out modifying the superclass. The increased modularity also impacts the clients who are now likely to be dependent on separate subclasses in the provider hierarchy.

Process

Detection. If the automatic detection of the pattern application is considered the following criteria should be take into consideration. A class having many long methods is a good candidate for further analysis. A line of code per method metric may help to narrow the search. If these methods contain case statements or complex decision structures all based on the same attribute then the attribute is probably serving as surrogate type information. In C++, where it is a good practice to define a class per file, the frequency of case statements in the same file can be also used as a first hint to narrow the search for this pattern.

Example: detection of case statements in C++. Knowing if the pattern should be applied re- quires the detection of case statements. Regular-expression based tools like emacs, grep, agrep help in the localization of case statements based on explicit construct like C++’s switch or Ada case. For example, grep ’switch’ ‘find . -name ”*.cxx” -print‘ enumerates all the files with extension .cxx contained in a directory tree that contains switch. The grep facilities for grep are extended in agrep so it is possible to ask for finer queries. For example, the expression agrep ’switch;type’ -e ‘find . -name ”*.cxx” -print‘ extracts all the files containing lines having switch and type. However, even for a language like C++ that provides an explicit case statement construct, detecting case statements based on explicit ifthenelse structures is necessary. The tools above are not well suited for such a task, since their detection capabilities are restricted to one line at a time. One possible solution is to use perl scripts - a perl script which searches the methods in C++ files and lists the occurrences of case statements can be found in the appendix.

Recipe. 1. Determine the number of conceptual types currently implemented by the class by inspect- ing the case statements. An enumeration type or set of constants will probably document this as well. 2. Implement the new provider hierarchy. You will need an abstract superclass and at least one derived concrete class for every variant. 3. Determine if all of the methods need to be declared in the superclass or if some belong only in a subclass.

4. Update the clients of the original class to depend on either the abstract superclass or on one of its concrete subclasses.

Difficulties.

 If the case statements are not all over the same set of functionality variants this is a sign that it might be necessary to have a more complex hierarchy including several intermediate abstract classes, or that some of the state of the provider should be factored out into a separate hierarchy.

 If a client depends on both the superclass and some of the subclasses then you may need to refactor the client class or apply the Type Check Elimination in Clients pattern because this is an indication that the provider does not support the correct interface.

Discussion

About the legacy solution. The legacy solution is a good solution when the language does not support polymorphism. The variants represent the subclasses in a object-oriented languages. The functionalities can be shared between the variants and the polymorphism can be simulated. Note also that the type checks occur only in the implementation of the provider and there is very little complexity distributed acrossed the clients. The major drawback is that the provide class quickly becomes very large as the number of variants increases and the particularities of each variant must be taken into account. The complexity is reflected in the methods that must distinguish the different variants and their logic becomes fragmented and difficult to follow. Adding new variants often requires making extensive modifications to the provider class.

About Detection. During the detection phase one can find other uses of case statements. For example, case statements are also used to implement objects with states [Bec94, ABW98]. In such a case the dispatch is not done on object type but on a certain state as illustrated in the State pattern [GHJV95, ABW98]. Moreover, the [GHJV95, ABW98] is also based on the elimination of case statement over object state. In his thesis W. Opdyke [Opd92] discusses the automization of code refactoring. His “Refac- toring To Specialize”, in which he proposed to use class invariants as a criteria to simplify con- ditionals, is similar to this pattern.

About the refactored solution. Applying the pattern may lead to a number of new classes basically transforming what is often a single class into a hierarchy of classes. However, these classes already existed conceptually in the legacy solution and we are trading one very complex class for simpler but more numerous classes. The logic of each class is then much cleaner since you do not need to filter out the noise associated with the other variants and the typing can be tightened so that unwanted variants can be excluded by the type system rather than through a precondition check.

Language Specific Issues.

C++. Detection: in C polymorphism can be emulated either by using function pointers or through union types and enum’s. C++ programmers are likely to use a single class with a void pointer and then cast this pointer to the appropriate type inside a switch statement. This allows them to uses classes which are nominally object-oriented as opposed to unions which they have probably been told to avoid. The use of constants is typically favored over the use of enum’s. Difficulties: If void pointers have been used in conjunction with type casts then you should check to see if the classes mentioned in the type casts should be integrated into the new provider hierarchy.

ADA. Detection: because Ada83 did not support polymorphism (or subprogram access types) discriminated record types are the prefered solution. Typically an enumeration type provides the set of variants and the conversion to polymorphism is straightforward in Ada95.

SMALLTALK. In the detection of the case statements over types is hard because few type manipulations are provided. Basically, methods isMemberOf: and isKindOf: are available. anObject isMemberOf: aClass returns true if anObject is an instance of the class aClass, anObject isKindOf: aClass returns true if aClass is the class or a superclass of anObject. Detecting these method calls is not sufficient, however, since class membership can also be tested with self class = anotherClass, or with property tests throughout the hierarchy using methods like isSymbol, isString, isSequenceable, isInteger. Type Check Elimination in Clients

Intent Transform client classes that depend on type tests (usually in conjunction with case statements) into clients that rely on polymorphism. The process involves factoring out the functionality dis- tributed across the clients and placing it in the provider hierarchy. This results in lower coupling between the clients and the providers (class hierarchy).

Applicability

Symptoms.

 Large decision structures in the client over the type of (or equivalent information about) an instance of the provider, either passed as an argument to the client, an instance variable of the client, or a global variable.

 Adding a new subclass of the provider superclass requires modifications to clients of the provider hierarchy because functionality is distributed over these clients.

Reengineering Goals.

 Localize functionality distributed across clients in the provider hierarchy.

 Improve usability of provider hierarchy.

 Lower coupling between clients and the provider hierarchy.

Related Reengineering Patterns. A closely related reengineering pattern is Type Check Elim- ination within a Provider Hierarchy, where the case statements over types are in the provider code as opposed to the client code. The essential distinction is if the decision structure is over the type or an attribute functioning as a type of: (a) an instance of another class (this pattern) or (b) an instance of the class to which the method belongs (see Type Check Elimination within a Provider Hierarchy ). The pattern is also related to the object oriented heuristic : “Explicit case analysis on the type of an object is usually an error. The designer should use polymorphism in most of these cases” [Rie96].

Motivation The fact that the clients depend on provider type tests is a well known symptom for a lack of polymorphism. This leads to unnecessary dependencies between the classes and it makes it harder to understand the program because the interfaces are not uniform. Furthermore, adding a new subclass requires all clients to be adapted. Initial Situation. The following code illustrates poor use of object-oriented concepts as shown by Fig. 3. The function makeCalls takes a vector of Telephone’s (which can be of different types) as a parameter and makes a call for each of the telephones. The case statement switches on an explicit type-flag returned by phoneType(). In each branch of the case, the programmer calls the phoneType specific methods identified by the type-tag to make a call.

Client Providers

TelephoneBox Telephone makeCalls()

switch(p->phoneType()) case .... case ...

POTSPhone ISDNPhone tourneManivelle() initializeLine() call() connect()

Figure 3: Initial relation and structure of clients and providers.

void makeCalls(Telephone * phoneArray[]) f

for (Telephone *p = phoneArray; p; p++) f

switch(p->phoneType()) f

case TELEPHONE::POTS: f POTSPhone * potsp = (POTSPhone *) p; potsp->tourneManivelle();

potsp->call(); break; g

case TELEPHONE::ISDN: f ISDNPhone * isdnp = (ISDNPhone *) p; isdnp->initializeLine();

isdnp->connect(); break; g

case TELEPHONE::OPERATORS: f OperatorPhone * opp = (OperatorPhone *) p; opp->operatormode(on);

opp->call(); break; g case TELEPHONE::OTHERS: default:

error(....); ggg

Final Situation. After applying the pattern the corresponding ringPhones() will look as fol- lows and the structure as shown by the Fig. 4.

void makeCalls(Telephones *phoneArray[]) f

for(Telephone *p = phoneArray; p; p++) p->makeCall(); g

Note that the client code, which represents distributed functionality, has been greatly simpli- fied. Furthermore, this functionality has been localized within the Telephone class hierarchy, thus making it more complete and uniform with respect to the clients needs.

Client Providers

TelephoneBox Telephone makeCalls() makeCall()

... makeCall()

POTSPhone ISDNPhone makeCall() makeCall()

Figure 4: Final relation and structure of clients and providers.

Structure

Participants.

 provider classes (Telephone and its subclasses)

– organized into a hierarchy.

 the clients (TelephoneBox) of the provider class hierarchy.

Collaborations. The collaborations will change between all clients and the providers as well as the collaboration within the provider hierarchy. Initially, the clients collaborate directly with the provider superclass and its subclasses by virtue of type tests or a case statement over the types of the subclasses. After reengineering the only direct collaboration between the clients and the providers is through the superclass. Interaction specific to a subclass is handled indirectly through polymorphism. Within the provider hierarchy the superclass interface must be extended to accurately reflect the needs of the clients. This will involve the addition of new methods and the possible refac- torization of the existing methods in the superclass. Furthermore, the collaborations between the provider superclass and its subclasses may also evolve, i.e. it must be determined whether the new/refactored methods are abstract or concrete. Consequences. Relying on polymorphism localizes the protocol for interacting with the provider classes within the superclass. The collaborations are easier to understand since the interface ac- tually required by the clients is now documented explicitly in the provider superclass. It also simplifies the addition of subclasses since their responsibilities are defined in a single place and not distributed across the clients of the hierarchy.

Process

Detection. The technique described in the pattern Type Check Elimination within a Provider Hierarchy to detect case statements is applicable for this pattern. Whereas in the pattern Type Check Elimination within a Provider Hierarchy, the switches are located in the same class, hence in one file for a language like C++, in this pattern the case statements occur in several classes which can be spread over different files.

Recipe. The process consists of two major steps. The first is to encapsulate all the responsibil- ities that are specific to the provider classes within the provider hierarchy. The second is to make sure that these responsibilities are correctly distributed within the hierarchy.

1. Determine the set of clients to which the pattern will be applied.

2. Define a new abstract method in the provider superclass and concrete methods implement- ing this method in each of the subclasses based on the source code contained within each branch of the case statement.

3. Refactor the interface of the provider superclass to accurately reflect the protocol used by the clients. This involves not only adding and possibly changing the methods included but determining how they work together with the subclasses to provide the required behav- ior. This includes determining whether methods are abstract or concrete in the provider superclass.

4. For each client, rewrite the method containing the case statement so that it uses only the interface of the provider superclass.

Difficulties.

1. The set of clients may all employ the same protocol; in this case the pattern needs to be applied only once. However, if the clients use substantially different protocols then they can be divided into different kinds and the pattern must be applied once for each kind of client.

2. If the case statement does not cover all the subclasses of the provider superclass a new abstract class may need to be added and the client rewritten to depend on this new class. For example, if it is an error to invoke the client method with some subclasses as opposed to just doing nothing then the type system should be used to exclude such cases. This reduces the provider hierarchy to the one starting at the new abstract class.

3. Refactoring the interface will affect all clients of the provider classes and must not be undertaken without examining the full consequences of such an action.

4. Nested case statements indicate that multiple patterns must be applied. This pattern may need to be applied recursively in which case it is easiest to apply the pattern to the out- ermost case statement first. The provider classes then become the client classes for the next application of the pattern. Another possibility is when the inner case statement is also within the provider class but some of the state of the provider classes should be factored out into a separate hierarchy.

Discussion

About the legacy solution. Explicit type check are sometimes necessary. This is the case when the programmers are working at the frontier between object-oriented and non-object-oriented applications [Rie96]. We observed the frequent use of explicit type checks to simulate method dispatching. For example writing in C++ does not save one from dealing the way events are handled in the X window system. X inevitably looses type information about events by placing them in the event queue from where an application receives them as events and must explicitly check the type to determine how to handle them. Java deals with problem by changing the approach to event handling. Widgets register for events and specify what they want done with the events. This short circuits the loss of type information and eliminates the type checks. Of course this would require reimplementing X or wrapping X in a better design thus encapsulating the type checks in the wrapper. Type checks are made necessary when type information is lost. Often type loss is an artifact of the design and fixing it may be relatively straight forward as in this pattern and the Type Check Elimination within a Provider Hierarchy pattern while other cases may require an extensive redesign such as with X. In such cases the advantages of “doing it right” or at least making it look like it was done right with a wrapper must be weighed against the costs.

About detection. During the detection phase one can find other uses of case statements. For example, case statements are also used to implement objects with states [Bec94, ABW98]. In such a case the dispatch is not done on object type but on a certain state as illustrated in the State pattern [GHJV95, ABW98]. Moreover, the Strategy pattern [GHJV95, ABW98] is also based on the elimination of case statement over object state.

About evolution. If the application currently reengineered has been in the past distributed as a library and is used now by other programmers, the Deprecation pattern [ste98] can be applied to deal with the interface conflicts between the old version and the new ones. About the refactored solution. Applying the pattern will lead to changes in the interface and may introduce new classes in the provider hierarchy. However, the interface as well as the classes already existed conceptually in the legacy solution. The mismatch between the abstraction pre- sented and the abstraction required created complexity that was distributed acrossed the clients. In the refactored solution the complexity is localized within the provider hierarchy instead of being distributed acrossed the clients of the hierarchy. This makes the client code much easier to understand and the role of the provider classes is much better documented by their interfaces.

Language Specific Issues.

C++. In C++ virtual methods can only be used for classes that are related by an inheritance rela- tionship. The polymorphic method has to be declared in the superclass with the keyword virtual to indicate that calls to this methods are dispatched at runtime. These methods must be redefined in the subclasses. Type information is encoded often using some enum type. A data member of a class having such an enum type and a method to retrieve these tags are usually a hint that polymorphism could be used (although there are cases in which polymorphic mechanism cannot substitute the manual type discrimination).

ADA. Detecting type tests falls into two cases. If the hierarchy is implemented as a single discriminated record then you will find case statements over the discriminant. If the hierarchy is implemented with tagged types then you cannot write a case statement over the types (they are not discrete); instead an if-then-else structure will be used. If a discriminated record has been used to implement the hierarchy it must first be transformed by applying the Type Check Elimination within a Provider Hierarchy pattern.

SMALLTALK. In SMALLTALK the detection of the case statements over types is hard because few type manipulations are provided. Basically, methods isMemberOf: and isKindOf: are available. anObject isMemberOf: aClass returns true if anObject is an instance of the class aClass, anObject isKindOf: aClass returns true if aClass is the class or a superclass of anObject. Detecting these method calls is not sufficient, however, since class membership can also be tested with self class = anotherClass, or with property tests throughout the hierarchy using methods like isSymbol, isString, isSequenceable, isInteger.

JAVA. Look for the use of the operator instanceof. Note that this operator returns true if the object on its left-hand side is an instance of the class or implements the interface specified on its right-hand side. As classes are not real objects in JAVA a programmer cannot compare directly like in SMALLTALK two references, but he could compare name of the classes, so you may look for getClass() and getName() combined with string comparison. References

[ABW98] S. R. Alpert, Kyle Brown, and B. Woolf. Design Patterns in Smalltalk. to appear, 1998.

[Bec94] Kent Beck. Death to case statements. Smalltalk Report, pages 8–9, jan 1994.

[Bec97] Kent Beck. Smalltalk Best Practice. Prentice-Hall, 1997. ISBN: 0-13-476904-X, 225 pages.

[BMMM98] William J. Brown, Raphael C. Malveau, Hays W. “Skip” McCormickIII, and Thomas J. Mowbray. AntiPatterns: Refactoring Software, Architectures, and Projects in Crisis. Wiley and Sons, 1998. ISBN: 0-471-19713-0.

[GHJV95] Erich Gamma, Richard Helm, Ralph Johnson, and John Vlissides. Design Patterns. Addison Wesley, Reading, MA, 1995.

[Opd92] William F. Opdyke. Refactoring Object-Oriented Frameworks. PhD thesis, University of Illinois, 1992.

[Rie96] Arthur J. Riel. Object Oriented Design Heuristics. Addison-Wesley, 1996. ISBN: 0-201- 63385-X.

[ste98] Systems reengineering patterns. 1998. To appear in proceedings of ACM-SIGSOFT Foun- dations of 6.

Acknowledgements. This work has been funded by the Swiss Government under Project no. NFS-2000-46947.96 and BBW-96.0015 as well as by the European Union under the ESPRIT programme Project no. 21975. Many thanks to the people that have commented on earlier drafts of this document: Eduardo Casais and Oscar Nierstrasz. Also to our shepherd at EuroPLOP’98, Don Roberts. And finally, to Patrick Steyaert, who helped us with the development of the appropriate pattern form.

A Detecting Case Statements.

This perl script searches the methods in C++ files and lists the occurences of statements matching the

following expression: elseXif where X can be replaced by f, //... or some white space including carriage return.

#!/opt/local/bin/perl $/ = ’::’; # new record delim., greps an entire method approx. $elseIfPattern = ’else[\s\n]*{?[\s\n]*if’; $linecount = 1; while (<>) { s/(\/\/.*)//g; # remove C++ style comments $lc = (split /\n/) - 1; # count lines in the record

if(/$elseIfPattern/) { # count # of lines until first occurence of "else if" $temp = join("",$‘,$&); $l = $linecount + split(/\n/,$temp) - 1; # count the occurences of else-if pairs, # flag the positions for an eventual printout $swc = s/(else)([\s\n]*{?[\s\n]*if)/$1\n\t\t*** HERE ***$2/g; printf "\n%s: Statement with %2d else-if’s, first at: %d", $ARGV, $swc, $l; } $linecount += $lc; if(eof) { close ARGV; # explicit close $linecount = 0; # to reset linenumber for next file print "\n"; # as well as the record number $. } }