Using Categories for the Development of Generative Software

Pedram Mir Seyed Nazari and Bernhard Rumpe Software Engineering, RWTH Aachen University, Aachen, Germany

Keywords: Model-driven Development, Code Generators, Software Categories.

Abstract: In model-driven development (MDD) software emerges by systematically transforming abstract models to concrete . Ideally, performing those transformations is to a large extent the task of code generators. One approach for developing a new code generator is to write a reference implementation and separate it into handwritten and generatable code. Typically, the generator developer manually performs this separation — a process that is often -consuming, labor-intensive, difficult to maintain and may produce more code than necessary. Software categories provide a way for separating code into designated parts with defined dependencies, for example, “Business Logic” code that may not directly use “Technical” code. This paper presents an approach that uses the concept of software categories to semi-automatically determine candidates for generated code. The main idea is to iteratively derive the categories for uncategorized code from the dependencies of categorized code. The candidates for generated or handwritten code finally are code parts belonging to specific (previously defined) categories. This approach helps the generator developer in finding candidates for generated code more easily and systematically than searching by hand and is a step towards tool-supported development of generative software.

1 INTRODUCTION model, the generator developer creates the reference implementation. Next, it has to be determined which Models are at the center of the model-driven devel- code parts need to be or can be generated and which opment (MDD) approach. They abstract from tech- ones should remain handwritten. Finally, the transfor- nical details, facilitating a more problem-oriented de- mations are defined to transform the reference model velopment of software. In contrast to conventional to the aforementioned generated code. general-purpose languages (GPL, such as or C), Often, the third step, i.e., ’separation of hand- the language of models is limited to concepts of a written and generated code’ is not explicitly men- specific domain, namely, a domain-specific language tioned in the literature. This separation is implicit (DSL). To obtain an exectuable software application, part of the last step, i.e., ’creation of transformations’, code generators systematically transform the abstract since the transformations are only created for code models to instances of a GPL (e.g., classes of Java). that ought to be generated. However, the separation However, code generators are software themselves of handwritten and generated code ought to be distin- and need to be developed as well. There are different guished as a step on its own, since it is not always development processes for code generators. One that obvious which classes need to be generated. is often suggested (e.g., (Kelly and Tolvanen, 2008) In general, every class can be generated, espe- and (Schindler, 2012)) is shown in Fig. 1. cially when using template-based generators. In an The approach includes four steps. First, a refer- extreme case, a class can be fully copied into a tem- ence model is created, which ultimately serves as in- plate containing only static template code (and, thus, put for the generator. Depending on this reference is independent of the input model). This is not de- sired, following the guideline that only as much code Creation of Creation of Separation of Creation of should be generated as necessary (Stahl et al., 2006), Reference Model Reference Impl. HW and Gen Code Transformations (Kelly and Tolvanen, 2008), (Fowler, 2010). Opti- result result result result mally, most code is put into the domain framework Handwritten Reference Reference Code Templates Model Impl. Templates (or domain platform), increasing the understandabil- Generated Templates Code ity and maintainability of the software. The gener- Figure 1: Typical development steps of a code generator. ated code then only configurates the domain frame-

498 Mir Seyed Nazari P. and Rumpe B.. Using Software Categories for the Development of Generative Software. DOI: 10.5220/0005328204980503 In Proceedings of the 3rd International Conference on Model-Driven Engineering and (MODELSWARD-2015), 498-503 ISBN: 978-989-758-083-3 c 2015 SCITEPRESS (Science and Technology Publications, Lda.) UsingSoftwareCategoriesfortheDevelopmentofGenerativeSoftware

work for specific purposes (Rumpe, 2012). CardGameGUISwing refines One important criterion for a code generator to be reasonable is the existence of similar code parts, ei- CardGameGUI SheepsHead ther in the same software product or in different prod- ucts (e.g., software product lines). Typically, genera- Swing CardGame tion candidates are similar code parts that are also re- lated to the domain. For example, in a domain about 0 cars, the classes Wheel and Brake would be more likely generation candidates than the domain indepen- Figure 2: Software categories for virtual SheepsHead (Siedersleben, 2004) (shortened). dent and thus unchanged class . This, of course, is the case, since the information for the generated categories, such as persistence, gui and application. code is obtained by the input model which, in turn, Therefore, (Siedersleben, 2004) suggests using soft- is an instance of a DSL that by definition describes ware categories for finding appropriate components. elements of a specific domain. Of course, the logi- In the following this idea is demonstrated by an ex- cal relation to the domain is not a necessary criterion, ample.1 because if the DSL is not expressive enough, the gen- erated code is additionally integrated with handwrit- Suppose that a software system for the card game ten code. Nevertheless, the generated code often has Sheepshead should be developed. The following cat- some bearing on the domain. egories then could be created (see Fig. 2): In most cases, the generator developer manually • 0 (Zero): contains only global software that is separates handwritten code from generated code. This well-tested, e.g., java.lang and java.util of process can be time-consuming, labor-intensive and the JDK. may impede maintenance. Furthermore, when using • CardGame a domain framework, this separation is insufficient, : contains fundamental knowledge since the handwritten code needs to be separated into about card games in general. Hence, it can be used handwritten code for a specific project and handwrit- for different card games. ten code concerning the whole domain. This sepa- • SheepsHead: Contains rules for the the ration also impacts the maintenance of the software Sheepshead game, e.g., whether a card can (Stahl et al., 2006). To address this problem, software be drawn. categories, as presented in (Siedersleben, 2004), are • CardGameGUI: determines the design of the card suited. game, independent of the used library, e.g., that The aim of this paper is to show how soft- the cards should be in the middle of the screen. ware categories can be exploited to categorize semi- automatically classes and interfaces of an object- • CardGameGUISwing: extends Swing by illustra- oriented software system. The resulting categoriza- tion facilities for cards. tion can be used for determining candidates for gen- • Swing: contains fundamental knowledge about erated code, supporting the developer performing this Java Swing. separation task. This paper is structured as follows: Sec. 2 intro- An arrow in Fig. 2 represents a refinement rela- duces software categories and the used terminology. tion between two categories. Classes that are in a cat- In Sec. 3, these software categories are adjusted for egory C1 that refines another category C2 may use generative software. Sec. 4 presents the allowed de- classes of this category C2. The other way around pendencies derived by the previously defined software is not allowed. Every category - directly or indi- categories. The general categorization approach is ex- rectly - refines the category 0 (arrows in Fig. 2). plained in Sec. 5 and exemplified in Sec. 6. Sec. 7 Hence, software in 0 can be used in every cate- outlines further possible dependencies. Finally, Sec. gory without any problems. CardGame is refined 8 concludes the paper. by SheepsHead and CardGameGUI which means that code in these categories can also use code in CardGame. Note that a communication between CardGameGUI and SheepsHead is not allowed di- 2 SOFTWARE CATEGORIES rectly, but rather by using CardGame or 0 interfaces. Since the category CardGameGUISwing refines both Software systems, especially larger ones, consist of a number of components that interact with each other. 1The example is taken from (Siedersleben, 2004) and re- The components usually belong to different kinds of duced to only the aspects required to explain our approach.

499 MODELSWARD2015-3rdInternationalConferenceonModel-DrivenEngineeringandSoftwareDevelopment

AT DT the sake of readability, we do not explicitly mention interfaces, albeit what applies to classes applies to in- A T D T terfaces as well. 0 DG (a) (b) 0‘ 3 CATEGORIES FOR Figure 3: Software categories (a) in general (Siedersleben, GENERATIVE SOFTWARE 2004) and (b) adjusted for generative software. While (Siedersleben, 2004) aims for finding compo- CardGameGUI and Swing, it is a mixed form of these nents from the defined software categories, the goal two categories. of this paper is to determine whether a specific class Now, having these categories, appropriate com- should be generated or not by analyzing its depen- ponents can be found. For example, a compo- dencies to other classes. nent SheepsHeadRules in the category SheepsHead, To illustrate this, consider the following example. CardGameInfo and VirtualPlayer in CardGame, When having a class Book and a class Jupiter, which CardGameInfoPresentation for CardGameGUI. of these classes are generation candidates? Of course, Considering this example, it can be seen that be- it depends on the domain. If the domain is about plan- side the 0 category, three other categories can be iden- ets, probably Jupiter is a candidate. In a carrier tified that exist in most software systems (see Fig. 3a): media domain, Book would be a candidate. So, we • Application (A): containing only application can say, that a generation candidate somehow relates software, i.e., CardGame, SheepsHead and to the domain. But this condition is not enough. In CardGameGUI a library domain where different books exist, Book • Technical (T): containing only technical software, would rather be general for the whole domain and e.g. Java Swing classes.2 should probably not be generated at all. Hence, addi- tionally to the domain affiliation, a generation candi- • Combination of A and T (AT): e.g. date is not general for the whole domain. Technically CardGameGUI-Swing because it refines both an speaking, the class or interface should depend on a 3 A (CardGameGUI) and a T (Swing) category. specific model (or model element). Consequently, a (Siedersleben, 2004) summarizes the characteris- change in the model can imply the change of the gen- tics and rules for the software categories as follows: erated class. Usually, classes that are global for the the categories are partially ordered, i.e., every cate- whole domain are not affected by changes in a model. gory can refine one or more categories. The emerging We adjusted the category model in Fig. 3a to bet- category graph is acyclic. The category 0 (Zero) is ter fit in with the domain. Fig. 3b shows the modified the root category, containing global software. A cat- category model. egory C is pure, if there is only one path from C to The category A from Fig. 3a is renamed to D (Do- 0. Otherwise, the category is impure. In Fig. 3a only main), to emphasize the domain. Consequently, the the category AT is impure, because it refines the two mixed form AT (Application and Technical) becomes categories A and T. All other categories are pure. DT (Domain and Technical). Category T remains un- changed. The new category DG (domain global) indi- Terminology cates software that is global for the whole domain and helps to differentiate from D-classes that are specific CookBook We call a class that has the category C a C-class. Fol- to the domain (a particular book, e.g., ). DG lowing from the category graph in Fig. 2 there are: Because of the introduction of , the character- 0 AT-classes, A-classes, T-classes and 0-classes. For istic of the category changes somewhat. It contains only global software that is well-tested and indepen- 2Note that Swing classes are global (belonging to the dent of the domain, e.g., java.lang and java.util JDK) and well-tested; hence meet the criteria of the cat- of the JDK. To highlight the difference to the initial 0 egory 0. But –as usually the user-interface should be definition, 0’ is used. exchangeable– Swing classes are not necessarily global in a With the above objective in mind and upon search- specific software system. ing for generation candidates, in particular, classes of 3 Representation R In (Siedersleben, 2004) also the ( ) cat- D D DT egory is presented. This category contains only software for the category are interesting, i.e., itself and , transforming A category software to T and vice versa. It is a refining the category of both D and T (see Fig. 3b). kind of cleaner version of AT. To demonstrate our approach, The matrix in Fig. 4a underscores which software the R category can be neglected. category results if two categories are combined. A

500 UsingSoftwareCategoriesfortheDevelopmentofGenerativeSoftware

+ DT D T DG 0‘ → DT D T DG 0‘ ric to the previously described allowed dependencies DT DT DT DT DT DT DT VVVVV of a category. D D DT D D D U V U VV T TTT T UU VVV Dependencies in Java DG DG DG DG UUU VV (a)0‘ 0‘ (b) 0‘ UUUU V Up to now, we included the term dependency, but we Figure 4: (a) Addition of software categories (b) Allowed did not define it so far. This is mainly because what dependencies between categories. a dependency ultimately is, depends on the (target) usage of 0’ has no effect, e.g., D + 0’ = D. The same programming language. Java, for example, provides is true for DG, as we defined it to be like 0’ (global for different kinds of dependencies between classes and the whole domain). Hence, D + DG = D, e.g., if the interfaces. The following shows one possible classi- A B D-class CookBook extends the DG-class Book it still fication, where the class depends on the class and I remains a D-class. Only the combination of D and T the interface , respectively: leads to an (impure) mixed form, concretely DT. Any • Inheritance: class A extends B DT DT DT DT combination with results in , i.e., * + = . • Implementation: A implements I • Import: import B 4 DEPENDENCY RULES FOR • Instantiation: new B() CATEGORIES • ExceptionThrowing: throws B • Usage: field access (e.g., b.fieldOfB), method A total of four categories (plus the mixed form DT) call (e.g., b.methodOfB(), declaration (e.g., B have been suggested for a general classification of b), use as method parameter (e.g., void meth(B code in generative software (Fig. 3b). Classes of b)), etc. a particular category are only allowed to depend on classes of the same category and classes that are on These are dependencies in Java that are mostly the same path to 0’. Consequently, based on these manifested in keywords (e.g., extends and throws), categories, the table in Fig. 4b can be derived auto- and hence, hold for any Java software project. How- matically. ever, not all of these dependencies are always desired. The table can be read in two ways: line-by-line It is important to determine first of all what a depen- or column-by-column. The former shows the allowed dency ultimately is. For example, an unused import, dependencies of a category, whereas the latter shows i.e., a class that imports another class without using it, the categories that may depend on a category. The is not necessarily a dependency. first row in Fig. 4b shows that a DT-class may depend on classes of any of the categories. A D-class can only depend on D-, DG- and 0’-classes4 (Fig. 4b, sec- 5 CATEGORIZATION ond row). A D-class must not depend on a DT-class. APPROACH Only the other direction is allowed. Analogous to D- classes, a T- class may only depend on T-, DG- and The suggested approach for the categorization of the 0’-classes. A class from category DG cannot depend source code is demonstrated in Fig. 5. Three inputs on any of the categories but DG and 0’; otherwise are needed for the categorization: the source code to it would contradict the definition of DG being global be categorized (from which a dependency graph is de- for the whole domain. For example, in the library do- rived), the category graph (such as in Fig. 3b) and an main, the (abstract) class Book (DG) would not know initial categorization of some of the classes and inter- anything about the single books (such as CookBook, faces (usually done by hand). Using these inputs, a D) or MDDBook (D)). Of course, 0’-classes can only categorization tool analyzes the dependencies of the communicate among each other. For instance, classes uncategorized classes and interfaces to the already in the java.lang package (0’) do not have any de- categorized ones. With the information obtained from pendencies to a class of any of the other categories. the category graph some of the uncategorized classes As mentioned before, the columns in Fig. 4b show and interfaces can be categorized automatically. For those categories that can depend on a specific cate- example, if a class C depends on a D- and a T-class gory. It can be seen that this is somehow antisymmet- and the category graph in Fig. 3b is given, the cate- 4Note that a D-class that depends on a T-class is rather gory of class C is definitively DT, because only this a DT-class. category refines both D and T.

501 MODELSWARD2015-3rdInternationalConferenceonModel-DrivenEngineeringandSoftwareDevelopment

performed (DT) CookBookPanel (DT) CD Source Code automatically

Category Graph Categorization Categ. i+1 CookBookReader CookBook (D) (D/DT) Categorization i AbstractPanel (T) categorization [changed] changed? Reader (?) Book (DG)

Manual [not changed] 1..* 1..* [yes] Categorization Author (DG) JPanel (0’) further categ. [yes] categorize needed? Categ. i+1‘ manually? Figure 7: Categorized classes after the first iteration. [no] [no]

Final Categorization six are not. The category is in parentheses beside the Figure 5: Overview of the categorization approach. class name. Uncategorized classes are marked with a question mark (?). Let us assume that the four cate- (?) CookBookPanel (?) CD gorized classes already exist and are categorized (e.g., CookBookReader manually by an expert) and the six other classes are CookBook (D) (?) AbstractPanel (T) newly created. This situation can arise, for instance, when software evolves. In the following, the catego- Reader (?) Book (DG) rization process is illustrated. 1..* 1..* CookBookPanel Author (?) JPanel (0’) The class communicates with both a D-class (CookBook) and a T-class Figure 6: Initially categorized classes. (AbstractPanel). Following Fig. 4b, only a DT-class may communicate with a D as well as with In some cases the order of the categorization pro- a T class (marked by a check mark in the D and cess matters. For example, if a class A only depends T column). Thus, CookBookPanel is definitively a on a class B (and no categorized class depends on DT-class. Moreover, any other class depending on A), A will not be categorized until B is categorized. CookBookPanel (represented by the three dots), is To prevent that the order has an effect on the final also a DT-class. In the column DT in Fig. 4b there is categorization, the categorization is performed itera- only a check mark for DT. tively. The output of iteration i serves as input for Next, CookBookReader depends on the D-class the next iteration i+1. This is repeated until a fix- CookBook and the not yet categorized class Reader. point is reached, that means, no further classes and If Reader is a DT- or T-class, CookBookReader will interfaces could be categorized. These iteration steps be definitive a DT-class, for it would depend on a D- can be conducted fully automatically. If there are still class and either a DT- or T-class. With regard to Fig. uncategorized classes left, some of them can be cate- 4b, this only fits for DT-classes. If Reader is of any gorized by hand (Sec. 6 illustrates this case by an ex- of the other categories, CookBookReader will be a D- ample). This updated categorization, again can serve class. However, when trying to categorize Reader, as input. The process can be repeated until the whole we encounter a problem. Reader only depends on source code is categorized or no further categoriza- Book, a DG-class. According to Fig. 4b this can ap- tion is needed. Finally, classes and interfaces with a ply to any category except 0’. So, in this iteration, specific categorization serve as candidates for code to Reader cannot be categorized automatically. Conse- be generated. Here, this applies to the categories D quently, the exact categorization of CookBookReader and DT. The user now can decide which of these can- cannot be determined. didates will become generated code. Analogous to the class Reader, the class Author only depends on the DG-class Book. So, except 0’, it can be of any category. Unlike the previous case, 6 EXAMPLE Book also has a dependency to Author, which means that Author is either DG or 0’. We have already ex- Now, with the help of the allowed dependencies de- cluded 0’; hence, only DG remains as a possible cat- fined in Sec. 4, given some classes, the category of egory for Author. Fig. 7 shows the extended catego- each of the classes can be derived semi-automatically, rization after this iteration. following the approach presented in the previous sec- Two classes could not be categorized exactly af- tion. ter the first iteration: CookBookReader and Reader. Consider the case in Fig. 6. The figure depicts Recalling that our goal is to find generation candi- overall ten classes, whereby four are pre-categorized dates, we are above all interested in classes of the (CookBook, AbstractPanel, Book and JPanel) and category D. So, the approximate categorization of

502 UsingSoftwareCategoriesfortheDevelopmentofGenerativeSoftware

CookBookReader (D or DT) is sufficient, because and can also lead to a different candidate list. How- both D and DT are of the category D. In contrast, ever, the procedure as described in Sec. 5 and Sec. 6 Reader is still completely uncategorized which ham- remains unchanged. pers the categorization of classes depending on it. There are two options to categorize Reader in the next iteration: either manually by the expert or au- 8 CONCLUSION tomatically by adding new classes and dependencies Reader limiting the possible categories of . Code generators are crucial to MDD, transforming Note that the order of the categorization of abstract models to executable source code. The gener- CookBookPanel and the classes depending on it ated source code often depends on handwritten code, (marked by “. . . ”) is important for the first iteration. e.g., code from the domain framework. When a code The “. . . ” classes could not be categorized if they generator is developed or evolved, the generator de- CookBookPanel were considered before . However, veloper manually decides which classes need to be the order has no impact on the final result, because generated and which remain handwritten. This task CookBookPanel after the first iteration is surely cate- can be time-consuming, labor-intensive and may gen- gorized, and thus, the “. . . ” classes can be categorized erate more code than is necessary, hampering the in the next iteration. maintenance of the software. Finally, three candidates (plus the “. . . ” classes) This paper has introduced an approach that can aid for generated code are identified: CookBook (D), the generator developer in finding candidates for gen- CookBookReader (D/DT) and CookBookPanel (DT). erated code. First, a software category graph is de- All of these classes belong to the category D di- fined. From this graph the allowed dependencies be- rectly or indirectly (i.e., DT), and hence, are some- tween the corresponding classes (and interfaces) are how related to the domain. Having these candidates, derived automatically. After an initial categorization the generator developer has to decide which of these of some classes, further classes can be categorized classes in the end need to be generated and which automatically, by analyzing their dependencies. This remain handwritten. Of course, this decision is re- procedure is conducted iteratively until all classes are stricted above all by the information content of the categorized or no more categorization is needed. Fi- input model. The generator developer must be aware nally, generation candidates are all classes belonging of this restriction. to the domain categories.

7 FURTHER DEPENDENCIES REFERENCES

Up to now, only the technical dependencies of the Fowler, M. (2010). Domain Specific Languages. Addison- code are considered for finding generation candidate Wesley Professional. classes (see Sec. 4). There can be further dependen- Kelly, S. and Tolvanen, J.-P. (2008). Domain-Specific Mod- cies, such as naming dependencies. If, for example, eling: Enabling Full Code Generation. Wiley. the CookBookPanel in Fig. 6 had no association to Rumpe, B. (2012). Agile Modellierung mit UML. CookBook, then, it would only depend on the T-class Xpert.press. Springer Berlin, 2nd edition edition. AbstractPanel and be a T-class. Schindler, M. (2012). Eine Werkzeuginfrastruktur zur Ag- But, CookBookPanel contains the name of ilen Entwicklung mit der UML/P. Aachener Infor- matik Berichte, Software Engineering. Shaker Verlag. CookBook as prefix in its class name. Considering Moderne Software-Architektur: CookBookPanel Siedersleben, J. (2004). this naming dependency, has also a Umsichtig planen, robust bauen mit Quasar. dependency to the D-class CookBook. Consequently, Dpunkt.Verlag GmbH. CookBookPanel is a DT-class and a generation candi- Stahl, T., Voelter, M., and Czarnecki, K. (2006). Model- date. Note that from the architecture’s point of view a Driven Software Development: Technology, Engineer- (technical) dependency between CookBookPanel and ing, . John Wiley & Sons. CookBook might be forbidden. Hence, deriving the dependency rules from the architecture (and not from software category graph) would limit the kinds of pos- sible dependencies. In sum, what a dependency finally is, depends on the software system and its conventions. This affects the emerging dependency graph of the source code

503