Build Automation and Optimization for Models of Embedded Real-Time Systems

Build Automation And Optimization for Models of Embedded Real-time Systems

Kanchan Nair

A thesis submitted to the School of Computing in conformity with the requirements for the degree of Master of Science

Queen’s University Kingston, Ontario, Canada May 2018

Model Driven Development (MDD) enables users to construct complex systems by leveraging models, often in some graphical notation, and pre-built application components. With MDD, users create a model using a language such as the Unified Modeling Language (UML) or UML for Real-Time (UML-RT). Certain tools supporting these languages allow building the code from the created models. Build systems transform source code, libraries, and data files into deliverables, such as deployment-ready executable files. Build tools play a critical role in the software development process by automating the compiling and packaging process. In a large MDD project, building accounts for a significant amount of time and effort, and optimization of the build process is still an actively researched topic. MDD tools rely heavily on the existing build tools, so there are possibilities of inheriting limitations and issues from the existing tools. This calls for making the existing build process more efficient. Hence, we propose a methodology to optimize the build process at the model level by comparing any two model versions, determining the impact of the change on a model, and generating a patch for the changed and impacted model elements. Finally, we present a prototype tool that serves to optimize the build process by generating a patch for the changed and impacted model elements. Also, using the

i prototype tool, we measure the time taken to compute the impacted elements.

ii Acknowledgments

I would be like to extend my sincerest gratitude to my professor and supervisor, Dr. Juergen Dingel for providing me with this opportunity to work under him. It has been a wonderful learning experience under his proper guidance with his insightful comments that propelled me beyond my capabilities and drove this thesis in the right direction. It would have been utterly impossible without his calm and his patience and any extra adjective I attribute to him would still belittle the amount of care he had shown. I would also appreciate my husband Arvind for having stayed by my side, supporting and providing me the mental strength that I needed especially whenever times got stressful. I would also extend my appreciation to Mojtaba Bagherzadeh with whom I worked closely regarding my project. He pushed me to work in a way to succeed. I would like to extend special thanks to Karim Jahed for his support towards the completion of my research by sharing his technical knowledge throughout. I would also take time to extend my appreciation to my lab mates Sudharshan, Harshith, Reza, Michael and David for their constant support and insightful feedbacks during the group meetings. Last but not least, I would also thank my family members comprising my parents, my in-laws, relatives, and friends for their constant support throughout my graduate

iii life at Queens. After all, I am the sum total of all the love and care that has nurtured and made me who I am. Thank you all!!

iv Contents

Abstract i

Acknowledgments iii

Contents v

List of Tables vii

List of Figures viii

Chapter 1: Introduction 2 1.1 Motivation ...... 2 1.2 Overview of the Proposed Approach ...... 5 1.3 Organization of Thesis ...... 6

Chapter 2: Background 8 2.1 Model and Model Driven Development ...... 8 2.1.1 Goals of MDD ...... 11 2.1.2 Model Transformation ...... 11 2.1.3 Build Process and Tools ...... 12 2.2 UML-RT ...... 14 2.3 MDD Open Source Tools ...... 20 2.3.1 Eclipse ...... 20 2.3.2 Papyrus ...... 20 2.3.3 JGit ...... 23 2.3.4 Eclipse Modeling Framework (EMF) ...... 23 2.3.5 Epsilon ...... 26

Chapter 3: Related Work 30 3.1 Related Work ...... 30 3.1.1 Model Versioning ...... 30 3.1.2 Impact Analysis ...... 31

v 3.1.3 Build ...... 33

Chapter 4: Automating the Build Process 34 4.1 Project Overview ...... 35 4.2 Conﬁguration ...... 38 4.2.1 Model Versioning ...... 38 4.2.2 Model Comparison ...... 41 4.3 Impact Analysis on Model ...... 44 4.3.1 Querying on UML Model ...... 44 4.3.2 Relationships between Model Elements ...... 45 4.3.3 Notation for Relationships ...... 50 4.3.4 Algorithmic Approach ...... 50 4.4 Code Generation ...... 54 4.5 Patch Generation and Build Process ...... 55

Chapter 5: Evaluation 58 5.1 Assumptions ...... 58 5.2 Types of Changes ...... 59 5.3 Description of sample models ...... 60 5.4 Results and Metrics ...... 66 5.4.1 Validation ...... 66 5.4.2 Experimental System Conﬁguration ...... 66 5.4.3 Evaluating Impact Analysis Time ...... 66 5.4.4 Optimization Beneﬁts ...... 69

Chapter 6: Summary and Conclusions 74 6.1 Summary ...... 74 6.2 Limitations and Future Work ...... 75 6.3 Conclusion ...... 75

vi List of Tables

5.1 Impact Analysis on Model For Overall Changes ...... 67 5.2 Optimization beneﬁt for select changes ...... 70

vii List of Figures

1.1 CDD v/s MDD ...... 3 1.2 Proposed Approach Overview ...... 6

2.1 Model Transformation ...... 13 2.2 Comparison of Build Tools ...... 14 2.3 A basic capsule implementation ...... 16 2.4 Capsule’s State Machine in UML-RT ...... 16 2.5 UML-RT Capsule and class notation on class diagrams ...... 17 2.6 UML-RT protocol description ...... 19 2.7 Screenshot of Papyrus-RT tool ...... 22 2.8 The code generation process used by the Papyrus-RT code generator 23 2.9 EMF Model unifying XML, UML and Java ...... 24 2.10 Model Diﬀerencing Phases ...... 25 2.11 EMF Compare ...... 26 2.12 Epsilon Module Structure ...... 28 2.13 Epsilon Proﬁling Summary View ...... 29

4.1 Overview of Our Approach ...... 36 4.2 Example of conﬂict in changes made on the same model ...... 39

viii 4.3 Illustration of how our tool accesses the Git repository to retrieve two model versions ...... 40 4.4 Model Comparison approach using EMF Compare ...... 43 4.5 Impact Analysis Process ...... 44 4.6 Relationships for evaluating Impact Analysis ...... 45 4.7 Example of use relationship between capsule, port and protocol . . . 46 4.8 Example of communication relationship between capsules ...... 47 4.9 Example of containment relationship between capsules ...... 48 4.10 Example model for types of relationships ...... 49 4.11 Overview of Code Generation ...... 54 4.12 Code Generation implementation ...... 55 4.13 Patch Generation implementation ...... 56 4.14 Build Process implementation ...... 57

5.1 Ping Pong Model ...... 60 5.2 Car Door Central Lock Model ...... 62 5.3 Parcel Router Model ...... 63 5.4 Refined - Parcel Router Model ...... 63 5.5 Rover Model ...... 65 5.6 Testing window with integrated plug-in ...... 67 5.7 Summary of time calculation in Epsilon profiling window ...... 68 5.8 Target for time calculation in Epsilon profiling window ...... 68 5.9 Changing model elements using Papyrus-RT ...... 72 5.10 Newly Added model element in Papyrus-RT ...... 72 5.11 Pushing model changes to the Git repository ...... 73

ix 5.12 Code Generation of Impacted elements ...... 73

x Glossary 1

Glossary

CDD Code Driven Development. 2

EMF Eclipse Modeling Framework. 23

EOL Epsilon Object language. 5

MDD Model Driven Development. 2

MOF Meta-Object Facility. 30

OCL Object Constraint Language. 5, 28

UML Uniﬁed Modeling Language. 14

UML-RT Uniﬁed Modeling Language for Real Time. 14

VCS Version Control System. 2 2

Chapter 1

Introduction

In a conventional sense, Model Driven Development (MDD) and Code Driven De- velopment (CDD) are regarded as being quite separate from each other. In CDD, developers work directly on code before the build process occurs. MDD focuses on models rather than on programs [27]. Figure 1.1 illustrates these two approaches. MDD attempts to address the complexities of large system development by allowing the developers to design the system via models from which the code is automatically or manually generated before the ﬁnal build process takes place [32].

1.1 Motivation

In a CDD environment, for every release or sub-release, a different version of the code is maintained and saved. When there are bug fixes, the developers generate patches and merge it with the new release. Patch creation in CDD is implemented by using different tools and technologies including Version Control System (VCS) tools such as GIT and SVN [16, 56], and build automation tools, e.g., Make [1] and Maven [8], compiler and build tools according to the programming language, e.g., GCC [69], and packaging tools, e.g., dpkg [59]. 1.1. MOTIVATION 3

Figure 1.1: CDD v/s MDD

Model Driven Development (MDD) is intended for many large-scale domain- speciﬁc applications. Ideally, MDD tools and approaches should provide a similar functionality to code-driven development for patch generation. However, existing approaches do not satisfy this requirement due to several open problems:

• Model management capabilities such as versioning, comparing, merging and related activities are not supported well by existing model transformation tools [44]. Model versioning provides the set of features that allows developers to maintain the history of the evolution of models. Approaches for providing a model-specific versioning system without leveraging existing VCS tools have been proposed and include EMFStore [34], ModelCVS [41], and AMOR [40]. However, all of them focus on specific parts of the versioning process and do not offer a complete solution for model versioning [46, 49]. 1.1. MOTIVATION 4

• Often existing tools for model comparison perform the calculation based on the changed elements and their explicit relationships. Extra work is required to specify the language and application-speciﬁc relationships and perform impact analysis based on the changed elements.

• Build tools play a critical role in compiling and packaging process of software and related patches. A useful MDD tool should exploit a range of different compilers and build utilities [27]. Unfortunately, no proper support and integration with build tools are provided by existing modeling tools. Existing tools often generate some predefined build script for normal cases and it is not easy to configure them for special cases such as building only part of the generated code or import an external library.

• Incremental code generation which is essential for patch generation and an eﬃ- cient build system is only supported by a few existing tools in a tool-dependent way [58].

In the absence of support for repositories, version control, and an optimized, automated build process, MDD involving large and frequently changing models can be slowed down significantly. A possible workaround for patch generation, at model- level, is to generate the code corresponding to each model and then apply the patch generation mechanism on the code base. This workaround requires that the extra source code and executable needs to be kept for different versions of the model which increases the administrative workload and the required disk spaces. Also, it may tempt developers to change the available generated code rather than the model which could lead to synchronization issues because changes to the code may not be reflected in the model. 1.2. OVERVIEW OF THE PROPOSED APPROACH 5

In this work, we focus on optimizing and automating the build process in the context of Model-driven Development (MDD) through model versioning and a patch generation mechanism. This work will allow the generation of a patch corresponding to diﬀerences between two versions of models rather than source code.

1.2 Overview of the Proposed Approach

Industrial MDD stores models and code in repositories to allow distributed development. All development activities apply to the models in collaborative mode and the executable code is generated from the models. However, existing MDD tools lack a proper versioning system. Our approach provides greater flexibility for accessing versions of a model, rather than code, directly from a repository such as Git. As illustrated in Figure 1.2, the process begins with ‘model versioning’, i.e., to load model versions from the repository. It is followed by model comparison. For this comparison between model versions, we have made use of the Eclipse Modeling Framework Compare (EMF Compare). Since generating complete code incurs high costs in terms of build time, there needed to be a strategy similar to the incremental code generation where the code generation happens only for the changed and impacted elements. Thus, we developed an algorithm for analyzing the impact of the changes made in the new model version. In order to extract the details of the various dependencies of the model from the UML file and implement the impact analysis, we have made use of a JavaScript and (OCL)-based language known as the Epsilon Object Language (EOL) which provides different features such as sequencing, variables, loops, branching and querying functionalities [5]. 1.3. ORGANIZATION OF THESIS 6

Figure 1.2: Proposed Approach Overview

To prevent the code generation of all the model elements, we generate the code only for the impacted model elements. Then, using the generated code, we create a patch which is a collection of compiled ﬁles only for the impacted model elements.

1.3 Organization of Thesis

Our thesis is organized in the following manner:

• Chapter 2 discusses the background of the tools, technologies and methodologies that were used in our research work. We discuss model-driven development (MDD), the Uniﬁed Modeling Language for Real Time (UML-RT) and MDD open source tools.

• Chapter 3 discusses other works related to our thesis.

• Chapter 4 begins by describing our approach for the project. Then we present the conﬁguration which outlines model versioning and model comparison. This is followed by the detailed description of our approach to perform impact analysis, code generation, and patch generation. 1.3. ORGANIZATION OF THESIS 7

• Chapter 5 discusses various UML-RT models and evaluates the impact analysis time for each of them. This is followed by discussing the optimization beneﬁt for the changes made on these models.

• Chapter 6 concludes and summarizes our work and also provides some sugges- tions for possible future work. 8

Chapter 2

Background

This chapter covers some background regarding MDD and some of the languages, frameworks, and tools that are associated with it and relevant to our work. In particular, we discuss Eclipse, Papyrus-RT, JGit, EMF Compare, Epsilon Object Language and Epsilon proﬁling.

2.1 Model and Model Driven Development

In simple terms, a model is defined as an abstraction over a software product [20]. Any efficient model possesses these five characteristics [27]:

• Abstraction to hide irrelevant details.

• Understandability in the form of intuitive notations.

• Accuracy in providing a true-to-life representation of the system.

• Predictiveness to correctly estimate the modeled system’s non-obvious properties through formal analysis.

• Inexpensive when compared to building the actual system. 2.1. MODEL AND MODEL DRIVEN DEVELOPMENT 9

Model-driven development, then, refers to a software engineering development approach and practice in which abstract models are constructed and are converted into actual implementations. Thus, MDD can provide a better, more high-level vocabulary for expressing the structure and behaviour of the system to be developed. In [61], the authors provide the definition for MDD as: “Model-driven development is simply the notion that we can construct a model of a system that we can then transform into the real thing”. Historically, although models have helped manage complex problems in software engineering, they have often played a secondary role [27]. As pointed out by Beydeda et al., [23], “These model designs exists on whiteboards, in business presentation slides, or in the designers’ and developers’ head”. However, in an MDD paradigm, models are not mere documentation, but are regarded as equivalent to code with the code implementation most often automated. With advancement in software development, efforts to attain higher levels of abstractions to simplify the development and increase productivity have constantly been pursued. In the early 1990’s, Computer Aided Software Engineering (CASE) was a precursor to MDD. Code-driven development (CDD) on the other hand was done using fourth generation languages (4GLs) [65]. Balasubramanian et al., compare MDD to third-generation languages where they defined it as, “Model-driven development is an emerging paradigm that improves the software development lifecycle, particu- larly for large software systems, by providing a higher level of abstraction for system design than is possible with third-generation programming languages” [22]. There- after, model-driven development receded into the background whilst object oriented languages gained popularity. But now, the boundaries between the model and the 2.1. MODEL AND MODEL DRIVEN DEVELOPMENT 10

code implementations are being blurred. Atkinson emphasizes the same when he explains [19], “Today’s object-oriented languages let programmers tackle problems of a complexity they never dreamed of in the early days of programming. Model-driven development is a natural continuation of this trend. Instead of requiring developers to spell out every detail of a system’s implementation using a programming language, it lets them model what functionality is needed and what overall architecture the system should have”. In CDD, the constant development of new programming languages forces companies to update code bases and train developers. But, in MDD, where the models take precedence over the code, the perspective shifts towards developing and maintaining system models. Thus, the domain experts can keep maintaining the same system regardless of the changes in the development methodologies. Sendall and Kozaczyn- ski define MDD in terms of productivity. They mention in [63], “The model-driven approach can increase development productivity and quality by describing important aspects of a solution with human-friendly abstractions and by generating common application fragments with templates”. The simplification of the design process by construction and analysis of formal models directly and automatic code generation can enhance the productivity of a developer as it shields the underlying complexities related to the platform. But for this enhanced level of abstraction, the model-driven development has to be enclosed and tied to specific domains [65]. It is difficult to have a general-purpose modeling language that is applicable to every domain. Domain- Specific Modeling helps to peel the layers of abstractions further than is possible through current programming languages. Domian-Specific Model (DSM) uses high- level specifications of domain specific problems to provide tangible solutions directly. 2.1. MODEL AND MODEL DRIVEN DEVELOPMENT 11

This narrower emphasis of MDD to specific domains allows for a more targeted and better automation [45]. Take, for instance, the mechanical engineering field where Computer Aided Design (CAD) data that is provided to a milling machine can create a physical prototype or a working piece automatically [65]. MDD has proven most useful in specific domains such as aerospace, automotive and telecommunication etc.

2.1.1 Goals of MDD

The main aim of model-driven development is to increase the level of abstraction. But it is much more than that. The following are the goals of a model-driven development process [65]:

• Managing complexity: Using a reliable problem-oriented modeling language, the system’s complexity can be managed by expressing the system at more abstract levels.

• Development speed: Automation generates the executable code from formal models.

• Enhancing software quality: Well deﬁned model architecture facilitates the development of systems with integrity and quality.

• Expert Knowledge: It allows for a more direct and explicit expression of the knowledge and domain expertise.

2.1.2 Model Transformation

“A model transformation is an operation that takes a set of models as input, executes a set of rules over the model(s) elements and produces a set of models as output” [24]. 2.1. MODEL AND MODEL DRIVEN DEVELOPMENT 12

It can be defined as the automated process of transforming and creating models [14]. The transformation specification can be in the form of text [36] or graphics [10] and be used across similar domains. Model transformation is helpful for model-driven engineering. It reduces the otherwise needed efforts to work simultaneously on multiple interrelated models and ensures that any updates are correctly and completely carried out. Model transformation can facilitate model synchronization, reverse engineering, view generation and, the application of patterns [63]. The main purpose of using model transformation is to simplify the manipulation of models and to reduce errors using a set of rules of transformation and application strategies which are written in a specifically designed model transformation language [62].

2.1.3 Build Process and Tools

The build process refers to the conversion of the source code to an executable. “Build automation” is the process of automating the compilation of code, its conversion to a binary format, packaging of the binary, and running the automated tests. Cur- rently, there are various examples of build tools available. We have provided a brief description of a few popular ones.

• Make Make is the oldest build automation tool which is still widely used in UNIX and UNIX-like operating systems. It came into existence in 1976 and is a part of the GNU operating system and software collection. Make is a tool which automates the build process by specifying certain rules for certain files and applying said rules to all the files that fit the criteria [1]. 2.1. MODEL AND MODEL DRIVEN DEVELOPMENT 13

Figure 2.1: Model Transformation

• ANT (Another Neat Tool) Apache’s ANT is a Java library and a command line tool which began as a project in 2000 in Apache. ANT was designed by James Duncan Davidson in 1999 as an enhancement over the ‘Make’ build tool for UNIX. It is quite similar to ‘Make’ but is implemented using Java to achieve extensibility and platform-independence [57]. A collection of XML ﬁles speciﬁes an ANT build system.

• Maven Apache Maven is a software project management and comprehension tool. It 2.2. UML-RT 14

Figure 2.2: Comparison of Build Tools [57]

was created to standardize the build process [57]. Based on the concept of a Project Object Model (POM), Maven can manage a project’s build, generation of reports and documentation from a central piece of information [8].

Figure 2.2 provides a comparison between Make, ANT and Maven build tools.

2.2 UML-RT

The Uniﬁed Modeling Language (UML) was too general to be a good modeling language for real-time systems. To accommodate this specialized need for RTE systems, UML-RT or ‘UML for Real Time” was proposed [60]. ROOM (Real-Time Object Oriented Modeling), an architecture description language, had been quite popular in the telecommunication industry. The concepts from ROOM were taken and formed the basis of a UML proﬁle in the MDD tool Rational Rose Real-Time (RoseRT) [70]. UML-RT extends UML and introduces new concepts to facilitate the design and development of real-time complex system architectures and improve the management of concurrent system behaviors [21]. UML-RT has the following key concepts: 2.2. UML-RT 15

• Capsule Capsules are the key feature of UML-RT. A capsule is an ‘active class’. An ‘active class’ is referred to as a class whose instances possess autonomous behaviour, which is deﬁned by a hierarchical state machine [60]. Each capsule operates as per a state diagram which generates actions through ports and responds to events [42].

Capsules have a very precise interface. The capsule contains the internal structure known as properties typed by other capsules [60]. The capsules contain one or more ports any two of which can be linked by a (connector). The connector is an association between capsules. It is analogous to a hardware connection through which communication happens between capsules by the use of signals. These signals arrive and depart a capsule at points called ‘ports’. The ports are associated with their corresponding protocols to send signals or messages. Using these ports, capsules can be interlinked. UML-RT uses ‘Run-To-Completion’ semantics for every capsule [21]. This means that the processing of a message is always completed before the next message is processed.

Figure 2.3 shows a basic implementation of a capsule called ‘Top’ having two child capsules named ‘Pinger’ and ‘Ponger’ each having a port with a connector linking them.

• State Machine A capsule’s functionality is speciﬁed by its state machine. A state machine contains transitions, states, guard, and other components that are used to describe the capsule’s behaviour in UML-RT [21]. A state machine has states which represent the relevant situations during the lifetime of the object. 2.2. UML-RT 16

Figure 2.3: A basic capsule implementation

Figure 2.4: Capsule’s State Machine in UML-RT [9] 2.2. UML-RT 17

Figure 2.5: UML-RT Capsule and class notation on class diagrams [9]

State machines may be hierarchical, i.e., a single state can contain an entire sub-state machine describing the object’s behaviour. As shown in Figure 2.4, the hierarchical state machine is used to represent the capsule’s behaviour. Messages on the capsule’s port trigger transitions between states.

• Passive Class Instances of passive classes can be used as properties or parameters in a UML- RT capsule [9]. A state-machine can describe the behaviour of a passive class. Figure 2.5 shows a passive class named ‘IdProvider’ connected with a capsule using a dotted arrow indicating that ‘IdProvider’ is used by the capsule.

• Port A port is represented as the point of interaction between a capsule and its environment through signals [42]. Ports are denoted by a rectangle shaped symbol. There are diﬀerent kinds of ports such as:

– A ‘relay port’ serves as the outer boundary and passes on the signals between a capsule in the environment and a subcapsule.

– An ‘external port’ is a port which resides on the boundary but is not connected to internal part. The capsule’s state machine can send and 2.2. UML-RT 18

receive messages to and from outside through these external ports.

– An ‘internal port’ does not reside on the boundary and is used by the capsule’s state machine to communicate within the capsule [60].

• Protocol A protocol deﬁnes a set of messages that may be sent or received between capsules by the port and processed by its state machine [42]. The set of messages can be of three types: input, output and input/output [9].

The simplest of the protocols is the ‘Binary protocol’ which involve two ports having two kinds of role: base or conjugated. The base role needs to be speciﬁed while the conjugate is derived from the base role by inverting the incoming and outgoing signal sets [42]. Figure 2.6 displays a class diagram of a UML-RT protocol having two arrows showing incoming signals on the left and two arrows indicating outgoing signals on the right.

• Connector Connectors act as an abstraction of a channel to pass messages between ports. Connectors in UML-RT are always binary, i.e., they always connect exactly two ports. The possible interactions that can occur across a connector are deﬁned by the connector typed by its associated protocol [42].

• Transition A transition deﬁnes a relationship between a source state and a destination state [6]. The processing of events is said to be run-to-completion.

A transition has the following parts: 2.2. UML-RT 19

Figure 2.6: UML-RT protocol description [9]

– Trigger: It deﬁnes which events from which interfaces will cause transitions to be taken. Multiple triggers are possible in a transition where an event satisfying any one of the triggers will cause the transition to be taken.

– Guard Condition: It is referred to as the boolean expression associated with a trigger which needs to be evaluated before the transition is triggered. If there is no guard condition speciﬁed, the default condition is assumed to be ‘True’.

– Actions: The actions in a behavior are where an object performs work such as reading and updating attributes, operation calls, creating or destroying other objects [6]. 2.3. MDD OPEN SOURCE TOOLS 20

2.3 MDD Open Source Tools

2.3.1 Eclipse

Eclipse IDE

Eclipse is an open source and platform-centric Integrated Development Environment containing various development tools integrated within [31]. It contains an extensible plug-in system for customizing the environment. Eclipse is one of the most widely used Java IDE and even includes tools for model- based development [66].

Eclipse Plug-in Development

In Eclipse, a fundamental notion is a plug-in. The tools in the Eclipse platform are extended using plug-ins. A complex tool may have several plug-ins integrated into it where each plug-in is responsible for a speciﬁc functionality and could be extended further by other plug-ins. The purpose of plug-ins is to provide the users with the resources necessary to accomplish a certain task. The platform runtime engine is tasked with ﬁnding and executing the plug-in [66].

2.3.2 Papyrus

Papyrus is an open source, graphical editing tool for UML2 on the Eclipse-based environment. It has a rich user interface and supports the latest Object Manage- ment Group (OMG) UML standard. It is used as a tool in the Eclipse IDE or as a standalone application and provides code generation capabilities. 2.3. MDD OPEN SOURCE TOOLS 21

Papyrus also provides extensive support for UML proﬁles and is considered to be a powerful tool with customization capabilities similar to DSML-like (Domain Speciﬁc Modeling Language) meta-tools [35].

Papyrus-RT

Papyrus-RT is an open-source real-time version of Papyrus. It is implemented on top of Papyrus and provides an implementation of the UML-RT modeling language [60]. It provides support for MDD using UML-RT development environment together with editors, a code generator for C++ and a supporting runtime system and model- compare (diff/merge) capabilities [64, 60]. Figure 2.7 shows the Papyrus-RT tool in Eclipse. Section 1 in the figure is the project explorer window to create Papyrus-RT projects. Section 2 displays the editor view and shows the project diagram of the selected project. Section 3 in the figure shows the palette window where UML-RT model elements can be created. Section 4 is the model explorer window for navigating across all the model elements. Section 5 is the properties window which displays the properties of the selected model elements either from the model explorer view or from the editor view.

Papyrus-RT Code Generation

Code generation as an automated process has met with limited success [27]. Papyrus- RT provides code generation by generating executable C++ code from the model. In this process, both the structural and behavioral elements of the model are included. The output of the code generator is a C/C++ Development Tooling (CDT) project. CDT is a collection of tools provided to support C/C++ development. 2.3. MDD OPEN SOURCE TOOLS 22

Figure 2.7: Screenshot of Papyrus-RT tool

As depicted in Figure 2.8, at the initial step, the code generator transforms the UML-RT model to xtUMLrt. xtUMLrt is an intermediate representation that intends to simplify UML and allow the future extension of the code generation to the xtUML language and possibly others. After the process of translation, all the elements other than the state machines are translated to instances of the C++ meta-model [60]. At the final phase, the actual source C++ files and CDT project is generated. The C++ meta-model separates the generator from issues such as formatting, file regeneration avoidance, etc. During the process of translation from xtUMLrt to the C++ meta-model, for each type of element generated in the C++ model, the generator gathers a set of dependent elements. The C++ program in textual notation is then generated from the model representing it and written out to a file. As a result, the code generator is 2.3. MDD OPEN SOURCE TOOLS 23

Figure 2.8: The code generation process used by the Papyrus-RT code generator [60] not affected by changes in the concrete syntax of the C++ program. The final generated code contains all the necessary header files, source path, and other required files. The code generator is designed to support incremental generation, which means that if the code has been generated previously, then the code generator will be involved on only the changed elements and will generate the code only for the changed elements and their dependent elements [60]. We will use this capability later for our patch generation.

2.3.3 JGit

Git is a distributed source and version control system used to eﬃciently track software projects. The Java implementation of Git is known as ‘JGit’. JGit is a set of libraries which helps the user to keep the copies of every revision of their projects [2]. Git also has integration capabilities with Eclipse that has been developed in Java known as EGit. EGit uses JGit as Git implementation.

2.3.4 Eclipse Modeling Framework (EMF)

EMF is a robust Eclipse framework and code generation facility which provides support for the development of Java applications that use and manipulate complex object models. EMF was developed to provide a practical solution to balance the needs of a mainstream Java programmer, an XML schema specialist or a modeling expert. As 2.3. MDD OPEN SOURCE TOOLS 24

Figure 2.9: EMF Model unifying XML, UML and Java depicted in Figure 2.9, EMF uniﬁes three important technologies namely Java, XML and UML. Thus, object models can be deﬁned using a UML modeling tool or an XML Schema or even by specifying simple annotations on Java interfaces [29].

EMF Model Diﬀerencing

Model differencing is usually used to compare two models. The model comparison is possible when contents are in a persistent storage media form such as XML files, relational databases or as runtime objects. As shown in Figure 2.10, ‘model differencing’ involves a number of steps such as calculation, representation of the calculated results in some form followed by visual- ization of the representation in a human-readable format [28]. 2.3. MDD OPEN SOURCE TOOLS 25

Figure 2.10: Model Diﬀerencing Phases [28]

EMF Compare

EMF Compare is an Eclipse project aimed at providing model comparison and merge capabilities [28]. The ability to discover the diﬀerences between model versions be- comes important as a model development and management practice, for the sake of system design, quality requirements or to track the evolution of the model during its lifetime. Therefore, ‘EMF Compare’ was initiated at Eclipse Summit Europe in 2006 to satisfy the need for a model comparison engine. The EMF Compare component is pluggable and highly extensible. Based on their needs, users can customize each section of the entire comparison process [25]. Figure 2.11 shows the two forms in which the compare result can be visualized. In our research, we used the EMF Compare libraries to compare the user-speciﬁed models. The result of the comparison is then used as input for an impact analysis. 2.3. MDD OPEN SOURCE TOOLS 26

(a) EMF Compare based on Containment Features

(b) EMF Compare based on Diagrams

Figure 2.11: EMF Compare [4]

2.3.5 Epsilon

Epsilon [48] stands for “Extensible Platform of Integrated Languages for mOdel maN- agement”. It is a platform comprising many interoperable languages for model management tasks, such as model-to-model transformation, model validation, model comparison, code generation, model merging, migration and refactoring. Epsilon has 2.3. MDD OPEN SOURCE TOOLS 27

strong support for EMF [44]. The Epsilon languages can be used to manage models of various metamodels and technologies. Epsilon provides multiple languages for speciﬁc tasks such as [48]:

• Epsilon Object Language (EOL): Is a model management scripting language that mixes capabilities of a programming and querying language.

• Epsilon Comparison Language (ECL): Is a rule-based language to identify matching elements between homogeneous and heterogeneous models.

• Epsilon Pattern Language (EPL): Is a pattern matching language to be used as input for model transformations.

• Epsilon Wizard Language (EWL): Is a task-speciﬁc language, tailored to support deﬁning and executing update transformations on models of diverse metamodels.

• Epsilon Transformation Language (ETL): Is a rule-based model-to-model transformation language that allows the use of multiple inputs and outputs.

• Epsilon Validation Language (EVL): Is a model validation language for evaluating inter-model constraints and supporting dependencies.

• Epsilon Generation Language (EGL): Is a template based code generation language which supports text generation from models and the merging of models.

• Epsilon Merging Language (EML): Is a rule-based language for merging diﬀerent models and meta-models together.

• Epsilon Flock: Is a model migration language meant to update a model as a consequence of any meta-model changes. 2.3. MDD OPEN SOURCE TOOLS 28

Figure 2.12: Epsilon Module Structure [48]

Epsilon Object Language

Epsilon Object Language (EOL) is an imperative programming language for creating, querying and modifying EMF models [14]. EOL is a combination of the Object Constraint Language (OCL) and JavaScript. This combination gives it an ability to have interesting programming language features of JavaScript such as variable management, looping constructs etc along with the features of an enriched querying language such as query chains, syntactic checking etc., present in OCL. The main goal of EOL is to provide a reusable set of common model management capabilities, on top of which task-speciﬁc languages can be implemented [48]. EOL programs are arranged in modules. Figure 2.12 shows an EOL module structure. Each module deﬁnes a body and some operations [48].

• Body: A block of statements that get evaluated after the module is executed.

• Operation: Deﬁnes the type of objects on which it is applicable (context), a set of parameters etc. 2.3. MDD OPEN SOURCE TOOLS 29

Figure 2.13: Epsilon Proﬁling Summary View [47]

Epsilon Proﬁling

In general, profiling is the process of measuring the performance metrics of the code. The ‘Epsilon Profiler’ provides integrated support for profiling the Epsilon code [47]. It has a GUI that displays sections of code that are more commonly executed. More- over, it allows for detection of variations in execution time. This enables engineers to inspect the execution times and identify potential issues. Figure 2.13 shows the summary view where the two targets have been executed 1 and 1973 time(s) and each took 1292 ms to complete the execution. 30

Chapter 3

Related Work

3.1 Related Work

Since our work integrates model comparison, impact analysis, code generation and, patch formation the resulting eﬀorts of previous researchers have helped us all along during our research. Some researchers have worked on diﬀerent parts that we have in our work.

3.1.1 Model Versioning

Our work is related to the model versioning work done by Kehrer et al. [68] where the two models have been compared to each other by using a version control system. Using a rule-based approach presented in this paper, low-level differences are computed which removes the need for specific transformers depending on model type. An early approach by Alanen et al. has discussed the operation-based difference between MOF models and their union in the context of a version control system [13]. They have presented three different algorithms for model difference calculation, merging of the model with their model differences and have computed the union of models. 3.1. RELATED WORK 31

Their work is specific to MOF and not to UML. Altmanninger et al. [15] have presented their ‘version control system’ called “SMoVer" to detect conflicts between model versions which is based on an EMF-based modeling language. It is a state-based approach and they compared two model versions using ‘Version Control System’ and determined the conflicts. To support the EMF-based models we have used Git for storage and retrieval and EMF Compare to perform the comparison.

3.1.2 Impact Analysis

Impact analysis is a big field in software engineering and different approaches to compute different kinds of impact in different kinds of artifacts have been proposed. Zhang et al. [37] conducted such a case study on an industrial project to help understand which set of dependency criteria were applicable in real-world practice compared to the theoretical research. Based on the existing literature and this case study, they made improvements and proposed a new dependency model. The requirement dependency is classified as ‘intrinsic dependency’ which considers the business impact and evolution and ‘additional dependency’ which takes the business value and cost into account. In terms of ‘additional dependency’ of the cost, it is reported that more than 85 percent [52] of software system cost goes towards maintenance and operation. A work by Goknil et al. [17] tries to tackle this aspect of how the cost of the change is usually higher than intended. It follows the theory that the logical change requirement at the outset do not take the impact of those requirement changes to the existing architecture into account. The paper also focuses on improving change impact analysis by formally defining textual requirements and mapping those to the 3.1. RELATED WORK 32

changes and the impact they have on the metamodel. In this paper, formal semantics are defined to classify requirements. Different requirement specific relationships, such as ‘Requirement relation’, ‘Requires relation’, ‘Contain relation’, etc. are defined. In our work, we focus more on the ‘elements specific relationships’ and how they impact the elements across the compared versions. Yet, defining the relationship set, regardless of the kind of relationship, be it ‘requirements specific’ or ‘elements specific’, for formally determining the impact of changes is a common feature between their work and ours. Until now, some researchers have studied the impact analysis on models and provided structured solutions for it. Briand et al. [53] verified the UML diagrams in UML models and analyzed the impact and change of one model element on other model elements. In this work, they provided the definition of ‘Impact Analysis’ by identifying the major outcomes of a change. It has also discussed the different problems that arise while changing the model elements and defined probable solutions to the problem by making better decisions well in advance. To do this, they presented a prototype tool which caters to their impact analysis methodology. Similar to this paper, we too have defined the relationships between model elements. However, our work makes use of UML-RT instead of UML models. Their work defines categories based on the properties and attributes change. It lays down the formal notations and a conceptual model for the ‘impact analysis’ and how elements can be impacted based on their distances. Our work, on the other hand, makes a broad generalization for impact analysis based on the hierarchy and specific location of a model element with respect to another, say a containment relationship. Another work by Briand et al. [54], where they have presented the above mentioned 3.1. RELATED WORK 33

model-based approach for UML models and demonstrated it by using the example of the ‘cruise control system’. Another work by Kuryazov et al. [33] has introduced the operation based approach to represent the model differences. With the help of the model differences, they have introduced a ‘Delta operation language’ (DOL) for representing the differences. Their work is metamodel independent and can be applied to UML activity diagrams to compute the resultant ‘delta’. Our work similarly creates a delta patch as a result of code generation. Kung et al.[51] have discussed the classes that may be impacted by a change in a given class and, the impact of object-oriented characteristics such as polymorphism, encapsulation, inheritance on the analysis. Bohner and Arnold [18] conducted impact analysis to evaluate the impact of the changes on system models and code. In our work, we outline an algorithmic approach for impact analysis involving the elements such as capsules and protocols in a UML-RT model.

3.1.3 Build

A survey conducted by Gary et al. between 2001-02 concluded that the build times in software projects were inconvenient, ‘overly high’ and tools insuﬃcient. For example, in their study conducted for the project at Lawrence Liverpool National Labs, the build times ranged between 1 to 2 hours [43]. Our paper discusses some problems that occur in a model-driven development (MDD) environment when many developers make several changes constantly on the same model. 34

Chapter 4

Automating the Build Process

One of the key problems with the existing UML-based MDD process and tools such as Papyrus-RT, is that it does not enable a collaborative mode of development and provides poor support for tracking the evolution of the model across its versions. Also, there is no patch generation mechanism in place. An online repository such as Git can help achieve a collaborative mode of development and ensure an efficient versioning capability. The evolution of the model can be captured by a patch by identifying specifically which model elements were affected by the changes made. This patch which optimizes the build is the result of a series of steps in our approach. In this chapter, the approach for the ‘Build Process’ is explained in detail. A brief overview of the build process is provided in Section 4.1. The configuration required for our work is described in Section 4.2. Section 4.3 explains the relationship analysis between the model elements and the algorithms used to perform the impact analysis on the model. The code generation is explained in Section 4.4. Section 4.5 provides the overview of patch generation and build process in detail. 4.1. PROJECT OVERVIEW 35

4.1 Project Overview

This section gives an overview of our work. The main goal of this research is to optimize the build process of MDD. Analyzing the impact of the changes between two versions of a model is the key to avoid unnecessary code generation and compilation. One way to implement the impact analysis is through a process similar to ‘slicing’. For programming languages, slicing refers to a form of static analysis (i.e., it is an analysis performed prior to run-time) that removes parts of the program that can be shown to not impact a slicing criterion (such as the value of a variable at a certain place in the program) [38]. In our setting, we will compute the changed and impacted elements of the model so that we can leave out the elements that can be shown to not be inﬂuenced by the change and thus not requiring regeneration or recompilation. We deﬁne changed and impacted elements as follows:

• changed model elements: Refers to the set of model elements obtained after performing the model comparison using EMF Compare.

• impacted model elements: Refers to the set of model elements impacted by the changed model elements. Impact tries to capture the need for regeneration or recompilation and will be computed using diﬀerent kinds of relationships between model elements such as containment. The analysis is implemented using the Epsilon Object Language (EOL).

The next process after the impact analysis is the code generation process. This means that the code regeneration applies speciﬁcally to the changed and the impacted model elements and not to all model elements in the model version. We 4.1. PROJECT OVERVIEW 36

Figure 4.1: Overview of Our Approach 4.1. PROJECT OVERVIEW 37

achieve ‘slicing’ for our models through the combined use of the existing processes and technologies of Git, EMF Compare, Epsilon, and Code Generation. For this, we created an Eclipse plug-in project, which imports or defines all needed code. The models that we are using to evaluate our work have been developed using an Eclipse plug-in known as ‘Papyrus-RT’. We have used ‘Papyrus-RT code generation’ to generate the code for models. As illustrated in Figure 4.1, the process begins when the user has two different versions V and V0 of the same model. The model versions are stored in an online Git repository, as opposed to a conventional local storage and the user can access the model versions through the commit-ids used in Git. Then, we traverse the model versions to extract their UML files. These UML files are compared using EMF Compare to get the changed model elements between the model versions. As discussed in Section 2.11, EMF Compare takes the UML files as input and calculates the difference between them based on the attribute and reference changes. The resultant difference is a set of changed model elements between the model versions. Once we have the set of ‘changed model elements’, we use that to perform the ‘Impact analysis’ which gives us the set of ‘impacted model elements’. The code generator is customized to take only the impacted model elements as the input and outputs the code for it. Moreover, the code generator creates a makefile as its output. Afterward, the compilation is performed using a makefile which contains the instructions to create a ‘patch’. A patch is a self-containing file containing all the differences between two versions of files [50]. In our project, a patch is the collection of compiled files of the impacted model elements. Figure 4.1 shows that the patch is 4.2. CONFIGURATION 38

the ﬁnal result of our process. This patch will help in making a decision for release process and to check the historical changes between the model versions.

4.2 Conﬁguration

In this section, a detailed description of the conﬁgurations done for the project is provided. The conﬁguration comprises two related sub-sections:

• Model Versioning

• Model Comparison

4.2.1 Model Versioning

Model versioning is a set of features that allows developers to maintain the history of the evolution of models [46]. When diﬀerent people in a team wish to collaborate in a CDD project, issues related to code synchronization could arise. Traditional software projects tackle this problem by using version control systems (VCS) such as CVS or Git to store and merge their code online. This enables them to work independently and to integrate their versions of the code back into their main project repository. Even for MDD, diﬀerent developers working on the same model should have the ability to synchronize their model versions. Also, in many existing MDD tools, whenever an incremental code generation is performed, developers need to have the source code present in their local workspace. Having to store the source code versions for each model version is an overhead. In our project, we address both of these issues. We provide the model developers with the facility to perform the incremental code generation by taking any two versions of the same model without the need for having their source code in place. For that, it is not required that the models be stored 4.2. CONFIGURATION 39

Figure 4.2: Example of conﬂict in changes made on the same model locally, but in a Git repository so that they are easily accessible from anywhere and at any time.

Example

Consider an example shown in Figure 4.2 wherein the team members Jack and Jill are simultaneously working on diﬀerent versions of the same model V. Jack changes the model to V0 which involves a minor referential change from Author to Book. At the same time, Jill changes it to V00 combining the model elements Author and 4.2. CONFIGURATION 40

Book. This is an example of a conflict that can occur when people are working in a distributed environment. Model versioning using an online Git repository can help identify conflicts, manage conflicts and handle merge effectively to yield the correct version V*. Therefore, if both Jack and Jill were working simultaneously on a Git-based repository and then push their versions back to the repositories through their branches, all the model versions along with their latest source code will be available in the Git repository. Hence, conflicts like the one described earlier could be avoided.

Figure 4.3: Illustration of how our tool accesses the Git repository to retrieve two model versions

Model Versioning Approach

In our work, we make use of Git to extract the UML ﬁle of model versions stored in the online repository. In Figure 4.3, the Git repository contains diﬀerent model 4.2. CONFIGURATION 41

versions V and V0. The user can clone the remote repository and use a local repository to traverse through the model. Once the ‘repository path’ and the ‘commit-ids’ of V and V0 are provided by the user, our prototype tool utilizes the Java Git (JGit) libraries to traverse through the stored model versions in the Git repository. Once the model versions are located, their corresponding UML ﬁles are extracted. These extracted UML ﬁles are then used for model comparison.

4.2.2 Model Comparison

Comparison of text files is a fairly established process based on string comparison. However, model comparison requires a different approach due to the rich structure and semantics of models. Model comparison refers to the process of identifying similarities or differences between various model elements. The primary job of model comparison in model-centric version control is to detect model version changes by computing the mappings and the differences between model versions [55]. To perform the model comparison manually is tedious, time-consuming and prone to errors [55]. Thus, we have automated the model comparison step by using the Eclipse Modeling Framework Compare (EMF Compare) which is provided as a part of the Eclipse Modeling Framework (EMF). EMF’s metamodel (Ecore) is an implementation of EMOF, a subset of the Meta Object Facility (MOF) which is the basis for many modeling languages such as UML [12]. We use EMF Compare for change detection between two different versions V and V0 of the same model. EMF Compare uses a similarity-based approach and uses custom matching algorithms to detect the differences. The comparison result between V and V0 is saved and queried in an XMI format known as differences [3]. All the 4.2. CONFIGURATION 42

comparison information such as matches, sub-matches and so on are contained in the ‘Comparison’. ‘Comparison’ is the root node of the diﬀerences schema.

Types of Diﬀerences:

The model elements are deﬁned by their properties and attributes. Also, attributes can belong to the model elements or properties. There are three types of diﬀerences in EMF Compare [14, 3]. They are:

• ReferenceChange (RC): Refers to the change detected when a reference value is changed. These values can either be added, deleted or moved.

• AttributeChange (AC): Refers to the change which is identical to the ‘Refer- enceChange’ except that it applies to attributes rather than references.

• ResourceAttachmentChange (RAC): Refers to the change detected when a root of the compared resource is modiﬁed.

Model Comparison Approach

While comparing the two model versions, developers can get overwhelmed with too many details and analyzing the differences could become tedious. EMF Compare provides the means to find the closest container of the changed elements. In our work, for any changes in the lower-level model element, we recursively search until we find its closest higher-level capsule or its closest higher-level protocol. EMF Compare provides the comparison mechanism between two model versions V and V0 via the ‘getDifference’ method. The ‘getDifference’ collects all the differences between models in a ‘difference’ set. We iterate through the changed model elements 4.2. CONFIGURATION 43

Figure 4.4: Model Comparison approach using EMF Compare in the ‘difference’ set and find the higher-level container for each of them. We extract the ‘type’ (e.g. Class, Package, etc.) and ‘id’ of the higher-level container. In a model UML file, the capsules are of ‘type’ Class and protocols are of ‘type’ Package. The ‘type’ and ‘id’ of these higher-level containers are the output of model comparison process which will be used for the ‘Impact Analysis’ process. Figure 4.4 shows the EMF Compare’s comparison window for one of sample model we used. To get the same results in the code we have made use of EMF Compare libraries. As shown in Figure 4.4, a new message has been added to the ‘detection’ protocol. In our approach, we get the differences as ‘in message()’. In this example, the ‘detection’ protocol is the closest higher-level container which will be added to the set of changed model elements. 4.3. IMPACT ANALYSIS ON MODEL 44

Figure 4.5: Impact Analysis Process

4.3 Impact Analysis on Model

Impact analysis is defined as the process of detecting the possible outcome of a change or evaluating what needs to be modified to attain a change [26, 53]. It helps us to estimate any risks associated with the changes. In order to calculate the impact on model elements, the relationship and mapping between the model elements need to be identified. The steps for finding the impacted model elements are provided below.

4.3.1 Querying on UML Model

In order to ﬁnd the relationship between model elements, it is required to query the UML model. Epsilon takes UML models that conform to the UML Ecore metamodel as input. We have used Epsilon’s EOL (as discussed in Section 2.3.5) as it facilitates the formulation of complex queries. EOL is a metamodel-independent language and can manage models represented using various technologies such as MOF, EMF, and XML. Figure 4.5 displays the impact analysis process. For querying the model, EOL requires a certain set of parameters as input which are: 4.3. IMPACT ANALYSIS ON MODEL 45

Figure 4.6: Relationships for evaluating Impact Analysis

• The UML ﬁle of the speciﬁc model version V0 in Git after model comparison.

• A set of changed model elements computed by the model comparison.

• The ‘model name’ of that speciﬁc model version (V0).

The output of the model comparison is the list of capsules or protocols. As discussed in Section 2.3, capsules are the basic building block of a UML-RT model, and protocols define the communication between capsules [7]. Therefore, our work primarily focuses only on these two types of model elements. Using the EOL based queries, the UML file of the model V0 is traversed to find the impacted elements based on the defined relationships as discussed in Section 4.3.2.

4.3.2 Relationships between Model Elements

The impact a model element can have on other model elements is based on its relationship with other model elements. Thus, it is essential to ﬁnd these relationships to analyze the overall impact of the changes made. The relationships are:

• Use relationship 4.3. IMPACT ANALYSIS ON MODEL 46

Figure 4.7: Example of use relationship between capsule, port and protocol

• Communication relationship

• Containment relationship

With the deﬁned relationships, the model elements could be directly or indirectly (through transitive closure) impacted by the change. The ‘Use relationship’ and the ‘Communication relationship’ only checks for the directly impacted model elements while the ‘Containment relationship’ may propagate impact of both of these to all parent elements. Figure 4.6 shows how the changed model elements are the input for the impact analysis where the relationships are determined and the impacted model elements are evaluated.

Use Relationship

Inside a model, the messages sent and received are deﬁned in a protocol and each capsule may or may not be using a speciﬁc protocol. The ‘Use’ relationship refers to the scenario where capsules contain any port typed by a protocol that it uses to communicate with the other capsules. Figure 4.7 shows that a capsule ‘CapsuleA’ is connected to a port and the port is typed by a protocol ‘ProtocolP’. We can conclude that any change in a protocol 4.3. IMPACT ANALYSIS ON MODEL 47

Figure 4.8: Example of communication relationship between capsules can aﬀect the capsules using those protocols in the UML-RT model version. Our approach is to traverse through all the capsules and ﬁnd out if the changed protocol is associated with any capsule.

Communication Relationship

This relationship refers to the scenario where two capsules are connected to each other through port typed by the same protocol. In this relationship, if two capsules are connected via same protocol, then any change in one capsule will impact the other capsule. Figure 4.8 shows the communication between the two capsules through ports.

Containment Relationship

Inside a model, a single capsule may enclose multiple capsules in it and the same process can continue recursively. Figure 4.9 shows an example of a ‘Containment relationship’ which refers to the scenario where a parent capsule contains one or more child capsules inside it. With such a relationship, a given change in any child capsule, can, in turn, af- fect its parent. Our task is to traverse through the UML-RT model and derive the containment relationships among the capsules in order to determine if there is any 4.3. IMPACT ANALYSIS ON MODEL 48

Figure 4.9: Example of containment relationship between capsules change and if there is, ensure that all parent capsules containing, possibly transitively, a changed capsule are considered impacted.

Larger sample model with all relationship types

Figure 4.10 shows an example model with all the three relationship types. The figure shows the capsule diagram for the ‘Parcel_Router’ model which we have taken as one of the sample models for evaluation purposes. This model contains different model elements and the relationship types which we have discussed in Section 4.3.2. In Figure 4.10, all capsules reside under the ‘Parcel_Router’ capsule and are connected to each other through ports and connectors. Ports define the type of messages between capsules. In this model, the ‘enter’ port is typed with a protocol named ‘transmission’. The ‘transmission’ protocol is also being used in the ‘Gen’, ‘Stage’ and ‘Bin’ capsules.

• Use relationship: As per this relationship, if in a model version V0 any changes are made to the ‘transmission’ protocol then all the capsules associated with this protocol will 4.3. IMPACT ANALYSIS ON MODEL 49

Figure 4.10: Example model for types of relationships

be impacted. In Figure 4.10, since the ‘Gen’, ‘Stage’ and ‘Bin’ capsules use the port ‘enter’ typed by protocol ‘transmission’, therefore, these capsules are also considered as impacted.

• Communication relationship: Considering the same example, if there is a change in capsule ‘Gen’ which uses the ‘transmission’ protocol, all the other capsules, which are using the same protocol will be impacted by the change. So, the ‘Stage’ and ‘Bin’ will be impacted by the change in ‘Gen’ capsule.

• Containment relationship: In Figure 4.10, all the capsules are contained inside the ‘Parcel_Router’ capsule. Therefore, change in a ‘Stage’ capsule will impact the ‘Parcel_Router’ capsule because there is a containment relationship between them. 4.3. IMPACT ANALYSIS ON MODEL 50

4.3.3 Notation for Relationships

0 0 00 0 0 Let V and V be the two versions of the same model. Let c , c , c1 and c2 be the capsules and p0 be the protocol. Binary relation ‘uses’ defines the ‘Use Relation- ship’, ‘connected’ defines the ‘Communication relationship’ and ‘contains’ defines the ‘Containment relationship’.

1. c0 ∈ changed(V,V 0) =⇒ c0 ∈ impacted(V,V 0)

2. p0 ∈ changed(V,V 0) =⇒ p0 ∈ impacted(V,V 0)

3. (c0, p0) ∈ uses ∧ p0 ∈ changed(V,V 0) =⇒ c0 ∈ impacted(V,V 0)

0 0 0 0 0 0 4. (c1, c2) ∈ connected ∧ c1 ∈ changed(V,V ) =⇒ c2 ∈ impacted(V,V )

5. ∃c00 : c00 ∈ contains(c00, c0) ∧ c0 ∈ impacted(V,V 0) =⇒ c00 ∈ impacted(V,V 0)

Above, Rules 1 and 2 capture the fact that changed capsules and protocols are considered impacted. Rule 3 captures impact in capsule c0 via a changed protocol

0 0 that is used by c . Rule 4 ensures that a capsule c2 connected to a changed capsule

0 c1 is considered impacted. Rule 5 expresses that parents of impacted children be considered impacted as well.

4.3.4 Algorithmic Approach

Assuming that impact is deﬁned via the three relationships above, the complexity of computing the impacted elements in a model is as follows:

• We are looping over all the model elements to find the dependencies between model elements. Our first step begins by finding the root element of the model. 4.3. IMPACT ANALYSIS ON MODEL 51

To search for protocols or capsules, we need to iterate over all the model elements.

• Likewise, to query the internal attributes or properties of capsules or protocols, we need to loop over these.

Algorithm 1 shows how the impact analysis proceeds based on the input values obtained after the model comparison process from EMF Compare. The inputs CPS refers to the set of changed protocols, CCS refers to the set of changed capsules and CS refers to the set of total capsules in a model. As discussed in Section 4.3.3, we consider all CPS and CCS as set of impacted protocols (IPS) and set of impacted capsules (ICS) respectively. We ﬁrst check if IPS or ICS has values. Then, if there is no protocol change, we do not check for the ‘Use relationship’. If there are protocol and capsule related changes, we check all the three relationships. The output of this function is the set of total impacted model elements which is denoted by IMS. In Algorithm 2, we check for the ‘Use relationship’. We iterate over each capsule c in CS and each protocol p in IPS. We check if there is a ‘Use relationship’ between c and p. If c uses p, then we consider c as an impacted element and add it to ICS. This impacted capsules set (ICS) is then carried forward as the input for checking the communication relationship. In Algorithm 3, we check for the ‘Communication relationship’. We iterate over each capsule c in CS followed by iterating over each capsule c0 in ICS. We determine if there is a ‘Communication relationship’ between c and c0. If c is connected to c0, then we consider c as impacted and add c to ICS. Algorithm 4 is for checking the ‘Containment relationship’. We again iterate over each capsule c in CS followed by iterating over each capsule c0 in ICS. Thereafter, 4.3. IMPACT ANALYSIS ON MODEL 52

we determine if c contains c0 and if it does, c is added to ICS. This process continues till there are no higher-level containers for any of the impacted capsules. Only those model elements that are present in ICS after the ‘Containment relationship’ check will be sent further for code generation. Algorithm 1: Impact Analysis Input : CPS - Set of changed protocols, CCS - Set of changed capsules, CS - Set of total capsules Output: IMS - Set of total impacted model elements function ImpactAnalysis(CP S, CCS, CS): IPS := CPS // IPS - Set of impacted protocols ICS := CCS // ICS - Set of impacted capsules

if (IPS = null) and (ICS 6= null) then ICS := Communication(ICS, CS) ICS := Containment(ICS, CS) else ICS := Use(IP S, CS) ICS := Communication(ICS, CS) ICS := Containment(ICS, CS)

IMS := IPS + ICS end function

Algorithm 2: Impact Analysis based on Use Relationship Input : IPS, CS Output: ICS function Use(IP S, CS): forall c ∈ CS do forall p ∈ IPS do if uses(c, p) then ICS := ICS + c end

end

end return ICS end function 4.3. IMPACT ANALYSIS ON MODEL 53

Algorithm 3: Impact Analysis based on Communication Relationship Input : ICS, CS Output: ICS function Communication(ICS, CS): /* ICS is the set of impacted capsules that we have computed using Algorithms 2 */ forall c ∈ CS do forall c0 ∈ ICS do if connected(c, c0) then ICS := ICS + c end

end

end return ICS end function Algorithm 4: Impact Analysis based on Containment Relationship Input : ICS, CS Output: ICS function Containment(ICS, CS): /* ICS is the set of impacted capsules that we have computed using Algorithms 2 and 3 */ repeat prevICS := ICS /* computing all impacted c0 ‘one step away’ and add them to ICS */ forall c ∈ CS do forall c0 ∈ ICS do if contains(c, c0) then ICS := ICS + c end

end

end /* If this iteration allow us to find more impacted capsules, we need to continue, if not, we are done */

until prevICS = ICS return ICS end function

The impacted model elements will be used as the input for the code generation process (as discussed in Section 4.4). Figure 4.11 illustrates this process. 4.4. CODE GENERATION 54

Figure 4.11: Overview of Code Generation

4.4 Code Generation

Code generation refers to the process of generating code from models. Papyrus-RT provides a code generation facility as discussed in Section 2.3.2. For the purpose of automatic code generation, we have made use of a standalone Papyrus-RT code generation application which is referred to as the ‘Papyrus-RT code generator’. This Papyrus-RT code generator is customizable and is parametrized to take a list of model elements. These input model elements are the output from the impact analysis process. As displayed in Figure 4.12, the output of the Papyrus-RT code generator is a C/C++ Development Tooling (CDT) project. In our work, the output CDT project contains the header files and the C++ source code files of the impacted model elements but only the header files of the non-impacted model elements. Since the Papyrus-RT code generator generates a separate compilation unit (.cc/.hh pair) for classes, capsules, and protocols, it does not generate a separate compilation unit for elements they contain, such as attributes, operations, parts or ports. In our work, the code generator only takes as input capsules and protocols, and generates the code 4.5. PATCH GENERATION AND BUILD PROCESS 55

Figure 4.12: Code Generation implementation for the same. In addition to the code, a makeﬁle is also generated that contains the instructions necessary to create a patch for the impacted model elements.

4.5 Patch Generation and Build Process

As described earlier in Section 4.4, the generated code comprises a makefile, header files, and the C++ source code files. The makefile contains instructions for ‘make’ to perform different operations such as compiling, linking and building the generated code [1]. To compile the generated code, we have created a target named ‘patch’ in the makefile. The rule associated with this target is executed by the ‘make patch’ command which is responsible for generating a patch. This generated patch can be used 4.5. PATCH GENERATION AND BUILD PROCESS 56

Figure 4.13: Patch Generation implementation

to update the executable to reflect the changes (and their impact) without having to regenerate, recompile and build the code for the entire changed model. In our research work, patch (denoted by ∆) refers to a set of compiled (.o) files (containing the object code) only for the impacted model elements. Figure 4.13 shows the basic patch generation process. Let the model version be denoted by (V0) that needs to be compared to another model version (V). Based on our assumption, if the model version (V) had already been built at least once and we have its necessary compiled files in place, the generated patch (∆) can update the existing compiled files for (V) to generate the complete compiled code for the new model version (V0). It can be expressed as:

Compiled Code(V0) = Compiled Code(V) + ∆

For testing the generated patch, we have written a script to update the patch with the compiled files of version V. Then, as shown in Figure 4.14, we use the ‘make’ command to create the final build executable. The ‘make’ command will link all the object files together to produce a binary file which can be executed. 4.5. PATCH GENERATION AND BUILD PROCESS 57

Figure 4.14: Build Process implementation 58

Chapter 5

Evaluation

This chapter describes our evaluation approach. This evaluation mainly deals with validating the functioning of our prototype tool by calculating the time taken to complete the ‘Impact analysis’ process and finding how many out of the total model elements are impacted. We have performed tests using two versions of different sample models. Before performing the evaluation, we identify a few basic assumptions that need to be satisfied for our approach to work in Section 5.1. Section 5.2 describes different types of changes that we have made on models to validate the results. Section 5.3 describes different models that have been created in Papyrus-RT for evaluation purposes. These sample models help us to validate our approach and our prototype tool. Section 5.4 describes the results from the evaluation performed on different sample models. This section includes the time taken for our impact analysis approach and the results of optimizing the build based on the generated patch.

5.1 Assumptions

We have used diﬀerent models for our evaluation. Our basic assumptions while evaluating the models are: 5.2. TYPES OF CHANGES 59

• The models have already been built at least once and we have the necessary source ﬁles of the model in place.

• The developer has access to the Git repository which contains the necessary UML ﬁles for the model versions.

• The model versions being compared have at least a few changes between them, i.e., V 6= V0.

• Model version V0 is the result of the changes we have made manually for the purposes of carrying out our evaluation.

5.2 Types of Changes

In order to perform impact analysis on different models, we have categorised the types of changes in a model broadly into two different types as mentioned below. The process of finding how any two model versions differ is done by calculating differences using EMF Compare. So, the granularity and scale of changes that we make on a sample model to obtain a new version will determine how the results will vary. EMF Compare calculates the differences based on these broad generalizations of reference and attributes. So, we have evaluated the measured time to compute impact based on these categories of changes as described below.

• Reference Change - This change corresponds to changes made on the capsule diagram. It includes the addition, deletion or moving of any element in the model.

• Attributes Change - This change corresponds to changes made on any attributes of capsules or protocols in a model. 5.3. DESCRIPTION OF SAMPLE MODELS 60

Figure 5.1: Ping Pong Model

5.3 Description of sample models

In order to test the automation of the build process at the model level, we used different sample UML-RT models having various complexities and sizes. These models are created in Eclipse using Papyrus-RT. The primary assessment of our evaluation is the time taken by our ‘Impact analysis’ approach for models. We used different sizes of UML-RT models ranging from the ones which have fewer model elements to the ones having many model elements to calculate the efficiency of our approach. There are different UML files of the different models that differ in attributes, elements and in the size of the model. The models used for evaluation of our impact analysis algorithm are as follows.

1. PingPong

PingPong is an example of a simplistic UML-RT model. It is an extremely small model which is available as the sample model in a Papyrus-RT distribution. The 5.3. DESCRIPTION OF SAMPLE MODELS 61

model has a single ‘Top’ capsule containing two capsules: ‘Pinger’ and ‘Ponger’. These two capsules are communicating with each other through a protocol called ‘PingPongProtocol’.

Both the Pinger and the Ponger have their own state machine. In the ‘Running’ state, Pinger sends an ‘onPing’ signal to the Ponger capsule to declare that it is in the ‘Running’ state. Upon receiving an ‘onPong’ signal from the Ponger capsule, a self transition in the ‘Running’ state of the Pinger state machine is triggered. The Ponger capsule also has its own state machine named ‘Running’ which in turn waits to receive an ‘onPing’ signal from the Pinger capsule.

2. Car Door Lock It is a control system where one ‘CentralLock’ capsule is responsible for locking and unlocking the four car doors. The ‘CentralLock’ capsule controls the locking status of the ‘Lock’ capsule located inside each ‘Door’ capsule. The protocol named ‘Locking’ sends the message to ‘Lock’ or ‘Unlock’ to the car doors.

In Figure 5.2 the model is shown to have the ‘Top’ capsule named ‘Car’ which consists of a ‘CentralLock’ capsule and four ‘Door’ capsules communicating with each other.

This model is slightly bigger than the ‘PingPong’ model.

3. Parcel Router The Parcel Router simulates the process of sorting pieces of luggage found at airports [39, 67]. This mechanism is used to sort parcels into bins according to their destination labels stuck on the parcels. The parcels are routed to bins via a sequence of chutes. A parcel is scanned to determine the destination bin and 5.3. DESCRIPTION OF SAMPLE MODELS 62

Figure 5.2: Car Door Central Lock Model

then routed to its unique intended destination bin. The parcel travels through ‘Stage’ capsules before reaching the destination bin. Each stage consists of a ‘Chute’ capsule which represents the belt, a ‘Switcher’ capsule which represents the switches and the ‘Sensor’ capsule. A parcel enters one end of the stage to emerge at one of the two openings at the other end based on the routing decision of the ‘Switcher’. The ‘Switcher’ has to be set ahead of time so that the parcel can enter and be routed appropriately to the correct chute. The parcel travels through various stages and at each ‘Stage’ capsule, the intended direction has to be determined so that in the end the parcel can reach the required destination bin. A simulation of the parcel routing system is constructed with 4 destination bins which are represented by 4 bin capsules. 5.3. DESCRIPTION OF SAMPLE MODELS 63

Figure 5.3: Parcel Router Model

Figure 5.4: Reﬁned - Parcel Router Model

This model is a medium-sized model which has 8 different classes, 3 protocols with 5 different types of transmission messages. The model has a ‘Par- cel_Router’ capsule consisting of 8 different capsule parts which are communicating with each other. 5.3. DESCRIPTION OF SAMPLE MODELS 64

For our evaluation, we are using 3 parcel router models which are divided in terms of their increasing model complexities. The increasing complexities in parcel routers are due to a higher number of elements.

• Simpliﬁed parcel router - This ignores the concept of ‘jams’ that may occur in this model context. Hence, there are fewer transitions and complexities for it to handle.

• Parcel router - One feature of this model is its support for ‘jam control’. This means that any new parcel will not be generated or transferred from one chute to another before the previous jam clears out [21].

• Reﬁned parcel router - This parcel router model is the most advanced form of them all. It also accounts for the lost parcels due to certain delay settings and also ﬁxes the problem incorrect routing that may occur.

4. Rover The Rover system model is an autonomous vehicle that comprises three wheels driven by two engines. Built using Raspberry-Pi and running a Linux operating system, it can move forward, backward, and rotate. It is equipped with a range of sensors to identify barriers and to gather environment data such as temper- ature etc. The model speciﬁes the following rover behaviour. First, the rover will drive forward. Once a barrier is detected, the rover then takes a 90-degree turn and begins to move forward once again [11]. The complete rover behavior was modeled in UML-RT containing capsules that are connected with other capsules, ports that connect them, protocols that deﬁne their communication and state machines that model capsule behavior. 5.3. DESCRIPTION OF SAMPLE MODELS 65

Figure 5.5: Rover Model

The rover architecture layout from bottom to top comprises of 5 layers.

• Hardware Layer: The underlying Raspberry Pi with its GPIO pins and sensors make up this layer.

• File System Layer: The Linux OS contains the ﬁle system which forms this layer.

• GPIO Class: Contains methods to get and set the pin directions, values etc.

• Rover Library: Contains UML-RT capsules representing parts of the rover that are necessary for specifying the behaviour of the rover such as sensors and motors.

• Application: Speciﬁes the behaviour of the rover as described above.

The complexity of this model is higher than that of the other models. Figure 5.5 shows the layered architecture of the rover. 5.4. RESULTS AND METRICS 66

5.4 Results and Metrics

5.4.1 Validation

We validated the correctness of the implementation of impact analaysis through man- ual inspection of the impact analysis results on the sample models that we used for evaluation.

5.4.2 Experimental System Conﬁguration

Our evaluation was conducted using a single system and all the conﬁgurations, whether hardware or software, were identical throughout the experiments. Our computer had a processor speed 2.7 GHz Intel Core i7 CPU with 16GB of LPDDR3 memory.

5.4.3 Evaluating Impact Analysis Time

We have calculated the Impact Analysis time using the proﬁling feature of Epsilon. As discussed in Section 2.3.5, proﬁling provides the performance metrics. The time computed is only for the ‘Impact Analysis process’ and does not include the model loading time. For each of the six models used for evaluation, Table 5.1 indicates their size and complexity by showing the number capsules, state machines, ports, protocols, and transitions they contain. Also, for each sample model, we have made a reference and an attribute change. The measured time for impact analysis is the average of these changes. As per our evaluation, the overall time required to perform the impact analysis on the sample models varies between 115 and 858 milliseconds. We can notice the impact analysis time increasing with the complexity of the model. Informally, most 5.4. RESULTS AND METRICS 67

Complexity Time Models C Sm P Pr T [ms] Ping Pong 3 2 4 1 4 115 Car Door Central Lock 4 2 4 1 18 160 Simpliﬁed Parcel Router 8 5 23 3 14 221 Parcel Router 8 5 23 3 25 398 Rover 6 3 15 3 21 668 Reﬁned - Parcel Router 9 8 49 11 34 858 C: Capsule, Sm: Statemachine, P: Port, Pr:Protocol, T:Transition Table 5.1: Impact Analysis on Model For Overall Changes

Figure 5.6: Testing window with integrated plug-in important factor inﬂuencing the analysis time is how many model elements satisfy the relationships relevant for the impact analysis, i.e., the relationships described in Section 4.3.2. Figure 5.6 shows the snapshot of the testing window for all ﬁve models, where we 5.4. RESULTS AND METRICS 68

Figure 5.7: Summary of time calculation in Epsilon proﬁling window

Figure 5.8: Target for time calculation in Epsilon proﬁling window have made changes on the model and have used the prototype tool as a plug-in to run our application. Figures 5.7 and 5.8 show the proﬁled time of impact analysis for one of the sample models. 5.4. RESULTS AND METRICS 69

5.4.4 Optimization Beneﬁts

The result from the ‘impact analysis’ process which we discussed in Section 5.4.3, provides us with a list of the model elements that are impacted by the changes between the model versions. Then, the code generator generates the code only for the impacted elements. This facility of incremental code generation in the prototype tool has the ability to compare between different versions of a model in a distributed environment. Finally, the build optimization is achieved by generating a patch only for these impacted model elements rather than for all the model elements. The results in Table 5.2 provide the comparative view of the total model elements needed for a complete code generation along with the impacted model elements for the change using our approach. In the scope of our evaluation, we refer to ‘Total model elements’ as the total number of capsules and protocols in the sample models. The ‘Impacted model elements’ is the sum of impacted capsules and protocols for the given type of changes we made for each model. The results in Table 5.2 are indicative of the optimization benefits for the changes made in each of the sample models. In Ping Pong model, we find that for a particular reference change, the code did not need to be generated for two of the four capsules since only two capsules were impacted by this change. In the Car Door Lock Central model, we changed an attribute of a protocol. The protocol in this model is used by all capsules. For this reason, all the model elements in the model are impacted and hence the complete code will be regenerated for this reference change. Similarly, for other models we performed a reference or attribute type of change and obtained the results for optimized benefit. Table 5.2 also provides the comparison between the ‘Total model elements build 5.4. RESULTS AND METRICS 70

Type of TMEB IMEB Models TME IME Change Time (ms) Time (ms) Ping Pong Reference 4 2 3,277 2,448 Car Door Central Lock Attribute 5 5 5,182 5,965 Simplified Parcel Router Attribute 11 3 7,134 4,410 Parcel Router Reference 11 5 7,613 5,159 Rover Reference 9 4 9,544 6,515 Refined Parcel Router Attribute 19 6 11,025 8,293 TME: Total model elements, IME: Impacted model elements, TMEB: Total model elements build, IMEB: Impacted model elements build Table 5.2: Optimization benefit for select changes time’ and ‘Impacted model elements build time’. Total Model Elements Build Time (TMEB) = time(generate code for entire model) + time(compile all generated code) + time(build), and Impacted Model Elements Build Time (IMEB) = time(impact analysis) + time(generate code for impacted elements) + time(compile code for impacted elements) + time(patch generation) + time(build) We calculated the TMEB which includes the time taken for code generation, compilation and build for all model elements. However, IMEB includes the time taken for impact analysis, code generation, compilation, patch generation and build for impacted model elements. As shown in the table, we found that IMEB time is less than the TMEB time in five of the sample models. However, in ‘Car Door Central Lock’ model the IMEB time is greater than the TMEB time due to the overhead time to perform impact analysis and patch generation. We also found that our approach makes the build process faster when comparing the build for all model elements although we need to perform impact analysis. Thus, the benefit outweighs the extra time we spend on 5.4. RESULTS AND METRICS 71

impact analysis. For small models such as the sample models used in our evaluation, the time saved by not having to regenerate and recompile elements that are not impacted by a change is outweighed most of the time by the time required for the impact analysis. However, for large, industrial models such as the ones used at Ericsson, which approximately had 8000 sequence diagrams and several hundreds of capsules [30], the time saving using our approach can be signiﬁcant. Below screenshots show our approach using one of the sample models. Figures 5.9 and 5.10 show the window where we change the ‘Parcel_Router’ model by adding a new message to one of the protocols. We used Eclipse Git service to push our changed model to the Git repository, Figure 5.11 shows the same process. Finally, Figure 5.12 shows the window where we have to provide commit ids of two model versions and repository path of the model. This is followed by the console output of the code generation process for the impacted model elements. After the code generation process, a CDT project is generated for the impacted model elements. Finally, using the ‘make patch’ command, the object ﬁles are generated for the impacted model elements which collectively is referred to as a ‘patch’. The patch is then used to get the executable. 5.4. RESULTS AND METRICS 72

Figure 5.9: Changing model elements using Papyrus-RT

Figure 5.10: Newly Added model element in Papyrus-RT 5.4. RESULTS AND METRICS 73

Figure 5.11: Pushing model changes to the Git repository

Figure 5.12: Code Generation of Impacted elements 74

Chapter 6

Summary and Conclusions

6.1 Summary

The main goal of this research is to automate and optimize the build process at model- level for UML-RT models. We achieve this by integration of the libraries available in the Eclipse environment. Our work eliminates the reliance on source code and allows developers to access model versions directly from an online repository. We have proposed and implemented a patch generation process which facilitates building the code only for the impacted model elements rather than for all model elements. The key highlights of our approach are:

• First, two model versions are compared using EMF Compare to determine the changes.

• Then, an impact analysis is performed to determine which model elements have been impacted by the change.

• From the results of the impact analysis, the code is generated and a patch is computed that captures the diﬀerences between the model versions. 6.2. LIMITATIONS AND FUTURE WORK 75

• We have conducted a ﬁrst preliminary evaluation of our prototype implementing the impact analysis and patch generation.

• The evaluation suggests that impact analysis and patch generation work correctly and can bring signiﬁcant beneﬁts at an acceptable cost.

6.2 Limitations and Future Work

A wide range of future work is possible. For example, a core aspect of our research was to perform the impact analysis between any two diﬀerent model versions resulting in a patch. This can be extended in the future so that a model version and the set of successively generated patches, could be leveraged to support regression testing. In this context, the impact analysis could be used to determine which tests need to be re-run after a change. Another key area of focus for future research could be performing the impact analysis based on passive classes. Currently, our work only supports capsule and protocol as inputs for the impact analysis process. This can be extended further to other model element such as ‘passive class’ as an input.

6.3 Conclusion

Our work uses repositories supporting distributed development, model comparison, impact analysis, code generation, and patch generation and ties them together to optimize the build process of code generated from models. We have worked on the Papyrus-RT tool and UML-RT models and our approach is applicable to UML models. Our implementation allows for the comparison between any two model versions directly from the model rather than code. Also, the model 6.3. CONCLUSION 76

files are present in a repository rather than in the local workspace. Our approach identifies situations in which the regeneration and rebuilding of parts of the model are unnecessary and can safely be avoided. Our work also evaluates the time taken to perform the impact analysis using several use case models. The impact analysis process is important for the incremental code generation because it ensures that the code is generated only for the impacted model elements. The incremental code generation forms the basis for the ‘patch generation’. The generated patch is the collection of compiled files of the code generated for impacted model elements. The patch helps optimize the process of generation of a final build for the model version. Consequently, our work could provide significant optimization benefits by regenerating and rebuilding the code associated only the impacted model elements, especially in real-world, large UML-based MDD tools. BIBLIOGRAPHY 77

Bibliography

[1] A GNU head. https://www.gnu.org/software/make/. [Accessed March, 2018].

[2] Eclipse - The Eclipse Foundation open source community website. https:// www.eclipse.org/jgit/. [Accessed March, 2018].

[3] EMF Compare. https://www.eclipse.org/emf/compare/documentation/ latest/developer/developer-guide.html. [Accessed March, 2018].

[4] EMF Compare - Compare and Merge Your EMF Models. https://www. eclipse.org/emf/compare/overview.html. [Accessed March, 2018].

[5] Epsilon Object Language. https://www.eclipse.org/epsilon/doc/eol/. [Ac- cessed February, 2018].

[6] Modeling Language Guide, Rational Rose Realtime - ibm.com. ftp: //ftp.software.ibm.com/software/rational/docs/v2003/win_solutions/ rational_rosert/rosert_modeling_language.pdf. [Accessed April, 2018].

[7] Modeling Real-Time Applications in RSARTE - ibm.com.

https://www.ibm.com/developerworks/community/wikis/form/ anonymous/api/wiki/b7da455c-5c51-4706-91c9-dcca9923c303/ page/35e0361a-c781-4ff2-a4cc-63a224a0b5c3/attachment/ BIBLIOGRAPHY 78

6c315754-c65a-4692-9f41-23fccae5c184/media/RSARTEConcepts.pdf. [Accessed April, 2018].

[8] Welcome to Apache Maven. http://maven.apache.org/. [Accessed March, 2018].

[9] Eclipse Papyrus for Real Time (Papyrus-RT). https://projects.eclipse. org/projects/modeling.papyrus-rt, 2017. [Accessed April, 2018].

[10] Aditya Agrawal, Gabor Karsai, and Ákos Lédeczi. An end-to-end domain-driven software development framework. In Companion of the 18th annual ACM SIG- PLAN conference on Object-oriented programming, systems, languages, and applications, pages 8–15. ACM, 2003.

[11] Reza Ahmadi, Nicolas Hili, Leo Jweda, Nondini Das, Suchita Ganesan, and Juer- gen Dingel. Run-time Monitoring of a Rover: MDE Research with Open Source Software and Low-cost Hardware. In Open Source for Model-Driven Engineering (OSS4MDE’16), 2016.

[12] R. Akers et al. A Model independent measurement of quark and gluon jet properties and diﬀerences. Z. Phys., C68:179–202, 1995.

[13] Marcus Alanen and Ivan Porres. Diﬀerence and Union of Models. In Perdita Stevens, Jon Whittle, and Grady Booch, editors, «UML» 2003 - The Uniﬁed Modeling Language. Modeling Languages and Applications, pages 2–17, Berlin, Heidelberg, 2003. Springer Berlin Heidelberg.

[14] Taghreed Altamimi and Dorina C. Petriu. Incremental Change Propagation from UML Software Models to LQN Performance Models. In Proceedings of BIBLIOGRAPHY 79

the 27th Annual International Conference on Computer Science and Software Engineering, CASCON ’17, pages 120–131, Riverton, NJ, USA, 2017. IBM Corp.

[15] Kerstin Altmanninger. Models in Conﬂict – Towards a Semantically Enhanced Version Control System for Models. In Holger Giese, editor, Models in Software Engineering, pages 293–304, Berlin, Heidelberg, 2008. Springer Berlin Heidel- berg.

[16] Subversion Apache. Subversion. https://subversion.apache.org/, 2018. Ac- cessed April, 2018.

[17] Goknil Arda, Kurtev Ivan, den Berg Klaas van, and Spijkerman Wietze. Change impact analysis for requirements: A metamodeling approach. Information and Software Technology, 56(8):950 – 972, 2014.

[18] Robert S. Arnold. Software Change Impact Analysis. IEEE Computer Society Press, Los Alamitos, CA, USA, 1996.

[19] Colin Atkinson and Thomas Kuhne. Model-driven development: a metamodeling foundation. IEEE software, 20(5):36–41, 2003.

[20] Hailpern B. and Tarr P. Model-driven development: The good, the bad, and the ugly. IBM Systems Journal, 45(3):451–461, 2006.

[21] Mojtaba Bagherzadeh, Nicolas Hili, and Juergen Dingel. Model-level, Platform- independent Debugging in the Context of the Model-driven Development of Real- time Systems. In Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering, ESEC/FSE 2017, pages 419–430, New York, NY, USA, 2017. ACM. BIBLIOGRAPHY 80

[22] K. Balasubramanian, A. Gokhale, G. Karsai, J. Sztipanovits, and S. Neema. Developing applications using model-driven design environments. Computer, 39(2):33–40, Feb 2006.

[23] Sami Beydeda and Volker Gruhn. Model-Driven Software Development. Springer- Verlag New York, Inc., Secaucus, NJ, USA, 2005.

[24] Jean Bézivin, Salim Bouzitouna, Marcos Didonet Del Fabro, Marie-Pierre Ger- vais, Frédéric Jouault, Dimitrios S. Kolovos, Ivan Kurtev, and Richard F Paige. A canonical scheme for model composition. In European Conference on Model Driven Architecture-Foundations and Applications, pages 346–360. Springer, 2006.

[25] Jean Bézivin, Antonio Vallecillo-Moreno, Jesús García-Molina, Gustavo Rossi, Douglas C Schmidt White, Andrey Nechypurenko, Egon Wuchner, Ed Merks, Nathalie Moreno-Vergara Beigbeder, Vicente Pelechano-Ferragud, et al. Mono- graph: Model-Driven Software Development. 2008.

[26] Shawn A Bohner and Robert S Arnold. An introduction to software change impact analysis. Software change impact analysis, pages 1–26, 1996.

[27] Selic Bran. The pragmatics of model-driven development. IEEE Software, 20(5):19–25, Sept 2003.

[28] Cédric Brun and Alfonso Pierantonio. Model diﬀerences in the eclipse modeling framework. UPGRADE, The European Journal for the Informatics Professional, 9(2):29–34, 2008. BIBLIOGRAPHY 81

[29] Frank Budinsky, David Steinberg, Ed Merks, Raymond Ellersick, and Timothy Grose. Eclipse Modeling Framework. 2003.

[30] Håkan Burden, Rogardt Heldal, and Jon Whittle. Comparing and contrasting model-driven engineering at three large companies. In Proceedings of the 8th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement, ESEM ’14, pages 14:1–14:10, New York, NY, USA, 2014. ACM.

[31] Zhixiong Chen and Delia Marx. Experiences with Eclipse IDE in Programming Courses. J. Comput. Sci. Coll., 21(2):104–112, December 2005.

[32] Michelle L. Crane and Juergen Dingel. UML vs. classical vs. rhapsody state- charts: not all models are created equal. Software & Systems Modeling, 6(4):415– 435, Dec 2007.

[33] Kuryazov D. and Winter A. Representing Model Diﬀerences by Delta Opera- tions. In 2014 IEEE 18th International Enterprise Distributed Object Computing Conference Workshops and Demonstrations, pages 211–220, Sept 2014.

[34] Foundation, Eclipse. EMFStore. https://www.eclipse.org/emfstore/index. html, 2018. Accessed April, 2018.

[35] Sébastien Gérard, Cédric Dumoulin, Patrick Tessier, and Bran Selic. 19 Pa- pyrus: A UML2 tool for domain-speciﬁc language modeling. In Model-Based Engineering of Embedded Real-Time Systems, pages 361–368. Springer, 2010.

[36] Jeﬀ Gray, Ted Bapty, Sandeep Neema, Douglas C Schmidt, Aniruddha Gokhale, BIBLIOGRAPHY 82

and Balachandran Natarajan. An approach for supporting aspect-oriented domain modeling. In International Conference on Generative Programming and Component Engineering, pages 151–168. Springer, 2003.

[37] Zhang He, Li Juan, Zhu Liming, Jeﬀery Ross, Liu Yan, Wang Qing, and Li Ming- shu. Investigating dependencies in software requirements for change propagation analysis. Information and Software Technology, 56(1):40 – 53, 2014. Special sections on International Conference on Global Software Engineering – August 2011 and Evaluation and Assessment in Software Engineering – April 2012.

[38] Lallchandani J. T. and Mall R. A Dynamic Slicing Technique for UML Architec- tural Models. IEEE Transactions on Software Engineering, 37(6):737–771, Nov 2011.

[39] Magee Jeﬀ and Kramer Jeﬀ. Concurrency: State Models & Java Programs. John Wiley & Sons, Inc., New York, NY, USA, 1999.

[40] Kepler Johannes. AMOR. http://www.modelversioning.org/, 2018. Accessed April, 2018.

[41] Kepler Johannes. ModelCVS. http://www.modelcvs.org/, 2018. Accessed April, 2018.

[42] Benghazi Akhlaki K., Capel Tuñón M.I., Holgado Terriza J.A., and Men- doza Morales L.E. A methodological approach to the formal speciﬁcation of real-time systems by transformation of UML-RT design models. Science of Com- puter Programming, 65(1):41 – 56, 2007. Special Issue on: Increasing Adequacy and Reliability of EIS. BIBLIOGRAPHY 83

[43] Gary K., Kumfert and Thomas Epperly. Software in the DOE: The Hidden Overhead of "The Build". Technical Report UCRL-ID-147343, 01 2002.

[44] Naﬁseh Kahani, Mojtaba Bagherzadeh, James R. Cordy, Juergen Dingel, and Daniel Varró. Survey and classiﬁcation of model transformation tools. Software & Systems Modeling, Mar 2018.

[45] Steven Kelly and Juha-Pekka Tolvanen. Domain-speciﬁc modeling: enabling full code generation. John Wiley & Sons, 2008.

[46] Altmanninger Kerstin, Seidl Martina, and Wimmer Manuel. A survey on model versioning approaches. International Journal of Web Information Sys- tems, 5(3):271–304, 2009.

[47] Dimitrios S. Kolovos. An Overview of the Epsilon Proﬁling Tools, July 2007.

http://www.eclipse.org/gmt/epsilon/doc/EpsilonProfilingTools.pdf.

[48] Dimitrios S. Kolovos, Louis Rose, Richard Paige, and A Garcia-Dominguez. The Epsilon Book. Eclipse, 2010.

[49] Dimitrios S. Kolovos, Louis M Rose, Nicholas Matragkas, Richard F Paige, Es- ther Guerra, Jesús Sánchez Cuadrado, Juan De Lara, István Ráth, Dániel Varró, Massimo Tisi, et al. A research roadmap towards achieving scalability in model driven engineering. In Proceedings of the Workshop on Scalability in Model Driven Engineering, page 2. ACM, 2013.

[50] Patrick Konemann. Model-independent diﬀerences. In Comparison and Ver- sioning of Software Models, 2009. CVSM’09. ICSE Workshop on, pages 37–42. IEEE, 2009. BIBLIOGRAPHY 84

[51] David Chenho Kung, Jerry Gao, Pei Hsia, F Wen, Yasufumi Toyoshima, and Cris Chen. Change Impact Identiﬁcation in Object Oriented Software Maintenance. In ICSM, volume 94, pages 202–211, 1994.

[52] Erlikh L. Leveraging legacy system dollars for e-business. IT Professional, 2(3):17–23, May 2000.

[53] Briand L. C., Labiche Y., and O’Sullivan L. Impact analysis and change management of UML models. In International Conference on Software Maintenance, 2003. ICSM 2003. Proceedings., pages 256–265, Sept 2003.

[54] Briand L.C., Y. Labiche, L. O’Sullivan, and M.M. Sówka. Automated impact analysis of UML models. Journal of Systems and Software, 79(3):339 – 352, 2006.

[55] Yuehua Lin, Jing Zhang, and Jeﬀ Gray. Model comparison: A key challenge for transformation testing and version control in model driven software development. In OOPSLA Workshop on Best Practices for Model-Driven Software Development, volume 108, page 6, 2004.

[56] Torvalds Linus. Git. https://git-scm.com/, 2018. Accessed April, 2018.

[57] Shane Mcintosh, Bram Adams, and Ahmed E. Hassan. The Evolution of Java Build Systems. Empirical Softw. Engg., 17(4-5):578–608, August 2012.

[58] Babajide Ogunyomi, Louis M. Rose, and Dimitrios S. Kolovos. Incremental execution of model-to-text transformations using property access traces. Software & Systems Modeling, Mar 2018.

[59] Debian Organization. Debian Package Management System. https://wiki. debian.org/DebianPackageManagement, 2018. Accessed April, 2018. BIBLIOGRAPHY 85

[60] Ernesto Posse. PapyrusRT: modelling and code generation. In Workshop on Open Source for Model Driven Engineering (OSS4MDE’15), 2015.

[61] Mellor S. J., Clark A. N., and Futagami T. Model-driven development - Guest editor’s introduction. IEEE Software, 20(5):14–18, Sept 2003.

[62] Gehan MK Selim, Shige Wang, James R Cordy, and Juergen Dingel. Model transformations for migrating legacy models: an industrial case study. In Eu- ropean Conference on Modelling Foundations and Applications, pages 90–101. Springer, 2012.

[63] Shane Sendall and Wojtek Kozaczynski. Model transformation: The heart and soul of model-driven software development. IEEE software, 20(5):42–45, 2003.

[64] Redding Simon. Papyrus-RT. https://www.eclipse.org/papyrus-rt/. [Ac- cessed March, 2018].

[65] Thomas Stahl, Markus Voelter, and Krzysztof Czarnecki. Model-Driven Software Development: Technology, Engineering, Management. John Wiley & Sons, 2006.

[66] Dave Steinberg, Frank Budinsky, Ed Merks, and Marcelo Paternostro. EMF: Eclipse Modeling Framework. Pearson Education, 2008.

[67] William Swartout and Robert Balzer. On the Inevitable Intertwining of Speciﬁ- cation and Implementation. Commun. ACM, 25(7):438–440, July 1982.

[68] Kehrer T., Kelter U., and Taentzer G. A rule-based approach to the semantic lifting of model diﬀerences in the context of model versioning. In 2011 26th IEEE/ACM International Conference on Automated Software Engineering (ASE 2011), pages 163–172, Nov 2011. BIBLIOGRAPHY 86

[69] GCC Team. GCC, the GNU Compiler Collection. https://gcc.gnu.org/, 2018. Accessed April, 2018.

[70] Gu Z. and Shin K. G. Synthesis of Real-Time Implementation from UML-RT Models. In In 2nd RTAS Workshop on Model-Driven Embedded Systems (MoDES ’04, 2004.