Engineering the dynamic of domain specific languages

Citation for published version (APA): Tikhonova, U. (2017). Engineering the dynamic semantics of domain specific languages. Technische Universiteit Eindhoven.

Document status and date: Published: 21/11/2017

Document Version: Publisher’s PDF, also known as Version of Record (includes final page, issue and volume numbers)

Please check the document version of this publication:

• A submitted manuscript is the version of the article upon submission and before peer-review. There can be important differences between the submitted version and the official published version of record. People interested in the research are advised to contact the author for the final version of the publication, or visit the DOI to the publisher's website. • The final author version and the galley proof are versions of the publication after peer review. • The final published version features the final layout of the paper including the volume, issue and page numbers. Link to publication

General rights Copyright and moral rights for the publications made accessible in the public portal are retained by the authors and/or other copyright owners and it is a condition of accessing publications that users recognise and abide by the legal requirements associated with these rights.

• Users may download and print one copy of any publication from the public portal for the purpose of private study or research. • You may not further distribute the material or use it for any profit-making activity or commercial gain • You may freely distribute the URL identifying the publication in the public portal.

If the publication is distributed under the terms of Article 25fa of the Dutch Copyright Act, indicated by the “Taverne” license above, please follow below link for the End User Agreement: www.tue.nl/taverne

Take down policy If you believe that this document breaches copyright please contact us at: [email protected] providing details and we will investigate your claim.

Download date: 04. Oct. 2021

Engineering the Dynamic Semantics of Domain Specific Languages

PROEFSCHRIFT

ter verkrijging van de graad van doctor aan de Technische Universiteit Eindhoven, op gezag van de rector magnificus, prof.dr.ir. F.P.T. Baaijens, voor een commissie aangewezen door het College voor Promoties in het openbaar te verdedigen op dinsdag 21 november 2017 om 16.00 uur

door

Ulyana Tikhonova

geboren te Leningrad, Rusland Dit proefschrift is goedgekeurd door de promotoren en de samenstelling van de promotiecom- missie is als volgt:

voorzitter: prof.dr. M.A. Peletier promotor: prof.dr. M.G.J. van den Brand copromotoren: dr.ir. T.A.C. Willemse dr.ir. R.R.H. Schiffelers leden: prof.dr. J.J. Vinju prof.dr. M. Butler (University of Southampton) prof.dr. B. Combemale (University of Toulouse) prof.dr.ir. M. Aksit (University of Twente)

Het onderzoek of ontwerp dat in dit proefschrift wordt beschreven is uitgevoerd in overeenstem- ming met de TU/e Gedragscode Wetenschapsbeoefening. Engineering the Dynamic Semantics of Domain Specific Languages

Ulyana Tikhonova Promotor: prof.dr. M.G.J. van den Brand (Eindhoven University of Technology) Copromotoren: dr.ir. T.A.C. Willemse (Eindhoven University of Technology) dr.ir. R.R.H. Schiffelers (ASML)

Additional members of the core committee: prof.dr. J.J. Vinju (Eindhoven University of Technology) prof.dr. M. Butler (University of Southampton) prof.dr. B. Combemale (University of Toulouse) prof.dr.ir. M. Aksit (University of Twente)

The work in this thesis has been carried out under the auspices of the research school IPA (Insti- tute for Programming research and Algorithmics). IPA dissertation series 2017-10

The work in this thesis has been carried out as part of the COREF project (Common Reference Framework for Executable Domain Specific Languages) with ASML as the industrial partner. The COREF project (PNU 10C19) is part of the Point-One University-Industry Interaction pro- gram.

A catalogue record is available from the Eindhoven University of Technology Library ISBN: 978-90-386-4381-6

Cover design: Karina Vasileva (Melanistic Kitsune) Printed by: Ipskamp Printing, Enschede, The Netherlands

c U. Tikhonova, 2017. Acknowledgements

It is many miles and many years that have brought me here.

Tenzing Norgay, Tiger of the Snows

The journey towards this PhD thesis was long and was influenced by a lot of people. First of all, I thank Mark van den Brand, my promoter and supervisor, for inviting me to participate in the COREF project and for providing me with his full support through these years. The topic of my PhD project perfectly fitted the object of my curiosity. Moreover, Mark gave me the freedom to choose directions of the research and to pursue my ideas. Mark, thank you for your trust and for all honest discussions we had! Any freedom implies responsibility and pursuing your own ideas can be a very lonely and painful experience. I thank my daily supervisor, Tim Willemse, for guiding me through these consequences of freedom and for ensuring the best outcome out of this experience – both for my work and for me. I always could burst into his office with another crazy idea, eureka moment, or a desperate problem. Tim, thank you for making time for me, for your thorough discussions and reviews, and for your attention to every detail! Your contribution made this work what it is. My PhD study was conducted as part of the COREF project, in collaboration between TU/e and ASML. I thank everybody actively participating in the COREF project or related to it: Tom Verhoeff for discovering with me the odds and evens of formal specifications; Suzana Andova for being my daily supervisor and a friend in the beginning of the project; Wilbert Alberts, Ra- mon Schiffelers, Rogier Wester, Istvan Nagy for helping and guiding on the industry side; Marc Hamilton for sharing his expertise, insights, and ideas about the world of industrial DSLs. My special thanks go to my COREF buddy, Maarten Manders. I could always rely on his company and back-up when working at ASML, preparing for conferences and schools, and discovering local life and cuisine in various locations we had attended. From the beginning of my PhD project I was inspired by the Event-B community, by the welcoming atmosphere of their workshops and by how keen they are to share knowledge and provide support. I would like to especially mention Michael Butler, Colin Snook, Thai Son Hoang, and Lukas Ladenberger. During my PhD research, I was fortunate to team up and collaborate with some of my col- leagues and students. I am grateful to Rimco Boudewijns for carrying out his master project with me. His work nicely complemented my research and provided an important support for presen- tations and demonstrations of Constelle. I thank Anton Wijs for experimenting with Constelle, ii despite all its clumsiness, and providing me with interesting insights into my own work. I am very grateful to Alexander Serebrenik for providing his help and advice on various questions in all these years, and especially for guiding me through the topics of empirical methods in engineering. Alexander, your help with setting up an empirical study greatly contributed to my research and this thesis. I thank members of my defense committee: Ramon Schiffelers, Michael Butler, Benoit Combemale, Mehmet Aksit, and Jurgen Vinju. It is an honor for me to have my work reviewed and assessed by you! I also express my gratitude to Mark Peletier for chairing the defense cere- mony. One of the joys of being a PhD student is to be in the company of other PhD students. I was happy to share my office hours, breaks, lunches, and after work drinks with Yanja Dajsuren, Mar- cel van Amstel, Luc Engelen, Zvezdan Protic, Arjan van der Meer, Bogdan Vasilescu, Aminah Zawedde, John Businge, Maarten Manders, Neda Noroozi, Ana-Maria Sutii, Dan Zhang, Luna Luo, Sjoerd Cranen, Maciek Gazda, Sarmen Keshishzadeh, Mahmood Talebi, Onder Babur, Sander de Putter, Josh Mengerink, Weslley Torres, Felipe Ebert, Thomas Neele, and others. I am especially grateful for all the fun times we had with fellow PhDs during various schools that I attended: numerous IPA spring and fall days, GTTSE, SFM/MDE, and, of course, Maktober- dorf. As a nice distraction from my research, I had enjoyed setting up and running the M&CS PhD Council - together with Jaron Sanders, Serban Badila, Sarah Gaaf, Jorn van der Pol, Arthur van Goethem, and Alok Lele. The journey to this thesis started some years before I came to Eindhoven, when my Russian supervisor Fedor A. Novikov introduced me to the field of domain specific language engineering. Фёдор Александрович, спасибо Вам за байки про программистов, за умение видеть вглубь и вширь и быть не таким как все. I am grateful to my senior colleagues from In- stitute of Applied Astronomy who inspired me with their romantic vision of research work, its curiosity and beauty. Скрипниченко Владимир Ильич, Нецветаева ГалинаАнатольевна, Свешников Михаил Леонидович, спасибо вам за вашу романтику, за любовь к своей работе и за рассказы про созвездия. Finally, I thank my friends both in Netherlands and in St. Petersburg, for not caring much about my PhD and all this research; Gar Oome for consulting me about mechanics of LEGO cars; my closest friends Yanja and Sibrecht for being my paranymphs. I thank my wise husband Geert and my beautiful daughter Vasja, for special joys and challenges they always have for me and for the place I call my home. I thank my parents, who always support me, whatever I do and wherever I go. To them I devote this book. Дорогие мама и папа, спасибо вам за вашу поддержку! Я знаю, что что бы я ни делала и где бы ни была, вы со мной и поддержите меня. Для меня это очень важно. А еще, эта книга про и для инженеров - таких какими я с детства видела вас.

Ulyana Tikhonova Заозерье, July 2017. To my parents, Elena and Nykolai Посвящается моим родителям, Елене и Николаю

Table of Contents

Acknowledgements i

Table of Contents v

List of Acronyms vii

1 Introduction 1 1.1 Scope and Definitions ...... 1 1.2 Problem Statement ...... 3 1.3 Research Questions ...... 5 1.4 Research Methods ...... 6 1.5 Outline and Origin of Chapters ...... 7

2 Defining the Dynamic Semantics of a DSL: a Case Study 11 2.1 A Definition of the Dynamic Semantics of a DSL ...... 12 2.2 The LACE DSL ...... 13 2.3 Use Cases and User Roles of LACE Specification ...... 16 2.4 Event-B and Rodin ...... 17 2.5 Specification of LACE Dynamic Semantics in Event-B ...... 18 2.6 Visualization of LACE Specifications ...... 29 2.7 User Study ...... 34 2.8 Related Work ...... 38 2.9 Conclusions ...... 40

3 The Grand Vision 43 3.1 The Technology Space ...... 44 3.2 Defining the Structure of a DSL ...... 45 3.3 Defining the Dynamic Semantics of a DSL ...... 47 3.4 Conclusions ...... 50 vi Table of Contents

4 Specification Templates and Constelle for Defining the Dynamic Semantics of DSLs 51 4.1 LEGO Allegory ...... 52 4.2 Approach ...... 54 4.3 Reusable Specification Templates ...... 58 4.4 Constelle Language ...... 62 4.5 Constraint Templates ...... 68 4.6 Related Work ...... 72 4.7 Conclusions ...... 76

5 Designing and Describing Model Transformations 77 5.1 Introduction and Motivation ...... 77 5.2 Notation for Describing Model Transformations ...... 78 5.3 Design of Model Transformations ...... 81 5.4 Validation ...... 89 5.5 Related Work ...... 91 5.6 Conclusions ...... 92

6 Mapping Constelle to Event-B 95 6.1 Overview ...... 96 6.2 Model transformations from Constelle to Event-B ...... 97 6.3 Substitution ...... 100 6.4 Composition ...... 106 6.5 Gluing Guards ...... 111 6.6 Proof Obligations ...... 115 6.7 Conclusions ...... 119

7 Implementation and Pragmatics of Constelle 121 7.1 Architecture ...... 121 7.2 Definition Process ...... 123 7.3 Constelle Workbench ...... 125 7.4 Conclusions and Future Work ...... 132

8 Validation of Constelle 133 8.1 Empirical Methods ...... 134 8.2 Setting up a Validation Study ...... 134 8.3 Conducting the Validation Study ...... 141 8.4 Conclusions ...... 152

9 Conclusions 153 9.1 Contributions ...... 153 9.2 Future Work ...... 156

Bibliography 159

A Event-B Specification Templates 167

B Event-B Specification of Robotic Arm Parallel 173 Table of Contents vii

C Questionnaires 177 C.1 Baseline Questionnaire ...... 177 C.2 Logbook Questionnaire ...... 178 C.3 Final Questionnaire ...... 180

Summary 185

Curriculum Vitae 187

IPA Dissertation Series 189

Index 194

List of Acronyms

AOP Aspect Oriented Programming ATL Atlas Transformation Language COREF Common Reference Framework DAG Directed Acyclic Graph DSL Domain Specific Language EMF Eclipse Modeling Framework GPL General Purpose Language GQM Goal Question Metric LAC Logical Action Component LACE Logical Action Component Environment LOC Lines Of Code MDE Model Driven Engineering MOF Meta-Object Facility MT Model Transformation OCL Object Constraint Language OMG Object Management Group PO Proof Obligation QVT Query/View/Transformation SF Semantic Feature SS Subsystem SSA Subsystem Action UML Unified

Chapter 1

Introduction

"Look, old boy," said the machine, "if I could do everything starting with ‘n’ in every possible language, I’d be a Machine That Could Do Everything in the Whole Alphabet, since any item you care to mention undoubtedly starts with ‘n’ in one foreign language or another. It’s not that easy. I can’t go beyond what you programmed. So no sodium."

Stanislaw Lem, How the World Was Saved

In this introduction chapter, we set the scope of our research, define the basic terms, and identify the gaps that we aim to address in our work. We outline our exploration path by formu- lating a series of research questions and describe how these research questions are addressed in the chapters of this thesis. Moreover, to set up the common ground for our reader, we identify principles that we follow in our research by explicitly choosing a philosophical stance.

1.1 Scope and Definitions

Today software controls a lot of aspects of our lives: from controlling our airplanes (autopi- lots) and driving our cars, to setting up temperature in our houses and running our washing machines, from connecting us with our family and friends, to performing our money transactions and handling our personal information. All types of software (realizing these and many other functionalities) are created by people known as software developers. The work of a software de- veloper consists in mapping a concrete problem (such as navigating a car) to a solution (a piece of software, known as a program) that solves this problem. To capture such a solution, software de- velopers use programming languages. A programming language provides an intermediate layer between a software developer and an existing hardware and/or software platform (where the pro- gram is executed). To facilitate the work of a software developer, we can move this intermediate layer closer to the developer, i.e. we can reduce the distance between a problem domain (deter- mined by the application of a program) and solution domain (determined by the programming language). Programming languages that aim at such an improvement of the problem-to-solution 2 Introduction mapping are known as domain-specific languages (DSLs) (in contrast to general purpose lan- guages, GPLs). The idea of moving an intermediate layer closer to the developer is illustrated in Figure 1.1.

software developer

intermediate layer is closer to the developer DSL the translation of the DSL bridges a wider semantic gap GPL execution platform

Figure 1.1: Domain specific language (DSL) versus general purpose language (GPL)

A domain-specific language (DSL) is a computer (programming) language specialized for a specific (application) domain. The idea of using DSLs for software development and/or soft- ware configuration is not new, and DSLs have been known and applied in various forms (such as subroutine libraries, frameworks, and dedicated languages) for a long time. Recently, DSLs be- came a central concept of Model Driven Engineering (MDE) [86]. In the context of MDE, a DSL determines a class of models that can be constructed in the domain; and model-to-model transfor- mations and code generators assign meaning to such models by automatically translating them into programs and/or other (useful in software development) artifacts, such as: documentation, visualizations, formal specifications, etc. As a programming language, a DSL is usually composed of the following three components: abstract syntax, concrete syntax, and semantics. The abstract syntax of a DSL introduces con- structs of the DSL and relationships between them. The concrete syntax of a DSL defines a notation for these constructs and relationships. Such a notation is used by a software developer to create (write down) and to read (understand) DSL models (programs developed using the DSL). For example, a DSL can have a textual notation, in which case its models appear as pieces of text. Or a DSL can have a graphical notation, in which case its models appear as diagrams. The semantics of a DSL assigns a meaning to a DSL model. For example, a meaning of a DSL model can be a value of an arithmetic expression, if we consider a DSL for expressing such arithmetic expressions. In this case, the semantics of the DSL is defined as an algorithm of computing such a value for an arbitrary DSL model. The dynamic semantics of a DSL maps each DSL model (program) to the corresponding execution behavior. In this case, a meaning is a sequence (or multiple interleaving sequences) of execution steps, which can dynamically include or exclude certain steps as a response to an input stimulus and which might result in some side effects (such as a change of the state of some component). An example of an execution behavior is a sequence of steps performed by a washing machine (although we might never think about our washing machine in this way): the input stimulus is the choosing of a washing program, the change of state is clothes getting washed by the machine. A DSLs can be implemented in the form of an embedded (internal) DSL or in the form of an external DSL [29]. An embedded DSL is integrated into a host language (usually GPL), relying on its existing constructs, notation, and semantics and extending it with domain-specific con- structs, notation, and semantics. An external DSL is an independent (programming) language, 1.2. Problem Statement 3 which introduces its own constructs, notation, and semantics (from scratch). Programming languages (both DSLs and GPLs) are meant for creating software, but they are themselves software too. To be able to use a programming language, we first need to construct its components (syntax and semantics) in the form of software. The languages that are meant for describing programming languages are known as meta-languages. If such a meta-language is implemented by the corresponding software, then we can execute a description of a programming language, for example, in order to execute programs written in this programming language (this approach is known as an interpreter). People who create new languages (by describing them in meta-languages and/or constructing them in the form of software) are known as language developers. As shown in Figure 1.1, a DSL brings the intermediate layer (a solution domain) closer to a software developer. As a consequence, the gap between the DSL and the execution platform increases. This gap is bridged in the semantics of the DSL and should be defined as such by the language developer. Thus, we facilitate the work of a software developer by the cost of increasing the workload of the language developer (who constructs the semantics of the DSL). In this thesis we aim to facilitate the work of a language developer who constructs a DSL. In particular, we focus on the dynamic semantics of external DSLs in the context of MDE and develop a meta-language for defining the dynamic semantics of a DSL. In the next section we describe the motivation of this work and give an overview of the problems that we aim to solve.

1.2 Problem Statement

In the context of MDE, the development of a DSL usually includes its design via meta-modeling and its implementation via code generation and/or model transformation. A DSL metamodel defines an abstract syntax of the DSL by capturing language constructs, their compositional hierarchy, classification and cross references between them. A model transformation implements a translation from the DSL metamodel to the input language of a target execution platform, and, thus, captures the dynamic semantics of the DSL. As stated in Section 1.1, from a semantics point of view, the gap bridged by this translation can be quite wide. Such a translation addresses both the problem and the solution domains and does it in terms of both high-level concepts of the DSL and low-level concepts of the execution platform. The complexity of the DSL translation, which in the context of MDE is practically (hard)coded in model transformations and code generation, poses challenges for learning, debugging, understanding, maintaining, and updating the DSL. To manage the complexity of the DSL translation, we want to have an explicit definition of the dynamic semantics of a DSL. There exist a number of approaches for defining the dy- namic semantics of general purpose languages (GPLs), such as denotational and algebraic se- mantics [108, 73], action semantics [70], and structural (SOS) [78]. These approaches allow for formalizing the dynamic semantics of a programming language and for performing various types of analysis of such a formal definition. However, there is no practically applicable tool support for this work, which means that the formalization and analysis are done on paper. Compared to GPLs, DSLs are smaller languages: a DSL covers a smaller set of problems and, as a consequence, has a smaller audience of practitioners (those who use the DSL) and/or a smaller group of developers (those who design and implement the DSL). Next to the known advantages of using such small languages [25], the main disadvantages are determined by the costs of developing and learning a DSL. An explicit definition of the dynamic semantics of a DSL can help to mitigate these disadvantages by providing various practical outcomes of having a formal specification of the dynamic semantics of a DSL. 4 Introduction

Recent approaches focus on defining the dynamic semantics of DSLs and on providing the corresponding tool support. The existing meta-languages for defining the dynamic semantics of DSLs include Kermeta Language [44], xMOF (eXecutable MOF) [67], K semantic frame- work [82], DynSem [106]. These meta-languages use a technique known as an operational approach, when the dynamic semantics of a DSL is defined in terms of the DSL itself. For this, the DSL metamodel is extended with additional constructs that are necessary for describing a state of a DSL model during execution. The tool support includes an interpreter that allows for navigating through execution states of a DSL model, according to a given definition of the dy- namic semantics of the DSL. Thus, the main practical outcome of having such a definition of the dynamic semantics of a DSL is a reference interpreter that can be consulted on how a DSL program should behave. In this thesis we strive towards having more practical outcomes of having a formal speci- fication of the dynamic semantics of a DSL. In [53] Kosar et al. review (perform a systematic mapping study of) 390 existing studies on DSLs in order to understand the DSL research field, identify research trends and open issues. According to this work, the number of DSLs which had formal descriptions of their (dynamic) semantics was low. Moreover, there was hardly any discussion on validation and maintenance of DSLs in the reviewed studies. This work shows a gap in the research field of DSLs: a formal description of the dynamic semantics of a DSL, which can be used for the wide variety of practical purposes (such as validation and maintenance of the DSL). For example, in practice a DSL implementation can include a number of DSL translations, targeting different execution platforms with the purpose of achieving diverse technological goals. In this case, one translation generates C/C++ or Java source code for execution of DSL pro- grams; another translation targets various formalisms for verification and formal analysis of DSL programs; and a third translation constructs diagrams visualizing DSL programs (as it is done in [102]). Generally speaking, there is no guarantee that different translations implement the DSL dynamic semantics in a coherent way. For instance, there is no confidence that a verified DSL program is coherent with the corresponding executed source code; or that a visualization of a DSL program captures the source code generated from this program and can be used as a ref- erence, for example, when testing the execution of the source code. The desire to have different translations implemented in a consistent way, so that they support and compliment each other, poses a maintenance problem. To manage the complexity of a DSL translation, to mitigate the costs of developing and learning a new DSL, to provide a common ground for different DSL translations, and in this way to facilitate their consistency, we want to have an explicit definition of the dynamic semantics of a DSL that can be used for the wide variety of practical purposes and in the wide variety of DSL translations. Therefore, we go for a technique known as a translational approach, when the dynamic semantics of a DSL is defined in terms of another language or formalism (whose semantics is already defined). The known advantage of this approach is the possibility to apply various tools available for the target formalism. However, as Combemale et al. describe in their survey in [19], the translational approach is also known for the complexity of a definition of the dynamic semantics, which is captured in the translation of the DSL to a target formalism. For example, in [18] Cleenewerck and Kurtev describe complexity of a translational semantics of a DSL caused by structural mismatches between the DSL and the target language/formalism. Such a complexity can hinder usability of a resulting definition of the dynamic semantics of a DSL. We need a meta-language for defining the dynamic semantics of a DSL, that is not only supported by an infrastructure for interpreting (using) such a definition of the dynamic semantics, but also allows for a usable (clear) definition of the dynamic semantics of a DSL. Thus, the central research question addressed in this thesis is the following: 1.3. Research Questions 5

RQ: How to define the dynamic semantics of a DSL in a usable and useful way? In the next section, we refine (decompose) this central research question into a series of more specific questions.

1.3 Research Questions

To define the dynamic semantics of a DSL, we first need to identify what we want to define, i.e. what we want to capture in such a definition. In search for the key components and variability points of a definition of the dynamic semantics of a DSL, we formulated the following research question.

RQ1: What constitutes a definition of the dynamic semantics of a DSL? The constitution of a definition of the dynamic semantics of a DSL (i.e. what do we want to define) correlates with the purpose of such a definition (i.e. how do we want to use the definition). To capture such a purpose explicitly and to ensure that our solution fits the stated purpose, we address the following research question.

RQ2: What are the requirements for a definition of the dynamic semantics of a DSL? After identifying what constitutes a definition of the dynamic semantics of a DSL and stating the purpose of such a definition, we are ready to search for the means that allow for constructing a definition with the formulated characteristics. In other words, we address the following research question.

RQ3: What are the constructs for defining the dynamic semantics of a DSL?

As a solution for research question RQ3, we designed a domain-specific language called Constelle. Constelle is a meta-language for defining the dynamic semantics of DSLs. To be able to interpret and use Constelle, one needs to know the meaning of a Constelle model, i.e. the meaning of a definition of the dynamic semantics of a DSL. We address the following research question from two points of view. From a practical point of view, we aim to know how to implement Constelle (in a model transformation). From a theoretical point of view, we aim to give a formal description of the semantics of Constelle.

RQ4: What is the semantics of the language for defining the dynamic semantics of a DSL? While defining the semantics of Constelle, we faced a challenge of designing a translation of a high-level meta-language (Constelle) into an existing (reference) formalism. This translation is rather complicated and includes a lot of mappings. The search for a guideline on how to do this work led us to the following research question.

RQ5: How to design and describe the semantics of the language for defining the dynamic semantics of a DSL? As described in Section 1.2, one of the problems we aim to solve in this thesis is the practical usage of a definition of the dynamic semantics of a DSL. The (methodological) knowledge on how to use a (programming or meta) language and in what context to use it, is usually captured in a (yet another) component of a language description: the pragmatics of the language. There- fore, to ensure that Constelle addresses the aforementioned problem of using a definition of the dynamic semantics of a DSL, we formulate the following research question. 6 Introduction

RQ6: What is the pragmatics of the language for defining the dynamic semantics of a DSL?

Constelle is our solution to the central research question of this thesis (RQ on page 5). Thus, Constelle is meant for giving definitions of certain content and for a certain purpose. To ensure that Constelle actually achieves these goals, i.e. that our solution is a valid solution, we conduct its evaluation. In the process of designing such an evaluation, we formulated the following research question.

RQ7: How can we evaluate the language (and the method) for defining the dynamic semantics of a DSL?

1.4 Research Methods

In our thesis we address the following two objects of research: a definition of the dynamic semantics of a DSL and a meta-language (a method) for giving such a definition. In this section, we formulate the research principles that we follow in relation to these two research objects. For this, we explicitly choose philosophical stances that we adopt in our work. A philosophical stance determines how we conduct our research in order to come up with an answer to a research question and how we justify whether the gained knowledge is valid [26]. Our research is conducted in the collaboration with an industrial partner, who has experi- ence with developing and applying DSLs. To maximize the relevance of our research results to industry and to address the practical challenges of developing and applying DSLs discussed in Section 1.2, we aim at having a practically useful and usable definition of the dynamic seman- tics of a DSL. Therefore, we follow the principles of a pragmatist stance when searching for an answer for our research questions RQ1 and RQ2: what constitutes a definition of the dynamic semantics of a DSL and what are the requirements for it? Pragmatism adopts an ‘engineering approach to research’ and judges knowledge (i.e. an answer to our research questions) by how useful it is for solving practical problems [26]. As a consequence, pragmatism implies certain relativism of the obtained knowledge: what is useful for one person might be not useful for an- other person. This means that we need to clearly identify our users, who will use a definition of the dynamic semantics of a DSL, and purposes of such use. The second object of our research, a meta-language for defining the dynamic semantics of a DSL is addressed in the research questions RQ3, RQ4, RQ5, and RQ6. As described earlier in Section 1.2, one of the obvious approaches to describe the dynamic semantics of a DSL would be to use an existing method and/or formalism for defining the dynamic semantics of programming languages (GPLs). However, as shown in [53], formal specifications of the dynamic semantics of DSLs are very rare in practice. To understand the causes of this gap and to propose a possible solution, we adopt the philosophical stance of Critical Theory [26]. Critical theory actively seeks to challenge existing perceptions about software practice (in our case, about defining the dynamic semantics of a DSL) and judges scientific knowledge by its ‘ability to free people from restrictive systems of thought’. Moreover, this philosophical stance encourages to choose what research to undertake based on whom it helps, which correlates with the pragmatic approach to our first object of research (a definition of the dynamic semantics of a DSL). Finally, we follow the critical theory stance when designing the evaluation of the proposed meta-language (the topic addressed in our research question RQ7). These two philosophical stances, critical theory and pragmatism, direct our research through the whole thesis. In particular, they influenced the decomposition of the main research question into the series of more specific research questions (as presented in Section 1.3) and the way we 1.5. Outline and Origin of Chapters 7 search for their answers (see Section 1.5). Moreover, to stress this close-to-practice nature of our work, we consider our research question (also) as design questions. That is, we aim at designing a method (a meta-language and/or framework and/or process) that (does not resolve from a strict point of view but) rather approximates a solution for the problems described in Section 1.2.

1.5 Outline and Origin of Chapters

In this section, we give an outline of the structure of the remainder of this thesis. In particular, we describe chapters that constitute this thesis and how these chapters map to the formulated research questions. Moreover, for each chapter that is based on an earlier publication, we indicate the origin of the chapter.

Chapter 2: Defining the Dynamic Semantics of a DSL: a Case Study

In this chapter, we address research questions RQ1 and RQ2. We investigate the notions of the dynamic semantics of a DSL and its definition by conducting a case study. For this, we define the dynamic semantics of the LACE DSL using the Event-B formalism and QVTo model trans- formations. Following our choice of the pragmatism stance, we identify user roles and discover how they can benefit from having an explicit formal definition of the dynamic semantics of the DSL, indicate limitations of the proposed approach and challenges emerging when applying it. Based on our observations from this case study and on the results of the user study conducted as part of this work, we formulate requirements for a definition of the dynamic semantics of a DSL and for a meta-language for giving such a definition (research question RQ2). This chapter is based on the following publications (for which the author was a main contributor). [98] U. Tikhonova, M. Manders, M.G.J. van den Brand, S. Andova, and T. Verhoeff. Applying Model Transformation and Event-B for Specifying an Industrial DSL. Proceedings of the 10th International Workshop on Model Driven Engineering, Verification and Validation, 2013.

[97] U. Tikhonova, M. Manders, and R. Boudewijns. Visualization of For- mal Specifications for Understanding and Debugging an Industrial DSL. Proceedings of the Third Workshop on Human Oriented , 2016.

Chapter 3: Grand Vision In this chapter, we ‘zoom out’ from the LACE case study and describe our vision on the DSL- based development approach. This vision is based on the knowledge gained during the LACE case study and helps us to introduce our ideas on how to address the central research question of this thesis. Using the broad picture outlined in this chapter, we indicate the scope of our research and give an overview on how we address the challenges described in Section 1.2.

Chapter 4: Specification Templates and Constelle for Defining the Dynamic Semantics of DSLs

In this chapter, we address research question RQ3. We introduce reusable specification templates and the Constelle meta-language and elaborate their design. For this, we use a small example DSL, demonstrate how we define its dynamic semantics using Constelle and specification tem- plates, and specify the corresponding metamodels of Constelle and specification templates. This 8 Introduction chapter is based on (a part of) the following publication (which won the 2017 Best paper award of Journal of Software and and was presented at the ACM/IEEE 20th Inter- national Conference on Model Driven Engineering Languages and Systems, MODELS 2017).

[96] U. Tikhonova. Reusable Specification Templates for Defining Dynamic Semantics of DSLs. Software and Systems Modeling (SoSyM), 2017.

Chapter 5: Designing and Describing Model Transformations

In this chapter, we address research question RQ5. Aiming at formulating the semantics of Con- stelle in the next chapter, in Chapter 5 we introduce a notation for describing and designing such a semantics. In particular, we specify how we use the mathematical notation of set theory and functions in order to describe a QVTo model transformation. Moreover, we use this notation to formulate two design principles of developing QVTo transformations: structural decomposition and chaining model transformations. This chapter is based on the following publications.

[99] U. Tikhonova and T. Willemse. Designing and Describing QVTo Model Transformations. Proceedings of the 10th International Joint Conference on Software Technologies, 2015.

[100] U. Tikhonova and T. Willemse. Documenting and Designing QVTo Model Transformations Through Mathematics. Selected Papers of the 10th International Joint Conference on Software Technologies, 2015.

Chapter 6: Mapping Constelle to Event-B

In this chapter, we address research question RQ4. We define the semantics of the Constelle lan- guage by mapping it to the Event-B formalism. In other words, a meaning of a Constelle model is computed in the form of an Event-B specification. The definition of the semantic mapping of Constelle to Event-B corresponds to the actual implementation of Constelle, the Constelle-to- Event-B model transformation. Moreover, based on this definition, we derive certain theoretical results for a resulting (computed) Event-B specification. This chapter is based on (a part of) the following publication (which is the same paper, that formed the basis of Chapter 4).

[96] U. Tikhonova. Reusable Specification Templates for Defining Dynamic Semantics of DSLs. Software and Systems Modeling (SoSyM), 2017.

Chapter 7: Implementation and Pragmatics of Constelle

In this chapter, we address research question RQ6. To elaborate on the pragmatics of our ap- proach, we formulate a process of defining the dynamic semantics of a DSL using Constelle. We describe our implementation of Constelle, the Constelle workbench, and show how its com- ponents support different steps of the formulated process and fulfill the requirements listed in Chapter 2.

Chapter 8: Validation of Constelle

In this chapter, we address research question RQ7. In order to evaluate whether Constelle is a valid solution to our central research question, we follow the critical theory stance and choose an empirical method that most closely reflects its philosophy, an action research. The main goal of our action research is to learn new insights from applying Constelle to another DSL by another 1.5. Outline and Origin of Chapters 9 language developer. Based on the gained insights, we indicate both strong sides and limitations of Constelle and propose directions for future work.

Chapter 9: Conclusions This final chapter concludes this thesis. In this chapter, we summarize the main contributions of our work and revisit our research questions. Moreover, we give an overview of the most interesting discussion points and outline directions for future research.

Chapter 2

Defining the Dynamic Semantics of a DSL: a Case Study

Real work is Brussels lace, the main thing in it is what holds the pattern up: air, punctures, truancy.

Osip Mandelstam, The Fourth Prose

In this chapter we introduce and investigate the notion of the dynamic semantics of a DSL from a pragmatic point of view: what constitutes its definition, what are the challenges and outcomes of the definition process? In particular, we strive to answer the following (research or design) questions.

what are the benefits of constructing and/or having a formal specification of the dynamic • semantics of a DSL? how can these benefits be achieved for different roles of users in the development process? • what are the requirements for a definition formalism (a formalism for defining the dynamic • semantics of a DSL) that allows for achieving practical benefits of constructing/having a specification of the dynamic semantics of a DSL? For this, we first introduce the notion of the dynamic semantics of a DSL and discuss the basic criteria for a definition formalism (Section 2.1). In our investigation of the dynamic semantics of a DSL from a pragmatic point of view we take a bottom-up approach: we define the dynamic semantics of an existing DSL (LACE, introduced in Section 2.2) and generalize our observations and experience in order to answer the listed above questions. The investigation is based on the viewpoints of two user roles of the (software) development process: a DSL developer and a DSL user. We describe these user roles and how they can use a definition of the dynamic semantics of the DSL (i.e. the use cases) in Section 2.3. As the existing formalisms for specifying the semantics of general purpose languages (GPLs) (such as Action Semantics [70] and Structural Operational Semantics [78]) do not have practi- cally applicable tool support, in order to realize the identified use cases, we choose a different 12 Defining the Dynamic Semantics of a DSL: a Case Study specification formalism for our case study: Event-B (Section 2.4). As the chosen formalism is not specifically designed for defining the dynamic semantics of DSLs (or GPLs), we face cer- tain challenges when applying it to the specification of our DSL. Section 2.5 describes how we overcome these challenges and align the specification formalism (Event-B) with the DSL devel- opment environment. In Section 2.6 we implement one of the previously identified use cases by providing a domain specific visualization for our Event-B specifications of LACE. The user- friendly visualization allows for performing a user study, described in Section 2.7. The feedback provided by the users during this study and lessons learned during the specification of LACE in Event-B form the set of requirements for a definition formalism listed in Section 2.9. Thus, this chapter serves as a main source of motivation for the rest of the thesis.

2.1 A Definition of the Dynamic Semantics of a DSL

In our work we focus on domain-specific programming (or executable) languages. In other words, we restrict the scope of our research by DSLs that can be used for programming, i.e. for defining programs that can be executed. A DSL defines a set of executable programs, and the dynamic semantics of the DSL determines the behavior of such programs (i.e. the way in which each DSL program executes). A definition of the dynamic semantics of a DSL consists of the following two components: a semantic domain providing terms to define the dynamic semantics; • a semantic mapping mapping the DSL (metamodel or abstract syntax) to the semantic • domain.

A B C

Figure 2.1: T-diagram

To depict a definition of the dynamic semantics of a DSL, we employ the notation of T- diagrams (or Tombstone diagrams) [5]. T-diagrams (Figure 2.1) are used in compiler theory to represent that a translation from a source language A (left ‘wing’ of T) to a target language B (right ‘wing’ of T) is realized in the implementation language C (the ‘basement’ of T). As compilers are a special case of a semantic mapping, we generalize this notation to illustrate definitions of dynamic semantics: the DSL being defined is depicted as the left ‘wing’ of T; the semantic domain is depicted as the right ‘wing’ of T; and the language or formalism employed for defining the semantic mapping is represented by the ‘basement’ of T. A definition of the dynamic semantics of a DSL should be [108]: precise, so that the definition can be interpreted unambiguously, • intelligible, so that the definition can be interpreted by humans, • executable, so that the definition can be interpreted by tools. • The criterion of an intelligible definition depends on the background of a particular reader. In practice, an intelligible definition can in most cases be achieved through a precise and executable definition, thus three criteria can be reduced to two criteria. Precision of a definition is achieved 2.2. The LACE DSL 13 by employing a formalism based on a solid mathematical theory. Executability of a precise definition is achieved by employing tools that implement this theory. Note that when employing such tools we have to rely on the level of preciseness that these tools provide to conform to the theory. When applying the criteria of a precise and executable definition to the two components of a semantics definition we get the following practical outcomes: A precise semantic mapping and semantic domain allow for reasoning and analysis of the • DSL semantics in terms of the formalism of the semantic domain. An executable semantic domain allows for the execution of DSL programs in terms of the • semantic domain. An executable semantic mapping allows for automatic translation of DSL programs into • models expressed in terms of semantic domain. In current practice usually at least one of the three outcomes listed above is not attainable. For example, both semantic domain and semantic mapping are executable but not precise (Fig- ure 2.2(a) illustrates a common implementation of a DSL, the triangle on the bottom denotes that the ‘basement’ formalism is executable); or both the semantic domain and semantic mapping are precise, but the semantic mapping is not executable (Figure 2.2(b) illustrates the work presented in [94]); or both the semantic domain and semantic mapping are executable, but the semantic mapping is not precise (Figure 2.2(c) illustrates the work presented in [102]).

DSL C DSL POOSL Xtend Java DSL LTS Xtend Java SOS

(a) (b) (c)

Figure 2.2: T-diagrams of some existing dynamic semantics definitions

To achieve an unambiguous understanding of a DSL and to enhance the DSL development with formal analysis and tool support, we would like to have both its semantic mapping and semantic domain definitions to be precise and executable. To achieve this (in our case study and in the rest of the thesis), we employ a formalism that has both a solid theory and tool support (see Section 2.4). In practice this means that our definition of the dynamic semantics of a DSL exists next to an implementation of the DSL that is developed using imprecise tools (such as model transformation and/or code generation languages) and is actually used for the development of DSL programs. In the next section we describe the DSL used for our case study and the that implements this DSL.

2.2 The LACE DSL

Our case study was performed at ASML,1 the world-leading manufacturer of lithography sys- tems (called waferscanners) for the semiconductor industry. Waferscanners are complex ma- chines which participate in the production pipeline of integrated circuits or chips. LACE (Logi- cal Action Component Environment) is a DSL, used within ASML for describing logical action

1www.asml.com, http://en.wikipedia.org/wiki/ASML_Holding 14 Defining the Dynamic Semantics of a DSL: a Case Study components and for using such descriptions to (automatically) generate software that controls a waferscanner by invoking hardware drivers in a synchronized and effective way. The key concepts of LACE and the supporting software architecture have been introduced about twelve years ago. Seven years ago the LACE DSL was developed to wrap the manual configuration of the source code into a graphical notation and to allow for the automatic generation of source code configurations. LACE has a graphical notation based on UML activity diagrams and provides an environment for defining (and editing) LACE programs (models). The main purpose of LACE is the coordination of machine parts of a waferscanner, i.e. phys- ical subsystems: actuators, projectors, sensors, etc. A LACE program (model) consists of one or more logical actions, each of which defines how a set of subsystems operate in collaboration with each other in order to perform a required (logical) function of the machine. An example of a logical action is shown in Figure 2.3. Note that although the graphical notation of LACE is based on UML activity diagrams, the dynamic semantics of LACE does not follow the semantics of UML activity diagrams.

take_a_snapshot

Laser Sensor Handler Projector

objectPos framePos AdjustFrame PositionObject

ProduceLight GrabAFrame snapshot

Switch Glass MoveParking

Figure 2.3: An example of a logical action for taking a snapshot

A logical action describes desired behavior in terms of an abstract representation of sub- system functions – subsystem actions. In Figure 2.3 the subsystems participating in the logical action are represented as swimlanes (columns). Subsystem actions are represented as rounded rectangles. Each subsystem action belongs to the corresponding subsystem (column). Subsystem actions, combined together into a so-called scan (visualized as a dashed rounded rectangle in Figure 2.3), are executed in unison, that is, the subsystem actions within such a group will start and end at the same time. Thus, the Sensor subsystem starts and stops executing the GrabAFrame action at the same moments as the Laser subsystem starts and stops executing the ProduceLight action.2 Subsystem actions within a logical action can be executed sequentially or concurrently (vi- sualized as thick arrows and fork and join nodes). For example, the subsystem actions Adjust- FramePosition and PositionObject are independent actions that can be executed in any order or in parallel, but the GrabAFrame action can be performed only after both these actions are finished. Finally, subsystem actions may require and produce data. The dataflow in a logical action is depicted by means of thin arrows, connected to input and output pins. For example, in Figure 2.3 the GrabAFrame action produces data, which is saved in the snapshot output

2In the rest of this thesis, we use this font convention when referring to elements of a LACE program. 2.2. The LACE DSL 15 parameter. The high-level description of the machine subsystems’ behavior, given in logical actions, is translated into the invocations of hardware drivers and a synchronization driver in such a way that the resulting execution matches the behavior specified in the logical actions. An overview of the software architecture that realizes the LACE DSL is depicted in Figure 2.4.

Controlling software logical actions requests LACE

Logical action generate LAC LACs layer model software {

Synchronization driver

subsystems SW A SW B SW C SW D layer { HW A HW B HW C HW D

Figure 2.4: Software architecture for execution of logical actions

The architecture consists of three layers: controlling software, logical action components (LACs), and a subsystems layer. A LAC provides an interface for requesting the execution of logical actions by controlling software. LAC translates requested logical actions into subsystem actions and requests the execution of these subsystem actions from the corresponding subsys- tems. The subsystems are responsible for the actual execution of subsystem actions. They store requested subsystem actions in buffers and execute them in order of arrival. The subsystems run independently from each other and from LACs. Moreover, the buffered execution of subsystem actions allows for the parallel (overlapped) execution of multiple logical actions. That is, a sub- system action from a next logical action can be executed while a subsystem action from the pre- vious logical action is still in the queue (unless this situation violates the execution order within the logical actions). For example, two instances of the logical action take_a_snapshot (de- fined in Figure 2.3) overlap while being executed, if the subsystem action MoveParking of the first instance is still (awaiting to be executed) in the queue of the Handler and the subsystem action AdjustFrame of the second instance is already being executed by Sensor. Such an approach is intended for improving the throughput of a machine. The dynamic semantics of LACE determines the independent execution of subsystem actions on different subsystems and, at the same time, takes care of the proper order of their executions and their synchronization. In order to modularize (and thus, to improve understandability of) the description of the dynamic semantics of LACE, we introduce semantic features (SFs). A semantic feature is a module of the dynamic semantics description (informal or formal) which can be considered separately from or/and as an extension to other semantic features. Next to the modularity introduced by the DSL syntax (DSL constructs, such as a logical action, or a subsystem action) and the modularity introduced by the DSL architecture (software components, such as LAC and subsystems), semantic features modularize the dynamic semantics, i.e. the (many-to-many) mapping of the DSL constructs to the DSL software components. For LACE we distinguish the following semantic features. 16 Defining the Dynamic Semantics of a DSL: a Case Study

Core SF The order of the execution of subsystem actions on each subsystem corresponds to (is a subset of) the order of requests of logical actions that have been translated into these subsystem actions. This is achieved in the LAC layer by processing logical actions one by one: the next logical action can be requested only after the previous one has been completely translated into subsystem action requests.

Scan SF Subsystem actions, combined together into a scan, are executed simultaneously. This means that if a subsystem needs to execute its subsystem action, which is a part of a scan, then this subsystem has to wait for the other subsystems, involved in this scan, to execute the scan together. The corresponding behavior is realized in the subsystems layer.

Order SF A partial order on the subsystem actions of a logical action determines parallel and sequential execution of the subsystem actions. This feature is implemented in the LACs layer: a subsystem action cannot be requested before the execution of the subsystem ac- tions that precede it has finished. Data SF Subsystem actions in a logical action description may require and produce data. This implies that a subsystem action that requires some data cannot be requested before the execution of the subsystem action that produces this data. The dataflow is taken care of by the LACs layer. In our specification of the dynamic semantics of LACE we remain at a high level of abstrac- tion and do not describe the details of the interaction protocol and the synchronization mecha- nism.

2.3 Use Cases and User Roles of LACE Specification

A high-level description of logical actions in a LACE program is translated into the invocations of hardware drivers in such a way that the resulting execution matches the behavior defined in the LACE program. The semantic gap between LACE and driver functions is wide, and thus, the translation is hard to develop, maintain, understand, and use. We aim at constructing a formal specification of the dynamic semantics of LACE in order to enhance understandability, maintain- ability and usage of the DSL translation. This can be achieved through applying different kinds of analysis to the formal specification of LACE. In relation to the DSL-based development process, we can identify two different roles of users: a DSL developer and a DSL end-user. Figure 2.5 depicts how these these two user roles can benefit from a formal specification of LACE.

Event-B / Rodin «implements » Verify LACE design «implements Discharging POs

»

Validate LACE « Model checking «implements LACE implements » developer design » Analyze « LACE program Animation implements » «extends » Simulate LACE program LACE user Visualization

Figure 2.5: Use cases and user roles of LACE specification 2.4. Event-B and Rodin 17

A DSL developer designs and develops the DSL by constructing its metamodel and imple- menting its dynamic semantics through various translations (for example, through code genera- tion). A formal specification of the DSL dynamic semantics can be used by the DSL developer to verify that the design of the DSL is consistent (non-contradictive), feasible, and complete (the left top bubble in Figure 2.5). Moreover, the DSL developer can validate the dynamic semantics of the DSL by executing various (sample) programs and ensuring that the observed behavior corresponds to his/her expectations (the left lower bubble in Figure 2.5). A DSL user specifies DSL programs as instances of the DSL metamodel and executes the source code generated from these programs. A DSL user usually does not know how the DSL is implemented (i.e. how the translation works) and employs high-level user manuals to learn the DSL constructs and their (approximate) semantics. A formal specification of the DSL dynamic semantics can be used by a DSL user to simulate the execution of his/her programs (the right bottom bubble in Figure 2.5). In this way, a DSL user can obtain a better understanding of the dynamic semantics of the DSL and improve his/her programs according to this obtained knowledge. Moreover, a (more advanced) DSL user can analyze behavior of his/her programs, for example by assessing their throughput or checking various (temporal) properties (the right top bubble in Figure 2.5). The described use cases and user roles determine the choice of a specification formalism for defining the dynamic semantics of LACE. There exist a number of approaches for defining the dynamic semantics of general purpose languages (GPLs), such as denotational and algebraic semantics [108, 73], action semantics [70], and structural operational semantics (SOS) [78]. However, these formalisms do not have practically applicable tool support. At the same time, there exists quite a number of tools that support formalisms for specifying behavior of hard- ware and software systems, such as Z, B, Event-B, and Abstract State Machines (ASM). These formalisms are not designed for specifying the DSL semantics, but allow for specifying and analyzing behavior in general. Aiming for practical benefits of having a formal definition of the DSL dynamic semantics (i.e. for realizing the use cases listed above), we have chosen to perform our case study with the Event-B formalism and the Rodin platform (Figure 2.5 in the middle). The next section briefly introduces this formalism and describes how the Rodin tool set implements the use cases.

2.4 Event-B and Rodin

Event-B is an evolution of the B method, both introduced by Abrial [1, 2]. Event-B employs set theory and first-order logic for specifying software and/or hardware behavior. An Event-B specification consists of contexts and machines. A context describes the static part of a system: sets, constants, and axioms. A machine uses the context to specify behavior of a system via a state-based formalism. Variables of the machine define the state space. Events, which change values of these variables, define transitions between the states. An event consists of guards and actions, and can have parameters. An event can occur only when its guards are true, and as a result of the event its actions are executed (in parallel). Parameters represent existentially quantified variables local to the event and are used in its guards and actions to represent an input or auxiliary data. The properties of the system are specified as invariants, which should hold for all reachable states. Moreover, one can specify theorems based on the global axioms or invariants of the specification or on the local guards of an event. The Rodin platform [3] offers various tools (plug-ins) for Event-B:

editors of Event-B specifications, that implement the familiar notation of set theory and • 18 Defining the Dynamic Semantics of a DSL: a Case Study

predicates in Unicode, supporting the what-you-see-is-what-you-get (WYSIWYG) princi- ple; syntactical analysis of Event-B specifications, that includes type checking of various for- • mulas of set theory and first-order logic;

automatic generation of proof obligations, that help to ensure that an Event-B specification • is (semantically) consistent (for example, that invariants are preserved and predicates are well-defined); interactive and automatic provers, that facilitate discharging of proof obligations and/or • theorems of the specification;

model checking, that can be used to verify if a specified behavior respects certain proper- • ties; animation (i.e. execution of a specification), that allows for debugging of Event-B specifi- • cations;

a graphical editor for constructing domain-specific visualizations of Event-B specifica- • tions. Thus, Event-B and Rodin in principle allow for realizing the use cases described in Sec- tion 2.3 (Figure 2.5 in the center). Syntactical analysis of Event-B specifications and discharg- ing proof obligations ensure that an Event-B specification is correct-by-construction. This con- tributes to the verification of the DSL design (that it is consistent, feasible, and complete). Model checking allows for verification of more complicated properties of the dynamic semantics of the DSL. Animation of Event-B specifications can be used for validation of the DSL design by the DSL developers and for simulation of DSL programs by DSL users. Moreover, a domain specific visualization can make the animation of Event-B specifications comprehensible for DSL users, who are not familiar with the formal notation of Event-B. Another benefit of Event-B is that it has an active community of users and developers with annual (informal) workshops, where users’ experiences, newly developed tools and techniques, and functionality requests are presented, discussed, and picked up.3 We introduce the syntax and the semantics of Event-B in the next section, as we apply it to the specification of the LACE DSL.

2.5 Specification of LACE Dynamic Semantics in Event-B

LACE dynamic semantics is specified in Event-B as a set of Event-B contexts and machines. Event-B contexts represent data structures, which are used in Event-B machines for specifying behavior. This explicit separation of data structure from behavior is very useful for constructing Event-B specifications for the different (meta-)levels of LACE. In the MDE context a DSL resides in two abstraction levels: the DSL metamodel and DSL models (programs). The DSL is designed and implemented on the metamodel level, and it is used via instantiating DSL programs on the model level. On the metamodel level, a generic Event-B specification of the DSL can be created and analyzed once (for example, by discharging proof obligations generated for the specification). On the model level, Event-B specifications of DSL programs need to be constructed many times, for each concrete DSL program (for example, to

3www.event-b.org, http://wiki.event-b.org/index.php/Main_Page 2.5. Specification of LACE Dynamic Semantics in Event-B 19 model check its state space or simulate its execution). In other words, the different use cases applicable on these two levels (i.e. the different kinds of analysis of the corresponding Event-B specifications) require different levels of details in the specified LACE data structures. There- fore, we specify data structures from different levels in different Event-B contexts: a metamodel context and a model context (see Figure 2.6). In Section 2.5.1 we describe the details of this technique.

DSL / EMF Event-B M2M Rodin

LACE metamodel sees input conceptual Discharging POs metamodel context machine

«instantiate» M2M M2M input Model checking LACE model sees composite program context machine Simulation

Figure 2.6: Instantiation and composition of Event-B specifications of LACE

The specification of the dynamic semantics of LACE in Event-B is not trivial: the constructs of LACE do not map one-to-one to the constructs of Event-B. Therefore, to make the specifica- tion clear and comprehensible, we strive towards a modular design of the specification. For this, we identify two dimensions of modularity that are present in the dynamic semantics of LACE: architectural modularity and semantic modularity. The first dimension is formed by the com- ponents of two layers of the LACE software architecture: subsystems and LACs. The second dimension consists of the semantic features, listed in Section 2.2. We specify these modules in Event-B in separate conceptual machines using the metamodel context of LACE. Using one of the (de)composition techniques of Event-B, we compose conceptual machines together into a composite machine that specifies the behavior of a concrete LACE program (see Figure 2.6). In Section 2.5.2 we describe the details of this technique. To automate construction of model contexts and of composite machines for an arbitrary LACE program, we developed a model transformation from LACE to Event-B (M2M in Fig- ure 2.6). The LACE-to-Event-B model transformation implements both the modularity of the dynamic semantics of LACE, and two (meta-)levels of its specification as described further in Sections 2.5.1 and 2.5.2. The technical details and the results are given in Section 2.5.3. The work presented in this section is published in [98].

2.5.1 Specification of LACE Structure in Event-B 2.5.1.1 Event-B contexts To specify the dynamic semantics of LACE in Event-B, we first need to specify the structure of LACE in Event-B. This is done in Event-B contexts. As we mentioned above, the structure of LACE (as of any other DSL in MDE) takes two different forms: the metamodel and a model. The metamodel of LACE describes the constructs of the language and how these constructs can be combined together. A model instantiates the language constructs for a concrete application of LACE. In Event-B we specify both these forms of the LACE structure – in a metamodel context and in a model context. The metamodel context captures the structure specified in the LACE metamodel. For exam- ple, Figure 2.7(a) shows the metamodel context for the two semantic features of LACE: Core 20 Defining the Dynamic Semantics of a DSL: a Case Study

SF and Order SF. The DSL constructs are introduced via Event-B sets (the ‘SETS’ section): SSAction for subsystem actions, LogicalAction for logical actions, and Occurrence to model occurrences of subsystem actions within a logical action.4 The relationships between the DSL constructs are introduced via Event-B constants (the ‘CONSTANTS’ section): LALabelDef, LAOrderDef, and SS1. Event-B axioms specify the structure (types) of these relationships and define additional properties for them (i.e. static semantics). For example, LALabelDef is spec- ified in axm1 in Figure2.7(a): LALabelDef associates each logical action (LogicalAction ) with a subset of occurrences of subsystem actions (modeled as a partial function Occurrence→ SSAction). The relation LAOrderDef introduces a partial order on this subsets of occurrences7→ (separately for each logical action): axm3 associates each LogicalAction with a relation of Oc- currences; axm4 ensures that this relation is in the scope of the LALabelDef of this logical action; axm5 forbids cycles in this relation. The constant SS1 is used to distinguish subsys- tem actions of a separate subsystem: axm2 defines it as a subset ( ) of all available subsystem actions SSAction. As on the level of the LACE metamodel we do⊆ not know which particular subsystems participate in a concrete logical action (specified on the model level), we use SS1 as a placeholder, which is instantiated for each concrete subsystem of a LACE model – in the corresponding model context.

CONTEXT lace metamodel order context CONTEXT takeasnapshot model order context SETS SETS SSAction, LogicalAction, Occurrence SSAction, LogicalAction, Occurrence CONSTANTS CONSTANTS LALabelDef, LAOrderDef, SS1 LALabelDef, LAOrderDef, AXIOMS laTAS, Laser, Sensor, Handler, Projector, axm1 : LALabelDef LogicalAction ssaPL, ssaAF, ssaGAF, ssaPO, ssaMP, ssaSG, (Occurrence∈ SSAction) → o1, o2, o3, o4, o5, o6 7→ axm2 : SS1 SSAction AXIOMS ⊆ axm3 : LAOrderDef LogicalAction axm1 : partition(SSAction, ssaP L , ssaAF , ∈ → ssaGAF , ssaP{ O , ssaMP} { , ssaSG} ) (Occurrence Occurrence) { } { } { } { } ↔ axm2 : partition(Laser, ssaP L ) axm4 : la la LogicalAction { } (dom(LAOrderDef∀ · ∈ (la)) LALabelDef⇒ (la) axm3 : partition(Sensor, ssaAF , ssaGAF ) ⊆ ∧ { } { } ran(LAOrderDef(la)) LALabelDef(la)) axm4 : partition(Handler, ssaP O , ssaMP ) ⊆ { } { } axm5 : la, s la LogicalActions axm5 : partition(P rojector, ssaSG ) ∀ · ∈ ∧ { } LAOrderDef(la) = ∅ axm6 : partition(LogicalAction, laT AS ) 6 ∧ s dom(LAOrderDef(la)) s = ∅ { } ⊆ ∧ 6 ⇒ axm7 : partition(Occurrence, s * LAOrderDef(la)[s] o1 , o2 , o3 , o4 , o5 , o6 ) { } { } { } { } { } { } END axm8 : partition(LALabelDef, laT AS o1 ssaP L, o2 ssaAF, o3 { ssaGAF,7→ o {4 7→ssaP O, o5 7→ ssaMP, o67→ ssaSG ) 7→ 7→ 7→ }} axm9 : partition(LAOrderDef, laT AS o1 o2, o1 o4, o3 o2, o3 o4, o6 { o1, o6 7→o {3, o57→ o6 ) 7→ 7→ 7→ 7→ 7→ 7→ }} END

(a) Metamodel context (b) Model context

Figure 2.7: Contexts of the specification of Core SF with Order SF

In a model context, the sets and constants, introduced in the metamodel context, are assigned concrete values. These values represent (correspond to) the concrete LACE model (program). For example, Figure 2.7(b) shows the model context that instantiates the metamodel context for the LACE program depicted in Figure 2.3. For this, the LACE-to-Event-B model transforma-

4In the rest of this thesis, we use this font convention when referring to elements of an Event-B specification. 2.5. Specification of LACE Dynamic Semantics in Event-B 21 tion generates extra constants to represent concrete values (i.e. objects) appearing in the LACE program in Figure 2.3: the constant laTAS represents the logical action take_a_snapshot; • the constants Laser, Sensor, Handler, and Projector represent the subsystems Laser, • Sensor, Handler, and Projector; the constants ssaPL, ssaAF, ssaGAF, etc. represent the subsystem actions Produce- • Light, AdjustFrame, GrabAFrame, etc.; the constants o1, o2, o3, etc. represent occurrences of the subsystem actions in the logical • action. The axioms of the model context (Figure 2.7(b)) assign values to the constructs and structural relations, introduced in the metamodel context. For example, axm1 assigns SSAction to a set of the elements ssaPL, ssaAF, ssaGAF, ssaPO, ssaMP, and ssaSG (i.e. to the constants representing subsystem actions). Axioms axm2-5 split this set into the subsets representing the subsystems Laser, Sensor, Handler, and Projector. As depicted in Figure 2.6, a conceptual machine uses the metamodel context to specify the dynamic semantics of LACE in terms of the metamodel. Based on the structural properties, specified in the axioms of the metamodel context, the Rodin tool set generates proof obligations for the conceptual machines. By discharging these proof obligations, a DSL developer can ensure that the Event-B specification of LACE is consistent and complete. In this way, the dynamic semantics of LACE is specified and analyzed on the metamodel level. The metamodel context and the conceptual machines for a specific DSL are constructed manually and only once. When the Event-B sets and relations instantiated in the model context are used instead of the abstract sets and relations of the metamodel context, the Event-B machine (a composite ma- chine in Figure 2.6) specifies the behavior of a concrete LACE program. Such a specification can be model checked and animated, allowing for the analysis of a particular LACE program. Model contexts are generated automatically from LACE programs by the LACE-to-Event-B model transformation.

2.5.1.2 Generic instantiation In the context of MDE, the MOF standard defines the conformance relationship between a model and a metamodel. When one creates a LACE model, this model conforms to the LACE meta- model, i.e. instantiates its syntax constructs and obeys the rules of its static semantics. In Event- B we can connect the model and the metamodel contexts with a relationship, parallel to the conformance relationship between a model and the metamodel using generic instantiation [4]. Comparing to the MOF conformance relationship, the Event-B generic instantiation ensures the structural conformance (of a model to the metamodel) in order to reuse the Event-B specification and its discharged proof obligations. According to the generic instantiation technique, if all structural properties that are defined in the metamodel context can be derived for the structure that is instantiated in the model context, then the discharged proof obligations for the Event-B machine can be extended straightforwardly from the metamodel level to the model level [4]. In other words, if we show that the sets and constants of the model context conform to the axioms of the metamodel context (for example, that axm4 of the metamodel holds for the LALabelDef and LAOrderDef as specified in axm8 and axm9 of the model context); then we can reuse the proof obligations, discharged for the 22 Defining the Dynamic Semantics of a DSL: a Case Study conceptual machines on the metamodel level, for the composite machine on the model level (Figure 2.6). In [88] and [10] it is proposed to use theorem proving to show that the metamodel properties can be derived for the model context. However, due to the large sizes of our model contexts (generated from LACE programs), the automatic provers of Rodin fail to discharge such (in- stantiation) theorems. For example, if a LACE model consists of three logical actions, each of which consists of six to ten subsystem actions, then the structural properties defined in axioms in Figure 2.7(a) can not be automatically proved for the corresponding model context using Rodin provers. On the other hand, we do not expect an average DSL user to prove these theorems using Rodin interactive provers, as it requires knowledge of propositional calculus and understanding of proof strategies. Therefore, instead of the theorem proving, we employ evaluation of the pred- icates that capture structural properties in the ProB animator integrated in Rodin [60]. Thus, we check the conformance relationship between a model context and the metamodel context in Event-B for each concrete LACE program (in a semi-automatic way). Such a check might seem superfluous, as each LACE model already conforms to the LACE metamodel. However, the metamodel context for LACE and the LACE-to-Event-B model trans- formation are constructed manually. Checking that a model context, that is generated by the transformation, instantiates the metamodel context can reveal potential errors (such as inconsis- tencies or gaps) in these manually constructed artifacts. Therefore, such a check is useful.

2.5.2 Composition of LACE Behavior in Event-B As we mentioned before, we employ a modular approach in order to facilitate development and understandability of the Event-B specification of the LACE dynamic semantics. For this, we identify two types of modules (dimensions of modularity) in the LACE dynamic semantics: architectural components (such as LACs and subsystems) and semantic features (such as Core SF and Scan SF). These two different types of modularity require different techniques for composing conceptual machines (that specify separate modules) into the composite Event-B machine (that specifies the dynamic semantics of LACE): viz. the shared event composition technique [89] and weaving of Event-B code.

2.5.2.1 Event-B machines As mentioned earlier, each module of the Event-B specification of the LACE dynamic semantics is represented in a separate Event-B machine. For example, the core semantics (Core SF) of a logical action component (LAC) and of a subsystem (SS) is specified in the machines depicted in Figures 2.8(a) and 2.8(b) correspondingly. Each of these machines uses (‘SEES’) the same Event-B context lace_metamodel_core_context, which is a subset of the metamodel context dis- cussed in Section 2.5.1. Below we explain the details of the Event-B syntax using the example of the LAC machine. LAC is responsible for processing a requested logical action through translating it into re- quests of the corresponding subsystem actions. In the machine lace_LAC_core the variable curr_la represents the logical action that is currently being processed; and the variable curr_job stores the (occurrences of) subsystem actions that still need to be requested (from the correspond- ing subsystems). The invariants specify the corresponding types of the variables: inv1 defines curr_job as a subset of subsystem actions occurrences; inv2 defines curr_la as a logical action. In the (standard for Event-B) special event Initialisation, we assign the default values to the vari- ables according to their types: curr_job is empty (act1); curr_la is any logical action from the set LogicalAction (act2). 2.5. Specification of LACE Dynamic Semantics in Event-B 23

MACHINE lace LAC core MACHINE lace SS1 core SEES lace metamodel core context SEES lace metamodel core context VARIABLES VARIABLES curr job, curr la ss1 buffer INVARIANTS INVARIANTS inv1 : curr job P(Occurrence) inv1 : ss1 buffer N (SS1) ∈ ∈ 7→ inv2 : curr la LogicalAction inv2 : finite(ss1 buffer) ∈ EVENTS EVENTS Initialisation Initialisation begin begin act1 : curr job := ∅ act1 : ss1 buffer := ∅ act2 : curr la : LogicalAction end ∈ end Event request ssa = Event request la = any ssaction, ss1 index any la where b where b grd1 : ssaction SS1 ∈ grd1 : la LogicalAction grd2 : ss1 index N ∈ ∈ grd2 : curr job = ∅ grd3 : ss1 buffer = ∅ then ( i i dom(ss1 buffer6 ⇒) ss1 index > i) ∀ · ∈ ⇒ act3 : curr job := dom(LALabelDef(la)) then act4 : curr la := la act2 : ss1 buffer := ss1 buffer end ss1 index ssaction∪ { 7→ } Event request ssa = end any occurrence, ssaction Event execute ssa = where b any ssaction, ss1 index grd3 : curr job = ∅ where b 6 grd4 : occurrence curr job grd4 : ss1 buffer = ∅ ∈ 6 grd5 : occurrence ssaction grd5 : ss1 index ssaction ss1 buffer 7→ ∈ 7→ ∈ LALabelDef(curr la) grd6 : i i dom(ss1 buffer) then ∀ · ss∈1 index i ⇒ ≤ act5 : curr job := curr job occurrence then \{ } end act3 : ss1 buffer := ss1 buffer \ END ss1 index ssaction end { 7→ } END (a) Event-B machine specifying the behavior of LAC (b) Event-B machine specifying the behavior of SS

Figure 2.8: Event-B machines specifying the core dynamic semantics of LAC and SS1

The (major part of the) dynamic semantics of Core SF for LAC is defined in the events of the machine: request_la and request_ssa. The request_la event is used for requesting a logical action. The logical action being requested is represented by the (local) parameter la (‘any’ section). The guards (‘where’) check the type of the local parameter (grd1) and ensure that the previous logical action has been processed (i.e. that curr_job is empty, grd2). As a result of this event (‘then’), curr_job is assigned the subset of occurrences that constitute the logical action la according to the structure defined in LALabelDef (act3); and curr_la is updated to the requested logical action (act4). The request_ssa event empties curr_job by removing one-by-one subsystem action occurrences from it (act5). Figure 2.8(b) shows the Event-B machine specifying a subsystem (SS). Each subsystem stores the requested subsystem actions in its buffer (in the request_ssa event) and executes them one-by-one in a first-in-first-out (FIFO) order (in the execute_ssa event). To respect the FIFO or- der of requested subsystem actions, the variable ss1_buffer is modeled as a partial function from 24 Defining the Dynamic Semantics of a DSL: a Case Study natural numbers to subsystem actions of this subsystem (N SS1). When a new subsystem action is requested, it is added to the buffer coupled with an index7→ that is bigger than any other index in the buffer (grd3 in the request_ssa event). Correspondingly in the execute_ssa event, the subsystem action with the lowest number is executed (grd6).

2.5.2.2 Architectural modularity The LAC and SS components, described above, interact with each other using the request_ssa event: LAC sends the request for the execution of a subsystem action to SS. To model this interaction, we use the Event-B shared event composition technique [89]. This technique allows for constructing an Event-B machine from constituent machines (modules) by synchronizing (weaving/composing together) their events (with no common variables allowed in the constituent machines). Thus, in a composite machine lace_LACxSS_core the events request_ssa of the LAC and SS machines are composed together (synchronized) into a new composite event request_ssa. Figure 2.9 shows the corresponding fragment of the composite machine. The (not overlapping, independent) variables of the constituent machines are conjoined in the composite machine, the invariants are conjuncted, and the other (non-synchronized) events are copied to the composite machine without any change.

MACHINE lace LACxSS1 core SEES lace metamodel core context VARIABLES curr job, curr la, ss1 buffer INVARIANTS lac inv1 : curr job P(Occurrence) ∈ lac inv2 : curr la LogicalAction ∈ ss1 inv1 : ss1 buffer N (SS1) ∈ 7→ ss1 inv2 : finite(ss1 buffer) EVENTS ... Event request ssa = any occurrence, ssaction, ss1 index where b lac grd1 : curr job = ∅ 6 lac grd2 : occurrence curr job ∈ lac grd3 : occurrence ssaction LALabelDef(curr la) 7→ ∈ ss1 grd1 : ssaction SS1 ∈ ss1 grd2 : ss1 index N ∈ ss1 grd3 : ss1 buffer = ∅ ( i i dom(ss1 buffer) ss1 index > i) then 6 ⇒ ∀ · ∈ ⇒ lac act1 : curr job := curr job occurrence \{ } ss1 act1 : ss1 buffer := ss1 buffer ss1 index ssaction end ∪ { 7→ } ...

Figure 2.9: Event-B machine composed of LAC and SS1

The composite event request_ssa in Figure 2.9 uses the conjunction of the guards of the constituent events (request_ssa in Figures 2.8(a) and 2.8(b)). The actions of the constituent events are combined (and are executed in parallel). The parameters of the constituent events are joined with respect to the shared parameter ssaction, which is used for the exchange of the data (synchronization) between the interacting components. Thus, ssaction requested by LAC (in the 2.5. Specification of LACE Dynamic Semantics in Event-B 25 request_ssa event in Figure 2.8(a)) is the same as ssaction accepted by SS for the execution (in the request_ssa event in Figure 2.8(b)). In this way, we use the Event-B shared event composition technique to specify the archi- tectural modularity of LACE. The composition of the LAC and SS machines is implemented using the model transformation LACE-to-Event-B (the M2M arrow from conceptual machines to composite machines in Figure 2.6). Note that, while each subsystem can be specified using the single Event-B machine (such as depicted in Figure 2.8(b)), on the model level this machine is instantiated for each concrete subsystem that participates in a logical action. This is necessary in order to model the execution of LACE programs on a real-life waferstepper, where each LAC interacts with multiple subsystems in parallel. As a result, the generated composite machine can be composed of one LAC machine and multiple SS machines. We use the ss1_ prefix to highlight the specification elements that are instantiated (i.e. renamed) on the model level.

2.5.2.3 Semantic modularity Compared to the architectural components, semantic features do not have clearly defined bound- aries and interaction interfaces. Instead, we specify Scan SF, Order SF, and Data SF (introduced in Section 2.2) as independent or gradual extensions of Core SF, as schematically depicted in Figure 2.10(a). The specifications of different SFs can use different data types; can introduce dif- ferent variables, invariants, guards, and actions; and even can change (adapt) invariants, guards, and actions. This results in a tangled composition of SFs together, as presented in Figure 2.10(b).

data core data order scan LACE SFs scan order scan dataorder order scan LACE-to-EventB core transformation

Event-B machines Event-B code (a) Configurations of machines (b) Weaving semantic features

Figure 2.10: Composition of LACE semantic features in Event-B

For example, Figure 2.11 shows the (simplified) specification of Order SF for LAC. Here, the variable wait_exe stores the subsystem actions (i.e. their occurrences) that should precede other subsystem actions. This variable is initialized when a new logical action is requested (act6 in the request_la event). A subsystem action is removed from wait_exe as soon as it is executed (the execute_ssa event in Figure 2.11). The order of the execution of subsystem actions is achieved through the following specification. A subsystem action can not be requested (for the execution by the corresponding SS) if it depends on (follows) a subsystem action that is still present in the wait_exe subset (grd5 of the request_ssa event). Thus, Order SF adds new elements to the specification of Core SF. Compared to Order SF, Scan SF changes the type of the parameters in one of the events of Core SF. In particular, in the specification of Scan SF depicted in Figure 2.12(a), LAC requests not single subsystem actions, but sets of those. The sets of subsystem actions with more than one action represent scans. The composition of Scan SF and Order SF is depicted in Figure 2.12(b). The change of the parameter type in Scan SF (compared to Core SF) requires a different way of using this parameter in the specification of Order SF (grd5 in Figure 2.12(b)). This semantic modularity currently cannot be implemented via the composition techniques of Event-B (without changing/adapting the Event-B specifications of SFs). Instead we weave 26 Defining the Dynamic Semantics of a DSL: a Case Study

MACHINE lace LAC order SEES lace metamodel order context VARIABLES curr job, curr la, wait exe INVARIANTS inv1 : curr job P(Occurrence) ∈ inv2 : curr la LogicalAction ∈ inv3 : wait exe P(Occurrence) ∈ EVENTS Initialisation begin act1 : curr job := ∅ act2 : curr la : LogicalAction ∈ act3 : wait exe := ∅ end Event request la = any la where b grd1 : la LogicalAction ∈ grd2 : curr job = ∅ then act4 : curr job := dom(LALabelDef(la)) act5 : curr la := la act6 : wait exe := ran(LAOrderDef(la)) end Event request ssa = any occurrence, ssaction where b grd3 : occurrence curr job ∈ grd4 : occurrence ssaction LALabelDef(curr la) 7→ ∈ grd5 : LAOrderDef(curr la)[ occurrence ] wait exe = ∅ then { } ∩ act7 : curr job := curr job occurrence end \{ } Event execute ssa = any ssaction, occurrence where b grd6 : occurrence ssaction LALabelDef(curr la) then 7→ ∈ act8 : wait exe := wait exe occurrence end \{ } END

Figure 2.11: Event-B machine specifying Order SF for LAC semantic features together by configuring the generation of Event-B source code in the LACE- to-Event-B model transformation (the self-referential M2M arrow in Figure 2.6).

2.5.2.4 Intersection of two dimensions of modularity The intersection of the two types of modularity results in eight conceptual machines and four composition schemes: for each semantic feature we specify a conceptual machine of each soft- ware module (LAC and SS) and a scheme of the interaction of LAC with SS. An interaction 2.5. Specification of LACE Dynamic Semantics in Event-B 27

MACHINE lace LAC scan MACHINE lace LAC order scan ...... Event request ssa = Event request ssa = any scan, ssactions any scan, ssactions where b where b grd1 : curr job = ∅ grd1 : curr job = ∅ 6 6 grd2 : scan curr job grd2 : scan curr job ⊆ ⊆ grd3 : scan LAScansDef(curr la) grd3 : scan LAScansDef(curr la) ∈ ∈ grd4 : ssactions = LALabelDef(curr la)[scan] grd4 : ssactions = LALabelDef(curr la)[scan] then grd5 : LAOrderDef(curr la)[scan] ∩ act1 : curr job := curr job scan wait exe = ∅ end \ then ... act1 : curr job := curr job scan end \ ...

(a) Event-B machine specifying Scan SF for a LAC (b) Event-B machine composing Scan SF and Order SF for a LAC

Figure 2.12: Fragments of Event-B machines for Scan SF and for Order SF scheme determines how the LAC and SS machines are composed together. The examples of two interaction schemes are depicted in Figure 2.13.

request_la request_la notify_ssa_execution LAC LAC execute_SsB request_ssa request_ssa execute_SsC request_SsB request_scanCD request_SsB request_SsC request_SsC

request_ssa request_ssa request_ssa request_ssa request_ssa SsB SsC SsD SsB SsC execute_ssa execute_ssa execute_ssa execute_ssa execute_ssa

execute_scanCD (a) Core + Scan SFs (b) Core + Order SFs

Figure 2.13: Architectural composition of Event-B machines for different configurations of se- mantic features

In Figure 2.13, events of the Event-B machines are represented as interaction interfaces (de- picted by lollipops). For example, in Figure 2.13(a) the LAC machine has two events: request_la and request_ssa – for requesting a logical action and for requesting a subsystem action respec- tively. Machines SsB, SsC and SsD are the instantiations of the conceptual machine SS. Thus, all of them have two events: request_ssa and execute_ssa. Different styles of dashed lines depict the interaction between machines via composition of events. The names of the resulting compos- ite events (interactions) are typeset slanted. For example, in Figure 2.13(a) event request_scanCD is a composition of the request_ssa event of the LAC machine and the request_ssa events of the SsC and SsD machines. The interaction schemes for the LACE semantic features are specified in the LACE-to- EventB transformation. For example, the interaction scheme for Scan SF that generates a network 28 Defining the Dynamic Semantics of a DSL: a Case Study depicted in Figure 2.13(a) consists of the following rules: 1. for each subsystem SS, required to execute subsystem actions in the logical actions of the LAC, compose the LAC.request_ssa event with the SS.request_ssa event; 2. for each combination of the subsystems SSs, required to execute scans in the logical actions of the LAC, compose together the LAC.request_ssa event and the SS.request_ssa events of all the subsystems from SSs (see request_scanCD Figure 2.13(a));

3. for the same combinations of the subsystems SSs compose their SS.execute_ssa events together (see execute_scanCD in Figure 2.13(a)). An Event-B machine that specifies the LACE semantics as a whole is composed of LAC and SS machines, that include Event-B code for all four semantic features, according to the com- positional schemes of all four semantic features. Two dimensions of the modularity presented above simplify creation of Event-B components and their validation through the animation, im- proves automatic discharging of proof obligations, and enhances the maintenance of the LACE- to-Event-B model transformation.

2.5.3 Implementation and Results The LACE-to-Event-B transformations, described in Sections 2.5.1 and 2.5.2, were implemented using the Operational QVT (Query/View/Transformation) language [74] in the Eclipse environ- ment. The input for the transformation is provided directly by the LACE implementation soft- ware, which employs model transformation and code generation techniques in the Borland To- gether environment, and therefore is compatible with EMF (Eclipse Modeling Framework). As a target metamodel for the transformation we use the Event-B Ecore implementation provided by the EMF framework for Event-B [92]. The LACE-to-Event-B transformation is designed in a modular way, which follows the logic of instantiation and composition techniques as described in Sections 2.5.1 and 2.5.2. Table 2.1: Characteristics of the LACE-to-Event-B transformation

Event-B Semantic features core+scan+ components core scan order data order+data Metamodel 3 constants 4 constants 5 constants 11 constants – context 5 axioms 8 axioms 9 axioms 22 axioms 20 constants 21 constants 23 constants 37 constants Model context – 7 axioms 8 axioms 10 axioms 17 axioms 3 events 3 events 4 events 4 events 4 events LAC machine 21 POs 23 POs 26 POs 28 POs 34 POs 40 LOC 41 LOC 88 LOC 87 LOC 105 LOC 3 events 3 events 3 events 3 events 3 events SS machine 7 POs 11 POs 7 POs 7 POs 11 POs 40 LOC 53 LOC 45 LOC 45 LOC 65 LOC composition 10 events 30 events 10 events 10 events 30 events of LAC and SS 70 POs 386 POs 82 POs 89 POs 491 POs machines 220 LOC 778 LOC 261 LOC 258 LOC 1184 LOC

Table 2.1 shows the representative characteristics of the transformation: the sizes of the meta- model contexts vs. model contexts and the sizes of the conceptual machines (the LAC and SS 2.6. Visualization of LACE Specifications 29 machines for Core SF, Scan SF, Order SF and Data SF semantic features) vs. composite ma- chines (bottom row).5 The automatically generated Event-B components are shaded. As an input for the automatic generation (model transformation) we used the LACE program depicted in Figure 2.3. All proof obligations (POs) of the LAC and SS machines are discharged by in- vocation of the automatic provers in Rodin. The proof obligations of the composite machines (bottom row) can be left undischarged, as these are inherited proof obligations of the LAC and SS machines (according to the shared event composition approach [89]). The Event-B machine that specifies the behavior of the LACE program is located in the bottom right cell of the table. One can observe that this machine is much larger and has much more proof obligations, than the conceptual machines, of which this machine is composed.

2.6 Visualization of LACE Specifications

In this section, we study one of the use cases of having a formal specification of the dynamic semantics of a DSL: understanding the execution (i.e. the dynamic semantics) of DSL programs through the animation of their Event-B specifications. This use case is mainly meant for DSL users (DSL engineers, or domain experts), who apply the DSL to solve various tasks and prob- lems, while having a restricted knowledge of the DSL dynamic semantics (obtained from a user guide or a tutorial). The work, presented in this section and in Section 2.7.1, was (in part) imple- mented by Rimco Boudewijns within his Master project [14] and is published in [97].

2.6.1 Designing LACE Visualizations In order to understand how LACE programs are executed, to learn the LACE semantics, and even to debug the programs, we simulate LACE programs using their Event-B specifications. The simulation of a LACE program is achieved through the execution of the Event-B specification (generated for this program) in an animation tool, for example in the ProB animator. However, DSL engineers usually do not know the formal notation of Event-B employed by the animation tools. Moreover, they are hardly familiar with the semantics of the DSL as it is specified in Event- B, and thus, have problems connecting Event-B specifications generated from their programs to their original programs. For example, Figure 2.14(c) shows a fragment of the screen shot of how the Event-B specifi- cation, which is generated from the LACE program depicted in Figure 2.14(a), is executed in the ProB animator. This Event-B specification is composed of the four instances of the same concep- tual machine representing a subsystem (for Laser, Sensor, Handler, and Projector) and of one instance of the LAC machine for Order SF. The events of these instantiated specifications and their combinations can be seen on the left side of Figure 2.14(c) (see the tab ‘Events’). The variables of the instantiated specifications and their values can be seen on the right side of Fig- ure 2.14(c) (see the tab ‘State’). It is hard to trace these events and variables back to the original LACE program without understanding the conceptual machines and how they are instantiated and composed. Consequently, while such a simulation might be useful for a designer who creates Event-B specifications of the DSL; one cannot expect engineers, who program using the DSL notation, to be able to use such a simulation. Therefore, we create a domain-specific visualization for Event- B specifications of LACE. For this we employ BMotion Studio [55], that provides a graphical editor for creating such visualizations and uses Event-B notation for specifying various details,

5The size of an Event-B context is represented by the number of its constants and axioms. The size of an Event-B machine is represented by the number of its events, proof obligations (POs), and lines of code (LOC). 30 Defining the Dynamic Semantics of a DSL: a Case Study

take_a_snapshot

Laser Sensor Handler Projector

objectPos

framePos AdjustFrame PositionObject

ProduceLight GrabAFrame snapshot

Switch Glass MoveParking

(a) Original DSL program

(b) Visualized animation in ProB: (1) subsystem buffers, (2) highlighted actions, (3) buttons for executing events

(c) Animation in ProB (running underneath the visualization)

Figure 2.14: An example of the LACE DSL program and its visualized animation in ProB such as predicate and value expressions.6 As BMotion Studio is integrated into ProB, a visual-

6In this work we use the first version of BMotion Studio. Currently the new version, BMotionWeb, is available. 2.6. Visualization of LACE Specifications 31 ization runs together with (or on top of) the ProB animator. A visualization provides a graphical user interface (GUI) for animating an Event-B specifi- cation. For creating such a GUI, BMotion Studio offers a palette of various graphical elements (shapes, connectors, text fields, buttons, etc.) and two concepts responsible for the animation: controls and observers. Controls are used to execute events of the specification. Observers are used to visualize the current state (i.e. current values of the variables). Below we discuss how such a GUI can be designed for a DSL. The main goal of our DSL visualization is to help a DSL user to grasp how a DSL program is executed. For this, we start from what a user already knows: the DSL concrete syntax. We add to it what a user aims to discover: the DSL dynamic semantics. In other words, we base our visualization on the graphical syntax of LACE; and we project the concepts of the LACE dynamic semantics on it. In general, such concepts represent an execution state and operations that can change this execution state. However, it is not obvious how such concepts can be visualized on top of the graphical syntax of LACE. For textual programming languages, such as C or Java, an established way to visualize an execution state on top of the program text is to highlight the code line that is being executed. Moreover, additional views, such as call stack, memory, variables, etc. allow for inspecting other aspects of the execution state. The execution of a program is done step by step using various play-buttons. This way of visualizing a program execution directly links to the concepts that are usually used in descriptions of the dynamic semantics of such languages: execution of a program line-by-line, function calls, memory management, etc. However, graphical languages and methods of visualizing their execution are not so estab- lished. An example of a graphical language is considered in [9]. In this work the execution of a UML activity diagram is visualized by projecting its dynamic semantics on the diagram. As the dynamic semantics of UML activity diagrams is described using the concept of token (adopted from Petri nets), the visualization consists of tokens (filled black circles) moving between activ- ity nodes. Thus, the design principle is the same as for textual programming languages: adding the visualization of (the run-time state of) the dynamic semantics of a language to the concrete syntax of the language. The dynamic semantics of LACE is defined as a set of Event-B conceptual machines repre- senting architectural components and semantic features of LACE (as described in Section 2.5). Therefore, we visualize these modules and project them on the LACE concrete syntax. By fo- cusing on the visualization of separate modules rather than on the visualization of the dynamic semantics as a whole, we design our visualization in a modular way. Below we demonstrate this approach on the examples of two modules (introduced in Section 2.5): the subsystem (SS) component and Order SF. SS component realizes the buffered execution of subsystem actions on a subsystem. An action is added to the buffer as a result of executing a LACE program, and removed from the buffer when this action is actually executed by the subsystem. We visualize a buffer for each subsystem participating in a LACE program as a column showing all elements of the buffer (see (1) in Figure 2.14(b)). We position the buffers above the corresponding subsystems and let them ‘grow’ downwards: a new element is added to the bottom of the column, and all elements are shifted up when an old element is removed. In the screen shot in Figure 2.14(b), the buffers of Laser and Projector are empty, the buffer of Sensor has one element, and the buffer of Handler has two elements. Order SF represents (models) a mutual dependency between subsystem actions (imposed by the control flow of a LACE program). Such a dependency influences the throughput of a program and combined with a slow execution on a subsystem can cause an (unforeseen) 32 Defining the Dynamic Semantics of a DSL: a Case Study

execution delay. We do not model time in our specification of the dynamic semantics of LACE, thus, we cannot detect such situations automatically (for example, using model checking). To facilitate (manual) detection of such situations, we visualize the partial order in the interaction with the subsystem buffers (where removing an action from a buffer can model such a slow execution). For this visualization we use color highlighting ((2) in Figure 2.14(b)): red rectangles highlight subsystem actions that are blocked from the execution, and green rectangles highlight subsystem actions that are being executed (i.e. are situated at the front/top of the corresponding buffers). To be able to execute the underlying Event-B specification of a LACE program we use buttons integrated into its visualization ((3) in Figure 2.14(b)). Such buttons get enabled and disabled based on the current state of the execution, letting a DSL user study the DSL semantics. Each button is attached to a graphical element of the LACE program in such a way that the user can intuitively associate the enabled or disabled behavior with the corresponding construct of LACE. For example, each subsystem in Figure 2.14(b) has two corresponding buttons attached to its graphical representation (column): "Request SS action" (for adding an action to the buffer) and "Execute next SS action" (for removing an action from the buffer).

2.6.2 Generating LACE Visualizations The resulting visualization mimics an (arbitrary) LACE program, and thus, needs to be con- structed for each concrete LACE program. To make such a visualization design feasible, we automate the construction of LACE visualizations using a model-to-model transformation from LACE to BMotion Studio. The transformation automatically derives graphical elements and their layout from an original DSL program and generates the corresponding observers (buttons), controls (shapes, labels, and their attributes), and the layout of a visualization. For example, the BMotion Studio visualization depicted in Figure 2.14(b) is automatically generated from the LACE program depicted in Figure 2.14(a). Moreover, the LACE-to-BMS transformation con- nects elements of the visualization with elements of the underlying Event-B specification (also generated from the LACE program, as described in Section 2.5). A BMotion Studio visualization runs on top of an Event-B specification. This means that controls and observers of the visualization query variables and events of the underlying Event-B specification. As the visualization design is determined (shaped) by the concrete syntax rather than by the dynamic semantics of the DSL, querying data from the Event-B specification requires rather non-trivial predicate and value expressions. For example, to realize highlighting of subsystem actions in green and red (as depicted in Figure 2.14(b)), we need to specify two observers for each subsystem action of a LACE diagram. Figure 2.15 shows the BMotion Studio wizard configuring such observers: an observer sets the background color to red or green when the corresponding predicate is true. The predicates check the mutual dependency between subsystem actions and the execution state of the corresponding subsystem buffer. Obviously, writing such predicates manually for all subsystem actions of the LACE program is quite tedious. Buttons (controls) do not map one-to-one on the events either. As described above, to make the semantics of buttons intuitively clear to DSL users, we attach them to the graphical elements of a LACE diagram. For example, in Figure 2.14(b) there are eight buttons attached to the four subsystem columns, and thus, representing the behavior of each of the subsystems. However, according to the interaction scheme of Scan SF, such a behavior of a subsystem can be executed in multiple events and in combination with the same behavior of another subsystem. For exam- ple, in Figure 2.14(c) "request_ssa_ss1" appears four times in different combinations with other 2.6. Visualization of LACE Specifications 33

Figure 2.15: Screen shot of the BMotion Studio wizard for configuring an observer subsystems ("ss2" and "ss3"). This means, that a subsystem button should be able to trigger different events depending on the current execution state, and moreover, some of these events should be triggered by a combination of multiple subsystem buttons. In other words, we need to implement a many-to-many relation between buttons and events. The LACE-to-BMS transformation overcomes these technical challenges by specifying the mapping between the concrete syntax (shaping the visualization) and the dynamic semantics (driving the visualization) on the level of the LACE metamodel. Thus, all predicate and value expressions and many-to-many relations between elements of the visualization and elements of the Event-B specification are generated automatically for each concrete LACE program. The scheme of the LACE-to-BMS transformation is depicted in Figure 2.16. Within the LACE development environment, a graphical LACE program is parsed into the corresponding LACE model. This model is used as an input for the LACE-to-Event-B transformation described in Section 2.5 (the right part of Figure 2.16). Besides a LACE model, the LACE-to-Event- B transformation takes as an input a set of LACE conceptual machines. As an output, this transformation generates an Event-B specification of the LACE program (bottom right corner of Figure 2.16) and the corresponding mapping information. The latter is in fact the log of a transformation execution and captures the links between the elements of the resulting Event- B specification and of the original LACE model. The mapping information is necessary for connecting LACE visualizations with the underlying Event-B specifications.

LACE syntax LACE parse LACE LACE semantics program model LACE (1) conceptual machine

Intermediate LACE-to-Event-B uses visualization mapping model information LACE-to-BMS (2)

B-Motion Studio uses Event-B visualization specification

transformation dependency

Figure 2.16: Transformations from LACE to Event-B and BMotion Studio

A LACE visualization is generated from the original graphical LACE program (the left part of Figure 2.16), as the essential model abstracts from (thus, leaves out) the notational details of the LACE diagram. The LACE-to-BMS transformation uses an intermediate visualization model, 34 Defining the Dynamic Semantics of a DSL: a Case Study which splits the transformation into two separate steps. The first step captures the graphical design of the visualization, i.e. the graphical elements and their layout. The second step takes care of the mapping between Event-B events and variables and BMotion Studio observers and controls. For example, in the second step of the transformation concrete elements of the Event- B specification are substituted into predicates of the observers and controls. In this way we modularize the transformation and decouple the DSL from the visualization platform, allowing for potential reuse of our approach for employing other visualization platforms.

2.7 User Study

Earlier in Section 2.3 we introduced the user roles and the (potential) use cases of constructing and analyzing an Event-B specification of the dynamic semantics of LACE. Below we elaborate on the (actual) experience of these two user roles with our Event-B specification of LACE. In Section 2.7.1 we describe the user study that we performed with end-users of LACE. In Sec- tion 2.7.2 we discuss an interesting insight that we got from one of the developers of LACE.

2.7.1 Interviews with LACE End-Users To validate the visualization design and to discover use cases for its application, we performed a user study among LACE end-users (ASML engineers who develop software using LACE). The user study consisted of interviews (with an engineer) and of brainstorming sessions (with few engineers at once). During such an interview or a brainstorming session, we would demonstrate our LACE visualization by animating five existing LACE programs and then collect the feedback using the questionnaire form and personal communication. Ten ASML engineers participated in the user study. Note that as LACE is a programming language specialized to a specific domain, the community of LACE users is not big. We estimate that the response rate for our user study was 50%. The participants had different levels of expertise in LACE and were from various application and/or system subdomains. During the user study we were aiming to achieve the following goals: to validate our rationale for designing a DSL visualization; • to assess how well the level of details is balanced; • to discover opportunities for applying the visualization in the development process. • Using the GQM (Goal, Question, Metric) method [93], we refine these goals to a more oper- ational level as a set of questions which we need to answer in order to draw conclusions from the results of the user study. Such questions allow for interpreting the data collected during the study. In Table 2.2 these study questions are shown in the second column next to the goals that they refine. The questions that were asked to the LACE end-users correspond to the Metric category of the GQM method. We show (examples of) these end-user questions in the third column of Ta- ble 2.2. The corresponding feedback of the users is presented in the rightmost column. Most of the questions in our questionnaire form were closed questions with a grade in the interval [0..4] to indicate the certainty and/or relevance of an answer. In Table 2.2 we indicate the mean value of the answers, given by the respondents to the corresponding question. The detailed description of the questionnaire and the collected answers can be found in [14]. 2.7. User Study 35 Outcome/feedback yes (3.3 out of 4) yes (2.9 out of 4) yes (2.9 out of 4) two more concepts were proposed rather no than yes (1.8 out of 4) yes (3 out of 4) yes (3 out of 4) rather yes than no (2.6 out of 4) indifferent (2.1 out of 4) four people – yes mostly not, one person – yes yes (2.9 out of 4) 15-20% Prototyping and validation, impact analysis, replay of execution logs and predefined sequences Metrics (questions asked to the end-users) Do you understand the visualization ofIs the the LACE visualization model? intuitive to findIs the the desired visualization information? intuitive to executeWhat the part desired of actions? LACE should be visualized? Are you happy with the level of details? Is highlighting of blocked and queuedDo actions queues helpful? help to see whichWould actions you need find to a be log executed? of all processed SS actions helpful? Would you find the visualization of the data values helpful? Did you have problems learning LACE? Do you still have problems understanding some LACE models? Would this visualization help understanding thoseHow models? much time would this kindHow of would visualization you save apply you? the visualization in practice? Table 2.2: The structure and the results of the user study Questions (subgoals) Design is intuitive and understandable Observers represent the execution state Semantics of controls (buttons) is clear Choice of the semantics concepts Overview of an animated diagram Insights provided by the visualization Insights missing in the visualization There is a lack of supportunderstanding for the LACE semantics The visualization can close this gap What are potential use case scenarios? Goals Validate the visualization design Assess the level of the details Discover opportunities for applying the visualization 36 Defining the Dynamic Semantics of a DSL: a Case Study

Based on the results of the user study presented in Table 2.2 and on other feedback of the LACE users, we draw the following conclusions. The visualization design is in general acceptable. Using the feedback provided by the • LACE users, it is possible to make the visualization design more intuitive and user-friendly. However, such a fine tuning is beyond the scope of our work. The approach would benefit from the possibility to configure the level of details of the • LACE visualization. For example, a LACE user might want to specify which elements of a LACE program (and/or which semantic features of the dynamic semantics of LACE) should appear in the visualization. This can be achieved through an explicit configuration of the LACE-to-BMS model transformation. The LACE users believe that they can use the visualization for understanding and test- • ing behavior of their LACE programs; for replaying real-life system executions; and for checking changes after LACE gets updated (impact analysis). Moreover, according to the LACE end-users, a crucial requirement of the proposed approach is that the DSL formal specification should be consistent with the actual implementation of the DSL. Without this consistency DSL end-users cannot benefit from the visualized animation of DSL specifications. In the next section we describe an example of such consistency.

2.7.2 The True Face of LACE LACE is implemented as a chain of model transformations and code generations, targeting an execution platform that wraps the lower level components (such as hardware drivers, buses, etc.). In Figure 2.4 we denoted this execution platform as Synchronization driver. The interface pro- vided by this execution platform is relatively limited. As a result, the translation of LACE to the execution platform (i.e. the implementation of LACE) does not follow an optimal and the most sensible design. It turns out that this results in differences between our formal specification of the dynamic semantics of LACE and the actual implementation of LACE. For example, Scan SF is implemented in the same way as we specify it in Event-B. That is, each logical action consists of intermediate constructs (so-called physical actions), which can consist of one or multiple subsystem actions. This corresponds to our way of specifying scans as subsets of SSAction constituting a LogicalAction. However, Order SF is implemented in a different way. In our Event-B specification, we define Order SF as a partial order on the occurrences of subsystem actions of a logical action (see Figure 2.7(a)). A next subsystem action cannot be requested by LAC if it (its occurrence) precedes a subsystem that has not been executed yet (see Figure 2.11). This design is not possible to implement using the interface provided by the target execution platform of LACE. Therefore, we modified the specification of Order SF so that it adheres to its actual implementation in LACE: see Figures 2.17 and 2.18. In LACE Order SF is implemented using a combination of the following two mechanisms. A (linear) order of the (occurrences of) subsystem actions within a logical action deter- • mines in which order these subsystem actions should be requested for the execution by the LAC component. The corresponding relation ocrRequestOrder is introduced in the metamodel context in Figure 2.17(a) and is instantiated for the example LACE program in Figure 2.17(b). The guard grd7 of the event request_ssa in Figure 2.18 ensures that the requested subsystem action (ssaction) has the smallest numerical order (ocr_order) among all the subsystem actions that still need to be requested (curr_job). 2.7. User Study 37

CONTEXT lace metamodel order context EXTENDS lace metamodel core context CONSTANTS ocrRequestOrder, ocrWaitForResult AXIOMS axm3 : ocrRequestOrder LogicalAction (Occurrence N) ∈ → 7→ axm4 : ocrW aitF orResult LogicalAction P(Occurrence) ∈ → END

(a) Metamodel context

CONTEXT takeasnapshot model order context EXTENDS takeasnapshot model core context CONSTANTS ocrRequestOrder, ocrWaitForResult AXIOMS axm1 : partition(ocrRequestOrder, laT AS o1 4, o2 1, o3 3, o4 2, o5 6, o6 5 ) { 7→ { 7→ 7→ 7→ 7→ 7→ 7→ }} axm2 : partition(ocrW aitF orResult, laT AS o1, o4, o6 ) { 7→ { }} END

(b) Model context

Figure 2.17: Contexts of the Event-B specification for the implementation of Order SF

A subset of the (occurrences of) subsystem actions within a logical action determines • which subsystem action blocks processing of the logical action. In other words, no other subsystem action can be requested until these subsystem actions have been executed. We model such subsets as the relation ocrWaitForResult (see Figure 2.17(a) and the example in Figure 2.17(b)). The corresponding behavior is specified in the event request_ssa in Figure 2.18 using the wait_exe variable: if the subsystem action being requested belongs to the blocking subset of the logical action (ocrWaitForResult(curr_la)), then it is included into the wait_exe variable (act7). No other subsystem action can be requested until the wait_exe variable is empty (grd5). The emptying of the latter happens in the event exe- cute_ssa (act8). The two Event-B specifications of Order SF presented in Figure 2.11 and in Figure 2.18 are just two possible solutions out of many possible. Moreover, we do not discuss the fact, that the latter specification might fail to realize the desired behavior (for example if PositionObject is executed earlier than AdjustFrame, then the scan ProduceLight/GrabAFrame can be already requested, which might result in the violation of the execution order). The focus here is on the question, what solution (or design) out of many possible solutions (designs) should be specified. According to the LACE developer and LACE end-users, whom we have interviewed, the specification should follow the actual implementation of the DSL. Then DSL developers can discover problems in their implementation, and DSL users can get aware of potential problems and develop their DSL programs correspondingly. 38 Defining the Dynamic Semantics of a DSL: a Case Study

MACHINE lace LAC order SEES lace metamodel core context, lace metamodel order context VARIABLES curr job, curr la, wait exe INVARIANTS inv1 : curr job P(Occurrence) ∈ inv2 : curr la LogicalAction ∈ inv3 : wait exe P(Occurrence) ∈ EVENTS Initialisation begin act1 : curr job := ∅ act2 : curr la : LogicalAction ∈ act3 : wait exe := ∅ end Event request la = any la where b grd1 : la LogicalAction ∈ grd2 : curr job = ∅ then act4 : curr job := dom(LALabelDef(la)) act5 : curr la := la end Event request ssa = any occurrence, ssaction, ocr order where b grd3 : occurrence curr job ∈ grd4 : occurrence ssaction LALabelDef(curr la) 7→ ∈ grd5 : wait exe = ∅ grd6 : occurrence ocr order ocrRequestOrder(curr la) 7→ ∈ grd7 : oc oc curr job oc dom(ocrRequestOrder(curr la)) ∀ · ∈ ∧ocrRequestOrder∈ (curr la)(oc) ocr order then ⇒ ≥ act6 : curr job := curr job occurrence \{ } act7 : wait exe := ocrW aitF orResult(curr la) occurrence end ∩ { } Event execute ssa = any ssaction, occurrence where b grd8 : occurrence ssaction LALabelDef(curr la) then 7→ ∈ act8 : wait exe := wait exe occurrence end \{ } END

Figure 2.18: Event-B machine specifying the implementation of Order SF for LAC

2.8 Related Work 2.8.1 Modular Specification of the Dynamic Semantics of a DSL There exist a number of studies in which Event-B is applied to specify the dynamic seman- tics of a DSL. Aït-Sadoune and Aït-Ameur employ Event-B and Rodin for proving properties 2.8. Related Work 39 and animation of BPEL processes [6]. Hoang et al. use Event-B and Rodin to automate anal- ysis of Shadow models [37]. In both studies, DSL program descriptions are translated into Event-B specifications. The translations are implemented in the Java programming language. These works do not use generic instantiation and composition techniques, but apply refinement of Event-B machines [4] to implement modularity of the programs. Based on our experience, refinement restricts semantics definitions and can be rather complicated for automatic proving. Moreover, we use model transformations to implement the generation of Event-B specifications, which increases the abstraction level of the LACE-to-Event-B translation. In [105] van Gool et al. describe a DSL that is a prototype of LACE. The authors define the dynamic semantics of this DSL through the model transformation from a PIM (platform- independent model) to a PSM (platform-specific model). The PIM consists of the LACE(-like) constructs, such as abstract behaviors (i.e. logical actions), concurrent and sequential execu- tion of resource behaviors (i.e. subsystem actions), passive behavior (we did not include this construct into our specification), and assigning values to variables (i.e. data flow). The PSM is represented by the scheduling component, which executes abstract behaviors (i.e. logical actions) in an optimal manner by determining what and when resources (i.e. subsystems) are needed for the execution. The PSM that is used by van Gool et al. is of higher abstraction level than the target execution platform that we considered in our specification of LACE. The model transfor- mation from the PIM to the PSM is described in [105] schematically as a set of transformation rules, one for each DSL construct. Thus, their definition of the dynamic semantics follows the semantic modularity of the DSL. In contrast to our specification of LACE semantic features, their semantic modules (defined in separate transformation rules) do not influence each other, i.e. can be clearly separated from each other. This might be due to the higher abstraction level of their target execution platform.

2.8.2 Domain-Specific Visualization of Formal Specifications There exist various approaches for visualizing a formal specification and its execution (anima- tion). It can be a visualization of a state space, or wrapping a formalism into the (standard) graphical notation of UML (such as [91]). In our work we focus on a domain-specific visualiza- tion – on a graphical representation tailored for a specific (engineering or application) domain. A number of case studies have demonstrated that a domain-specific visualization of a formal spec- ification can be very useful for creating, validating, and applying the specification by humans, especially by domain experts. For example, in [35] Hansen et al. state that a graphical repre- sentation of their Event-B specification of a landing gear system was crucial for its development and validation. In [65] Mathijssen and Pretorius use a visualization of their mCRL2 specification of an automated parking garage to discover and fix a number of bugs in the specification. They conclude that compared to the standard simulation of mCRL2, the visualization projected on the top of this simulation is more intuitively clear and easy to understand, and thus, helps to identify issues that may not have been noted otherwise. In [95] Stappers specifies the behavior of an industrial wafer handler using mCRL2, obtains a trace to a deadlock state using the mCRL2 toolset, and visualizes this trace using a CAD (Computer Aided Design) model of the wafer handler in a kinematic 3D visualizer. In other words, the visualization animates the 3D virtual model of the physical system by moving its parts along the predefined motion paths. As a result, engineers of the wafer handler can identify the problem that leads to the deadlock state and find a proper solution. Stappers presents his approach from a general point of view, describing the components that are required to realize such a visualization and their architecture. For our work we draw inspiration from his motivation on how system development can benefit from formal specifications and their visualization, and 40 Defining the Dynamic Semantics of a DSL: a Case Study from his overview of how various technological fields connect and interact with each other in order to achieve these benefits. While most of the domain-specific visualizations are implemented in an ad-hoc way (for ex- ample, a traffic light system presented in [66] is visualized in a prototype simulator developed in Java specifically for this case study); recent developments facilitate the creation of visualizations using dedicated graphical editors integrated with a formalism toolset. For instance, in BMotion Studio [55] one can create an interactive domain-specific visualization for a (single) B or Event-B specification. BMotion Studio has been successfully applied to a number of case studies (see for example [35] and [56]). In our work we lift this successful tool support to the level of DSLs, au- tomating the creation of BMotion Studio visualizations for multiple (or for a family of) Event-B specifications. To date, there is a lack of tool support for understanding and debugging an executable DSL on the level of its domain rather than on the level of its target execution platform (such as generated C or Java code). One of the early studies that recognize and address this problem is presented in [104]. In this work Olivier et al. describe TIDE, a framework that allows for instantiating a set of generic debugging facilities (such as stepping through the program and setting breakpoints) for an arbitrary DSL based on its formal specification in ASF+SDF. More specialized behavior, such as language specific visualizations, require extending the implementation of TIDE in Java. In [17] Chis et al. propose the Moldable Debugger framework for developing domain-specific debuggers. The Moldable Debugger allows for configuring a domain-specific debugging view as a collection of graphical widgets, such as stack, code editor, and object inspector. In order to visualize the execution of a DSL program in such widgets, a DSL developer needs to specify so- called debugging predicates (capturing an execution state) and debugging operations (controlling the execution of a program). To realize this approach, the Moldable Debugger builds on top of, and extends an existing IDE (integrated development environment). In our work we build on top of formal methods, making use of the DSL formal specification. Moreover, we discuss how to design the visualization of a DSL program (i.e. domain-specific graphical widgets) based on an explicit definition of the DSL dynamic semantics. In [9] Bandener et al. visualize behavior of a graphical language in the form of the animated concrete syntax using the Dynamic Meta Modeling (DMM) technique. In DMM, a so-called runtime metamodel enhances the language metamodel with concepts that express an execution state. The dynamic semantics of the language is specified as a set of graph transformation rules for deriving instances of the runtime metamodel. When applied to a program (i.e. instance of the language metamodel), these rules iteratively generate a state space representing the behavior of the given program. Each of these states is an instance of the runtime metamodel. Bandener et al. enhance the language’s concrete syntax with the graphical representation of the runtime meta- model. As a result, a graphical representation (i.e. a diagram) can be generated for each state of the state space. Their front-end tool allows for choosing a path in such a visualized state space. In our work we also strive for the effect of the animated concrete syntax of the DSL.

2.9 Conclusions

In this chapter we described our case study on defining the dynamic semantics of the LACE DSL. In our case study, we investigated the definition of the dynamic semantics of LACE from the viewpoints of two user roles: a DSL developer and a DSL user. These two different user roles can both benefit from constructing and/or having a formal specification of the dynamic semantics of LACE, but in different ways (i.e. different use cases). DSL developers benefit from the explicit specification of their design of the DSL and from the possibility to analyze its consistency and 2.9. Conclusions 41 check various properties. DSL users benefit from the animation of DSL programs wrapped into a domain specific visualization. Moreover, two different user roles work with the different (meta- )levels of the DSL: the DSL metamodel and a DSL model. Thus, the corresponding use cases are applied to two different types of specifications: a specification of the dynamic semantics of the DSL on the metamodel level, and a specification of the behavior of a DSL program on the model level. Construction, analysis, and maintenance of these two different types of specifications are realized using different technologies (such as modular specification of LACE in conceptual machines and model transformation that generates composite machines from LACE models). Our experience of applying Event-B to the specification of LACE and developing the LACE- to-Event-B model transformation leads to the following observations. The level of the state machines based formalism (such as Event-B) for the specification of the dynamic semantics of a DSL is rather low (even on the high abstraction level that captures only essential details): writing the specification code and ensuring its coherency can be tedious work that requires precision in a lot of small details and copy-pasting of reappearing common solutions. The process and the result of the formal specification of the dynamic semantics of a DSL can be improved using a modular approach. We identified two dimensions of the modularity in our DSL: architectural and semantic. However, in our solution the composition of the specification modules is hard- coded into the DSL specific (LACE-to-Event-B) model transformation. The semantic mapping of LACE hard-coded into the model transformation limits the flexibility of the formalization, complicates the maintenance and understandability of the definition of the dynamic semantics of LACE. Moreover, the model transformation maps various small details of the metamodel of LACE to the Event-B specification of LACE. This complicates the reuse of our implementation for the definition of the dynamic semantics of new versions (evolution) of LACE. In relation to the notion of the dynamic semantics of a DSL as it is introduced in Section 2.1, the specification of LACE in Event-B consists of the following semantic domain(s) and semantic mapping(s). Conceptual machines of the LACE specification introduce an intermediate semantic domain, which consists of architectural modules and semantic features. The specification of these machines in Event-B represents a semantic mapping from the intermediate semantic domain to Event-B (i.e. the back-end formalism). The LACE-to-Event-B model transformation represents a semantic mapping from LACE to the intermediate semantic domain. Thus, our definition of the dynamic semantics of LACE is realized through a two-steps semantic mapping. In this way we can effectively deal with the low abstraction level of the back-end formalism. The nature of the intermediate semantic domain, that we used in our definition of the dy- namic semantics of LACE, is determined by the ‘domain specificity’ of the DSL, in particular, by one of its two roles (as explained below). On the one hand, a DSL captures domain knowl- edge, which supports its reuse via domain notions and notation and raises the abstraction level of solving problems in a particular domain. In other words, the DSL realizes a so-called verti- cal domain [51]. On the other hand, the implementation of the DSL (such as its translation to the source code, or via interpretation) captures software solutions (algorithms, architecture, and techniques) that are commonly used in the domain, which supports their reuse and, thus, raises the efficiency of the software development process. In this way, the DSL realizes a so-called horizontal domain [51]. We observed that when specifying the dynamic semantics of LACE, we were identifying and (re)using such horizontal concepts of LACE: partial order and linear order, multi-layered architecture, request-notify mechanism, FIFO. Based on the conclusions described above, we formulate the following requirements for a formalism and a suite of tools for defining the dynamic semantics of a DSL. Req-1 A definition of the dynamic semantics of a DSL should allow for the implementation of various use cases, such as verification and validation of the DSL design, and simulation 42 Defining the Dynamic Semantics of a DSL: a Case Study

of DSL programs. Req-2 A definition of the dynamic semantics of a DSL should describe how DSL programs are executed on a (real) machine (according to the actual implementation of the DSL), rather than how DSL programs are expected to behave (according to the user guide or other informal description of the DSL). Req-3 The definition formalism should provide DSL-independent means for specifying the dy- namic semantics of a DSL on the metamodel level (in terms of the DSL metamodel). Req-4 The definition formalism should allow for the introduction and invocation of an additional (new) semantic domain, such as an intermediate semantic domain that uses concepts of higher abstraction level (comparing to the existing semantic domain). For example, it should be possible to use concepts of the DSL horizontal domain. Req-5 The definition formalism should support different types of modularity of the dynamic semantics of a DSL, such as architectural modules and semantic features.

Req-6 The definition formalism should support a clear description of the DSL design, that is the interaction between and/or composition of separate modules of the dynamic semantics should be described in a clear and explicit way. Req-7 Formal specifications of DSL programs should be generated automatically on the model level. The automatic generator should be DSL-independent and configured according the definition of the dynamic semantics of a DSL that has been constructed on the metamodel level. Note that these requirements are orthogonal to the research (design) questions that determine and guide our study. The formulated above list of the requirements is in itself an answer to one of the research questions (formulated in the introduction of this chapter). The list of explicitly identified requirements gives a pragmatic focus to the other research questions that we consider further in this work. Chapter 3

The Grand Vision

"We create a general theory of the slaying of electrodragons, of which the lunar dragon will be a special case, its solution trivial." "Well, create such a theory!" said the King. "To do this I must first create various experimental dragons." "Certainly not! No thank you!" exclaimed the King. "A dragon wants to deprive me of my throne, just think what might happen if you produced a swarm of them!"

Stanislaw Lem, The Tale of Machine that Fought a Dragon

In this chapter we outline our vision on how our research fits into the perspective of the DSL- based development approach, its challenges and potentials. First we describe the technology space of the DSL-based development approach by aligning the formalisms, tools, and platforms that support the development and usage of DSLs (Section 3.1). Then we use the terms of this technology space to introduce a Common Reference Framework (COREF) – a framework for defining, developing, and using DSLs. The aim of this framework is not to introduce new for- malisms and tools, but rather to bridge existing formalisms and tools. Based on the high-level discussion of our experience of the LACE case study, we introduce the idea of structural and semantic templates and describe our vision on how such templates can be used for achieving the goals of COREF (Sections 3.2 and 3.3). On the one hand, structural and semantic templates can be used for defining a DSL on a higher abstraction level, supporting reuse of the DSL definitions. On the other hand, structural and semantic templates can bridge the formalisms and tools (of the technology space), facilitating the correspondence between them. Our research on defining the dynamic semantics of DSLs is performed within COREF and realizes a part of the proposed generic framework. In particular, in the rest of the thesis, we elaborate on the idea of semantic templates and how they can be applied to a definition of the dynamic semantics of DSLs. 44 The Grand Vision

3.1 The Technology Space

Figure 3.1 depicts an (approximate) technology space common for the DSL-based development approach. We organize this technology space along the following two dimensions: MOF meta- levels (vertical axis) and different technologies/platforms (horizontal axis). MOF consists of four layers: the top layer M3 provides a meta-metamodel (or meta-language) for defining DSLs; on the M2 layer DSLs are defined and developed; on the M1 layer DSLs are used, i.e. DSL programs are specified; on the M0 layer DSL programs are executed. In Figure 3.1 we do not depict the conformance relationships between the models of different MOF layers in order to not overly complicate the picture. However, we imply that each model conforms to the model from the same column from the layer above. For example, a DSL metamodel conforms to the Meta-metamodel (Ecore); a DSL model conforms to the DSL metamodel; etc.

Meta-metamodel Event-B M3 (Ecore) metamodel

DSL uses DSL DSL formal C++/Java M2 concrete metamodel specification syntax M2M generate DSL parse DSL code model model uses model M1 program model implementation specification visualization

program model M0 execution animation

Textual/graphical EMF (MDE) C++, Java, etc. Event-B BMotion Studio notation (GPL source code) (formal methods) (debugging GUI)

reference translation (DSL-specific) translation, automated via the corresponding reference

Figure 3.1: Technology space for the DSL based development approach

We consider the technologies/platforms that support the development and usage of DSLs. For example, EMF (Eclipse Modeling Framework) realizes the principles of the MDE approach by providing the means for a DSL (i.e. defining the DSL metamodel as an instance of the Ecore meta-metamodel) and for specifying DSL models as instances of the DSL metamodel. A DSL model cannot be executed by means of the EMF tool set (in Figure 3.1 the corresponding square is empty). The common approach is to generate GPL source code from a DSL model (model implementation in Figure 3.1) and run this code using the corresponding compiler and/or interpreter (program execution in Figure 3.1). The code generation is usually developed for each separate DSL (i.e. it is DSL-specific). The same scheme applies to our specification of the LACE DSL in Event-B, described in Chapter 2. The model transformation (M2M in Figure 3.1) from a DSL model to the corre- sponding model specification has been developed particularly for the LACE DSL, and, as it is, cannot be applied to another DSL. The LACE-to-Event-B transformation composes a LACE model specification out of the conceptual machines of the DSL formal specification, and in this 3.2. Defining the Structure of a DSL 45 way instantiates the latter. On the M0 layer such a model specification can be animated using the Rodin tool set. A model visualization is generated from the graphical representation of the DSL model (DSL program in the left-most column) and runs on top of (wraps) the animation. Note, that formally speaking, the Event-B column in Figure 3.1 should take three cells instead of four: the Event-B metamodel is an instance of Ecore, i.e. a M2-model. Moreover, in the context of the Event-B language, the DSL formal specification and a model specification are both Event- B specifications (i.e. M1-models). We organize this column differently in order to align our Event-B specifications with the DSL meta-levels. In particular, in this way we stress the fact that LACE conceptual machines (forming the DSL formal specification for LACE) are constructed on the meta-model level (i.e. in terms of the LACE meta-model); and LACE composite machines (model specifications) are generated on the model level for concrete LACE models. In Figure 3.1 we depict the DSL-specific translations as white (hollow) arrows. In contrast, hatched (dashed) arrows show the DSL-independent translations, such as parsing a DSL program into a DSL model. A DSL parser is generated automatically based on the definition of the DSL concrete syntax (the DSL grammar) and the definition of how this concrete syntax uses (maps to) the DSL metamodel (the DSL abstract syntax), i.e. how elements of the concrete syntax correspond to the elements of the DSL metamodel. The mapping between the DSL concrete syntax and the DSL metamodel is described by means of the corresponding meta-language (not shown in Figure 3.1), which are the same for (generic throughout) different DSLs. The examples of such meta-languages are Xtext grammars for defining textual syntaxes1 and Sirius editors for creating graphical syntaxes2. Thus, in parsing, a DSL-independent meta-language (from M3) allows for configuring vari- ous DSL-specific translations (in terms of M2), which are then applied to DSL models (on M1). This approach has two important benefits. First, this approach (together with the supporting meta-language) is reusable for many DSLs. Second, the mapping between a DSL concrete syn- tax and a DSL metamodel is described explicitly (whereas in DSL-specific translations it is hard- coded in a translation). The various translations discussed above (code generation, parsing, model transformation) realize the bridges between the technological platforms. We observe that most of the bridges are defined in a DSL-specific manner, i.e. for each particular DSL. The goal of the Common Reference Framework (COREF) is to generalize such bridges by making them reusable for many DSLs, trying to achieve the same generic solution as there exists for parsing. Below we discuss how this goal can be achieved by giving an explicit high-level definition of a DSL, in particular the high-level definition of its two components: the structure and the dynamic semantics.

3.2 Defining the Structure of a DSL

The structure of a DSL captures language constructs, their compositional hierarchy, their classi- fication (or taxonomy), and cross-references between them. The structure of a DSL is a central point of the DSL design and development. Therefore, its specification appears in some form in all platforms of the technology space depicted in Figure 3.1. Following the experience of our case study, we discuss here the following two platforms: EMF and Event-B. In our case study we bridged these two platforms via the LACE-to-Event-B model transfor- mation that generates Event-B contexts from a concrete LACE model. The input of this transfor- mation is the LACE structure captured through Ecore constructs. The output of the transforma- tion is the LACE structure captured through set theory and first-order predicates. The mapping

1http://www.eclipse.org/Xtext/ 2http://www.eclipse.org/sirius/ 46 The Grand Vision realized by the LACE-to-Event-B transformation is quite fine-grained and very LACE-specific: it maps small details from an input model to small details of an output specification according to the way the LACE structure is specified (modeled) in Ecore and in Event-B. Reimplementing such a transformation for another DSL is tedious work, that requires expertise in both formalisms (Ecore and Event-B) and in the DSL under construction. The aim of COREF is to facilitate this work by generalizing the LACE-to-Event-B transformation and bridging the Ecore and Event-B formalisms on the higher abstraction level (in the DSL-independent way). When specifying LACE in Event-B, we noticed that we first describe the structure of LACE using coarse-grained (informal) statements, such as ‘a logical action consists of subsystem ac- tion occurrences’, ‘occurrences are partially ordered’, etc. As a second step, we translate each of these statements into a piece of Event-B code and/or into a fragment of the Ecore model. More- over, in this translation we establish and reapply certain Ecore and Event-B constructions that capture typical phrases reappearing in the coarse-grained statements. For example, after we have specified ‘a logical action consists of subsystem action occurrences’, we reuse the same Ecore and Event-B constructions for specifying the ‘consists of’ phrase appearing in the statement ‘a logical action consists of data nodes’. In case the Ecore model (the DSL metamodel) already exists, we do not need to translate the coarse-grained statements into the Ecore model, but we rather need to decipher the existing Ecore model into a set of such statements (i.e. to perform reverse engineering). The observation that we reuse the same Ecore and Event-B constructions gave us the idea of structural templates. Such structural templates can represent constructs (re)appearing in a coarse-grained description of the DSL structure. Typical examples of structural templates are ‘A consists of B’, ‘partial order on A’, ‘A relates to B’. Each of these structural templates has a corresponding representation (implementation) in Event-B and in Ecore (see Figure 3.2, the top layer), and thus, bridges these two platforms. Figure 3.2 depicts our vision on how structural templates can be used for bridging Ecore and Event-B.

metamodel structural Event-B templates templates templates

DSL uses DSL formal concrete DSL DSL structural specification M2 syntax metamodel definition

DSL parse DSL structure model M1 program model instantiation specification

Textual/graphical EMF (MDE) COREF Event-B notation (formal methods)

reference translation, automated via the corresponding reference

Figure 3.2: Structural templates for bridging EMF and Event-B

As each structural template is implemented both in Ecore and Event-B, a DSL structural defi- nition composed of such structural templates can be used as a source for the automatic generation of the DSL metamodel in Ecore and the DSL formal specification in Event-B (M2 layer in Fig- ure 3.2). That is, we do not create a DSL metamodel in Ecore, but we generate it automatically from the DSL structural definition using the library of metamodel templates (the top layer in Fig- 3.3. Defining the Dynamic Semantics of a DSL 47 ure 3.2). The corresponding DSL formal specification is generated using the library of Event-B templates. On the M1 level, the translation from EMF to COREF can be used to trace how the DSL struc- ture is instantiated in a particular DSL model. The resulting structure instantiation can be used for constructing the corresponding model specification in Event-B. Thus, the DSL structural definition configures the translation from DSL models to the corresponding Event-B specifications. The feasibility of the approach depicted in Figure 3.2 and its applicability to various DSLs are not self-evident and require further investigation. Within the COREF project, the first step of the study on structural templates was done by Zhang within her Master project [109]. Zhang investigated how a structural template can be implemented in Ecore and weaved into a DSL metamodel. Further, the following research questions should be addressed: what is the nature of structural templates; how structural templates can be parameterized, applied (invoked), and weaved together in a DSL structural definition; and how can these mechanisms (structural tem- plates separately and their composition together) be translated into Ecore and Event-B. However, in our work we focus on the dynamic semantics of DSLs (see the next section) and discuss the structure of DSLs as a necessary context of our study. Therefore, we consider the listed questions out of scope of this thesis. In our work, as an approximation of the proposed concept of structural templates, we employ UML-B [91]. UML-B wraps Event-B code into the graphical notation of UML class diagrams (and UML state machine diagrams, which we do not consider here). For this, UML-B maps (translates) the standard constructs of UML class diagrams to the corresponding Event-B (or B) constructions. As Ecore is a subset of UML class diagrams, we use this translation to bridge Ecore and Event-B. We consider the basic constructs of Ecore (a class, an argument of a class, an association between classes, etc.) to be structural templates and use UML-B to translate them to the corresponding Event-B constructions. The translation of such structural templates to Ecore is trivial. On the one hand, this implementation has certain restrictions. For example, it does not allow to capture various properties of the DSL structure in the DSL structural definition: we need to specify them separately in Ecore (for example, using OCL) and in Event-B (for example, using axioms). On the other hand, we can use (a subset of) an existing DSL metamodel as an input for the UML-B translation to Event-B.

3.3 Defining the Dynamic Semantics of a DSL

As described in Chapter 2, the dynamic semantics of the LACE DSL is defined by the set of con- ceptual machines and the LACE-to-Event-B model transformation that instantiates and composes these conceptual machines for each concrete LACE model (program). Note that the examples of such conceptual machines, presented in Section 2.5, use known mechanisms of program- ming and/or reuse mechanisms from each other. For example, the Event-B machine specifying the behavior of a subsystem component implements the FIFO (first-in-first-out) mechanism (see Figure 2.8(b)). The Core SF of the LAC component (Figure 2.8(a)) and the ‘aligned-with-the- implementation’ version of the Order SF of the LAC component (Figure 2.18) use the same mechanism for processing a (sub)set of (requested) elements one-by-one blocking any upcoming request till the current request is processed: compare the usage of the variables curr_job and wait_exe in Figure 2.18. Thus, we observe that major parts of our Event-B machines specifying LACE are not LACE- specific and in principle can be used for defining other DSLs as well. However, the LACE-to- Event-B model transformation, that instantiates and composes these machines, is LACE-specific. Therefore, to be able to reuse our Event-B specifications of LACE for defining the dynamic se- 48 The Grand Vision mantics of other DSLs (or an evolution of LACE), we need to generalize both components of the specification of the dynamic semantics of LACE: conceptual machines and the LACE-to-Event- B model transformation. Such a generalization is achieved by extracting the LACE-specific information (from the conceptual machines and the transformation) and lifting this information until we find a suitable abstraction for expressing (part of) the dynamic semantics of a DSL in a DSL-independent way. The result of the generalization of LACE conceptual machines is a library of semantic tem- plates. In particular, the two example mechanisms given above are generalized as the Queue specification template and the Request specification template. Each of the semantic templates has a corresponding representation (implementation) in Event-B (see Figure 3.3, the top layer). The result of the generalization of the LACE-to-Event-B model transformation is a meta-language that allows for describing a DSL semantics definition in terms of semantic templates and for automatic generation of the corresponding DSL formal specification in Event-B (M2 layer in Fig- ure 3.3). These two generalizations form the core of our research work. The rest of Figure 3.3 depicts our vision on the potential usage of semantic templates. Figure 3.4 connects our vision of a definition of the dynamic semantics of a DSL (from Figure 3.3) with our vision of a definition of the structure of the DSL (presented in Figure 3.2).

source code semantic Event-B visualization templates templates templates templates

DSL semantics DSL formal M2 definition specification

model semantics model uses model M1 implementation application specification visualization

program execution model M0 execution trace animation

C++, Java, etc. COREF Event-B BMotion Studio (source code) (formal methods) (debugging GUI)

reference translation, automated via the corresponding reference

Figure 3.3: Semantic templates and their application

In the LACE case study, the Event-B conceptual machines specify the dynamic semantics of LACE and get instantiated and composed (invoked and weaved together) into a composite machine for each concrete LACE model. In the same way, in Figure 3.3 the Event-B DSL formal specification is instantiated into an Event-B model specification for each concrete DSL model on the M1 layer. In contrast with the LACE case study, this instantiation is realized by a DSL-independent translation. First, a DSL model is translated into the corresponding struc- ture instantiation (see Figure 3.2). Then, according to this structure instantiation, we construct a semantics application (using the connections between the definition of the dynamic semantics and the structure of the DSL, as depicted in Figure 3.4). Such a semantics application is still composed of semantic templates, but invokes them using more detailed (concrete) DSL struc- 3.3. Defining the Dynamic Semantics of a DSL 49

Figure 3.4: Connections between semantics definition and structural definition ture (M1-model vs. M2-model). On the M1 layer three different platforms can be (potentially) bridged via the invoked semantic templates: source code, formal methods, and domain-specific visualization (see Figure 3.3). For example, for each semantic template of our library we can assign the corresponding source code template that implements it. A translation from a semantic application to a model implementation would compose these implementations of the semantic templates and, thus, con- struct the corresponding source code of the DSL model. In order to be able to use source code as a target platform, we need to answer the following questions: how to compose fragments of source code according to a DSL semantics definition; and more particularly, does the order, in which code fragments are composed, influence the behavior of the resulting program. Bridging an actual implementation of a DSL (in GPL source code) and a formal specification of the dynamic semantic of the DSL on the M0 layer is investigated within COREF by Man- ders [63]. In his work, Manders uses execution traces to investigate and compare behavior of DSL models and their implementations in different target platforms. Potentially, such execution traces can use semantic templates to decipher the low-level information of execution logs (from the program execution or from the model animation) into the higher-level concepts specified in the DSL semantics definition. This will allow for high-level domain-specific debugging of DSL programs (in contrast to low-level GPL-based debugging of the source code generated from a DSL model). An example of another platform is BMotion Studio, that we used in our case study to create a domain-specific visualization of the Event-B specifications of LACE. As this visualization is designed following the same modules of the LACE specification that we aim to generalize into semantic templates, we can potentially identify and collect visualization templates (right most column in Figure 3.3). If each of the semantic templates in our library is coupled with a corresponding visualization template, then the construction of a domain-specific visualization 50 The Grand Vision for an arbitrary DSL can be supported with a (higher-level) palette consisting of the visualization templates instantiated for (and based on) the DSL semantics definition. In this way, a library of semantic templates can serve as a common semantic domain, which is used as a (common) source of semantic mappings targeting various execution platforms. As a result, the consistency between different DSL translations is handled within these semantic mappings and can be reused for different DSLs. However, the feasibility of these ideas requires further research.

3.4 Conclusions

In this chapter we elaborated on the lessons learned during the LACE case study and described our vision on how a generic framework for defining DSLs can look like. We discussed our observations and generalized the results obtained for the LACE DSL, and proposed an approach (a framework) that will support reuse of these results for other DSLs. The proposed approach is based on the idea of structural and semantic templates. In the remainder of the thesis we realize this idea by designing and developing the concept of specification templates. In particular, in the next chapter we design a language that allows for the introduction and application of specification templates, i.e. realizes the following two relations depicted in Fig- ure 3.3: semantic templates implemented as Event-B templates (the arrow in the top layer), and a DSL semantics definition given in terms of semantic templates (the arrow between the M2 layer and the top layer). Further, we develop the semantics of this language by specifying the semantic mapping from an arbitrary DSL semantics definition to the corresponding DSL formal specifica- tion in Event-B (the dashed transformation arrow in the M2 layer in Figure 3.3). Finally, we investigate the applicability of the proposed idea and its realization through a validation study on defining the dynamic semantics of a DSL. Chapter 4

Specification Templates and Constelle for Defining the Dynamic Semantics of DSLs

The constellations are totally imaginary things that poets, farmers and as- tronomers have made up over the past 6000 years. The real purpose for the constellations is to help us tell which stars are which, nothing more. On a really dark night, you can see about 1000 to 1500 stars. Trying to tell which is which is hard. The constellations help by breaking up the sky into more manageable bits. They are used as mnemonics, or memory aids.

What Are Constellations? - on the Internet

Following the requirements formulated in Chapter 2, in this chapter we design and develop the Constelle language. Constelle implements the idea of reusable specification templates, intro- duced in Chapter 3. Constelle introduces an intermediate semantic domain for defining the dynamic semantics of a DSL. The first step of the semantic mapping maps DSL constructs to specification templates. The second step of the semantic mapping implements specification templates using the back-end formalism. From the technological point of view, Constelle wraps a back-end formalism (such as Event-B) into specification templates and uses them as building blocks for defining the dynamic semantics of a DSL. In relation to the LACE case study (described in Chapter 2), specification templates generalize the conceptual machines that we used for the LACE specification; and Con- stelle results from lifting up and abstracting the LACE-specific information extracted from the LACE-to-Event-B transformation. In Chapter 2 based on our LACE case study, we suggested a hypothesis that in a definition of the dynamic semantics of a DSL an intermediate semantic domain is determined by the ‘domain specificity’ of the DSL. In particular, an intermediate semantic domain consists of common soft- ware (design) solutions that reappear in the implementation of the DSL and form its horizontal domain. According to this hypothesis, the dynamic semantics of a DSL is defined by a (semantic) mapping of DSL vertical concepts (i.e. its language constructs) to DSL horizontal concepts. Using Constelle, we aim to support this hypothesis. For this, we use specification templates to capture software (design) solutions that reappear in the implementation of different DSLs (i.e. 52 Specification Templates and Constelle for Defining the Dynamic Semantics of DSLs the DSL horizontal domain). Constelle allows for mapping constructs of the DSL (i.e. the DSL vertical domain) to such specification templates. As a result, Constelle supports the reuse of (formal specifications of) common software (design) solutions throughout the family of DSLs that apply the same implementation techniques. Moreover, Constelle facilitates a clear definition of the dynamic semantics of a DSL in terms of its horizontal concepts. Throughout the chapter we use a LEGO allegory to explain our ideas and design decisions. The LEGO allegory is a metaphor for the dynamic semantics of a DSL defined following the ac- tual implementation of the DSL (rather than according to a set of requirements), see Section 4.1. In Section 4.2 we define the notion of reusable specification templates and discuss how they are invoked and composed together. In Section 4.3 we develop a metamodel of specification templates. In Section 4.4 we describe the Constelle language and its metamodel. Section 4.5 introduces the technique of reusable constraint templates that we use in Constelle to ‘glue’ the specification templates invoked in a definition of the dynamic semantics of a DSL.

4.1 LEGO Allegory 4.1.1 Dynamic semantics of a DSL: requirements vs. solutions In analogy with software, a DSL can be explored, designed, and described in the form of two different artifacts: requirements that define expected (or intended) behavior of DSL programs; • solution that defines the actual implementation (or software architecture) of the DSL. • Ideally, each possible solution refines the (predefined) requirements (in the general mean- ing of the refinement relation as the implication of properties). To check whether this relation actually holds, one can apply manual and automatic techniques – validation and verification cor- respondingly. For performing automatic verification of this refinement relation, one needs to have a formal specification of the requirements. Manual validation involves a human who can interpret the informal description of requirements. In this case a misinterpretation is possible and there exist a number of studies that aim to mitigate this drawback, see for example [62]. In practice, requirements are specified formally if it is required by a standard of the development process – for example, for critical or life-threatening systems, such as railway signaling (the CENELEC standards) or automotive systems (the ISO 26262 standard). Classical approaches of algebraic and allow for formally specifying the dynamic semantics of a programming language in the form of requirements rather than in the form of a solution. An operational semantics gives more insight in how a program is executed, but still abstracts from implementation strategies and machine architectures [73]. Using these formal techniques requires scientific expertise and, thus, is not expected from an average software engineer. At the same time, the costs of employing scientific expertise might not be justified if a DSL is used in a non-critical domain. Thus, the usual situation, that the dynamic semantics of a DSL is not specified formally, might be the common software development practice of not having a formal specification of requirements. In our approach we specify the dynamic semantics of a DSL as a solution rather than require- ments. To stress the difference between a specification of requirements and a specification of a solution, we illustrate our approach using a tangible and domain-independent analogy: the LEGO Technic construction kit. LEGO Technic allows for constructing models of moving mechanisms using LEGO pieces such as gears, pins, axles, pneumatic systems, motors, etc. and principles of mechanical engineering to assemble them. An example of such a LEGO Technic model is 4.1. LEGO Allegory 53 presented in Figure 4.1.1 When considering requirements of the system modeled by this LEGO construction, one can think about the following requirement: the car should make a smooth left (or right) turn when the car is moving and the steering wheel is turned left (or right). The construction presented in Figure 4.1 is a model of a real-life solution: it abstracts from some implementation details and captures the key elements of the solution, which guarantee the real- ization (refinement) of the requirements listed above. The key elements of the pictured solution are: bevel gears steering system, two differentials that enable the car to turn smoothly, etc.

(a) Side view (b) Bottom view

Figure 4.1: Example of a LEGO Technic model

In the same way as a LEGO model captures the principal mechanical solution, a definition of a DSL implementation captures the DSL dynamic semantics as a principal design solution. If such a definition is precise and executable (as defined in Section 2.1), then we can use it in the following use cases:

to prototype the DSL implementation (in LEGO: let’s construct a Jeep-like car); • to validate the prototyped DSL implementation against (informal) requirements by exe- • cuting the definition (in LEGO: does the car drive? does the steering mechanism work as expected?);

to check consistency of the prototyped DSL by analyzing the definition (in LEGO: how are • two differentials combined in a four-wheel drive without blocking the car from moving?); to verify the prototyped DSL against formalized requirements (or other high-level proper- • ties that the DSL should fulfill) by analyzing the definition (in LEGO: a combination of gears meshed together support the speed and/or friction ratio as specified in requirements);

to simulate and debug DSL programs by executing the definition (in LEGO: debug why • the gears get loose after 10 minutes of usage). Note, that for all these use cases it is essential that the definition of the DSL dynamic semantics (that is used for analysis and execution) is consistent with and reflects the actual DSL imple- mentation. Without this consistency we cannot extend the results of analyzing and executing the definition to the actual DSL implementation and its DSL programs.

1Pictures of LEGO models used in the thesis are taken from the web-sites brickshelf.com and sariel.pl and from the book [52]. 54 Specification Templates and Constelle for Defining the Dynamic Semantics of DSLs

4.1.2 Reusable specification templates When defining a DSL dynamic semantics as a design solution rather than requirements, we target a semantic domain that is sufficiently rich to model the programming language (or environment, or system, or platform) in which DSL programs are executed. Thus, this semantic domain rep- resents concepts that are commonly used in software development practice (rather than in a par- ticular DSL). In our LEGO allegory this means, that the same semantic domain of plastic LEGO pieces (which model real-life details: gears, axles, pneumatic systems, etc.) is used to construct mechanisms (i.e. DSLs) from various domains - from cars and trucks, to robotic arms. This is different to the definition of the dynamic semantics of a DSL (or mechanisms) in the form of requirements: then we target the semantic domains that are (substantially) different from each other. For example, for cars we would model concepts of speed, acceleration, and caster angle; and for robotic arms we would model concepts of gripping, spinning, and positioning. On the one hand, this kind of semantic domain is low level, which can result in a relatively complex semantic mapping that bridges the semantic gap between a DSL and the semantic do- main. On the other hand, the commonality of the semantic domain makes it possible to reuse common, reappearing solutions. We will explain this observation using the LEGO allegory. When constructing LEGO mechanisms, one does not need to reinvent the wheel. There is a set of custom mechanical solutions (built of LEGO pieces), patterns, and principles that one can reuse for constructing linkages, transmissions, suspensions, pneumatic devices, etc. For example, a collection of such patterns is provided in the book by P. Kmiec´ [52]. Note, that these mechanical solutions are not restricted to LEGO constructions, but are rather distilled and explained in terms of LEGO (Figure 4.2(b)). The same idea can be applied to defining a DSL dynamic semantics via introduction of reusable specification templates.

   

(a) Differential pattern (b) Function of the dif- (c) Application of the differential pattern ferential pattern

Figure 4.2: Example of a LEGO pattern

4.2 Approach 4.2.1 Reusable specification templates Specification templates are introduced for the reuse of common (successful) design solutions, which appear when constructing (defining or prototyping) DSL implementations (solutions). In our work we concentrate on reusable specification templates for defining the dynamic semantics of DSLs. However, similar ideas can be applied to the definition of other aspects of DSLs: abstract syntax (metamodel), concrete syntax (grammar), or static semantics (type system). A brief overview on the existing research of such a possibility is given in Section 4.6. 4.2. Approach 55

1 template 2 Type max(Type a, Type b) { 3 return a > b ? a : b ; 4 } 5 ... 6 i n t z = max< int >( x , y ) ; Listing 4.1: Example of a C++ template and its invocation

Reusable specification templates realize the well-known approach of generic programming, first defined by Musser and Stepanov in [72]. In generic programming, many concrete imple- mentations of the same algorithm are captured in the form of a generic algorithm via abstracting from the concrete data types appearing in the algorithm. Such an abstraction is expressed as requirements on the parameters of a generic algorithm. For example, Listing 4.1 (lines 1–4) depicts a generic algorithm implemented as a C++ function template for calculating the larger of two objects, a and b. The only parameter of this generic algorithm is the data type Type. From the source code, we can infer that the requirements on this parameter are as follows: this type should support copying of an object value and the operator greater-than (>). 2 Line 6 of List- ing 4.1 demonstrates how this template can be (re)used: for this, the type parameter is specified (int) and the resulting specialized template is invoked as a usual C++ function (in the example, we calculate it for the variables x and y). With regard to our LEGO example depicted in Figure 4.2(a), we can consider the differential pattern as a generic mechanical template parameterized by the gears used in its construction. The major requirement on these parameters can be that the gears should fit together. After specializing this template with concrete gears, we can invoke it by connecting with other parts of the vehicle via input (green) and output (red) axles (in the example depicted in Figure 4.2(c), the output axles are connected to the wheels, and the input axles are connected to the engine). Similar to a source code template, a specification template is a piece of specification code parameterized for reuse via abstracting from the concrete data types and/or data. Unlike a tem- plate written in a programming language, a specification template is written in a formalism that is based on a solid mathematical theory. Therefore, we identify the following key features that characterize reusable specification templates. A specification template is a specification where some specification elements are consid- • ered as template parameters and thus can be substituted by other elements of the same nature. (In our LEGO example depicted in Figure 4.2(a), the driver gear mounted on the green axle is considered as a template parameter and, thus, can be replaced by another gear). Requirements on template parameters are specified explicitly as a part of the specification • template. (In Figure 4.2(a), the driver gear should be a bevel gear, which means that it can be meshed with other gears in a perpendicular manner).

A specification template can be reused together with the results of its verification, such • as: proof of the specification consistency and/or proof of some properties holding for this specification. (The main property of the differential pattern is depicted in Figure 4.2(b): it balances the drive between output axles allowing for a smooth turn [52]; this property

2Note, that in some programming languages it is possible to specify such requirements explicitly, for example, as an interface that should be realized by the type parameter. 56 Specification Templates and Constelle for Defining the Dynamic Semantics of DSLs

holds for any specialization of the template that fulfills the requirements imposed on the template parameters). After specializing a specification template, it can be invoked as a building block for con- • structing another (composite) specification. The verification results of the specification template hold after composing the template with other parts of the composite specification. (In Figure 4.2(c), both applications of the differential hold the "smooth turn" property). While specification templates can be a part (i.e. a language construct) of a specification for- malism, we decide not to develop such a formalism from scratch, but rather base our solution on an existing formalism. The existing formalism provides its tool support (verification and validation tools), thus fulfilling the requirement Req-1 formulated in Chapter 2. And our frame- work wraps the formal methods allowing for composing a design solution using specification templates, according to the requirement Req-3. In order to support all features of specification templates listed above, the ingredients of such a framework should fulfill the following require- ments

The specification formalism provides the possibility to parameterize specifications. Such • parametrization allows for specialization of a specification into another one with the pos- sibility to reuse its verification results. The specification formalism provides the possibility to compose specifications in such a • way, that the constituent specifications hold their verification results after being composed together. The front-end provides a language that allows for describing a design by configuring the • parametrization and composition of reusable specification templates. Moreover, the front- end supports feedback from the formal methods tools to the language level. The first two requirements are fulfilled by the Event-B formalism and the techniques of generic instantiation (implementing the first requirement), shared event composition, and refinement (both implement the second requirement). The third requirement is partially contributed by this chapter: the Constelle language.

4.2.2 Specification templates for composing DSL semantics Following the requirements Req-4 and Req-5 formulated in Chapter 2, we aim to use specifi- cation templates as the semantic domain, and Constelle as the language for defining a semantic mapping (from a DSL to the semantic domain). In the classical approaches of denotational semantics or operational semantics [73], a semantic mapping is defined as a set of so-called semantic functions, each of which gives a meaning to a separate construct (statement) of the lan- guage being defined. A meaning of a language construct determines how this construct changes the state of the program being executed. In other words, the semantics of each of the language constructs is defined separately and independently from other language constructs in terms of changes to the program state. This style of a semantic mapping is known as a compositional semantics [73] and can be characterized as a one-to-many relation (from DSL constructs to state changes). However, our decision to define a DSL dynamic semantics as a solution (or implementation, as described in Section 4.1) leads to the following situation. In a solution, multiple DSL con- structs (statements) can be implemented via the same specification template invocation; and one DSL construct can be implemented by multiple specification template invocations. In this case, 4.2. Approach 57

1 void Point :: moveBy ( i n t dx , i n t dy ) { 2 x=x+dx; y=y+dy; 3 display.update(); 4 l o g (MOVE_BY, t h i s , dx , dy ) ; 5 } 6 7 void Point :: setColor ( i n t c ) { 8 c o l o r = c ; 9 display.update(); 10 l o g (SET_COLOR, t h i s , c ) ; 11 } Listing 4.2: Example of a C++ code with different aspects

the meaning of a DSL construct determines how multiple template invocations change the state of the program being executed. For example in our LEGO allegory (in Figure 4.2(c)), two dif- ferent ‘DSL constructs’: driving and turning - are implemented by the following common set of mechanical templates: the differential pattern (invoked twice) and the drive-train. The state of the car is changed via interaction of these template invocations. This style of a semantic mapping can be characterized as a many-to-many relation (from invocations of specification templates to state changes). As a consequence, when defining a semantic mapping we may face the problem of scattering and tangling: the definition of a DSL construct is scattered over multiple invocations of specification templates, and an invocation of a specification template participates in definitions of multiple DSL constructs (and, thus, tangles them). Note that whether such a decomposition of the dynamic semantics (into template invocations) is always feasible is the question that we address in Chapter 9 as a future work. Here we focus on the design of the proposed approach. The problem of scattering and tangling of software code is considered and managed by As- pect Oriented Programming (AOP) [49]. The AOP technique allows for the modularization of aspects that cross-cut a system’s basic functionality. Examples of such aspects are synchroniza- tion, memory management, localization, logging, etc. For example, in the C++ code depicted in Listing 4.2 the basic functionality of updating a point’s color and changing its coordinates is cross-cut by the aspect of notifying the display about the point’s new state (lines 3 and 9) and by the aspect of logging (lines 4 and 10). According to the AOP paradigm, these aspects can be extracted into separate (explicit) modules and then weaved into the basic functionality. Following the requirement Req-6 formulated in Chapter 2, we use the AOP approach to express how a DSL dynamic semantics is composed of specification templates. We consider (specialized) specification templates as aspects and weave (i.e. compose) them together to form the functionality (i.e. behavior) of DSL constructs. In other words, the dynamic semantics of a DSL is defined by composing aspects, each of which is a specialized specification template. The major difference between our approach and the classical AOP is that we define the DSL semantic mapping only in terms of aspects, i.e. the basic functionality is a composition of aspects. As it was discussed earlier, such a definition can be characterized as a many-to-many relation. A natural way to depict such definition is to use the matrix notation: the basic functionality (state changes) is represented in rows, the aspects (invocations of specification templates) are represented in columns, and the intersections of these represent the composition (semantics of DSL constructs). For example, Table 4.1 represents the composition of aspects for the code in Listing 4.2. Compared to the classical AOP realized in programming languages, aspects in our approach are formal specifications. Thus, the compatibility of aspects composed together can be analyzed using tools that support the specification formalism. For example, two differential patterns can 58 Specification Templates and Constelle for Defining the Dynamic Semantics of DSLs

Table 4.1: Mapping of aspects on the basic functionality

Point Display Logging moveBy update MOVE_BY setColor update SET_COLOR be composed together into a 4 4 vehicle’s drive-train, as shown in Figure 4.2(c). To ensure the compatibility of these two template× invocations, the front and rear differentials must be oriented in opposite directions so that the front and rear wheels rotate in the same direction. This can be checked by executing the LEGO model, i.e. by executing (animating) the formal specification of the dynamic semantics.

4.3 Reusable Specification Templates

In this section we develop a (meta)model of reusable specification templates. As a carrier formal- ism to express specification templates we use Event-B. However, we strive towards a mechanism that is formalism-independent, that is, we wish to be able to use another back-end formalism. The metamodel of specification templates allows for configuring the parametrization of Event-B code for further (re)use in a definition of the dynamic semantics of a DSL.

4.3.1 The Event-B formalism

[0..*] refines [0..*] extends

Machine [0..*] sees Context

[0..*] invariants [0..*] axioms [0..*] constants [0..*] variables [0..*] events [0..*] sets [0..*] refines Variable Event Invariant Axiom CarrierSet Constant

[0..*] guards [0..*] parameters [0..*] actions Parameter Action Guard

EventBNamedCommentedActionElement EventBNamedCommentedDerivedPredicateElement

EventBNamedCommentedPredicateElement

EventBNamedCommentedElement

Figure 4.3: Metamodel of an Event-B specification

As we explained in Chapter 2, Event-B employs set theory and first-order logic for specifying software and/or hardware behavior. The metamodel of an Event-B specification is depicted in 4.3. Reusable Specification Templates 59

Figure 4.3. This is an essential subset of the Event-B metamodel provided by the EMF framework for Event-B, one of the Rodin plug-ins [92]. Recall that, an Event-B specification consists of Contexts and Machines.3 A context describes the static part of a system using sets, constants, and axioms. A machine uses (sees) the context to specify behavior of a system via a state-based formalism. Variables of the machine define the state space. Events, which change values of these variables, define transitions between the states. An event consists of guards and actions, and can have parameters. The properties of the system are specified in invariants of the machine.

CONTEXT template queue context CONTEXT template request context SETS SETS ElementT ype ElementT ype END END (a) Event-B context for the Queue specification template (b) Event-B context for the Request specifica- tion template

MACHINE template queue machine SEES template queue context VARIABLES MACHINE template request machine queue SEES template request context INVARIANTS VARIABLES inv1 : queue N ElementT ype request body ∈ 7→ EVENTS INVARIANTS inv1 : request body P(ElementT ype) Initialisation ∈ begin EVENTS act1 : queue := ∅ Initialisation end begin Event enqueue = act1 : request body := ∅ any element, index end where b Event request = grd1 : element ElementT ype any elements ∈ grd2 : index N where b ∈ grd3 : queue = ∅ grd1 : elements P(ElementT ype) 6 ⇒ ∈ ( i i dom(queue) index > i) grd2 : request body = ∀ · ∈ ⇒ ∅ then then act2 : queue := queue index element act2 : request body := elements ∪ { 7→ } end end Event dequeue = Event process = any element, index any element where b where b grd4 : index element queue grd3 : element request body 7→ ∈ grd5 : i i dom(queue) index i then ∈ ∀ · ∈ ⇒ ≤ then act3 : request body := act3 : queue := queue index element request body element \{ 7→ } end end \{ } END END

(c) Event-B machine for the Queue specification template (d) Event-B machine for the Request specifica- tion template

Figure 4.4: Event-B code of two specification templates: Queue and Request

Figure 4.4 shows the Event-B contexts and machines of two specification templates that

3In the rest of this thesis, we use this font convention when referring to elements of a metamodel. 60 Specification Templates and Constelle for Defining the Dynamic Semantics of DSLs we use as examples in this chapter: Queue and Request. In the Queue specification (Fig- ure 4.4(c)), the collection of elements is modeled as a partial function queue from natural num- bers to ElementType (see invariant inv1), where ElementType is a set of all possible elements that can be stored in the queue (see the Event-B context in Figure 4.4(a)). The possible opera- tions on the collection are specified as the events enqueue and dequeue. In the enqueue event, a new pair index element is added to the collection (see the action act2) if the index is big- ger than any other7→ index used in the queue (see the guard grd3). In the dequeue event, a pair index element is removed from the collection (see the action act3) if the index is smaller than any other7→ index used in the queue (see the guard grd5). In this way, the First-In-First-Out (FIFO) property of the data structure is realized.

4.3.2 Metamodel of a specification template To consider Event-B code as a specification template, we need (1) a mechanism to parameterize it as a generic template and (2) a mechanism to invoke this template when defining the dynamic se- mantics of a DSL. To keep the approach universal, we separate these mechanisms from a concrete carrier formalism (in our case, from Event-B). For this, we introduce the concept of Template In- terface, that supports the mechanisms of parametrization and invocation, independently from the concrete specification formalism. A Specification Template connects a template interface and the specification code that implements this interface. In the metamodel depicted in Figure 4.5, concepts related to the template interface are shown on the left; concepts of the formalism (Event-B) are shown on the right; and concepts of the specification template that connect these two parts are shown in the middle. The right part of Figure 4.5 uses a subset of the Event-B metamodel depicted in Figure 4.3.

TemplateInterface

[1..1] eventbcontext [1..1] implements StructuralTemplate Context StructuralInterface

[1..1] uses [1..1] uses [1..1] implements [1..1] eventbmachine SemanticInterface SemanticTemplate Machine

[0..*] _interface

[0..*] _interface StaticParameter

Operation

Constant SpecificationTemplate [0..*] signature

Type DynamicParameter [0..*] elements

EventBNamed SpecificationElement [1..1] eventbElement CommentedElement Relation

InterfaceElement PublicElement PrivateElement [1..1] implements

Figure 4.5: Metamodel of a specification template

To realize the listed mechanisms, we distinguish two possible template interfaces: Structural Interface and Semantic Interface. To realize the parameterization mechanism, a structural in- 4.3. Reusable Specification Templates 61

terface defines a collection of template parameters that generalize the template and that can be substituted by concrete data when specializing the template. As these parameters do not change their values during the execution of a composed system, we name them Static Parameters. In our LEGO example depicted in Figure 4.2(a), the static parameters are the sizes of the gears used in the differential: after the sizes are chosen and the corresponding gears are assembled into the mechanism, they are not changed any more (during driving). Choosing different sizes of the gears allows for configuring the so-called axle ratio, which determines how the torque multiplication and top speed are changed (decreased or increased). The Queue specification template depicted in Figure 4.4(c) is generic with respect to the type of elements stored in the queue. Thus, the corresponding structural interface includes one static parameter: ElementType (see Listing 4.3, structural interface template_basic).4 In the metamodel depicted in Figure 4.5, we distinguish three possible static parameters: Constant, Type, and Relation. For example, ElementType is an instance of the meta-class Type. If we consider a finite queue with a fixed capacity, then we can parameterize this capacity using an instance of the meta-class Constant. We note that the completeness of such a classification with respect to various specification formalisms and metamodeling languages requires further investigation. Therefore, we leave a possibility to extend our metamodel by adding new kinds of static parameters.

1 structural interface template_basic { 2 t yp e s ElementType 3 } 4 5 semantic interface template_queue uses template_basic { 6 operation enqueue (element) 7 operation dequeue (element) 8 } 9 10 semantic interface template_request uses template_basic { 11 operation request (elements) 12 operation process (element) 13 } Listing 4.3: Structural and semantic interfaces of the specification templates Queue and Request

To realize the invocation mechanism, a semantic interface defines a set of signatures via which one can invoke the behavior that is specified in the template. In the metamodel depicted in Figure 4.5, a behavior signature is modeled as an Operation. For connecting the behavior with other sub-behaviors of the system, an operation uses DynamicParameters. Dynamic parameters allow for transferring data to and from the template behavior. In our LEGO example depicted in Figure 4.2(a), the red and green axles are dynamic parameters: they connect the differential with other parts of a system and transfer the rotation to (the green axle) and from (the red axles) the differential. The behavior specified in the Queue specification template can be invoked via operations enqueue and dequeue. The data that is transferred into and from these operations is an element that should be added to or has been removed from the queue. Thus, the corresponding semantic interface consists of two operations: enqueue and dequeue, with elements as their dynamic parameters (see Listing 4.3, semantic interface template_queue). Not all elements of the specification template should appear in the template interface. Some elements are encapsulated in order to hide details of the template design. For example, the index parameters of the events enqueue and dequeue, that are used for determining a position of the

4In the rest of this thesis, we use this font convention when referring to elements of a listing. 62 Specification Templates and Constelle for Defining the Dynamic Semantics of DSLs element being added/removed, are specific to the way the queue is modeled (a partial function from natural numbers to ElementType). Therefore, we encapsulate index and do not add it to the dynamic parameters of the semantic interface. The same applies to the Initialisation event of the Event-B specification. Finally, a specification template as such lists the elements of the Event-B code that constitute the template, and defines which of these elements can be substituted and/or invoked when apply- ing the template. For this, a SpecificationTemplate stores a collection of SpecificationElements, each of which references an EventBNamedCommentedElement (see the metamodel depicted in Figure 4.5). An EventBNamedCommentedElement can be an Event-B variable, an event, a pa- rameter, etc. (according to the Event-B metamodel in Figure 4.3). All these specification ele- ments constitute the template, and therefore, are explicitly referenced in it. A specification element can be either public (PublicElement) or private (PrivateElement), see Figure 4.5. A public element links an element of the template interface (InterfaceElement) with an element of the Event-B code that implements this interface element. When the specification template is applied, this Event-B element is substituted by another element of the same kind ac- cording to the specialization and invocation of the interface element. For example, the enqueue operation of the semantic interface template_queue (Listing 4.3) is linked to the event en- queue of the Event-B machine template_queue_machine (Figure 4.4(c)); and the Element- Type type of the structural interface template_basic is linked to the set ElementType of the Event-B context template_queue_context. The elements of the Event-B specification which do not appear in the template interface, are referenced through PrivateElements. For example, the index parameters of the events enqueue and dequeue are not included in the template interface, and thus, are referenced through the corresponding objects of the class PrivateElement. As an Event-B specification is organized as a context for the static part and a machine for the dynamic part, it is natural to split a specification template into a StructuralTemplate linking a structural interface and an Event-B context, and a SemanticTemplate linking a semantic interface and an Event-B machine. Theoretically, the right part of the metamodel depicted in Figure 4.5 can be replaced by concepts of another specification formalism, with a necessary adjustment of the middle part (practically, this possibility requires further investigation). In the next section we describe how a DSL can be defined in terms of structural and semantic interfaces. Using only interfaces for a DSL definition (i.e. only the left part of Figure 4.5) allows for a potential replace- ment of Event-B by another specification formalism without changing the DSL definition.

4.4 Constelle Language

As introduced in Section 4.2.1, the contribution of this chapter is the Constelle language that allows for defining the dynamic semantics of a DSL using specification templates. Constelle applies the ideas of generic programming and aspect oriented programming described in Sec- tion 4.2.2. Namely, in Constelle the dynamic semantics of a DSL is defined as a composition of aspects, each of which is a specification template specialized by substituting its (static) pa- rameters with the DSL constructs. The semantics of Constelle maps such a definition to the corresponding (Event-B) specification of the dynamic semantics of the DSL by substituting and composing the specification templates. Such substitutions and compositions raise certain proof obligations in the resulting specification. We discuss the Constelle-to-Event-B mapping and how the corresponding proof obligations can be identified and discharged in Chapter 6. To realize our approach, we need to have a library of specification templates, which we can use in a definition. This library can exist beforehand or can be created and extended during the process of designing the DSL. The purpose of such a library is to collect and store the knowledge 4.4. Constelle Language 63 and expertise of designing and developing a DSL (for a concrete domain or for a general broad usage).

4.4.1 Design of the Constelle language We explain and design the Constelle language through the following example: we consider a DSL for controlling an industrial robot and define (a subset of) its dynamic semantics in Constelle using a library of two specification templates, Queue and Request. The industrial robot and an example of a DSL program for controlling it are depicted in Figure 4.6. The specification templates Queue and Request were introduced in Section 4.3.1.

1 task Demo { 2 arm a c t i o n (turnright , 60); 3 arm a c t i o n (movedown, 10); 4 hand a c t i o n ( grab ) , 5 arm a c t i o n (moveup, 10); 6 hand a c t i o n (rotateleft , 180); 7 arm a c t i o n (turnleft , 60); 8 arm a c t i o n (movedown, 10); 9 hand a c t i o n ( r e l e a s e ) ; 10 }

(a) Robotic arm (b) A DSL program

Figure 4.6: An industrial robot and a DSL program that controls it

The industrial robot consists of two major mechanical parts: a hand, responsible for ma- nipulating objects, and an arm, responsible for moving the hand to a certain position. Our example DSL allows for programming tasks for such a robot using a set of actions that can be performed by these parts, such as: actions turn left, turn right, move up, move down performed by the arm; and actions grab, release, rotate left, rotate right performed by the hand. The DSL program in Listing 4.6(b) specifies the task Demo that should be executed by the robotic arm. While some actions in a task should be performed in a certain order, some other actions can be performed in parallel, as the arm and the hand can operate inde- pendently. For example, in lines 5-7 in Listing 4.6(b), the actions move up and turn left of the arm can be performed in parallel with the action rotate left of the hand. However, the action release of the hand should be performed only after the action move down of the arm. The dynamic semantics of the example DSL realizes the parallel execution of mutually inde- pendent actions of the robot parts, and the sequential execution of mutually dependent actions. In Constelle we define these two types of execution in two separate semantic modules. First we define the parallel (independent) execution of actions in the semantic module Robotic Arm Paral- lel using the specification templates Queue and Request. Then we define the sequential execution of actions in the semantic module Robotic Arm Sequential using the module Robotic Arm Parallel and other specification templates.5 In other words, in Constelle the dynamic semantics of a DSL is split into separate semantic modules, each of which encapsulates a behavioral aspect and/or certain design decision(s) and hides it from the rest of the semantics definition. Each of these semantic modules is split into

5In the rest of this thesis, we use this font convention when referring to elements of a Constelle definition. 64 Specification Templates and Constelle for Defining the Dynamic Semantics of DSLs

Table 4.2: Semantic module Robotic Arm Parallel

Robotic Arm Parallel driver1 : Queue driver2 : Queue distributor : Request taskStm request

• task • elements armActionStm enqueue process

• action • element • element handActionStm enqueue process

• action • element • element executeArm dequeue

• action • element executeHand dequeue

• action • element Actions ElementType ArmActions ElementType HandActions ElementType smaller semantic modules – and so on till we arrive at the library of specification templates, which have the corresponding (Event-B) implementations. Thus, a definition is structured as a directed acyclic graph (DAG) of semantic modules, where the edges represent the relation ‘composed of’ and the graph sinks represent the semantic interfaces of the specification templates from the library. Such a design allows for a scalable, modular, and formalism-independent definition of the dynamic semantics of a DSL. To make the definition uniform with respect to the ‘composed of’ relation, we consider other nodes of the graph as semantic interfaces, too. That is, each semantic module is a semantic interface, composed of other semantic interfaces. Table 4.2 introduces the semantic interface of the semantic module Robotic Arm Parallel, and shows how this semantic module is composed of the semantic interfaces of the templates Queue and Request. The semantic interface of Robotic Arm Parallel is shown in the leftmost column of the table. The other columns show the aspects which Robotic Arm Parallel is composed of: the aspects driver1 and driver2 both invoke the Queue template, and the aspect distributor invokes the Request template. Each column contains elements of the corresponding semantic interface. The rows of the table show different elements of these interfaces: operations (non-shaded rows) and their parameters (shaded rows). The intersections of the rows and the columns show the mapping of the elements of the semantic module to the elements of the constituent semantic interfaces. The bottom part of the table shows the mapping of the structural interfaces used in these semantic interfaces. Below we explain this Constelle definition in detail. As we stated in the introduction of this chapter, the main idea behind our approach is to define the dynamic semantics of a DSL as a two-steps semantic mapping: first, from the DSL constructs to an intermediate semantic domain; second, from the intermediate semantic domain to the target execution platform. The choice of such an intermediate semantic domain is not arbitrary: it is formed by the typical (design) solutions that are used for handling the target execution platform (in other words, by concepts of the horizontal domain of the DSL). For example, a robotic arm is typically controlled via the drivers of its parts. In our semantics definition, we represent such drivers as queues to emphasize that the drivers have buffers for storing actions that should be executed. The third aspect of the Robotic Arm Parallel is a distributor, responsible for assigning 4.4. Constelle Language 65 actions to the drivers. A Constelle table, such as Table 4.2, represents a mapping from the DSL (vertical) concepts, depicted in the leftmost column, to the intermediate semantic domain (i.e. the DSL horizontal concepts), depicted in the other columns. For the example DSL we distinguish the following DSL concepts: the task statement, two types of action statements, and two types of action executions (by the arm and by the hand). In other words, we separate the concept of an action statement in a DSL program from the concept of the resulting execution of the action by the robotic arm. These (vertical) concepts appear in the semantic interface of Robotic Arm Parallel (leftmost column in Table 4.2) as the operations taskStm, armActionStm, handActionStm, executeArm, and execute- Hand. The operation taskStm is a starting point of an execution and is responsible for initializing a task. We define this operation as the request operation of the Request template. The elements parameter of the request operation (the right column in Table 4.2) corresponds to the task that is requested for the execution (parameter task in the left column). According to the Event-B specification of the Request template depicted in Figure 4.4(d), this means that the task is saved in the internal variable request_body; and a new task can be requested only after the current task is processed (see grd2 in Figure 4.4(d)). After initializing a task, we process it action by action (or statement by statement) using the process operation of the Request template. Each action is assigned for execution to the hand or to the arm – depending on the type of the action. Thus, the operations armActionStm and handActionStm are composed of the enqueue operation of the corresponding driver and the process operation of the distributor. Moreover, the action that is processed in the distributor aspect is the same action that is queued in a driver aspect. This is depicted by putting the parameters element of enqueue and process in the same row as the parameter action of armActionStm or handActionStm. An execution of an action corresponds to the dequeuing of this action. Therefore, the opera- tions executeArm and executeHand are defined as the operations dequeue of the aspects driver1 and driver2 correspondingly. Finally, we specialize static parameters of the invoked specification templates. For this, we use the following constructs of the example DSL. The set Actions contains all the predefined actions of the DSL. As actions can be performed either by the arm or by the hand, the set Actions is partitioned by the sets ArmActions and HandActions:

Actions = ArmActions HandActions, ArmActions HandActions = ∅ ∪ ∩ The substitution of the static parameters is depicted in the bottom rows of Table 4.2. Namely, the Actions type substitutes the ElementType of the distributor aspect. This means that the re- quest_body of the Request template invoked in this aspect becomes a subset of Actions. More- over, the task parameter of the taskStm operation is a subset of Actions too:

task P(Actions) ∈ The ArmActions type substitutes the ElementType of the driver1 aspect. The HandActions type substitutes the ElementType of the driver2 aspect. These mean that only the actions of the corre- sponding type are stored in the queues: ArmActions are stored in driver1:Queue, and HandActions are stored in driver2:Queue (see guard grd1 in Figure 4.4(c)). In the resulting semantics of the example DSL, actions of the arm and the hand are executed independently from each other in order of arrival to a corresponding queue. Moreover, according to the way actions are processed, the order of actions within a task does not matter. However, the 66 Specification Templates and Constelle for Defining the Dynamic Semantics of DSLs order of requesting tasks does matter, as a new task cannot be initialized until all the actions of the current task are sent to the queues.6 As described earlier, the definition of the dynamic semantics of the example DSL consists of two semantic modules, Robotic Arm Parallel and Robotic Arm Sequential. Robotic Arm Sequen- tial extends Robotic Arm Parallel with the specification template Partial Order. The correspond- ing structure of the definition of the dynamic semantics (i.e. DAG of the semantic modules) is depicted in Figure 4.7.

Robotic Arm Sequential

Robotic Arm Parallel

library Queue Request Partial Order

Figure 4.7: DAG of the semantic modules of the example DSL

Table 4.3 defines the semantic module Robotic Arm Sequential. In this module we use the same names of operations and parameters (the leftmost column in Table 4.3) as we used in the semantic module Robotic Arm Parallel. The semantic module Robotic Arm Sequential is composed of the semantic module Robotic Arm Parallel (the second column of the table) and the template Partial Order (the rightmost column). The latter imposes a partial order on the actions forming a task.7 This aspect restricts processing of actions to the maximal elements of the order and removes the executed actions from the order. For the sake of brevity, we do not discuss the details of this table here.

4.4.2 Metamodel of the Constelle language In this section we develop the metamodel that allows for such a Constelle definition as described above. The resulting metamodel of the Constelle language is depicted in Figure 4.8: the part depicted on the left duplicates the (sub-)metamodel of a template interface from Figure 4.5; and the part depicted on the right shows concepts related to the definition of DSL semantic modules via composition of semantic interfaces. In the metamodel depicted in Figure 4.8, a DSL SemanticDefinition consists of a collection of SemanticModules, each of which is an instance of the Semantic Interface class (see the gen- eralization relation). A semantic module is composed of aspects, each of which is a Template- Invocation that invokes a SemanticInterface. An invoked semantic interface can be another se- mantic module or it can be implemented as an Event-B specification template. In this way we realize the graph structure of the semantics definition – the DAG of semantic modules, as intro- duced in Section 4.4.1. To implement the ideas of generic programming and aspect oriented programming, the Con- stelle language realizes the following two mechanisms: (1) a mechanism to specialize a spec- ification template and (2) a mechanism to invoke the specialized template as an aspect in the composition with other specialized templates. For both mechanisms we use substitution of the interface elements of constituent semantic interfaces with the interface elements of the composite

6The complete Event-B specification of the semantic module Robotic Arm Parallel can be found in Appendix B. 7The Event-B specification of the Partial Order template can be found in Appendix A. 4.4. Constelle Language 67

Table 4.3: Semantic module Robotic Arm Sequential

Robotic Arm Sequential core : Robotic Arm Parallel sequential : Partial Order taskStm taskStm NewPartialOrder

• task • task • poset

• order • order armActionStm armActionStm GetMaximalElement

• action • action • maximal handActionStm handActionStm GetMaximalElement

• action • action • maximal executeArm executeArm RemoveElement

• action • action • element executeHand executeHand RemoveElement

• action • action • element Actions Actions PosetElement ArmActions ArmActions HandActions HandActions

SemanticDefinition TemplateInterface

StructuralInterface [1..*] semanticModules

[1..1] uses SemanticInterface SemanticModule [0..*] _interface [1..1] invokes StaticParameter [1..1] owner [0..*] _interface

Operation [0..*] aspects

Constant TemplateInvocation [0..*] signature

Type DynamicParameter

Relation

[0..*] elementsSubstitution

[1..1] actual InterfaceElement InterfaceElementSubstitution [1..1] formal

Figure 4.8: Metamodel of the Constelle language semantic interface. This means that in a Constelle table, an interface element from the leftmost column substitutes all interface elements situated in the same row in other columns. When substituting an interface element of a constituent semantic interface with an interface 68 Specification Templates and Constelle for Defining the Dynamic Semantics of DSLs element of the composite semantic interface, we specialize the template that implements the con- stituent semantic interface with the elements of the DSL definition (i.e. of the semantic module being composed). For example in Table 4.2, the parameter task of the operation taskStm substi- tutes the parameter elements of the constituent operation request, i.e. specializes the code of the Request specification template. When substituting interface elements of several constituent semantic interfaces with the same interface element of the composite semantic interface, we link the constituent semantic interfaces via this common (shared) interface element. In other words, the semantic interfaces are com- posed through substitution of the linking elements with one shared element that fixes this link. The concept of sharing an interface element is illustrated in our LEGO allegory (Figure 4.2(c)) in the following way: the differential pattern is fit into the whole mechanism via sharing its axles with other parts of the system. In Table 4.2, the parameter action of the operation armActionStm substitutes the parameters element of the constituent operations enqueue and process, i.e. links the interfaces of the templates Queue and Request. In the DSL definition this means that the action that is processed in the Request template is the same action that is queued in the Queue template. In the metamodel depicted in Figure 4.8, a TemplateInvocation consists of InterfaceElement- Substitutions, each of which substitutes a formal interface element of the invoked (constituent) semantic interface with an actual interface element of the semantic module (i.e. composite se- mantic interface). This applies to all interface elements introduced earlier: static parameters (that are used in the semantic interfaces), operations, and dynamic parameters. By substituting static parameters we specialize the templates with the DSL constructs and synchronize them with re- spect to the data types. By substituting operations we invoke these templates and weave them together in an aspect oriented style (as discussed in Section 4.2.2). By substituting dynamic pa- rameters we realize transferring of data between the templates. As mentioned earlier, all these substitutions raise certain proof obligations in the resulting formal specification. In Chapter 6 we discuss how such proof obligations can be identified and discharged.

4.5 Constraint Templates

In the previous section, we discussed the composition (weaving) mechanism of Constelle, which uses substitution of interface elements to link elements of semantic interfaces (aspects) being composed together. In particular, such linking of dynamic parameters realizes transferring of data between the templates. For example, in Table 4.2 in the operation armActionStm an action is transferred (requested) from the distributor aspect to the driver1 aspect. However, such a mecha- nism is very restrictive: it can link only the dynamic parameters of the same type (otherwise the substitution will lead to type errors, which can be detected for the resulting Event-B specification using Rodin tools). To allow for linking dynamic parameters that do not share the same type, we introduce the construct of gluing guards into Constelle and implement it using constraint templates.

4.5.1 Gluing Guards To demonstrate the purpose of gluing guards, we use Constelle to define a subset of the dynamic semantics of the LACE DSL. The dynamic semantics of LACE was specified using Event-B during our case study described in Chapter 2. Here we do reverse engineering of the Event-B specification of the LACE Core SF (core semantic feature) given in Section 2.5.2 and define it in a Constelle table. In Section 2.5.2 the dynamic semantics of the LACE Core SF is defined by 4.5. Constraint Templates 69

Table 4.4: Semantic module for LACE Core SF

LACE Core SF LAC: Request curr_la: Query SS1: Queue Gluing Guards request_la request assign

• la • value curr_job =

• curr_job • elements dom(LALabelDef(la)) request_ssa process query enqueue

• occurrence • element occurrence 7→ • ssaction • element ssaction ∈ • curr_la • value LALabelDef(curr_la) execute_ssa dequeue

• ssaction • element SS1 ElementType Occurrence ElementType LogicalAction VariableType two conceptual machines, LAC and SS, and the description of how these machines interact with each other. However, in our explanation of gluing guards we do not consider the LAC and SS components separately (in separate semantic modules), but rather show how they contribute to the dynamic semantics of the LACE Core SF in one semantic module. Table 4.4 shows the semantic module for LACE Core SF. We distinguish three aspects that constitute this semantic module: Request, Queue, and Query. The Request and Queue specifica- tion templates are introduced earlier in Section 4.3.1. The Query specification template allows for the introduction of a variable in a Constelle definition. For this, Query wraps an Event-B variable and provides two operations (i.e. methods): for assigning a new value to the variable (assign(value)) and for querying the current value of the variable (query(value)). The corre- sponding Event-B specification is rather trivial and can be found in Appendix A. The semantic module LACE Core SF invokes the listed specification templates to model the dynamic semantics of LACE according to its specification given in Section 2.5.2. SS1: Queue represents the buffered execution of subsystem actions by a subsystem. LAC: Request models processing of a requested logical action by the LAC component. The aspect curr_la: Query is used to save the value of the currently requested logical action. In Table 4.4, the operation request_la is used for requesting a new logical action. This oper- ation is composed of the request operation of the aspect LAC: Request and the assign operation of the aspect curr_la: Query. The request operation starts processing the elements of the logical action, stored in the parameter curr_job. The assign operation updates the value of the curr_la by the requested logical action la. Here the aspects LAC: Request and curr_la: Query should be con- nected (linked) in the following way. The curr_job that is processed in the LAC: Request aspect consists of the subsystem action occurrences that belong to the logical action la that is stored in the curr_la: Query aspect. This fact is captured in the predicate appearing in the rightmost column of Table 4.4: curr_job = dom(LALabelDef(la)). We use this gluing guard to connect the dynamic parameters of the request_la operation, as these parameters have different types: curr_job is a set of subsystem action occurrences (see the static type Occurrence that substitutes ElementType for the aspect LAC: Request in Table 4.4); and la is a logical action (LogicalAction substitutes VariableType for the aspect curr_la: Query). 70 Specification Templates and Constelle for Defining the Dynamic Semantics of DSLs

To connect these dynamic parameters, we use the LALabelDef relation specified in the structure (metamodel) of LACE. According to the specification of the LACE metamodel given in Sec- tion 2.5.1, LALabelDef associates each logical action with a subset of occurrences of subsystem actions: LALabelDef LogicalAction (Occurrence SSAction) ∈ → 7→ The same mechanism of gluing guards is employed for the request_ssa operation in Ta- ble 4.4. This operation processes (process) one-by-one the subsystem action occurrences (occur- rence) of the current logical action (curr_la) and requests execution of the corresponding sub- system action (ssaction) from the subsystem (enqueue). Thus, to connect the dynamic pa- rameters of this operation we use the following gluing guard for the request_ssa operation: occurrence ssaction LALabelDef(curr_la) (the rightmost column of Table 4.4). 7→ ∈ 4.5.2 Metamodel of a Constraint Template To introduce gluing guards in the Constelle language, we follow the same methodology as we applied to the definition of the dynamic semantics of a DSL. Namely, we avoid constructing a new formalism for capturing gluing guards, but rather we build on top of the back-end formalism (Event-B) and introduce the notion of constraint templates. A constraint template is a predicate parameterized in such a way that it can be instantiated using concrete data structures and variables and, thus, (re)used for a concrete domain (in a Constelle definition). Following the same style that we used to design specification templates (in Section 4.3) and their invocation in Constelle (Section 4.4), we distinguish a ConstraintInterface, a Constraint- Invocation, and a ConstraintTemplate. ConstraintInterface represents an abstract constraint and defines its signature (i.e. a set of template parameters). ConstraintInvocation invokes a constraint interface by specifying (substituting) its template parameters. ConstraintTemplate assigns an implementation (i.e. an Event-B predicate) to a constraint interface. For example, the gluing guard of the request_ssa operation of LACE Core SF (shown in Table 4.4) is implemented by the following constructs:

constraint template: x y f(z); • 7→ ∈ constraint interface: (x, y, f, z); • constraint invocation: occurrence ssaction LALabelDef(curr_la), where occurrence • substitutes x, ssaction substitutes y7→, LALabelDef∈ substitutes f, and curr_la substitutes z. Note, that here the constraint invocation uses two types of (template) parameters (supported in Constelle): dynamic parameters (such as occurrence, ssaction and curr_la) and static parame- ters (such as LALabelDef). Correspondingly, a constraint interface should declare x, y, and z as dynamic parameters and f as a static parameter. We capture this observation in the metamodel of a constraint interface, constraint invocation, and constraint template, shown in two separate parts in Figures 4.9 and 4.10. Figure 4.9 depicts the metamodel of the Constelle language extended with the ConstraintInvocation class. Figure 4.10 depicts the metamodel of a constraint template put into the context of a specification template (a structural template, in particular). These two metamodels are related via the common ConstraintInterface part (on the left in both figures). Below we discuss the details. A ConstraintInterface defines a signature of a constraint template by specifying a set of its (template) parameters. As we noted above, these template parameters can include both static and dynamic parameters. To declare those, the ConstraintInterface class uses the corresponding as- sociations: staticInterface and dynamicInterface (see Figure 4.9). Note, that in order to facilitate 4.5. Constraint Templates 71

TemplateInterface SemanticDefinition

[1..*] semanticModules StructuralInterface

[1..1] invokes ConstraintInterface ConstraintInvocation [1..1] uses [1..1] restricts

[0..*] _interface [0..*] gluingGuards

[1..1] owner SemanticInterface SemanticModule

StaticParameter [0..*] staticInterface [1..1] invokes [1..1] owner [0..*] _interface

Operation [0..*] aspects

Constant TemplateInvocation [0..*] signature [0..*] dynamicInterface

Type DynamicParameter [0..*] elementsSubstitution

Relation

[0..*] elementsSubstitution

[1..1] actual InterfaceElement InterfaceElementSubstitution [1..1] formal

Figure 4.9: Metamodel of the Constelle language with gluing guards

TemplateInterface

[1..1] eventbcontext [1..1] implements StructuralTemplate Context StructuralInterface

[1..1] uses [1..1] uses

[0..*] _interface ConstraintInterface [1..1] implements EventBNamed ConstraintTemplate [1..1] eventbpredicate CommentedDerived StaticParameter PredicateElement [0..*] staticInterface

Constant SpecificationTemplate [0..*] dynamicInterface

Type DynamicParameter [0..*] elements

SpecificationElement [1..1] eventbElement EventBNamedCommentedElement Relation

InterfaceElement [1..1] implements PublicElement PrivateElement

Figure 4.10: Metamodel of a constraint template implemented in Event-B reuse of structural interfaces, we model the staticInterface as a non-containment relation, imply- ing that a ConstraintInterface references StaticParameters from an existing StructuralInterface. As introduced earlier in Section 4.5.1, a gluing guard connects (links) dynamic parameters of an operation in a Constelle semantic module. Correspondingly, in the metamodel in Fig- 72 Specification Templates and Constelle for Defining the Dynamic Semantics of DSLs ure 4.9 a gluing guard is modeled by a ConstraintInvocation, which belongs to a SemanticModule. Each ConstraintInvocation is associated with (restricts) an Operation of this SemanticModule.A ConstraintInvocation invokes a ConstraintInterface by substituting its dynamic parameters with the dynamic parameters of the restricted operation and its static parameters with the DSL con- structs (declared in the structural interface of the semantic module). The corresponding substitu- tions are stored in the elementsSubstitution list. Finally, in the metamodel in Figure 4.10 a ConstraintTemplate implements a Constraint- Interface by connecting it to the corresponding EventBNamedCommentedDerivedPredicateEle- ment (introduced in the Event-B metamodel in Figure 4.3). The dynamic and static parame- ters of the constraint interface are associated with the corresponding EventBNamedCommented- Elements through the (public) SpecificationElements of the constraint template.

4.6 Related Work

The concepts of templates and/or patterns have been applied to various components of DSLs in order to facilitate reuse of their design. The existing work includes studies on metamodel tem- plates (such as [11] and [84]); composition and reuse of concrete syntax (both for textual [69] and graphical notations [77]); and reuse of definitions of the dynamic semantics. For specify- ing reusable fragments (building blocks) of dynamic semantics and/or weaving/composing them together, some of the studies use informal (or semi-formal) notations: transformation languages (such as Epsilon Object Language, EOL, in [69]), UML activity diagrams (in [85]), and UML state and sequence diagrams (in [50]). In [18] Cleenewerck and Kurtev investigate the prob- lem of scattering and tangling of DSL concerns (modules) in a translational semantics. They discover that this problem is related to the two major challenges of model-to-model transfor- mations, global-to-local and local-to-global transformations, and conclude that the semantics of a DSL should be defined using a specially designed meta-language, rather than using existing general purpose transformation languages. There exist a number of formal notations that allow for modular definition of the dynamic se- mantics of general purpose programming languages (GPLs) using (existing or to-be-established) libraries of reusable modules. For example, TinkerType [61], Modular SOS (MSOS) [71], DynSem [106], K framework [82] achieve an AOP-like modularity for term reduction (rewriting or inference) rules. The latter two formal notations allow for automatic generation of an AST (abstract syntax tree) interpreter, which can be used as a reference implementation of the pro- gramming language or for formal analysis of the specified dynamic semantics. In [19] Combe- male et al. advocate for a thorough method of defining the dynamic semantics of a DSL, where two types of the dynamic semantics should be explicitly specified: a reference semantics (well defined requirements) and a translational semantics (for enabling various use cases through the tools of a target semantic domain). Moreover, these two types of the dynamic semantics should describe the same behaviors, which, for example, can be ensured through a bisimulation proof. In our approach we aim for a precise and executable definition of the dynamic semantics of a DSL and employ the translational technique. As a target semantic domain we use a formalism that has a solid theory and extensive tools support, but is not specifically designed for defining the dynamic semantics of GPLs or DSLs. To manage the resulting complexity of a semantic map- ping, in this chapter we propose and design Constelle, a meta-language that allows for defining the dynamic semantics of a DSL using specification templates and uses the principles of AOP to express how invoked specification templates interact with each other. In Section 4.6.1 we discuss in detail the existing work that uses various formal methods for defining reusable (intermediate) building blocks for composing the dynamic semantics of DSLs. In the relation to the concept 4.6. Related Work 73 of specification templates, in Section 4.6.2 we look into existing techniques for reusing formal specifications of software systems. Note, that here we do not consider reuse of a DSL and its formal analysis via embedding this DSL into another DSL, as it is done by Ratiu et al. in [81]. In their work, the reuse of the semantic mapping of a DSL to a verification formalism is achieved through the clear separation of the DSL concepts from its environment, rather than through the composition of the embedded DSL (sub-language) with the hosting DSL (which realizes its environment).

4.6.1 Reusable building blocks for specifying dynamic semantics of DSLs In [22] Dagand et al. propose Filet-o-Fish (FoF) as a semantic language for composing a DSL out of semantically-rich building blocks. Technically, FoF is a ‘safe abstraction of C embedded in a functional language’ (in their case, Haskell). For this, Haskell functions wrap various string concatenations that can generate fragments of C code. As a result, FoF abstracts from the details of the C syntax and provides building blocks for specifying the dynamic semantics of a DSL. Such building blocks are invoked and composed together in the form of (higher-order) functions using the standard combinators of Haskell (such as folding for traversing an abstract syntax tree, AST). Thus, in FoF the dynamic semantics of a DSL is defined as a Haskell program using available code generators. On the one hand, the FoF-to-C compiler generates the corresponding C code from such a definition. On the other hand, various techniques and tools for Haskell allow for validation (for example, random testing) and verification (i.e. proofs of correctness) of the definition of the DSL dynamic semantics at the level of this definition. Unfortunately, the authors do not discuss the nature of their building blocks. Therefore, it is not clear if we can use FoF to introduce and to invoke the building blocks proposed in our approach: solutions commonly used in the implementation of DSLs. In [16] Chen et al. propose semantic units as an intermediate common language for defining dynamic semantics of DS(M)Ls. Semantic units capture the formal operational semantics for a set of basic models of computations. These can be either basic behavioral categories, such as Finite State Machine (FSM), Timed Automaton (TA), and Hybrid Automaton (HA); or basic component interaction categories, such as Synchronous Data Flow (SDF), Communicating Se- quential Process (CSP) and Process Networks (PN). The semantic units are specified using the Abstract State Machines (ASM) formalism. The dynamic semantics of a DSL is defined as a model transformation between the metamodel (abstract syntax) of the DSL and the metamodel that captures the syntax of the ASM Abstract of a selected semantic unit. The au- thors call such a technique semantic anchoring. Comparing to our semantic templates, semantic units are general purpose computation models, rather than specific software solutions forming the horizontal domain of a DSL or a family of DSLs. In [15] Chen et al. develop a method for the composition of semantic units. In the same way as in our approach, the dynamic semantics of a DSL is built hierarchically as a composition of primary semantic units and newly derived semantic units (composed of the primary ones). To specify such a composition the authors use the composition mechanisms of the ASM formalism, such as invocations of primary ASM specifications and adding new (ad-hoc) constraints. As a result, the interaction of constituent semantic units and the mapping between their data structures are tangled over the ASM code. In our approach we overcome this issue by using the table notation for composing specification templates when defining the dynamic semantics of a DSL. Mannadiar and Vangheluwe in their position paper [64] elaborate on the work by Chen et al. and describe an idea of a semantic template – a combination of a metamodel template (i.e. a parametrized metamodel fragment) with its semantic anchoring (i.e. its model transformation to the ASM formalism). The authors propose to define a DSL as a combination of such seman- 74 Specification Templates and Constelle for Defining the Dynamic Semantics of DSLs tic templates, thus automatically constructing the DSL metamodel and the dynamic semantics specification. However, there is no follow-up work and/or proof of concept for the proposed approach. In [90] Simko extends the approach of semantic anchoring to denotational specification of the dynamic semantics of CPS (Cyber-Physical Systems) modeling languages. The author iden- tifies the following semantic units, typical for the CPS domain: differential algebraic equations, difference equations, and finite state machines. Comparing to the FORMULA formalism used in this work, Event-B does not allow for expressing differential algebraic and difference equations. Our approach is based on providing an operational specification of the dynamic semantics of DSLs. In particular, we specify a DSL dynamic semantics as a solution rather than requirements (as discussed in Section 4.1.1). In [76, 75] Pedro et al. propose a compositional and incremental approach for prototyping DS(M)Ls, where they focus on reuse of metamodel fragments (which they call domain con- cepts) together with the model transformations that capture the dynamic semantics of these frag- ments. A domain concept is a brick that represents a basic idea that can appear in one or several DS(M)Ls. A domain concept is defined as a metamodel, a set of the metamodel elements that can be parameterized (i.e. replaced by effective parameters), and a model transformation to a formal executable language (in their case, to Concurrent Object Oriented Petri Nets, CO-OPN) that cap- tures the dynamic semantics of this domain concept. The definition of a DS(M)L consists of the metamodel composition and the transformation composition. The metamodel composition is an iterative replacement of formal parameters of some domain concept with elements (i.e. classes, attributes, etc.) of another metamodel. In this way, various domain concepts can be composed with each other or with specific constructs of the DS(M)L being defined. The corresponding re- placement of parameters (i.e. instantiation) takes place in the model transformation of the original domain concept. Moreover, the instantiated transformation is extended with additional transfor- mation rules that capture a more precise and more specific semantics of the DS(M)L. As a result, the transformation language (in their case, ATL) serves as a main formalism for specifying the dynamic semantics of a DSL. Moreover, different from Constelle, this approach does not provide any formal theory behind the instantiation and composition of dynamic semantics of constituent building blocks (in their terms, domain concepts). We discuss this aspect in detail in Chapter 6, where we present the semantic of the Constelle language. In [23] Degueule proposes a meta-language for modular and reusable development of DSLs, Melange. The motivation for Melange is drawn from the idea similar to our idea of reusing com- mon solutions appearing in the implementation of DSLs: ‘many DSLs, despite being targeted at specialized domains, share common paradigms and constructs’. Melange is build on top of the K3 language, a meta-language for operational semantics definition. In K3 an operational seman- tics is defined as a set of aspects, each of which adds an execution information (a current state and executable methods) to one of the constructs of the DSL abstract syntax. Melange allows for composing such definitions of the operational semantics by ensuring structural interoperability of DSLs being composed. For example, if two DSLs have different metamodels, Melange allows for merging their sets of K3 aspects by introducing structural interfaces that wrap and abstract away these different metamodels. Such structural interfaces of Melange are of similar purpose and nature as structural templates of Constelle. Different from Melange, Constelle allows for direct composition (weaving) of operations of aspects (semantic modules). Moreover, the table notation of Constelle captures the many-to-many relation between DSL constructs and aspects of the dynamic semantics in an explicit way. 4.6. Related Work 75

4.6.2 Composition and reuse of formal specifications The idea of applying the principles of AOP to the formal specification of software systems ap- peared shortly after the introduction of AOP. In [48] Kellomaki and Mikkonen not only propose the gradual introduction of aspects of collective behavior in a specification of reactive distributed system, but also describe how such aspects can be stored as generic templates, allowing for reuse of both design and verification effort. Kellomaki and Mikkonen use the DisCo specification language and in their later studies in- troduce the Oscid specification language [47], an experimental variant of DisCo. An aspect is defined as a superposition step, which refines an existing specification by introducing new state variables, invariants, and actions. Comparing to Event-B, the superposition mechanism resem- bles shared-event composition, rather than the Event-B refinement. Particularly, superposition preserves safety properties by construction. To be able to archive and reuse such a superposition step, the authors turn it into a template by introducing template parameters and specifying what behavior these parameters should realize. In Constelle only static parameters (of a structural in- terface) can be used for instantiating a template. In contrast, Kellomaki and Mikkonen include actions into their template parameters and specify these actions. As a result, an instantiated template (an aspect) can be imposed (applied) only if the original specification realizes certain behavior. Using templates of superposition steps, one can design a distributed system adding new as- pects to it one-by-one, forming a specification branch. In [46] Kellomaki extends this approach with the possibility to merge (compose) specification branches together. Comparing to the table notation of Constelle, both DisCo and Oscid use the ’...’ symbol as a weaver notation: to indicate where the old code appears in the new specification. In [36] the Oscid specification language is applied for specifying and instantiating two (OOP) design patterns: Observer and Memento. Unfortunately, there is no follow-up work. In [38] Hoang et al. propose a concept of the Event-B pattern, which is similar to the su- perposition step of Kellomaki and Mikkonen. An Event-B pattern is a (generic and/or reusable) refinement step that introduces new details to the abstract machine of the pattern. An application of the pattern (i.e. of the instantiated refinement step) requires syntactical matching the abstract machine of the pattern with the Event-B machine under construction. The authors identify the following patterns of communication protocols: Single Message Communication, Request/Con- firm/Reject, and Asynchronous Multiple Message Communication (with or without Repetition). Comparing to the approach of Kellomaki and Mikkonen (and to Constelle), an Event-B pat- tern does not have an explicit description of template parameters, as Hoang et al. use purely the Event-B notation. The syntactic matching of the pattern is semiautomated. This makes it hard to reuse patterns in other formalisms, and to capture design of a specification (i.e. of a system under specification) in terms of pattern applications (as it is done in Constelle tables). The ancestor of the Event-B formalism, the B method, is used in [13] to specify (OOP) design patterns and to realize different reuse mechanisms for them. Particularly, instantiation of a design pattern is implemented in B by the inclusion of the machine specifying the design pattern and by redefining (in essence, renaming) its variables. Composition of multiple design patterns is achieved through the invocation of the operations of different patterns in a new (composite) operation and/or linking or merging the variables of the patterns. Extension of a design pattern is realized using the B refinement mechanism. This study shows that, in principle the design patterns based approach can be realized in formal methods through proper code conventions, in the same way as it is done in software development for general purpose programming languages. The practice shows that this approach requires discipline and good understanding of a chosen formalism. 76 Specification Templates and Constelle for Defining the Dynamic Semantics of DSLs

4.7 Conclusions

In this chapter, we developed and demonstrated a new method for defining the dynamic seman- tics of DSLs. The key point of our method is an intermediate semantic domain, that splits the semantic mapping from a DSL to an execution platform (or a specification formalism) into two steps. As the intermediate semantic domain we use software design solutions that are typically used in the DSL implementation, i.e. concepts that form the horizontal domain of the DSL. Thus, we define the dynamic semantics of a DSL as a mapping from the language constructs (forming the vertical domain of the DSL) to the horizontal concepts. In this way, we do not propose a (yet another) intermediate language, universal for defining dynamic semantics of all possible DSLs; but we rather propose an intermediate step in the definition of the dynamic semantics of a DSL and support this step with the corresponding expressive means. To capture the mapping from the DSL constructs to the intermediate semantic domain, we use the notation of a table: the DSL vertical domain is represented in the table rows, the DSL horizontal domain is represented in the table columns, and the mapping is represented in their intersections. The second step of the semantic mapping, from the intermediate semantic domain to the specification formalism, is realized through specification templates. We implemented this method in the form of the Constelle language and employed the Event- B formalism as a carrier for specifying behavior. Constelle applies ideas of generic programming and aspect oriented programming to the world of formal methods and provides a front-end that wraps the Event-B formalism. In this way, Constelle fulfills the following requirements formu- lated in Chapter 2: Req-3, Req-4, Req-5, and Req-6 (as described in Section 4.2). As a next step, we define the semantics of Constelle in the form of semantic mapping of Constelle to Event-B. The semantic mapping from Constelle to Event-B assigns a corresponding Event-B specification to a definition of the dynamic semantics of a DSL. In order to follow our general requirement that both a semantic mapping and semantic domain should be precise and executable (formulated in Section 2.1), we design the Constelle-to-Event-B semantic mapping using Event-B techniques that are theoretically solid: generic instantiation, shared-event com- position, and refinement. To make the Constelle-to-Event-B semantic mapping executable, we implement it in a (QVTo) model-to-model transformation. In order to design and describe the Constelle-to-Event-B mapping (model transformation), we introduce a method for describing model transformations and formulate a number of design principles for developing model transformations. These are demonstrated in the next chapter (Chapter 5). The results of applying these techniques to the Constelle-to-Event-B mapping can be found in Chapter 6. The implementation, pragmatics, and evaluation of the proposed method are addressed in Chapters 7 and 8. The questions for future work are discussed in Chapter 9. Chapter 5

Designing and Describing Model Transformations

The basic principle of mathematics: reduce an expression by two, increase the explanation by eight.

The students humor

The main contribution of our work is the Constelle language (described in Chapter 4). We implement Constelle as a model transformation to the back-end formalism (Event-B). The role of this model transformation is twofold. On the one hand, it allows for automatic generation of Event-B specifications based on a definition of the dynamic semantics of a DSL in Constelle (according to the requirement Req-7 formulated in Chapter 2). On the other hand, this model transformation formulates and captures the semantics of Constelle itself. Therefore, we aim to design and describe the Constelle-to-Event-B model transformation in a clear, maintainable and intelligible, way (and thus, apply the requirement Req-2 to Constelle itself). A problem that we face when describing the semantics of Constelle is the lack of common notations for describing (documenting) model transformations and the lack of common pragmat- ics explaining how to use (design) model transformations. To overcome this challenge, in this chapter we propose to use the mathematical notation of set theory and functions to give an in- formal description of a model transformation. Moreover, we use this notation to formulate two design principles of developing QVTo transformations: structural decomposition and chaining model transformations. In the next chapter we apply this methodology to design and describe the Constelle-to-Event-B semantic mapping. The work presented in this chapter is published in [99] and [100].

5.1 Introduction and Motivation

Model transformations are the key technology of Model Driven Engineering (MDE). Model transformations make models meaningful and exploitable and, thus, allow for (software) de- velopment using models as first-class artifacts. Employing model transformations as the key development technology poses challenges typical for , such as design, doc- 78 Designing and Describing Model Transformations umentation, and maintenance. To address these challenges one needs to be able to describe model transformations in an unambiguous and clear way. Nowadays, there exist a number of languages specifically devoted for implementing model transformations, such as: ATL (Atlas Transformation Language)1, QVT (Query/View/Transfor- mation)2 family of languages, ETL (Epsilon Transformation Language)3, etc. These languages can be viewed as DSLs for model transformations. Consequently, a notation for designing and documenting programs written in such languages should be specifically tailored to describing model transformations, more or less disqualifying general purpose notations such as UML. In the current practice there exists no specific notation for describing model transformations (ex- cept those provided by the model transformation languages themselves). The common approach for documenting and/or explaining model transformations is to use concrete examples of their inputs and the corresponding outputs. Clearly, this approach provides an incomplete picture of the transformation and it fails to provide an overview of the transformation design, which are es- sential for describing complex model transformations bridging big semantic gaps and/or having complex organizational structures. In our work we use QVT-Operational (QVTo), that was introduced as part of the MOF (Meta Object Facility) standard [74]. QVTo allows for imperative implementations of model transfor- mations using various tool support, such as plug-ins of Eclipse Modeling Project. The language comprises both high-level concepts specific for model transformation development, such as con- structs of the Object Constraint Language (OCL), traceability, inheritance of subroutines; and low-level concepts of the imperative languages, such as loops, conditional statements, explicit invocation of subroutines. Consequently, using QVTo requires strong expertise in programming language technologies. However, when interviewing QVTo practitioners we discovered that they are mostly engineers who are experts in their own domains but typically have little to no com- puter science training and lack the required language expertise. Moreover, gaining the required expertise in QVTo is hampered by the fact that the language is poorly documented and especially lacks documentation on its pragmatics. In this chapter, we propose to adopt functions and set theory as a notation-independent and informal approach for documenting, explaining, and designing QVTo model transforma- tions. Typically such mathematical concepts are familiar to most engineers. In what follows, we demonstrate how this notation can be aligned with the QVTo concepts, and thus can be used for explaining QVTo concepts and for documenting model transformations (Section 5.2). Moreover, using this notation we formulate two common design principles of developing model transfor- mations and show how the corresponding organization structure and information flow can be described (Sections 5.3). In Section 5.4 we discuss and assess the proposed approach based on interviews with QVTo practitioners. Related work is discussed in Section 5.5. Conclusions and directions for future work are given in Section 5.6.

5.2 Notation for Describing Model Transformations

A model transformation takes as an input one or more source models and generates as an output one or more target models. An algorithm for such generation is defined in terms of source and target metamodels. In particular, the smallest units of a model transformation (such as mappings in QVTo, or rules in ATL and ETL) are defined in terms of classes and associations specified in the source and target metamodels. In order to provide a description of a model transformation

1http://www.eclipse.org/atl/ 2http://www.omg.org/spec/QVT/ 3http://www.eclipse.org/epsilon/doc/etl/ 5.2. Notation for Describing Model Transformations 79 using the mathematical notation of set theory, we need to specify how we refer to the metamodel concepts in this notation.

5.2.1 Describing Metamodel Concepts In the context of MDE, a metamodel is usually defined using the MOF standard. This standard can be seen as a subset of UML class diagrams. There have been a number of studies on formal- izing UML class diagrams, see e.g. [91]. A common approach is to view a class as a set of objects that instantiate this class, and the associations between classes as relations on the corresponding sets of objects. Following this approach, we represent classes as sets and objects of classes as elements of sets. For instance, the example metamodel depicted in Figure 5.1 introduces the sets StateMachine, State, Transition, CompositeState, etc. In model transformations, associations are used as references to access and assign the associated objects, rather than elements of relations on the sets of objects. Therefore, we do not use the notation of relations, but stick to the notation common in object-oriented languages (including QVTo), and refer to the associated classes using association names and dot notation; for instance s.states where s StateMachine. ∈ [1..*] submachine

StateMachine [1..1] initial name : EString [0..*] final

[1..*] states [0..*] transitions

[0..*] out CompositeState State Transition [1..1] source /isSequential : EBoolean = true name : EString [0..*] in /isOrthogonal : EBoolean = false [1..1] target

[0..*] effect [0..1] guard [0..*] trigger Behavior Constraint Trigger

Figure 5.1: Metamodel of the UML state machine

Associations between classes are the key metamodel elements that determine the structure of the resulting models. From the point of view of a model structure, we may treat an object as a tuple of its referenced objects and its attributes. This permits a different view on classes: a class can be seen as a relation on the classes it is associated with. For example, the class StateMachine is determined by a relation on the classes with which it is associated through the associations states, transitions, initial, and final:

StateMachine P(State) P(Transition) State P(State) (5.1) ⊆ × × × Note that here we use the powerset P(S) of a set S to indicate that multiplicity of an association end has an upper bound greater than one. We will use the inclusion relationship (5.1) as a rewriting rule when constructing transformations in Section 5.3.1.

5.2.2 Describing Model Transformations In general, a model transformation defines a relation between a set of source models and a set of target models [21]. A QVTo mapping is more restrictive, realizing a function that maps one or 80 Designing and Describing Model Transformations

1 mapping Transition :: CloneTransition() : Transition {...} 2 3 mapping Transition :: MultiplyTransByState( in state : State) 4 : Transition {...} 5 6 mapping Tuple (s1: State , s2: State) :: MultiplyTwoStates() 7 : State {...} 8 9 mapping Set (StateMachine) :: MultiplySTMs() 10 : StateMachine {...} Listing 5.1: Mapping declarations of the example QVTo transformation

1 mapping Transition :: CloneTransition() : Transition { 2 r e s u l t . t r i g g e r := s e l f . t r i g g e r −> map CloneTrigger (); 3 r e s u l t . guard := s e l f . guard . map CloneConstraint (); 4 r e s u l t . e f f e c t := s e l f . e f f e c t −> map CloneBehavior (); 5 r e s u l t . s o u r c e := s e l f . s o u r c e . map CloneState (); 6 r e s u l t . t a r g e t := s e l f . t a r g e t . map CloneState (); 7 } Listing 5.2: QVTo code of the CloneTransition mapping

more source model objects into one or more target model objects. Thus, we can denote a QVTo mapping as a function from a set representing the source class to a set representing the target class. For example, the simple mapping depicted in the line 1 of Listing 5.1 can be viewed as a function with the following signature:

CloneTransition : Transition Transition (5.2) → Mappings with multiple inputs and/or outputs can be treated in the same way, using Cartesian products of sets to indicate these inputs and/or outputs; e.g. the QVTo mappings of lines 3 and 6 in Listing 5.1 can be represented as follows:

MultiplyTransByState : Transition State Transition (5.3) × → MultiplyTwoStates : State State State (5.4) × → Note that in QVTo multiple inputs and outputs for a mapping can be realized in two different ways: using in/out parameters of the mapping (lines 3-4 in the Listing 5.1) or using tuples as source/target types (lines 6-7 in the Listing 5.1). This syntactic difference influences results of the traceability and resolving mechanisms of QVTo, as only the object before the symbol ’::’ is traced and resolved. QVTo also allows a collection of objects to be used as an input and/or output of a mapping (line 9 in Listing 5.1). We treat an instance of a collection defined on a class as a set of objects of this class, meaning that we treat the collection class as a powerset (P) of the set that represents the class of the collection elements. Strictly speaking, a collection can be realized as a list or a bag, and thus cannot be formally specified as a set. We use this notation rather as an approximation in order to describe a transformation conceptually:

MultiplySTMs : P(StateMachine) StateMachine (5.5) → Function signatures allow us to concisely capture the purpose of a mapping in terms of source and target metamodels, and in line with the QVTo code. However, they do not suffice to describe 5.3. Design of Model Transformations 81 what a mapping actually does. For this, we need to describe how elements of target objects are calculated from the elements of source objects. We use formulas of the following form (which we explain below) to achieve this:

CloneTransition(self): trigger = CloneTrigger(t) , { } t∈ self.[trigger

guard = CloneConstraint(self.guard),

effect = CloneBehavior(e) , { } (5.6) e∈self[.effect

source = CloneState(self.source),

target = CloneState(self.target)

Formula 5.6 describes the QVTo mapping depicted in Listing 5.2. It shows how each element of the target object is calculated by the invocation of other mappings on elements of the source object. In QVTo invocation of a mapping on a collection is denoted by arrow (lines 2 and 4 in Listing 5.2), while invocation on a single object uses dot-notation (lines 3, 5, and 6 in Listing 5.2). In Formula 5.6, we indicate the invocation of a mapping by the application of the corresponding function, such as CloneConstraint(self.guard). A calculation that is performed on a collection is denoted by a quantified union over elements of the collection (lines 1 and 3 of the formula). Such a formula, moreover, specifies the order of invocation of the mappings, which might be essential due to the operational nature of QVTo. According to the execution semantics of QVTo implemented in the Eclipse project, the result of a mapping invocation is the corresponding target object, which is newly constructed or resolved depending on whether this mapping has been already invoked on this source object.

5.3 Design of Model Transformations

Designing model transformations is a challenging part of the model transformation development process. When designing a model transformation one must answer questions such as ‘what are the mappings that constitute a model transformation?’ and ‘how are these mappings related to each other, i.e. invoked by each other?’. In practice, there exist a number of design principles that developers can follow when creating their model transformations [31, 58]. In this section we demonstrate how the presented mathematical notation can facilitate application of some of these design principles.

5.3.1 Structural Decomposition of Model Transformations Model transformations construct target models from source models. Such models typically con- sist of objects connected with each other by associations. Therefore, mappings that constitute a model transformation should construct both target objects and the associations between them. In QVTo each mapping constructs a new object via assigning values to its properties, i.e. by constructing objects associated with this object. This observation is captured by the following principle for designing model transformations: 82 Designing and Describing Model Transformations

mappings should be related to (invoked by) each other in the same way as the objects that they construct are associated with (composed of) each other. In other words, the structure of a model transformation follows the structure defined by the target metamodel. In [31], this was identified as one of the best practices for understandability and maintainability of transformations. To apply this principle in our design process we take the following steps: 1. We identify the inputs and outputs of the transformation and define their structures. 2. We establish the correspondence between the elements in input and output structures.

3. We capture these correspondences in signatures of the constituent mappings. 4. We decompose the transformation into constituent mappings. In what follows, we illustrate and explain these steps through the following example: we show how these steps help in designing a mapping that takes two state machines and constructs a state machine representing their product. The source and target metamodel of this transformation is depicted in Figure 5.1. The function signature of the mapping is as follows.

MultiplyTwoSTMs : StateMachine StateMachine StateMachine (5.7) × → The task of writing such a transformation may not seem challenging, taking into account that we know the definition of a product of two state machines: a product machine simultaneously simulates the behavior of the two state machines of which it is constructed. Figure 5.2 shows an example of applying such a transformation to two simple state machines. However, it might be not so obvious how to implement the MultiplyTwoSTMs transformation in QVTo code, as we need to decide how to compute a product machine according to its definition, i.e. we need to construct an imperative algorithm based on the declarative definition.

a b s1,s4 s2,s4 s3,s4 a b s3 c c c s1 s2 a b s1,s5 s2,s5 s3,s5 c d s6 d d d s4 s5 a b s1,s6 s2,s6 s3, s6 (a) Two state machines (b) Product of the state machines

Figure 5.2: An example of applying the MultiplyTwoSTMs model transformation

5.3.1.1 Identifying the structure of the inputs and outputs. As a first step, we establish what kind of objects we have for the transformation. For this we ‘rewrite’ each class in the initial function signature with the classes that are associated with this one. The rule we use to rewrite is based on the inclusion relation captured in Formula (5.1). According to this formula, an object of the class StateMachine can be viewed as a tuple of objects referenced through the associations states, transitions, initial, and final, i.e. as an object 5.3. Design of Model Transformations 83 that belongs to P(State) P(Transition) State P(State). Thus, as a result of rewriting signature (5.7), we obtain× the following: × ×

(P(State) P(Transition) State P(State)) × × × × (P(State) P(Transition) State P(State)) (5.8) × × × → P(State) P(Transition) State P(State) × × × 5.3.1.2 Connecting elements in input to output structures. Compared to signature (5.7), signature (5.8) is much richer, but it is still very ‘monolithic’. We next strive towards breaking down the structure in the formula, allowing for a modular solution. To this end, we redistribute the inputs on the left of the arrow and the outputs on the right of the arrow according to our understanding of which input elements are required to construct which output elements. For example, we know that the states of the resulting state machine are constructed from the states of the input machines (see Figure 5.2). We capture this knowledge by grouping the input collections of States and the output collection of States in a separate sub- signature in Formula (5.9).

(P(State) P(State) P(State)) × → × (P(State) P(Transition) P(State) P(Transition) P(Transition)) × × × → × (5.9) (State State State) × → × (P(State) P(State) P(State)) × → We do the same for the output transitions, the initial state, and the final states. The resulting initial state is composed of the input initial states, see the third sub-signature in Formula (5.9). The same holds for the collection of final states, see the fourth sub-signature in Formula (5.9). The collection of output transitions is computed based on the input transitions and the input states, thus we duplicate the collections of the input states for the second sub-signature (see the second sub-signature in Formula (5.9), on the left of the arrow). Note that in the process of structural decomposition (described here), we do not use the inner structure of the input and output elements. For example, we do not refer to the source and target states of a transition, but we rather use the collection of all states of a state machine as it appears in the structure of the state machine (see metamodel in Figure 5.1).

5.3.1.3 Deriving signatures of the constituent mappings. Each sub-signature, underlying the signature obtained in the previous step, captures a correspon- dence between source and target objects. We decompose the mapping MultiplyTwoSTMs into invocations of the constituent mappings according to these correspondences. This means that, as a first approximation, each sub-signature of signature (5.9) corresponds to an invocation of a constituent mapping. To derive signatures of the constituent mappings, we consider each sub-signature separately. The first sub-signature of Formula (5.9) is

P(State) P(State) P(State) (5.10) × → In QVTo a P-symbol that appears evenly on both sides of an arrow corresponds to the invocation of a mapping on an input collection and assigning the result of this mapping to an output collec- tion. If the mapping operates on elements of the collections rather than on the collections, then 84 Designing and Describing Model Transformations we can reduce the corresponding collection symbols P that appear evenly on both sides of an ar- row. Mathematically, such a reduction of P-symbols corresponds to an inverse of function lifting. After applying this technique to signature (5.10), we derive the following underlying signature: MultiplyTwoStates : State State State (5.11) × → The resulting mapping MultiplyTwoStates constructs a state of the target state machine as a pair of states from the two source machines (see Figure 5.2). The initial and final states are constructed in the same way. Thus, the signature of MultiplyTwoStates can be derived from the first, third, and fourth sub-signatures of formula (5.9). Using our knowledge of what the transformation should do, we can further refine the sub- signature from the second line of Formula (5.9). Each output transition is a copy of an input transition duplicated as many times as its source and target states have been duplicated to con- struct the output states (see Figure 5.2). As each state of an input machine is paired with all states of the other input machine, we conclude that each transition of an input machine is paired with all states of the other input machine. This knowledge is captured by regrouping the corresponding collections in the second line of formula (5.9):

(P(Transition) P(State)) (P(Transition) P(State)) P(Transition) (5.12) × × × → The described pairing of transitions with states is applied symmetrically to both input machines. The target collection of transitions is constructed as the union of the two resulting collections. Therefore, we reduce signature (5.12) to the following underlying signature:

P(Transition) P(State) P(Transition) (5.13) × → Observing that we can once more use a reverse function lifting, we finally arrive at the following signature for a constituent mapping: MultiplyTransByState : Transition State Transition (5.14) × → 5.3.1.4 Describing the implementation of a mapping.

Function signature (5.9) helps to identify signatures of the mappings that compose the MultiplyT- woSTMs mapping, but it does not capture how this composition is implemented. Therefore, as a final step of the design procedure, we describe the implementation the MultiplyTwoSTMs mapping in a formula using the notation we introduced in Section 5.2.2:

MultiplyTwoSTMs(m1, m2):

states = MultState(s1, s2) , { } s ∈m1.states s ∈m2.states 1 [ 2 [

transitions = MultTrans(t, s) { } ∪ t∈ s∈m2.states m1.transitions[ [ (5.15) MultTrans(t, s) , { } t∈ s∈m1.states m2.transitions[ [

initial = MultState(m1.initial, m2.initial),

final = MultState(f1, f2) { } f1∈[m1.final f2∈[m2.final 5.3. Design of Model Transformations 85

1 mapping Tuple (stm1: StateMachine , stm2: StateMachine) 2 :: MultiplyTwoSTMs() : StateMachine 3 { 4 r e s u l t .states := stm1.states −> collect(s1 | 5 stm2. states −> collect(s2 | 6 Tuple {st1=s1, st2=s2}. map MultiplyTwoStates ())); 7 8 r e s u l t .transitions := stm1.transitions −> collect(t | 9 stm2. states −> collect(s | 10 t . map MultiplyTransByState(s)) ) 11 −> union ( 12 stm2. transitions −> collect(t | 13 stm1. states −> collect(s | 14 t . map MultiplyTransByState(s))) ); 15 16 r e s u l t . i n i t i a l := Tuple {st1=stm1.initial , st2=stm2. initial}. 17 map MultiplyTwoStates (); 18 r e s u l t .final := stm1.final −> collect( f1 | 19 stm2 . f i n a l −> collect( f2 | 20 Tuple {st1=f1, st2=f2}. map MultiplyTwoStates ())); 21 } 22 23 mapping Tuple (st1: State , st2: State) :: MultiplyTwoStates() 24 : S t a t e 25 {...} 26 27 mapping Transition :: MultiplyTransByState( in state: State) 28 : Transition 29 {...} Listing 5.3: QVTo code of the MultiplyTwoSTMs transformation

This formula captures our knowledge of the transformation functionality discussed above us- ing invocations of the constituent mappings MultState (5.11) and MultTrans (5.14). For example, the target collection of transitions is constructed as the union of the two collections resulting from pairing (multiplying) transitions of one state machines with the states of another state machine (lines 3-4 in Formula (5.15)). After we have come up with the design of the mapping and captured it in a formula, we write the QVTo code according to this formula. Listing 5.3 corresponds to formula (5.15). In this section, we derived the structured design of a mapping by gradual refinement of our knowledge about the mapping implementation. The same procedure can be applied recursively for designing the constituent mappings MultiplyTwoStates and MultiplyTransByState. The re- sulting process of the structural decomposition of a model transformation is common to the method of syntax-directed translation. In syntax-directed translation, a compiler implementation is driven by the grammar of its input language [5].

5.3.2 Chaining Model Transformations In the previous section we discussed how a model transformation can be decomposed into con- stituent mappings by following the structure of the target model(s) and matching it with the corresponding elements of the source model(s). However, this principle is difficult to apply if source and target models have very different structures; for example, if mappings that match source and target elements are scattered over the metamodels structures, or if there are mutual dependencies between constructed objects. This type of situation is described in the literature 86 Designing and Describing Model Transformations as structure clash [42], or semantic gap [103]. Such a design difficulty is usually solved using a chain of model transformations. One or more intermediate structures (or metamodels) are in- troduced to split a model transformation into two or more steps (links of the chain). In this way, a structure clash is managed in a chain of separate model transformations each of which with a minimal clash. Using our notation this design principle can be explained as follows. Consider a model transformation f : A B, where the gap between structures A and B is too wide to manage in a one-step transformation.→ Therefore, we split this model transformation by introducing an intermediate structure C that mediates between A and B. We develop two model transformations f1 : A C and f2 : C B. Then we connect those into a chain using function composition: → → f(m) = f2(f1(m)). A function composition can be denoted in a more ‘chain-like’ style using the ’ ’ operator, allowing us to write f(m) = (f1 f2)(m). We prefer to use the latter notation, as it◦ is more readable when we have more than two◦ steps in a transformation chain. Function composition cannot always be applied immediately. For instance, if the result of a function application is of a form that is different from the form that is accepted by the next function, then the function composition fails. An indispensable tool in many such cases is to use currying. Currying basically entails the translation of a function of the form f : A B C to a function of the form g : A (B C), by setting g(a)(b) = f(a, b). × → → → Below, we illustrate through an example transformation how the concepts of function com- position and currying can be used to design a transformation chain. The example transformation flattens a UML state machine (specified by the metamodel in Figure 5.1) by removing its compos- ite states and replacing them with equivalent sets of simple states and transitions. The mapping that performs such replacement can be described by the following signature:

SimplifyComposite : CompositeState P(State) P(Transition) (5.16) → × According to our metamodel in Figure 5.1, each composite state can be either sequential or orthogonal. A sequential composite state contains only one submachine, while an orthogonal (or parallel) composite state contains more than one submachine. Intuitively, when transforming a sequential state, we can simply substitute its submachine into the parent machine by ‘erasing’ borders of the submachine and redirecting and adding the necessary transitions. When transform- ing an orthogonal state, we need to consider its parallel nature and express it in terms of simple states. In other words, we first need to transform parallel submachines into one sequential subma- chine. After this we can apply the same transformation as for a sequential state. This is a simple example of a transformation chain that constitutes the mapping (5.16). The transformation chain will therefore consist of two mappings: Orthogonal2Sequential and SubstituteMachine. Fig- ure 5.3 shows an example of applying such a chain of model transformations to an orthogonal composite state. The first mapping transforms a collection of parallel state machines (such as an example depicted in Figure 5.3(a)) into one state machine (such as an example depicted in Figure 5.3(b)). Thus, we describe this mapping as follows:

Orthogonal2Sequential : P(StateMachine) StateMachine (5.17) → The second mapping substitutes a nested state machine (Figure 5.3(b)) into the parent state ma- chine (Figure 5.3(c)). When substituting a nested machine from a composite state we need to update two sets of transitions: those that have the composite state as a target, and those that have the composite state as a source. The former should be redirected to the initial substate of the composite state; the latter should be exiting from all substates of the composite state (see Fig- ure 5.3(c)). Therefore, we consider these sets of transitions as an input of the mapping. As an 5.3. Design of Model Transformations 87

s7 s8 e f s7 s8 a b e f s1,s4 s2,s4 s3,s4 a b c c c s3 a b s1 s2 s1,s5 s2,s5 s3,s5 d d d c d s6 a b s4 s5 s1,s6 s2,s6 s3, s6

f f s9 s9 (a) Orthogonal composite state (b) Sequential composite state

s8 f e a b s7 s1,s4 s2,s4 s3,s4 c c c a b s1,s5 s2,s5 s3,s5 d d d a b s1,s6 s2,s6 s3,s6 f f f s9 (c) Substituted composite state

Figure 5.3: An example of applying the chain of transformations SimplifyComposite output, the mapping thus creates a set of new states and a set of new and updated transitions:

SubstituteMachine : P(Transition) P(Transition) StateMachine × × → (5.18) P(State) P(Transition) × Observe that the output of the Orthogonal2Sequential mapping does not directly match with the input of the SubstituteMachine mapping. Therefore, we cannot connect these two mappings using a function composition. To overcome this problem, we apply currying: we rewrite Func- tion (5.18), which takes a tuple of inputs, as a sequence of functions, each of which takes a single input. In other words, we replace the Cartesian products on the left of the arrow with a sequence of arrows to obtain the following:

SubstituteMachine : P(Transition) P(Transition) StateMachine → → → (5.19) P(State) P(Transition) × When calculating such a function, we calculate each of the functions in the sequence sepa- rately and as a result of each separate calculation we get a new function. For example, if we cal- culate Function (5.18) for a tuple (x, y, z), we start from the first function in the sequence (5.19) 88 Designing and Describing Model Transformations

1 helper CompositeState :: SimplifyComposite() 2 : s t a t e s : Set (State), transitions: Set (Transition) 3 { 4 return s e l f . submachine −> map Orthogonal2Sequential (). 5 map SubstituteMachine( s e l f . _in , s e l f . _out ) ; 6 } Listing 5.4: QVTo code of the SimplifyOrthogonal transformation

and apply it for the first argument of the tuple. As a result we get a new function:

SubstituteMachine(x): P(Transition) StateMachine → → (5.20) P(State) P(Transition) × Then we do the same for function (5.20) and the rest of the tuple (y, z). As a result we get a new function:

SubstituteMachine(x, y): StateMachine P(State) P(Transition) (5.21) → × Note that input of function (5.21) matches with the output of function (5.17). Therefore, we can apply function composition to these two functions. The resulting definition of the Simplify- Composite mapping connects the mapping SubstituteMachine (with two fixed inputs) and the mapping Orthogonal2Sequential into a transformation chain:

SimplifyComposite(state) = ( Orthogonal2Sequential(state.submachine) (5.22) SubstituteMachine(state.in, state.out)) ◦ Formula (5.22) demonstrates how the notation of function composition can enhance the de- scription of a model transformation chain and the flow of information through the separate steps of this chain. The corresponding QVTo code is depicted in Listing 5.4.

5.3.3 Combining Structural Decomposition and Transformation Chains In the previous sections we demonstrated how the mathematical notation of set theory and func- tions can accompany the design process of a model transformation. In particular, we use re- grouping of function signatures to represent the structural decomposition of a model transforma- tion along the structure of the input and/or output metamodels; and function composition and currying – to capture the information flow in chains of model transformations. The purpose of these two design principles of developing a model transformation is to guide the application of the modular approach, i.e. to split the model transformation into modules. In practice, these design principles are often applied simultaneously, resulting in a mixture of transformation modules of two different kinds (sub-signatures and steps of a transformation chain). To manage the combination of these two types of modularity (i.e. to organize application of two design principles), we propose the following iterative process of developing a model transformation. 1. We choose a structure along which to decompose our model transformation. This structure is most probably the input metamodel or the output metamodel of the transformation. Our experience is that this choice depends on the granularity of the metamodels: we choose the most ‘dense’ structure, i.e. the structure whose content is modeled (captured) in smaller constructs (granules), such as classes and associations between them. 5.4. Validation 89

Table 5.1: Feedback received from the QVTo practitioners

Expertise Expertise in Mathematical notation Formulated Situations in QVTo mathematics for documenting QVTo design principles Long-term + +/– + ++ development – – – Prototyping + + – – Developing – + +/– ++ a DSL

2. We identify the delta(s) between the chosen structure and the input/output metamodel. Each delta corresponds to a (meaningful or structural) difference between the input and output metamodels. For each of the identified deltas we aim to develop a (separate) model transformation. 3. We split (or compose) our model transformation into a chain of transformations (if possible and necessary). For this, we identify the flow of the information between the input and the output metamodels and connect links of the chain according to this flow (so that output of one link serves as an input for the next link). Note that, the described process is purely empirical and should be considered as a recom- mendation on a possible approach, rather than an attempt to generalize the development process of all possible model transformations.

5.4 Validation

In this chapter we proposed and demonstrated an approach of using the mathematical notation of set theory for describing QVTo mappings, their signatures and implementations, for deriv- ing organizational structure of mappings, and for describing the information flow in chains of model transformations. While this approach is applied to implement the Constelle-to-Event-B model transformation (as described in the next chapter), further investigation and experiments are needed to validate that the approach is feasible, useful, and consistent in a more general scope (i.e. can be applied to other model transformations). To estimate the potentials of this research direction, we performed an early evaluation of the proposed approach by consulting with QVTo practitioners through interviews.

5.4.1 Interviews We conducted four interview sessions with seven developers from three different affiliations and different engineering domains. During an interview we first presented our approach (as done in Sections 5.2 and 5.3), and then collected the feedback using a questionnaire form consisting of 25 open questions. The goal of the questionnaire was to (1) assess an interviewee’s expe- rience of developing and maintaining QVTo transformations, (2) gauge his/her understanding of the employed mathematical notation, and (3) get the interviewee’s feedback on the usabil- ity and usefulness of the proposed approach. An overview of the feedback collected during the interviews is presented in Table 5.1. 90 Designing and Describing Model Transformations

We classify the collected feedback along the following situations, in which the interviewees used QVTo (left most column in Table 5.1): 1. developing large-scale software systems for further industrial usage in a team of software developers that also continues to maintain and extend the QVTo source code; 2. prototyping software architectures for further delegation of found solutions to software developers, without need for documentation and long-term maintenance of the QVTo code; 3. developing a DSL for conducting research in the field of electrical engineering by an en- gineer without expertise. In the first situation (second row in Table 5.1), the developers find the maintenance and documentation of their model transformations especially important. However, they are satisfied with the QVTo code documenting itself and use certain code conventions to improve readability of their model transformations. This corresponds to the high expertise level in QVTo of these interviewees. In this situation, the developers would not employ the proposed notation for doc- umenting their model transformations, but rather for the formal specification of (the most tricky parts of) their model transformations (‘+/–’ in Table 5.1). This feedback is partially determined by the fact that some of these developers do not have a background in mathematics, and thus cannot understand the notation of set theory describing QVTo model transformations. As for the formulated design principles, the interviewees recognize them and already apply them in their development process. However, the interviewees that lack the required mathematical background cannot follow formulas that accompany application of these design principles in our approach. In the second situation (third row in Table 5.1), the architect does not see the need for notation for describing his transformations and, moreover, presumes that the proposed design guidelines might restrict his experiments. Thus, we conclude that our proposed approach may not be suited for situations such as this one. In the third situation (bottom row in Table 5.1), the (non-software) engineer finds it especially important to discover the pragmatics of QVTo and to have guidelines on how to design a model transformation. Such an engineer starts the development of a DSL without having expertise in QVTo and obtains this expertise during the development process, usually by the ‘trial and error’. The interviewed engineer believes that the decomposition principle and the corresponding design process can be very useful and finds it helpful to have them formulated explicitly (‘++’ in the bottom right cell in Table 5.1). This third situation corresponds to our own experience of using QVTo for designing and implementing the Constelle-to-Event-B model transformation.

5.4.2 Constelle-to-Event-B According to the taxonomy proposed in [68], the Constelle-to-Event-B model transformation can be classified as an exogenous vertical semantical transformation with multiple source and target models; and it can be characterized as a fully automated and complex transformation (about 40 mappings and one thousand lines of code). This means that the transformation is objectively sophisticated, which corresponds to our subjective experience. The approach proposed in Sec- tion 5.3 helped us in streamlining the development of this transformation. Applying the proposed approach, we found that: the described design principles can assist in the creative process of designing model trans- • formations; designing a model transformation in the form of formulas prior to coding it in QVTo • improves readability and understandability of the resulting code; 5.5. Related Work 91

describing a model transformation using the proposed notation facilitates explaining its • meaning to peers.

5.4.3 Results and Future Work From our experience and from the feedback of the QVTo practitioners, we conclude that the proposed notation is usable and useful for research-oriented projects which involve design of nontrivial model transformations and their further description and explanation (for example, in the form of a publication). As the group of interviewed QVTo practitioners was rather small (seven developers), conducting additional interviews is an important step for future validation. Moreover, for the successful application of the proposed notation the following questions require further investigation. Does the proposed notation restrict the resulting design of a model transformation? • What are the limits of the proposed notation, i.e. can all essential situations and QVTo • constructs be naturally described using mathematical concepts? For instance, there is no mathematical operator that naturally matches the QVTo concept of inheritance of map- pings. How scalable is the proposed approach, how much effort does it require in general to apply • the proposed notation for describing and designing a complete complex transformation?

Is the proposed approach language independent, i.e. can it be used for developing a model • transformation using another model transformation languages, for example ATL? Do the formulated design principles improve efficiency of the resulting model transforma- • tion and enhance its reuse (through its modules)?

5.5 Related Work

The broad problem of supporting the whole life-cycle of the model transformation development with a proper description notation is raised by Guerra et al. in [33]. In their study they propose a family of different visual languages for various phases of the development process: requirements, analysis (testing), architectural design, mappings overview and their detailed design. Although the authors state that the resulting diagrams should guide the construction of the software arti- facts, there is no discussion in the paper on how to design model transformations. In contrast, in our approach we use a single mathematical notation for describing three of the listed phases: architectural design (transformation chains), mappings overview (function signatures) and their detailed design (formulas). The mathematical notation of function signatures applied in our paper is close to the notation of rewrite rules of the ASF+SDF [12, 24] and Stratego [107] languages. The earlier works on a notation for describing and designing model transformations, such as [28] and [79], propose graphical representations that are based on UML class diagrams. The major disadvantage of such approaches is the difficulty to describe the organizational structure of a transformation and the information flow through this structure from source to target models. This challenge is managed in the visual notation of the MOLA transformation language [45], that combines ‘structured flowcharts’ for describing a transformation algorithm with (model) patterns for defining input and output of a transformation. Though MOLA aims for describing model transformations in a ‘natural and easy readable way’, it introduces a number of specific visual means, which might require certain experience from a user. 92 Designing and Describing Model Transformations

The existing studies that specify model transformations using mathematical formalisms, such as [40], mostly aim for formal analysis of transformations rather than for designing and main- taining them. An exception to this is [58, 80] by Lano and Rahimi, who define a systematic development process for model transformations with the focus on their formal specification and further verification. They propose to describe a model transformation as a set of abstract con- straints on the relation between source and target models that is realized by this transformation. For this, they identify and classify common patterns for specifying constraints on model transfor- mations [57]. The authors also give guidelines on how the (implementation) design of a model transformation can be derived based on the pattern that has been applied for its specification - using one of the proposed implementation patterns, or implementation strategies. The structural decomposition approach—the design principle that we discuss in Section 5.3.1—is similar to the recursive pattern in [58]. In particular, our decomposition method can be used to determine the structural dependency ordering of the target model, which is required when applying this imple- mentation pattern. Chaining of model transformations—our second design principle, discussed in Section 5.3.2—is similar to the auxiliary metamodels pattern in [58]. The principle of structuring a program (or in our case, a transformation) according to the structure of its input and/or output goes back to the method proposed by M. Jackson in 1970s and known as Jackson Structured Programming (JSP) [41, 42]. This method was developed in the domain of data processing systems, where an input data stream is processed and an output data stream is produced as a result. JSP gives guidelines on how to decompose, i.e. to design, a program that performs such processing. According to JSP, program structure should be dictated by the structure of its input and output data streams. The JSP method uses diagrams to represent structures of input and output streams, to merge them and to derive a corresponding program structure. We apply ideas, similar to those of the JSP method, to structure model transformations along the structure of its input and output metamodels (Section 5.3.1). Several aspects of our approach are indirectly supported by the results of the exploratory study performed by Gerpheide et al. [31] for constructing a QVTo quality model. In this work they formalize a quality model for QVTo based on the interviews with four QVTo experts. Ac- cording to their results, understandability and maintainability are the most ubiquitous quality goals for QVTo experts. Moreover, among the best practices that address these goals are (1) the preference for a declarative style of programming over an imperative style (for example, avoid- ing loop-statements); and (2) structuring a transformation along the hierarchy of either the input or output metamodel. According to our experience, both these best practices can result from using the notation and the design principles described in this chapter. This indicates that the proposed approach can contribute to improving the quality of QVTo code with respect to the understandability and maintainability.

5.6 Conclusions

According to Kurtev [54], the current QVT standard lacks a formal basis, which causes risks when using model transformations designed and implemented in QVTo. These risks are ampli- fied by the current lack of documentation on the QVT execution semantics and pragmatics. In this chapter we showed how the mathematical notation of set theory and functions can be used to explain QVTo concepts, to facilitate and organize the design process of model transfor- mations, and to document model transformations. The resulting formulas give an overview of the organizational structure and the information flow of a transformation in an unambiguous and concise way. We applied the proposed notation and the formulated design principles to design, develop, 5.6. Conclusions 93 and to do code refactoring of two model transformations: the state machine model transformation used as an example in this chapter and the Constelle-to-Event-B model transformation (described in the next chapter). Moreover, we used the notation and design principles for teaching QVTo to students. The approach described in this chapter requires further investigation and experiments. In Section 5.4.3 we highlight directions for future work. The quality of model transformations in general is an important concern in the context of MDE and has been addressed in a number of studies (see for example the PhD dissertation by van Amstel [101]). In this respect, we believe that the most interesting research questions are (1) whether the described approach improves the quality of a resulting model transformation and (2) whether the proposed notation can be used with different transformation languages (such as Xtend, ETL and ATL).

Chapter 6

Mapping Constelle to Event-B

The best language for writing informal specifications is the language of ordinary math, which consists of precise prose combined with mathe- matical notation. The math needed for most specifications is quite sim- ple: predicate logic and elementary set theory. This math should be as natural to a programmer as numbers are to an accountant.

Leslie Lamport, Who Builds a House without Drawing Blueprints?

In this chapter, we define the semantics of the Constelle language by mapping it to the Event- B formalism. On the one hand, in this way we answer the research question: what is the seman- tics of a definition of the dynamic semantics of a DSL? On the other hand, when specifying the semantics of Constelle, we follow a pragmatic approach, guided by the requirements formulated in Chapter 2. In particular, we refine the stated research question into the following (design or research) (sub)questions. How to achieve both a precise and executable semantic mapping from Constelle to Event-B? • How do the constructs and mechanisms of Constelle map to the constructs and mechanisms • of Event-B?

What is generated from a Constelle model that defines the dynamic semantics of a DSL? • What is the added value of having a precise semantic mapping from Constelle to Event-B? • The first of these questions is highlighted in Section 6.1, which gives an overview of the semantic mapping from Constelle to Event-B from a pragmatic point of view. The next two questions are answered in the rest of the chapter (Sections 6.2, 6.3, 6.4, and 6.5) through the description of the Constelle-to-Event-B model transformation. For this description we rely on the mathematical notation of set theory and functions introduced in Chapter 5. The last of the listed questions is addressed in Section 6.6, where we identify proof obligations for the Event-B specifications generated from a Constelle model (i.e. a definition of the dynamic semantics of a DSL given in Constelle). 96 Mapping Constelle to Event-B

6.1 Overview

In the Constelle language the dynamic semantics of a DSL is defined using semantic interfaces of the specification templates. Such templates are collected in a library, which facilitates the reuse of design solutions. As a carrier of such design solutions, i.e. as an implementation formalism for our specification templates, we use Event-B. This means that Constelle realizes a semantic mapping of the DSL to the semantic domain of Event-B (according to the definitions given in Chapter 2). In Figure 6.1 this process is represented as the leftmost T-diagram: the semantic mapping of the DSL to Event-B is realized in Constelle.

DSL Event-B DSL Event-B Constelle Constelle Event-B Event-B Event-B* QVTo

Figure 6.1: T-diagrams of the Constelle semantics definition

In Constelle the dynamic semantics of a DSL is defined as a composition of specialized spec- ification templates. The engine of the Constelle language realizes code substitution in the tem- plates and composes the resulting specifications into the Event-B specification(s) of the dynamic semantics of the DSL. The semantics of Constelle defines such a translation from a composition of specialized specification templates to the corresponding Event-B code. In Figure 6.1 the defi- nition of the Constelle semantics is represented as the middle T-diagram: the semantic mapping of Constelle to Event-B. To benefit from a semantics definition in practice and to implement the use cases discussed in Chapter 2, we aim to have the semantic mapping of Constelle to Event-B both precise and executable. Precision of the semantic mapping from Constelle to Event-B is achieved through the use of three Event-B techniques: generic instantiation, shared event composition and refinement – depicted as Event-B∗ in Figure 6.1. These techniques have a solid theory [4] and make it possible to reuse proof obligations discharged for the specification templates in a specification of the DSL dynamic semantics generated from its Constelle definition. The detailed explanations of these techniques and how they allow for the reuse of proof obligations are given further in this chapter. According to our definition of executability given in Chapter 2, an executable semantic map- ping can be interpreted (i.e. executed) by tools. Executability of the semantic mapping from Con- stelle to Event-B is achieved through its implementation using the MDE technique of model-to- model transformation. Namely, we implement the Event-B∗ techniques in a Constelle-to-Event-B transformation using the QVTo model transformation language [74]. In Figure 6.1 this imple- mentation is represented as the triangle on the bottom of the middle T-diagram. The semantic mapping from a DSL to Event-B realized in Constelle (leftmost T-diagram in Figure 6.1) combined with the semantic mapping from Constelle to Event-B result in (generate) a semantic mapping from the DSL to Event-B realized in Event-B. This mapping is represented as the rightmost T-diagram in Figure 6.1. In the resulting definition of the dynamic semantics of a DSL, both the semantic mapping and semantic domain are precise and executable, thus fulfilling our basic requirement stated in Chapter 2. 6.2. Model transformations from Constelle to Event-B 97

6.2 Model transformations from Constelle to Event-B

The definition of the Constelle semantic mapping follows its actual implementation, the Constelle- to-Event-B transformation coded in QVTo. Thus, by defining the semantic mapping we also de- scribe how the implemented QVTo model transformation works. In this description we use the mathematical notation of set theory and functions, aligned with the concepts and notation of the QVTo language – as introduced in Chapter 5. In particular, we define the Constelle-to-Event-B semantic mapping as a set of functions, capturing QVTo model transformations, and defined in terms of the metamodels introduced in Chapter 4. Figure 6.2 reproduces and combines all the metamodels introduced in Chapter 4. The Con- stelle metamodel is depicted on the left, with the dedicated Constelle constructs depicted on the shaded background and the basic constructs related to the notion of template interface depicted on the white background. The metamodel of a specification template is depicted in the mid- dle on the shaded background. The Event-B metamodel is shown on the right. The relations that connect these four metamodels are highlighted using labels that are typeset in bold (such as implements that links elements of specification templates with the corresponding elements of template interfaces). According to the description method introduced in the previous chapter, each class depicted in Figure 6.2 is viewed as a set of objects that instantiate this class, and as such is used in the defi- nitions of the QVTo functions that are given further in this chapter. Note that as we use set theory and functions, we do not discuss some trivial details, such as formal applicability of functions or additional constraints that should be fulfilled. This is done for the sake of brevity. However, the actual Constelle-to-Event-B transformation performs all necessary checks. In Section 6.6 we discuss proof obligations that result from (or correspond to) the Constelle-to-Event-B functions. The model transformation Constelle-to-Event-B transforms (maps, translates) a Constelle definition of the DSL dynamic semantics to a corresponding Event-B specification of this DSL dynamic semantics. For this, the transformation consumes a Constelle model (instance of the Constelle metamodel) and a library of the specification templates invoked in this Constelle model. As in Constelle the dynamic semantics of a DSL is defined through a DAG of semantic modules (see Chapter 4), the transformation produces an Event-B specification that consists of multiple Event-B machines: an Event-B machine for each semantic module of the definition. Each of these resulting Event-B machines is wrapped into a specification template. This is done for the sake of uniformity (of semantic modules and semantic templates) and for the possibility to construct new specification templates from existing ones using Constelle. Consequently, we describe the Constelle-to-Event-B transformation by the following func- tion: Constelle-to-Event-B : (6.1) P(SpecificationTemplate) SemanticDefinition P(SemanticTemplate) → → Here the transformation applies a library (a set) of SpecificationTemplates to a SemanticDefinition of a DSL; and generates as an output a collection of SemanticTemplates that implement all se- mantic modules of the input semantic definition. All constructs used in the formulas of this chapter correspond to the similarly-named classes depicted in Figure 6.2, such as Semantic- Template and SemanticDefinition. Figure 6.2 should be consulted in order to understand which classes participate in a transformation and how these classes are related to each other. 98 Mapping Constelle to Event-B Constant [0..*]constants CarrierSet [0..*]sets Context Axiom [0..*]axioms Invariant Guard EventBNamedCommentedDerivedPredicateElement EventBNamedCommentedPredicateElement [0..*]guards [0..*]sees [0..*]invariants Action EventBNamedCommentedElement Event Machine [0..*]events [0..*]actions EventBNamedCommentedActionElement [1]eventbcontext Parameter [0..*]variables Variable [0..*]parameters [1]eventbElement [1]eventbmachine [1]eventbpredicate [1]uses SemanticTemplate PrivateElement [1]uses [0..*]elements ConstraintTemplate SpecificationElement SpecificationTemplate [1]implements StructuralTemplate PublicElement [1]implements Operation [0..*]_interface [0..*]signature [1]implements [1]implements SemanticInterface [1]uses [1]uses ConstraintInterface [0..*]dynamicInterface DynamicParameter InterfaceElement [0..*]staticInterface [1]restricts StaticParameter StructuralInterface Figure 6.2: Metamodels of Constelle, specification templates, and Event-B combined together [0..*]_interface [1]actual [1]formal [1]invokes [1]invokes [1]dslStructure ConstraintInvocation [0..*]elementsSubstitution [0..*]gluingGuards SemanticModule [1..*]semanticModules TemplateInvocation SemanticDefinition InterfaceElementSubstitution [0..*]aspects [0..*]elementsSubstitution 6.2. Model transformations from Constelle to Event-B 99

As described in Chapter 4, each semantic module of a Constelle model substitutes static pa- rameters of the invoked specification templates with the static parameters representing the DSL constructs – such as the types Actions, ArmActions, and HandActions in the DSL used as an example in Chapter 4. According to the Constelle metamodel depicted in Figure 6.2, such Static- Parameters are contained in a StructuralInterface, used by a SemanticModule (as a subclass of a SemanticInterface). For the sake of simplicity, we assume that all semantic modules of a Con- stelle model use the same structural interface, which introduces all necessary DSL constructs (types, relations, and constants). In Figure 6.2 we depict such an interface through the reference dslStructure from SemanticDefinition to StructuralInterface. To ensure that the resulting Event-B machines specify the DSL semantics in terms of these concepts, we assume that this structural interface is implemented in an Event-B context beforehand. In practice such an Event-B con- text can be automatically generated from the DSL metamodel, for example using the UML-B approach [91]. As all other Event-B code in Constelle, such an Event-B context is wrapped in the corresponding structural template. We treat such a structural template as a global constant (or environment) of the Constelle-to-Event-B transformation, and, therefore, do not include it in the function definitions discussed in this section. Transformation (6.1) can be implemented using the following (sub-)transformation that con- siders a single SemanticModule of a semantic definition and results in a single SemanticTemplate correspondingly:

ExpandDefinition : (6.2) P(SemanticTemplate) SemanticModule SemanticTemplate → → To transform all semantic modules of a semantic definition using their dependencies on (invoca- tions of) each other, we apply transformation (6.2) to the nodes of the DAG of semantic modules starting from the sinks towards the sources. In this way we ensure that when a semantic module is to be transformed, all its aspects (invocations of SemanticInterfaces, see Figure 6.2) have the corresponding implementations in the form of semantic templates. The transformation ExpandDefinition realizes the key function of the Constelle semantic map- ping. It implements the two mechanisms of the specification templates approach: substitution of parameters in a generic template, and invocation (composition) of specialized templates. These mechanisms are implemented as two (separate) steps: Substitute and Compose – which are con- nected into a chain of model transformations. The third step of the transformations chain, Glue- Operations, adds the gluing guards to the composition of the specialized templates.

ExpandDefinition(lib)(module) = Compose(module) Substitute(a, t) a module.aspects t lib t.implements = a.invokes (6.3) { | ∈ ∧ ∈ ∧ }  GlueOperations(module.gluingGuards, lib ConstraintTemplate)  ◦ ∩ We discuss the details of the function application of Substitute, Compose, and GlueOperations (i.e. arguments of the mappings appearing in formula (6.3)) and their signatures in the following sections. As discussed earlier in Section 6.1, we define the semantic mapping from Constelle to Event- B in terms of the Event-B∗ techniques. As specification templates wrap Event-B code, the trans- formations Substitute, Compose, and GlueOperations build on top of (or wrap) the Event-B∗ techniques, which manipulate the back-end Event-B code. Thus, we give the definition of the functions Substitute, Compose, and GlueOperations in terms of the functions of the Event-B∗ techniques (formulas (6.5), (6.14), and (6.23) see Sections 6.3, 6.4, and 6.5). 100 Mapping Constelle to Event-B

Figure 6.3 provides an overview of the mappings discussed in this chapter and how they are related with each other. Each of the mappings in Figure 6.3 has a reference to its signature and/or to its formula (for example, Substitute (6.4, 6.5)). The functions of the Event-B∗ techniques are highlighted in green color. The organization structure of the Constelle-to-Event-B transforma- tion depicted in Figure 6.3 shows how composite mappings are decomposed into transformation chains of constituent mappings. Such a decomposition is depicted using the ‘ ’ symbol. ◦

Figure 6.3: Structure of the Constelle-to-Event-B model transformation

Note that every decomposition of a mapping into a chain of mappings (i.e. every step to the right along the structure depicted in Figure 6.3) is accompanied by a step towards using a more fine-grained structure (i.e. a top-down step along the class hierarchy of the metamodel depicted in Figure 6.2). In other words, constituent mappings use more details of the Constelle structure than composite mappings. Therefore, we do not describe signatures of constituent mappings at the same place where a composite mapping applies (invokes) these constituent mappings (i.e. in the subsection that describes the composite mapping), but rather explain each of the constituent mappings in a subsection dedicated to it.

6.3 Substitution

Substitution of parameters in an invoked semantic template is the first step of the transforma- tion ExpandDefinition. It takes a SemanticTemplate from the library and performs its Template- Invocation by transforming it into a new SemanticTemplate. The resulting SemanticTemplate im- plements the semantic interface of the semantic module being composed (compared to the input SemanticTemplate that implements a generic SemanticInterface from the library). The Substitute transformation can be described by the following function:

Substitute : TemplateInvocation SemanticTemplate SemanticTemplate (6.4) → → Correspondingly, the transformation ExpandDefinition (6.3) applies Substitute (6.4) to all as- pects of its input semantic module: Substitute(a, t) for a module.aspects (see the second line of formula (6.3)). For this, ExpandDefinition finds a corresponding∈ template in the library, i.e. the 6.3. Substitution 101 template that implements the semantic interface invoked in the aspect: t lib t.implements = ∈ ∧ a.invokes. The Substitute transformation realizes four objectives: duplication of the specification elements (in particular, template Operations) that are in- • voked (i.e. substituted) several times;

specialization of the template parameters with the DSL constructs using InterfaceElement- • Substitutions of StaticParameters;

preparation for further composition of the template by assigning new identifiers to its Op- • erations and DynamicParameters – according to InterfaceElementSubstitutions of these el- ements with elements of the target (composite) specification; encapsulation of the specification elements that do not implement any interface elements • (PrivateElements) through extending their identifiers with a proper namespace identifier (to avoid possible name conflicts in the target specification) – using as the namespace the aspect that invokes this template. These objectives are realized in the four consecutive steps connected into a chain of transfor- mations: Substitute(inv, tmpl) = Normalize(inv.elementsSubstitution, tmpl.elements tmpl.uses.elements) ∪ ComposeRenaming(inv.name) (6.5) ◦ GenericInstantiation(tmpl.eventbmachine) ◦ ReconstructInstantiatedTemplate(inv.elementsSubstitution, tmpl) ◦ The first step normalizes the invocation of the specification template by ensuring that each Oper- ation of the semantic module substitutes the corresponding unique Operation of the specification template. For example, in the Constelle definition of the semantic module Robotic Arm Paral- lel (see Table 4.2 on page 64), the operation process of the Request template is substituted by two operations of the semantic module: armActionStm and handActionStm. This means that the Event-B code and the specification elements wrapping this code into the operation process should be duplicated in order to implement these two separate operations of the semantic module. The other three steps of the Substitute mapping realize the actual substitution of the template parameters (i.e. static and dynamic interface elements) by renaming Event-B elements in the back-end specification code. For this, ComposeRenaming configures the renaming scheme. The mapping GenericInstantiation performs generic instantiation of the Event-B machine according to this renaming scheme. The mapping ReconstructInstantiatedTemplate is an auxiliary transfor- mation that wraps the resulting Event-B machine into the corresponding semantic template. We discuss these steps in the next four subsections.

6.3.1 Normalization of an invocation As discussed above, the Normalize transformation updates an invocation of a specification tem- plate in such a way, that each of the operations of the semantic module invokes a unique (sepa- rate) operation of the specification template. This means, that each InterfaceElementSubstitution of the input TemplateInvocation should substitute exactly one SpecificationElement of the in- put SemanticTemplate; or that the substitution relation (realized by the set of InterfaceElement- Substitutions, see Figure 6.2) should be injective. The corresponding update of the set of Interface- ElementSubstitutions and the set of SpecificationElements is captured in the following signature 102 Mapping Constelle to Event-B of the Normalize transformation:

Normalize : P(InterfaceElementSubstitution) P(SpecificationElement) × → (6.6) P(InterfaceElementSubstitution) P(SpecificationElement) × For each SpecificationElement that is invoked (substituted) several times, the Normalize trans- formation creates its copy (clone) and updates the references from the corresponding Interface- ElementSubstitution. For example, in the Constelle definition of the example DSL presented in Chapter 4 (see Table 4.2 on page 64), the template invocation distributor: Request (the right- most column in Table 4.2) has the following set of InterfaceElementSubstitutions (here each InterfaceElementSubstitution is presented in the form of an ordered pair of its actual and formal InterfaceElements, i.e. an element that substitutes and an element that is should be substituted):

(taskStm request), (task elements), 7→ 7→ (armActionStm process), (action element),  7→ 7→ (handActionStm process), (action element), 7→ 7→ (Actions Elements) 7→ Here the operation process is invoked (substituted) two times: by armActionStm and hand- ActionStm. The same holds for the dynamic parameter of this operation: element is substituted by two different parameters action. To ensure that in the resulting Event-B specification both armActionStm and handActionStm are implemented properly and include the functionality of the process operation, we need to duplicate (clone) the process operation and the Event-B code that implements it. Thus, a new set of InterfaceElementSubstitutions (i.e. substitution relation) should realize an injective relation:

(taskStm request), (task elements), 7→ 7→ (armActionStm process), (action element),  7→ 7→ (handActionStm process2), (action element), 7→ 7→ (Actions Elements) 7→ Moreover, the new (normalized) Event-B machine should contain two corresponding events: process and process2. However, the bodies (code) of these events should be the same (i.e. clones of each other), including their parameters element. The Normalize mapping, that realizes such a transformation, is captured in the following formula: Normalize(substset, specset) = specset c c s clones , ∪ { | 7→ ∈ }  substset s c s clones c.implements s.actual c s clones \{ | 7→ ∈ } ∪ { 7→ | 7→ ∈ } (6.7) where clones = Clone(el, s1) s1  { 7→ | el specset PublicElement el.implements / StaticParameter ∈ ∩ ∧ ∈ s1 substset s1.formal = el.implements ∧ ∈ ∧ s2 substset : s1 = s2 s1.formal = s2.formal ∧ ∃ ∈ 6 ∧ } Here for each PublicElement of the input set of specification elements (el specset PublicElement, line 5 of the formula) we check if it is substituted in several InterfaceElement-∈ ∩ Substitutions (s1 and s2, last two lines of the formula). For example, s1 can be (armAction- Stm process) and s2 can be (handActionStm process). For such s1 and s2 we duplicate 7→ 7→ 6.3. Substitution 103

(clone) the specification element (Clone(el, s1)) and collect the cloned element and the original InterfaceElementSubstitution (s1) paired together in an intermediate variable ‘clones’. The con- dition el.implements / StaticParameter ensures that we consider (duplicate) only Operations and DynamicParameters of∈ the specification template (so we do not consider pairs like (Actions Elements) in the example above). 7→ The intermediate variable ‘clones’ is used to update the input sets of InterfaceElementSubstitu- tions (substset) and of SpecificationElements (specset) in the following way. The cloned spec- ification elements (c) are added to the set ‘specset’ (second line of the formula). In the set ‘substset’ (third line of the formula), the InterfaceElementSubstitutions that substitute the same SpecificationElements are replaced (we remove the set s c s clones ) with the Interface- { | 7→ ∈ } ElementSubstitutions that substitute the corresponding clones instead (we add the set c.imple- { ments s.actual c s clones ). The7→Clone mapping| 7→ (which∈ is} used in formula (6.7)) clones (i.e. constructs a copy of) a specification element for an input InterfaceElementSubstitution. The resulting clone includes copies of the Event-B and interface counterparts of the input SpecificationElement (i.e. objects referenced through the eventbElement and implements associations of the SpecificationElement are also cloned). The Clone mapping is described in the following signature:

Clone : SpecificationElement InterfaceElementSubstitution SpecificationElement (6.8) × → 6.3.2 Configuration of renaming The configuration of a renaming scheme is generated in a transformation ComposeRenaming. This transformation applies a set of InterfaceElementSubstitutions to a set of SpecificationElements (of the invoked semantic template). A namespace extension (String) is used for the encapsulation of the PrivateElements (i.e. SpecificationElements that are not referenced by InterfaceElement- Substitutions). The result of the transformation, EventBNamedCommentedElement String, is a partial function of renamings for the Event-B elements of the specification. The transformation7→ is described by the following function:

ComposeRenaming : String P(InterfaceElementSubstitution) P(SpecificationElement) (6.9) → × → (EventBNamedCommentedElement String) 7→  The transformation ComposeRenaming prepares the substitution of public specification ele- ments and encapsulation of private specification elements. For this, the transformation is imple- mented in the following way:

ComposeRenaming(nmspc)(substset, specset) = s.eventbElement x.actual.name { 7→ | s specset PublicElement x substset x.formal = s.implements (6.10) ∈ ∩ ∧ ∈ ∧ } ∪ s.eventbElement nmspc + s.eventbElement.name { 7→ | s specset PrivateElement ∈ ∩ } In other words, the ComposeRenaming transformation considers public and private specification elements (s specset) separately. For the former, it finds an interface element substitution ∈ (x) that substitutes the interface element implemented by this specification element (x.formal = s.implements); and uses the name of the target interface element for renaming (x.actual.name). 104 Mapping Constelle to Event-B

For the private specification elements, the transformation simply extends their names with the namespace: s.eventbElement nmspc + s.eventbElement.name. For example, when we apply7→ this function to the template invocation driver1: Queue of the semantic module Robotic Arm Parallel defined in Table 4.2 (on page 64), we use Interface- ElementSubstitutions depicted in the table lines to substitute (i.e. rename) elements from the second column with elements from the leftmost column. As a namespace extension we use the aspect name, i.e. ‘driver1’. The input set of specification elements is taken from the specification template Queue. As a result we get the renaming function depicted in Figure 6.4(a).

ElementType ‘ArmActions’, MACHINE driver1 queue machine 7→ queue ‘driver1_queue’, SEES dsl context 7→ enqueue ‘armActionStm’, VARIABLES (enq)element7→ ‘action’, driver1 queue (enq)index 7→‘driver1_index’, INVARIANTS 7→ inv1 : driver1 queue N ArmActions dequeue ‘executeArm’, ∈ 7→ (deq)element7→ ‘action’, EVENTS (deq)index 7→‘driver1_index’ Initialisation 7→ begin (a) Renaming function act1 : driver1 queue := ∅ end Event armActionStm = any action, driver1 index where b grd1 : action ArmActions CONTEXT dsl context ∈ grd2 : driver1 index N ∈ SETS grd3 : driver1 queue = ∅ Actions, ArmActions, HandActions ( i i dom(driver1 queue6 ) ⇒driver1 index > i) AXIOMS ∀then· ∈ ⇒ axm1 : partition(Actions, act2 : driver1 queue := ArmActions, HandActions) driver1 queue driver1 index action ∪ { 7→ } END end Event executeArm = (c) Event-B context that implements structural ... interface of the example DSL (b) Fragment of theb instantiated Event-B machine Figure 6.4: Substitution of the Queue template for the arm aspect

6.3.3 Generic instantiation As described in Chapter 2, generic instantiation is introduced by Abrial et al. in [4] and is de- veloped in detail by Silva and Butler in [88]. Generic instantiation realizes reuse of an Event-B specification by considering it as a generic model and instantiating it into a more specific model. For this, the context of an Event-B specification C(s, c) is considered as its parameterization. The sets s and the constants c introduced in the context play the role of parameters of the generic model; and the axioms A(s, c) capture their properties, i.e. requirements on the parameters. An instantiation of such a specification uses an Event-B context D(ds, dc) with more specific sets ds and constants dc, featuring more specific properties captured in the axioms DA(ds, dc). An instantiation of the generic Event-B machine is performed by replacing its parameters s and c with the instance elements ds and dc. Moreover, the variables, events, and parameters of the generic machine can be renamed in the instantiated machine. 6.3. Substitution 105

In practice, generic instantiation is implemented by syntactically replacing generic elements with instance elements (sets, constants, variables, events, and their parameters), or renaming generic elements into instance elements. For example, Figure 6.4(b) shows an Event-B machine constructed as a result of generic instantiation of the Queue (template) machine (depicted in Figure 4.4(c) on page 59). For this generic instantiation we used the renaming configuration depicted in Figure 6.4(a). According to [88], the resulting instantiated machine is correct by construction if the follow- ing conditions hold. The requirements on the parameters of the generic specifications hold for its instantiation. • This means that the properties A(ds, dc) of the generic specification can be derived from the properties DA(ds, dc) of the specific (instantiated) specification. In practice, this is done by stating the generic properties A(ds, dc) as theorems, and thus triggering the gen- eration of the corresponding proof obligations. In our example, the context of the Queue template (Figure 4.4(a)) does not contain axioms. Therefore, no generic properties need to be proved. Each set (in the generic specification) must be replaced by a set or by a valid type expres- • sion (in the instantiated specification); and each constant must be replaced by a constant. In our approach we ensure that this requirement is met by providing a proper implementation of the structural interface used in the Constelle definition (an Event-B context wrapped in a structural template, as discussed earlier in this section). An example of such a context for Robotic Arm Parallel is depicted in Figure 6.4(c). To apply (call) generic instantiation in our definition of the semantics of Constelle, we repre- sent it by the following function:

GenericInstantiation : (6.11) Machine (EventBNamedCommentedElement String) Machine → 7→ → Here a Machine (for example, the Queue machine from Figure 4.4(c) on page 59) is instanti- ated using a partial function of renaming Event-B elements EventBNamedCommentedElement String (for example, the renaming function from Figure 6.4(a)), and as a result a new (instan- tiated)7→ Machine is generated (such as a machine depicted in Figure 6.4(b)).

6.3.4 Substitution: chain of transformations As specified earlier in formula (6.5) (and repeated here for convenience of reading), the Substitute mapping is defined as a function composition of the transformations Normalize (6.6), Compose- Renaming (6.9), GenericInstantiation (6.11), and an auxiliary transformation ReconstructInstan- tiatedTemplate (which is discussed further):

Substitute(inv, tmpl) = Normalize(inv.elementsSubstitution, tmpl.elements tmpl.uses.elements) ∪ ComposeRenaming(inv.name) ◦ GenericInstantiation(tmpl.eventbmachine) ◦ ReconstructInstantiatedTemplate(inv.elementsSubstitution, tmpl) ◦ (6.5 revisited) Here the Normalize transformation takes as an input InterfaceElementSubstitutions of the tem- plate invocation (inv.elementsSubstitution) and SpecificationElements of both semantic template 106 Mapping Constelle to Event-B

(tmpl.elements) and of its static template (tmpl.uses.elements). The resulting normalized sets of InterfaceElementSubstitutions and SpecificationElements are used directly ( ) as an input for ComposeRenaming. As a namespace extension we use the name of the template◦ invocation (i.e. the name of the aspect in the semantic module): inv.name. The resulting renaming is applied directly ( ) to instantiate the Event-B machine of the template (tmpl.eventbmachine). The output◦ of GenericInstantiation is translated into the output of Substitute using the auxil- iary transformation ReconstructInstantiatedTemplate described by the following function:

ReconstructInstantiatedTemplate : P(InterfaceElementSubstitution) SemanticTemplate (6.12) × Machine SemanticTemplate → → The transformation ReconstructInstantiatedTemplate (6.12) generates a SemanticTemplate that wraps the newly generated (instantiated) Event-B Machine. This step is necessary because generic instantiation is an Event-B∗ technique implemented by one of Rodin plug-ins. To be able to build on top of this Rodin plug-in, we need to translate the output of GenericInstantiation into the constructs of Constelle. For this, ReconstructInstantiatedTemplate (6.12) takes the original (invoked) SemanticTemplate and traces back its specification elements to the Event-B elements of the machine through the set of InterfaceElementSubstitutions that have been applied to the original template (inv.elementsSubstitution in the Substitute mapping (6.5)). To match all inputs and outputs of the functions being composed (transformations being chained) in function composition (6.5), we use currying according to the technique described in Chapter 5. Currying realizes the translation of a function of the form f : A B C to a function of the form g : A (B C), by setting g(a)(b) = f(a, b): × → → → Normalize : P(InterfaceElementSubstitution) P(SpecificationElement) × → P(InterfaceElementSubstitution) P(SpecificationElement) × ComposeRenaming(nmspc): P(InterfaceElementSubstitution) P(SpecificationElement) × → (EventBNamedCommentedElement String) 7→ GenericInstantiation(machine): (EventBNamedCommentedElement String) Machine 7→ → ReconstructInstantiatedTemplate(substset, tmpl): Machine SemanticTemplate → 6.4 Composition

Composition of the substituted semantic templates is the second step of the transformation ExpandDefinition. According to the design of Constelle, introduced in Chapter 4 and realized in the metamodel depicted in Figure 6.2, composition of semantic templates is defined through sharing interface elements of the (composite) semantic module between (constituent) semantic templates (or template instances). In other words, the SemanticModule has a SemanticInterface (see Figure 6.2); the aspects of this module invoke SemanticTemplates, each of which imple- ments this SemanticInterface. Therefore, the Compose transformation takes as an input the (shared) SemanticInterface and the set of (constituent) SemanticTemplates and constructs a new (composite) SemanticTemplate:

Compose : SemanticInterface P(SemanticTemplate) SemanticTemplate (6.13) → → 6.4. Composition 107

The configuration when all SemanticTemplates implement the same SemanticInterface re- sults from the previous step of the transformation, the Substitute mapping, where specification templates from the library are substituted with the elements of the SemanticInterface of the in- put semantic module. This is captured in the formula (6.3) that describes the transformation ExpandDefinition (reappearing here for convenience of a reader):

ExpandDefinition(lib)(module) = Compose(module) Substitute(a, t) a module.aspects t lib t.implements = a.invokes { | ∈ ∧ ∈ ∧ }  GlueOperations(module.gluingGuards, lib ConstraintTemplate)  ◦ ∩ (6.3 revisited) ExpandDefinition applies Compose to its input semantic module (as it is a semantic interface itself, see Figure 6.2) and to the semantic templates resulting from the Substitute mapping applied to the aspects of the module (the first and second lines of the formula). The composition of semantic templates builds on top of an Event-B∗ technique, shared event composition (see Figure 6.1). This means that an Event-B machine of the resulting semantic template is composed of the Event-B machines of the input semantic templates using shared event composition:

Compose(module)(tmplts) = SharedEventComposition( t.eventbmachine t tmplts )(config) { | ∈ } ReconstructComposedTemplate(module) ◦ where config = x.eventbElement x.implements = op x t.elements t tmplts , (6.14) { | ∧ ∈ ∧ ∈ } n  x.eventbElement x.implements = dp x t.elements t tmplts { | ∧ ∈ ∧ ∈ } dp op.signature | ∈ op module.interface  | ∈ o In the next two subsections we present SharedEventComposition and discuss how it is configured (intermediate variable ‘config’) and how the resulting (composite) Event-B machine is wrapped into a specification (semantic) template (using the ReconstructComposedTemplate mapping).

6.4.1 Shared event composition As we explained earlier in Chapter 2, shared event (de)composition was introduced as one of the mechanisms to support modularity of Event-B specifications [89]. Shared event (de)composition allows for decomposing an Event-B specification into a collection of independent sub-components that interact with each other. Each sub-component is specified in a separate constituent Event- B machine, which does not share its state (i.e. its variables) with the constituent machines of other sub-components. The sub-components interact with each other by sharing (synchroniz- ing) events of the corresponding constituent machines. The synchronized events can exchange data via shared parameters. This mechanism is similar to the exchange of messages between synchronized input and output channels in Communicating Sequential Processes (CSP) [39]. Figure 6.5 shows an example of a machine composed for the semantic module Robotic Arm Parallel according to its definition given in Table 4.2 on page 64. This machine is composed of three machines: two instances of template_queue_machine (introduced in Figure 4.4(c)) – for the aspects driver1 and driver2, – and one instance of template_request_machine (introduced in Figure 4.4(d)) – for the aspect distributor. 108 Mapping Constelle to Event-B

MACHINE robotic arm parallel SEES dsl context VARIABLES driver1 queue, driver2 queue, distributor request body INVARIANTS driver1 inv1 : driver1 queue N ArmActions ∈ 7→ driver2 inv1 : driver2 queue N HandActions ∈ 7→ distributor inv1 : distributor request body P(Actions) ∈ EVENTS ... Event taskStm = any task where b distributor grd1 : task P(Actions) ∈ distributor grd2 : distributor request body = ∅ then distributor act2 : distributor request body := task end Event armActionStm = any action, driver1 index where b driver1 grd1 : action ArmActions ∈ driver1 grd2 : driver1 index N ∈ driver1 grd3 : driver1 queue = ∅ ( i i dom(driver1 queue) driver1 index > i) 6 ⇒ ∀ · ∈ ⇒ distributor grd3 : action distributor request body then ∈ driver1 act2 : driver1 queue := driver1 queue driver1 index action ∪ { 7→ } distributor act3 : distributor request body := distributor request body action end \{ } ...

Figure 6.5: Fragment of the composed Event-B machine

To apply shared event composition in our definition of the Constelle semantics, we represent it by the following function:

SharedEventComposition : (6.15) P(Machine) P(P(Event) P(P(Parameter))) Machine → × → Here a set of constituent Machines is composed into a new (composite) Machine using a configu- ration of type P(P(Event) P(P(Parameter))). A configuration is formed as a set of composite events of the resulting machine.× Each of these composite events is composed of a pair consist- ing of a set of synchronized events P(Event) coming from different constituent machines and of sets of sets of parameters P(P(Parameter)) shared by these events. Thus, an element of type P(Parameter) represents overlapping parameters of constituent events that form a single (shared) parameter in the resulting composite event. For example, the machine depicted in Figure 6.5 is composed of the machines driver1_queue, driver2_queue, and distributor_request using the following configuration (here we use prefixes of the form ‘machine_name/’ to show from which constituent machine each event or parameter 6.4. Composition 109 is taken):

( distributor_request/taskStm , ∅), { } driver1_queue/armActionStm, distributor_request/armActionStm ,  { } , driver1_queue/action, distributor_request/action  {{ }}  driver2_queue/handActionStm, distributor_request/handActionStm , { } , driver2_queue/action, distributor_request/action  {{ }}  ...

The resulting composite machine is constructed in the following way. The composite ma- chine sees the contexts of all constituent specifications. In the example all constituent ma- chines see the same context, dsl_context, that implements the structural interface of the Con- stelle definition. The list of variables of the composite machine is a concatenation of (not over- lapping) lists of variables of the constituent machines: driver1_queue, driver2_queue, and dis- tributor_request_body. Possible overlapping of variables, i.e. name conflicts (for example, the variable queue appears in two instances of the same template), are avoided via namespace exten- sion (such as driver1_queue) performed during generic instantiation of the constituent machines (see Section 6.3). The invariants of the composite machine are a conjunction of the invariants of the constituent machines. These transformations are captured by the following formula:

SharedEventComposition(machines, configuration): sees = m.sees m machines { | ∈ } variables = m.variables m machines { | ∈ } invariants = m.invariants m machines (6.16) { | ∈ } events = ComposeEvent(ce, cp) ce cp configuration { | 7→ ∈ } ∪ e m.events m machines { ∈ | ∈ ∧ ( ce cp configuration : e ce) ¬∃ 7→ ∈ ∈ } Each set of synchronized events is composed into one composite event (such as armAction- Stm in Figure 6.5) using mapping ComposeEvent:

ComposeEvent : P(Event) P(P(Parameter)) Event (6.17) × → This mapping conjuncts guards of the constituent events and concatenates actions of the con- stituent events. The parameters of a composite event are a union of the parameters of the con- stituent events with respect to shared (overlapping) parameters, such as action in armActionStm (p s s sharedset in the following formula): ∈ | ∈ ComposeEvent(events, sharedset): guards = e.guards e events { | ∈ } actions = e.actions e events (6.18) { | ∈ } parameters = p s s sharedset ( x s p.name = x.name) { ∈ | ∈ ∧ ∀ ∈ · } p e.parameters e events ( s sharedset p s) ∪ { ∈ | ∈ ∧ ¬∃ ∈ · ∈ } The not shared (overlapping) parameters, such as driver1_index in armActionStm, are copied to the composite event unmodified (see the last line of Formula (6.18)). The same applies to the not synchronized (i.e. not interacting) events of the composite machine, such as taskStm in 110 Mapping Constelle to Event-B

Figure 6.5: they are copied from the constituent machines without modifications (see the last line of Formula (6.16)). According to [89], the resulting composed machine is correct by construction (provided that all proof obligations of the constituent machines are discharged). In other words, the results of discharging proof obligations (ensuring consistency, feasibility, and well-definedness) of the con- stituent machines can be directly extended to the composite machine (as we do not consider here a more complicated case of combining (de)composition and refinement). However, this important theoretical outcome does not exclude a possible incompatibility of constituent machines. For ex- ample, a shared parameter might have different (and incomparable) types in different constituent machines. Such a situation is possible, as constituent machines are generated as instantiations of template machines according to a Constelle definition. Thus, the corresponding (compatibility) checks should be performed for a Constelle definition. In our approach we delegate such check- ing to the Event-B tool support, Rodin. It identifies such incompatibilities as syntactical and type errors in the resulting composed machine.

6.4.2 Composition of semantic templates As specified earlier in formula (6.14), the Compose transformation uses (applies) shared event composition in the following way:

Compose(module)(tmplts) = SharedEventComposition( t.eventbmachine t tmplts )(config) { | ∈ } ReconstructComposedTemplate(module) ◦ where config = x.eventbElement x.implements = op x t.elements t tmplts , { | ∧ ∈ ∧ ∈ } n  x.eventbElement x.implements = dp x t.elements t tmplts { | ∧ ∈ ∧ ∈ } dp op.signature | ∈ op module.interface  | ∈ o (6.14 revisited) Here the resulting machine is composed of the machines of the input semantic templates (t.eventb- machine t tmplts). The configuration of the composition (config) is derived from the interface of the semantic| ∈ module (module), which is implemented (i.e. shared) by all semantic templates (tmplts). Namely, for each operation of the interface (op module.interface) we select from dif- ∈ ferent templates (t tmplts) specification elements (x t.elements) that implement this opera- ∈ ∈ tion (x.implements = op). Event-B elements of these specification elements (x.eventbElement) determine which events should be synchronized (i.e. composed). The configuration of sharing Event-B parameters is derived in the same way for each dynamic parameter of the operation ( x.eventbElement x.implements = dp dp op.signature). { The resulting composite| Event-B machine} | is∈ wrapped into a semantic template using an aux- iliary transformation ReconstructComposedTemplate:

ReconstructComposedTemplate : SemanticInterface Machine SemanticTemplate (6.19) → → To wrap all Event-B elements of the composite Machine, this transformation generates spec- ification elements of the resulting SemanticTemplate based on the SemanticInterface that this machine implements. In particular (see Formula 6.20), all variables of the input machine are wrapped into PrivateElements of the output SemanticTemplate. The special Initialisation event 6.5. Gluing Guards 111 is wrapped into a PrivateElement too. All other events (e machine.events) of the machine are ∈ matched with the corresponding operations of the SemanticInterface (op si.interface op.name = ∈ ∧ e.name) and connected to them via PublicElements (x PublicElement). Each parameter of a machine’s event that appears in the interface (i.e. is∈ matched with a dynamic parameter dp op.signature) is wrapped into a PublicElement. The Event-B parameters that do not have ∈ matching dynamic parameters are wrapped into PrivateElements (x PrivateElement in the last lines of formula (6.20)). ∈

ReconstructComposedTemplate(si)(machine): uses x StructuralTemplate x.implements = dslStructure ∈ { ∈ | ∧ x.eventbcontext machine.sees , ∈ } elements = x PrivateElement x.eventbElement machine.variables { ∈ | ∈ } x PrivateElement x.eventbElement machine.events ∪ { ∈ | ∈ ∧ x.eventbElement.name = "Initialization" } x PublicElement op si.interface e machine.events ∪ { ∈ | ∈ ∧ ∈ ∧ op.name = e.name ∧ x.eventbElement = e x.implements = op ∧ } (6.20) x PublicElement op si.interface e machine.events ∪ { ∈ | ∈ ∧ ∈ ∧ op.name = e.name ∧ p e.parameters dp op.signature ∈ ∧ ∈ ∧ p.name = dp.name ∧ x.implements = dp x.eventbElement = p ∧ } x PrivateElement op si.interface e machine.events ∪ { ∈ | ∈ ∧ ∈ ∧ op.name = e.name p e.parameters ∧ ∈ ∧ ( dp op.signature p.name = dp.name) ¬∃ ∈ · x.eventbElement = p ∧ } The first line of formula (6.20) makes an explicit use of the (implicit) global constant (or environment) that we discussed earlier in Section 6.2: the structural template that wraps an Event- B specification of the DSL constructs used in the Constelle model (i.e. an Event-B context that introduces the corresponding sets, relations, and axioms). This structural template implements the structural interface dslStructure (see Figure 6.2), which is used by all semantic modules of the Constelle model for substituting static parameters in the invoked templates. To ensure consistency of the resulting Event-B specification, the corresponding Event-B context should be used by the generated composite machine (x.eventbcontext machine.sees). ∈ 6.5 Gluing Guards

The final step of the transformation ExpandDefinition (6.2) adds gluing guards (i.e. instantiated constraint templates) to the specification of the semantic module using the transformation Glue- 112 Mapping Constelle to Event-B

Operations (third line of formula (6.3) reappearing here): ExpandDefinition(lib)(module) = Compose(module) Substitute(a, t) a module.aspects t lib t.implements = a.invokes { | ∈ ∧ ∈ ∧ }  GlueOperations(module.gluingGuards, lib ConstraintTemplate)  ◦ ∩ (6.3 revisited) GlueOperations takes a set of gluing guards (which are realized in Constelle as Constraint- Invocations) and a set of ConstraintTemplates (that provide concrete predicates, see Figure 6.2) and applies them to a SemanticTemplate:

GlueOperations : P(ConstraintInvocation) P(ConstraintTemplate) × (6.21) SemanticTemplate SemanticTemplate → → The resulting SemanticTemplate has its aspects glued together by extra predicates that link dy- namic parameters of (within) their operations. For this, GlueOperations adds extra Event-B guards to the events of the Event-B machine of the SemanticTemplate. To represent such a mod- ification of the underlying Event-B machine in a precise manner, we use the Event-B refinement technique.

6.5.1 Refinement Refinement is an Event-B language construct that establishes a relation between two Event- B machines, where a concrete machine refines an abstract machine. Refinement allows for the gradual introduction of details into an Event-B specification by modifying or adding new variables and events [4]. A concrete machine can have a completely different (from an abstract machine) set of variables, invariants, events, guards, and actions; or the concrete machine can add new variables, invariants, events, guards and actions to the abstract machine; or the concrete machine can modify the elements of the abstract machine partially. Independent from a particular type of modification, the concrete machine should obey (conform to) the behavior specified in the abstract machine. In other words, both the abstract and concrete machines should specify the same behavior (or system), but at different levels of detail. This is ensured whenever the corresponding proof obligations generated for all refinement relations are discharged. To define the GlueOperations transformation, we use a small subset of the refinement capa- bilities. In particular, we use refinement only to add new guards to the events of a machine:

Refinementguards : Machine P(Event P(Guard)) Machine (6.22) → × → Here the configuration P(Event P(Guard)) indicates which guards (P(Guard)) should be added × to which events (Event ...). No other modifications of the input machine are possible within × Refinementguards. Figure 6.6 shows an example of Refinementguards applied to the Event-B machines generated for (and from) the semantic module LACE_Core_SF, defined in the previous chapter in Table 4.4. The LACE_Core_SF_interm machine depicted in Figure 6.6(a) is generated using SharedEvent- Composition (6.16) from the machines instantiated for the following three aspects of the LACE_ Core_SF semantic module: LAC: Request, curr_la: Query, and SS1: Queue. The prefixes of the Event-B elements in Figure 6.6 determine their origin: for example, lac_request_body is the variable of the Request template instantiated for the LAC aspect. The Event-B machine depicted in Figure 6.6(b) adds gluing guards to the composite ma- chine depicted in Figure 6.6(a), such as, glue_grd1 in the event request_la. This modifica- tion is performed within refinement of the machine LACE_Core_SF_interm by the machine LACE_Core_SF (see the sections ‘refines’). 6.5. Gluing Guards 113

MACHINE LACE Core SF interm MACHINE LACE Core SF SEES lace core context REFINES LACE Core SF interm VARIABLES SEES lace core context lac request body, curr la variable, ss1 queue VARIABLES INVARIANTS lac request body, curr la variable, ss1 queue lac inv1 : lac request body P(Occurrence) INVARIANTS ∈ curr la inv1 : curr la variable LogicalActions lac inv1 : lac request body P(Occurrence) ∈ ∈ ss1 inv1 : ss1 queue N (SS1) curr la inv1 : curr la variable LogicalActions ∈ 7→ ∈ EVENTS ss1 inv1 : ss1 queue N (SS1) ∈ 7→ ... EVENTS Event request la = ... any la, curr job Event request la = where b refines request la lac grd1 : curr job P(Occurrence) any la, curr jobb ∈ lac grd2 : lac request body = ∅ where curr la grd1 : la LogicalActions lac grd1 : curr job P(Occurrence) then ∈ ∈ lac grd2 : lac request body = ∅ lac act1 : lac request body := curr job curr la grd1 : la LogicalActions ∈ curr la act1 : curr la variable := la glue grd1 : curr job = dom(LALabelDef(la)) end then ... lac act1 : lac request body := curr job curr la act1 : curr la variable := la end ...

(a) Machine composed of the constituent aspects (b) Machine with the gluing guard

Figure 6.6: Fragment of the LACE_Core_SF machine

6.5.2 Gluing operations

The transformation GlueOperations uses the mapping Refinementguards to add gluing guards to the events of the machine representing (wrapped into) the semantic template that we con- struct for our semantic module. However, the gluing guards that are added to the machine are Event-B predicates (EventBNamedCommentedDerivedPredicateElements, see Figure 6.2), while the gluing guards that appear in the Constelle definition of the semantic module are Constraint- Invocations. To translate ConstraintInvocations into the proper Event-B predicates, we need to un- fold each ConstraintInvocation, that is: to find a ConstraintTemplate that implements the invoked ConstraintInterface and to substitute its InterfaceElements according to the InterfaceElement- Substitutions of the ConstraintInvocation. In other words, we unfold ConstraintInvocations for gluing guards in the same way as we unfold TemplateInvocations for aspects. The following formula describes the transformation GlueOperations, including the details of 114 Mapping Constelle to Event-B the invocation of ConstraintTemplates.

GlueOperations(constrset, lib)(tmplt): uses = tmplt.uses, elements = tmplt.elements, implements = tmplt.implements, eventbmachine = Refinementguards(tmplt.eventbmachine) (6.23) el.eventbelement grds(el) el tmplt.elements 7→ | ∈ ∧  el.implements Operation ∈ where grds(el) = SubstituteConstraint(ci)(ct) ci constrset ct lib { | ∈ ∧ ∈ ∧ ci.restricts = el.implements ci.invokes = ct.implements ∧ } Here only the Event-B machine of the input semantic template (tmplt.eventbmachine) is updated using the Refinementguards mapping. To construct the configuration of Refinementguards, we go through all specification elements that implement an operation of the semantic module: ... el tmplt.elements el.implements Operation . Each of these elements references an{ event| of∈ the underlying Event-B∧ machine,∈ and this event} is taken as a part of the refinement configuration (el.eventbelement ...). 7→ For the operation referenced by the specification element (el.implements) we select Constraint- Invocations from the input set of gluing guards (constrset): ci constrset ci.restricts = ∈ ∧ el.implements. For each of this ConstraintInvocations (ci) we find in the input library of tem- plates (lib) the corresponding ConstraintTemplate: ct lib ci.invokes = ct.implements. The Event-B predicate for the refinement configuration is∈ constructed∧ using an auxiliary mapping SubstituteConstraint: SubstituteConstraint(ci)(ct) ... . This mapping substitutes Interface- { | } Elements in the ConstraintTemplate (ct) according to its ConstraintInvocation (ci) and generates the corresponding Event-B predicate:

SubstituteConstraint : ConstraintInvocation ConstraintTemplate → (6.24) EventBNamedCommentedDerivedPredicateElement → The substitution of InterfaceElements in a ConstraintTemplate is performed in (almost) the same way as it is done for the SemanticTemplates as described in Section 6.3. In particular, we first compose the renaming scheme using the mapping ComposeRenaming (6.9), and then use its output to instantiate the Event-B predicate in the mapping InstantiateEventBPredicate:

SubstituteConstraint(inv, tmplt) = ComposeRenaming("")(inv.elementsSubstitution, tmplt.elements tmplt.uses.elements) ∪ InstantiateEventBPredicate(tmplt.eventbpredicate) ◦ (6.25) The mapping InstantiateEventBPredicate simply replaces entries of the Event-B elements in the predicate with the new names. The transformation is described by the following signature:

InstantiateEventBPredicate : EventBNamedCommentedDerivedPredicateElement → (6.26) (EventBNamedCommentedElement String) 7→ → EventBNamedCommentedDerivedPredicateElement 6.6. Proof Obligations 115

Note, that when composing the renaming scheme in the SubstituteConstraint mapping, we use an empty namespace (ComposeRenaming("")). This is possible because a ConstraintTemplate does not have PrivateElements, i.e. it does not encapsulate any Event-B element (parameter, set, or constant) appearing in the template predicate, but rather explicitly lists all of them in its ConstraintInterface.

6.6 Proof Obligations

In this section we discuss proof obligations (POs) of the Event-B specification generated from a Constelle model by the Constelle-to-Event-B model transformation. Proof obligations determine what should be proven for an Event-B specification in order to ensure that the specified (system) design is consistent, feasible, and complete. Discharging (i.e. proving) such proof obligations re- alizes the semantic analysis of an Event-B specification in a static manner (i.e. it does not require an execution of the specification). Proof obligations are generated for each Event-B specification according to the set of proof obligation rules, which constitute a part of the mathematical model (theory) of Event-B (i.e. the dynamic semantics of Event-B) [2]. One of the key features of a specification template (as defined in Section 4.2) is the possibility to reuse its verification results. That is, if we have discharged proof obligations for an Event-B specification template, we do not need to discharge them again for the Event-B specification that is constructed using this specification template. To implement this feature of a specification template in Constelle, we base our definition of the Constelle-to-Event-B semantic mapping on the Event-B∗ techniques that include a mathematical reasoning (theorems) of how proof obliga- tions can be reused. In this section we provide the details of this approach. For this we identify which proof obligations can be ignored and which proof obligations should be discharged for the Event-B specification generated by the Constelle-to-Event-B model transformation. Note that we do not construct such proof obligations from scratch, but we rather select them from the (existing) set of POs generated by the Rodin tools. Moreover, we identify subsets of POs (POs that can be ignored and POs that should be discharged) in terms of a Constelle model; that is we map elements (i.e. identifiers) of the input Constelle model onto (labels/identifiers of) POs of the output Event-B machine.

6.6.1 Event-B proof obligation rules As mentioned above, the mathematical model of Event-B defines a set of PO rules [2]. Using these rules, one of the Rodin tools (PO generator) generates POs for each concrete Event-B machine (and context) [43, 34]. These POs are then sent to the Rodin provers for automatic or interactive proving. Table 6.1 gives an overview of different types of Event-B PO rules. Each PO of an Event-B specification is identified (labeled) by its type (an abbreviation in the left-most column of the table) and by the (tuple of) Event-B constructs for which this PO has been constructed (the PO domain in the right-most column of the table). For example, a PO labeled as armActionStm/driver1_inv1/INV is a PO that ensures that the invariant driver1_inv1 is preserved by the event armActionStm (for the Event-B machine robotic_arm_parallel depicted in Figure 6.5). The INV PO ensures that each invariant is preserved by each event, that is, the invariant still holds after the event is executed. This should be proven based on (derived from) the axioms and theorems of the Event-B context, the invariants and theorems of the Event-B machine, the guards of the event, and the before-after predicate composed out of the assignment expressions of the 116 Mapping Constelle to Event-B

Table 6.1: Overview of different types of proof obligations for an Event-B machine

Type Description Domain INV Each invariant is preserved by each event Event Invariant × Axiom Invariant ∪ WD An expression is well defined Event Guard ∪ × Event Action ∪ × Axiom Invariant THM A theorem is provable ∪ Event Guard ∪ × FIS A non-deterministic action is feasible Event Action × Guards of a concrete event are stronger GRD Event Guard than the guards of the abstract event × Actions of an abstract event are SIM Event Action simulated in the concrete event × The values of the abstract and concrete EQL Event Variable variables with the same names are equal × event’s actions. This type of PO is generated (defined) for each couple of the machine’s events and invariants: Event Invariant. A WD PO ensures× that all Event-B expressions are well defined, that is, that the mathematical language of set theory and first-order logic (employed by Event-B) is used properly. For example, if we use a function application f(x) and f is a partial function, then f should be defined for x, that is, we need to prove that x dom(f). A potential division by zero is also checked by a WD PO. The WD POs are generated∈ for (potentially ill-defined) elements of an Event-B specification that include an expression: axioms, invariants, guards, and actions. The POs shown in the bottom part of Table 6.1 (under the double line) are related to the refinement of Event-B machines. Note that, in this table we omit PO kinds that are not relevant for our work as we do not use (and do not discuss in this thesis) the corresponding Event-B constructs. For example, we do not use variants in our Event-B machines and, therefore, we do not consider termination POs, such as: VAR (each convergent event decreases the proposed variant), or VWD (a variant is well defined). When applying refinement, we do not use witnesses and merged events and, thus, do not list POs determined by these Event-B constructs, such as: WWD (a witness is well defined), or MRG (the guard of a concrete event merging two abstract events is stronger than the disjunction of the guards of the abstract events).

6.6.2 Constelle proof obligation rules In our library of reusable specification templates each template captures a common software (de- sign) solution in the corresponding Event-B specification. As a successful and reusable piece of code, such a specification should be consistent and well defined, that is all POs of the correspond- ing Event-B machine (and the used context) should be discharged. To avoid re-proving these POs, we identify what happens with them through GenericInstantiation (6.11), SharedEvent- Composition (6.15), and Refinementguards (6.22), which are applied to the Event-B machines of the specification templates invoked in a Constelle semantic module. To capture the results of this process for each of the transformations ExpandDefinition (6.3), 6.6. Proof Obligations 117

Substitute (6.5), Compose (6.14), and GlueOperations (6.23) we extend their functions of the form f : X Y with the corresponding functions POf : X P(PO). Such an (extension) function indicates→ which POs (i.e. a subset of the generated PO set)→ should be discharged by the user for the Event-B machine generated for (from) a Constelle semantic module. In other words, we indicate the relevant POs and can ignore all other POs generated for this Event-B machine. To indicate a PO, we use a tuple consisting of the PO type (left-most column in Table 6.1) and Event- B constructs of the corresponding PO domain (right-most column in Table 6.1). For example, the tuple (INV, armActionStm, driver1_inv1) indicates the INV PO generated for the invariant driver1_inv1 and the event armActionStm.

6.6.2.1 Substitution As mentioned earlier in Section 6.3.3, an Event-B machine constructed from another machine using GenericInstantiation is correct by construction. In other words, all POs of a generic ma- chine can be reused in the instantiated machine. This theoretical result is proved by Silva in his PhD dissertation [87], see Section 3.3.4. However, the reuse of POs is possible only if the following condition holds: the structural properties of the generic sets and constants substituted as the result of the generic instantiation should hold for concrete sets and constants substituting the generic ones. Such structural properties are defined in the axioms of the context used by the generic machine. To ensure the described condition, the transformation GenericInstantiation (6.11) introduces new POs – theorems formed on the basis of generic axioms by substituting generic sets and constants with concrete sets and constants. The following formula identifies such POs in the context of the mapping Substitute (6.5), that applies (invokes) generic instantiation.

POSubstitute(inv)(tmpl) = (THM, ComposeRenamingn(inv.name)(inv.elementsSubstitution, tmpl.uses.elements) (6.27) InstantiateEventBPredicate(a)) a tmpl.eventbmachine.sees.axioms ◦ | ∈ where POSubstitute : TemplateInvocation SemanticTemplate P(PO) o → →

Here POSubstitute takes the same input as the mapping Substitute and constructs a set of POs that should be proved in order to reuse POs of the SemanticTemplate that is being substituted in the Substitute mapping. These are the THM POs (theorems), constructed by instantiating (InstantiateEventBPredicate) axioms of the Event-B context which is used by the machine of the semantic template (a tmpl.eventbmachine.sees.axioms). The instantiation follows the same ∈ renaming scheme as the instantiation of the semantic template (ComposeRenaming(inv.name) (inv.elementsSubstitution, tmpl.uses.elements) – compare formulas (6.5) and (6.27)).

6.6.2.2 Composition

A composite Event-B machine constructed from constituent machines using SharedEventComposi- tion is correct by construction. According to the proofs by Silva in his PhD dissertation [87] (see Section 2.3.3) POs of the kinds INV, FIS, and WD of constituent machines can be directly reused in the composite machine. The reasoning for the reuse of THM POs is analogous to the reason- ing behind the reuse of WD POs: if no new invariants are introduced in the composite machine, then all theorems in invariants and guards are proved (individually) for each constituent machine. However, refinement POs should be re-proved for a composite machine. The reuse of POs in a composite machine does not require any additional POs. Therefore, considering that an (intermediate) composite machine generated during the Constelle-to-Event-B 118 Mapping Constelle to Event-B transformation does not refine any other Event-B machine, we define the corresponding mapping (or Constelle PO rule) POCompose by the following formula:

POCompose(si)(tmplts) = ∅ (6.28)

where POCompose : SemanticInterface P(SemanticTemplate) P(PO) → → 6.6.2.3 Gluing guards

The transformation GlueOperations (6.23) applies the mapping Refinementguards, which does not change anything in the underlying Event-B machine, but only adds a refinement relationship and new guards to its events. Thus, we deduce that the set of POs of the resulting machine consists of the POs of the original machine, the POs related to the refinement relationship, and the POs defined (generated) for the added guards. In other words, the POs of the intermediate (i.e. composite) machine can be reused in the modified (final) machine. The new (additional) POs introduced by GlueOperations can be considered as two separate groups: POs for the refinement and POs for gluing guards. We consider three kinds of refinement POs: GRD, SIM, and EQL (other types of refinement POs do not apply due to our way of using refinement). Each of these POs can be proved easily because of our way of using refinement. For example, the GRD PO ensures that concrete guards in a concrete event are stronger than the abstract guards in an abstract event. This is required for the abstract event to be enabled if the concrete event is enabled. In the mathematical model of Event-B this PO rule is described using the following form: A Iabstract Iconcrete Gconcrete W Gabstract – which means that guards of ∧ ∧ ∧ ∧ ` an abstract event (Gabstract) should be derived ( ) from axioms (A) specified in the context used ` (seen) by the concrete machine, invariants of the abstract machine (Iabstract), invariants of the concrete machine (Iconcrete), guards of the concrete event (Gconcrete), and witnesses for parameters (W , not used in our transformation). As Refinementguards only adds new guards, we conclude that Gabstract Gconcrete and thus the GRD PO rule is proved. The same reasoning applies to the ⊆ SIM PO rule: A Iabstract Iconcrete Gconcrete W BAconcrete BAabstract – where BA stands for ∧ ∧ ∧ ∧ ∧ ` before-after predicate of the event’s actions (which are copied unchanged in Refinementguards). The EQL PO considers a situation when the abstract and the concrete machine use the same variable and the concrete machine updates this variable in an action that does not exist in the abstract machine. Clearly, this situation does not apply to the mapping Refinementguards, as all actions are copied unchanged from the abstract machine to the concrete machine. The POs for gluing guards can be identified by choosing PO kinds from Table 6.1 that are related to Event-B Guards. These are WD and THM POs. As we do not introduce gluing guards as theorems, the only type of POs that is left is WD. The following formula specifies how these POs are identified in the context of the mapping GlueOperations, which substitutes Constraint- Templates into gluing guards and adds them to the events of the underlying machine.

POGlueOperations(constrset, lib)(tmplt) = (WD, el.eventbelement, g) el tmplt.elements el.implements Operation | ∈ ∧ ∈ n g grds(el) ∧ ∈ (6.29) o

where POGlueOperations : P(ConstraintInvocation) P(ConstraintTemplate) × → SemanticTemplate P(PO) → 6.7. Conclusions 119

According to this formula, the output WD POs are identified for the Operations of the se- mantic module paired with the ConstraintInvocations that restrict them. The semantic module is implemented by the input argument ‘tmplt’, thus, its elements el tmplt.elements link the operations of the semantic module with the corresponding events of the∈ underlying Event-B ma- chine: el.implements Operation. Therefore, we use el.eventbelement as a second element of a PO tuple. Each ConstraintInvocation∈ is unfolded and substituted into an Event-B guard using the mapping SubstituteConstraint (third element of the PO tuple). The scheme of its application here (ci, ct) is the same as in the definition of GlueOperations given in (6.23). For example, the Event-B machine depicted in Figure 6.6(b) is generated for (from) the semantic module LACE_Core_SF defined in Table 4.4 on page 69. This semantic module uses a gluing guard to restrict its operation request_la:

curr_job = dom(LALabelDef(la))

Let’s assume that in the Constelle model this gluing guard is realized by the ConstraintInvocation x. Then we select the Event-B guard generated from this ConstraintInvocation by applying the mapping SubstituteConstraint to x:

y = SubstituteConstraint(x)(t), t lib x.invokes = t.implements ∈ ∧ As a result, out of all guards of the event request_la (depicted in Figure 6.6(b)) we select y = glue_grd1. After indicating this Event-B guard, we can indicate the PO that should be discharged for the original gluing guard: (WD, request_la, glue_grd1). This PO checks if the expression curr_job = dom(LALabelDef(la)) is well defined.

6.6.2.4 Proof obligations for a Constelle semantic module All the POs identified above for the intermediate steps of the Constelle-to-Event-B transforma- tion constitute the set of POs that should be discharged for a semantic template (i.e. Event-B machine) generated by this transformation (i.e. ExpandDefinition). In particular, the POSubstitute are constructed (as theorems in auxiliary Event-B contexts) for all aspects of an input semantic module and POGlueOperations are identified for all gluing guards of the semantic module. The corresponding definition is captured in the following formula.

POExpandDefinition(lib)(module) = po PO (a, t) a module.aspects t lib ∈ Substitute | ∈ ∧ ∈ ∧  t.implements = a.invokes PO (module.gluingGuards, lib ConstraintTemplate, impl) (6.30) ∪ GlueOperations ∩

where impl SemanticTemplate impl.implements = module; ∈ ∧ POExpandDefinition : P(SemanticTemplate) SemanticModule P(PO) → → Here, the auxiliary variable ‘impl’ represents an (intermediate) implementation of the seman- tic module (module) constructed during the transformation ExpandDefinition.

6.7 Conclusions

In this chapter we gave a precise and executable definition of the semantic mapping of Constelle to Event-B. This definition allows for achieving the following three results. 120 Mapping Constelle to Event-B

1. The precise definition documents and explains the semantic mapping in a clear way, which is crucial for understanding and maintenance (evolution) of the semantics of Constelle. 2. The precise definition provides a transparent description of how Constelle applies the Event-B∗ techniques, which allows for optimizing the verification of the resulting Event- B machine by reusing proof obligations that are already discharged for the specification templates invoked in the Constelle model. 3. The executability of the definition is achieved by implementing the Constelle-to-Event-B model transformation according to the given definition, which is possible because of align- ing the mathematical notation with QVTo constructs described in the previous chapter.

The Constelle-to-Event-B model transformation (semantic mapping) generates an Event-B machine for each semantic module of a definition of the dynamic semantics of a DSL. This machine can be analyzed and animated using the Rodin tool set. In this way, the Constelle-to- Event-B transformation fulfills the requirement Req-1 formulated in Chapter 2: a definition of the dynamic semantics of a DSL should allow for the implementation of various use cases, such as verification and validation of the DSL design, and simulation of DSL programs. The generated Event-B machine is wrapped into a specification template, which allows for constructing new specification templates from existing ones, and thus fulfills the requirement Req-4: the definition formalism should allow for the introduction and invocation (by using its constructs in a semantic mapping) of an additional (new) semantic domain. The Constelle-to-Event-B model transformation generates Event-B machines wrapped into specification templates for all possible Constelle models (allowed by the Constelle metamodel). In other words the transformation fulfills the requirement Req-7: the automatic generator should be DSL-independent and configured according the definition of the dynamic semantics of a DSL that has been constructed on the metamodel level. In the next chapter we provide the implementation details of the Constelle-to-Event-B model transformation and discuss how it can fit into the process of designing and developing a DSL. Moreover, we set up a controlled experiment to validate the language, process, and tool support provided by Constelle. Chapter 7

Implementation and Pragmatics of Constelle

Deep Blue: Bishop to knight 4. Gore: Not all missions can be solved with chess, Deep Blue. Someday you’ll understand that.

Futurama, Season 2, Episode 20

In this chapter, we present the implementation of the proposed approach of defining the dy- namic semantics of a DSL using specification templates and Constelle. In particular, we address the following (design and research) questions. How to organize the implementation of Constelle so that it supports the use cases that we • proposed in Chapter 4? How to use Constelle for defining the dynamic semantics of a DSL? What process should • a DSL developer follow in order to define the dynamic semantics of his/her DSL using Constelle? The first of the listed questions is addressed in Section 7.1, where we describe the architecture of the (prototype) implementation of Constelle. Based on this description, we formulate an iterative process of defining the dynamic semantics of a DSL using specification templates and Constelle (Section 7.2). To explain the process in detail and to demonstrate how the use cases proposed in Chapter 4 can be realized using our implementation, in Section 7.3 we present the tool set that supports our approach.

7.1 Architecture

We implement Constelle following its definition presented in Chapters 4 and 6. In order to combine the formats (i.e. bridge the platforms) of the Event-B formalism and of an arbitrary DSL, we use MDE techniques. In particular, we fit all the components of our implementation together using the Ecore format of Eclipse Modeling Framework (EMF)1. To reference Event-B

1www.eclipse.org/modeling/emf 122 Implementation and Pragmatics of Constelle specifications we use the (Ecore) metamodel of the Event-B formalism provided by the EMF framework for Event-B, the Rodin plug-in2. A DSL is represented by the Ecore metamodel that captures the domain model and/or the abstract syntax of the DSL. The Ecore metamodels of Constelle and of a specification template are constructed according to their definitions given in Chapter 4. The semantic mapping of Constelle to Event-B is implemented in the form of the QVTo3 model transformation that uses these Ecore metamodels as the types of its input and output. An overview of our implementation is presented in Figure 7.1.

Constelle-to-Event-B legend model transformation Constelle DSL Event-B formalism composed of (Constelle.ecore) (eventb.ecore)

instance of Specification uses Event-B templates templates (library.constellecore, (library.eventb) impl.templateslibrary) Constelle workbench Event-B Meta-model Semantics definition specification (dsl.ecore) (dsl.constelle) execution of (dsl.eventb) Constelle-to-Event-B

Event-B DSL model execution of Constelle-to-Event-B specification (model.dsl) (model.eventb) DSL Rodin platform

Figure 7.1: The architecture of the Constelle implementation

The key components of our implementation (Figure 7.1, on the top) are the Constelle DSL, which comprises the metamodels of a specification template and of Constelle (defined in Chap- ter 4), and the Constelle-to-Event-B model transformation, which follows the definition given in Chapter 6. The library (in the middle) includes both semantic interfaces (library.constellecore) and specification templates (impl.templateslibrary), which implement these semantic interfaces in the form of Event-B code (library.eventb on the right). The dynamic semantics of a DSL is defined in Constelle as a composition of the semantic interfaces (dsl.constelle), which are spe- cialized using the DSL constructs introduced by the DSL metamodel (dsl.ecore on the left). The execution of the Constelle-to-Event-B model transformation uses this definition to automatically generate the corresponding Event-B specification (dsl.eventb). The bottom part of Figure 7.1 is related to the possibility to extend the current Constelle- to-Event-B model transformation, so that we can generate Event-B specifications not only for

2wiki.event-b.org/index.php/EMF_framework_for_Event-B 3projects.eclipse.org/projects/modeling.mmt.qvt-oml 7.2. Definition Process 123 the DSL metamodel, but also for (from) DSL models. For example, in our definition of the LACE DSL (see Table 4.4 on page 69) we use the placeholder SS1 to represent an arbitrary subsystem, to define the dynamic semantics of LACE on the metamodel level. However, on the model level, a LACE program can include multiple subsystems (see for example Figure 2.3 on page 14). Thus, an Event-B specification for a LACE program should be composed out of multiple instances of the SS1 aspect – a separate instance for each of the (concrete) subsystems participating in the LACE program. We can achieve this by updating the Substitute and Compose steps of the Constelle-to-Event-B model transformation. We foresee no (technical) difficulties in implementing such an extension and we leave it for future work. Another extension point of our architecture, that follows from our vision presented in Chap- ter 3, is the possibility to have multiple back-end formalisms and/or platforms. In Figure 7.1 we depict this as multiple components representing the Rodin platform (on the right of Figure 7.1). As described in Chapter 3, such a target formalism and/or platform can be C-code or BMotion Studio visualization.

7.2 Definition Process

Following our general idea to not only give an explicit definition of the dynamic semantics of a DSL, but also to bridge various technological platforms (as introduced in Chapter 3), Constelle uses (bridges) different technologies. In particular, our implementation of Constelle includes artifacts of very different natures: multiple Ecore metamodels, Event-B specifications, and QVTo model transformations. Therefore, in order to explain the role of each of these artifacts and to relate them with each other, we formulate the process of defining the dynamic semantics of a DSL using specification templates and Constelle. This process is defined using the notation of UML activity diagrams, depicted in Figure 7.2. In Figure 7.2 the activities that use and update the structure of a DSL are depicted in the right column (swimlane). In a Constelle model the structure of a DSL (the DSL metamodel or abstract syntax) is represented by the structural template (Event-B context and the corresponding structural interface) that captures the DSL constructs and relationships between them. We need to prepare such a structural template before we start defining the dynamic semantics (step (0) in Figure 7.2). However, we might need to add extra constructs (types, constants, or relations) to this structure during the definition process (step (5.b)). For example, if we decide to restrict the capacity of buffers that are used in the example DSL of Robotic Arm by n = 5 (which means, that we cannot store more than 5 actions in a driver’s buffer); then we need to introduce the corresponding constant n and add it to the (global) structural template of the DSL, so that we can use it in our Constelle model. Note that we do not consider a more sophisticated situation of the DSL evolution, when the metamodel and/or the dynamic semantics of the DSL change, for example, as a result of receiving feedback from applying the DSL in practice. The activities related to the specification of the dynamic semantics of the DSL are shown in the left column of Figure 7.2. We start from gathering (obtaining) the information about the dynamic semantics of the DSL (step (1)). This preparatory work can include brainstorming (in case we create a DSL from scratch), examining existing documentation and implementation (in case we define the dynamic semantics of an existing DSL), interviewing existing or potential end-users of the DSL, etc. The collected information will be formalized in the definition of the dynamic semantics of the DSL. As a second step, we identify semantic modules, i.e. decompose the dynamic semantics of the DSL. For example, in our LACE case study (described in Chapter 2) we identified the following semantic modules: semantic features (Core SF, Scan SF, Order SF, and Data SF) and architecture 124 Implementation and Pragmatics of Constelle

Specification of the dynamic semantics of the DSL Specification of the DSL structure

0. Prepare the 1. Gather information about structural interface and template the dynamic semantics of the DSL that capture constructs of the DSL (metamodel)

2. Identify semantic modules (decompose the dynamic semantics)

semantic modules «parallel»

[the module is [the module can a template be decomposed] in the library] [no]

3. Construct an Event-B specification

4. Derive semantic and structural interfaces

semantic templates

5.a. Compose the dynamic semantics [no] 5.b. Add static parameters out of the specified semantic modules to the DSL structure

[Structural interfaces of all aspects can be substituted]

6. Generate Event-B specification of the dynamic semantics

[the definition 7. Validate (run/analyze) can/should the generated Event-B specification be improved/ extended]

[the definition is satisfactory]

Figure 7.2: The process of defining the dynamic semantics of a DSL using Constelle components (SS and LAC). Then we consider each of the semantic modules in parallel with others: in Figure 7.2 the corresponding steps are visualized in the expansion region (dashed rounded rectangle) with the keyword «parallel». If a semantic module can be decomposed into smaller semantic modules, then we return 7.3. Constelle Workbench 125 to the decomposition step. If we can find in the library a specification template that realizes this semantic module, then we can proceed with other semantic modules or to the next step of the definition process. Otherwise, each semantic module is specified in Event-B (step (3)) and wrapped into a specification template using the corresponding structural and semantic interfaces (step (4)). Note, that during step (3) we might realize that we can decompose the semantic module and, therefore, return back to the preceding decision node: the process is not a strict set of rules, but rather a recommendation on how to use Constelle. After we have constructed (or found in the library) Event-B specifications and interfaces for all semantic modules identified earlier, we can compose them together in a Constelle table (step (5.a)). At this stage we might need to add missing static parameters to the structural template of the DSL (step (5.b)). From a Constelle model (visualized as a table) we automatically generate the corresponding Event-B specification (step (6)) and use it to validate the defined (formalized) dynamic semantics of the DSL (step (7)). The feedback collected as the result of such a validation can lead us back to step (1), i.e. trigger another iteration of defining the dynamic semantics of the DSL. Note that the steps (1)-(7) can also be performed not for the whole dynamic semantics of the DSL, but for one of its semantic modules. In other words, we first decompose the dynamic semantics into coarse-grained modules, and then consider them in separate iterations and de- compose into more fine-grained semantic modules (i.e. follow the steps (1)-(7) for each of the coarse-grained modules). This facilitates scaling the approach. For example, in our LACE case study (described in Chapter 2), we consider each of the LACE semantic features in a separate iteration, decomposing its specification into architectural components. Thus, we first specify SS and LAC for Core SF, then SS and LAC for Scan SF, and so on.

7.3 Constelle Workbench

The process described in the previous section uses various tools: those that already exist (such as Rodin and EMF) and those that we have developed specifically for Constelle (such as the Constelle-to-Event-B model transformation). The set of tools that we developed for Constelle is referred to as the Constelle workbench in Figure 7.1. The Constelle workbench supports the (automatic) creation of specification templates on the basis of Event-B specifications, editing of a Constelle definition in the form of the table notation, and automatic generation of the corre- sponding Event-B specifications. To implement this tool set, we employed the following MDE techniques provided in the Eclipse platform: Ecore metamodeling tools (EMF), the QVTo model transformation language, Xtext text editor framework4, and the Sirius graphical editor genera- tor5. In this section we introduce the Constelle tools and explain how they contribute to the process of defining the dynamic semantics of a DSL described in the previous section. In particular, step (0) is discussed in Section 7.3.1. Steps (1)-(2) are informal and do not have any tool support. The tools that support steps (3)-(4) are described in Section 7.3.2. In Section 7.3.3 we describe the Constelle editor for step (5) of the definition process. Section 7.3.4 gives an overview of steps (6)-(7).

4www.eclipse.org/Xtext 5www.eclipse.org/sirius 126 Implementation and Pragmatics of Constelle

7.3.1 Structure definition We assume that when we start defining the dynamic semantics of a DSL, the structure of the DSL is already defined (in the form of an Ecore metamodel). In practice, however, the DSL metamodel might evolve along with the definition of the dynamic semantics. We believe that it is a natural process and do not see restrictions for using an evolving DSL metamodel. In contrast, we strive to build the Constelle workbench in such a way that it does not depend on a final version of the DSL metamodel, but rather facilitates working with a changing metamodel. For example, we provide the Ecore-to-Structure model transformation that constructs a structural interface from an Ecore metamodel automatically. Every time the DSL metamodel is updated, we run the Ecore-to-Structure transformation to generate the corresponding structural interface, so that we can use this new version in the definition of the dynamic semantics in Constelle. Figure 7.3 demonstrates an example of an Ecore metamodel and the corresponding structural interface generated for (from) it. The Ecore metamodel of the Robotic Arm DSL (an example introduced in Chapter 4) is shown in the EMF tree editor (tab ‘roboticarm.ecore’ on the left). The structural interface generated from this metamodel is shown in the tree editor in the tab ‘roboticarm_struct.constelle’ on the right. The Ecore-to-Structure transformation that generates such a structural interface from an Ecore metamodel is quite straightforward:

classes and data types are transformed into types (such as Task and Actions in Figure 7.3), • references and attributes of classes are transformed into relations (such as program and • parameters in Figure 7.3),

enumeration literals are transformed into constants (such as turnright and turnleft in Fig- • ure 7.3). The mapping realized by the Ecore-to-Structure transformation is saved in a set of elements Metamodel Element Substitution (highlighted in the tab ‘roboticarm_struct.constelle’ on the right of Figure 7.3). Each Metamodel Element Substitution connects (links) an element of the struc- tural interface with the corresponding element of the Ecore metamodel. For example, the tab ‘Properties’ on the bottom of Figure 7.3 shows the link generated for the structural element Ac- tions: the property Actual references the Ecore class Actions and the property Formal references the Constelle type (i.e. static parameter) Actions.

7.3.2 Specification templates In order to construct a specification template, we first construct an Event-B specification and then wrap it into the corresponding template interface. This applies to all types of specification templates: structural, semantic, and constraint templates. For example, Figure 7.4 shows (the fragment of) the Event-B specification of the Queue template (described in detail in Chapter 4). This specification is constructed using the Event-B editor provided by the Rodin platform (its screen-shot is depicted in Figure 7.4). The Rodin tool set generates proof obligations (POs) for this specification (depicted on the left in Figure 7.4) and runs automatic provers to discharge those. All proof obligations in Figure 7.4 are highlighted in green color, which means that they all have been successfully discharged. After specifying our template in Event-B, we continue with step (4) and define an interface for the template. For this we use the textual editor that we have developed within the Constelle workbench. Figure 7.5 depicts a screen shot of a set of different template interfaces defined in our textual editor. 7.3. Constelle Workbench 127

Figure 7.3: Screen shot of the Ecore metamodel of the example DSL

In order to connect the Event-B specification and the (semantic, or structural, or constraint) interface through a corresponding specification template, we execute the Event-B-to-Template model transformation. This transformation matches Event-B elements with interface elements by their names and generates intermediate elements of the specification template to connect them (i.e. to wrap the Event-B code into the interface). For example, the operation enqueue of the semantic interface template_queue from Figure 7.5 is matched with (connected to) the event enqueue of the machine template_queue_machine depicted in Figure 7.4.

7.3.3 Constelle editor The constructed specification templates are composed together to form the definition of the dy- namic semantics of a DSL. For this activity we provide a Constelle table editor. This editor allows for constructing a semantic module using the table notation that we defined earlier in Chapter 4. In such a table, the interface elements of the semantic module (its operations, dy- namic and static parameters appearing in the leftmost column of the table) are mapped to the interface elements of the invoked semantic templates (appearing in other columns of the table). For example, Figure 7.6 shows the semantic module Robotic Arm Parallel in the table editor of the Constelle language. Compare this screen shot with Table 4.2 on page 64. 128 Implementation and Pragmatics of Constelle

Figure 7.4: Screen shot of the Event-B specification of the Queue template

Figure 7.5: Screen shot of library interfaces in the textual editor

Gluing guards can be added to a semantic module using the wizard depicted in Figure 7.7. The layout of this dialog follows the structure of the class ConstraintInvocation (from the Con- stelle metamodel, see Figure 6.2): the fields Name, Restricts, and Invokes represent the attributes and associations of the class (i.e. name, restricts, and invokes). Other fields are up- dated dynamically according to the operation and the constraint template chosen in the fields Restricts and Invokes. For example, x, y, TypeB, TypeA, and functionAB corre- 7.3. Constelle Workbench 129

Figure 7.6: Screen shot of the Constelle table editor

Figure 7.7: Screen shot of the wizard for constructing a gluing guard spond to the static parameters of the constraint template template_IsFunctionOf chosen in the field Invokes.

7.3.4 Generating and validating Event-B specifications After defining semantic modules in the Constelle table editor, we generate an Event-B specifi- cation for each of the semantic modules using the Constelle-to-Event-B model transformation (described in detail in Chapter 6). We use the generated Event-B specifications and Rodin to implement the use cases that we identified in Chapter 4 and explained in terms of the LEGO allegory (see page 53): to prototype the DSL implementation in Constelle by generating the corresponding Event- • B specifications (using the Constelle-to-Event-B model transformation); to check applicability and compatibility of invoked specification templates by ensuring • 130 Implementation and Pragmatics of Constelle

syntactic and logical correctness of the generated Event-B specifications (the logical cor- rectness of an Event-B specification is defined in the form of proof obligations); to validate the prototyped DSL implementation against (informal) requirements by execut- • ing the generated Event-B specification (for example, using the ProB animator); to verify that the prototyped DSL fulfills certain properties by analyzing the generated • Event-B specification using (automatic) provers and/or model checkers of Rodin; to simulate DSL programs by executing the generated Event-B specifications (and possibly • enhance such a simulation with a domain-specific visualization using BMotion Studio). Thus, the Constelle DSL combined with the tool support of the Event-B formalism, Rodin, implements the use cases of defining the dynamic semantics of a DSL according to the actual implementation of the DSL (i.e. the principal design solution vs. the requirements of the DSL). For example, Figure 7.8 shows the ProB animation of the Event-B machine generated for the semantic module Robotic Arm Parallel.

Figure 7.8: Screen shot of the ProB animation of the generated Event-B machine

Note that the second of the listed use cases implies checking consistency of the prototyped DSL through two different techniques: ensuring syntactic correctness of the generated Event-B specifications and discharging proof obligations. For the latter, we do not need to discharge all proof obligations generated by Rodin tools, but only its subset that can be identified using the technique described in Chapter 6 (see Formula (6.30)). However, our prototype implementation of Constelle does not yet support (automatic) filtering (identification) of proof obligations that should be discharged for the generated Event-B specifications. Such a functionality is described in Section 7.4 as a part of future work. Besides discharging proof obligations, we use the Event-B syntactic analyzer provided by Rodin to identify potential type mismatches in our Constelle models. For example, let’s assume that we want to modify the semantic module Robotic Arm Parallel in the following way (see Figure 7.9(a)). We introduce a new construct ActionStatement representing an occurrence of an action in a task. Thus, in this new version, the task of the operation taskStm consists of Action- Statements, rather than of simple Actions. Correspondingly, we change the invocation of the Request template (the distributor aspect/column): the static parameter ElementType is substituted by the type ActionStatement (rather than by the type Actions as in the previous version). Now we want to check that this modified definition is still consistent. Figure 7.9(b) shows a fragment of the Event-B machine generated for/from this modified semantic module. We see that there is a syntactic error in the event handActionStm: ‘Types 7.3. Constelle Workbench 131

(a) Constelle table of the modified Robotic Arm Parallel

(b) A fragment of the generated Event-B specification: the syntactic error is highlighted

Figure 7.9: Screen shot of the ProB animation of the generated Event-B machine

Actions and ActionStatement do not match’ (all three appearances of the parameter action are highlighted by the same error). This means that the two template invocations driver2 and dis- tributor of the modified semantic module are incompatible. These template invocations overlap in (i.e. share) the dynamic parameter action of the operation handActionStm (that is transformed into the event handActionStm, depicted in Figure 7.9(b)). According to the Event-B specification of the template Queue (invoked in driver2), action is of the type HandActions (substituting the static parameter ElementType of driver2 in Figure 7.9(a)). According to the Event-B specifica- tion of the template Request (invoked in distributor), action is of the type ActionStatement (see the Event-B machine for the Request template on page 59: request_body P(ElementType)). Thus, two template invocations assign two different types to the shared parameter∈ action. The syntactic analyzer of Rodin helps to identify such inconsistencies in a Constelle model. Note that in the use cases listed above, Constelle plays a role only in prototyping the DSL implementation and creating the corresponding specification of the dynamic semantics. All types of the analysis are carried out by the back-end formalism: from the syntactic analysis of the generated Event-B code to model checking. As a result, to benefit from having a definition of the DSL dynamic semantics, one needs to be able to work with the back-end formalism. 132 Implementation and Pragmatics of Constelle

7.4 Conclusions and Future Work

In this chapter we described the Constelle workbench, that implements our design presented in Chapters 4 and 6 and supports the process of defining the dynamic semantics of a DSL using specification templates. The architecture of the Constelle workbench allows for its extension according to our long-term vision outlined in Chapter 3. In the next chapter we use the Constelle workbench in order to perform a validation study. The Constelle workbench is a proof of concept rather than a self-contained workbench that supports the complete process of developing and maintaining the dynamic semantics of a DSL. In particular, we can identify the following functionality that is missing to make our implementation mature enough for practical usage. Filtering proof obligations that should be discharged for a generated Event-B specification, • according to the identification scheme defined in Chapter 6. An inverse mapping from compilation and/or analysis results provided by Rodin tools • (such as syntactic errors and undischarged proof obligations) to the concepts and constructs used in a Constelle definition. Such a mapping will facilitate interpreting the feedback provided by Rodin tools in terms of the definition of the dynamic semantics of a DSL and, thus, support a DSL developer who does not know the Event-B formalism and the specifics of its tool set.

Nevertheless, as we will demonstrate in the next chapter, the Constelle workbench can be suc- cessfully used for defining the dynamic semantics of a DSL, prototyping a (new) implementation of the DSL, and getting extra insights into the resulting design of the DSL. Chapter 8

Validation of Constelle

There is nothing magical about specification. It will not eliminate all errors. Thinking does not guarantee that you will not make mistakes. But not thinking guarantees that you will.

Leslie Lamport, Who Builds a House without Drawing Blueprints?

In this chapter, we validate the proposed approach for defining the dynamic semantics of a DSL using specification templates and Constelle. In particular, we address the following (vali- dation) question: is the proposed approach usable and useful? For answering this question, we first design a validation study, i.e. answer the following (design) question: how can we find out whether the proposed approach is usable and useful? (Section 8.2). Then we apply this design of a validation study for conducting a particular experiment and analyzing its results (Section 8.3). The definition process and the Constelle workbench (both described in Chapter 7), result from an iterative application of our ideas (formulated in Chapter 4 using the LEGO allegory) to three different DSLs. In particular, we can identify the following three milestones of designing Constelle and the corresponding definition process.

We used the LACE case study (presented in Chapter 2) as our main source of inspira- • tion and motivation. In particular, the design of Constelle is the result of generalizing the LACE-to-Event-B model transformation that constructs Event-B specifications of the dynamic semantics of LACE.

Manders applied Constelle (the first raw version of its implementation) to define the dy- • namic semantics of the Shot puzzle (the DSL that he uses as a running example in his PhD dissertation [63]). This case study revealed a number of programming and design mistakes in the metamodel and implementation of Constelle, which we have corrected since then. Moreover, this case study identified a number of restrictions imposed by Constelle, which we discuss in Chapter 9.

In the validation experiment (presented further in this chapter), the dynamic semantics • of SLCO (Simple Language of Communicating Objects) [27] was (partly) defined using 134 Validation of Constelle

Constelle by an external researcher. This case study resulted in formulating a clear process of defining the dynamic semantics of a DSL using Constelle (described in Chapter 7) and provided a number of interesting insights (which we discuss further in Section 8.3). The first two of the listed milestones were performed in an ad-hoc manner, i.e. we did not plan them thoroughly in advance, but rather improvised in response to the available and non- structured feedback. In order to get an objective outcome and to mitigate the potential bias, in the third experiment we explicitly choose and follow an empirical research method. As a result, the design of the validation study (presented in this chapter) can be applied (reused) in future experiments on defining the dynamic semantics of a DSL using Constelle. An overview of the empirical methods that we use in our validation study is given in Section 8.1

8.1 Empirical Methods

In empirical research, one gains knowledge by means of observation or experience (compared to theoretical research, where one develops theories to support observed phenomena). In the context of software engineering, most empirical methods study human activities of software development process. Many of these methods are drawn from disciplines that study human behavior and, as such, ‘have known flaws and can only provide limited, qualified evidence about the phenomena being studied’ [26]. The key focus of empirical methods is to manage informality of an empirical study, mitigate possible flaws and misinterpretations resulting from the human bias, and to enhance clarity and transparency of conclusions drawn from the study. In our validation study we use the following empirical methods. Qualitative study explores a phenomenon rather than measures it (in contrast to a quantitative study). In particular, a qualitative study examines the ‘why’ and ‘how’ of decision making, and thus, provides insights into the underlying reasons, opinions, and motivations. In contrast, a quantitative study uses statistical data to prove hypotheses formulated as a result of a qualitative study. Action research combines conducting research on the process (of solving a problem) with ac- tively participating in improving (or changing) this process [26]. In comparison, the well- known empirical method of a case study focuses on studying the effects of a change (i.e. of applying our method) and has an observational (rather than influential) nature [83]. GQM (Goal/Question/Metric) method has a long history of usage in the context of improving and provides a structured approach to identify which metrics should be measured [93]. GQM defines a study model on three levels (i.e. in three consecutive steps): goals, questions, and metrics. Concurrent triangulation strategy implies that different methods are used concurrently to sup- port, corroborate, and cross-validate findings of the study [26].

8.2 Setting up a Validation Study

To evaluate our method of defining the dynamic semantics of a DSL, we perform a validation study. The validation study has a corresponding object and subject: a DSL, whose dynamic semantics is defined, and a DSL developer, who uses Constelle to define the dynamic semantics of the DSL. Based on the nature of our method, we can distinguish the following characteristics of such a study. 8.2. Setting up a Validation Study 135

To be representative, the study should use a DSL with a non-trivial dynamic semantics (to • increase the likelihood that the dynamic semantics is composed of more than one semantic module). A subject has to know and/or learn the DSL, the Event-B formalism, and Constelle. Thus, • such a study is a resource-intensive task for the subject. As our method is novel and possibly not yet optimal, we aim to learn new insights and im- • prove our method, rather than to measure its impact and compare it with existing methods and practices. Taking into account these characteristics, we choose to perform a qualitative study. Applying a qualitative study to Constelle allows for using a small data set (i.e. number of subjects and objects) and for validating our method by comparing the insights gained during the study with our prior assumptions, motivation, and vision. The possibility to use a small data set makes the validation study feasible, as we need to find a non-trivial DSL and invite a DSL developer to perform a resource-intensive task for us. In our validation study, a DSL developer uses Constelle to define the dynamic semantics of a DSL. This means that the DSL developer needs to learn how to use Constelle. To make the process of learning and adopting Constelle more efficient, and at the same time to learn from this process, we choose to perform an action research. Note that we choose an action research rather than a case study, as our method is not yet at the stage where we want to analyze how it affects design, development, and usage of a DSL. For the study of our method an action research means that we teach the DSL developer how to use Constelle and Event-B, help the DSL developer in applying our method to the concrete DSL throughout the whole study, and even update our implementation and our definition process on-the-fly using the feedback obtained from the DSL developer. However, at the same time we strive to validate our ideas by learning new insights from the DSL developer, from difficulties that they experience, and from their way of applying our method. According to [26], the biggest challenge of performing an action research is that it is not supported with an established evaluation framework. Such an evaluation framework ensures that the conducted research is rigorous and reduces the potential researcher bias. For this, an evaluation framework helps researchers to position their experiment within the field of study and to consider carefully the nature and assumptions underlying their work [20]. To establish an evaluation framework for our validation study, we draw inspiration from the GQM method and organize the structure of our study following its steps. As GQM is meant for designing a quantitative study but we conduct a qualitative study, we adapt the last step of this method: instead of deriving which metrics should be measured, we define how we collect data during the qualitative study (Section 8.2.3). Overall, we follow steps of GQM to identify aspects that we should look at (activities, decisions, artifacts, etc.) and the corresponding details that we should not miss (to check our assumptions and reduce our biases). In the following three subsections we present the three steps of GQM: goals, questions, and metrics. In Section 8.3 we use the design captured in these three steps to analyze the collected data and to derive conclusions from the validation study.

8.2.1 Goal Definition The GQM method starts top-down with the definition of an explicit measurement goal (or in our case, a goal of the qualitative study). In order to clarify what constitutes a study goal and to structure its definition, GQM provides a template for a goal definition. Table 8.1 captures the 136 Validation of Constelle application of such a template to our study. In particular, this table captures the purpose (what objects we analyze and why), the perspective (what we focus on and from whose point of view), and the context characteristics (what are the specifics of the process under study).

Table 8.1: GQM goal definition for the validation study

• the definition process (as defined in Chapter 7), Analyze • Constelle (and the supporting tool set), (i.e. the objects under study) • the hypothesis of reusable specification templates. • checking the applicability of the proposed method, For the purpose of • improving Constelle and the definition process, • assessing usefulness of the proposed method. • a complete and clear definition of the dynamic semantics, With respect to • modularity of the definition of the dynamic semantics, (i.e. the study focuses on) • reusability (i.e. genericity) of semantic modules. From the viewpoint of a DSL designer (developer) In the context of • defining the dynamic semantics of an existing DSL (i.e. the environment in which • by a person, who did not participate in the development the study takes place) of this DSL in the first place.

In principle, in our validation study we aim to answer the three following questions: does our method work? how can we improve our method? and is our method useful? For this, we look into the ideas that form our method (reusable specification templates and their composition for defining the dynamic semantics of a DSL). Based on the identified goals (the objects under study in the top row of the table), we define the following criteria of failure and/or success of our validation study. Criteria of failure of the validation study

the definition process is not applicable and it is not clear how to improve it, • Constelle is not applicable and it is not clear how to improve it, • no candidates for reusable specification templates were found, • Constelle and Event-B are not applicable for expressing reusable specification tem- • plates proposed (identified) by the DSL developer. Criteria of success of the validation study

the definition process is applicable and/or can be improved, • Constelle is expressive enough to support the definition process, • a number of improvements and insights are identified, • a number of candidates for reusable specification templates were found. • 8.2.2 Questions According to GQM, each goal is refined into a set of questions that break down the studied issue into its main components. Questions do not only provide an intermediate step between 8.2. Setting up a Validation Study 137 abstract goals and concrete techniques for collecting data, but also support interpretation of the data (gathered during the study) when we draw conclusions about the goals of the study (i.e. when we decide whether the goals are reached). We formulate the following questions for our study.

1. What kind of information about the DSL is necessary to define its dynamic semantics? 2. What are the steps that one follows when defining the dynamic semantics of the DSL? 3. How does the decomposition of the dynamic semantics happen? 4. What are the guidelines for recognizing a semantic module?

5. What changes in the definition of the dynamic semantics of the DSL between the iterations of the definition process? 6. How to compose the dynamic semantics of the DSL out of semantic modules through their semantic interfaces?

7. Are there reusable specification templates emerging as a result of the definition? 8. What the DSL developer can learn from defining the dynamic semantics of the DSL? 9. How can one use the resulting definition of the dynamic semantics? 10. What is the scope of Constelle, what are its expressiveness power and limitations?

11. What can we change in Constelle to improve the design of Constelle (the Constelle lan- guage)? 12. How does the connection with the back-end formalism (Event-B) influence the definition process and the resulting definition of the dynamic semantics of the DSL?

Following the qualitative nature of our study, we formulate most of these questions as open questions (examining ‘what’ and ‘how’ of the subject’s experience). Figure 8.1 shows the mind map (or concept map) of how the goals of our study are related to (refined into) the formulated questions. First (see Figure 8.1 on the left), we instantiate the study goals into a set of concepts using combinations of the objects, purposes, and focuses of the study (stated in Table 8.1): such as ‘applicability of the definition process’ (combination of a purpose and an object) and ‘modu- larity of the definition’ (a focus). Then we derive study questions from these concepts using our knowledge of the method under study. For example, the definition process is analyzed through studying its steps. Modularity of the definition is analyzed through studying the (de)composition of the dynamic semantics. Improvement of Constelle (Figure 8.1 on the bottom) addresses its weak sides in the context of its strong and bad sides addressed in the applicability of Constelle (‘the good, the bad, and the ugly’). Some of the formulated questions have hypothetical answers: such as ‘if not possible to decompose’ (Figure 8.1, on the right). Such hypotheses stimulate thinking about potential de- viations of the results (obtained during the study) from our expectations and capture knowledge (and assumptions) that we currently have about our method [93]. Stating hypotheses can help to not miss some important details (data) that should be collected (observed) during the study. We relate the emerging hypotheses to other study questions from the list or formulate auxiliary questions in order to address a hypothesis. 138 Validation of Constelle Figure 8.1: Mind map of refining the goals into the study questions 8.2. Setting up a Validation Study 139

8.2.3 Questionnaires After refining the goals into the list of questions, we need to define how to collect all necessary information in order to answer these questions in a satisfactory way. In a quantitative study, information is collected in the form of metrics. According to GQM, metrics are designed as a refinement of questions (listed in Section 8.2.2) into a quantitative process and/or a set of mea- surements. As we perform a qualitative study, we refine our questions into a set of questionnaires that allow for collecting empirical data during our study (such as experience and feedback of the DSL developer). When formulating questionnaires, we map our study questions onto the context and experi- ence of the DSL developer (our study subject). For example, knowing that our DSL developer will be using the Constelle workbench to define the dynamic semantics of their DSL, we can relate to the tools, formalisms, and artifacts that they will be using: Constelle, Event-B, Ecore, DSL metamodel. To capture the experience (emotions) of the DSL developer we use such no- tions: problem, challenge, hard, happy, trick, recommendation. Moreover, the relation between the study questions to the questionnaires is many-to-many: the same study question can be ad- dressed in multiple places in multiple questionnaires, and the same question of a questionnaire can be addressing multiple study questions. The reason for this is that we need to translate our vision of the validation study into the set of questions for the DSL developer in such a way that we do not influence (direct) answers of the DSL developer and the questions are clear to the DSL developer. An example of such a relation between the study questions and the questionnaires is depicted in Figure 8.2. Here in brackets we refer to the index numbers of the questions (see Ap- pendices C.2 and C.3 for the questionnaires). For the sake of understandability, we use different styles of lines. Next to the study questions listed above in Section 8.2.2, we identify factors that can influence the results of our study (i.e. determine the utility of the knowledge gained in the study). For example, for applying our method to define the dynamic semantics of a DSL, the DSL developer needs to learn and/or know the DSL, the Event-B formalism, and Constelle. Thus, an important factor for our study is the knowledge (and skills) that the DSL developer possesses before the study, such as their experience with formal methods and with developing and designing DSLs. Such a starting set of existing knowledge and skills is known as a baseline of a study. Following GQM, we include these influencing factors into the questionnaires (i.e. assess them within the same study, among the questions that aim to address the main goals of the study). Our validation study includes the following three questionnaires. 1.A baseline questionnaire is filled in the beginning of the study and is meant for identifying the input (the baseline) of the study. 2.A logbook questionnaire is filled regularly during the study and is meant for logging the progress of the definition process. In particular, we ask the DSL developer to trace their progress using a version control system and to fill in the logbook questionnaire when com- mitting (checking in) a new version of their files (i.e. artifacts, such as Event-B specifica- tions, the Constelle model, etc.). 3.A final questionnaire is filled out at the end of the study and is meant for capturing the major impression and experience of the DSL developer. 140 Validation of Constelle Figure 8.2: Mind map of refining (some of) the study questions into the questionnaires 8.3. Conducting the Validation Study 141

The questionnaires can be found in Appendix C. Note that in our questionnaires we use both open questions to the DSL developer, such as How many iterations did you make when specifying the dynamic semantics of the DSL? and questions accompanied with a list of possible answers, such as This semantic module appeared as a result of learning from: (a) the DSL metamodel; (b) the example models; (c) the DSL (concrete) syntax definition; ... We include possible answers (i.e. apply a style of a survey) when we assume that a DSL developer might perform certain steps (activities) naturally, without thinking, and thus, not realizing that these steps are important for us to observe. For example, when defining the dynamic semantics of a DSL, the DSL developer might be studying various sources of information about the DSL (its metamodel, example models, etc.). However, this process can be so natural to the DSL developer, that he/she would not recall all these sources of information, unless asked explicitly (as we do in the latter question given as an example above). To support the data collected via the questionnaires with an objective input (as the question- naires capture a subjective view of the DSL developer), we apply the concurrent triangulation strategy (described in Section 8.1). As an extra source of information for such a triangulation, we use a version control system for the artifacts created by the DSL developer during the defi- nition process. In particular, we set up a repository where the DSL developer stores all artifacts created during the validation study. As a result we can access and examine created artifacts, and moreover, keep track of the evolution of these artifacts.

8.3 Conducting the Validation Study

In this section, we describe (a concrete case of) the validation study that we conducted and analyze it according to the design presented in the previous section.

8.3.1 Methodology and Collected Data As we discussed earlier, in our validation study we perform action research. This empirical method requires close cooperation between a researcher and a participant of the study (a DSL developer). Together with the resource-intensive task for the DSL developer to manage (under- stand and/or use) three different languages (the DSL, the Event-B formalism, and Constelle), this specificity of the chosen empirical method results in high organizational costs of the validation study. We mitigate the high organizational costs by conducting the study with one of our colleagues (a researcher) who has the expertise in formal methods. Thus, it is easier for us to organize collaboration, we have a common understanding of the motivation for conducting such a study, and it is easier for the DSL developer to learn another formal method (Event-B) and to relate it to his experience with other formal methods. For example, our DSL developer “can read, write, and apply analysis tools to specifications in UML, process algebra, timed automata, and petri nets”1 . This setting determines the baseline of our study.

1In the rest of this chapter, we use the double quotation marks when quoting answers given by our DSL developer. 142 Validation of Constelle

During the validation study, we had five sessions with the DSL developer (lasting for about an hour each) in order to: teach (by demonstration) how to use Event-B, Rodin, and the Constelle workbench, • solve appearing (mostly technical) issues, and • give advice on how to apply Constelle in various (concrete) situations. • All these sessions were documented in our notebook. The final, sixth session, lasted for 1,5 hour and was used for filling in the final questionnaire. In this way we could capture additional remarks by our DSL developer and discuss some of the emerging results in more detail. In total, during the definition process the DSL developer made 15 commits to the (Git2) repository, filled in one baseline, one final and 7 logbook questionnaires. Moreover we received 13 e-mails from our DSL developer requesting various types of help (mostly technical problems).

8.3.2 Analyzing Results of the Validation Study To interpret (analyze) the collected data, we use the study questions formulated in the second step of the GQM method (see Section 8.2.2). In particular, we use our mapping between study questions and questionnaires (discussed in Section 8.2.3) to do (reverse) mapping of answers given by our DSL developer onto our study questions. In this way we answer the study questions, i.e. analyze results of the study. Moreover, following the concurrent triangulation strategy (described in Section 8.2.3), we relate the answers given by the DSL developer to the Git repository. For this, we manually read the source code of the artifacts constructed by the DSL developer (Constelle tables, Event-B specifications, structural and semantic interfaces) and study how these artifacts evolve during the definition process. For the latter, we use the EMF model compare plug-in.3 In what follows (Sections 8.3.2.1–8.3.2.12), we answer our study questions one-by-one using (and referring to) the data collected during the study (i.e. to the filled in questionnaires and the Git repository).

8.3.2.1 What kind of information about the DSL is necessary for defining its dynamic semantics? (study question 1) During the validation study, Constelle was used to define the dynamic semantics of the SLCO DSL (Simple Language of Communicating Objects) [102]. Our DSL developer did not partic- ipate in the development of SLCO in the first place. His main goal during our validation study was to “rethink the design of SLCO”, as he wanted to “change how the DSL is implemented”. As a source of information and a reference implementation (or design) of the dynamic se- mantics of SLCO our DSL developer was using the definition of SLCO in ASF+SDF given in [7]. According to the filled logbook questionnaires, the most common sources of acquiring information about the dynamic semantics of the DSL were “the DSL metamodel” and “the ex- ample models” (mentioned in 6 out of 7 log entries). The metamodel of SLCO and the informal description of its dynamic semantics were provided in [27].

2https://git-scm.com/ 3www.eclipse.org/emf/compare/ 8.3. Conducting the Validation Study 143

8.3.2.2 What are the steps that one follows when defining the dynamic semantics of the DSL? (study question 2) As a result of the validation study, our (recommended) definition process was reconsidered. In particular, the definition process described in Chapter 7 reflects the actual way in which our DSL developer was defining SLCO in Constelle. For example, he was considering multiple semantic modules in parallel, first specifying them in Event-B and only then connecting them together in a Constelle table. This differs from our initial proposal to define semantic modules in the form of their interfaces, then compose them in a Constelle model, and only then to implement them in the Event-B specifications. Note though, that this outcome might be determined by the background of our DSL developer (i.e. formal methods). Thus, we formulate our definition process rather as a recommendation, one of the possible ways of applying Constelle.

8.3.2.3 How does the decomposition of the dynamic semantics happen? (study question 3)

The final Constelle model of SLCO includes four semantic modules: state machine, sm data, channel, and sequential control flow. The semantic module state machine defines the structure of a state machine (states and transitions) and updates its current state according to the taken transitions. The semantic module sm data is responsible for executing statements associated with the taken transitions and for assigning values to the local variables of the state machines (as a result of such an execution). The semantic module channel defines communication channels between state machines and takes care of delivering messages through these channels. Finally, the semantic module sequential control flow ensures that statements of the taken transition are executed in sequential order (for example, guards of a transition, realized in SLCO in the form of blocking statements, should be executed first in order to enable the transition, i.e. before the transition can be taken). Moreover, the DSL developer considers a subset of certain gluing guards as another mod- ule (which is strictly-speaking in terms of Constelle not a semantic module). This set of gluing guards, appearing under the common name object, is used to link elements of the state machine module with elements of the channel module, and with elements of the sm data module. From this we conclude that the notion of the semantic module should be revisited: for example, re- named and/or generalized so that it can capture a set of (logically connected) gluing guards. We discuss the criteria used for the decomposition of the dynamic semantics into semantic modules in the next subsection (see Section 8.3.2.4), together with the guidelines for recogniz- ing a separate semantic module. According to the feedback received from our DSL developer, decomposing the dynamic semantics of the DSL into semantic modules is the most difficult part of the whole definition process. For designing such a decomposition, one needs at the same time to come up with a set of constituent modules and to design how these constituent modules are connected. Gluing guards play an essential role in these connections. However, these Constelle constructs are the hardest for understanding.

8.3.2.4 What are the guidelines for recognizing a semantic module? (study question 4) Five out of 7 logbook questionnaires were devoted to the introduction of a new semantic module (i.e. one logbook per module). According to these logbook questionnaires, the major criteria that were used to identify (introduce) a semantic module were: 144 Validation of Constelle

independence of the semantics when being implemented by a DSL developer (mentioned • in three out of 5 log entries), independence of the semantics specified for a subset of the DSL metamodel (mentioned in • two out of 5 log entries), independence of the semantics when being used by a DSL practitioner (mentioned in two • out of 5 log entries),

hiding a (difficult) design decision (mentioned in one log entry and related to the sequential • control flow semantic module). Thus, separate semantic modules are implemented and/or used independently from each other. This confirms our prior assumption of and motivation for having different types of modularity of a definition of the dynamic semantics of a DSL (such as architectural modules and semantic features in the LACE case study, see Chapter 2). In relation to the sizes of the semantic modules of SLCO, there are two or three operations in their interfaces (each operation with only one dynamic parameter). Event-B specifications that implement these semantic modules are quite compact (for example, three out of the four specifications can fit separately on one screen). As neither of these (rather small) sizes was mentioned as one of the guidelines for identifying a semantic module, we conclude that our DSL developer did not consider sizes of his Event-B specifications important.

8.3.2.5 What changes in the definition of the dynamic semantics of the DSL between the iterations of the definition process? (study question 5) When filling in the final questionnaire, our DSL developer identified three iterations (or mile- stones) in the definition process that he followed: (1) defining the basic concepts of a state machine; (2) adding data and its processing to the state machine; (3) introducing communication between state machines. However, as the time for the validation study was limited, the definition of the dynamic semantics of SLCO was not finished. Our DSL developer foresees that there are still two to three iterations required to achieve his goal of redesigning SLCO. For example, one more iteration is necessary for extending a channel with a buffer of messages. Figure 8.3 captures the evolution of defining the dynamic semantics of SLCO in Constelle derived from the Git repository. This graph shows semantic modules (depicted as blue circles), operations of the semantic interface (depicted as orange diamonds), and gluing guards (depicted as green squares) along the versions of the Constelle model (the horizontal axis of the graph). In Figure 8.3 we group the versions of the Constelle model into iterations as they were identified by our DSL developer. In what follows we describe some details of this evolution derived from a more close inspection of the Constelle model and its successive versions. In the first iteration two semantic modules, state machine and sm data, define the basic struc- ture of the state machine (including statements on transitions and local variables) and its basic behavior. This basic behavior is projected on (represented by) two operations of the semantic interface (two orange diamonds at the version 1 in Figure 8.3): to check whether a transition is enabled and to fire (take) an enabled transition. The second iteration adds the semantic module sequential control flow. Along with the intro- duction of this semantic module, two new operations appear (see Figure 8.3). The newly added components (the semantic module sequential control flow and two new operations) define the sequential execution of statements on a transition. Figure 8.4 shows details of this change (from right to left: step from version 1 to version 2). From this figure we see that adding new operations results in a more fine-grained semantic interface of the dynamic semantics. In other words, we 8.3. Conducting the Validation Study 145

model

1 2 3 4 5 6 7 versions iteration 1 iteration 2 iteration 3

semantic module operation of the gluing guard (template invocation) semantic interface

pins illustrate the number of pins illustrate the number interface element substitutions of dynamic parameters

Figure 8.3: Evolution of the Constelle model of the SLCO dynamic semantics can consider this change as a split of operations: both the meaning and the dynamic parameters of the ‘old’ operations are divided among ‘new’ operations. This more fine-grained interface results in more connections between the interface elements and semantic modules (template in- vocations), see Interface Element Substitutions in Figure 8.4. Note that an interesting design choice was made during the second iteration. In version 4 versus version 3 (see Figure 8.3), there are fewer gluing guards (green squares on the top of the column) but more interface element substitutions (pins of blue circles). Both Constelle constructs of gluing guards and interface element substitutions can be used to connect (or relate) separate template invocations (or semantic modules) to compose the dynamic semantics of a DSL. Here the DSL developer was choosing a more suitable way of connecting three semantic modules. In the later versions (see versions 6 and 7) both interface element substitutions and gluing guards are used extensively. Thus, we do not conclude that one of the constructs is more preferable than 146 Validation of Constelle

Figure 8.4: Constelle model of SLCO: version 1 on the right and version 2 on the left another. In the third iteration the semantic module channel is introduced (new blue circle in version 5), then the set of gluing guards object are added (new green squares in version 6), and finally both these concepts are connected to the existing model (more pins and more squares in version 7). Moreover, as a projection of these changes to the semantic interface, a new operation is added (new orange diamond in version 7). An important observation that we make from the definition process of SLCO, is the way how new semantic modules appear. Neither of the semantic modules appears as a result of split- 8.3. Conducting the Validation Study 147 ting a semantic module into (smaller) semantic modules. In other words, we can characterize the definition process as an iterative composition (adding functionalities one-by-one to the ba- sic functionality) rather than iterative decomposition (splitting one monolithic functionality into smaller functionalities) of the dynamic semantics.

8.3.2.6 How to compose the dynamic semantics of the DSL out of semantic modules through their semantic interfaces? (study question 6) The semantic interface of the Constelle definition of SLCO has five operations (depicted as or- ange diamonds in Figure 8.3). As we see from the evolution of the Constelle model, these five op- erations were introduced not at once, but were added along with the semantic modules (aspects) of the SLCO definition. As we discussed above (in the relation with Figure 8.4), operations of the semantic interface are decomposed into new operations that represent more fine-grained steps of the execution. This observation helps us to relate constructs of Constelle to the existing techniques of defin- ing the dynamic semantics of GPLs. In particular, we can compare the observed decomposition of the semantic interface into more fine-grained operations with a conversion from the style of natural (or big-step) semantics [73] towards the style of structural operational (or small-step) semantics [73]. Big-step semantics describes results of the overall execution of a (DSL) pro- gram. Small-step semantics describes the individual steps of the computations. In the definition process of SLCO depicted in Figure 8.3, introduction of more semantic modules corresponds to a more detailed description of steps of the computations.

8.3.2.7 Are there reusable specification templates emerging as a result of the definition? (study question 7)

Our DSL developer recognizes the semantic modules state machine and channel as generic soft- ware design solutions that can be applied in a scope broader than the dynamic semantics of SLCO. Moreover, he foresees that he can reuse state machine (without channel) in one of his (other) running projects. Note that, the remark that our DSL developer can reuse “state machine without channel” confirms the common knowledge that the possibility to separate modules from each other facilitates their reuse. The semantic module sequential control flow captures a technique that is not specific for SLCO, but is specific for applying Event-B and Constelle. In particular, there is no way to restrict (impose a sequential order on) the control flow of operations in a Constelle model and/or events in an Event-B specification. As a workaround for this limitation, we proposed our DSL developer to introduce the semantic module sequential control flow. We consider sequential control flow as a specification trick reusable in other specifications. As such, all these three semantic modules can be considered as reusable specification tem- plates and can be included in our library. However, neither of these specification templates has been reused within the definition of SLCO. Moreover, the DSL developer did not use our speci- fication templates presented in Chapter 4 (Queue, Request, and Partial Order). Therefore, more experiments are necessary to evaluate (assess and measure) how reusable all these specification templates are.

8.3.2.8 What the DSL developer can learn from defining the dynamic semantics of the DSL? (study question 8) The dynamic semantics of SLCO has been formalized before (see the PhD dissertation of Enge- len [27]). Therefore, its definition does not reveal any gaps and/or inconsistencies (in contrast 148 Validation of Constelle with our hypothesis of possible outcomes of the definition process). However, the existing formal specification of SLCO is “hard to read”. Defining the dynamic semantics of SLCO in Constelle helped our DSL developer to understand the existing formalization and to gain extra insights in the DSL. For example, the DSL developer was “forced to think” about the concept of atomic statements and as a result gained understanding of an effect of transitions on the global state of a system (the most complex and “most dynamic” part of the DSL, according to our DSL developer). All Event-B specifications of the resulting specification of SLCO have their proof obligations (POs) discharged. According to our DSL developer, the feedback that he got from the tools (such as Rodin automatic provers) when discharging proof obligations was helpful for constructing the Event-B specifications.

8.3.2.9 How can one use the resulting definition of the dynamic semantics? (study question 9) From the point of view of defining the dynamic semantics of SLCO in Constelle (i.e. complete- ness of the definition), our DSL developer was able to express what he intended to express within the time period of our validation study and up to the created version. He has not finished the defi- nition of his design though and was aiming to continue later. Moreover, he had not yet performed any type of analysis of the resulting definition. From the point of view of the clarity and intelligibility of the definition, our DSL devel- oper believes that he would be able to use the resulting definition as documentation for his own reference, but not for communicating with other interested participants. In other words, the defi- nition of SLCO in Constelle can not be used per se, without any additional documentation and/or explanation (both of SLCO and Constelle).

8.3.2.10 What is the scope of Constelle, what are its expressiveness power and limitations? (study question 10) In general, our DSL developer was satisfied with the expressive means of Constelle. He es- pecially found the table notation of Constelle very useful: “the Constelle table gives a good overview of the DSL specification”. Except for finding it hard to learn Constelle and applying a modular approach to the specification (as described above in Section 8.3.2.3), he encountered the following two limitations, with which he was not satisfied. The first limitation is in the scope of the Event-B formalism: it is not possible to specify functions for returning a value in Event-B. As a workaround, one needs to use guards of an event to calculate the value of a parameter. As a result, the formalism “forces to specify a DSL concept in a different way, it is restrictive”. The second limitation is determined by Constelle. As described in Section 8.3.2.7, there is no way to restrict the control flow in a Constelle model or in an Event-B specification. The semantic module sequential control flow was used to overcome this limitation in the definition of the dynamic semantics of SLCO. When discussing the semantic module sequential control flow with our DSL developer, we suggested that in the actual implementation of SLCO there should be a corresponding (software) construction or component that implements sequential control flow (for example, a program counter). Our DSL developer agreed, but considered sequential control flow “auxiliary on this conceptual level of defining the DSL”. He would prefer not to include this semantic module in the definition of SLCO if it was possible to use other means of Constelle instead. However, we do not consider the latter limitation to be a flaw in Constelle. As discussed in Chapters 2 and 4, we deliberately chose to define the dynamic semantics of a DSL according to its 8.3. Conducting the Validation Study 149

actual implementation, capturing all design decisions explicitly. These include a design decision on how to implement the sequential control flow. Our DSL developer would prefer to have an implicit description of the sequential control flow, which contradicts with the described principle of Constelle. Note though that this contradiction might be determined by the background of our DSL developer (formal methods). Therefore, we conclude that more studies (or experiments) are necessary to validate the described principle and/or evaluate its usefulness in practice.

8.3.2.11 What can we change in Constelle to improve the design of Constelle? (study question 11) MACHINE template queue machine SEES template queue context An important insight that we gained from the validation studyVARIABLES and managed to improve (on-the- fly) concerns the Constelle construct of different template parametersqueue and their relations with each other. In particular, in our previous version of Constelle,INVARIANTS each DynamicParameter of an Operation had an explicit association with its type, which couldinv1 be: onequeue of theN StaticParameterElementT ype s ∈ 7→ (all from the same scope determined by the SemanticInterfaceEVENTSand its StructuralInterface). In practice this meant, that our DSL developer could choose (orInitialisation even should have defined) a type for each dynamic parameter in his semantic interfaces (for example,begin as it is done in Figure 8.5(a)). act1 : queue := ∅ end 1 semantic interface template_queue Event enqueue = 2 uses template_basic any element, index 3 { where b 4 operation enqueue (element: ElementType) grd1 : element ElementT ype 5 operation dequeue (element: ElementType) ∈ grd2 : index N 6 } ∈ grd3 : queue = ∅ 6 ⇒ (a) Old design (b) Fragment( ofi thei Event-Bdom(queue code) index > i) then ∀ · ∈ ⇒ act2 : queue := queue index element 1 semantic interface template_queue uses template_basicend { ∪ { 7→ } 2 operation enqueue (element) Event dequeue = 3 operation dequeue (element) any element, index 4 } where b (c) New design grd4 : index element queue 7→ ∈ grd5 : i i dom(queue) index i ∀ · ∈ ⇒ ≤ Figure 8.5: An example of the change of the Constellethen design act3 : queue := queue index element end \{ 7→ } However, a semantic interface is implemented in the correspondingEND Event-B specification, where such a dynamic parameter is implemented as an Event-B parameter (of an event). The type of this parameter is specified in the Event-B specification (see Figure 8.5(b)). This Event-B type can correspond to one of the static parameters of the StructuralInterface, but still the association between the parameter and its type is already specified in the Event-B code. Thus, the semantic interface wrapping this Event-B code should not redefine such an association (in a redundant or even conflicting way). After realizing this, we removed the association between a Dynamic- Parameter and a StaticParameter from the Constelle metamodel and updated user interfaces of the Constelle workbench (table editor and the textual syntax for defining semantic interfaces, see Figure 8.5(c)). The described realization came to us as a result of observing our DSL developer struggling to understand how to use the Constelle table editor, in particular, how to substitute both a dynamic parameter and its type in a consistent way. Another possible improvement in the design of Constelle is related to the encapsulation of a current state (i.e. Event-B variables) of a semantic module (invoked specification template). In Constelle, it is not possible to access the current state of a semantic module. However, the 150 Validation of Constelle possibility to examine the current state of a semantic module and to restrict its value using (an invocation of) a constraint template would improve the experience of a DSL developer. More- over, such a change would fit in our Constelle-to-Event-B semantic mapping that is based on the Event-B∗ techniques of generic instantiation, shared event composition, and refinement. We consider such an improvement in the design of Constelle as future work.

8.3.2.12 How does the connection with the back-end formalism (Event-B) influence the definition process and the resulting definition of the dynamic semantics of the DSL? (study question 12) When answering the final questionnaire, our DSL developer recommended us to improve the connection of semantic and structural interfaces with the back-end Event-B code by making it more clear. As mentioned above, gluing guards of Constelle were especially hard to grasp. From the artifacts created by our DSL developer when defining the dynamic semantics of SLCO in Constelle (and stored in the repository), we observe that he did not use private specification el- ements. In other words, all elements of his Event-B specifications (in particular, parameters of the events) appear in the semantic interfaces of the specification templates. On the one hand, such a lack of encapsulation might be caused by the complexity of the tool provided for con- necting semantic (and structural) interfaces to the back-end Event-B code. On the other hand, all operations and the corresponding events of the Event-B specifications have only one (dynamic) parameter. Thus, the encapsulation was not necessary. In the context of the Event-B∗ techniques used in the Constelle-to-Event-B model transfor- mation, the DSL developer foresees that he could apply other decomposition mechanisms of Event-B, for example, shared variable decomposition [4]. He also believes that Constelle can be used with other back-end formalisms, for example, with mCRL2 (the formalism that is based on Algebra of Communicating Processes) [32].

8.3.3 Concluding on the Study Goals Based on the described findings, we conclude that the validation study was successful and most of the goals of the study were achieved (see the list of criteria in Section 8.2.1): the definition process was applicable for defining the dynamic semantics of SLCO; • Constelle was expressive enough for defining the dynamic semantics of SLCO; • the usefulness of applying Constelle was confirmed; • a number of improvements for Constelle were indicated and partly realized; • a number of candidates for reusable specification templates were defined. • At the same time, the focuses of the study (see third line in Table 8.1) were not fully covered: The definition of the dynamic semantics of SLCO was not complete (i.e. not finished due • to the time restrictions) and was not judged to be clear by our DSL developer (i.e. one can not use it as a documentation per se). Applying the “modular specification” approach required quite some effort from our DSL • developer. He suggested that a more insightful tutorial and documentation are necessary. However, no specific recommendations have been derived. 8.3. Conducting the Validation Study 151

Neither of the candidates for reusable specification templates have been actually reused. • This might be due to the (still too) small size of the constructed definition of the dynamic semantics of SLCO. The validation study was not without flaws. Due to the time restrictions, the definition of the dynamic semantics of SLCO was not finished. Moreover, the first steps of the definition process were not tracked in the version control system (i.e. the Git repository). Therefore, the overview of the evolution of defining the dynamic semantics of SLCO (presented in Section 8.3.2.5) is not complete. Furthermore, there were a number of technical problems with our tool support, the Constelle workbench. From these observations we conclude that our choice for an action research was justified: we could update the definition process and tooling on-the-fly and we could advise on certain design decisions (such as the introduction of sequential control flow). In the context of an action research, the most important outcome of our study are the changes that happened with a subject (our DSL developer), an object (Constelle), and a technology provider (us) as a result of the study. We identify the following changes resulting from our action research. As a result of our validation study, the DSL developer experienced a “shorter feedback • loop” when specifying the dynamic semantics of a DSL. That is, he could get a feedback (from tools) and update his specification correspondingly on an earlier stage than he was used to (traditionally, a formal specification is constructed on paper and then encoded into a formalism that has a tool support). We consider such an experience as agile formal methods. As a result of the preparation for our validation study, the pragmatics of Constelle was • considered and the definition process was formulated. As a result of conducting the study, the design of Constelle was improved and the definition process was reconsidered. In other words, our method and Constelle matured. Moreover, more specification templates were identified. As a result of our validation study, we learned that one of the main focuses of our research • (i.e. this thesis) - the modularity of a definition of the dynamic semantics of a DSL - might be one of the main challenges of applying our method. Thus, the design of a decomposition should be addressed with more care (in the design and/or in the pragmatics of Constelle).

8.3.4 Threads to Validity In relation to the generality of the obtained results, an action research usually concerns threats to utility of the knowledge gained [26]. In other words, the assessment of validity of an action research focuses on the practical outcome of the study, rather than on the repeatability of the study. From this point of view, the specifics of the profile of our DSL developer do not violate validity of the results of our validation study. For example, our DSL developer has a strong background (a PhD degree) in formal methods, which we usually do not expect from an average DSL developer. This is a threat to validity of our results, as we might not get the same results if we repeat our study with a DSL developer without a strong background in formal methods. However, from the point of view of an action research, the strong background in formal methods of our DSL developer helped us to get certain insights, which we otherwise would not get (such as comprehensive usage of Event-B and the modular approach to formal specification). Thus, we consider the obtained results valid, as they are useful. 152 Validation of Constelle

8.3.5 Future Work Our study triggered a number of new questions and revealed a number of gaps in our knowledge about experience of a DSL developer and the definition process that they follow. In this relation, we identify the following questions that can be addressed in the future (both qualitative and quantitative) studies and/or related to existing empirical studies in software engineering:

Is a modular approach a common challenge in software engineering and, more specifically, • in MDE? Or is the decomposition of the dynamic semantics of a DSL difficult (only) in the context of our method? Does the definition process (such as depicted in Figure 8.3) correspond to any evolutionary • process known for MDE/software in general (for example, to Lehman’s laws of software evolution [59])?

How does our method improve the resulting design of a DSL (and its dynamic semantics)? • How can we assess the quality of the decomposition/modularization (of the dynamic se- • mantics) enhanced by Constelle? How does Constelle perform compared to existing (baseline) methods for defining the • dynamic semantics of DSLs (or GPLs)? How reusable are the specification templates? I.e. how often are they reused in other • definitions of the dynamic semantics of DSLs? What should be the methodology for introducing (teaching and applying) Constelle in • practice?

8.4 Conclusions

The validation study revealed a number of challenges of applying Constelle and a number of benefits resulting from applying Constelle. Some of them correlate with our previous experience, for example, with the insights gained from defining the Shot puzzle in Constelle by Manders (mentioned in the introduction of this chapter). On the whole, the validation study answered our initial question: for the specific case of SLCO, the proposed approach is usable (applicable) and in general is useful for defining the dynamic semantics of a DSL in an iterative fashion. Moreover, we learned an important insight about applying our method in practice: a modular approach might require a change in the mindset of a DSL developer, from thinking about the dynamic semantics of a DSL as a whole to thinking about the dynamic semantics of the DSL as a composition of reusable building blocks. Chapter 9

Conclusions

In Section 9.1 we give an overview of how our work contributes to the research questions formu- lated in the introduction. In Section 9.2 we propose directions for future work.

9.1 Contributions

In this thesis we focused on one of the components of a DSL design, its dynamic semantics, and on practical advantages of having its explicit formal definition. In particular, we addressed the following central research question.

RQ: How to define the dynamic semantics of a DSL in a usable and useful way?

This central research question is decomposed into seven more specific questions, addressed in the chapters of this thesis. The first of these questions deals with the content and structure of a definition of the dynamic semantics of a DSL, i.e. explores what type of content should be captured in such a definition and in which form:

RQ1: What constitutes a definition of the dynamic semantics of a DSL?

The second question is closely related to the first one, as it deals with the purpose of a definition of the dynamic semantics of a DSL, i.e. explores how we aim to use such a definition:

RQ2: What are the requirements for a definition of the dynamic semantics of a DSL?

We addressed these two research questions in Chapter 2. We took a bottom-up approach and defined the dynamic semantics of an existing DSL, called LACE. In this LACE case study we followed the principles of the pragmatist stance. For this, we first identified our user roles, i.e. who can work with (make use of) a definition of the dynamic semantics of a DSL. Then we formulated our use cases, i.e. how our users can make use of a definition of the dynamic semantics of a DSL. Aiming at realizing these use cases, we chose a specification formalism that has an extensive tool support: Event-B and Rodin. We implemented the semantic mapping from 154 Conclusions

LACE to Event-B in the form of a QVTo model transformation, which automatically generates an Event-B specification for an arbitrary LACE model (program). Enhancing such a definition of the dynamic semantics of LACE with the domain-specific (LACE-alike) visualization of resulting Event-B specifications allowed for demonstrating it to LACE engineers (who cannot use Event- B and Rodin per se) and for conducting the user study. As a result, we got hands-on feedback from both of our user roles. Based on the feedback of our user roles and based on our own experience and insights gained during the case study, we came to conclusions on what should be captured in a definition of the dynamic semantics (i.e. answer research question RQ1) and formulated a list of requirements both for such a definition and for a meta-language that allows for such a definition (answer re- search question RQ2). Namely, we identified that a definition of the dynamic semantics of a DSL is usable when it allows for the implementation of various use cases, such as validation and verification of the DSL design, and simulation of DSL programs (models). We identified that a definition of the dynamic semantics of a DSL is useful when it corresponds to (captures) how DSL programs are actually executed (rather than how DSL programs are expected to behave). The meta-language for constructing such a definition should allow for a DSL-independent mod- ular definition of the dynamic semantics of a DSL given on the level of the DSL metamodel with the possibility to automatically generate artifacts (specifications) on the model level. The design of such a meta-language for defining the dynamic semantics of a DSL is addressed in the third research question of this thesis:

RQ3: What are the constructs for defining the dynamic semantics of a DSL? During our LACE case study we observed that we reuse pieces of our Event-B specifications. Moreover, such pieces, when considered separately, correspond to (software) design solutions that appear in the implementation of LACE. From this observation we got the idea of reusable specification templates that capture typical (software) design solutions. We discuss the broad pic- ture originating from this idea in Chapter 3. In Chapter 4 we use reusable specification templates as constructs of our meta-language for defining the dynamic semantics of a DSL, Constelle. In this way, we do not propose a (yet another) meta-language, universal for defining dynamic semantics of all possible DSLs; but we rather propose an intermediate semantic domain, that splits the semantic mapping from a DSL to an execution platform (or a specification formalism) into two steps, and support this intermediate semantic domain with the corresponding expressive means (Constelle). The design of Constelle is based on the ideas of the following two approaches well-known in software engineering: generic programming and aspect oriented programming (AOP). We use principles of generic programming to introduce the notion of specification templates and template parameters. We use principles of AOP to express how the dynamic semantics of a DSL is composed of specification templates. In particular, in Constelle the dynamic semantics of a DSL is defined as a composition of aspects, each of which is a specification template specialized by substituting its template (i.e. static) parameters with the constructs of the DSL (defined in its metamodel). To capture the specialization of specification templates and their composition, we introduce a table notation. A Constelle table naturally represents the mapping from the DSL to our interme- diate semantic domain, specification templates: notions and constructs of the DSL (forming the DSL vertical domain) are represented in the table rows, common software solutions (forming the DSL horizontal domain) are represented in the table columns, and the mapping is represented in their intersections. Finally, in order to relate invoked (specialized and composed) specification templates in a more sophisticated manner, we introduce gluing guards. We design gluing guards in the same 9.1. Contributions 155 way as specification templates: as constraint templates. Such a (sophisticated) design of our meta-language (Constelle) requires a clear explanation of its meaning. The fourth research question of our thesis deals with the definition of such a meaning:

RQ4: What is the semantics of the language for defining the dynamic semantics of a DSL? We specify the semantics of Constelle in Chapter 6. For this, we describe the (semantic) map- ping of Constelle to Event-B. This semantic mapping is implemented in the Constelle-to-Event-B model transformation that (automatically) transforms a (Constelle) definition of the dynamic se- mantics of a DSL into the corresponding Event-B specification of this dynamic semantics. The semantic mapping from Constelle to Event-B builds on top of (wraps) three Event-B techniques: generic instantiation, shared event composition, and refinement. These techniques have a solid theory and make it possible to reuse proof obligations discharged for the specification templates in a specification of the DSL dynamic semantics generated from its Constelle definition. As a re- sult, we can identify which proof obligations should be discharged and which proof obligations can be ignored (reused from the invoked specification templates) in the resulting (generated) Event-B specification of the dynamic semantics. When designing and describing the semantic mapping from Constelle to Event-B we faced the following question:

RQ5: How to design and describe the semantics of the language for defining the dynamic semantics of a DSL? We wanted to have a description of the semantic mapping that is clear (to us and our peers) and that corresponds to (i.e. documents) the actual implementation of this semantic mapping, the QVTo model transformation. Moreover, we wanted to use such a description for (during) developing (designing) our model transformation. Inspired by the ‘mapping nature’ of model-to- model transformations, we adopted the notation of functions (function signatures) and set theory in order to (informally) describe and design QVTo model transformations. In Chapter 5 we describe how we align concepts and constructs of QVTo with the mathematical concepts of set theory and functions. Building on the latter, we formulate two design principles of developing QVTo model transformations: structural decomposition and chaining model transformations. Our implementation of Constelle includes (Ecore) metamodels of Constelle and specification templates, an editor for the library of specification templates, an editor for Constelle tables, and the Constelle-to-Event-B model transformation. Moreover, we use Rodin for constructing Event-B specifications, discharging their proof obligations, and simulating (debugging) Event-B specifications. Such a combination of various techniques, tools, and formalisms posed a question of how to use them when defining the dynamic semantics of a DSL. Thus, we faced the following research question:

RQ6: What is the pragmatics of the language for defining the dynamic semantics of a DSL? We address this question in Chapter 7 by explicitly formulating a process of defining the dynamic semantics of a DSL using Constelle and specification templates. We relate all the listed tools and techniques to the steps of this definition process and demonstrate how these tools and techniques can contribute to the iterative design of the dynamic semantics of a DSL. In the final chapter of this thesis we strive to explore whether our ideas (of reusable specifi- cation templates), our meta-language (Constelle), and our method (the definition process) make sense. For this, we need to design a validation study, i.e. answer the following research question: 156 Conclusions

RQ7: How can we evaluate the language (and the method) for defining the dynamic semantics of a DSL?

We design a validation study using a number of techniques of empirical methods: action research, the GQM method, and a triangulation strategy. This design of a validation study can be used for conducting validation studies on Constelle in future. In Chapter 8 we describe a concrete instance of such a validation study and analyze insights gained from it. In particular, we observe the process of defining the dynamic semantics of the SLCO DSL in Constelle by an external DSL developer. Analyzing our observations allowed for improving the design of Constelle, for highlighting both limitations and benefits of applying Constelle, and for learning that the key characteristic of our method, modular approach for specifying the dynamic semantics, might require a change in the mindset of a DSL developer: from thinking about the dynamic semantics of a DSL as a whole to thinking about the dynamic semantics of the DSL as a composition of reusable building blocks. This outlines our answer to the central research question of this thesis, revisited in the be- ginning of this section. The contribution of our work can be characterized by the combination of the following features. In order to realize the use cases of having a definition of the dynamic semantics of a DSL, we adopted the translational approach and targeted a formalism with an extensive tool support. Our meta-language, Constelle, mitigates the known complexity of the translational approach by adopting ideas of aspect oriented programming and by using the table notation to express the interaction of aspects. Moreover, Constelle introduces an intermediate semantic domain in the DSL translation: specification templates that capture design solutions reappearing in the implementation of different DSLs. The well defined semantics of Constelle allows for the reuse of specification templates together with the proof obligations that have been already discharged for them. Finally, our work is complimented by the design of a validation study that allows for investigating and evaluating the proposed method.

9.2 Future Work

We drew the inspiration and motivation for our research from the LACE case study presented in Chapter 2. However, we haven’t applied Constelle to (re)define the dynamic semantics of LACE. An interesting direction for future work would be to give such a definition of the dynamic semantics of LACE using Constelle and to explore how it is different compared to our experience described in Chapter 2 and how often we actually reuse the specification templates identified for LACE (such as Queue and Request). In Chapter 3 we outline our vision on how the notion of reusable specification templates can fit into the perspective of the DSL-based development approach: from defining the structure of a DSL using structural templates to generating source code of a DSL model using semantic templates. Constelle realizes a (core) part of this vision. A logical direction for future work would be to extend Constelle (its design and implementation) in order to realize other parts of the described vision. For example, we could extend the Constelle metamodel with constructs that represent a semantics application - an instance of a definition of the dynamic semantics of a DSL instantiated for a concrete DSL model. Using such an extension, we could extend the Constelle- to-Event-B transformation in order to generate an Event-B specification for each concrete DSL model. This would allow for fulfilling the requirement Req-7 formulated in Chapter 2: formal specifications of DSL programs should be generated automatically on the model level. Our main solution, the approach of reusable specification templates for defining the dynamic semantics of a DSL, is developed in Chapter 4. This approach triggers a number of important 9.2. Future Work 157 research questions that we leave for future work. First, we need to investigate whether a decom- position of the dynamic semantics of a DSL into invocations of reusable specification templates is always feasible. Note that this question can be related both to the scope (what kind of DSLs are decomposable into reusable specification templates?) and to the pragmatics (how to decompose the dynamic semantics of a DSL?) of our method. Second, we need to build the library of reusable specification templates. The potential candi- dates for specification templates can be found using various methods and empirics. First of all, these can be well-known software design patterns and architecture styles [30], such as Observer and Blackboard patterns, Multi-layered and Peer-to-Peer architectures, etc. Another source of established software development practice is peer-reviewed source code libraries, such as the Boost libraries of C++ code1. For example, such components as Boost Flyweight and Boost Graph Library can be investigated as candidates for specification templates. Following our idea that specification templates capture a DSL horizontal domain, candidates for specification tem- plates can be derived from constructs of horizontal DSLs. For example, the Reo language [8] uses a common set of primitive communication channels, typical for concurrent applications: synchronous, lossy, buffered, etc. These can be added to our library of specification templates. Constelle is designed (in Chapter 4), developed, and specified (in Chapter 6) using Event-B as a back-end formalism. However, in our design we strive to provide a possibility for future replacement of Event-B by another specification formalism. This would be beneficial, for ex- ample, for conducting different type of analysis of the dynamic semantics of a DSL. Thus, a valuable direction for future research would be to connect Constelle with and map it to another specification formalism, such as: mCRL2, Uppaal. In relation to the pragmatics of Constelle and to the definition process (presented in Chapter 7 and further studied in Chapter 8), it would be interesting to investigate how Constelle contributes not only to making design decisions about the dynamic semantics of a DSL, but also to document- ing such decisions. In other words, it would be interesting to see if we could extend Constelle with proper expressive means and incorporate the corresponding steps into the definition process in order to allow for capturing what has changed in a definition of the dynamic semantics of a DSL and why it has changed. In Chapter 7 we highlight a number of challenges of more technical nature, such as: filtering proof obligations that should be discharged for a generated Event-B specification, according to the scheme defined in Chapter 6; development of an inverse mapping from compilation and/or analysis results provided by Rodin tools to the concepts and constructs used in a Constelle defi- nition. Finally, the validation study presented in Chapter 8 is a first step of validating and evaluating our method. More validation studies and further evaluation are necessary. Next to the questions listed in Chapter 8, we consider the following research questions as the most important for fu- ture research: how to identify specification templates and how reusable are these specification templates across various application domains; what is the scope of Constelle, i.e. what kind of DSLs can be defined using Constelle and specification templates; how scalable is the proposed approach, i.e. whether the benefits of applying it to a real (industrial-size) DSL are worth the effort.

1www.boost.org

Bibliography

[1] Jean-Raymond Abrial. The B-book: Assigning Programs to Meanings. Cambridge Uni- versity Press, 1996. [2] Jean-Raymond Abrial. Modeling in Event-B: system and software engineering. Cam- bridge University Press, 2010.

[3] Jean-Raymond Abrial, Michael Butler, Stefan Hallerstede, Thai Son Hoang, Farhad Mehta, and Laurent Voisin. Rodin: An Open Toolset for Modelling and Reasoning in Event-B. International Journal on Software Tools for Technology Transfer (STTT), 12(6):447–466, 2010. URL: http://eprints.soton.ac.uk/271058/.

[4] Jean-Raymond Abrial and Stefan Hallerstede. Refinement, Decomposition, and Instanti- ation of Discrete Models: Application to Event-B. Fundam. Inform., 77(1-2):1–28, 2007. [5] Alfred Aho, Ravi Sethi, and Jeffrey Ullman. Compilers: Principles, Techniques, and Tools. Addison-Wesley, 1986. [6] Idir Ait-Sadoune and Yamine Ait-Ameur. Stepwise Design of BPEL Web Services Com- positions: An Event-B Refinement Based Approach. In Roger Lee, Olga Ormandjieva, Alain Abran, and Constantinos Constantinides, editors, Software Engineering Research, Management and Applications 2010, pages 51–68. Springer Berlin / Heidelberg, 2010. URL: http://dx.doi.org/10.1007/978-3-642-13273-5_4. [7] Suzana Andova, Mark van den Brand, and Luc Engelen. Prototyping the semantics of a DSL using ASF+SDF: link to formal verification of DSL models. In Francisco Durán and Vlad Rusu, editors, Proceedings Second International Workshop on Algebraic Methods in Model-based Software Engineering, AMMSE 2011, Zurich, Switzerland, 30th June 2011, volume 56 of EPTCS, pages 65–79, 2011. [8] Farhad Arbab. Proper protocol. In Erika Ábrahám, Marcello Bonsangue, and Broch Einar Johnsen, editors, Theory and Practice of Formal Methods: Essays Dedicated to Frank de Boer on the Occasion of His 60th Birthday, pages 65–87. Springer, 2016. [9] Nils Bandener, Christian Soltenborn, and Gregor Engels. Extending DMM Behavior Specifications for Visual Execution and Debugging. In Brian A. Malloy, Steffen Staab, 160 Bibliography

and Mark van den Brand, editors, Software Language Engineering (SLE), pages 357–376, 2010. [10] David A. Basin, Andreas Fürst, Thai Son Hoang, Kunihiko Miyazaki, and Naoto Sato. Abstract Data Types in Event-B – An Application of Generic Instantiation. In Workshop on the experience of and advances in developing dependable systems in Event-B, CoRR, pages 5–16, 2012. [11] Henning Berg and Birger Møller-Pedersen. Type-Safe Symmetric Composition of Meta- models Using Templates. System Analysis and Modeling: Theory and Practice, pages 160–178, 2013. [12] J. A. Bergstra, J. Heering, and P. Klint, editors. Algebraic Specification. ACM Press/Addison-Wesley, 1989. [13] Sandrine Blazy, Frédéric Gervais, and Régine Laleau. Reuse of Specification Patterns with the B Method. In ZB 2003: Formal Specification and Development in Z and B, volume 2651 of Lecture Notes in Computer Science, pages 40–57. Springer Berlin Heidelberg, 2003. [14] Rimco C. Boudewijns. Graphical simulation of the execution of DSL models. Master’s thesis, Eindhoven University of Technology, 2013. [15] Kai Chen, Joseph Porter, Janos Sztipanovits, and Sandeep Neema. Compositional Spec- ification Of Behavioral Semantics For Domain-Specific Modeling Languages. Int. J. Se- mantic Computing, 3:31–56, 2009. [16] Kai Chen, Janos Sztipanovits, Sherif Abdelwalhed, and Ethan Jackson. Semantic Anchor- ing with Model Transformations. European Conference on Model Driven Architecture - Foundations and Applications, pages 115–129, 2005. [17] Andrei Chi¸s,Tudor Gîrba, and Oscar Nierstrasz. The Moldable Debugger: A Framework for Developing Domain-Specific Debuggers. In Software Language Engineering (SLE), pages 102–121, 2014. [18] Thomas Cleenewerck and Ivan Kurtev. Separation of concerns in translational semantics for dsls in model engineering. In Yookun Cho, Roger L. Wainwright, Hisham Haddad, Sung Y. Shin, and Yong Wan Koo, editors, Proceedings of the 2007 ACM Symposium on Applied Computing (SAC), Seoul, Korea, March 11-15, 2007, pages 985–992. ACM, 2007. [19] Benoît Combemale, Xavier Crégut, Pierre-Loïc Garoche, and Xavier Thirioux. Essay on semantics definition in MDE - an instrumented approach for model verification. JSW, 4(9):943–958, 2009. [20] Julie Wolfram Cox. Action Research, pages 371–388. SAGE, 2012. [21] K. Czarnecki and S. Helsen. Feature-based Survey of Model Transformation Approaches. IBM Syst. J., 45(3):621–645, July 2006. [22] Pierre-Evariste Dagand, Andrew Baumann, and Timothy Roscoe. Filet-o-Fish: Practical and Dependable Domain-Specific Languages for OS Development. In Proceedings of the 5th Workshop on Programming Languages and Operating Systems, PLOS ’09, pages 5:1–5:5. ACM, October 2009. Bibliography 161

[23] Thomas Degueule. Composition and Interoperability for External Domain-Specific Lan- guage Engineering. (Composition et interopérabilité pour l’ingénierie des langages dédiés externes). PhD thesis, University of Rennes 1, France, 2016. [24] A. van Deursen, J. Heering, and P. Klint, editors. Language Prototyping: An Algebraic Specification Approach, volume 5 of AMAST Series in Computing. World Scientific, 1996. [25] Arie Van Deursen, Paul Klint, and Joost Visser. Domain-specific languages: An annotated bibliography. ACM SIGPLAN NOTICES, 35:26–36, 2000. [26] Steve Easterbrook, Janice Singer, Margaret-Anne Storey, and Daniela Damian. Selecting Empirical Methods for Software Engineering Research. Guide to Advanced Empirical Software Engineering, pages 285–311, 2008. [27] Luc Engelen. From Napkin Sketches to Reliable Software. PhD thesis, Eindhoven Univer- sity of Technology, 2012. [28] Anne Etien, Cedric Dumoulin, and Emanuel Renaux. Towards a Unified Notation to Represent Model Transformation. Research Report 6187, INRIA, May 2007. [29] Martin Fowler. Domain-Specific Languages. Addison-Wesley Signature Series, 2010. [30] Erich Gamma, Richard Helm, Ralph Johnson, and John Vlissides. Design Patterns: Ele- ments of Reusable Object-Oriented Software. Addison Wesley, 1994.

[31] Christine M. Gerpheide, Ramon R. H. Schiffelers, and Alexander Serebrenik. A Bottom- Up Quality Model for QVTo. In QUATIC, pages 85–94. IEEE, 2014. [32] Jan Friso Groote and Mohammad Reza Mousavi. Modeling and Analysis of Communicat- ing Systems. MIT Press, 2014. [33] Esther Guerra, Juan de Lara, Dimitrios Kolovos, Richard Paige, and Osmar dos San- tos. Engineering model transformations with transML. Software and Systems Modeling, 12(3):555–577, 2013. [34] Stefan Hallerstede. The Event-B Proof Obligation Generator. [35] Dominik Hansen, Lukas Ladenberger, Harald Wiegard, Jens Bendisposto, and Michael Leuschel. Validation of the ABZ Landing Gear System Using ProB. In ABZ 2014: The Landing Gear Case Study, pages 66–79, 2014. [36] Joni Helin, Pertti Kellomäki, and Tommi Mikkonen. Patterns of Collective Behavior in Ocsid. In Toufik Taibi, editor, Design Pattern Formalization Techniques, pages 73–93, Hershey, PA, USA, 2007. IGI Global.

[37] T. Hoang, A. McIver, L. Meinicke, C. Morgan, A. Sloane, and E. Susatyo. Abstractions of non-interference security: probabilistic versus possibilistic. Formal Aspects of Computing, pages 1–26, 2012. [38] Thai Son Hoang, Andreas Fürst, and Jean-Raymond Abrial. Event-B patterns and their tool support. Software and System Modeling, 12(2):229–244, 2013.

[39] C. A. R. Hoare. Communicating Sequential Processes. Prentice-Hall, 1985. 162 Bibliography

[40] Akram Idani, Yves Ledru, and Adil Anwar. A Rigorous Reasoning about Model Trans- formations Using the B Method. In BPMDS, pages 426–440, 2013. [41] Michael Jackson. Designing and Coding Program Structures. In Henry P Stevenson, ed- itor, Proceedings of a Codasyl Programming Language Committee Symposium on Struc- tured Programming in COBOL - Future and Present, pages 22–53, 1975. [42] Michael Jackson. JSP in Perspective. In Manfred Broy and Ernst Denert, editors, Software Pioneers, pages 480–493. Springer-Verlag New York, Inc., 2002. [43] Michael Jastram. Rodin User’s Handbook. 2012.

[44] Jean-Marc Jézéquel, Benoît Combemale, Olivier Barais, Martin Monperrus, and François Fouquet. Mashup of metalanguages and its implementation in the kermeta language work- bench. Software and Systems Modeling, 14:905–920, May 2015. [45] Audris Kalnins, Janis Barzdins, and Edgars Celms. Model Transformation Language MOLA. In MDAFA, pages 62–76, 2004.

[46] P. Kellomaki. Composing distributed systems from reusable aspects of behavior. In Dis- tributed Computing Systems Workshops, IEEE Press, pages 481–486, 2002. [47] Pertti Kellomäki. A Formal Basis for Aspect-Oriented Specification with Superposition. In The FOAL workshop on Foundations of Aspect-Oriented Languages, pages 27–32, 2002.

[48] Pertti Kellomäki and Tommi Mikkonen. Design Templates for Collective Behavior. In ECOOP 14th European Conference on Object-Oriented Programming, pages 277–295, 2000. [49] Gregor Kiczales, John Lamping, Anurag Mendhekar, Chris Maeda, Cristina Videira Lopes, Jean-Marc Loingtier, and John Irwin. Aspect-Oriented Programming. In ECOOP, pages 220–242, 1997. [50] Jörg Kienzle, Wisam Al Abed, and Jacques Klein. Aspect-oriented Multi-view Modeling. In Proceedings of the 8th ACM International Conference on Aspect-oriented Software Development, AOSD ’09, pages 87–98, 2009.

[51] Anneke Kleppe. Software Language Engineering: Creating Domain-Specific Languages Using Metamodels. Addison-Wesley, 2008. [52] Pawel Kmiec.´ The Unofficial LEGO Technic Builder’s Guide. No Starch Press, 2013. [53] Tomaz Kosar, Sudev Bohra, and Marjan Mernik. Domain-specific languages: A system- atic mapping study. Information & Software Technology, 71:77–91, 2016. [54] Ivan Kurtev. State of the Art of QVT: A Model Transformation Language Standard. In Andy Schürr, Manfred Nagl, and Albert Zündorf, editors, AGTIVE, Lecture Notes in Com- puter Science, pages 377–393, 2008. [55] Lukas Ladenberger, Jens Bendisposto, and Michael Leuschel. Visualising Event-B Mod- els with B-Motion Studio. In Formal Methods for Industrial Critical Systems, FMICS 2009, pages 202–204, 2009. Bibliography 163

[56] Lukas Ladenberger, Ivaylo Dobrikov, and Michael Leuschel. An Approach for Creating Domain Specific Visualisations of CSP Models. In Software Engineering and Formal Methods - SEFM 2014 Collocated Workshops: HOFM, SAFOME, OpenCert, MoKMaSD, WS-FMDS, Grenoble, France, September 1-2, 2014, Revised Selected Papers, pages 20– 35, 2014. [57] Kevin Lano. Model transformation design pattern catalogue. http://www.dcs.kcl. ac.uk/staff/kcl/mtdp, Date of access August 2015. [58] Kevin Lano and Shekoufeh Kolahdouz Rahimi. Model-Transformation Design Patterns. IEEE Trans. Software Eng., 40(12):1224–1259, 2014. [59] Meir M. Lehman. On understanding laws, evolution, and conservation in the large- program life cycle. Journal of Systems and Software, 1:213–221, 1980. [60] Michael Leuschel and Michael J. Butler. ProB: A Model Checker for B. In International Symposium of Formal Methods (FME), Pisa, Italy, pages 855–874, 2003. [61] Michael Y. Levin and Benjamin C. Pierce. Tinkertype: a language for playing with formal systems. J. Funct. Program., 13(2):295–316, 2003. [62] Yaping Luo, Mark van den Brand, Luc Engelen, John M. Favaro, Martijn Klabbers, and Giovanni Sartori. Extracting models from ISO 26262 for reusable safety assurance. In Safe and Secure Software Reuse - 13th International Conference on Software Reuse, ICSR 2013, Pisa, Italy, June 18-20. Proceedings, pages 192–207, 2013. [63] Maarten Manders. Understanding Execution. PhD thesis, Eindhoven University of Tech- nology. [64] Raphael Mannadiar and Hans Vangheluwe. Domain-specific Engineering of Domain- specific Languages. In Proceedings of the 10th Workshop on Domain-Specific Modeling, DSM ’10, pages 11:1–11:6. ACM, 2010. [65] Aad Mathijssen and A. Johannes Pretorius. Verified Design of an Automated Parking Garage. In Formal Methods: Applications and Technology, 11th International Workshop FMICS, Revised Selected Papers, pages 165–180, 2006. [66] Sjouke Mauw, Wouter T. Wiersma, and Tim A. C. Willemse. Language-driven sys- tem design. International Journal of Software Engineering and Knowledge Engineering, 14(6):625–663, 2004. [67] Tanja Mayerhofer. Defining Executable Modeling Languages with fUML. PhD thesis, Vienna University of Technology, 2014. [68] Tom Mens and Pieter Van Gorp. A Taxonomy of Model Transformation. Electr. Notes Theor. Comput. Sci., 152:125–142, 2006. [69] Bart Meyers, Antonio Cicchetti, Esther Guerra, and Juan de Lara. Composing textual modelling languages in practice. In Proceedings of the 6th International Workshop on Multi-Paradigm Modeling, MPM@MoDELS 2012, Innsbruck, Austria, October 1-5, 2012, pages 31–36, 2012. [70] Peter Mosses. Theory and practice of action semantics. In Wojciech Penczek and Andrzej Szalas, editors, Mathematical Foundations of Computer Science 1996, volume 1113 of Lecture Notes in Computer Science, pages 37–61. Springer Berlin / Heidelberg, 1996. 164 Bibliography

[71] Peter D. Mosses. Modular structural operational semantics. J. Log. Algebr. Program., 60-61:195–228, 2004. [72] David Musser and Alexander A. Stepanov. Generic Programming. In Symbolic and alge- braic computation: ISSAC 88, pages 13–25. Springer, 1988. [73] Hanne Riis Nielson and Flemming Nielson. Semantics with Applications: A Formal In- troduction. Wiley, 1992. [74] OMG. Meta Object Facility (MOF) 2.0 Query/View/Transformation Specification, Febru- ary 2015. Version 1.2. [75] Luís Pedro. A Systematic Language Engineering Approach for Prototyping Domain Spe- cific Modelling Languages. Phd dissertation, University of Geneva, January 2009. [76] Luís Pedro, Vasco Amaral, and Didier Buchs. Foundations for a Domain Specific Mod- eling Language Prototyping Environment: A compositional approach. In Proceedings of the 8th OOPSLA ACM-SIGPLAN Workshop on Domain-Specific Modeling (DSM), 2008. [77] Luis Pedro, Matteo Risoldi, Didier Buchs, Bruno Barroca, and Vasco Amaral. Composing Visual Syntax for Domain Specific Languages. In Human-Computer Interaction. Novel Interaction Methods and Techniques, 13th International Conference, HCI International 2009, San Diego, CA, USA, July 19-24, 2009, Proceedings, Part II, pages 889–898, 2009. [78] Gordon D. Plotkin. A Structural Approach to Operational Semantics. The Journal of Logic and Algebraic Programming, 60-61:17–139, 2004. [79] Lukman Ab. Rahim and Sharifah Bahiyah Rahayu Syed Mansoor. Proposed Design No- tation for Model Transformation. In ASWEC, pages 589–598. IEEE Computer Society, 2008. [80] Shekoufeh Kolahdouz Rahimi and Kevin Lano. A Model-Based Development Approach for Model Transformations. In Fundamentals of Software Engineering - 4th IPM Interna- tional Conference, FSEN 2011, Tehran, Iran, April 20-22, 2011, Revised Selected Papers, pages 48–63, 2011. [81] Daniel Ratiu, Markus Voelter, Zaur Molotnikov, and Bernhard Schaetz. Implementing Modular Domain Specific Languages and Analyses. In Proceedings of the Workshop on Model-Driven Engineering, Verification and Validation, pages 35–40, 2012. [82] Grigore Rosu and Traian-Florin Serbanuta. An overview of the K semantic framework. J. Log. Algebr. Program., 79(6):397–434, 2010. [83] Per Runeson and Martin Höst. Guidelines for conducting and reporting case study research in software engineering. Empirical Softw. Engg., 14(2):131–164, April 2009. [84] Christian Schäfer, Thomas Kuhn, and Mario Trapp. A Pattern-based Approach to DSL De- velopment. In Proceedings of the Compilation of the Co-located Workshops on DSM’11, TMC’11, AGERE!’11, AOOPES’11, NEAT’11, & VMIL’11, SPLASH ’11 Work- shops, pages 39–46. ACM, 2011. [85] Markus Scheidgen and Joachim Fischer. Human Comprehensible and Machine Process- able Specifications of Operational Semantics. In Model Driven Architecture- Foundations and Applications, volume 4530 of Lecture Notes in Computer Science, pages 157–171. Springer Berlin Heidelberg, 2007. Bibliography 165

[86] Douglas C. Schmidt. Guest editor’s introduction: Model-driven engineering. IEEE Com- puter, 39(2):25–31, 2006. [87] Renato Silva. Supporting Development of Event-B Models. PhD thesis, University of Southampton, UK, 2012.

[88] Renato Silva and Michael Butler. Supporting Reuse of Event-B Developments through Generic Instantiation. In Karin Breitman and Ana Cavalcanti, editors, 11th International Conference on Formal Engineering Methods, ICFEM, volume 5885 of Lecture Notes in Computer Science, pages 466–484. Springer, 2009. [89] Renato Silva and Michael Butler. Shared Event Composition/Decomposition in Event-B. In Bernhard K. Aichernig, Frank S. de Boer, and Marcello M. Bonsangue, editors, Formal Methods for Components and Objects (FMCO), pages 122–141. Springer, 2010. [90] Gabor Simko. Formal Semantic Specification of Domain-Specific Modeling Languages for Cyber-Physical Systems. Phd dissertation, Vanderbilt University, Nashville, Tennessee, August 2014. Chapter 6: Reusable Semantic Units for Formalizing the Denotational Se- mantics of CPS Modeling Languages, pages 59–75. [91] Colin Snook and Michael Butler. UML-B: Formal Modeling and Design Aided by UML. ACM Trans. Softw. Eng. Methodol., 15(1):92–122, 2006. [92] Colin Snook, Fabian Fritz, and Alexei Illisaov. An EMF framework for Event-B. In Workshop on Tool Building in Formal Methods – ABZ Conference, 2010. [93] Rini Van Solingen and Egon Berghout. Goal/Question/Metric Method: A Practical Guide for Quality Improvement of Software Development. McGraw-Hill, 1999. [94] Frank P. M. Stappers, Sven Weber, Michel A. Reniers, Suzana Andova, and Istvan Nagy. Formalizing a Domain Specific Language Using SOS: An Industrial Case Study. In Soft- ware Language Engineering - 4th International Conference, SLE 2011, Braga, Portugal, July 3-4, 2011, Revised Selected Papers, pages 223–242, 2011. [95] Frank Petrus Maria Stappers. Bridging Formal Models: An Engineering Perspective. PhD Dissertation, Eindhoven University of Technology, November 2012. Chapter 6: Dissemi- nating Verification Results, pages 109–125.

[96] Ulyana Tikhonova. Reusable Specification Templates for Defining Dynamic Semantics of DSLs. Software and Systems Modeling (SoSyM), 2017. [97] Ulyana Tikhonova, Maarten Manders, and Rimco Boudewijns. Visualization of Formal Specifications for Understanding and Debugging an Industrial DSL. In Human Oriented Formal Methods (HOFM), 2016.

[98] Ulyana Tikhonova, Maarten Manders, Mark van den Brand, Suzana Andova, and Tom Verhoeff. Applying Model Transformation and Event-B for Specifying an Industrial DSL. In MoDeVVa@MoDELS, pages 41–50, 2013. [99] Ulyana Tikhonova and Tim Willemse. Designing and Describing QVTo Model Trans- formations. In ICSOFT-EA 2015 - Proceedings of the 10th International Conference on Software Engineering and Applications, Colmar, Alsace, France, 20-22 July, 2015, pages 401–406. 166 Bibliography

[100] Ulyana Tikhonova and Tim A. C. Willemse. Documenting and Designing QVTo Model Transformations Through Mathematics. In Software Technologies - 10th International Joint Conference, ICSOFT 2015, Colmar, France, July 20-22, 2015, Revised Selected Papers, pages 349–364, 2015.

[101] Marcel van Amstel. Assessing and Improving the Quality of Model Transformations. PhD thesis, Eindhoven University of Technology, 2012. [102] Marcel van Amstel, Mark van den Brand, and Luc Engelen. An exercise in iterative domain-specific language design. In Proceedings of the Joint ERCIM Workshop on Soft- ware Evolution (EVOL) and International Workshop on Principles of Software Evolution (IWPSE), IWPSE-EVOL ’10, pages 48–57, 2010. [103] Marcel van Amstel, Mark G. J. van den Brand, Zvezdan Protic, and Tom Verhoeff. Trans- forming Process Algebra Models into UML State Machines: Bridging a Semantic Gap? In Antonio Vallecillo, Jeff Gray, and Alfonso Pierantonio, editors, ICMT, volume 5063 of Lecture Notes in Computer Science, pages 61–75. Springer, 2008.

[104] Mark van den Brand, B. Cornelissen, Pieter A. Olivier, and Jurgen J. Vinju. TIDE: A generic debugging framework - tool demonstration. Electr. Notes Theor. Comput. Sci., 141(4):161–165, 2005. [105] Louis van Gool, Teade Punter, Marc Hamilton, and Remco van Engelen. Compositional MDA. In Model Driven Engineering Languages and Systems, 9th International Confer- ence (MoDELS), pages 126–139, 2006. [106] Vlad A. Vergu, Pierre Neron, and Eelco Visser. DynSem: A DSL for Dynamic Semantics Specification. In 26th International Conference on Rewriting Techniques and Applica- tions, RTA, pages 365–378, 2015.

[107] Eelco Visser. Program transformation with stratego/xt: Rules, strategies, tools, and sys- tems in stratego/xt 0.9. In Domain-Specific Program Generation, International Seminar, pages 216–238, 2003. [108] David A. Watt and Thomas Muffy. Programming Language Syntax and Semantics. Pren- tice Hall International Series in Computer Science, 1991.

[109] Jia Zhang. Pattern specification and application in meta-models in Ecore. Master’s thesis, Eindhoven University of Technology, 2016. Appendix A

Event-B Specification Templates 168 Event-B Specification Templates

CONTEXT template variable context SETS VariableType END

(a) Event-B context for the Variable specification template

MACHINE template variable machine SEES template variable context VARIABLES variable INVARIANTS inv1 : variable V ariableT ype ∈ EVENTS Initialisation begin act1 : variable : V ariableT ype end ∈ Event assign = any value where b grd1 : value V ariableT ype then ∈ act1 : variable := value end Event query = any value where b grd1 : value = variable then skip end END

(b) Event-B machine for the Variable specification template

Figure A.1: Specification template Query/Variable 169

CONTEXT template queue context SETS ElementT ype END (a) Event-B context for the Queue specification template

MACHINE template queue machine SEES template queue context VARIABLES queue INVARIANTS inv1 : queue N ElementT ype ∈ 7→ EVENTS Initialisation begin act1 : queue := ∅ end Event enqueue = any element, index where b grd1 : element ElementT ype ∈ grd2 : index N ∈ grd3 : queue = ∅ ( i6 i ⇒dom(queue) index > i) then ∀ · ∈ ⇒ act2 : queue := queue index element end ∪ { 7→ } Event dequeue = any element, index where b grd4 : index element queue 7→ ∈ grd5 : i i dom(queue) index i then ∀ · ∈ ⇒ ≤ act3 : queue := queue index element end \{ 7→ } END

(b) Event-B machine for the Queue specification template

Figure A.2: Specification template Queue 170 Event-B Specification Templates

CONTEXT template request context SETS ElementT ype END (a) Event-B context for the Request specification template

MACHINE template request machine SEES template request context VARIABLES request body INVARIANTS inv1 : request body P(ElementT ype) ∈ EVENTS Initialisation begin act1 : request body := ∅ end Event request = any elements where b grd1 : elements P(ElementT ype) ∈ grd2 : request body = ∅ then act2 : request body := elements end Event process = any element where b grd3 : element request body then ∈ act3 : request body := request body element end \{ } END

(b) Event-B machine for the Request specification template

Figure A.3: Specification template Request

CONTEXT template partialorder context SETS P osetElement END (a) Event-B context for the Partial Order specification template 171

MACHINE template partialorder machine SEES template partialorder context VARIABLES posetBody, posetOrder INVARIANTS inv1 : posetBody P osetElement ⊆ inv2 : posetOrder posetBody posetBody ∈ ↔ inv3 : x, y x y posetOrder x = y ∀ · 7→ ∈ ⇒ 6 inv4 : x, y x y posetOrder y x / posetOrder ∀ · 7→ ∈ ⇒ 7→ ∈ inv5 : a, b, c a b posetOrder b c posetOrder a c posetOrder ∀ · 7→ ∈ ∧ 7→ ∈ ⇒ 7→ ∈ EVENTS Initialisation begin act1 : posetBody := ∅ act2 : posetOrder := ∅ end Event NewPartialOrder = any poset, order where b grd1 : poset P osetElement ⊆ grd2 : order poset poset ∈ ↔ grd3 : x, y x y order x = y ∀ · 7→ ∈ ⇒ 6 grd4 : x, y x y order y x / order ∀ · 7→ ∈ ⇒ 7→ ∈ grd5 : a, b, c a b order b c order a c order then ∀ · 7→ ∈ ∧ 7→ ∈ ⇒ 7→ ∈ act1 : posetBody := poset act2 : posetOrder := order end Event GetMaximalElement = any maximal where b grd1 : maximal posetBody ∈ grd2 : x x posetBody x = maximal maximal x / posetOrder then ∀ · ∈ ∧ 6 ⇒ 7→ ∈ skip end Event RemoveElement = any element, elementRelations where b grd1 : element P osetElement ∈ grd2 : elementRelations = x, y x y posetOrder (x = element y = element) x y then { · 7→ ∈ ∧ ∨ | 7→ } act1 : posetBody := posetBody element \{ } act2 : posetOrder := posetOrder elementRelations end \ END

(b) Event-B context for the Partial Order specification template

Figure A.3: Specification template Partial Order

Appendix B

Event-B Specification of Robotic Arm Parallel

CONTEXT roboticarm structure context SETS Actions CONSTANTS HandActions ArmActions TURN LEF T TURN RIGHT MOVE UP MOVE DOWN GRAB RELEASE ROT AT E LEF T ROT AT E RIGHT AXIOMS axm1 : partition(Actions, ArmActions, HandActions) axm2 : partition(ArmActions, TURN LEF T , MOVE UP , TURN{ RIGHT , MOVE} { DOWN }) { } { } axm3 : partition(HandActions, GRAB , RELEASE , ROT AT E{LEF T }, {ROT AT E RIGHT} ) { } { } END

Figure B.1: Event-B context for the Robotic Arm DSL 174 Event-B Specification of Robotic Arm Parallel

MACHINE RoboticArmParallel machine SEES roboticarm structure context VARIABLES driver1 queue, driver2 queue, distributor request body INVARIANTS distributor inv1 : distributor request body P(Actions) ∈ driver2 inv1 : driver2 queue N HandActions ∈ 7→ driver1 inv1 : driver1 queue N ArmActions ∈ 7→ EVENTS Initialisation begin driver1 act1 : driver1 queue := ∅ driver2 act1 : driver2 queue := ∅ distributor act1 : distributor request body := ∅ end Event taskStm = any task where b distributor grd1 : task P(Actions) ∈ distributor grd2 : distributor request body = ∅ then distributor act1 : distributor request body := task end Event handActionStm = any driver2 index, action where b distributor grd1 : action distributor request body ∈ driver2 grd1 : action HandActions ∈ driver2 grd2 : driver2 index N ∈ driver2 grd3 : driver2 queue = ∅ ( i i dom(driver2 queue) driver2 index > i) 6 ⇒ ∀ · ∈ ⇒ driver2 grd4 : driver2 index action N HandActions { 7→ } ∈ 7→ driver2 grd5 : driver2 index / dom(driver2 queue) then ∈ distributor act1 : distributor request body := distributor request body action \{ } driver2 act1 : driver2 queue := driver2 queue driver2 index action end ∪ { 7→ } Event armActionStm = any driver1 index, action where b distributor grd1 : action distributor request body ∈ driver1 grd1 : action ArmActions ∈ driver1 grd2 : driver1 index N ∈ driver1 grd3 : driver1 queue = ∅ ( i i dom(driver1 queue) driver1 index > i) 6 ⇒ ∀ · ∈ ⇒ driver1 grd4 : driver1 index action N ArmActions { 7→ } ∈ 7→ driver1 grd5 : driver1 index / dom(driver1 queue) then ∈ distributor act1 : distributor request body := distributor request body action \{ } driver1 act1 : driver1 queue := driver1 queue driver1 index action end ∪ { 7→ } Event executeArm = any driver1 index, action where b driver1 grd1 : driver1 index action driver1 queue 7→ ∈ driver1 grd2 : i i dom(driver1 queue) driver1 index i ∀ · ∈ ⇒ ≤ (a) part 1 175

An Event-B Specification of RoboticArmParallel machine Page 2 of 2

then driver1 act1 : driver1 queue := driver1 queue driver1 index action end \{ 7→ } Event executeHand = any driver2 index, action where b driver2 grd1 : driver2 index action driver2 queue 7→ ∈ driver2 grd2 : i i dom(driver2 queue) driver2 index i then ∀ · ∈ ⇒ ≤ driver2 act1 : driver2 queue := driver2 queue driver2 index action end \{ 7→ } END

(b) part 2

Figure B.1: Event-B machine for the semantic module Robotic Arm Parallel

Appendix C

Questionnaires

C.1 Baseline Questionnaire

Questions Choices for answers 1.a Have you been developing domain specific languages yes/no (DSLs) before? 1.b How many DSLs have you implemented and/or open question participated in the implementation of? 2. For how many DSLs have you defined a formal (mathematically rigorous) specification of 2.a • DSL (abstract) syntax? open question 2.b • DSL (execution/behavioral) semantics? open question 3. Which specification formalisms do you know for specifying software? And to which extent do you know these formalisms? 3.a • You can read specifications in: open question 3.b • You can write specifications in: open question 3.c • You can apply analysis tools to a specification in: open question 4. To which extent do you know the Event-B formalism? 4.a • You can read Event-B specifications 4.b • You can write Event-B specifications yes/no 4.c • You can simulate (animate/execute) an Event-B specification yes/no 4.d • You can model check an Event-B specification yes/no 178 Questionnaires

4.e • You can interpret proof obligations generated for an Event-B yes/no specification 4.f • You can use interactive prover to discharge proof obligations yes/no that were not discharged automatically by the Rodin tools 5. How well do you know and/or understand the SLCO DSL (that you are going to specify)? 5.a • You have clear understanding of the application domain of to the extent of the DSL (who, when, and how is going to use this DSL) 100%(yes) – 0%(no) 5.b • You have clear understanding of the DSL (what are the DSL for 100% – 0% concepts and what are the relations between them) of the DSL concepts 5.c • You have clear understanding of the DSL graphical syntax for 100% – 0% (how DSL concepts are represented in the graphical notation) of the DSL concepts 5.d • You have clear understanding of the DSL textual syntax for 100% – 0% (how DSL concepts are represented in text) of the DSL concepts 5.e • You have clear understanding of the intended behavior for 100% – 0% of the DSL concepts (how they are applied in practice) of the DSL concepts 5.f • You have clear understanding of the DSL implementation for 100% – 0% design (how the intended behavior of the DSL concepts is of the DSL concepts implemented) 5.g • How many example DSL programs have you considered open question (looked into or written)?

Table C.1: Baseline Questionnaire

C.2 Logbook Questionnaire C.2.1 Add new semantic module 1. Criteria that were used to identify this semantic module: (a) independence of the semantics when being used by a DSL practitioner (b) independence of the semantics when being implementated by a DSL developer (c) independence of the semantics specified for a subset of the DSL concepts (subset of the DSL metamodel) (d) hiding a (difficult) design decision (e) other, please specify:

2. This semantic module appeared as a result of learning from: (a) from the DSL metamodel (b) from the example models (c) from the DSL (concrete) syntax definition C.2. Logbook Questionnaire 179

(d) from the transformations of the DSL to other languages (e) from comparing with other DSLs (f) from a talk with the DSL author (g) from validation of the Constelle definition using (h) from inconsistencies found out when defining the DSL in Constelle (i) other, please specify: 3. There were alternative design solutions that were discarded because: (a) No alternative solutions were considered (b) The alternative solution was not possible to implement in Constelle (c) The alternative solution was not correct according to the knowledge obtained about the DSL (d) The alternative solution was not considered optimal/efficient/elegant (e) Other

C.2.2 Change semantic module 1. This semantic module was: (a) decomposed into other semantic modules (b) replaced by another semantic module(s) (c) merged with another semantic module(s) (d) identified as a specification template and specified in Event-B (e) other 2. The changes were made as a result of learning from: (a) from the DSL metamodel (b) from the example models (c) from the DSL (concrete) syntax definition (d) from the transformations of the DSL to other languages (e) from comparing with other DSLs (f) from a talk with the DSL author (g) from validation of the Constelle definition using (h) from inconsistencies found out when defining the DSL in Constelle (i) other 3. There were alternative design solutions that were discarded because: (a) No alternative solutions were considered 180 Questionnaires

(b) The alternative solution was not possible to implement in Constelle (c) The alternative solution was not correct according to the knowledge learnt about the DSL (d) The alternative solution was not considered optimal/efficient/elegant (e) Other

C.2.3 Apply workaround for 1. The problem/challenge that is solved has: (a) technical nature (b) conceptual nature 2. The problem/challenge is:

3. The workaround was found as a result of: (a) experience (try and fail) (b) investigation of the source code of the Constelle tools (c) reading documentation of the tool (d) consultation provided by Ulyana

C.3 Final Questionnaire

Questions Choices for answers 0.a. How would you classify the dynamic semantics • Operational (Mapping DSL of the DSL, that you have been specifying concepts to operations) in Constelle • Denotational (Mapping DSL concepts to concepts) • Axiomatic (axioms and theorems) 0.b. Do you know how this dynamic semantics is yes/no implemented? (as a translator or an interpreter) C.3. Final Questionnaire 181

0.c. Is your specification coherent with the actual • Yes, completely coherent; implementation of the DSL dynamic semantics? • The specification abstracts away certain details; • Certain details are specified differently than actually implemented; • The specification specifies rather abstract properties than the actual implementation • The specification is not coherent with the actual implementation 1.a. How many iterations did you make when open question specifying the dynamic semantics of the DSL? 1.b. If more than two, can you name these iterations? open question (i.e. what determined each of these iterations) 1.c. Would be there more iterations if you continued open question specifying the DSL dynamic semantics? How many? 2. Assess the following activities on how well you could perform them when specifying the DSL semantics: 2.a. • expressing the DSL semantics in the provided • Naturally, without problems; formalisms (Event-B, constellecore, and Constelle) • Somewhat convenient; • Possible, requires some learning; • Possible, but not convenient/natural; • Doable, but very hard; • Impossible at all 2.b. • determining the borders of the semantic modules " ∼ ∼ (i.e. decomposing the definition) 2.c. • connecting the semantic modules together into " ∼ ∼ a whole system 2.d. • using tools " ∼ ∼ 2.e. • other: Please specify " ∼ ∼ 3.a. How close your Constelle specification of the DSL 10 (exactly the same) - dynamic semantics to what you had in your mind 0 (completely different) before you started to specify the DSL semantics? 182 Questionnaires

3.b. How close your Constelle specification of the DSL 10 (exactly the same) - dynamic semantics to what you have in your mind 0 (completely different) now, after you finished specifying the dynamic semantics of the DSL? 3.c. If there is a difference between these two mental open question pictures (before and after specifying the DSL), then what did cause this change? 4. List the DSL concepts, with the specification of which you are not happy: 4.a. • you failed to specify these at all list the concepts 4.b. • you had to come up with a specification trick list the concepts 4.c. • you had to introduce undesirable list the concepts auxiliary/intermediate concepts 4.d. • the specification of these concepts does not list the concepts correspond to the actual implementation 4.e. • other: Please specify list the concepts 4.f. List the DSL concepts, which were hard to specify list the concepts due to the limitations of Event-B 4.g. List the DSL concepts, which were hard to specify list the concepts due to the limitations of constellecore (semantic interfaces: operations with parameters) 4.h. List the DSL concepts, which were hard to specify list the concepts due to the limitations of Constelle (composing and specializing semantic interfaces in a table) 5. List the DSL concepts, with the specification of which you are satisfied: 5.a. • the specification corresponds to how you list the concepts originally wanted to specify the concept 5.b. • the specification omits some details, but in list the concepts general it is good enough 5.c. • the specification adds some auxiliary details, list the concepts but in general it is good enough 5.d. • the specification uses a trick, but this trick list the concepts corresponds to the actual implementation 5.e. • other: Please specify list the concepts 5.f. In case your specification uses some tricks and/or open question auxiliary elements, on which level (i.e. in which formalism: Event-B, constellecore, or Constelle) did you add them? C.3. Final Questionnaire 183

6. Which of the Event-B specifications, that you have created, capture software engineering techniques that are more generic (have a broader scope) than the dynamic semantics of the DSL that you have specified? 6.a. • Techniques specific for formal specification list the specifications (for example, Event-B tricks) 6.b. • Known design patterns, algorithms, protocols, list the specifications and architecture styles 6.c. • Language engineering technology list the specifications 6.d. • Other: Please specify list the specifications 6.e. Which of these specifications you will be able to list the specifications (re)use in your future work? 6.f. Why did you use these software engineering open question techniques in your specification? 6.g. Do you think the amount of reusable/generic Event-B specifications in your Constelle specification could be increased by: • Increasing the amount of details yes/no (i.e. lowering the level of abstraction) • Making the decomposition more fine yes/no grained (introducing more semantic modules in your specification) 7.a. Was it useful to specify the dynamic semantics of 5 (very useful) - the DSL in Constelle? 0 (not useful at all) 7.b. Have you found any gaps and/or inconsistencies open question that you were not aware of before? 7.c. Did you manage to do any type of analysis of open question the resulting specification? 7.d. Will you be able to use this Constelle specification as reference documentation for (learning and/or understanding) the DSL? • For yourself yes/no • For other interested participants yes/no 7.e. In case of using this Constelle specification as documentation, will be there any extra documentation necessary? • To explain the Constelle specification yes/no • Independent from the Constelle specification yes/no (but the same content-wise) 184 Questionnaires

7.f. What else was the Constelle specification open question useful for? 8. What would you do differently in the definition process proposed for specifying the dynamic semantics of DSLs? (0. Preparation — 1.a. Obtain information about the DSL — 1.b. Recognize semantic modules (decompose) — 1.c. Specify their semantic interfaces — 1.d. Implement them as Constelle tables — 1.e. Classify the constituent aspects as semantic modules or as specification templates — 2. Write constituent Event-B specifications — 3. Validate the resulting specification) 8.a. • Change order of steps open question 8.b. • Omit some steps open question 8.c. • Add extra steps open question 9. What recommendations do you have? open question 9.a. • for the semantic and/or structural interfaces open question 9.b. • for the Constelle language open question 9.c. • for the back-end formalism open question (not restricted to Event-B) 9.d. • for the tool support open question 9.e. • other open question 10.a. Does this questionnaire adequately capture your yes/no feedback? 10.b. Do you miss some questions? open question If yes, please specify:

Table C.2: Final Questionnaire Summary

Engineering the Dynamic Semantics of Domain Specific Languages

Domain Specific Languages (DSLs) are a central concept of Model Driven Engineering (MDE). They are considered to be very effective in software development and are being widely adopted by industry nowadays. A DSL is a (small) computer language specialized for a specific (appli- cation) domain. In the context of MDE, a DSL is usually implemented as a translation to the input language of a target execution platform, such as C/C++ or Java code. As a consequence, the semantics of the DSL is (hard)coded in model transformations and code generation. This situation poses challenges when designing, learning, and evolving the DSL. In this work we in- vestigated how an explicit definition of the dynamic semantics of a DSL can facilitate design and development of the DSL, and support understanding and debugging of DSL programs. In order to gain insight into a real-life experience of working on and with DSLs, we per- formed a case study at ASML, a producer of complex lithography machines for the semiconduc- tor industry. In our case study we defined the dynamic semantics of the LACE DSL (Logical Action Component Environment), which is used for generating fragments of the source code that controls lithography machines. Aiming for practical benefits of having a formal definition of the DSL dynamic semantics, we employed a formalism that has extensive tool support: the Event-B formalism. The Rodin platform offers a wide range of functionality that can be applied to an Event-B specification of the DSL: editing, automatic generation of proof obligations, automatic and interactive proving, animation, model checking, etc. To be able to apply these tools to a DSL specification, we used Event-B as a back-end formalism for defining the dynamic semantics of the DSL and developed a model-to-model transformation from the DSL to Event-B. To engage DSL engineers who are not familiar with the notation of Event-B, we created a domain-specific visualization of the Event-B specifications of the DSL. The visualization mimics the original graphical notation of the DSL and runs on top of the animation of an Event-B specification. Us- ing this visualization, we investigated the needs and the perception of DSL engineers by means of a user study. Based on the lessons learned during the case study and on the results of the user study, we formulated the use cases for a definition of the dynamic semantics of a DSL and identi- fied the corresponding requirements. For example, we have observed that although the available tools facilitate design and usage of the DSL, the semantic gap between the DSL and Event-B is quite wide, and the definition of the dynamic semantics is kept (coded) in the DSL-to-Event-B translation. Thus, the abstraction level of Event-B is not enough for defining the dynamic se- mantics of DSLs. An example of an important requirement, indicated by the DSL engineers, is that the DSL specification should be (kept) consistent with the actual implementation of the DSL (which evolves over time). 186 Summary

To bridge the wide semantic gap between a DSL and a specification formalism and to fa- cilitate consistency between the specification and the implementation of a DSL, we introduce an intermediate semantic domain, that splits the semantic mapping from a DSL to an execution platform (or a specification formalism) into two steps. As such an intermediate semantic domain we use software design solutions that are typically used in the DSL implementation, i.e. concepts that form the horizontal domain of the DSL. Thus, we propose to define the dynamic semantics of a DSL as a mapping from the language constructs (forming the vertical domain of the DSL) to the horizontal concepts. We realized the proposed idea in the form of the Constelle language and reusable specification templates. Specification templates realize the generic for (thorough mathematical-based) formal specifications. They capture software de- sign solutions of the DSL horizontal domain in the form of generic (Event-B) specifications that can be specialized for a concrete domain. Constelle allows for defining the DSL dynamic seman- tics as a composition of such specification templates and, in this way, implements the two-step translation of the DSL to the back-end formalism. For invoking and weaving templates, Con- stelle applies ideas of aspect oriented programming and uses the notation of a table: the DSL vertical domain is represented in the table rows, the DSL horizontal domain is represented in the table columns, and the mapping from the vertical to the horizontal domain is represented in the table intersections. The semantics of the Constelle language is implemented as a model-to-model transformation from Constelle to Event-B. We describe this transformation informally through the mathematical notation of set theory and functions. While designing the Constelle-to-Event-B transformation, we used this notation to formulate and to apply two design principles of developing (QVTo) model transformations: structural decomposition and chaining model transformations. To evaluate our approach, we designed and conducted a validation study on defining the dynamic semantics of a DSL using Constelle. The validation study was designed as an action research, following the steps of the GQM (Goal, Question, Metric) method. The validation study revealed both strong sides and limitations of applying Constelle. Moreover, based on the insights gained during the study we formulated a number of interesting directions for future work. Curriculum Vitae

Ulyana Tikhonova was born on October 7th, 1985 in Leningrad (now St.Petersburg), Russia. From 2002 till 2008 she studied Applied Mathematics and Informatics at St.Petersburg State Polytechnic University, combining her studies with a part-time work as a Software Developer. She did her Bachelor and Master projects in the field of Software Language Engineering and graduated in 2008 with honors. In July 2008 she received Europe Anita Borg Memorial Scholarship. From 2008 till 2010 Ulyana was working as a Software Developer and a Junior Researcher at the Institute of Applied Astronomy of Russian Academy of Science. At the same time she was working on her PhD project and teaching at St.Petersburg State Polytechnic University. In 2010 she was awarded the Russian Federation President Scholarship for studying abroad and did 10 months internship at the SET (Software Engineering and Technology) group in Eindhoven University of Technology. In December 2011 Ulyana started her PhD research at Eindhoven University of Technology in the collaboration with ASML within the Common Reference Framework (COREF) project. The results of this research are presented in this dissertation. In July 2016 Ulyana received the STW take-off grant for the project titled “Harnessing formal methods for practical use in industry”.

IPA Dissertation Series

Titles in the IPA Dissertation Series since 2014

J. van den Bos. Gathering Evidence: Faculty of Mathematics and Computer Sci- Model-Driven Software Engineering in Auto- ence, TU/e. 2014-08 mated Digital Forensics. Faculty of Science, UvA. 2014-01 A.F.E. Belinfante. JTorX: Exploring Model- Based Testing. Faculty of Electrical Engi- D. Hadziosmanovic. The Process Matters: neering, Mathematics & Computer Science, Cyber Security in Industrial Control Systems. UT. 2014-09 Faculty of Electrical Engineering, Mathemat- ics & Computer Science, UT. 2014-02 A.P. van der Meer. Domain Specific Languages and their Type Systems. Fac- A.J.P. Jeckmans. Cryptographically- ulty of Mathematics and Computer Science, Enhanced Privacy for Recommender Systems. TU/e. 2014-10 Faculty of Electrical Engineering, Mathemat- ics & Computer Science, UT. 2014-03 B.N. Vasilescu. Social Aspects of Collabo- C.-P. Bezemer. Performance Optimization ration in Online Software Communities. Fac- of Multi-Tenant Software Systems. Faculty ulty of Mathematics and Computer Science, of Electrical Engineering, Mathematics, and TU/e. 2014-11 Computer Science, TUD. 2014-04 F.D. Aarts. Tomte: Bridging the Gap be- T.M. Ngo. Qualitative and Quantitative In- tween Active Learning and Real-World Sys- formation Flow Analysis for Multi-threaded tems. Faculty of Science, Mathematics and Programs. Faculty of Electrical Engi- Computer Science, RU. 2014-12 neering, Mathematics & Computer Science, UT. 2014-05 N. Noroozi. Improving Input-Output Confor- mance Testing Theories. Faculty of Mathe- A.W. Laarman. Scalable Multi-Core Model matics and Computer Science, TU/e. 2014-13 Checking. Faculty of Electrical Engi- neering, Mathematics & Computer Science, M. Helvensteijn. Abstract Delta Modeling: UT. 2014-06 Software Product Lines and Beyond. Fac- J. Winter. Coalgebraic Characterizations ulty of Mathematics and Natural Sciences, of Automata-Theoretic Classes. Faculty of UL. 2014-14 Science, Mathematics and Computer Science, P. Vullers. Efficient Implementations of RU. 2014-07 Attribute-based Credentials on Smart Cards. W. Meulemans. Similarity Measures and Faculty of Science, Mathematics and Com- Algorithms for Cartographic Schematization. puter Science, RU. 2014-15 190 IPA Dissertation Series

F.W. Takes. Algorithms for Analyzing and J.E.J. de Ruiter. Lessons learned in the anal- Mining Real-World Graphs. Faculty of Math- ysis of the EMV and TLS security protocols. ematics and Natural Sciences, UL. 2014-16 Faculty of Science, Mathematics and Com- puter Science, RU. 2015-11 M.P. Schraagen. Aspects of Record Linkage. Faculty of Mathematics and Natural Sciences, Y. Dajsuren. On the Design of an Ar- UL. 2014-17 chitecture Framework and Quality Evalua- tion for Automotive Software Systems. Fac- G. Alpár. Attribute-Based Identity Manage- ulty of Mathematics and Computer Science, ment: Bridging the Cryptographic Design of TU/e. 2015-12 ABCs with the Real World. Faculty of Sci- ence, Mathematics and Computer Science, J. Bransen. On the Incremental Evaluation RU. 2015-01 of Higher-Order Attribute Grammars. Fac- ulty of Science, UU. 2015-13 A.J. van der Ploeg. Efficient Abstractions for Visualization and Interaction. Faculty of Sci- S. Picek. Applications of Evolutionary Com- ence, UvA. 2015-02 putation to Cryptology. Faculty of Sci- R.J.M. Theunissen. Supervisory Control in ence, Mathematics and Computer Science, Health Care Systems. Faculty of Mechanical RU. 2015-14 Engineering, TU/e. 2015-03 C. Chen. Automated Fault Localization for T.V. Bui. A Software Architecture for Body Service-Oriented Software Systems. Faculty Area Sensor Networks: Flexibility and Trust- of Electrical Engineering, Mathematics, and worthiness. Faculty of Mathematics and Computer Science, TUD. 2015-15 Computer Science, TU/e. 2015-04 S. te Brinke. Developing Energy-Aware A. Guzzi. Supporting Developers’ Teamwork Software. Faculty of Electrical Engineer- from within the IDE. Faculty of Electrical En- ing, Mathematics & Computer Science, gineering, Mathematics, and Computer Sci- UT. 2015-16 ence, TUD. 2015-05 R.W.J. Kersten. Software Analysis Methods T. Espinha. Web Service Growing Pains: Un- for Resource-Sensitive Systems. Faculty of derstanding Services and Their Clients. Fac- Science, Mathematics and Computer Science, ulty of Electrical Engineering, Mathematics, RU. 2015-17 and Computer Science, TUD. 2015-06 J.C. Rot. Enhanced coinduction. Fac- S. Dietzel. Resilient In-network Aggregation ulty of Mathematics and Natural Sciences, for Vehicular Networks. Faculty of Electrical UL. 2015-18 Engineering, Mathematics & Computer Sci- M. Stolikj. Building Blocks for the Internet ence, UT. 2015-07 of Things. Faculty of Mathematics and Com- E. Costante. Privacy throughout the Data puter Science, TU/e. 2015-19 Cycle. Faculty of Mathematics and Computer D. Gebler. Robust SOS Specifications of Science, TU/e. 2015-08 Probabilistic Processes. Faculty of Sci- S. Cranen. Getting the point — Obtaining ences, Department of Computer Science, and understanding fixpoints in model check- VUA. 2015-20 ing. Faculty of Mathematics and Computer M. Zaharieva-Stojanovski. Closer to Reli- Science, TU/e. 2015-09 able Software: Verifying functional behaviour R. Verdult. The (in)security of proprietary of concurrent programs. Faculty of Electrical cryptography. Faculty of Science, Mathemat- Engineering, Mathematics & Computer Sci- ics and Computer Science, RU. 2015-10 ence, UT. 2015-21 191

R.J. Krebbers. The C standard formalized A Treatise Supported by Computer Verified in Coq. Faculty of Science, Mathematics and Proofs. Faculty of Mechanical Engineering, Computer Science, RU. 2015-22 TU/e. 2016-11 R. van Vliet. DNA Expressions – A Formal A. Zawedde. Modeling the Dynamics of Notation for DNA. Faculty of Mathematics Requirements Process Improvement. Fac- and Natural Sciences, UL. 2015-23 ulty of Mathematics and Computer Science, TU/e. 2016-12 S.-S.T.Q. Jongmans. Automata-Theoretic Protocol Programming. Faculty of Mathe- F.M.J. van den Broek. Mobile Communica- matics and Natural Sciences, UL. 2016-01 tion Security. Faculty of Science, Mathemat- ics and Computer Science, RU. 2016-13 S.J.C. Joosten. Verification of Interconnects. Massively Collaborative Ma- Faculty of Mathematics and Computer Sci- J.N. van Rijn. chine Learning ence, TU/e. 2016-02 . Faculty of Mathematics and Natural Sciences, UL. 2016-14 M.W. Gazda. Fixpoint Logic, Games, and M.J. Steindorfer. Efficient Immutable Col- Relations of Consequence. Faculty of Mathe- lections. Faculty of Science, UvA. 2017-01 matics and Computer Science, TU/e. 2016-03 W. Ahmad. Green Computing: Efficient En- S. Keshishzadeh. Formal Analysis and Veri- ergy Management of Multiprocessor Stream- fication of Embedded Systems for Healthcare. ing Applications via Model Checking. Fac- Faculty of Mathematics and Computer Sci- ulty of Electrical Engineering, Mathematics ence, TU/e. 2016-04 & Computer Science, UT. 2017-02 P.M. Heck. Quality of Just-in-Time Require- D. Guck. Reliable Systems – Fault tree anal- ments: Just-Enough and Just-in-Time. Fac- ysis via Markov reward automata. Faculty of ulty of Electrical Engineering, Mathematics, Electrical Engineering, Mathematics & Com- and Computer Science, TUD. 2016-05 puter Science, UT. 2017-03 Y. Luo. From Conceptual Models to Safety H.L. Salunkhe. Modeling and Buffer Anal- Assurance – Applying Model-Based Tech- ysis of Real-time Streaming Radio Applica- niques to Support Safety Assurance. Fac- tions Scheduled on Heterogeneous Multipro- ulty of Mathematics and Computer Science, cessors. Faculty of Mathematics and Com- TU/e. 2016-06 puter Science, TU/e. 2017-04 B. Ege. Physical Security Analysis of Embed- A. Krasnova. Smart invaders of private mat- ded Devices. Faculty of Science, Mathemat- ters: Privacy of communication on the Inter- ics and Computer Science, RU. 2016-07 net and in the Internet of Things (IoT). Fac- ulty of Science, Mathematics and Computer Algorithms for Curved A.I. van Goethem. Science, RU. 2017-05 Schematization. Faculty of Mathematics and Computer Science, TU/e. 2016-08 A.D. Mehrabi. Data Structures for Analyz- ing Geometric Data. Faculty of Mathematics T. van Dijk. Sylvan: Multi-core Deci- and Computer Science, TU/e. 2017-06 sion Diagrams. Faculty of Electrical Engi- neering, Mathematics & Computer Science, D. Landman. Reverse Engineering Source UT. 2016-09 Code: Empirical Studies of Limitations and Opportunities. Faculty of Science, I. David. Run-time resource management for UvA. 2017-07 component-based systems. Faculty of Mathe- W. Lueks. Security and Privacy via Cryptog- matics and Computer Science, TU/e. 2016-10 raphy – Having your cake and eating it too. A.C. van Hulst. Control Synthesis us- Faculty of Science, Mathematics and Com- ing Modal Logic and Partial Bisimilarity – puter Science, RU. 2017-08 192 IPA Dissertation Series

A.M. ¸Sutîi. Modularity and Reuse of U. Tikhonova. Engineering the Dynamic Se- Domain-Specific Languages: an exploration mantics of Domain Specific Languages. Fac- with MetaMod. Faculty of Mathematics and ulty of Mathematics and Computer Science, Computer Science, TU/e. 2017-09 TU/e. 2017-10

Index

abstract syntax, 2 specification template, 55 action research, 134 T-diagram, 12 baseline, 139 technology space, 44 template parameter, 55 concrete syntax, 2 concurrent triangulation strategy, 134 vertical domain, 41 Constelle workbench, 125 critical theory, 6 definition formalism, 11 definition process, 123 domain specific language, 2 DSL developer, 17 DSL user, 17 dynamic semantics, 2 embedded DSL, 2 executable definition, 12 executable DSL, 12 external DSL, 2

GQM method, 134 horizontal domain, 41 philosophical stance, 6 pragmatism, 6 precise definition, 12 qualitative study, 134 semantic domain, 12 semantic feature, 15 semantic mapping, 12