Type Inference in Flexible Model-Driven Engineering
Total Page:16
File Type:pdf, Size:1020Kb
Type Inference in Flexible Model-Driven Engineering ATHANASIOS ZOLOTAS DOCTOR OF ENGINEERING UNIVERSITY OF YORK COMPUTER SCIENCE September 2016 Abstract Model-driven Engineering (MDE) is an approach to software development that promises increased productivity and product quality. Domain models that conform to metamodels, both of which are the core artefacts in MDE approaches, are ma- nipulated to perform different development processes using specific MDE tools. However, domain experts, who have detailed domain knowledge, typically lack the technical expertise to transfer this knowledge using MDE tools. Flexible or bottom-up Model-driven Engineering is an emerging approach to domain and sys- tems modelling that tackles this challenge by promoting the use of simple drawing tools to increase the involvement of domain experts in MDE processes. In this ap- proach, no metamodel is created upfront but instead the process starts with the definition of example models that will be used to infer a draft metamodel. When complete knowledge of the domain is acquired, a final metamodel is devised and a transition to traditional MDE approaches is possible. However, the lack of a meta- model that encodes the semantics of conforming models and of tools that impose these semantics bears some drawbacks, among others that of having models with nodes that are unintentionally left untyped. In this thesis we propose the use of approaches that use algorithms from three different research areas, that of classifi- cation algorithms, constraint programming and graph similarity to help with the type inference of such untyped nodes. We perform an evaluation of the proposed approaches in a number of randomly generated example models from 10 differ- ent domains with results suggesting that the approaches could be used for type inference both in an automatic or a semi-automatic style. 3 For my parents Despoina and Michalis Contents Abstract 3 Dedication 5 Table of Contents 7 List of Figures 11 List of Tables 15 Listings 17 List of Algorithms 19 Acknowledgements 21 Declaration 23 1. Introduction 25 1.1. Motivation and Background . 25 1.1.1. Bottom-up MDE . 26 1.2. Hypothesis and Objectives . 28 1.2.1. Thesis Objectives . 29 1.3. Research Contributions . 29 1.4. Thesis Structure . 30 2. Literature Review 33 2.1. Model-Driven Engineering . 33 2.1.1. MDE Principles and Tools . 34 2.1.2. Strengths and Weaknesses of MDE . 43 2.2. Bottom-up MDE . 44 2.2.1. Muddles . 45 2.2.2. metaBUP . 49 2.2.3. Flexisketch . 52 7 Contents 2.2.4. Other . 54 2.3. Partial Models . 56 2.4. Metamodel and Type Inference . 57 2.4.1. MetaBUP . 57 2.4.2. Flexisketch . 58 2.4.3. MLCBD . 58 2.4.4. Process Development Environment (PDE) . 59 2.4.5. Metamodel Recovery System (MARS) . 60 2.5. Summary and Critique of Flexible MDE approaches . 61 2.6. Classification Algorithms . 63 2.6.1. Classification and Regression Trees (CART) . 63 2.6.2. Random Forests (RF) . 64 2.6.3. Support Vector Machines (SVM) . 65 2.6.4. Artificial Neural Networks (ANN) . 65 2.7. Constraint Logic Programming . 66 2.7.1. Logic Programming Tools & Distributions . 69 2.7.2. Combining MDE with Logic Programming . 69 2.8. Graph Similarity . 70 2.8.1. Similarity Flooding . 70 2.8.2. Using Similarity Measurements in MDE . 71 2.9. Chapter Summary . 74 3. Type Inference using Classification Algorithms 77 3.1. Introduction . 77 3.2. Type Inference . 78 3.3. Feature Signatures . 79 3.3.1. Features Based on the Semantics . 80 3.3.2. Features Based on Concrete Syntax . 83 3.3.3. Extending Muddles . 84 3.4. Training and Classification . 85 3.5. Experimental Evaluation . 86 3.5.1. Experiment for Features Based on Semantics . 86 3.5.2. Results and Discussion . 91 3.5.3. Experiment for Concrete Syntax Features . 110 3.5.4. Results and Discussion . 112 3.6. Limitations . 120 3.7. Chapter Summary . 121 4. Type Inference using Constraint Programming 123 4.1. Introduction . 123 4.2. Type Inference . 125 8 Contents 4.3. The Constraint Satisfaction Problem . 126 4.3.1. CSP Formalisation . 126 4.3.2. Model and Metamodel to CSP Transformation . 131 4.4. Experimental Evaluation . 134 4.4.1. Experiment . 135 4.4.2. Results & Discussion . 137 4.5. Limitations . 150 4.6. Chapter Summary . 152 5. Type Inference using Graph Similarity 153 5.1. Introduction . 153 5.2. Type Inference Using String Similarity . 154 5.3. Graph Configuration . 156 5.3.1. Flattened Configuration . 157 5.4. Similarity Flooding . 159 5.5. Experimental Evaluation . 162 5.5.1. Experiment . 163 5.5.2. Results and Discussion . 168 5.6. Limitations . 176 5.7. Chapter Summary . 177 6. Conclusions 179 6.1. Thesis Contributions . 180 6.2. Future Work . 184 6.3. Closing Remarks . 187 Appendices 189 A. Metamodels 191 Bibliography 201 9 List of Figures 1.1. Stages of a typical flexible MDE approach. 27 1.2. Overview of the research project. 30 2.1. The relationships between a model with its metamodel and the do- main it represents (adapted from [1]). 34 2.2. The four layers of metamodelling infrastructures (adapted from [2] and [3]). 36 2.3. An example of a metamodel. 37 2.4. An example of a model that conforms to the metamodel of Figure 2.3. 38 2.5. An example of a model-to-model transformation between instances of two different metamodels. 39 2.6. An example of a model-to-text transformation. 40 2.7. The architecture of the Epsilon suite . 42 2.8. An overview of the Muddles approach (based on Fig. 1 from [4]). 45 2.9. An example model diagram in yEd representing a zoo configura- tion. Shapes and colours are not bound to types but can be used by domain experts for the better presentation of the example models. 46 2.10. The Muddle metamodel. 48 2.11. An overview of the metaBUP approach (from [5]). 50 2.12. An example visual fragment (from [5]). 50 2.13. The Flexisketch approach’s three basic phases (from [6]). 53 2.14. The Flexisketch Android application. 54 2.15. String representation of sketches in the Coyette et al.’s approach (from [7]). 58 2.16. Concept metamodel of PDE-based languages (adapted from [8]). 60 2.17. MARS metamodel inference approach (adapted from [9]). 61 2.18. An example of a decision tree in CART (from [10]). 64 2.19. A map for colouring. 68 2.20. An overview of the similarity flooding approach. 71 2.21. An example metamodel (adapted from [11]). 72 11 List of Figures 2.22. The directed graph of the metamodel shown in Figure 2.21 using the minimal configuration (adapted from [11]). 72 2.23. An overview of Grammel et al.’ approach [12] to trace link genera- tion (adapted from [12]). 73 2.24. An example of how a metamodel is translated to an E-Graph in Grammel et al. [12] model matching approach. 74 3.1. An overview of the proposed approach to type inference using clas- sification algorithms. 79 3.2. An example model of a zoo configuration. 80 3.3. Colours and shapes are used to define semantics on graphical models. 83 3.4. The muddles extension for type inference using concrete syntax prop- erties. 84 3.5. Example decision tree for the features based on semantics. F1 repre- sents the number of attributes of a node and F2 the number of unique incoming references. 85 3.6. The experimentation process using the features based on semantics. 87 3.7. Accuracy for different sampling rates and number of trees (“Nor- mal”, Random Forests). 101 3.8. Accuracy for different sampling rates and number of trees (“Sparse”, Random Forests). 102 3.9. A metamodel from which instances of “Children” nodes may never be instantiated if the random model generator forces optional com- position relationships to be instantiated less frequently (“Sparse” sce- nario). 104 3.10. Variables importance of features based on semantics. F1 represents the number of attributes, F2 and F3 represent the number of unique incoming and outgoing references respectively and F4 and F5 the number of unique children and parents respectively. 106 3.11. The concrete features experimentation process. ..