ABSTRACT

INTELLIGENT SIMULINK MODELING ASSISTANCE VIA MODEL CLONES AND MACHINE LEARNING

by Bhisma Adhikari

Source code auto-completion has been pursued extensively by the research community, and has benefited software engineering to a great extent. In contrast, analogous research in Model Driven Engineering (MDE), that is, model auto-completion has not been explored as extensively. As a result, there does not exist sufficient MDE tooling that allows for automatic completion of models. This thesis aims to fill that gap by developing a prototype modeling assistant for Simulink using Simone as the underlying model clone detector and machine learning algorithms. The modeling assistant provides two types of suggestions: (1) Simulink block-level suggestions, and (2) Simulink (sub)system-level suggestions, to help modelers auto-complete or extend their Simulink models. It uses machine learning algorithms to produce block-level suggestions and model clone detection to produce (sub)system-level suggestions. Simulink and Simone are chosen over other tools due to their greater maturity and wider adoption. Although this prototype modeling assistant is developed specifically for Simulink, the results and knowledge gained from this research are extendable to any modeling environment for which a text-based model clone detector is available. INTELLIGENT SIMULINK MODELING ASSISTANCE VIA MODEL CLONES AND MACHINE LEARNING

A Thesis

Submitted to the Faculty of Miami University in partial fulfillment of the requirements for the degree of Master of Science by Bhisma Adhikari Miami University Oxford, Ohio 2021

Advisor: Dr. Matthew Stephan Reader: Dr. Eric J. Rapos Reader: Dr. Christopher Vendome

©2021 Bhisma Adhikari This Thesis titled

INTELLIGENT SIMULINK MODELING ASSISTANCE VIA MODEL CLONES AND MACHINE LEARNING

by

Bhisma Adhikari

has been approved for publication by

The College of Engineering and Computing

and

The Department of Computer Science & Software Engineering

Dr. Matthew Stephan

Dr. Eric J. Rapos

Dr. Christopher Vendome Table of Contents

List of Tables vi

List of Figures vii

Acknowledgements viii

Statement of Originality ix

1 Introduction1 1.1 Motivation ...... 1 1.2 Contributions ...... 2 1.3 Overview ...... 3

2 Background & Related Work4 2.1 Model-Driven Engineering ...... 4 2.1.1 MDE Tools ...... 5 2.1.2 Simulink ...... 5 2.1.3 Simulink App ...... 7 2.1.4 App Designer ...... 8 2.2 Software Clones ...... 11 2.2.1 Code Clones ...... 11 2.2.2 Code Clone Types ...... 11 2.2.3 Model Clones ...... 13 2.2.4 Model Clone Types ...... 13 2.2.5 Drawbacks of Software Clones ...... 15 2.2.6 Advantages of Software Clones ...... 16 2.2.7 Simone ...... 16 2.3 Static Software Analysis ...... 17 2.3.1 Software Clone Detection ...... 17 2.3.2 Source Code Suggestion ...... 18 2.3.3 Model Suggestion ...... 19 2.4 Machine Learning ...... 20 2.4.1 Recommender Systems ...... 20 2.4.2 Association Rule Mining ...... 20

iii 2.4.3 Ensemble Learning ...... 20 2.5 Related Work ...... 21 2.5.1 Source Code Assistance/Completion ...... 21 2.5.2 Model Assistance/Completion ...... 22 2.5.3 Reference Framework for Intelligent Modeling Assistance (RFIMA) . . . . 22

3 Overview 24 3.1 SimIMA Components ...... 24 3.2 SimIMA Architecture ...... 25 3.3 SimIMA User Interface ...... 26

4 SimGestion: Simulink Block-Level Suggestions 29 4.1 Block Prediction Models ...... 29 4.1.1 ARM Model ...... 29 4.1.2 Freq Model ...... 31 4.1.3 Ensemble Model ...... 36 4.2 Performance Optimization of Block-Prediction Models ...... 37 4.3 User Interface ...... 38 4.3.1 SimGestion Suggestion Panel ...... 38 4.3.2 SimGestion Configuration Wizard ...... 39 4.4 Running Example: ExampleSUD ...... 41

5 SimXample: Simulink Complete Model Examples 43 5.1 Phase 1: Clone Detection ...... 44 5.1.1 Phase 1 Implementation ...... 44 5.2 Phase 2: Subsystem Recommendation ...... 46 5.2.1 Phase 2 Implementation ...... 46 5.3 Phase 3: Candidate Selection ...... 48 5.3.1 Phase 3 Implementation ...... 48 5.4 Phase 4: Application to User Model ...... 48 5.4.1 Phase 4 Implementation ...... 48 5.5 Running Example Continued: ExampleSUD ...... 53

6 Evaluation 54 6.1 Data ...... 54 6.2 SimGestion Evaluation ...... 55 6.2.1 Data Preparation ...... 55 6.2.2 Determining Block Prediction Model Parameters ...... 56 6.2.3 Evaluation of Ensemble Block Suggestions ...... 56 6.2.4 Results ...... 57 6.2.5 Discussion ...... 58 6.3 SimXample Evaluation ...... 59

iv 6.3.1 Data Preparation ...... 59 6.3.2 Evaluation of Visualization of Inferred Suggestions ...... 59 6.3.3 Evaluation of the Replacement ...... 60 6.3.4 Results ...... 61 6.3.5 Discussion ...... 62 6.4 Qualitative Evaluation via IMA Evaluation Grid ...... 63

7 Conclusion 65 7.1 Threats to Validity ...... 65 7.1.1 External Threats ...... 65 7.1.2 Internal Threats ...... 66 7.2 Future Work ...... 66 7.3 Summary ...... 67

A Source Code 68 A.1 Class Definitions ...... 68 A.2 Function Definitions ...... 105

References 220

v List of Tables

4.1 Example ARM Matrix ...... 30

6.1 SimGestion Suggestion Accuracy ...... 58 6.2 Evaluation Tests for the 5 Insertion Cases ...... 61 6.3 Inferred Suggestion Accuracy ...... 62 6.4 Summary of Qualitative Assessment ...... 64

vi List of Figures

2.1 A simulink model that adds two integers and outputs the result ...... 6 2.2 A Simulink app that computes and plots the mortgage based on different input values. This app (Mortgage.mlapp) is included with the official distribution of MATLAB software...... 7 2.3 The design view of the Simulink app shown in Figure 2.2. This app (Mortgage.mlapp) is included with the official distribution of MATLAB software...... 9 2.4 The code view of the Simulink app shown in Figure 2.2...... 10 2.5 An example of Type I (exact) model clones in Simulink ...... 14 2.6 An example of Type II (renamed) model clones in Simulink ...... 14 2.7 An example of Type III (near-miss) model clones in Simulink ...... 15 2.8 An example of Type IV (semantic) model clones in Simulink ...... 15 2.9 Model suggestion as envisioned in SimIMA ...... 19

3.1 SimIMA Architecture Overview ...... 25 3.2 SimIMA UI: Querying SimIMA using customized tools menu ...... 27 3.3 SimIMA UI: Querying SimIMA using customized context menu ...... 28

4.1 SimGestion suggestion panel ...... 39 4.2 SimIMA Block-Level Suggestion Configuration Wizard ...... 40 4.3 Initial composition of ExampleSUD ...... 41 4.4 Initial Suggestions for First Insertion into ExampleSUD ...... 42 4.5 Interim view of ExampleSUD after four applications of SimGestion ...... 42 4.6 Interim view of ExampleSUD after sseven applications of SimGestion ...... 42

5.1 SimXample Process Overview ...... 44 5.2 SimXample user interface ...... 45 5.3 Decision Process for Inserting SFS into SUD ...... 51 5.4 Engineer-guided replacement notification – prompt to proceed with replacement . . 52 5.5 Engineer-guided replacement notification – prompt to adjust connections manually 52 5.6 ExampleSUD after inserting the suggestion from SimXample ...... 53

6.1 Accuracy of Suggestion Based on Number of Suggestions Shown ...... 57

vii Acknowledgements

I am very thankful to my thesis advisor, Dr. Matthew Stephan of the Department of Computer Science and Software Engineering at Miami University for providing me with this exciting oppor- tunity to work on this project. The proposed project on SimVMA is originally his idea [1].

I am also equally grateful to Dr. Eric Rapos who, together with Dr. Stephan, provided me with valuable ideas, insights, and feedback in course of this thesis. This work would not be possible without their guidance. I would also like to acknowledge Dr. Rapos as the first reader of this thesis. It was a great honor for me to work with Dr. Stephan and Dr. Rapos whose optimism and enthusiasm for this project showered me with inspiration and kept me motivated throughout the process.

Similarly, I would also like to extend my gratitude to Dr. Christopher Vendome, who as a committee member provided me with productive feedback especially during the proposal-writing phase.

I am equally thankful to the MUSTANG research group at Miami University. The continuous support and feedback form MUSTANG members kept me motivated throughout my Master’s stud- ies.

Last, but not the least, I am very grateful to my family and friends for their continuous love and support. Thank you, everyone!

This material is based upon work supported by the National Science Foundation under Grant No. 1849632.

viii Statement of Originality

These statements certify that, to the best of my knowledge, the content of this thesis is my own work. This thesis has not been submitted for any other degree. I certify that the intellectual content of this thesis is the product of my original work and that all the assistance received in preparing this thesis and sources have been acknowledged in this thesis.

Much of the original contents of this thesis has been submitted to a theme issue in the Software and Systems Modeling (SoSyM) Journal and is currently under review [2].

ix Chapter 1 Introduction

This chapter introduces this thesis work on Simulink Intelligent Modeling Assistant via Model Clones and Machine Learning (SimIMA). Specifically, it elaborates the motivation behind this re- search and the contribution this research makes to Model Driven Engineering (MDE) and Software Engineering as a whole. Some of the contents in this chapter have been submitted to the SoSyM Theme Issue on AI-enhanced model-driven engineering [2].

While the research in source code auto-completion has advanced extensively, an analogous re- search in Model Driven Engineering (MDE) is still at its infancy. Consequently, there has been a lack of sufficient tooling in MDE that supports for model auto-completion. This thesis aims to fill that gap by developing algorithms and approaches to create an Intelligent Modelling Assistant (IMA). Specifically, it develops a prototype modeling assistant for the Simulink modeling envi- ronment called Simulink Intelligent Modelling Assistant (SimIMA). This modeling assistant helps engineers to auto-complete their incomplete Simulink models as well as to refactor their com- pleted models. SimIMA provides two types of suggestions: Simulink block-level suggestions and Simulink (sub)system level suggestions. The engineers can choose to insert such suggestions into their Model Under Development (MUD) directly from the SimIMA UI which is integrated into the native Simulink modeling environment.

SimIMA uses machine learning algorithms to produce block-level suggestions. Similarly, it uses model clone detection techniques to produce (sub)system-level suggestions. Although this IMA is developed for the Simulink modeling environment, the methodology employed in this re- search, including machine learning and model clone detection, are generalizable to other modeling environments as well for which a text-based model clone detector is available. SimIMA derives its suggestions from a set of Simulink model repositories. These repositories as well as the SimIMA parameters such as the number of suggestions and the minimum similarity level of (sub)system- level suggestions are configurable by the user to better serve their specific modeling needs.

1.1 Motivation

MDE continues to see notable adoption in industry [3], educational contexts [4], and research [5]. As with any evolving and maturing field, MDE has certain challenges that it must overcome and milestones it must achieve to flourish and realize its full potential. One such example, identified at the “2017 Grand Challenges in Modelling workshop” and its subsequent publication [5], that MDE must address is the need for cognizance-based data-driven approaches to assist engineers

1 while they are developing their modeling artifacts [6]. Such approaches have seen much success in traditional, source code, software engineering contexts, for example, those that provide coding assistance [7,8,9]. The same does not hold nearly as true for MDE. One promising approach to address this open MDE challenge is an intelligent virtual modeling assistant capable of providing data-based examples and/or guidance to engineers as they create their software models [6]. This will bring tangible and measurable benefits to MDE approaches. Mussbacher et al. [10] further discuss the need and landscape for intelligent modeling assistance, while also providing a refer- ence framework, or guidelines, to those building Intelligent Modeling Assistants (IMAs).

SimIMA is a realization of Stephan’s vision of IMA presented in a New Ideas and Emerging Results paper [11]. In that paper Stephan presented their general high-level ideas for realizing an IMA that employs model clone detection/analysis, inference, and machine learning. These ideas included two forms of cognification and software modeling assistance. The first was suggesting and presenting entire models to engineers that are similar to incomplete models they are currently developing for guidance or direct insertion. These suggested models come from analysis on a knowledge base consisting of software models from configurable sources and visualized to engi- neers in their native interface. The second was step-wise suggestions for engineers to consider during development, allowing them to visualize, analyze, and apply data-driven suggestions.

1.2 Contributions

This thesis report describes the efforts in researching and developing SimIMA. Specifically, this thesis answers the following research questions,

RQ1 - Can SimIMA provide engineers step-wise model element suggestions that are based on analysis of a configurable model set?

RQ2 - Can SimIMA provide engineers developing software models with similar model examples for insertion/inspiration based on analysis of a configurable model set?

To answer these research questions, this thesis work investigates and develops a research proof of concept in the form of an intelligent modeling assistant for Simulink modeling environment. While SimIMA aims to define a proof of concept in a language-agnostic generalizable fashion, it chose Simulink as the initial target language due to its popularity especially in the emerging areas of cyber-physical, embedded [12], and most recently, machine learning1, systems. Additionally, there are many open and growing repositories with Simulink models including Matlab Central2, SourceForge, and other corpuses [13]. This thesis also considers and references IMA guidelines, RF-IMA, proposed by Mussbacher et al. [10] where appropriate.

1https://www.mathworks.com/help/stats/machine-learning-in-matlab.html 2https://www.mathworks.com/matlabcentral/

2 SimIMA is composed of two forms of modeling assistance, corresponding to the two research questions presented earlier, and enables engineers to request assistance in their native environments while developing their models. The first form of modeling assistance, corresponding to the first research question and termed SimGestion, provides engineers with multiple suggestions for which Simulink block they should add next to their model based on a combination of techniques from machine learning, including classification based on association rule mining (ARM) and frequency matching. This thesis evaluates SimIMA through empirical experiments using a curated [13] and independently validated [14] data set. The second form of modeling assistance, corresponding to the second research question and termed SimXample, allows engineers to 1) visualize similar models based on clone analysis and data inference with engineer-customizable repositories and examples, and 2) either select one of these similar model examples for direct insertion or display it adjacently for inspiration and guidance as they create and edit their models. This thesis makes the following contributions,

1. A model assistance approach that employs machine learning techniques on configurable model sets to give engineers step-wise guidance in the development of their systems.

2. A model assistance approach that uses model clone detection on configurable model sets to provide suggestions to engineers for inspiration and suggestion.

3. A realization and demonstration of both of these approaches in a complete Simulink Intel- ligent Modeling Assistant, SimIMA, solution available to engineers for use in their native Simulink environments.

4. An evaluation of these approaches using a curated and established corpus of Simulink models that has been independently validated.

5. All of SimIMA data and artifacts posted on a public and persistent repository, allowing for reproduction and replication. 3

1.3 Overview

The remainder of this document is organized as follows. Chapter2 presents background infor- mation and related research works pertinent to SimIMA. Chapter3 discusses the components, architecture, and user interface of SimIMA at a high level. Chapters4 and5 present the research approaches for SimGestion and SimXample respectively, including the design decisions, research hurdles, and user interface decisions for each. Chapter6 presents the evaluation of each aspect of SimIMA. Finally, Chapter7 concludes this thesis by summarizing this research, and identifying threats to validity as well as some possibilities to extend/enhance this work on SimIMA.

3SimIMA: https://doi.org/10.5281/zenodo.5123568, SimIMA evaluation: https://doi.org/10.5281/zenodo.5123564

3 Chapter 2 Background & Related Work

This chapter presents information on background and related works regarding this thesis project. This chapter is expected to help the reader get necessary background information so that they can follow SimIMA research concepts, presented in later chapters, with ease. SimIMA is closely related to four distinct sub-disciplines of Computer Science/Software Engineering. These sub- disciplines are as follows:

1. Model Driven Engineering

2. Software Clones

3. Static Software Analysis

4. Machine Learning

2.1 Model-Driven Engineering

Model-Driven Engineering (MDE) is a software development approach employing high level for- mal abstractions as first class artifacts [15]. In MDE, these high level formal abstractions of the software are called models and they drive the overall software development process.

Traditionally, software engineering makes limited use of models for software development. Models, in traditional software engineering, are used more as a diagram rather than as a software artifact [4]. The use of models in traditional (non-model based) software engineering is often lim- ited to facilitate communication between different stakeholders. For example, UML diagrams [16] are used to facilitate communication among software developers, and between software companies and clients.

In contrast, models in Model-Driven Engineering are not just diagrams. They are software ar- tifacts. In MDE, such models are used by engineers during all phases of the software engineering life cycle to model both the structural (how the software system is structured) as well as the be- havioral (how the software system operates) aspects of the software project. Representing systems as models rather than traditional source code allows engineers to develop large and high quality applications by expressing them using formalisms closer in abstraction to the problem domain. In an ideal MDE environment, the models are the only artifacts created, updated, shared, and ana- lyzed. These models must have formal syntax and semantics, and have sufficient support facilities

4 ensuring they are easier to work with than the low level code they represent. This includes auto- matic code generation from models and verification techniques demonstrating the models correctly represent the system [17].

MDE has become popular in both academia and industry [3, 18]. Hutchinson et al. [3] have discussed the use of MDE in car, printer, and telecom industries.

2.1.1 MDE Tools Just as the traditional software engineering relies on the use of Integrated Development Environ- ments (IDEs) for developing software, Model-Driven Engineering relies on the use of MDE Tools. Eclipse Modeling Framework (EMF) [19] and Papyrus-RT [20] are two widely used MDE tools based on the open-source Eclipse project [21]. There are also several proprietary MDE tools, among which Simulink [22] is the most popular one used widely to model automotive, aerospace, and embedded systems [12].

2.1.2 Simulink Simulink is a MDE tool developed and maintained by MathWorks [23]. It is basically a simulation software that allows users to create simulation models of their software and/or hardware systems. Simulink is tightly coupled with the Matlab [24] programming environment which is also a prod- uct of Mathworks. Thus, models created in Simulink can be manipulated either from Simulink’s graphical editor or via Matlab scripts.

While Eclipse based modeling tools primarily use UML diagrams to model the software, Simulink uses a different standard. Simulink models are data-flow models, and consist of three levels of granuality: whole models, systems, and blocks. Models contain systems, and systems contain other (sub)systems, and blocks. Thus, blocks are the smallest entities in Simulink mod- els. Blocks are connected to each other by lines which carry a signal to/from a block. A block is usually associated with a specific mathematical function. However, there are some blocks, for example output blocks, that do not perform mathematical operations. The mathematical function is performed on the incoming signal to output a resulting signal. The output from one block is often fed as an input to some other block(s), and so on until the final output of the (sub)system is reached. Similarly, the outputs from one or more sub-systems may be fed as input to other sub-systems. The final output from the last sub-system in the chain gives the output of the entire Simulink model. Simulink saves its model files as either mdl or as slx files both of which are proprietary to MathWorks.

A Simple Simulink Model To help the reader understand how systems are modeled in Simulink more concretely, this section presents a simple Simulink model and one possible equivalent implementation in ++ program- ming language. The Simulink model, as illustrated in Figure 2.1 represents a simple adder that

5 adds two integers and outputs the sum. Listing 2.1 shows a C++ code equivalent to this Simulink model. Although Simulink provides a feature to generate C++ code from the model itself, the C++ code presented in listing 2.1 is not generated using Simulink for two reasons. Firstly, the code generated by Simulink would be much longer, organized in multiple files and spanning multiple pages, even for this simple model, and thus not suitable to include in this document. Secondly, the purpose of including the C++ code here is only to illustrate the code-equivalence of a Simulink model, rather than to illustrate the code-generation feature of Simulink.

Figure 2.1: A simulink model that adds two integers and outputs the result

1 # include 2 using namespace std; 3 4 void model(){ 5 int i1 = 23; 6 int i2 = 54; 7 int sum = i1 + i2; 8 cout << "sum : " << sum << endl; 9 } 10 11 int main(){ 12 model(); 13 } Listing 2.1: Equivalent C++ code of Simulink model in Figure 2.1

6 2.1.3 Simulink App A Simulink application, called an app, is a self-contained Matlab program that provides a sim- ple point-and-click interface to the the underlying matlab code. The matlab code may be linked with some Simulink model, as well. Apps contain interactive control components such as menus, trees, buttons, and sliders that execute specific instructions to facilitate users in- teract with them. Apps can also contain tables, and plots for data visualization or explo- ration. These components are laid out in a hierarchical manner following the approach of Ob- ject Oriented Programming. Figure 2.2 shows a sample UI of a Simulink app that computes and plots the mortgage based on different input values. This app is included with the offi- cial distribution of MATLAB software. In R2019b version of Matlab, the app is located at MAT LAB/Examples/R2019b/matlab/MortgageCalculatorExample/Mortgage.mlapp.

Figure 2.2: A Simulink app that computes and plots the mortgage based on different input values. This app (Mortgage.mlapp) is included with the official distribution of MATLAB software.

Simulink apps can be created in two ways: 1. Programmatically: Simulink Apps can be developed programmatically using Matlab functions. This is a tedious approach as it requires the developer to do a lot of coding. 2. Interactively: Alternatively, Simulink apps can also be developed interactively, using App designer devel- opment environment. Section 2.1.4 discusses Simulink App Designer in detail.

7 2.1.4 App Designer App Designer [25] is a graphical application development environment introduced by Mathworks to help developers create Simulink apps interactively. It is a successor to GUIDE [26], which is also a graphical application development environment, but offers features as compared to App Designer [27]. Mathworks recommends to use App designer over GUIDE to create Simulink apps for two reasons:

1. App Designer offers more features than GUIDE.

2. GUIDE will be removed in future releases of Matlab.

Developing Simulink apps interactively using App designer (or GUIDE) is inspired by Model Driven Software Engineering (MDSE), an approach where software systems are modeled in a mod- eling environment, rather than coded in a programming enviromnent [4]. The executable code is then generated, either completely or partially from the models so created.

App designer supports partial automatic code generation. Developers add interactive control components such as menus, trees, buttons, and sliders to their app from the available components library using drag-and-drop feature. The executable code is then generated automatically by App designer. Develpers are still free to edit specific code segments in the generated code in order to add/edit callback functions for various UI components such as buttons and check-boxes. Figures 2.3 and 2.4 show the design view and the code view respectively of the standard Simulink app shown in Figure 2.2. The section of code in gray background in figure 2.4 is generated automat- ically by App designer from the model shown in the design view in Figure 2.3. This section of code is not editable by the app developer. In contrast, the section of code in white background is editable by the app developer to implement some specific functionality. This app is included with the official distribution of MATLAB software. In R2019b version of Matlab, the app is located at MAT LAB/Examples/R2019b/matlab/MortgageCalculatorExample/Mortgage.mlapp.

8 Figure 2.3: The design view of the Simulink app shown in Figure 2.2. This app (Mortgage.mlapp) is included with the official distribution of MATLAB software.

9 Figure 2.4: The code view of the Simulink app shown in Figure 2.2

10 2.2 Software Clones

This section will introduce two analogous concepts in software engineering – code clones and model clones, which are collectively referred to as software clones.

2.2.1 Code Clones In software engineering literature, there is no one generic definition of code clones to which all researchers agree [28]. Baxter et al. [29] define code clones as the segments of code that are similar according to some definition of similarity [29]. However, researchers differ in their definitions of “similarity”. Roy et al. [30] have done a comprehensive survey of the literature on code clones. They have identified the following two metrics that can be used to define similarity between two or more code fragments:

1. Textual similarity: Two code fragments are said to be textually similar if they contain similar texts. Such code fragments “look” alike. For example, a code fragment copied from some some code base and pasted somewhere else (with or without some modifications) is textually similar to the original code fragment.

2. Functional similarity: Two code fragments are said to be functionally similar if they per- form the same function/operation, but differ in implementation (and thus in text). For exam- ple, two fragments of codes, one computing the factorial of a number using recursion, and the other computing the factorial of a number using loop are functionally similar [30].

2.2.2 Code Clone Types Three types (Type I, Type II, and Type III) [31, 32, 33] of textual similarity-based code clones and one type of functional similarity-based clone (Type IV) [34, 35, 36, 37] are distinguished in the literature [30]. Examples of code clones presented below are taken from [30].

Type I Clones: Exact Clones T ypeI clones are those code fragments which are textually identical to each other when white- spaces, layouts, and comments are ignored. These clones are also called as exact clones. For example,

11 FRAGMENT 1 | FRAGMENT 2 | if (a >= b) { | if (a>=b) c = d + b; // Comment1 | // Comment1 d = d + 1;} | c=d+b; else | d=d+1;} c = d - a; // Comment2 | else // Comment2 | c=d-a; Listing 2.2: Type I code clones. Source: Roy et al. [30]

Type II Clones: Renamed Clones T ypeII clones are those code fragments which are textually identical to each other except for dif- ferences in user-defined identifiers, types, layouts, and comments. User-defined identifiers include name of variables, constants, class, methods, and so on. There is no variation regarding the re- served words and statement structures between the two fragments. If a programmer copy-pastes a code fragments and makes some changes only in the name of the variables used, the resulting clones are Type II clones. For this reason, type II clones are also called as “renamed” clones. If renaming of the user-defined identifiers is done in a consistent manner, the resulting clones are called as “parameterized” clones. Thus, parameterized clones are a special case of the more gen- eral renamed clones. An example of Type II clones is presented below:

FRAGMENT 1 | FRAGMENT 2 | if (a >= b) { | if (m >= n) c = d + b; // Comment1 | { // Comment1 d = d + 1;} | y = x + n; else | x = x + 5; //Comment3 c = d - a; // Comment2 | } | else | y = x - m; //Comment2 Listing 2.3: Type II code clones. Source: Roy et al. [30]

Type III Clones: Near-Miss Clones T ypeIII clones refer to those clone fragments which are further modified (from Type II clones) by changing, adding and/or deleting one or more statements. While Type I and Type II clones are functionally identically, Type III clones may not be so due to the difference introduced by addi- tion/deletion/updating of statement(s). Since Type III clones still look very similar to each other (at least textually), they are also termed as “near-miss” clones. An example of Type III clones is as follows:

12 FRAGMENT 1 | FRAGMENT 2 | if (a >= b) { | if (a >= b) { c = d + b; // Comment1 | c = d + b; // Comment1 d = d + 1;} | c++; // statement added else | d = d + 1; } c = d - a; // Comment2 | else | c = d - a; // Comment2 Listing 2.4: Type III code clones. Source: Roy et al. [30]

Type IV Clones: Semantic Clones T ypeIV clones are those code fragments which are functionally similar. Therefore, unlike the previous three clone types, Type IV clones may not be a result of copy-pasting (and modifying) code fragments. These clones behave similarly from semantic point of view. Therefore, they are also known as “semantic” clones. They may be syntactically very dissimilar, however. For exam- ple, two code fragments, one computing the factorial of a number using recursion, and the other computing the factorial of a number using loop qualify as semantic clones [30]. This is shown in the code example below:

FRAGMENT 1 | FRAGMENT 2 | int i, j=1; | int factorial(int n) { for (i=1; i

2.2.3 Model Clones Model clones are graphical counterparts of code clones. Models differ from source codes in that they are typically represented visually, as box-and arrow diagrams, rather than as texts. [38]. Model clones are defined as similar sub-graphs of these diagrams. The similarity score between two graphical models may be computed using either a graph-based approach or a text-based approach [38].

2.2.4 Model Clone Types For SimIMA, the scope of graphical models, and hence model clones, is limited to Simulink mod- els only. This is because the goal of this project is to create a model recommendation system, and demonstrate the viability of such idea by implementing a prototype model recommendation system for Simulink. Alalfi et al. have categorized Simulink models (at subsystem level) into

13 three categories – Type I (exact) clones, Type II (renamed) clones, and Type III (near-miss) clones. According to this categorization, there is no corresponding model clone for Type IV i.e., semantic code clones. However, Al-batran [39] has discussed the concept of semantic clones in Simulink models. According to their definition, two or more Simulink sub-systems that perform the same or similar function but differ in implementation can be regarded as semantic clones. Thus, like code clones, there are four different types of model clones which are discussed in the sections that follow.

Type I Clones: Exact Clones T ypeI model clones are identical model fragments except for differences in visual presentation, layout and formatting [38]. These clones are also called as “exact” clones. For example, the two Simulink models shown in Figure 2.5 are exact clones of each other.

Figure 2.5: An example of Type I (exact) model clones in Simulink

Type II Clones: Renamed Clones T ypeII model clones are structurally identical model fragments except for differences in labels, values, types, visual representation, layout and formatting [38]. These clones are also called as “renamed” clones. Figure 2.6 shows two simulink models which are renamed clones of each other.

Figure 2.6: An example of Type II (renamed) model clones in Simulink

14 Type III Clones: Near-Miss Clones T ypeIII model clones have further modifications (compared to Type II model clones). These modifications include changes in position or connection with respect to other model fragments and “small” additions or removals of blocks and/or lines [38]. These model clones are also called as “near-miss” clones. Figure 2.7 shows two Simulink models which are near-miss clones of each other.

Figure 2.7: An example of Type III (near-miss) model clones in Simulink

Type IV Clones: Semantic Clones T ypeIV model clones are two model fragments which may look different but perform similar function. These clones are functionally i.e. semantically similar, although they may look consider- ably different structurally. Therefore, these types of clones can also be called as semantic clones. For example, two different Simulink implementation of the logic x ∧ y, as shown in Figure 2.8, are semantic model clones of each other. Note that, unlike the previous three types of model clones, this family of model clones is not discussed by Alalfi et al. [38].

Figure 2.8: An example of Type IV (semantic) model clones in Simulink

2.2.5 Drawbacks of Software Clones One of the ideals of software engineering is to reduce code redundancy and code duplicity within a software project [40]. The existence of a significant number of software clones (either code clones,

15 or model clones, or both) within a software project suggests that the software project suffers from code duplicity. The following problems are caused by cloned code in a software project:

1. Having code clones increases the probability of bug propagation. If a code fragment is cloned from some source that has some bug, there is a high probability that the bug gets copied along with the useful code. [41, 42].

2. Having code clones increases the probability of introducing a new bug into the system. In many cases, only the structure of the duplicated code fragment is reused. For example, only the interfaces are cloned while the implementation details are to be filled by the developers. This process may introduce errors in the implemented code [43, 29].

3. Having code clones is an indication of bad project design. Prevalence of software clones indicates a compromise in the fundamental principles of good project design such as proper abstraction and inheritance, code reuse, and modularity [44, 45, 46].

4. Having code clones in a project leads to increased maintenance cost. When a bug is found in some code fragment, all other code fragments which are the clones of that fragment also need to be investigated for the existence of the bug. Thus, code clones impact the overall maintainability of the software project [46, 44]

2.2.6 Advantages of Software Clones Software clones are not always bad. The following are the advantages of having cloned code in software projects:

1. The use of cloned code can sometimes increase program’s readability and maintainability. For example, in an Object Oriented Programming (OOP) based software project, developers want to define separate classes, which are often clones of one another, for each item to be modeled so that each class can be understood and updated independently. Refactoring of such code-bases to remove code clones might introduce abstractions that are complex, unintuitive, and hard to maintain over time as the software evolves [47].

2. The use of code clones often saves development time. By burrowing pre-written, and pos- sibly pre-tested, code fragments into their projects, software developers can complete their projects in a short time. This is particularly helpful for novice developers who can benefit by cloning the codes written by more experienced programmers [48].

2.2.7 Simone Simone is a Simulink model clone detector [38]. It is adapted from Nicad [49], a text-based code clone detector. Simone detects type 1, 2, and 3 model clones by analyzing Simulink models’ under- lying textual representations. It filters, normalizes, and sorts these textual representations before performing text-based model clone detection, which has advantages over graph-based detection

16 techniques [38, 50]. Simone additionally clusters model clones into model clone classes. Accord- ing to the evaluations conducted by Stephan et al. [51], Simone is the most adept Simulink model clone detector for detecting type 3 clones.

2.3 Static Software Analysis

Static analysis of software refers to analyzing a computer software without actually executing it. Thus, static analyses are performed before the program is run i.e., sometime between coding (or modeling) and unit-testing. This is different from dynamic analysis of software where programs are analyzed by executing them [52] for example, during unit testing. The scope of SimIMA regarding software analysis includes one specific research area known as model clone detection (discussed in section 2.3.1). In particular, SimIMA makes use of several background works [49, 38,1] carried out on model clone detection in Simulink models.

2.3.1 Software Clone Detection Software clone detection is the process of finding model and/or code clones in a software project or a collection of software projects. It is an active area of research in static software analysis [53].

As discussed in Section 2.2.5, the maintenance of software systems containing a significant amount of cloned models and/or codes becomes a difficult problem to solve for software engi- neers. Mayrand et al. [46] have estimated that a typical industrial source code contains 5% to 20% of duplicated code. This means for a software project that contains one million lines of code, there will be 50,000 to 200,000 lines of cloned code in the project – which is too big to discover and then to inspect manually. Therefore, software engineers and researchers are interested in developing al- gorithms and approaches to discover clones in such (large) systems in an efficient manner [53].

Depending upon whether the clone detection is done at the code level or at the model level, software clone detection can be of two types:

1. Code clone detection: There have been many research works in code clone detection. Roy et. al. [54] has done a detailed literature review of such works. They have also categorized all code-clone detection research approaches into five different categories – textual, lexical, syntactic, semantic, and hybrid.

2. Model clone detection: Unlike, code clone detection, there has been limited research in model-clone detection. Researches in model clone detection can be categorized into two broad categories – textual (for example, [38]), and graph-based (for example, [55]). Model clone detection is inherently more challenging than code clone detection due to the graphical nature of models [56].

The most important application of software clone detection is to assist in better software design and software maintenance. By detecting clones in a software project, software engineers are able

17 to refactor their codes and/or models so as to reduce the amount of duplicated codes and/or models. This also helps in finding and fixing similar bugs across all clones when such bugs are discovered in a member of the clone class.

Software clone detection can also be applied for generating source-code and model suggestions discussed in sections 2.3.2 and 2.3.3 respectively. The second module of SimIMA, that is, SimX- ample produces model suggestions for incomplete Simulink models using model clone detection on Simulink model repositories.

2.3.2 Source Code Suggestion Source code suggestion is a feature offered by many modern IDEs to help developers write pro- grams faster by suggesting them possible ways to complete their code as they type it. Developers use this feature extensively, upto several times per minute [57]. This feature is also referred to as “source code (auto)-completion”, “source code recommendation”, and “software engineering recommendation system” [58].

There are a number of reasons behind the growing popularity of source-code suggestion feature of IDEs. Bruch et al. [59] suggest the following advantages of source code suggestion:

1. Save development time: Source code suggestion saves development time for developers. With this feature, they no longer have to type potentially long names of properties, methods, import statements, and so on every time. Rather they can just press the code-completion keyboard shortcuts (which are just one or two key presses) as desired suggestions pop up.

2. Reduce coding errors: Source code suggestion helps developers to avoid writing erroneous code by suggesting (and possibly requiring) them to use only those code-completion pos- sibilities that are compatible at the current context. For example, given a variable of type java.lang.List, the source code suggestion system would suggest the members of this class but none of some other class, for example java.lang.String. The code recom- mendation system would raise an error message in case the developer invokes an incompat- ible method call.

3. Provide quick and accessible documentation: Source code suggestion helps developers to narrow down the search space for potential methods they might use at a particular context. When developers are presented with a selected few possibilities of completing their incom- plete code, all they have to do is browse the proposed suggestions and choose the appropriate one. Thus, by reducing the search space for potential methods calls and providing documen- tation links to such methods, code recommendation provides quicker and more accessible documentation for the developer.

4. Encourage code readability: Source code suggestion encourages developers to use longer, more descriptive names for variables and methods, resulting in more readable code. In the

18 absence of source code suggestion feature, developers tend to use shorter names for user- defined identifier so that they have to type less. With code auto-completion systems at their disposal, they can afford to use longer, more descriptive identifiers with the same amount of typing. Thus, source code suggestion results in more readable code.

2.3.3 Model Suggestion Model suggestion is the graphical analogue of source-code suggestion. It serves the same purpose in Model-Driven Engineering as source-code suggestion does in traditional (code-based) software engineering. Thus, the idea behind model suggestion is to suggest auto-completion options for in- complete models as modelers/developers work on their software models in an graphical IDE such as Simulink. At present, model suggestion is not as extensively explored as source-code suggestion [1].

Figure 2.9 shows an illustration of model suggestion where three potential model completion options pop up as a modeler works on their incomplete Simulink model, so that they may simply choose the suggested completion option if that is what they want their complete model to look like. Note that this is not an actual screenshot of real Simulink workspace as Simulink does not currently support such feature (model completion). This is just a conceptual representation of how model suggestion might look like if implemented in Simulink. SimIMA aims to accomplish this goal.

Figure 2.9: Model suggestion as envisioned in SimIMA

19 2.4 Machine Learning

Mitchell et. al. [60] define machine learning as “the study of computer algorithms that improve automatically through experience and by the use of data”. This approach is used to build what are known as machine learning models. Machine learning models are first trained on what is known as the training data from which they “learn” to solve the particular task they are designed to do. Thus, machine learning models are not explicitly programmed to do a certain task [61]. Rather, they derive their problem-solving ability from the data fed to them during the training phase. This is in contrast to conventional problem-solving approaches which employ some precise step-by-step procedure programmed explicitly to solve the problem. Machine learning algorithms are used to solve tasks such as Natural Language Processing (NLP) and Computer Vision (CV) where con- ventional algorithms are relatively unfeasible.

2.4.1 Recommender Systems Recommender Systems, or recommendation systems, are software tools which assist people to make choices when they do not have sufficient personal experience of the alternatives [62]. For example, such systems are used in market-basket-analysis where the task of the recommender system is to suggest more products to a customer based on (1) what they have already added to their cart, and (2) past buying patterns of all customers. Other applications of recommender systems include recommending new movies to users based on their personal preferences, suggesting new connections to users in social-networking sites, and producing code-completions suggestions by IDEs. Recommender systems typically use one or more machine learning techniques to produce such suggestions. They can be described as predictive models which make one or more predictions (recommendations) along with a confidence value associated with each of the predictions.

2.4.2 Association Rule Mining Association Rule Mining is a data-mining technique. Using association rule mining (ARM) is an established way of building a recommender system [63]. Association rule mining involves devising rules of association/correspondence that satisfy some minimum confidence [64]. Simply put, it involves finding out that when some things, for example antecedents X and Y, are present, what else is also often present, for example consequent Z. In the market-basket-analysis problem, assuming a customer has already put bread and butter in their cart, then these two items would be the antecedents of the association rule, while other items the customer might be interested to buy, such as honey, would be the consequent of the rule. The confidence is a number between 0 and 1 that represents the percentage of time that rule holds true in the rule derivation process.

2.4.3 Ensemble Learning Ensemble methods in machine learning are those that construct multiple machine learning mod- els and combine them to produce one optimal predictive model [65, 66]. This is often done by

20 weighting of the individual predictive models and by testing the ensemble model. There are many examples of predictive models being employed by analysts [67]. They often improve prediction and performance by reducing the risk of obtaining a local minimum, avoiding overfitting, and by having a better hypothesis discovered by the combination multiple predictive models not originally possible with a single predictive model.

2.5 Related Work

SimIMA aims to answer the two research questions presented in Chapter1. In doing so, it aims to develop a prototype model completion tool. Research works on two fields – source code comple- tion, and model completion are particularly relevant to this thesis work. Research works in model completion are directly related to SimIMA. Similarly, works on source code completion are used to draw inspiration and collect ideas as the research in SimIMA is, in fact, a graphical analogue of source code suggestion.

2.5.1 Source Code Assistance/Completion There have been a number of research works on source code assistance. Among them, the one that is the most relevant to SimIMA is the intelligent Code Completion System (CCS) developed by Bruch et al. [59]. This tool is based on three different kind of information – frequency, association rules, and matching neighbors. Their frequency based CCS suggests the most frequently occurring APIs in the example codebases as the most probable predictions. In their second approach, they apply the machine learning technique called association rule mining [64] in order to make predic- tions of possible method calls given the state of a variable. Similarly, in their third approach, they modify the k-nearest-neighbors (KNN) algorithm [68] and use it to make predictions for code com- pletion. They call their approach as Best Matching Neighbor (BMN) algorithm. Prior to Bruch’s work, code completion systems did not make use of the context of a variable, and hence would make a lot of irrelevant suggestions in addition to the relevant ones [59].

Proksch et al. [8] extend the work of Bruch et al. on BMN by using Bayesian networks [69] instead of best matching neighbors. Their code completion tool makes better predictions than BMN in cases where method calls are not involved. Raychev et al. employ Natural Language Processing (NLP) techniques to mine large code-bases and make code completion suggestions [58]. Similarly, Asaduzzaman et al. [7] propose a context-sensitive code completion (CSCC) system that indexes method calls in training code-bases based on their context. CSCC compares those contexts with the current context of the program in order to make method code completion suggestions [7]. One of the widely used open-source code completion system is that of the Eclipse IDE. Eclipse IDE makes code completion recommendations based on API usage statistics [1].

21 2.5.2 Model Assistance/Completion As compared to source code completion, there has been relatively less research work in model completion [70, 71]. To overcome this situation, Dyck et al. designed a “model recommender framework” that suggests how UML model recommenders should appear [70]. Segura et al. [72] developed an Eclipse-based modeling assistant, which can be used to facilitate building both mod- els as well as meta-models. Their tool queries multiple data sources, in possibly multiple formats such as XML, RDF, and database schemas to make suggestions to the user. Sen et al. [73] de- veloped a methodology to create model editors that support automatic completion of incomplete models built in some domain-specific modeling language. Similarly, Mazanek et al. [74] worked on auto-completing diagrams by generating suggestions based on the graph grammar that describes the diagram under development. In the same way, Steimann and Ulke [75] employed constraint- solving and modeling language analysis to make recommendations for model modifications. Pati et al. introduced proactive modeling whose goal is also to facilitate automate modeling by predict- ing and executing valid model transformations [76].

Kuschke et al. [71] developed a modeling assistant for structural UML models. Their tool makes model completion suggestions based on user’s activity pattern which is analyzed and matched against a pre-defined set of modeling activities defined within the tool. The goal of their research aligns closely to what is proposed in SimIMA. However, the methodology differs as SimIMA makes model recommendations based on structural similarity of models rather than user activity.

Another work of notable relevance to SimIMA is a feature introduced in R2018a version of Simulink that allows modelers to refactor their completed Simulink models [1]. Thus, unlike SimIMA, this feature is intended to facilitate refactoring of completed Simulink models rather than to recommend model completion suggestions to the engineer as they develop the model itself. Besides, the model clone detector they employ for this purpose is proprietary as is the tool itself. Nevertheless, this feature serves as an inspiration for the research on SimIMA.

2.5.3 Reference Framework for Intelligent Modeling Assistance (RFIMA) Mussbacher et. al. [77] have proposed a conceptual Reference Framework for Intelligent Model- ing Assistance, known as RFIMA. This framework describes, at a high level, the relationship and interaction among the Socio-Technical Modeling System (STMS) – which includes the modeler, the Intelligent Modeling Assistant (IMA), and the External Sources.

RFIMA also defines the following nine high-level properties that intelligent modeling assistants are supposed to have:

1. Quality of IMA regarding models

2. Autonomy

3. Relevance

22 4. Confidence

5. Trust

6. Explainability

7. Quality degree

8. Timeliness

9. Quality of IMA regarding external sources

For each of these properties, RFIMA also defines different levels. These levels are numbered from 1 to N, with 1 being the lowest, and N the highest.

23 Chapter 3 Overview

This chapter provides an overview of the architecture, main components, and User Interface of SimIMA. At the highest level, SimIMA is composed of two main modules: SimGestion and SimX- ample. SimGestion provides block-level suggestions for completing models, addressing the first research question. SimXample provides whole system examples to the user for inspiration and/or insertion. It addresses the second research question. Some of the contents in this chapter have been submitted to the SoSyM Theme Issue on AI-enhanced model-driven engineering [2].

3.1 SimIMA Components

There are four components of SimIMA consistent with the RF-IMA framework [77]: the assistant, the data acquisition/production layer, the context shadow, and the optional adaption.

1. The assistant for SimIMA is realized as a Simulink application that provides graphical rec- ommendations to engineers directly into their Simulink development environment and also contextualizes the information showing where potential suggestions can be applied by Sim- IMA.

2. In SimIMA, the data acquisition/production layer, which provides access to and allows con- figuration of different sources, is facilitated through repository configuration within different Simulink menus that are customized using App Designer. These menus connect to external applications that perform the data acquisition and suggestion generation.

3. SimIMA’s context shadow, which captures the context and current activity of the engineer, is the Simulink environment itself whereby SimIMA captures context information. This includes the engineer’s current system under development (SUD), the most recently clicked block (MRCB) by the engineer, information about the MRCB, and more.

4. While optional, SimIMA also has an adaption component, which allows engineer feedback to change both data and context. SimXample allows the engineer to regenerate their data after seeing suggested, whole, subsystems by fine tuning both the parameters of the search and the data. While SimGestion does not incorporate feedback at this point, it is something that can be considered future work.

24 Figure 3.1: SimIMA Architecture Overview

3.2 SimIMA Architecture

Figure 3.1 presents SimIMA’s architecture. This architecture involves a combination of compo- nents internal and external to Matlab. For SimXample, engineers can request model design assis- tance through the Simulink custom menus. SimIMA uses App Designer to present suggestions to the engineers and Matlab script files to connect, configure, and execute Simone, which is external to Matlab. SimGestion suggests blocks within the actual model development interfaces. The ma- chine learning algorithms used by SimGestion are encoded within Matlab scripts. In both cases, it begins with a work-in-progress Model Under Development (MUD) by an engineer who requests assistance. In the case of Simulink, where models contain one or more (sub)systems, the MUD contains a (Sub)System Under Development (SUD), potentially along with other (sub)systems. While this thesis discusses the research and approach using some Simulink terms and concepts, the approaches and ideas are intended to be transferable to other modeling languages unless spec- ified otherwise.

25 SimIMA comes with six default repositories corresponding to the following domains: Auto- motive, Avionics, Electronics, Energy, Robotics, and Miscellaneous. These Simulink models are taken from the Chowdhury et al. [13] corpus. This is discussed in more detail in Chapter6. En- gineers can select any combination of these default repositories. In addition, they are also able to add custom model repositories through SimIMA interface. This feature allows them to link to their own in-house model projects, domain exemplars, and other repositories as they desire, and corresponds to the recommendation from Dyck et al. to allow for model recommenders to query multiple sources [70]. A possible future extension to this feature is to link SimIMA to online model repositories, such as Matlab Central, the Model Clone Portal (MoCoP) [78], MDEForge [79], and others including those identified by Chowdhury et al [13].

3.3 SimIMA User Interface

As part of this thesis work, SimIMA is integrated into Simulink UI, so that engineers can query SimIMA for suggestions seamlessly without having to leave their modeling environment. To achieve this functionality, SimIMA customizes Simulink’s UI and introduces custom menus. Specif- ically, SimIMA introduces the following three custom menus:

1. SimIMA: Suggest Blocks : Engineers can click this menu to query SimIMA for Simulink block-level suggestions.

2. SimIMA: Suggest Complete Systems : Engineers can click this menu to query SimIMA for Simulink system-level suggestions.

3. SimIMA: Configure Block Suggestions : Engineers can click this menu to change config- uration regarding Simulink block-level suggestions.

For convenience of users, SimIMA makes these menus available via two ways: first via the custom “Tools” menu, and second via context menu, that is, mouse-right-click action. Figures 3.2 and 3.3 illustrate this. User Interfaces specific to SimGestion and SimXample are described in Chapters4 and5 respectively.

26 Figure 3.2: SimIMA UI: Querying SimIMA using customized tools menu

27 Figure 3.3: SimIMA UI: Querying SimIMA using customized context menu

28 Chapter 4 SimGestion: Simulink Block-Level Suggestions

This chapter introduces the first module of SimIMA called SimGestion. SimGestion is named so because it provides block-level suggestions for completing Simulink models. This module addresses the first research question presented in Chapter1: Can SimIMA provide engineers step- wise model element suggestions that are based on analysis of a configurable model set? This chapter describes SimIMA research in developing SimGestion, including a discussion in design considerations. It also presents the resulting Simulink implementation and user interface, and illustrates it through a running example. Some of the contents in this chapter have been submitted to the SoSyM Theme Issue on AI-enhanced model-driven engineering [2].

4.1 Block Prediction Models

SimGestion uses machine learning techiniques in order to produce Simulink block-level sugges- tions. In particular, it uses Association Rule Mining, Frequency Matching, and/or their ensemble to derive the suggestions. The next sections describe these different block-prediction models and their implementation details.

4.1.1 ARM Model This block-prediction model is based on the machine learning algorithm known as Association Rule Mining. Generating Simulink block-level suggestions from a repository of Simulink models (training dataset) can be framed as an Associaton Rule Mining problem, which the ARM model leverages. During the model-training phase, the ARM model extracts Simulink (sub)system-level information from the training-set models. In doing so, it keeps a record of the presence/absence of every Simulink block-type in each of the sub-systems present in the training model set. At Simulink model-development time, when queried for suggestions by the engineer, the ARM model “mines” through the extracted information using Association Rule Mining techniques to predict potential Simulink types which are not yet present in the SUD.

ARM Model Implementation SimIMA’s first step in the ARM model is to create a two-dimensional matrix where columns rep- resent Simulink block types and rows represent (sub) systems. Within the matrix, each system

29 Table 4.1: Example ARM Matrix

Gain Inport Outport Scope Sine Sum S1 1 0 0 0 0 1 S2 0 1 0 1 0 1 S3 1 0 1 0 1 0 row entry will indicate whether that specific block type is present in that system. Consider the following illustrative example of 3 systems in the training data and corresponding matrix in Ta- ble 4.1. A ‘1’ in a cell indicates the presence of a block of that type in that respective system. To gather this information, the ARM model considers each repository configured by the engineer. For each repository, it iterates through each model and its respective subsystems, noting which block types are present and which are not. When the ARM model encounter a block type not yet in the matrix, it simply adds a new column to the matrix, and update the entries for that new column accordingly. This allows the ARM model to establish ARM antecedents-and-consequent rules/re- lationships accordingly. The example matrix corresponds to the following systems containing the listed blocks, S1 : [“Sum”, “Gain”]

S2 : [“Inport”, “Sum”, “Scope”]

S3 : [“Gain”, “Outport”, “Sine”] To produce suggestions using the ARM model, SimGestion first captures the context of the assistance request – specifically, the SUD. Two pieces of information from the modeling context are particularly relevant for the ARM model. First, which block types are already present in the SUD, and second, the block type of the Most Recently Clicked Block (MRCB). SimGestion then passes the context information so captured to the ARM model. The ARM model then considers all block types present in the SUD to be the antecedents, including the MRCB. Empirical experiments on evaluating the prediction accuracy of the ARM model using 5-fold cross validation showed that loosening the percentage of required antecedents yielded better accuracy. The experiments further revealed that requiring that half or more (0.5) of the antecedent blocks be present, with the MRCB being always present, provided the best results. With this observation under consideration, the ARM model then traverses the ARM matrix. It then calculates the confidence for any block type that is present as a consequent to the antecedents. Here, confidence is measured as a fraction of the number of rows that have the antecedents and consequent over those rows in the matrix that have just the antecedents. When that confidence is non-zero, the ARM model considers it as a potential suggestion. After the ARM model completes computing the confidence value for each suggestion, it then sorts them by their confidence for presentation purposes. Listings 4.1 and 4.2 present the algorithms, in the form of pseudocode, for model training and suggestion prediction respectively for the ARM model. The full source code for the ARM model can be seen in Listing A.2 in Ap- pendixA.

30 1 FUNCTION trainArmModel(armModel, repositories) --> armModel 2 Create ARM table (2D matrix) such that the columns correspond to block ,→ types and rows correspond to (sub)systems. 3 For each Simulink model (M) in the repositories: 4 For each (sub)system (S) in M: 5 For each block-type (B) in S: 6 If B is not present in ARM table, add a column in ARM table ,→ for B and set all previous row values for this column to ,→ 0. 7 Add a row for S in ARM table. 8 armModel.table = ARM table 9 Return armModel Listing 4.1: Pseudocode for the ARM model’s training algorithm

1 FUNCTION predictArmModel(context) --> suggestions 2 blockTypesInSudExceptMrcbBlockType = set_difference(context. ,→ blockTypesInSud, context.mrcbBlockType) 3 n = length(blockTypesInSudExceptMrcbBlockType) 4 r = floor(n * ANT_REDUCTION) 5 combinations = all combinations of n blockTypesInSUDExceptMrcb taken r ,→ at a time 6 suggs = new collection 7 For each combination (c) in combinations: 8 Add MRCB block-type to c 9 Set c as the antecedent i.e. X of ARM rule X --> Y. 10 For each block-type present in ARM table but not in context. ,→ blockTypesInSud: 11 Set block-type as consequent i.e. Y of ARM rule X --> Y. 12 Compute the confidence of ARM rule X --> Y as follows: 13 nRows with both X and Y = 1 14 confidence = ------15 nRows with X = 1 16 If confidence > 0, add a suggestion with block-type Y to suggs. 17 If suggs is not empty, goto line 20 i.e. break outer for loop 18 Sort suggs by confidence -- high to low. 19 Return suggs. Listing 4.2: Pseudocode for ARM model’s prediction algorithm

4.1.2 Freq Model This block prediction model is based on the frequency of different Simulink block types present in the training model set that are relevant to the modeling context of the SUD. In order to produce suggestions, this model looks up the count of different block types in the training model set that are connected to the MRCB. The Freq Model then suggests those block types which occur most frequently as top suggestions. In addition to the MRCB, this model also considers the block-types connected to the MRCB’s neighboring blocks. Since the suggestions are based on two different

31 sources (the MRCB and its neighbors), the Freq Model internally combines these suggestions into one final list of suggestions using an empirically derived weighted model. Section 4.1.2 describes this process in detail. This approach of employing frequency and context data was inspired by related work focused on source code that considers method frequency [59].

Freq Model Implementation The first step in the implementation of the Freq Model involves creating a nested-map data struc- ture that summarizes the training data. Listing 4.3 illustrates a contrived instance of that data structure consisting of only two block types. Looking first at the ‘Sum’ component on line 3, the ‘count’ represents the total number of Sum blocks in the training set. The ‘src’ is the total number of blocks connected to the src port of ‘Sum’ blocks in training set. This tally is further broken down into the ‘details’ component. The ‘dst’ component contains the analogous information for all the Sum blocks in the training set. The ‘both’ component contains the summation of both ‘src’ and ‘dst’. This pattern continues on line 32, with the ‘Gain’ block representing all Gain blocks in the training set. There is one such entry for each type of block SimIMA discovers during its repository (data) analysis. This phase can also be called the training phase of the Freq model.

1 { 2 'Sum': { 3 'count': 304, 4 'src':{ 5 'count': 120, 6 'details': { 7 'Sum': 55, 8 'Gain': 21, 9 'Inport': 44, 10 } 11 }, 12 'dst':{ 13 'count': 167, 14 'details': { 15 'Sum': 35, 16 'Gain': 13, 17 'Inport': 32, 18 'Outport': 87, 19 } 20 }, 21 'both':{ 22 'count': 287, 23 'details': { 24 'Sum': 90, 25 'Gain': 34, 26 'Inport': 76, 27 'Outport': 87, 28 } 29 }, 30 },

32 31 'Gain': { 32 'count': 434, 33 'src':{ 34 'count': 42, 35 'details': { 36 'Sum': 15, 37 'Gain': 23, 38 'Inport': 4, 39 } 40 }, 41 'dst':{ 42 'count': 107, 43 'details': { 44 'Sum': 35, 45 'Gain': 13, 46 'Inport': 32, 47 'Outport': 27, 48 } 49 }, 50 'both':{ 51 'count': 149, 52 'details': { 53 'Sum': 50, 54 'Gain': 36, 55 'Inport': 36, 56 'Outport': 27, 57 } 58 }, 59 60 }, 61 } Listing 4.3: Example of Frequency Data Listing 4.4 presents the algorithm, in the form of pseudocode, for model training for the Freq model. This algorithm first initializes the “data” attribute of the Freq model to an empty map (line 2). Then, for each Simulink model in the training set repository, this algorithm extracts block fre- quency and connection information for all Simulink block-types present in the model. With this information it updates the “data” attribute as it “processes” each Simulink model in the repository.

1 FUNCTION trainFreqModel(freqModel, repositories) --> freqModel 2 Create an uninitialized nested-map data structure: data = {} 3 For each Simulink model (M) in the repositories: 4 For each Simulink Block (B) in the top-level system of M: 5 processBlock(B, data) 6 freqModel.data = data 7 Return freqModel 8 9 10 FUNCTION processBlock(B, data)

33 11 If B is a subsystem: 12 For each Simulink Block (BSS) in B: 13 processBlock(BSS, data) 14 Else: 15 If data.B does not exist: 16 data.B = {} 17 data.B.count = 0 18 data.B.src = {} 19 data.B.src.count = 0 20 data.B.src.details = {} 21 data.B.dst = {} 22 data.B.dst.count = 0 23 data.B.dst.details = {} 24 data.B.both = {} 25 data.B.both.count = 0 26 data.B.both.details = {} 27 data.B.count++ 28 For each block connected to the src port of B: 29 Get its block-type (BS) 30 data.B.src.count++ 31 If data.B.src.details.BS does not exist: 32 data.B.src.details.BS = 0 33 data.B.src.details.BS++ 34 For each block to the dst port of B: 35 Get its block-type (BD) 36 data.B.dst.count++ 37 If data.B.dst.details.BD does not exist: 38 data.B.dst.details.BD = 0 39 data.B.dst.details.BD++ 40 For each block connected to either src or dst port of B: 41 Get its block-type (BB) 42 data.B.both.count++ 43 If data.B.both.details.BB does not exist: 44 data.B.both.details.BB = 0 45 data.B.both.details.BB++ Listing 4.4: Pseudocode for the Freq model’s training algorithm When suggestions are requested by the engineer, SimIMA captures the MUD and the MRCB’s block type, T. SimIMA then derives a classifier using two different calculations. Firstly, it finds all blocks that are connected to T-type blocks as destination blocks. It then calculates a confidence value for each potential suggestion based on how often that suggestion was found as a destination block. Secondly, it also considers context in the form of neighbours. That is, it considers all blocks that are neighbours (sources or destinations) of T-type blocks. SimIMA once again calculates the confidence for each suggestion by seeing the frequency of that suggestion in all neighbours of T-type blocks. The derivation of a total block-prediction model for the Freq model involved an empirical experiment to derive a weighting of the suggestions based on the MRCB and its neigh- bors. Specifically, the results of the experiment showed that weighting destination alone at 0.9 and neighbours at 0.1 yielded the best accuracy.

34 Listing 4.5 presents the algorithm, in the form of pseudocode, for block prediction by the Freq Model. The function predictF reqModel first calls two internal functions predictByMrcbBlockT ype and predictByMrcbNeighborsBlockT ypes (line 8,9) which return suggestions based on the MRCB and its neighbors respectively. Then, predictF reqModel calls the function mergeSuggs with these suggestions as input (line 10). It then sorts the suggestions based on their confidence values (line 11) and returns them. The full source code for the Freq model is in Listing A.8 in AppendixA.

1 // NOTE: 2 // - The 'context' of MUD is captured by the caller of the function ,→ predictFreqModel(). 3 // - Context.mrcbBlockType is the Simulink block-type of the most recently ,→ clicked block in Simulink workspace. 4 // - Context.mrcbNeighborsMap contains (key,value) pairs such that key is ,→ block-type and value is frequency of that block-type connected to the ,→ MRCB 5 6 7 FUNCTION predictFreqModel(context) --> suggestions 8 suggsMrcb = predictByMrcbBlockType(context.mrcbBlockType) 9 suggsNeighbors = predictByMrcbNeighborsBlockTpes(context. ,→ mrcbNeighborsMap) 10 suggs = mergeSuggs(suggsMrcb, suggsNeighbors, 0.9, 0.1) 11 Sort suggs by confidence -- high to low 12 Return suggs 13 14 15 FUNCTION predictByMrcbBlockType(mrcbBlockType) --> suggestions 16 suggs = new collection 17 If data.mrcbBlockType does not exist: 18 return suggs 19 For each block-type (B) in data.mrcbBlockType.dst.details: 20 Create suggestion S 21 S.blockType = B 22 S.confidence = data.mrcbBlockType.dst.details.B / data.mrcbBlockType ,→ .dst.count 23 Add S to suggs 24 Sort suggs by confidence -- high to low 25 Return suggs 26 27 28 FUNCTION predictByMrcbNeighborsBlockTypes(mrcbNeighborsMap) --> ,→ suggestions 29 suggs = new collection 30 nNeighbors = number of neighbors of the MRCB 31 For each (neighborBlockType, count) in mrcbNeighborsMap: 32 If data.neighborBlockType does not exist: 33 continue 34 For each block-type (B) in data.neighborBlockType.both.details:

35 35 nNeighborsThis = number of neighbors of the MRCB of block-type B 36 Create suggestion S 37 S.blockType = B 38 S.confidence = data.neighborBlockType.both.details.B / data. ,→ neighborBlockType.dst.count 39 S.confidence = s.confidence * nNeighborsThis / nNeighbors 40 Add S to suggs 41 Sort suggs by confidence -- high to low 42 Return suggs 43 44 45 FUNCTION mergeSuggs(suggsMrcb, suggsNeighbors, weightMrcb, weightNeighbors ,→ ) --> suggestions 46 suggs = new collection 47 For each suggestion SM in suggsMrcb: 48 If there does not exist a suggestion S in suggs such that S. ,→ blockType == SM.blockType: 49 Create suggestion S 50 S.blockType = SM.blockType 51 S.confidence = 0 52 Add S to suggs 53 S.confidence += SM.confidence * weightMrcb 54 For each suggestion SN in suggsNeighbors: 55 If there does not exist a suggestion S in suggs such that S. ,→ blockType == SN.blockType: 56 Create suggestion S 57 S.blockType = SN.blockType 58 S.confidence = 0 59 Add S to suggs 60 S.confidence += SN.confidence * weightNeighbors 61 Sort suggs by confidence -- high to low 62 Return suggs Listing 4.5: Pseudocode for the Freq model’s prediction algorithm

4.1.3 Ensemble Model The Ensemble model is the third prediction model employed by SimGestion to recommend block- level suggestions. Unlike the previous two models, this model does not produce suggestions on its own. It rather “combines” suggestions produced by the ARM model and the Freq model to give a final list of suggestions. The suggestions so produced by the Ensemble model are found to be more accurate than those produced by the individual ARM and Freq models.

Ensemble Model Implementation The main task involved in the Ensemble model is the merging of two suggestion groups, obtained from the ARM model and the Freq model, into one final suggestion group. The Ensemble model accomplishes this task using empirically derived weights to calculate a “total” confidence. This

36 process is consistent with other ensemble methods that combine classifiers by “taking a (weighted) vote of their predictions” [65]. Listing 4.6 illustrates this algorithm in the form of pseudocode. The key aspects here are the use of weights for calculating confidence on lines 9 and 16 for the ARM model and the Freq model, respectively. Confidence for a suggestion starts at zero and is grown only when suggested by the ARM model or the Freq model and weighted according to their respective weights. Chapter6 describes the empirical derivation of these weights. The derivation sets their values to 0.3 for the ARM model and 0.7 for the Freq model.

1 FUNCTION mergeSuggsEnsembleModel(suggsArm, suggsFreq, weightArm, ,→ weightFreq) --> suggestions 2 suggs = new collection 3 For each suggestion SA in suggsArm: 4 If there does not exist a suggestion S in suggs such that S. ,→ blockType == SA.blockType: 5 Create suggestion S 6 S.blockType = SA.blockType 7 S.confidence = 0 8 Add S to suggs 9 S.confidence += SA.confidence * weightArm 10 For each suggestion SF in suggsFreq: 11 If there does not exist a suggestion S in suggs such that S. ,→ blockType == SF.blockType: 12 Create suggestion S 13 S.blockType = SF.blockType 14 S.confidence = 0 15 Add S to suggs 16 S.confidence += SF.confidence * weightFreq 17 Sort suggs by confidence -- high to low 18 Return suggs Listing 4.6: Pseudocode for Ensemble model creation

4.2 Performance Optimization of Block-Prediction Models

In general, training these prediction models involves loading the training-set Simulink models from the default and custom repositories in Simulink and extracting the information needed by SimGes- tion. Experiments show that it takes roughly less than a second per Simulink model to train the two prediction models. This can vary depending on the size of the models. To speed up the process, SimGestion employs a number of optimizations.

Firstly, SimGestion undergoes “preliminary training” on the six default repositories. This in- volves training both the ARM model and the Freq model in all possible combinations, that is, 26 = 64 combinations of the default repositories. This preliminary training is conducted only once

37 (during the tool-development time). With such optimization, SimGestion does not need to recom- pute (retrain) the prediction models every time the user makes some change in their choice of the default repositories.

Secondly, SimGestion employs a cache-like approach. For both the ARM model and the Freq model, it maintains a hash of the table/matrix and the frequency data, respectively. When using SimIMA to train the ARM or the Freq models, it first computes the hash value of the Simulink model file and sees if that data is available in the corresponding cache. If so, it uses that infor- mation. Otherwise, it extracts necessary information for the prediction models by loading the Simulink model, and also updates the cache for future use.

Lastly, it performs prediction model merging within the ARM model and within the Freq model by considering the default and custom repositories separately and merging after. This occurs each time the engineer makes a change in their selection of repositories. SimGestion begins by loading in the appropriate preliminary trained prediction models of the default repositories from the first optimization. It then compares the Simulink model files in the new custom repositories, and if it finds differences, then it updates the prediction models. It then performs prediction-model merging of the default repositories model and the custom repositories model. For the Freq model, this involves updating all the appropriate data (nested map data structure). For the ARM model, it concatenates the two matrices into one.

4.3 User Interface

SimGestion UI can be broken down into two parts — the suggestion panel and the configuration wizard.

4.3.1 SimGestion Suggestion Panel SimGestion presents the block-level suggestions to the engineer in a suggestion “panel”. The sug- gestion panel is a rectangular element overlaid onto the SUD. It always appears adjacent to the MRCB for users’ convenience. This panel contains the suggested block-types, their prediction confidences, and instructions for inserting them into the SUD (or ignoring them). By default, the panel shows six block-level suggestions. However, this value is user-configurable using the SimGestion configuration wizard (see section 4.3.2). Figure 4.1 presents a screenshot of SimGes- tion suggestion panel.

38 Figure 4.1: SimGestion suggestion panel

4.3.2 SimGestion Configuration Wizard To allow for engineer customization of data and learning parameters, SimGestion uses Simulink App Designer to implement a SimIMA Block-Level Suggestion Configuration wizard. Figure 4.2 illustrates this wizard. SimIMA includes the call to the wizard in the same model context menu as all other SimIMA commands. Within the wizard, engineers are able to adjust the number of suggestions shown by SimIMA, choose which of the default repositories to include, add their own custom repositories, and adjust the performance of the recommendation.

The performance slider allows the user to choose between speed and accuracy with three fixed positions. The evaluation experiments, described in Chapter6, show that the ARM model produces more accurate suggestions as compared to the Freq model. The Ensemble model produces even more accurate suggestions than the ARM model. The suggestion retrieval time for these three models is in the reverse order, with the Freq model being the fastest and the Ensemble model the slowest. The Freq model is fast because it produces suggestions through a simple look-up of its nested-map data – an operation which is independent of the training data size, that is, O(1). In contrast, the ARM model is slower because it scans through each of the rows of the ARM table to produce suggestions. The number of such rows grows with the training data size and so does the ARM model’s suggestion-retrieval time. The Ensemble model is even slower because it needs to get suggestions from both the previous models, plus combine them into a final list of suggestions. Prioritizing speed uses only the Freq model. Prioritizing accuracy uses the ensemble model. The balanced option applies only the ARM model for a mix of speed and accuracy. Any time an engineer changes the repository, it triggers a regeneration of the classifiers as appropriate/needed.

39 Figure 4.2: SimIMA Block-Level Suggestion Configuration Wizard

40 4.4 Running Example: ExampleSUD

In order to best illustrate the application of SimIMA to systems under development, this thesis uses a running example. The running example, ExampleSUD, begins as a relatively new system undergoing development. This section describes the process of developing ExampleSUD first us- ing SimGestion. Section 5.5 continues to build on the same example using SimXample once it gets more advanced. This is not meant necessarily to showcase a standard series of events, but rather just to illustrate the two assistance approaches in one consistent example to assist the reader’s un- derstanding. This example employs the default repository named “Miscellaneous” that comes with SimIMA to facilitate reproduction and replicability.

Figure 4.3: Initial composition of ExampleSUD

The initial composition of ExampleSUD, as seen in Figure 4.3, contains a single inport block, which is a typical starting point for most systems. The running example starts by querying SimGes- tion on this ExampleSUD. This provides the engineer with the top suggested, six (by default) blocks that they could connect. Figure 4.4 illustrates the in-model suggestion view for this ini- tial query. Beside each block, SimGestion also shows its confidence in that particular suggestion. For this running example, the engineer opts to select the top most ranked non-terminal block to continue building the model in a step-wise fashion, with the most recently inserted block acting as the MRCB for the next suggestion. Following this process for four insertions produces the in- terim model in Figure 4.5. Continuing to apply a total of seven suggestions produces the interim model shown in Figure 4.6. SimGestion can be applied continuously in this manner as long as the engineer wants to keep suggesting blocks to insert. Using a different MRCB will yield different suggestion results.

41 Figure 4.4: Initial Suggestions for First Insertion into ExampleSUD

Figure 4.5: Interim view of ExampleSUD after four applications of SimGestion

Figure 4.6: Interim view of ExampleSUD after sseven applications of SimGestion

42 Chapter 5 SimXample: Simulink Complete Model Examples

This chapter introduces the second module of SimIMA called SimXample. SimXample is named so because it provides complete system-level examples or suggestions for completing the Simulink models. This module addresses the second research question presented in Chapter1: Can SimIMA provide engineers developing software models with similar model examples for insertion/inspira- tion based on analysis of a configurable model set? Some of the contents in this chapter have been submitted to the SoSyM Theme Issue on AI-enhanced model-driven engineering [2].

This chapter presents an overview of the SimXample process followed by a presentation of its phases, including the research hurdles and resulting process implementations. While the imple- mentations are Simulink specific, this general process is aimed to be applicable to any language that has type-3 model clone detection capabilities. The SimXample process, illustrated in Fig- ure 5.1, infers and visualizes model suggestions to engineers through a four-phase process: model clone detection, subsystem recommendation, candidate selection, and application to the engineer’s model. At the highest level, it begins with a MUD. It ends with that model being updated/re- placed, depending on the engineer’s preference. More specifically, the input to this process is any MUD that the engineer is currently developing. When an engineer is unsure of how to complete an SUD within the MUD, they can use SimXample to find examples of other models that have similar initial structure. This allows them to comprehend how other modelers have completed sim- ilar systems, and optionally insert and/or them into their implementations. Additionally, a secondary SimXample use case includes when the engineer has just completed creating a subsys- tem, or is evaluating an existing subsystem, and wants to compare visually their subsystem to other similar design alternatives, potentially for optimization or verification. This chapter now describes each phase, first in language agnostically, followed by how this thesis realizes it for SimXample specifically. A later section illustrates the SimXample process with a running example.

43 Figure 5.1: SimXample Process Overview

5.1 Phase 1: Clone Detection

The input to this first phase is the SUD. Its output is model clone detection data that are able to be interpreted by the modeling assistant in such a way that the assistant can provide recommendations to the engineer. The essence of this phase is the inclusion of model clone detection directly into the targeted environment. That is, a model clone detector capable of detecting Type 3 model clones must be “hooked” into the engineers’ interface. This includes being able to receive sufficient SUD data such that the model clone detector can perform its analysis. Additionally, the engineer must be able to specify the knowledge base, in the form of repositories and sources, that the intelligent modeling assistant should have the model clone detector consider when inferring example model clones. Once the clone detector is configured visually in this manner by the engineer, the modeling assistant must be able to conduct the model clone analysis and inference without requiring the engineer to leave their interface as per established responsive requirements when implementing model recommenders [70].

5.1.1 Phase 1 Implementation SimXample implements the user interface for this phase using App-Designer. SimXample is avail- able as a custom application available via the menu of any Simulink model or subsystem. An engineer interested in a suggestion from SimXample can select the menu item from the menu of their SUD, using either the “Tools” menu or the right-click context menu, and be presented with the phase 1 SimXample interface, or the “landing page”, as illustrated in Figure 5.2. As illustrated on the left of Figure 5.2, the engineer is prompted to select the knowledge base repositories that SimXample will search and consider for clone analysis along with the SUD. Based on the repositories selected by the engineer, SimXample invokes Simone’s cross-clone function- ality. It also configures its clone similarity thresholds and other settings. Based on existing work

44 Figure 5.2: SimXample user interface

evaluating Simone’s ideal clone detection settings [51], SimXample sets Simone to use a 30% dif- ference (70% similar) threshold by default, which finds model clones that are 30% different or less according to its algorithm. This difference threshold has been previously established and evalu- ated to find models that are related and useful clones [51]. Engineers can change this difference threshold according to their preference via the radio button showcased in Figure 5.2 by selecting Very similar (> 90%), Somewhat similar (> 80%), or Less similar (> 70%), which are termed as such for user friendliness. This adheres to Dyck et al. recommendation pertaining to “allowing multiple recommender strategies” [70]. The cross-clone function performs model clone detection on the SUD against/across the complete set of models in all selected repositories. That is, it looks for clones of the SUD within the selected repositories, but not among the repositories themselves.

SimXample conducts the clone detection process in the background. This use of clone detec- tion on only the SUD rather than the complete model allows SimXample to focus on a local context rather than being obfuscated by model content not relevant to the current design context due to the nested clone problem [38]. This problem is an interesting research challenge this thesis had to address. Specifically, when multiple clones are present in the same Simulink model such that one of the clones (the inner clone) is contained within some other clone (the outer clone), Simone re- ports only the outermost model clone that meets its threshold and not any inner model clones. This is an issue because, under standard circumstances, Simone takes as input two Simulink models and detects model clones at the system level. If SimXample were to pass in the MUD containing the SUD, Simone may identify system clones at a higher level/hierarchy than the SUD within the MUD. Thus, to avoid the nested clone problem, SimXample does not treat the actual MUD as input. Instead, it creates a temporary MUD that encapsulates and contains the SUD as the top- level system, while maintaining all the subsystems and elements contained within/below the SUD.

45 It then runs cross-clone detection with that temporary MUD. Once SimXample includes, config- ures, and executes the model clone detector, it must interpret and process the results to provide recommendations to the engineers, which is the focus of the next phase.

5.2 Phase 2: Subsystem Recommendation

This phase takes as input the tailored model clone detection results from the prior phase. This phase has the output of a list of specific (sub)system recommendations. The model clone detection results must be interpreted by a modeling assistant in such a way that only the most similar/related model clones are inferred and visualized to the engineer since that is what most likely mirrors their current intent. Depending on the nature of the model clone detection report and results, this involves various forms of interpretation and processing. After completing that interpretation and processing, the modeling assistant must present these results in a responsive non-blocking manner [80] that is natural and organic to the engineer’s processes.

5.2.1 Phase 2 Implementation Given the fairly extensive model clone detection report produced by Simone, SimXample must interpret and visualize only the most related and similar type 3 model clones to the engineer for consideration.

1 2 3 4 5 6 ,→ 7 8 9 ... 10 11 ,→ 12 13 14 Listing 5.1: Sample Simone Report There are a number of considerations in deriving suggestions, including querying, ranking, and fil- tering [70]. The initial implementation of SimXample employs ranking suggestions based on their similarity to the SUD according to Simone’s algorithm. To facilitate SimXample’s suggestions of

46 the most similar model systems, this process implements filtering and sorting of the model clone detection results. The report from Simone’s cross cloning feature is an XML report containing pairs of clones, with each pair containing the SUD as one of its elements, and supplemental data about the clone pairs. Listing 5.1 provides an excerpt of such a report. SimXample parses this report to create an internal representation within Simulink for further processing.

In parsing the report, SimXample gains an internal representation that consists of a list of Clone objects, each with • nlines: the number of source lines contained in the clone

• similarity: the percentage of similar lines between the clone pair

• source1: a Source object representing the SUD

• source2: a Source object representing the potential suggestion system The full source code for the Clone object is in Listing A.6 in AppendixA.

SimXample sorts and filters this clone list to provide optimal complete-system suggestions. First, SimXample sorts the list in decreasing order of similarity. It then filters that sorted list to provide the maximum number of suggestions, which is an engineer-configurable value, set by de- fault to 10. SimXample’s filtering algorithm removes any clone pairs that have 100% similarity, as identical clones are not useful suggestions. It then removes the suggestions that exist after the maximum suggestion limit. The resulting list contains at most 10 clone pairs, which form the sug- gestions for completing the SUD.

For each suggestion, SimXample creates an internal model file and an image file for use in suggestion processing and visualization. It stores a complete Suggestion object, which has an internal representation consisting of • similarity: the similarity to the SUD

• source: a Source object representing the potential suggestion system, corresponding to source2 in a Clone object

• mdlFile: the file path to the MDL file containing the subsystem

• imgFile: the file path to the image file representing the system for displaying in the next phase

• rank: the numerical ranking of the suggestion The full source code for the Suggestion object is in Listing A.10 in AppendixA.

SimXample passes the list of Suggestion objects to the third phase for visualization to, and eventual selection by, the engineer.

47 5.3 Phase 3: Candidate Selection

This phase takes as input the inferred system recommendations it is to visualize to the engineer for them to browse. It ends with output representing an engineer choice/selection of which system they want to load for inspiration or direct application into their model. This ability to directly in- sert and apply recommendations from a modeling assistant is consistent with the guidelines from Dyck et al [70]. SimXample includes the customization feature to allow engineers to alter/update the suggestions through selecting alternative repositories, adding engineer-defined repositories, or adjusting the similarity threshold. They can then “apply” their customization through the cor- responding button on the landing page. This causes the whole process to begin anew with new parameters. The SimXample interface is interactive and any changes to these parameters are re- flected in the displayed results responsively and in real time.

5.3.1 Phase 3 Implementation As illustrated on the right of Figure 5.2, SimXample presents the engineer with the two most sim- ilar and related suggestions at first. SimXample represents and visualizes the inferred systems to the engineer directly in the SimXample interface within Matlab by loading the model containing the systems and capturing an image representation at the appropriate hierarchy level of the respec- tive inferred system suggestions. The interface visualizes each two options side-by-side with the similarity value of each candidate with the SUD. The engineer has the ability to browse through pairs of suggestions, sorted by similarity ranking, using navigational buttons. SimXample displays the 10 most similar suggestions. A significant feature of this SimXample tool interface is the abil- ity to select the desired suggested system directly in the interface by clicking on its representative image. This loads the model for further processing and prepares it for the next phase. This chosen (sub)system, which is referred to herein as the System From Suggestion (SFS), has already been loaded into Simulink’s memory by SimXample during the previous phase, allowing SimXample to utilize it in the next phase, efficiently and responsively.

5.4 Phase 4: Application to User Model

The input to this final phase is the engineer-selected system they wish to load for inspiration and/or direct insertion into the modeling environment. There must be enough (meta)data about the system such that a modeling assistant can insert the system into the engineer’s environment as correctly as possible, given the context. This includes any connections that are being replaced, or those that are implicit. The output for this phase and the process as a whole is an updated interface and model consistent with the engineer’s expectations based on their selected suggestion.

5.4.1 Phase 4 Implementation With the SFS selected by the engineer via the SimXample interface, the final step is the application of the SFS to the SUD. Rather than the simplified approach of merely replacing the entire SUD with

48 the contents of the SFS as is, it is necessary to consider the impact of the replacement on the MUD as a whole. This introduces potential MUD variation that may occur due to potential mismatches in the block-port signatures of the SUD. Since SimXample uses Type 3 model clones as its data, it is possible that the SFS has a different number of ports, such as inports, outports, or a combination of both, than the SUD. Thus, a direct replacement may cause inconsistencies. There are multiple possible approaches that SimXample considers to perform a replacement. These approaches can be categorized as follows,

1. direct copy of SFS into SUD

2. automatic merging of SFS into SUD

3. intelligent copy of SFS into SUD with engineer intervention

The first option can lead to unconnected ports, lost information, or model-build errors if done naively by a model assistant. For these reasons, this option is not a viable solution on its own. The second option involves merging the two systems automatically to maintain the signature of the SUD while applying the logic and design of the SFS. This avoids the issues with the first op- tion. However, automatic efficient model merging is an open research problem, one that has been demonstrated not fit for software engineering needs [81] nor able to scale [82] beyond polynomial time. The third option is one that this implementation devises as a custom hybrid heuristic whereby SimXample copies the SFS into the SUD (replacing contents), detects any changes to the block sig- nature, and reports these to the engineer for their inspection and intervention. This intelligent and user-assisted copy method of SFS application replaces all elements of the SUD with those from the candidate subsystem only after performing a block signature comparison and indicating any differences in the number of inports or outports to the engineer and asking for their intervention. SimXample realizes the intelligent assisted copy through the application of a heuristic approach that examines the location of the SUD within the MUD as well as the SFS within its respective model file. Figure 5.3 illustrates this heuristic decision process. The assisted copy method identi- fies five potential actions, placed into four scenarios, based on the locations and signatures of the SUD and SFS:

1. Replace entire model file: If both the SUD and SFS are top level systems, SimXample replaces the entire MUD with the model containing the SFS

2. Copy all contents of SFS into MUD: If the SUD is a top level system but the SFS is contained within a subsystem, SimXample places the contents of the SFS at the top level of the MUD without containment

3. Replace SUD with SFS and connect lines: If the SUD is contained within a subsystem and connected to the surrounding MUD then the connections become relevant. In this situation SimXample replaces the SUD with the SFS and connects all ports if and only if there are exactly 1 inport and 1 outport in both the SUD and SFS - Note: if the SFS is not contained within a subsystem, SimXample must first wrap/place it within one before applying this option or the next

49 4. Engineer Guided Replacement: Similar to the previous option, if the SUD is contained, the replacement of the SUD with the SFS is possible. However, if the block signatures do not have exactly 1 inport and 1 outport, this necessitates engineer notification and intervention, which SimXample treats in one of two ways:

(a) Abort Replacement: if the comparison of the SUD and SFS reveals that either the number of inports do not match or the number of outports do not match, SimXample asks the engineer if they wish to proceed with replacement given the mismatch (Fig- ure 5.4). If they opt not to, SimXample aborts the replacement (b) Replace SUD with SFS and ask engineer to adjust signal connections: if the com- parison yields a match for both inports and outports, or if the engineer opts to continue merging, SimXample can make the substitution directly and prompts the engineer to inspect/adjust the connections (Figure 5.5)

A successful insertion of the SFS into the SUD constitutes a complete application of SimX- ample. SimXample makes changes primarily within its current modeling context, allowing the engineer to observe any updates on the screen without navigating through the model. The only po- tential changes made outside of the local context relate to connections to the SUD at a higher level of the model hierarchy, which SimXample addresses through its user prompts. Following this final phase, the engineer can continue the development of their system, including using SimXample again for any in-progress or recently completed systems.

50 Figure 5.3: Decision Process for Inserting SFS into SUD

51 Figure 5.4: Engineer-guided replacement notification – prompt to proceed with replacement

Figure 5.5: Engineer-guided replacement notification – prompt to adjust connections manually

52 5.5 Running Example Continued: ExampleSUD

ExampleSUD is in a mostly complete state after successive applications of SimGestion, as demon- strated in Figure 4.6. The engineer now applies SimXample to find similar complete systems as suggestions for insertion into the MUD. Querying SimXample with the ExampleSUD in its current configuration leads to the the suggestions shown in Figure 5.2. In this case, the engineer opts to select Suggestion 2 for insertion. After the insertion, ExampleSUD is now complete as illustrated in Figure 5.6.

Figure 5.6: ExampleSUD after inserting the suggestion from SimXample

53 Chapter 6 Evaluation

This chapter presents the evaluation of SimIMA. The goal of this thesis is to demonstrate the feasi- bility of the two forms of modelling assistance: step-wise model element suggestions and complete model examples, both based on analysis of a configurable model set. To that end, this thesis de- velops a prototype modeling assistant for Simulink modeling environment, called SimIMA, that internally consists of two modules: SimGestion and SimXample. SimGestion provides engineers with Simulink block-level suggestions and tries to answer the first research question. Similarly, SimXample provides engineers with Simulink (sub)system-level suggestions and tries to answer the second research question. This thesis evaluates SimGestion and SimXample independently, focusing on the evaluation of the approaches/algorithms. This thesis also makes the resources used for its evaluation, including the datasets, results, automation scripts and instructions for the re- production of evaluation experiments available publicly. 1 This evaluation also uses existing work on evaluating Intelligent Modelling Assistants (IMAs) to conduct a qualitative evaluation. At this time, user evaluations and studies are out of scope, and can be considered for future work. To be meaningful, such endeavours will require multiple users from different domains. This chapter begins with a discussion of the data used throughout the evaluations. Some of the contents in this chapter have been submitted to the SoSyM Theme Issue on AI-enhanced model-driven engineer- ing [2].

6.1 Data

The evaluation experiments for SimIMA leveraged a large publicly available and curated corpus of Simulink models [13]. Additionally, Boll et al. independently evaluated and validated this set for empirical research [14]. Through investigation and consultation with industry partners, they note that the models are suitable for empirical research. Many of the models are “mature” and large enough for analysis purposes. They are diverse enough that they facilitate good replication opportunities as well. Some concerns regarding the set are that some projects are no longer under development and code generation is not well represented. Neither of these are concerns for the purpose of evaluation of SimIMA, however. This is because SimIMA algorithms do not require the repository models to be actively maintained. Being valid Simulink model files is a sufficient condition for their inclusion in the evaluation experiments. Also, whether or not the models sup- port code-generation is irrelevant because SimIMA is not concerned with code generation aspect of Simulink models. 1SimIMA evaluation: https://doi.org/10.5281/zenodo.5123564

54 There are a total of 946 model files in this set, including both Simulink MDL (573) and SLX (373) formats. The evaluation process first sorted these models into similar sets based on their domains to provide the most valuable suggestions. It then cross-referenced the domains indicated explicitly by the corpus curators with an independent analysis of model and documentation content. This produced six non-mutually-exclusive collections: automotive, avionics, electronics, energy, robotics, and miscellaneous. They can all also be found either directly on the corpus’ website or through a link on that same website. While a variety of models could have been chosen for evaluation, this evaluation process uses these models from the corpus for two reasons: to facilitate independent verification and replication of the experiments due to their public availability, and the level of variety and realism in the models as established by the independent evaluation.

6.2 SimGestion Evaluation

The evaluation of SimGestion uses the established method for measuring prediction accuracy and error classification, K-fold cross validation [83, 84]. This involves splitting data into k compli- mentary subsets, using some for learning/training and the others for testing/evaluating perfor- mance [83], and repeat the process so that each of the k sets is used once as the test set. In the case of the corpus used for this evaluation, given the size of our data sets, 5-fold cross validation was the most appropriate. SimGestion evaluation combines this cross validation with mutation testing [85] to determine the accuracy of suggestion produced by SimGestion algorithms, that is, the block-prediction models.

6.2.1 Data Preparation The corpus contains a total of 946 Simulink model files. The data-preparation process discovered that 16 of these models from the corpus were invalid, and removed them immediately. 35 of the model files were not fully navigable, that is, Simulink had errors when trying to fetch “params” for some blocks. So these models were also removed. Similarly, 362 of these model files were found to be immutable and thus unusable for evaluating SimGestion. This was either because they had no suitable block to delete at all, or because mutating them would result in a (sub)system with no blocks available to be set as the MRCB. Following the removal of these models, the final model set for SimGestion consisted of 533 unique Simulink models for the evaluation experiments. Following that, the data-preparation process went through an independent analysis of model and documentation content and cross-referencing with the corpus data. Based on such analysis, these 533 Simulink models were categorized into six categories: automotive, avionics, electronics, en- ergy, robotics, and miscellaneous, containing 63, 70, 320, 126, 157, and 69 Simulink models, respectively. A few randomly selected models were removed from some of these categories to make the number a multiple of 5 as needed for 5-fold cross validation. This gave the final counts of 60 automotive, 70 avionics, 320 electronics, 125 energy, 155 robotics, and 65 miscellaneous. Finally, the data-preparation process split each of these categories’ model files randomly (using a seed for reproducibility) into 5 folds to for evaluation using 5-fold cross-validation.

55 6.2.2 Determining Block Prediction Model Parameters SimGestion has the following parameters,

• The ARM model has one parameter: ant reduction, which is the amount of antecedent reduction (percentage of antecedents required when assessing with a consequent)

• The Freq model has two parameters: the weight of the MRCB destination block types (weight mrcb) and the weight of the neighbours (weight neighbors). These must add up to one.

• The Ensemble model has two parameters: the weight given to the ARM model (weight arm) and the weight given to the Freq model (weight freq). These must add up to one.

The parameter-tuning process employed Grid search [86], the standard method for optimizing parameters, to find the best value for these parameters. For the Freq and ensemble parameters, the process tuned these by treating their two parameters as one parameter by leveraging the fact that the second parameter is simply one minus the first parameter. For example, in Freq model, the weight of the neighbours in FREQ is simply one minus the weight of the MRCB weight, that is, weight neighbors = 1 − weight mrcb. To find the best parameters in each instance, this process searches through each value of the parameter space, [0,1], using a step size of 0.1 and see the impact on prediction accuracy using 5-fold cross validation. This parameter-tuning process produced the following configuration parameter values,

• ARM: 0.5 ant reduction

• FREQ: 0.9 weight mrcb, 0.1 weight neighbors

• Ensemble: 0.3 weight arm, 0.7 weight freq

6.2.3 Evaluation of Ensemble Block Suggestions To evaluate the effectiveness of the ensemble learning method for block suggestions, the evaluation experiment applied SimGestion once for every mutable block within the evaluation set of models. For each block, it deleted the block from the model and queries SimGestion for suggestions, with the expected result being the deleted block appearing as a suggestion. Since the number of sug- gestions shown in SimGestion UI is customizable, it is first necessary to determine the optimal number of blocks to display to the user, which also formed the success criteria. To establish the optimal configuration, the experiment first performed an evaluation considering the top-N sugges- tions, where N ranged from 1 to 10, and measured the prediction accuracy. Figure 6.1 shows a plot of this comparison. Based on these results, it was determined that displaying the top 6 suggestions provides strong accuracy without overcrowding the engineer’s screen with suggestions. While in- creasing the number of suggestions presented to engineers marginally increases the accuracy, the

56 Figure 6.1: Accuracy of Suggestion Based on Number of Suggestions Shown gains demonstrated are not significant enough to increase the default number of suggestions. Fur- thermore, should the engineer wish, they are able to increase the number of suggestions provided to them through the SimGestion Configuration Wizard presented in Section 4.3.2.

Based on this optimal configuration, the evaluation process ran the validation experiments with the constraint that the deleted block must be suggested as one of the top six suggestions. If the deleted block appears in the top six suggestions, SimGestion has made a correct suggestion, and if the correct block does not appear, or was ranked below the sixth suggestion, SimGestion did not provide the correct suggestion.

6.2.4 Results The experiments involved evaluating each set of models independently to ensure that the block- prediction models were on models from a similar domain to provide the most effective suggestions. For each domain, the evaluation process identifies the number of mutable blocks in the entire

57 set and the number of times that SimGestion identified the removed block in the top six ranked suggestions. This provided an indication of SimGestion’s accuracy. Table 6.1 presents the detailed results of these experiments. While each domain is meant to be processed individually, the table also provides an overall accuracy score to indicate how SimGestion performs in the complete dataset.

Table 6.1: SimGestion Suggestion Accuracy Domain Mutable Blocks Correct Suggestions Automotive 2470 2044 (82.75%) Avionics 7723 3210 (41.56%) Electronics 67936 55225 (81.29%) Energy 29411 24484 (83.25%) Robotics 4695 3664 (78.04%) Miscellaneous 866 706 (81.52%) total 113101 89190 (78.86%)

6.2.5 Discussion Regarding the first research question, the results of the evaluation experiments have demonstrated that SimGestion can provide engineers with step-wise suggestions that are based on the analysis of a configurable model set. More specifically, the result show that, for the majority of domains, SimGestion’s accuracy is in the low 80s. A notable outlier is the experiments with the avionics dataset, which also impacts the total accuracy score significantly. To elaborate, both individual pre- diction models, as well as their ensemble, have low prediction accuracy on the Avionics dataset, especially compared to the other domains. One possible explanation of this observation is that the Simulink models within the Avionics dataset are not as mutually similar as models in other data sets. As a result, for this dataset, the prediction models are trained poorly and hence make inaccurate predictions.

To evaluate SimGestion’s performance, these experiments employed accuracy as the perfor- mance metric instead of other metrics such as precision, recall, and F1-score which are more commonly used to evaluate the performance of recommender systems [87]. This is because these metrics (precision, recall, and F1-score), are defined in terms of “selected” and “relevant” sugges- tions, and the relevance of a suggestion cannot be determined given the design of these experiments which does not involve actual users or experts who could evaluate their relevance.

58 6.3 SimXample Evaluation

The evaluation of SimXample focuses on two aspects: evaluation of the visualization of the in- ferred suggestions to engineers (which items are shown / filtered) consistent with how code rec- ommender systems are evaluated [88] and evaluation of the application of those suggestions to the engineers’ interfaces.

6.3.1 Data Preparation The data-preparation process discovered 16 of the MDL-formatted models from the corpus were invalid, and removed them immediately. To facilitate Simone clone detection, the process then converted the SLX models to the MDL format required by Simone using Matlab’s conversion function. Five of the SLX-formatted models failed to convert to MDL format due to unsupported characters. After conversion, there were a total of 925 MDL-formatted models (557 original + 368 converted). A further analysis revealed that 259 of these models were immutable, thus incapable of being mutated for SimXample evaluation experiments. This produced a total of 666 valid, mu- table, MDL-formatted Simulink models. Of these models, 414 models were incompatible with the latest Simone MDL grammar. In particular, the data-preparation process had to discard the entire “Electronics” dataset as almost all models in that set were incompatible with Simone. Following the removal of these models, the final test set for SimXample evaluation consisted of 252 unique MDL models. Having the data preparation process removing such a large number of models from the dataset might raise concerns regarding the choice of the dataset and/or Simone as the clone de- tector. However, the number of resulting Simone-compatible models (252) following this removal was still large enough to conduct the evaluation experiments. The model files filtered out due to incompatibility with Simone could be included in the evaluation dataset by updating the Simone MDL grammar. However, updating the Simone MDL grammar file would be out of scope for this thesis.

6.3.2 Evaluation of Visualization of Inferred Suggestions To validate correctness of the inferred suggestion visualization, SimXample evaluation employs mutation testing, a methodology that has been demonstrably successful in evaluation experi- ments [85], including clone-related experiments [89, 90]. Specifically, it determines if SimXample recommends the completed version of a mutated model as one of its top suggestions. This is consistent with recommender evaluation strategies that consider rank and retrieval [88]. It is important to stress that SimXample’s evaluation experiments are not conducting an evaluation of Simone’s clone detection results as this has been accomplished in the past in a systematic evaluation of model clone detection tools [51]. Instead, these experiments evaluate how effectively SimXample configures Simone, and interprets and visualizes Simone’s clone report data for its purposes. Simone has multiple configuration parameters that need to be tuned properly to produce desired results for its application in SimXample. These parameters include the maximum difference (minimum similiarity) threshold between the clones, minimum and maximum clone size, and the clone type. Thus, this evaluation aims to illustrate that SimXample uses the

59 appropriate parameters and applies the correct sorting and filtering of Simone results to visualize optimal and appropriate model examples in a novel and useful manner.

This experiment first mutates each of the test models using existing established Simulink muta- tions [91] at random systems within the models. Specifically, it deletes a single randomly-selected block from each SUD to replicate an incomplete subsystem for which SimXample can provide suggestions. The experiment performs the mutation with the help of a matlab script (rather than manually) to avoid any human bias in choosing which block to delete. Also, the script uses seeded randomization such that the process could easily be reproduced. Multiple/compound mutations would cause significant distance to the point of no longer being a clone, as was observed in the first round of experiments which were discarded for the same reason, and thus are not relevant in the context of this experiment. This application of mutation is optimal for testing the inferred suggestion as the concern here is primarily with retrieving the closest match from the repository, which, by design, will be the unmutated model. Increasingly mutated models would yield a larger difference from the intended replacement, but would still be included in the result. This thesis work will make all these mutations and random seeds available in Miami’s public repository for replication purposes.

Using each of the 252 mutated models as MUDs, the evaluation experiment invoked SimX- ample each time with the matching domain set of models as the only selected repository to search for potential clones. For SimXample to provide a suggestion correctly and successfully, this ex- periment required that the completed subsystem from the original model must be within the top 2 suggestions provided by SimXample, as this means the option is visible and available without en- gineers having to browse suggestions. While the ideal result is to have the model that was mutated be the top suggestion in each application, the similarity of some of the subsystems in the test set allows for a variety of suitable suggestions. For each input, the experiment involved observing the rank of the “correct” subsystem presented by SimXample and noted whether a correct suggestion was made. The intent of this evaluation is to ensure that SimXample interprets the clone detection results to return and visualize the appropriate recommendations.

6.3.3 Evaluation of the Replacement This evaluation of the replacement focuses on the correctness of SimXample’s insertion of the SFS into the SUD. It considers a correct insertion as one in which the resulting SUD must contain exactly the elements from the SFS, with appropriate connections, and with notifications being dis- played to the engineer when SimXample cannot make connections automatically.

To realize this evaluation, the experiment included simplifying the clone detection and suggestion portions of the process, as this was accomplished in the first part of the evaluation independently. As no single model was found in the model set which could be used directly as the MUD to test all five possible scenarios from Figure 5.3, this experiment first created a custom Simulink model based on the Matlab Central model Rotfe25x/Rotfe25x/models/sim tutorial.mdl as it required only a deletion of one specific block to make it possible to test each of the five

60 possible cases. To prepare the repository set for evaluating the correctness of replacement, this second experiment modified the same base-model deliberately to produce two different Simulink models. Each of these models contained clones of the MUD at various levels of containment. Thus, this custom MUD, together with the custom repository made it possible to run SimXample in different locations within the experimental model to realize the 5 replacement options. The experiment included executing SimXample following the test scenarios in Table 6.2 and produced the outcomes against the expected outcomes in that same table. This evaluation experiment served as a correctness test to ensure proper insertion of the SFS into the SUD.

Table 6.2: Evaluation Tests for the 5 Insertion Cases

Scenario Expected Result 1 delete block at top level of MUD, choose Entire MUD replaced with entire model suggestion of large top level model of SFS 2 delete block at top level of MUD, choose Entire MUD replaced with contents of suggestion of the contained full system SFS 3 delete block within subsystem with 1 in- SUD replaced with SFS, lines automati- port and 1 outport, choose suggestion cally connected with matching port signature 4a delete block within subsystem, choose SUD replaced with SFS, prompted to non-exact suggestion, opt to proceed manually adjust connections 4b delete block within subsystem, choose Replacement Aborted non-exact suggestion, opt to abort

6.3.4 Results Table 6.3 presents the results of the validity experiments on inferred suggestions. For each of the five model sets, this table presents the number of models where clone detection was success- ful, and of those, the number of mutated models where SimXample found the original/unmutated corresponding model as a clone. Furthermore, the last column of the table narrows this number to indicate the instances where the correct clone is found in position 1 or 2, indicating instances where the engineer is not required to browse to find the correct suggestion, as it has been visualized on the initial results page.

Similarly, Table 6.2 presents the results of the evaluation of the insertion of the SFS into the MUD through the second experiment targeting the five potential insertion scenarios. This table compares each of the five resulting events against the expected results to determine its success. Of the five test scenarios, SimXample demonstrated correct insertion behavior in all five cases.

61 Table 6.3: Inferred Suggestion Accuracy models correct clone clone ranked in Domain in set identified position 1 or 2 Automotive 63 59 (93.7%) 56 (88.9%) Avionics 69 66 (95.7%) 61 (88.4%) Energy 53 49 (92.5%) 45 (84.9%) Robotics 47 41 (87.2%) 37 (78.7%) Miscellaneous 101 86 (85.2%) 70 (69.3%)

6.3.5 Discussion The results of these experiments have answered the second research question presented in Chap- ter1. SimXample has devised an approach that provides engineers with similar model examples based on analysis of a configurable model set. When it comes to the specific results, since the data sets are not mutually exclusive, calculating accuracy based on their totals is not possible.

This evaluation of the suggestions identified and visualized by SimXample demonstrates 82.05% accuracy with the test set across all 5 domains. It is also worth calculating the accuracy after ex- cluding the models in the “miscellaneous” set. The reason is clone detection, and thus SimXample, is most useful when compared to similar sets/domains. After omitting this category, SimXample’s average accuracy increases to 85.23%. Furthermore, considering any time SimXample finds the correct clone, rather than in the first or second position only, the accuracy is increased to 90.83% overall and 92.25% when excluding the “other” category. This accuracy is not necessarily the case in all contexts, but was employed for evaluation to confirm SimXample’s ability to interpret and process the model clone detection results to suggest the most appropriate subsystems.

Since these experiments based correctness on finding the original/unmutated model as a top- two suggestion, it might seem reasonable to expect 100% accuracy given a single block deletion and the use of a 30% difference threshold for clone detection. However, it is possible that the block-deletion mutation causes an over mutation such that Simone is unable to find the correct suggestion. This, however, is a shortcoming of Simone’s sensitivity rather than SimXample’s ap- plication, interpretation, and visualization of the results. SimXample always displays the correct suggestion indicated by the Simone clone report.

In regard to performance, a general observation was that as the model repositories increased in size, the time required to perform clone detection increased as well. Thus, it is recommended to select only the most relevant and necessary models as repositories. Regarding the evaluation of the insertion, the experiment was designed to demonstrate coverage of the five possible outcomes of insertion. Since the act of insertion takes the form of a complete copy and replacement of the SFS into the SUD using atomic Matlab commands, this two-step operation completes in constant time.

62 The success of this first time realization of an intelligent model assistant using model clones is indicative of significant positive impacts of virtual intelligent modeling assistance. Through reasonably accurate suggestion, visualization, and insertion of candidate subsystems, SimXample allows modelers to become more familiar with standard design principles and effective modeling practices by example. The well-documented gains associated with code completion implementa- tions [92] are likely to transfer to users of modeling assistants such as SimXample. Furthermore, SimXample’s realization and visualization of suggestions through the use of model clone detection has significance on its own. It provides a new application for existing model clone detection tools and an accompanying novel visualization. While this approach employs Simone due to the earlier stated reasons, the modular nature of Simxample process allows for any model clone detector to be used by tailoring Phase 1 to a respective detector.

6.4 Qualitative Evaluation via IMA Evaluation Grid

Mussbacher et al. describe their initial ideas performing assessment of intelligent modeling assis- tance [77]. While they consider it only a first step, it is still prudent to apply their assessment grid to SimIMA as appropriate. Specifically, the student presenting this thesis work and one of their advisors independently self assessed SimIMA along that grid and came together to reconcile final scores. Table 6.4 summarizes these scores. The reader can find more details on the analysis criteria in Mussbacher et al.’s work [77].

In regard to Quality of the IMA regarding models, we (the student and their advisor) gave SimIMA a score of 1 - Syntactic Quality as syntactic quality is enforced and always produced by SimXample and SimGestion. We could not give a score of 2 nor 3, as it depends on how the modeler chooses both to configure the repositories and employ the suggestions. Regarding Autonomy, SimIMA scores Level 2 - Narrowed Set. This is because SimIMA provides a list of multiple suggestions that have been narrowed down. When it comes to Relevance, the assessment grid considers accuracy as an acceptable metric. For SimGestion, our relevance is Level 79. For SimXample, our relevance is 82. The Confidence property from the grid for SimIMA is 100 as we always (100%) show a confidence value. Our Trust level for SimIMA is 0 as SimIMA gathers no information about the modeler’s trust. We leave such feedback as future work. Explainability for SimIMA is also Level 0 currently as we do not provide any insight on how its insight was gathered. For the Quality Degree of SimIMA, we evaluated it at a Level 2 - Safe Lookup. This is because we consider the context and external sources. While we do not leverage syntactic or semantic descriptions, we do allow the engineer to tailor their predictive models based on tradeoffs. SimIMA has a Timeliness of Level 2 - Short running and regularly (out of a possible level 5). SimGestion has FREQ complete suggestions around roughly one second, and its retrieval time is independent of the training data size. For ARM, it also computes suggestions around one second, however, its retrieval time does increase with training data size. SimXample also takes a matter of seconds. It is not level 3 as it is not iterative, nor level 4 as it is not less than one second in running time. Regarding the Quality of SimIMA’s External Sources, we gave it Levels S1 -

63 Table 6.4: Summary of Qualitative Assessment Property SimIMA Score Quality of IMA Regarding Models Level 1 - Syntactic Quality Autonomy Level 2 - Narrowed Set Relevance Levels 79 and 82 Confidence Level 100 Trust Level 0 – Unidentified Explainability Level 0 – No explanation Quality Degree Level 2 - Safe Lookup Timeliness Level 2 – Short running and regularly Level S1 - Project and infrastructure Level A2 - Public Quality of IMA Regarding External Sources Level U3 - Periodically updated Level C2 - Curated

Project and infrastructure, A2 - Public, U3 - Periodically updated, and C2 - Curated. For S1, SimIMA uses the current MUD and SUD, and past (repositories) development artifacts. It also uses data, information, and knowledge about Simulink. We gave ourselves Level A2 as all repositories included with SimIMA are public, as are the ones from our experiments. We deemed Level U3 appropriate for SimIMA as we can easily update the standard repositories that we included with SimIMA regularly. Lastly, Level C2 seemed most appropriate since the repositories we include with SimIMA and used in our experiments are from a curated dataset.

64 Chapter 7 Conclusion

This chapter first explores threats to validity for this work. Then it discusses some potential future works that can follow the completion of this thesis. This chapter concludes by presenting a sum- mary of the research works accomplished in this thesis. Some of the contents in this chapter have been submitted to a SoSyM Theme Issue on AI-enhanced model-driven engineering [2] 1.

7.1 Threats to Validity

This section presents some external and internal threats to the validity of this research.

7.1.1 External Threats An external threat to validity lies in the choice of Simulink for this thesis work to research and demonstrate the original research envisioned by Stephan [11]. The vision on this research work is language agnostic. Nothing defined and realized in SimIMA is strongly language-specific. That is, while this research prototype exploits certain characteristics and features of Simulink, the general process is designed to be applicable and generalizable to any modeling language.

For SimXample, this is true for any language in which Type 3 model clone detection exists. Simulink features the most prevalent and mature Type 3 clone detection, which is one of the rea- sons behind choosing it for this research. Until this general process is applied to another modeling language, a threat to validity is its generality. However, more Type 3 model clone detectors need to be developed [93]. Similarly, the choice of Simone clone detector for SimXample represents a threat to external validity. Simone is not updated to accommodate recent Simulink versions. While this research on SimIMA motivates and justifies choosing Simone due to its maturity and evalua- tion history, it is a threat that the research did not experiment with other clone detectors instead of Simone.

An additional external threat to validity is the way models are selected in the evaluation ex- periments. While the previous chapter already discussed the motivation for choosing this set; including it being a large, publicly-available, and curated set; using industrial models would re- duce this threat, at the cost of public availability and replicability.

1under review

65 There is also the external threat of the experimenter effect. Specifically, the same researchers (the student and their advisors), who developed SimIMA, also conducted its evaluation experi- ments. For the quantitative evaluations, the person who chose the evaluation model set (the advi- sor) was different than the person who actually developed SimIMA (the student). However, the actual experiments were conducted by the same member that created SimIMA. Experimenter ef- fect is even more of a concern in the qualitative evaluation where two members (the student and their advisor) performed self assessment of SimIMA. This thesis report acknowledges this threat to validity, and leaves third-party assessment through the assessment grid as future work.

7.1.2 Internal Threats One internal threat to validity pertains both to the determination of block-prediction model param- eters and the evaluation of the ensemble block-prediction model. Due to the size and nature of the domain data sets, the evaluation experiments employed 5-fold cross validation instead of a larger ‘k’ value, such as 10. While both are acceptable, having a larger data set and a larger ‘k’ value would increase the support for the evaluation and research questions.

7.2 Future Work

From an evaluation perspective, there are a number of avenues that can be considered for fu- ture work. Now that this thesis has demonstrated the plausibility of these assistants and developed working prototypes, an obvious extension of this work would be to connect to and work with indus- try to perform user evaluations. To better demonstrate the robustness and effectiveness of SimIMA it would be beneficial to validate using industrial production models using Simulink practitioners to elicit feedback on the correctness of the suggestions and insertions. By having SimIMA applied in a more real-world setting, its use and promotion would be fostered. This is not an indictment of the current evaluation. Rather, application of SimIMA is very context dependent and having in- dustrial use cases would expand the context of the evaluation. Additionally, recruiting third party researchers, possibly from the original RF-IMA assessment grid paper, to conduct an independent qualitative evaluation of SimIMA would also be a good idea.

As with most research, there is potential for user interface (UI) enhancements. Careful consid- eration was part of SimIMA design process, as was adhering to established guidelines for creating model recommendation systems [80]. However, conducting a study with Simulink practitioners to solicit feedback on the UI design and usability would be beneficial. Such feedback would be valuable to consider alternative UI designs and determine which features contribute the most to the usability and design. The qualitative assessment also identified some areas of potential enhance- ments such as more explanations and building trust through transparency.

66 7.3 Summary

This thesis work describes SimIMA research in determining the plausibility of providing software modeling assistance by using machine learning and model clone detection on past artifacts. To that end, it answered two subquestions targeted to different forms of assistance: providing step- wise (element-level) suggestions and providing similar example models as suggestions. These forms of assistance correspond to two components of larger SimIMA assistant: SimGestion and SimXample. SimGestion employed ensemble learning by combining association rule mining and frequency information. SimXample configured, executed, and visualized model clone detection and (sub)system replacement.

The experiments for the evaluation of SimIMA used a large, publicly available, independently validated, and curated data set to facilitate reproduction and reproducibility. SimGestion’s pa- rameter tuning and evaluation employed k-fold cross-validation. In these experiments on various domains, SimGestion ended up with an total accuracy of around 79%. Similarly, SimXample pro- duced the correct/accurate suggestions roughly 82% of the time. While the overall intent of this research was to establish the feasibility model assistants employing machine learning and model clone detection, these accuracy results are very encouraging.

67 Appendix A Source Code

This appendix presents the MATLAB implementation of various classes and functions used in the project.

A.1 Class Definitions

Listing A.1: AppState.m class definition 1 % This class tracks the "state" and/or "config" of appSimxample and ,→ appSimgestion. 2 % Such tracking is necessary to 3 % - initialize the state from last saved state of the app. 4 % - 'pass' state information across apps and functions or between ,→ appSimxample 5 % and appSimgestion (eg. which repos are selected) 6 7 classdef AppState 8 properties 9 % although some of these properties are constant, we still keep 10 % them in this 'state' class, so that they can be accesed from both 11 % inside the app and outside the app (in functions that 12 % create/update the block-prediction models) through the same 13 % shared-var. Otherwise, we would have to hardcode the same value 14 % (eg. defRepoPath1 in two places -- once inside the app, and also 15 % in the library functions) 16 17 simxampleSimoneMinSize = 10; 18 simxampleSimoneMaxSize = 20000; 19 simxampleSimoneRename = "blind"; 20 simxampleNSuggsMax = 10; 21 simxampleSimoneDiffLimit = 30; % maximum percentage difference limit ,→ between clones 22 23 % This flag is set/unset by createMdlFileForSimone() 24 simxampleSavedAsMdl = false; 25 26 simgestionNSuggsMax = 6; % number 27 28 % There will be a trade-off between speed and accuracy. 29 % accuracy level can be one of following: 30 % 1: less accurate (but fast) -- uses FreqModel only

68 31 % 2: medium accurate (medium fast) -- uses ArmModel only 32 % 3: very accurate (but slow) -- uses ensemble of FreqModel and ,→ ArmModel 33 simgestionAccuracyLevel = 3 % integer in range [1,2,3] 34 35 % only the 'miscellaneous' repo (repo 6) is checked by default 36 defRepo1Checked = false; 37 defRepo2Checked = false; 38 defRepo3Checked = false; 39 defRepo4Checked = false; 40 defRepo5Checked = false; 41 defRepo6Checked = true; 42 43 % these paths are relative to simvmaPath 44 % We don't store the absolute path here so that the code does not 45 % become machine-specific 46 defRepoPath1 = "default-repos/automotive"; % constant 47 defRepoPath2 = "default-repos/avionics"; % constant 48 defRepoPath3 = "default-repos/electronics"; % constant 49 defRepoPath4 = "default-repos/energy"; % constant 50 defRepoPath5 = "default-repos/robotics"; % constant 51 defRepoPath6 = "default-repos/miscellaneous"; % constant 52 53 % unlike default repopaths, these paths are absolute 54 cstRepoPath1 = ""; 55 cstRepoPath2 = ""; 56 cstRepoPath3 = ""; 57 cstRepoPath4 = ""; 58 cstRepoPath5 = ""; 59 60 % While SimGestion can work with either MDL or SLX files in the 61 % repos, SimXample works with only MDL files. So, we track them 62 % both. 63 64 nMdlDefRepoPath1 % number 65 nMdlDefRepoPath2 % number 66 nMdlDefRepoPath3 % number 67 nMdlDefRepoPath4 % number 68 nMdlDefRepoPath5 % number 69 nMdlDefRepoPath6 % number 70 71 nSlxDefRepoPath1 % number 72 nSlxDefRepoPath2 % number 73 nSlxDefRepoPath3 % number 74 nSlxDefRepoPath4 % number 75 nSlxDefRepoPath5 % number 76 nSlxDefRepoPath6 % number 77 78 nMdlSlxDefRepoPath1 % number 79 nMdlSlxDefRepoPath2 % number 80 nMdlSlxDefRepoPath3 % number

69 81 nMdlSlxDefRepoPath4 % number 82 nMdlSlxDefRepoPath5 % number 83 nMdlSlxDefRepoPath6 % number 84 85 nMdlCstRepoPath1 % number 86 nMdlCstRepoPath2 % number 87 nMdlCstRepoPath3 % number 88 nMdlCstRepoPath4 % number 89 nMdlCstRepoPath5 % number 90 91 nSlxCstRepoPath1 % number 92 nSlxCstRepoPath2 % number 93 nSlxCstRepoPath3 % number 94 nSlxCstRepoPath4 % number 95 nSlxCstRepoPath5 % number 96 97 nMdlSlxCstRepoPath1 % number 98 nMdlSlxCstRepoPath2 % number 99 nMdlSlxCstRepoPath3 % number 100 nMdlSlxCstRepoPath4 % number 101 nMdlSlxCstRepoPath5 % number 102 103 % Unlike for standard-block-prediction-models, we don't track 104 % whether custom-block-prediction-models are reset or not. This is 105 % because, custom-block-prediction-models are ALWAYS updated when 106 % the function updatePredModels() is called. 107 108 % these paths are relative to simvma 109 % these will be used to decide whether or not to invoke 110 % updatePredModels() from inside appAdmin 111 relpathsOfMdlSlxInDefRepo1 % list of strings 112 relpathsOfMdlSlxInDefRepo2 % list of strings 113 relpathsOfMdlSlxInDefRepo3 % list of strings 114 relpathsOfMdlSlxInDefRepo4 % list of strings 115 relpathsOfMdlSlxInDefRepo5 % list of strings 116 relpathsOfMdlSlxInDefRepo6 % list of strings 117 118 % these paths are absolute 119 abspathsOfMdlSlxInCstRepo1 % list of strings 120 abspathsOfMdlSlxInCstRepo2 % list of strings 121 abspathsOfMdlSlxInCstRepo3 % list of strings 122 abspathsOfMdlSlxInCstRepo4 % list of strings 123 abspathsOfMdlSlxInCstRepo5 % list of strings 124 125 % this flag is set by: 126 % - resetModels() 127 % - resetModelsAndCache() 128 % and, unset by 129 % - updateDefBlockPredModelsAfterChangesInDefRepos() 130 defBlockPredModelsReset = false; % boolean 131

70 132 end 133 134 methods 135 136 function obj = AppState() 137 % this information (filepaths, count) needs to be updated after ,→ every change in 138 % repos. So, we put the corresponding code in public method 139 % setModelPathsAndCounts() rather than in the constructor itself. 140 % 141 % Although default repos are not supposed to be channged, this 142 % function updates them too (recomputes the same value). This 143 % is to keep the code simple, and is affordable as the 144 % computation cost is not high. 145 obj = obj.setModelFilePathsAndCounts(); 146 end 147 148 function obj = setModelFilePathsAndCounts(obj) 149 % This method does not affect the repopaths. 150 obj.relpathsOfMdlSlxInDefRepo1 = removeSimvmaPathPrefix( ,→ searchFilesRecursively(obj.defRepoPath1, ["mdl", "slx"])); 151 obj.relpathsOfMdlSlxInDefRepo2 = removeSimvmaPathPrefix( ,→ searchFilesRecursively(obj.defRepoPath2, ["mdl", "slx"])); 152 obj.relpathsOfMdlSlxInDefRepo3 = removeSimvmaPathPrefix( ,→ searchFilesRecursively(obj.defRepoPath3, ["mdl", "slx"])); 153 obj.relpathsOfMdlSlxInDefRepo4 = removeSimvmaPathPrefix( ,→ searchFilesRecursively(obj.defRepoPath4, ["mdl", "slx"])); 154 obj.relpathsOfMdlSlxInDefRepo5 = removeSimvmaPathPrefix( ,→ searchFilesRecursively(obj.defRepoPath5, ["mdl", "slx"])); 155 obj.relpathsOfMdlSlxInDefRepo6 = removeSimvmaPathPrefix( ,→ searchFilesRecursively(obj.defRepoPath6, ["mdl", "slx"])); 156 157 obj.abspathsOfMdlSlxInCstRepo1 = searchFilesRecursively(obj. ,→ cstRepoPath1, ["mdl", "slx"]); 158 obj.abspathsOfMdlSlxInCstRepo2 = searchFilesRecursively(obj. ,→ cstRepoPath2, ["mdl", "slx"]); 159 obj.abspathsOfMdlSlxInCstRepo3 = searchFilesRecursively(obj. ,→ cstRepoPath3, ["mdl", "slx"]); 160 obj.abspathsOfMdlSlxInCstRepo4 = searchFilesRecursively(obj. ,→ cstRepoPath4, ["mdl", "slx"]); 161 obj.abspathsOfMdlSlxInCstRepo5 = searchFilesRecursively(obj. ,→ cstRepoPath5, ["mdl", "slx"]); 162 163 obj.nMdlDefRepoPath1 = getNFilesRecursively(addSimvmaPathPrefix( ,→ obj.defRepoPath1), 'mdl'); 164 obj.nMdlDefRepoPath2 = getNFilesRecursively(addSimvmaPathPrefix( ,→ obj.defRepoPath2), 'mdl'); 165 obj.nMdlDefRepoPath3 = getNFilesRecursively(addSimvmaPathPrefix( ,→ obj.defRepoPath3), 'mdl'); 166 obj.nMdlDefRepoPath4 = getNFilesRecursively(addSimvmaPathPrefix( ,→ obj.defRepoPath4), 'mdl');

71 167 obj.nMdlDefRepoPath5 = getNFilesRecursively(addSimvmaPathPrefix( ,→ obj.defRepoPath5), 'mdl'); 168 obj.nMdlDefRepoPath6 = getNFilesRecursively(addSimvmaPathPrefix( ,→ obj.defRepoPath6), 'mdl'); 169 170 obj.nSlxDefRepoPath1 = getNFilesRecursively(addSimvmaPathPrefix( ,→ obj.defRepoPath1), 'slx'); 171 obj.nSlxDefRepoPath2 = getNFilesRecursively(addSimvmaPathPrefix( ,→ obj.defRepoPath2), 'slx'); 172 obj.nSlxDefRepoPath3 = getNFilesRecursively(addSimvmaPathPrefix( ,→ obj.defRepoPath3), 'slx'); 173 obj.nSlxDefRepoPath4 = getNFilesRecursively(addSimvmaPathPrefix( ,→ obj.defRepoPath4), 'slx'); 174 obj.nSlxDefRepoPath5 = getNFilesRecursively(addSimvmaPathPrefix( ,→ obj.defRepoPath5), 'slx'); 175 obj.nSlxDefRepoPath6 = getNFilesRecursively(addSimvmaPathPrefix( ,→ obj.defRepoPath6), 'slx'); 176 177 obj.nMdlSlxDefRepoPath1 = length(obj.relpathsOfMdlSlxInDefRepo1); 178 obj.nMdlSlxDefRepoPath2 = length(obj.relpathsOfMdlSlxInDefRepo2); 179 obj.nMdlSlxDefRepoPath3 = length(obj.relpathsOfMdlSlxInDefRepo3); 180 obj.nMdlSlxDefRepoPath4 = length(obj.relpathsOfMdlSlxInDefRepo4); 181 obj.nMdlSlxDefRepoPath5 = length(obj.relpathsOfMdlSlxInDefRepo5); 182 obj.nMdlSlxDefRepoPath6 = length(obj.relpathsOfMdlSlxInDefRepo6); 183 184 obj.nMdlCstRepoPath1 = getNFilesRecursively(obj.cstRepoPath1, ' ,→ mdl'); 185 obj.nMdlCstRepoPath2 = getNFilesRecursively(obj.cstRepoPath2, ' ,→ mdl'); 186 obj.nMdlCstRepoPath3 = getNFilesRecursively(obj.cstRepoPath3, ' ,→ mdl'); 187 obj.nMdlCstRepoPath4 = getNFilesRecursively(obj.cstRepoPath4, ' ,→ mdl'); 188 obj.nMdlCstRepoPath5 = getNFilesRecursively(obj.cstRepoPath5, ' ,→ mdl'); 189 190 obj.nSlxCstRepoPath1 = getNFilesRecursively(obj.cstRepoPath1, ' ,→ slx'); 191 obj.nSlxCstRepoPath2 = getNFilesRecursively(obj.cstRepoPath2, ' ,→ slx'); 192 obj.nSlxCstRepoPath3 = getNFilesRecursively(obj.cstRepoPath3, ' ,→ slx'); 193 obj.nSlxCstRepoPath4 = getNFilesRecursively(obj.cstRepoPath4, ' ,→ slx'); 194 obj.nSlxCstRepoPath5 = getNFilesRecursively(obj.cstRepoPath5, ' ,→ slx'); 195 196 obj.nMdlSlxCstRepoPath1 = length(obj.abspathsOfMdlSlxInCstRepo1); 197 obj.nMdlSlxCstRepoPath2 = length(obj.abspathsOfMdlSlxInCstRepo2); 198 obj.nMdlSlxCstRepoPath3 = length(obj.abspathsOfMdlSlxInCstRepo3); 199 obj.nMdlSlxCstRepoPath4 = length(obj.abspathsOfMdlSlxInCstRepo4);

72 200 obj.nMdlSlxCstRepoPath5 = length(obj.abspathsOfMdlSlxInCstRepo5); 201 202 end 203 204 function obj = cleanCustomRepoPaths(obj) 205 % this function does the following jobs: 206 % - make sure the custom repo paths do exist (otherwise set them to ,→ "" and 207 % corresponding mdl/slx file count to 0 208 % - make sure that app.newState.cstRepoPathX is set (to some non- ,→ empty value) 209 % before app.newState.cstRepoPathY, where X < Y 210 211 212 n = 5; % custom repopaths 213 214 % make sure specified custom repo paths do exist, else reset them 215 for i = 1 : n 216 propCstRepoPath = "cstRepoPath" + i; 217 propNMdlCstRepoPath = "nMdlCstRepoPath" + i; 218 propNSlxCstRepoPath = "nSlxCstRepoPath" + i; 219 propNMdlSlxCstRepoPath = "nMdlSlxCstRepoPath" + i; 220 221 if ˜exist(obj.(propCstRepoPath), 'dir') 222 obj.(propCstRepoPath) = ""; 223 obj.(propNMdlCstRepoPath) = 0; 224 obj.(propNSlxCstRepoPath) = 0; 225 obj.(propNMdlSlxCstRepoPath) = 0; 226 end 227 end 228 229 % make sure obj.cstRepoPathX is set before obj.cstRepoPathY 230 % where X < Y 231 for i = 1 : n 232 for j = i+1 : n 233 propCstRepoPath1 = "cstRepoPath" + i; 234 propNMdlCstRepoPath1 = "nMdlCstRepoPath" + i; 235 propNSlxCstRepoPath1 = "nSlxCstRepoPath" + i; 236 propNMdlSlxCstRepoPath1 = "nMdlSlxCstRepoPath" + i; 237 238 propCstRepoPath2 = "cstRepoPath" + j; 239 propNMdlCstRepoPath2 = "nMdlCstRepoPath" + j; 240 propNSlxCstRepoPath2 = "nSlxCstRepoPath" + j; 241 propNMdlSlxCstRepoPath2 = "nMdlSlxCstRepoPath" + j; 242 243 % this is a cool way to access properties by their 244 % string representation. 245 if obj.(propCstRepoPath1) == "" && obj.(propCstRepoPath2) ,→ ˜= "" 246 obj.(propCstRepoPath1) = obj.(propCstRepoPath2); 247 obj.(propNMdlCstRepoPath1) = obj.(propNMdlCstRepoPath2);

73 248 obj.(propNSlxCstRepoPath1) = obj.(propNSlxCstRepoPath2); 249 obj.(propNMdlSlxCstRepoPath1) = obj.( ,→ propNMdlSlxCstRepoPath2); 250 obj.(propCstRepoPath2) = ""; 251 obj.(propNMdlCstRepoPath2) = 0; 252 obj.(propNSlxCstRepoPath2) = 0; 253 obj.(propNMdlSlxCstRepoPath2) = 0; 254 end 255 end 256 end 257 end 258 259 end 260 end

Listing A.2: ArmModel.m class definition 1 % PUBLIC METHODS: 2 % ------3 % CONSTRUCTOR : returns an 'empty' model 4 % trainByFilepath : update model by training with 1/more mdlSlx files 5 % trainByFilehash : update model by training with 1/more mdlSlx files' ,→ hash values 6 % predict : return BlockSuggFromPredModel objects 7 8 9 classdef ArmModel 10 properties 11 12 % "data" is a struct such that 13 % each field denotes a (sub)system contained in the training models 14 % the corresponding field value is a list of BlockTypes (strings) 15 % contained in that (sub)system. 16 % Since a Simulink model potentially contains multiple sub-systems, 17 % it can give multiple such fields. 18 % For an instance of ArmModel, data contains one entry for each 19 % (sub)system contained in the training mdlSlx files 20 21 % data may look like follows: 22 % data = { 23 % 'ss1' : ["Sum", "Gain"], 24 % 'ss2' : ["Inport", "Sum", "Scope"], 25 % 'ss3' : ["Gain", "Outport", "Sine"], ... 26 %} 27 28 data = struct; 29 30 % "table" is a result of "processing" 'data'. 31 % A table generated after training an armModel on several mdlSlx ,→ files 32 % containing above data can be visualized as follows: 33 %

74 34 % Gain Inport Outport Scope Sine Sum 35 % ------36 % 1 0 0 0 0 1 37 % 0 1 0 1 0 1 38 % 1 0 1 0 1 0 39 40 % Thus, 41 % - 'table' serves as the table on which we can run the 42 % "standard" association-rule-mining algorithm 43 % - For convenience, block types are sorted alphabetically (note 44 % that 'table' won't contain the first row i.e. the blockTypes' 45 % names. It only contains a 2D matrix of 0s and 1s such that each 46 % trainining row is represented by a row of the matrix. Which 47 % blockTypes are present in this table's data is resolved by the 48 % attribute 'blockTypes' which stores the blockTypes in a list 49 % sorted alphabetically. Thus, the first row in above 50 % visualization of table is actually stored in the attribute 51 % 'blockTypes'. 52 table; % 2D boolean array 53 % 'blockTypes' stores the header row of table (read comment for 54 % 'table') 55 blockTypes = string.empty 56 57 % hash values of all mdlSlx files with which the model has been 58 % trained successfully 59 hashTrainingFiles = string.empty; 60 61 % When "finding" Association Rule Mining rules, the antecedent 62 % block-type count must be >= this fraction. For detailed comments, 63 % see method predict() 64 ANT_REDUCTION = 0.5; % tuned experimentally to maximize prediction ,→ accuracy 65 end 66 67 methods (Access = public) 68 69 function obj = ArmModel() 70 % This constructor creates an UNTRAINED instance of ArmModel. 71 % The returned ArmModel instance needs to be trained using 72 % train() method 73 end 74 75 function obj = trainByFilepath(obj, mdlSlxAbspaths, verbose) 76 % Update the model by adding training data from each mdlSlx file 77 % 78 % PARAMETERS: 79 % ------80 % mdlSlxAbspaths (str or list_of_str) : absolute path of mdlSlx ,→ file 81 % verbose (boolean) : if true, details are printed. 82 %

75 83 % IMPORTANT: 84 % ------85 % When training the model with multiple mdlSlx files, call this 86 % method once with all the mdlSlx paths passed (rather than 87 % calling this method multiple times, with 1 mdlSlx path at a 88 % time). This will speed up computation time as in that case, 89 % opened/loaded models need to be reloaded only once 90 91 % APPROACH: 92 % ------93 % Training is accomplished in 2 phases. 94 % PHASE 1: The following are set: 95 % - data 96 % - hashTrainingFiles 97 % PHASE 2: The following are set: 98 % - blockTypes 99 % - table 100 101 warning off; 102 % these are required to restore the state of previously 103 % loaded/open models 104 openPaths = getOpenModelsAbsFilepaths(); 105 loadedOnlyPaths = getLoadedOnlyModelsAbsFilepaths(); 106 bdclose('all'); 107 108 % PHASE 1 (set data, hashTrainingFiles) 109 for i = 1 : length(mdlSlxAbspaths) 110 mdlSlxAbspath = mdlSlxAbspaths(i); 111 if verbose 112 dispSpacedAbove("Count : " + i + "/" + length( ,→ mdlSlxAbspaths)); 113 disp("path : " + mdlSlxAbspath); 114 end 115 [status, fileHash, data_] = obj.extractDetailsAndUpdateCache( ,→ mdlSlxAbspath, verbose); 116 if verbose disp("status : " + status); end 117 118 if status == "SUCCESS" 119 obj.data = mergeArmData(obj.data, data_); 120 obj.hashTrainingFiles = [obj.hashTrainingFiles fileHash]; 121 end 122 end 123 124 % PHASE 2 (set blockTypes, table) 125 obj = obj.setBlockTypesAndTable(); 126 127 % restore previously loaded/open models 128 open_system(openPaths); 129 load_system(loadedOnlyPaths); 130 warning on; 131 end

76 132 133 function obj = trainByFilehash(obj, fileHashes, verbose) 134 % Update the model by adding training data from each mdlSlx file. 135 % Duplicate fileHashes (i.e. those fileHashes with which the 136 % model is already trained will be ignored). 137 % 138 % This method is intended to be used for merging 2 or more 139 % ArmModels. This operation is quick because all data is 140 % available in the cache. 141 % 142 % ASSUMPTION: 143 % ------144 % - all mdlSlxHashes exist in the cache file: shared-vars/ ,→ simvma_armCache.mat 145 % 146 % PARAMETERS: 147 % ------148 % mdlSlxHashes (str or list_of_str) : hash values of mdlSlx files ,→ ' content 149 150 cache = getSharedVar('simvma_armCache'); 151 if verbose dispKeyVal('#hashes', length(fileHashes)); end 152 153 % PHASE 1 (set data, hashTrainingFiles) 154 for i = 1 : length(fileHashes) 155 fileHash = fileHashes(i); 156 % we avoid training with duplicate files 157 if any(strcmp(obj.hashTrainingFiles, fileHash)) 158 status = "DUPLICATE"; 159 else 160 if ˜isfield(cache, fileHash) 161 error("From trainByFilehash: No cache data found for : " ,→ + fileHash); 162 end 163 status = cache.(fileHash).status; 164 if status == "SUCCESS" 165 data_ = cache.(fileHash).data; 166 obj.data = mergeArmData(obj.data, data_); 167 obj.hashTrainingFiles = [obj.hashTrainingFiles fileHash ,→ ]; 168 end 169 end 170 171 if verbose 172 dispSpacedAbove("Count : " + i + "/" + length(fileHashes)); 173 disp("hash : " + fileHash); 174 disp("status : " + status); 175 end 176 end 177 178 % PHASE 2 (set blockTypes, table)

77 179 obj = obj.setBlockTypesAndTable(); 180 end 181 182 183 function [suggs, time, n, r, blockTypesAnt] = predict(obj, context) 184 % Returns: 185 % 1. a list of BlockSuggFromPredModel objects, sorted by 186 % confidence 187 % 2. the execution time (in seconds). 188 % 3. n (of C(n,r)) 189 % 4. r (of C(n,r)) 190 % 191 % If no suggestions are possible, it will return an empty list 192 % of BlockSuggFromPredModel. 193 % 194 % Each suggestion will have confidence > 0 195 % 196 % PARAMETERS: 197 % ------198 % context (Context object) 199 200 % This method will internally call one or more private methods, 201 % each of which will return suggestions based on various 202 % criteria. Then, based on a weighted-model, this method will 203 % prepare final list of suggestions and return them. 204 205 206 207 % OBSERVATION: It is not very likely to find rows in the table 208 % with all blockTypes in SUD present (This is especially true 209 % if the training dataset is not big enough). As a result ,→ nRowsAnt 210 % often becomes zero, thus not producing any suggestion. To 211 % circumvent this, we "relax" the antecedent by chosing removing 212 % some of the blockTypes present in SUD from the antecedent. In 213 % doing so, we try all possible combinations for a fixed tuned 214 % ANT_REDUCTION -- the MRCB block type is always present in each 215 % combination because we regard it to be more important than ,→ others. 216 % If at least 1 suggestion (which is not already present in SUD) 217 % is found from a particular combination, we stop immediately. 218 219 220 tStart = tic; 221 222 suggs = BlockSuggFromPredModel.empty; 223 blockTypesAnt = string.empty; 224 225 blockTypesInSudExceptMrcbBlockType = setdiff(context. ,→ blockTypesInSud, context.mrcbBlockType); 226 n = length(blockTypesInSudExceptMrcbBlockType);

78 227 228 r = floor(n * obj.ANT_REDUCTION); 229 % combination of n taken r at a time 230 combinations = combntns(blockTypesInSudExceptMrcbBlockType, r); % ,→ C(n,r) : 2D array -- each row is a combination 231 [nRows, nCols] = size(combinations); 232 for row = 1 : nRows 233 blockTypesAnt = combinations(row,:); 234 % we regard mrcb's block type as more important than 235 % others. So, we include it in all combinations 236 blockTypesAnt = [blockTypesAnt context.mrcbBlockType]; 237 suggsAll = obj.predictByBlockTypesInSud(blockTypesAnt); 238 239 % filter out those suggestions with block-types already 240 % present in SUD 241 suggs = BlockSuggFromPredModel.empty; 242 for i = 1 : length(suggsAll) 243 sugg = suggsAll(i); 244 if ˜any(strcmp(sugg.blockType, context.blockTypesInSud)) 245 suggs = [suggs sugg]; 246 end 247 end 248 249 if ˜isempty(suggs) 250 time = toc(tStart); 251 % to compensate for MRCB 252 n = n + 1; 253 r = r + 1; 254 return; 255 end 256 end 257 % to compensate for MRCB 258 n = n + 1; 259 r = r + 1; 260 time = toc(tStart); 261 end 262 263 end 264 265 methods (Access = private) 266 267 function obj = setBlockTypesAndTable(obj) 268 % Set the attributes 'blockTypes' and 'table' 269 270 % set blockTypes 271 obj.blockTypes = string.empty; 272 for i = 1 : length(fields(obj.data)) 273 obj.blockTypes = [obj.blockTypes obj.data.("ss" + i)]; 274 end 275 % merge and sort alphabetically 276 obj.blockTypes = unique(obj.blockTypes);

79 277 278 % set table 279 obj.table = logical.empty; 280 for i = 1 : length(fields(obj.data)) 281 dataRow = obj.data.("ss" + i); 282 tableRow = logical.empty; 283 for j = 1 : length(obj.blockTypes) 284 blockType = obj.blockTypes(j); 285 % if a particular blockType is present in a training 286 % data row, represent it by 1 (true), otherwise by 0 287 % (false). 288 if any(strcmp(dataRow, blockType)) 289 tableRow = [tableRow true]; 290 else 291 tableRow = [tableRow false]; 292 end 293 end 294 obj.table = [obj.table; tableRow]; % append new row 295 end 296 end 297 298 function [status, fileHash, data] = extractDetailsAndUpdateCache(obj ,→ , mdlSlxAbspath, verbose) 299 % Extract details from a mdlSlx file. 300 % This method also updates the cache stored in shared-var/ ,→ simvma_armCache.mat, if required 301 302 % 303 % PARAMETERS: 304 % mdlSlxAbsPath (str) : absolute path of mdlSlx file 305 % 306 % RETURNS: 307 % ------308 % status (str): 309 % "INVALID_PATH" : mdlSlxAbspath does not exist 310 % "INVALID_MDL" : invalid mdlSlx file (cannot be loaded) 311 % "DUPLICATE" : a mdlSlx file with the same content has 312 % already been used to train the model 313 % "SUCCESS" : model was updated successfully by training 314 % with this mdlSlx file 315 % "FAIL" : something went wrong during when training 316 % the model with this mdlSlx file (although the 317 % mdlSlx file was loaded successfully). 318 % fileHash (str): hash value of content; 319 % NaN if status == "INVALID_PATH" 320 % data (struct) : same as ArmModel.data, but trained on 1 file 321 % NaN if status != "SUCCESS" 322 323 % 324 % IMPORTANT: 325 % ------

80 326 % This method does not preserve the state of previously loaded 327 % simulink models, if any 328 329 % make sure no simulink model is loaded (as it might introduce 330 % some hard-to-debug errors 331 bdclose('all'); 332 333 fileHash = NaN; % default 334 data = NaN; % default 335 336 if ˜exist(mdlSlxAbspath, 'file') 337 status = "INVALID_PATH"; 338 return; 339 end 340 341 fileHash = hashByFilepath(mdlSlxAbspath); 342 if any(strcmp(obj.hashTrainingFiles, fileHash)) 343 status = "DUPLICATE"; 344 return; 345 end 346 347 % first, see if it is in cache 348 cache = getSharedVar('simvma_armCache'); 349 if isfield(cache, fileHash) 350 if verbose disp('Using cache'); end 351 status = cache.(fileHash).status; 352 data = cache.(fileHash).data; 353 return; 354 end 355 356 % not found in cache; need to extract data 357 if verbose disp('Not found in cache'); end 358 359 try 360 load_system(mdlSlxAbspath) 361 catch ME 362 status = "INVALID_MDL"; 363 return 364 end 365 366 data = struct; 367 368 %%%%%%%%%%%% uncomment to debug try-catch block %%%%%%%%%%%%%%% 369 % modelName = bdroot; 370 % bdParams = getParamsByPath(modelName); % block diagram params 371 % bdHandle = bdParams.Handle; 372 % data = obj.extractDetailsHelper(data, bdHandle, modelName); 373 % status = "SUCCESS"; 374 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 375 376 try

81 377 modelName = bdroot; 378 bdParams = getParamsByPath(modelName); % block diagram params 379 bdHandle = bdParams.Handle; 380 data = obj.extractDetailsHelper(data, bdHandle, modelName); 381 status = "SUCCESS"; 382 catch ME 383 status = "FAIL"; 384 end 385 386 % update cache 387 cache.(fileHash).status = status; 388 cache.(fileHash).data = data; 389 setSharedVar('simvma_armCache', cache); 390 end 391 392 function data = extractDetailsHelper(obj, data, handle, parentPath) 393 % this method updates only data attribute 394 % handle: either top-level system handle or sub-system handle 395 % parentPath: parent system's path 396 397 params = getParamsByHandle(handle, false); 398 blocks = params.Blocks; 399 400 % add 1 entry in data for all non-System type blocks 401 newRow = string.empty; 402 for i = 1 : length(blocks) 403 block = blocks(i); 404 % some block-names contain the character '/'. But since 405 % '/' is used as a separater between parent and child 406 % blocks, '/' in a block-name is replaced with '//' 407 block = strrep(block, '/', '//'); 408 blockPath = parentPath + "/" + block; 409 blockHandle = get_param(blockPath, 'handle'); 410 blockParams = getParamsByHandle(blockHandle, false); 411 blockType = string(blockParams.BlockType); 412 if blockType ˜= "SubSystem" 413 newRow = [newRow blockType]; 414 newRow = unique(newRow); % remove duplicates 415 else % handle subsystem recursively 416 data = obj.extractDetailsHelper(data, blockHandle, ,→ blockPath); 417 end 418 end 419 420 nFields = length(fields(data)); 421 % add new field to struct i.e. add a new training data tuple 422 data.("ss" + (nFields + 1)) = newRow; 423 end 424 425 function suggs = predictByBlockTypesInSud(obj, blockTypesInSud) 426 % Returns a list of BlockSuggFromPredModel objects based on

82 427 % the current context, sorted by confidence 428 % 429 % All returned suggestions have confidence > 0. 430 % 431 % If no suggestions are possible, it will return an empty list 432 % of BlockSugg. 433 % 434 % PARAMETERS: 435 % ------436 % context (Context object): 437 438 % APPROACH: 439 % - create BlockSuggFromPredModel instances for each blockType 440 % present in training dataset, but not in SUD. 441 % - Compute armSupportRule, armConfidenceRule, and ,→ armSupportConsequent 442 % for each blockType (not in SUD) as the consequent, and all ,→ blocks in 443 % SUD as the antecedent. 444 % - we use prefix 'arm' to distinguish the ARM's confidence 445 % from BlockSuggFromPredModel.confidenceterms 446 % - also, we use postfix 'Rule' to distinguish arm-support 447 % of 'rule' from the arm-support of 'blockType' from 448 % - Compute armLiftRule (from armConfidenceRule and 449 % armSupportCon) 450 % - Compute block-prediction confidence for each predicted 451 % type as based on (one or more of) armSupportRule, 452 % armConfidenceRule, and armLiftRule 453 % - sort BlockSuggestionFromPredModels by confidence 454 455 456 457 % For each predicted blockType, compute armSupportRule, 458 % armConfidenceRule, and armSupportCon, and armLiftRule 459 % NOTATIONS: 460 % Ant(ecedent): all blocks currently present in SUD 461 % Con(sequent): each block (one at a time) present in training 462 % dataset but not in SUD 463 % 464 % 465 % DEFINITIONS: 466 % 467 % nRows with both Ant and Con = 1 468 % armSupportRule = ------469 % nRows 470 % 471 % nRows with both Ant and Con = 1 472 % armConfidenceRule = ------473 % nRows with Ant = 1 474 % 475 % nRows with Con = 1

83 476 % armSupportCon = ------477 % nRows 478 % 479 % armConfidenceRule 480 % armLiftRule = ------481 % armSupportCon 482 % 483 484 485 [nRows, nCols] = size(obj.table); 486 487 % indices of antecedent columns in table 488 antIndices = []; 489 for i = 1 : length(blockTypesInSud) 490 blockType = blockTypesInSud(i); 491 index = find(obj.blockTypes == blockType); 492 % if this particular block-type is not present in the 493 % training dataset, index will be set to 494 % '1x0 empty double row vector', otherwise it will be set 495 % to some 'double' value. 496 497 % We IGNORE any blocktype not present in the training 498 % dataset (otherwise we can't use armModel at all) 499 % i.e we act as if it were not present in the SUD 500 % so that we can define our antecedent with only those 501 % block-types present in both training dataset and SUD 502 blockExistsInTrainingDataset = any(index); 503 if blockExistsInTrainingDataset 504 antIndices = [antIndices index]; 505 end 506 end 507 508 % a sub-matrix of table containing only those rows for which 509 % all antecedent column values are 1 510 tableAnt = obj.table; 511 for i = 1 : length(antIndices) 512 index = antIndices(i); 513 col = tableAnt(:, index); 514 tableAnt = tableAnt(col == true, :); 515 end 516 517 % nRowsAnt: # rows with Antecedent = true; 518 [nRowsAnt, nCols] = size(tableAnt); 519 520 % block-types for creating suggestions 521 % NOTE: armModel never suggests a blocktype already present in ,→ SUD 522 blockTypes_ = setdiff(obj.blockTypes, blockTypesInSud); 523 suggs = BlockSuggFromPredModel.empty; 524 for i = 1 : length(blockTypes_) 525 blockType = blockTypes_(i);

84 526 conIndex = find(obj.blockTypes == blockType); % consequent's ,→ index 527 528 % FIND nRowsAntCon (to compute armSupportRule, ,→ armConfidenceRule) 529 col = tableAnt(:, conIndex); 530 % a sub-matrix of table containing only those rows for which 531 % all antecedent as well as the consequent column values are 1 532 tableAntCon = tableAnt(col == true); 533 [nRowsAntCon, nCols] = size(tableAntCon); 534 535 % FIND nRowsCon (to compute armSupportCon, armLiftRule) 536 col = obj.table(:, conIndex); 537 % a sub-matrix of obj.table containing only those rows for ,→ which 538 % all consequent column values are 1 539 tableCon = obj.table(col == true); 540 [nRowsCon, nCols] = size(tableCon); 541 542 543 % COMPUTE ARM METRIX (support, confidence, lift) 544 armSupportRule = nRowsAntCon / nRows; 545 armConfidenceRule = nRowsAntCon / nRowsAnt; 546 % in case nRowsAnt (and hence nRowsAntCon) is 0, we get 0/0 = ,→ NaN 547 if isnan(armConfidenceRule) 548 armConfidenceRule = 0; % we've slightly modified the ,→ definition of confidence in this case 549 end 550 551 if armConfidenceRule > 0 552 sugg = BlockSuggFromPredModel(blockType, armConfidenceRule) ,→ ; 553 suggs = [suggs sugg]; 554 end 555 end 556 suggs = sortObjsByProp(suggs, 'confidence', 'descend'); 557 end 558 end 559 560 end

Listing A.3: BlockInsertionState.m class definition 1 classdef BlockInsertionState 2 properties 3 context 4 suggs 5 percentTexts 6 end 7 8 methods

85 9 10 function obj = BlockInsertionState(context, suggs, percentTexts) 11 obj.context = context; 12 obj.suggs = suggs; 13 obj.percentTexts = percentTexts; 14 end 15 16 end 17 end

Listing A.4: BlockSugg.m class definition 1 classdef BlockSugg 2 properties 3 blockType % string: BlockType of suggested block 4 blockTypeAsField % string: BlockType compatible to be used as a ' ,→ field' of a struct 5 libPath % string: Simulink library path; eg: built-in/Sum 6 rank % double: (1:highest) : suggestion's rank 7 confidence % double in range [0,1] : prediction confidence for the ,→ block 8 blockName % string: block's name; eg: Add1 9 blockPath % string: block's path as would be returned by gcb; eg: x/ ,→ y/z/Add1 10 foregroundColor % string: foreground color of the suggestion block ( ,→ when displayed in simulink workspace) 11 backgroundColor % string: background color of the suggestion block ( ,→ when displayed in simulink workspace) 12 end 13 14 methods 15 function obj = BlockSugg(blockType, rank, confidence, blockName, ,→ blockPath) 16 blockType = string(blockType); 17 obj.blockType = blockType; 18 obj.blockTypeAsField = convertBlockTypeToField(blockType); 19 obj.libPath = getSimulinkLibPathByBlockType(blockType); 20 obj.rank = rank; 21 obj.confidence = confidence; 22 obj.blockName = blockName; 23 obj.blockPath = blockPath; 24 obj.foregroundColor = "black"; 25 obj.backgroundColor = "white"; 26 27 end 28 end 29 end 30 31 function libpath = getSimulinkLibPathByBlockType(blockType) 32 libpath = "built-in/" + blockType; 33 end

86 Listing A.5: BlockSuggFromPredModel.m class definition 1 classdef BlockSuggFromPredModel 2 properties 3 blockType % string: BlockType of suggested block 4 blockTypeAsField % string: BlockType compatible to be used as a ' ,→ field' of a struct 5 confidence % double in range [0,1] : prediction confidence for the ,→ block 6 end 7 8 methods 9 function obj = BlockSuggFromPredModel(blockType, confidence) 10 obj.blockType = string(blockType); 11 obj.blockTypeAsField = convertBlockTypeToField(obj.blockType); 12 obj.confidence = confidence; 13 end 14 end 15 end

Listing A.6: Clone.m class definition 1 classdef Clone 2 properties 3 nlines 4 similarity 5 source1 % from mdl file under development 6 source2 % from clone 7 end 8 9 methods 10 function obj = Clone(nlines, similarity, source1, source2) 11 obj.nlines = nlines; 12 obj.similarity = similarity; 13 obj.source1 = source1; 14 obj.source2 = source2; 15 end 16 end 17 end

Listing A.7: Context.m class definition 1 classdef Context 2 properties 3 4 xMin % x-coordinate value of top-left point of leftmost block in ,→ Simulink workspace 5 xMax % x-coordinate value of bottom-right point of rightmost block ,→ in Simulink workspace 6 yMin % y-cooridinate value of top-left point of topmost block in ,→ Simulink workspace 7 yMax % y-coordinate value of bottom-right point of bottommost block ,→ in Simulink workspace

87 8 9 sud % path of (sub)system under development as returned by 'gcs' 10 mrcbPath % str: path of most-recently-clicked-block (as returned by ,→ gcb) 11 mrcbBlockType % Most Recently Clicked Block's Block-Type 12 mrcbPosition % position of MRCB in the format [xMin, yMin, xMax, ,→ yMax] 13 blockNamesInSud % list of all block names (strings, unique, sorted ,→ alphabetically) in system-under-development 14 blockTypesInSud % list of all block-types (strings, unique, sorted ,→ alphabetically) in system-under-development 15 16 mrcbNeighborsMap % struct such that each key(str):blockType(as field ,→ ) of blocks connected to the MRCB, and value (int): count of ,→ such neighboring blockTypes 17 % eg: 18 % "Sum": 3 19 % "Gain": 1 20 % "Outport": 1 21 22 suggBlkDim = 50; % all suggestion panel dimensions will be based on ,→ this value 23 24 positionSuggPanelHeaderSS % position of the header subsystem of ,→ block suggestion panel (if block suggestions were to be ,→ populated in current context) 25 positionSuggPanelBodySS % position of the body subsystem of block ,→ suggestion panel (if block suggestions were to be populated in ,→ current context) 26 positionSuggPanelFooterSS % position of the footer subsystem of ,→ block suggestion panel (if block suggestions were to be ,→ populated in current context) 27 end 28 29 methods 30 function obj = Context(xMin, xMax, yMin, yMax, ... 31 sud, blockNamesInSud, blockTypesInSud, mrcbPath, mrcbBlockType ,→ , ... 32 mrcbPosition, mrcbNeighborsMap) 33 34 obj.xMin = xMin; 35 obj.xMax = xMax; 36 obj.yMin = yMin; 37 obj.yMax = yMax; 38 obj.sud = sud; 39 obj.blockNamesInSud = blockNamesInSud; 40 obj.blockTypesInSud = blockTypesInSud; 41 obj.mrcbPath = mrcbPath; 42 obj.mrcbBlockType = string(mrcbBlockType); 43 obj.mrcbPosition = mrcbPosition; 44 obj.mrcbNeighborsMap = mrcbNeighborsMap;

88 45 46 obj = obj.setPositions(); 47 48 end 49 50 % we don't include this function code within constructor because 51 % we may need to re-define the position of the suggestion panel 52 % in case a previous context is to be used (assuming the user might 53 % have changed 'nSuggsMax' from AdminControl UI -- see code in 54 % suggestBlocks.m for such use) 55 function obj = setPositions(obj) 56 % for now, suggestion panel header's yMin is same as MRCB's yMin 57 % this will be adjusted shortly such that the sugg panel will ,→ have 58 % same yCenter as that of MRCB (see code below) 59 xMin = obj.mrcbPosition(3) + obj.suggBlkDim / 2; 60 yMin = obj.mrcbPosition(2); 61 xMax = obj.mrcbPosition(3) + 4 * obj.suggBlkDim; 62 yMax = obj.mrcbPosition(2) + obj.suggBlkDim; 63 64 obj.positionSuggPanelHeaderSS = [xMin, yMin, xMax, yMax]; 65 66 yMin = yMax; 67 68 state = getSharedVar('simvma_appState'); 69 70 yMax = yMin + (obj.suggBlkDim / 2) * (3 * state. ,→ simgestionNSuggsMax + 1); 71 obj.positionSuggPanelBodySS = [xMin, yMin, xMax, yMax]; 72 73 yMin = yMax; 74 yMax = yMin + obj.suggBlkDim; 75 obj.positionSuggPanelFooterSS = [xMin, yMin, xMax, yMax]; 76 77 % move the suggestion panel vertically so that its yCenter 78 % aligns with MRCB's yCenter 79 yCenterMrcb = round((obj.mrcbPosition(2) + obj.mrcbPosition(4)) / ,→ 2); 80 yCenterBody = round((obj.positionSuggPanelBodySS(2) + obj. ,→ positionSuggPanelBodySS(4))/2); 81 shift = yCenterBody - yCenterMrcb; 82 83 obj.positionSuggPanelHeaderSS(2) = obj.positionSuggPanelHeaderSS ,→ (2) - shift; 84 obj.positionSuggPanelHeaderSS(4) = obj.positionSuggPanelHeaderSS ,→ (4) - shift; 85 obj.positionSuggPanelBodySS(2) = obj.positionSuggPanelBodySS(2) - ,→ shift; 86 obj.positionSuggPanelBodySS(4) = obj.positionSuggPanelBodySS(4) - ,→ shift; 87 obj.positionSuggPanelFooterSS(2) = obj.positionSuggPanelFooterSS

89 ,→ (2) - shift; 88 obj.positionSuggPanelFooterSS(4) = obj.positionSuggPanelFooterSS ,→ (4) - shift; 89 end 90 91 end 92 end

Listing A.8: FreqModel.m class definition 1 % PUBLIC METHODS: 2 % ------3 % CONSTRUCTOR : returns an 'empty' model 4 % trainByFilepath : update model by training with 1/more mdlSlx files 5 % trainByFilehash : update model by training with 1/more mdlSlx files' ,→ hash values 6 % predict : return BlockSuggFromPredModel objects 7 8 9 classdef FreqModel 10 properties 11 12 % 'data' will store the results of training 13 % we'll implement this nested map using struct 14 % the following is an illustration of how this 'map' will look like 15 16 % data = { 17 % 'Sum': { 18 % 'count': 304, % total number of 'Sum' blocks in 19 % training set 20 % 'src':{ 21 % 'count': 120, % total number of blocks 22 % connected to the src port of 23 % 'Sum' blocks in training set. 24 % This is equal to the sum of 25 % counts of all blockTypes in 26 % details{}. 27 % 'details': { 28 % 'Sum': 55, 29 % 'Gain': 21, 30 % 'Inport': 44, 31 %} 32 % }, 33 % 'dst':{ 34 % 'count': 167, 35 % 'details': { 36 % 'Sum': 35, 37 % 'Gain': 13, 38 % 'Inport': 32, 39 % 'Outport': 87, 40 %} 41 % },

90 42 % 'both':{ 43 % 'count': 287, 44 % 'details': { 45 % 'Sum': 90, 46 % 'Gain': 34, 47 % 'Inport': 76, 48 % 'Outport': 87, 49 %} 50 % }, 51 % }, 52 % 'Gain': { 53 % 'count': 434, 54 % 'src':{ 55 % 'count': 42, 56 % 'details': { 57 % 'Sum': 15, 58 % 'Gain': 23, 59 % 'Inport': 4, 60 %} 61 % }, 62 % 'dst':{ 63 % 'count': 107, 64 % 'details': { 65 % 'Sum': 35, 66 % 'Gain': 13, 67 % 'Inport': 32, 68 % 'Outport': 27, 69 %} 70 % }, 71 % 'both':{ 72 % 'count': 149, 73 % 'details': { 74 % 'Sum': 50, 75 % 'Gain': 36, 76 % 'Inport': 36, 77 % 'Outport': 27, 78 %} 79 % }, 80 % 81 % }, 82 %} 83 84 data = struct; 85 % hash values of all mdlSlx files with which the model has been 86 % trained successfully 87 hashTrainingFiles = string.empty; 88 89 % weights are tuned experimentally to maximize prediction accuracy 90 WEIGHT_MRCB = 0.9; 91 WEIGHT_NEIGHBORS = 0.1; 92 end

91 93 94 methods (Access = public) 95 96 function obj = FreqModel() 97 % This constructor creates an UNTRAINED instance of FreqModel. 98 % The returned FreqModel instance needs to be trained using 99 % train() method 100 end 101 102 function obj = trainByFilepath(obj, mdlSlxAbspaths, verbose) 103 % Update the model by adding training data from each mdlSlx file 104 % 105 % PARAMETERS: 106 % ------107 % mdlSlxAbspaths (str or list_of_str) : absolute path of mdlSlx ,→ file 108 % verbose (boolean) : if true, details are printed. 109 % 110 % IMPORTANT: 111 % ------112 % When training the model with multiple mdlSlx files, call this 113 % method once with all the mdlSlx paths passed (rather than 114 % calling this method multiple times, with 1 mdlSlx path at a 115 % time). This will speed up computation time as in that case, 116 % opened/loaded models need to be reloaded only once 117 118 warning off; 119 % these are required to restore the state of previously 120 % loaded/open models 121 openPaths = getOpenModelsAbsFilepaths(); 122 loadedOnlyPaths = getLoadedOnlyModelsAbsFilepaths(); 123 bdclose('all'); 124 125 for i = 1 : length(mdlSlxAbspaths) 126 mdlSlxAbspath = mdlSlxAbspaths(i); 127 if verbose 128 dispSpacedAbove("Count : " + i + "/" + length( ,→ mdlSlxAbspaths)); 129 disp("path : " + mdlSlxAbspath); 130 end 131 [status, fileHash, data_] = obj.extractDetailsAndUpdateCache( ,→ mdlSlxAbspath, verbose); 132 if verbose disp("status : " + status); end 133 134 if status == "SUCCESS" 135 obj.data = mergeFreqData(obj.data, data_); 136 obj.hashTrainingFiles = [obj.hashTrainingFiles fileHash]; 137 end 138 end 139 140 % restore previously loaded/open models

92 141 open_system(openPaths); 142 load_system(loadedOnlyPaths); 143 warning on; 144 end 145 146 function obj = trainByFilehash(obj, fileHashes, verbose) 147 % Update the model by adding training data from each mdlSlx file. 148 % Duplicate fileHashes (i.e. those fileHashes with which the 149 % model is already trained will be ignored). 150 % 151 % This method is intended to be used for merging 2 or more 152 % FreqModels. This operation is quick because all data is 153 % available in the cache. 154 % 155 % ASSUMPTION: 156 % ------157 % - all mdlSlxHashes exist in the cache file: shared-vars/ ,→ simvma_freqCache.mat 158 % 159 % PARAMETERS: 160 % ------161 % mdlSlxHashes (str or list_of_str) : hash values of mdlSlx files ,→ ' content 162 163 cache = getSharedVar('simvma_freqCache'); 164 if verbose dispKeyVal('#hashes', length(fileHashes)); end 165 for i = 1 : length(fileHashes) 166 fileHash = fileHashes(i); 167 % we avoid training with duplicate files 168 if any(strcmp(obj.hashTrainingFiles, fileHash)) 169 status = "DUPLICATE"; 170 else 171 if ˜isfield(cache, fileHash) 172 error("From trainByFilehash: No cache data found for : " ,→ + fileHash); 173 end 174 status = cache.(fileHash).status; 175 if status == "SUCCESS" 176 data_ = cache.(fileHash).data; 177 obj.data = mergeFreqData(obj.data, data_); 178 obj.hashTrainingFiles = [obj.hashTrainingFiles fileHash ,→ ]; 179 end 180 end 181 182 if verbose 183 dispSpacedAbove("Count : " + i + "/" + length(fileHashes)); 184 disp("hash : " + fileHash); 185 disp("status : " + status); 186 end 187 end

93 188 end 189 190 191 function [suggs, time] = predict(obj, context) 192 % Returns: 193 % 1. a list of BlockSuggFromPredModel objects, sorted by 194 % confidence 195 % 2. the execution time (in seconds). 196 % 197 % If no suggestions are possible, it will return an empty list 198 % of BlockSuggFromPredModel. 199 % 200 % Each suggestion will have confidence > 0 201 % 202 % PARAMETERS: 203 % ------204 % context (Context object) 205 206 % This method will internally call one or more private methods, 207 % each of which will return suggestions based on various 208 % criteria. Then, based on a weighted-model, this method will 209 % prepare final list of suggestions and return them. 210 211 tStart = tic; 212 assert(obj.WEIGHT_MRCB + obj.WEIGHT_NEIGHBORS == 1, "FROM ,→ FreqModel.m: Assertion failed: Sum of all weights must be ,→ equal to 1.") 213 214 % if one of the weights is 1, we don't need to compute others 215 if obj.WEIGHT_MRCB == 1 216 suggs = obj.predictByMrcbBlockType(context.mrcbBlockType); 217 time = toc(tStart); 218 return; 219 elseif obj.WEIGHT_NEIGHBORS == 1 220 suggs = obj.predictByMrcbNeighborsBlockTypes(context. ,→ mrcbBlockType); 221 time = toc(tStart); 222 return; 223 end 224 225 suggsMrcb = obj.predictByMrcbBlockType(context.mrcbBlockType); 226 suggsNeighbors = obj.predictByMrcbNeighborsBlockTypes(context. ,→ mrcbNeighborsMap); 227 228 229 % COMBINE SUGGESTIONS FROM MULTIPLE PRIVATE METHODS 230 231 % to combine suggestions, we first need to make sure the # ,→ suggestions 232 % is same from each private methods (to match matrix dimensions). 233 % So, we pad "fake" suggestions (with confidence 0), if necessary

94 ,→ . 234 % These fake suggestions will be ignored by ,→ combineBlockSuggsFromPredModel() 235 nSuggsMrcb = length(suggsMrcb); 236 nsuggsNeighbors = length(suggsNeighbors); 237 nSuggsMax = max(nSuggsMrcb, nsuggsNeighbors); 238 239 % pad dummy suggestions to prepare suggs2D (for ,→ combineBlockSuggsFromPredModel()) 240 if nSuggsMrcb < nSuggsMax 241 for i = 1 : nSuggsMax - nSuggsMrcb 242 suggsMrcb = [suggsMrcb BlockSuggFromPredModel("", 0)]; 243 end 244 end 245 246 if nsuggsNeighbors < nSuggsMax 247 for i = 1 : nSuggsMax - nsuggsNeighbors 248 suggsNeighbors = [suggsNeighbors BlockSuggFromPredModel("", ,→ 0)]; 249 end 250 end 251 252 % each row contains suggestions from 1 private method 253 suggs2D = [suggsMrcb; suggsNeighbors]; 254 % these are sorted by confidence, and 255 % all suggs have confidence >= 0 256 suggs = combineBlockSuggsFromPredModel(suggs2D, [obj.WEIGHT_MRCB, ,→ obj.WEIGHT_NEIGHBORS]); 257 time = toc(tStart); 258 end 259 end 260 261 methods (Access = private) 262 263 function [status, fileHash, data] = extractDetailsAndUpdateCache(obj ,→ , mdlSlxAbspath, verbose) 264 % Extract details from a mdlSlx file. 265 % This method also updates the cache stored in shared-var/ ,→ simvma_freqCache.mat, if required 266 267 % 268 % PARAMETERS: 269 % mdlSlxAbsPath (str) : absolute path of mdlSlx file 270 % 271 % RETURNS: 272 % ------273 % status (str): 274 % "INVALID_PATH" : mdlSlxAbspath does not exist 275 % "INVALID_MDL" : invalid mdlSlx file (cannot be loaded) 276 % "DUPLICATE" : a mdlSlx file with the same content has 277 % already been used to train the model

95 278 % "SUCCESS" : model was updated successfully by training 279 % with this mdlSlx file 280 % "FAIL" : something went wrong during when training 281 % the model with this mdlSlx file (although the 282 % mdlSlx file was loaded successfully). 283 % fileHash (str): hash value of content; 284 % NaN if status == "INVALID_PATH" 285 % data (struct) : same as FreqModel.data, but trained on 1 file 286 % NaN if status != "SUCCESS" 287 288 % 289 % IMPORTANT: 290 % ------291 % This method does not preserve the state of previously loaded 292 % simulink models, if any 293 294 % make sure no simulink model is loaded (as it might introduce 295 % some hard-to-debug errors 296 bdclose('all'); 297 298 fileHash = NaN; % default 299 data = NaN; % default 300 301 if ˜exist(mdlSlxAbspath, 'file') 302 status = "INVALID_PATH"; 303 return; 304 end 305 306 fileHash = hashByFilepath(mdlSlxAbspath); 307 if any(strcmp(obj.hashTrainingFiles, fileHash)) 308 status = "DUPLICATE"; 309 return; 310 end 311 312 % first, see if it is in cache 313 cache = getSharedVar('simvma_freqCache'); 314 if isfield(cache, fileHash) 315 if verbose disp('Using cache'); end 316 status = cache.(fileHash).status; 317 data = cache.(fileHash).data; 318 return; 319 end 320 321 % not found in cache; need to extract data 322 if verbose disp('Not found in cache'); end 323 324 try 325 load_system(mdlSlxAbspath) 326 catch ME 327 status = "INVALID_MDL"; 328 return

96 329 end 330 331 data = struct; 332 333 try 334 modelName = bdroot; 335 bdParams = getParamsByPath(modelName); % block diagram params 336 blocks = string(bdParams.Blocks); 337 for i = 1 : length(blocks) 338 block = blocks(i); 339 % some block-names contain the character '/'. But since 340 % '/' is used as a separater between parent and child 341 % blocks, '/' in a block-name is replaced with '//' 342 block = strrep(block, '/', '//'); 343 blockPath = modelName + "/" + block; 344 blockHandle = get_param(blockPath, 'handle'); 345 data = obj.extractDetailsHelper(data, blockHandle); 346 end 347 status = "SUCCESS"; 348 catch ME 349 status = "FAIL"; 350 end 351 352 % update cache 353 cache.(fileHash).status = status; 354 cache.(fileHash).data = data; 355 setSharedVar('simvma_freqCache', cache); 356 end 357 358 function data = extractDetailsHelper(obj, data, blockHandle) 359 % this method updates only data attribute 360 361 params = getParamsByHandle(blockHandle, false); 362 blockType = string(params.BlockType); 363 % make blockType compatible to be used as a field in a struct 364 blockType = convertBlockTypeToField(blockType); 365 366 if blockType == "SubSystem" 367 % go recursively 368 subSystemPath = string(params.Parent) + "/" + string(params. ,→ Name); 369 blocks = string(params.Blocks); % inner blocks 370 371 for i = 1 : length(blocks) 372 block = blocks(i); 373 % some block-names contain the character '/'. But since 374 % '/' is used as a separater between parent and child 375 % blocks, '/' in a block-name is replaced with '//' 376 block = strrep(block, '/', '//'); 377 blockPath = subSystemPath + "/" + block; 378 blockHandle = get_param(blockPath, 'handle');

97 379 data = obj.extractDetailsHelper(data, blockHandle); 380 end 381 else % base condition 382 if ˜isfield(data, blockType) 383 % add new blockType to data 384 data.(blockType) = struct(); 385 data.(blockType).count = 1; 386 387 data.(blockType).src = struct(); 388 data.(blockType).src.('count') = 0; 389 data.(blockType).src.('details') = struct(); 390 391 data.(blockType).dst = struct(); 392 data.(blockType).dst.('count') = 0; 393 data.(blockType).dst.('details') = struct(); 394 395 data.(blockType).both = struct(); 396 data.(blockType).both.('count') = 0; 397 data.(blockType).both.('details') = struct(); 398 else 399 data.(blockType).count = data.(blockType).count + 1; 400 end 401 402 % set/update obj.(blockType).src 403 mapSrc = getConnectedBlockTypesWithCountByBlockHandle( ,→ blockHandle, 'src'); 404 data.(blockType).src.('count') = data.(blockType).src.('count' ,→ ) + sum(cell2mat(mapSrc.values)); 405 406 % cast to string so they can be used to 407 % 1. check if field exists in structure using isfield() 408 % 2. access using parenthesis notation eg. mapSrc(keysSrc(1)) 409 keysSrc = string(mapSrc.keys); 410 for i = 1 : length(keysSrc) 411 key = keysSrc(i); % blockType 412 val = mapSrc(key); % count 413 key = convertBlockTypeToField(key); 414 if ˜isfield(data.(blockType).src.details, key) 415 data.(blockType).src.details.(key) = val; 416 else 417 data.(blockType).src.details.(key) = data.(blockType).src ,→ .details.(key) + val; 418 end 419 end 420 421 % set/update obj.(blockType).dst 422 mapDst = getConnectedBlockTypesWithCountByBlockHandle( ,→ blockHandle, 'dst'); 423 data.(blockType).dst.('count') = data.(blockType).dst.('count' ,→ ) + sum(cell2mat(mapDst.values)); 424

98 425 keysDst = string(mapDst.keys); 426 for i = 1 : length(keysDst) 427 key = keysDst(i); % blockType 428 val = mapDst(key); % count 429 key = convertBlockTypeToField(key); 430 if ˜isfield(data.(blockType).dst.details, key) 431 data.(blockType).dst.details.(key) = val; 432 else 433 data.(blockType).dst.details.(key) = data.(blockType).dst ,→ .details.(key) + val; 434 end 435 end 436 437 % set/update obj.(blockType).both 438 mapBoth = getConnectedBlockTypesWithCountByBlockHandle( ,→ blockHandle, 'both'); 439 data.(blockType).both.('count') = data.(blockType).both.(' ,→ count') + sum(cell2mat(mapBoth.values)); 440 441 keysBoth = string(mapBoth.keys); 442 for i = 1 : length(keysBoth) 443 key = keysBoth(i); % blockType 444 val = mapBoth(key); % count 445 key = convertBlockTypeToField(key); 446 if ˜isfield(data.(blockType).both.details, key) 447 data.(blockType).both.details.(key) = val; 448 else 449 data.(blockType).both.details.(key) = data.(blockType). ,→ both.details.(key) + val; 450 end 451 end 452 end 453 end 454 455 function suggs = predictByBlockType(obj, blockType, dstOnly) 456 % Returns a list of BlockSuggFromPredModel objects based on 457 % mrcb (most recently clicked block) type, sorted by confidence 458 % If no suggestions are possible, it will return an empty list 459 % of BlockSugg. 460 % 461 % PARAMETERS: 462 % ------463 % blockType(str): 464 % - This is the BlockType (as field) of the block such that the 465 % suggested block is supposed to be connected to that 466 % block. For example, themost recently clicked block, 467 % or one of its neighbors is a good fit for this argument. 468 % dstOnly(bool): 469 % - If true, the suggested block is supposed to be 470 % connected to ONLY a destination port of this block. 471 % - If false, the suggested block is supposed to be

99 472 % connected to either a source port or a destination 473 % port of this block 474 % 475 476 blockType = string(blockType); 477 blockType = convertBlockTypeToField(blockType); 478 suggs = BlockSuggFromPredModel.empty; 479 480 if isfield(obj.data, blockType) 481 if dstOnly 482 % total number of blocks connected at dest port of 483 % argument blockType in entire training set 484 total = obj.data.(blockType).dst.count; 485 else 486 % total number of blocks connected at both src and dest 487 % ports of argument blockType in entire training set 488 total = obj.data.(blockType).both.count; 489 end 490 491 dstDetails = obj.data.(blockType).dst.details; % struct 492 sortedDstBlockTypes = sortStructFieldsMaxToMin(dstDetails); 493 494 % The actual number of suggestions may be less than nSuggMax 495 % it can even be 0 in which case we return an empty array 496 nSugg = min(length(sortedDstBlockTypes), ,→ getSharedVarSimgestionNSuggsMax); 497 498 for i = 1 : nSugg 499 blockType_ = sortedDstBlockTypes(i); 500 percentOccurrence = dstDetails.(blockType_) / total; 501 % for FreqModel, confidence is percentOccurrence 502 sugg = BlockSuggFromPredModel(blockType_, percentOccurrence ,→ ); 503 suggs = [suggs sugg]; 504 end 505 end 506 end 507 508 function suggs = predictByMrcbBlockType(obj, mrcbBlockType) 509 % Returns a list of BlockSuggFromPredModel objects based on 510 % mrcb (most recently clicked block) type, sorted by confidence 511 % If no suggestions are possible, it will return an empty list 512 % of BlockSugg. 513 % 514 % PARAMETERS: 515 % ------516 % mrcbBlockType(str): 517 % - This is the BlockType (as field) of the block such that 518 % - the suggested block is supposed to be connected to a 519 % destination port of this block. 520 % (THIS IS DIFFERENT THAN WHAT HAPPENS IN

100 521 % predictByMrcbNeighborsBlockTypes) 522 % - The most recently clicked block is a good fit for 523 % this argument. 524 % 525 suggs = obj.predictByBlockType(mrcbBlockType, true); 526 end 527 528 529 function suggs = predictByMrcbNeighborsBlockTypes(obj, neighborsMap) 530 % Returns a list of BlockSuggFromPredModel objects based on 531 % the type of blocks connected radially (1 step away) to the MRCB 532 % (most recently clicked block) type, sorted by confidence 533 % If no suggestions are possible, it will return an empty list 534 % of BlockSugg. 535 % 536 % PARAMETERS: 537 % ------538 % neighborsMap(struct with key=blockType(str), and value=count( ,→ int)): 539 % - The key is the BlockType (as field) of a block which 540 % is a neighbor of the MRCB such that 541 % - the suggested block is supposed to be connected to 542 % either a src port or a dest port of that block. 543 % (THIS IS DIFFERENT THAN WHAT HAPPENS IN 544 % predictByMrcbNeighborsBlockTypes) 545 546 % APPROACH: 547 % - get suggestions for each neighboring blockType 548 % - adjust weight of suggestions based on corresponding 549 % blockType's count 550 551 neighborBlockTypes = string(fields(neighborsMap)); 552 553 % find total number of neighboring blocks 554 nNeighbors = 0; 555 for i = 1 : length(neighborBlockTypes) 556 neighborBlockType = neighborBlockTypes(i); 557 nNeighbors = nNeighbors + neighborsMap.(neighborBlockType); 558 end 559 560 % This map will be updated with information of blockType and 561 % confidence from each neighbor, and will eventually be used 562 % to create the final suggs list 563 % key : blockType (as field) 564 % value: confidence 565 map = struct; 566 567 568 for i = 1 : length(neighborBlockTypes) 569 neighborBlockType = neighborBlockTypes(i); 570 suggs = obj.predictByBlockType(neighborBlockType, false);

101 571 for j = 1 : length(suggs) 572 sugg = suggs(j); 573 574 % adjust confidence of suggestion from this neighboring 575 % BlockType, then update confidence of corresponding 576 % BlockType in map. 577 578 sugg.confidence = sugg.confidence * neighborsMap.( ,→ neighborBlockType) / nNeighbors; 579 blockType = sugg.blockTypeAsField; 580 if isfield(map, blockType) 581 map.(blockType) = map.(blockType) + sugg.confidence; 582 else 583 map.(blockType) = sugg.confidence; 584 end 585 end 586 end 587 588 suggs = BlockSuggFromPredModel.empty; 589 % finally create suggs from map (which now has adjusted 590 % confidences for all BlockTypes) 591 sortedBlockTypes = sortStructFieldsMaxToMin(map); % this ensures ,→ the suggestions are sorted 592 593 for i = 1 : length(sortedBlockTypes) 594 blockType = sortedBlockTypes(i); 595 sugg = BlockSuggFromPredModel(convertFieldToBlockType( ,→ blockType), map.(blockType)); 596 suggs = [suggs sugg]; 597 end 598 599 end 600 end 601 end

Listing A.9: Source.m class definition 1 classdef Source 2 properties 3 filepath % absolute path (unlike in simone report). This, unlike ,→ realFilepath, may be a symbolic link 4 startline 5 endline 6 pcid 7 system 8 realFilepath % absolute path. This, unlike filepath, is always the ,→ real filepath. This prop is created to make SimIMA Windows- ,→ compatible. 9 end 10 11 methods 12 function obj = Source(filepath, startline, endline, pcid)

102 13 filepath = string(filepath); 14 obj.filepath = filepath; 15 obj.startline = startline; 16 obj.endline = endline; 17 obj.pcid = pcid; 18 19 % All this overhead about 'realFilepath' is to make SimIMA 20 % Windows-compatible. This is because in Windows, 21 % getSystemPathFromStartline won't work with symlink as the ,→ filepath. 22 % So, we need to resolve it to the realpath pointed by the ,→ symlink. 23 % Such path resolution is needed for Clone.source2 (which comes 24 % from .../mdl-files/... (not for Clone.source1 which comes from 25 % .../mdl-file/sud.mdl) 26 27 prefix = getSimvmaPath() + "/Simone-2.0-Complete-Cygwin64- ,→ customized-for-simvma/mdl-files/"; 28 29 if filepath.startsWith(prefix) % true for Clone.source2 30 % convert symlink to realpath 31 symlinkMap = getSharedVar('simvma_symlinkMap'); 32 keys = symlinkMap.keys(); 33 for i = 1 : length(keys) 34 key = keys{i}; 35 if filepath.startsWith(key) 36 value = symlinkMap(key); 37 obj.realFilepath = filepath.replace(key, value); 38 obj.system = getSystemPathFromStartline(obj.realFilepath ,→ , startline); 39 break; 40 end 41 end 42 else % true for Clone.source1 43 obj.realFilepath = filepath; 44 end 45 obj.system = getSystemPathFromStartline(obj.realFilepath, ,→ startline); 46 end 47 end 48 end

Listing A.10: Suggestion.m class definition 1 classdef Suggestion 2 properties 3 similarity 4 source % corresponding clone.source2 5 mdlFilepath 6 imgFilepath 7 rank % rank of this suggestion (1:highest) 8 end

103 9 10 methods 11 function obj = Suggestion(similarity, source, mdlFilepath, ,→ imgFilepath, rank) 12 obj.similarity = similarity; 13 obj.source = source; 14 obj.mdlFilepath = mdlFilepath; 15 obj.imgFilepath = imgFilepath; 16 obj.rank = rank; 17 end 18 end 19 end

104 A.2 Function Definitions

Listing A.11: addSimvmaPathPrefix.m function definition 1 function abspath = addSimvmaPathPrefix(relpath) 2 % Add simvma path prefix from given relative path 3 % eg: 4 % "default-repos/automotive/Matlab_Central/MinorTimeStepLogging/ ,→ SimpleBounce.mdl" 5 % --> "/Users/bhisma/courses/cse-700-simvma/simvma/default-repos/ ,→ automotive/Matlab_Central/MinorTimeStepLogging/SimpleBounce.mdl" 6 % 7 % PARAMETERS: 8 % - relpath (char/string) : file/folder path relative to simvma 9 10 11 relpath = string(relpath); 12 abspath = getSimvmaPath + "/" + relpath; 13 end

Listing A.12: buildCache.m function definition 1 function buildCache(dirpaths) 2 % Use this function to add data to shared-vars/freqCache, and 3 % shared-vars/armCache. 4 % 5 % PARAMETERS: 6 % ------7 % dirpaths (list of string): list of absolute/relative paths of ,→ directories which 8 % contain mdlSlx files. These mdlSlx files may be nested inside inner 9 % directories as well. 10 11 dirpaths = string(dirpaths); 12 mdlSlxPaths = string.empty(); 13 for i = 1 : length(dirpaths) 14 dirpath = dirpaths(i); 15 mdlSlxPaths_ = searchFilesRecursively(dirpath, ["mdl", "slx"]); 16 mdlSlxPaths = [mdlSlxPaths mdlSlxPaths_]; 17 end 18 19 dispTitle("Building freqCache"); 20 fm = FreqModel(); 21 fm = fm.trainByFilepath(mdlSlxPaths, true); 22 23 dispTitle("Building armCache"); 24 am = ArmModel(); 25 am = am.trainByFilepath(mdlSlxPaths, true); 26 end

Listing A.13: combineBlockSuggsFromPredModel.m function definition

105 1 function suggs = combineBlockSuggsFromPredModel(suggs2D, weights) 2 % Combines two (or more) lists of BlockSuggs into 1 list, based on their ,→ weights, 3 % and sorted by rank 4 % 5 % PARAMETERS: 6 % ------7 % - suggs2D (2 D aray of BlockSuggFromPredModel) 8 % - Each row shall contain suggestions from 1 group 9 % - To make each 1D array have the same dimension, some suggestions 10 % are fake (have confidence = 0). These fake suggestions will be 11 % ignored while merging. 12 % - weights (1 D array of floats) 13 % - weights(i) is the weight of suggs2D(i) row group 14 15 assert(sum(weights) == 1, "combineBlockSuggs.m : Assertion failed -- ,→ Sum of all weights must be equal to 1.") 16 17 map = struct; % key: blocktypeAsField, value: confidence (based on ,→ weighted model) 18 19 [rows, cols] = size(suggs2D); 20 for i = 1 : rows 21 weight = weights(i); 22 suggsRow = suggs2D(i, :); 23 for j = 1 : cols 24 sugg = suggsRow(j); 25 % to ignore the fake suggestions which were padded to match the 2 ,→ D 26 % array dimensions, only process those with confidence > 0 27 % To know details, see the caller to this method where we 28 % prepare the suggs from each source before merging. 29 if sugg.confidence > 0 30 if ˜isfield(map, sugg.blockTypeAsField) 31 map.(sugg.blockTypeAsField) = weight * sugg.confidence; 32 else 33 map.(sugg.blockTypeAsField) = map.(sugg.blockTypeAsField) + ,→ weight * sugg.confidence; 34 end 35 end 36 end 37 end 38 39 suggs = BlockSuggFromPredModel.empty; 40 blockTypesAsFieldSorted = sortStructFieldsMaxToMin(map); % block types ,→ sorted by confidence 41 42 for i = 1 : length(blockTypesAsFieldSorted) 43 blockTypeAsField = blockTypesAsFieldSorted(i); 44 blockType = convertFieldToBlockType(blockTypeAsField); 45 confidence = map.(blockTypeAsField);

106 46 sugg = BlockSuggFromPredModel(blockType, confidence); 47 suggs = [suggs sugg]; 48 end 49 end

Listing A.14: convertBlockTypeToField.m function definition 1 function field = convertBlockTypeToField(blockType) 2 % Convert blockType (str) to a string compatible to be used as a fieldname 3 % in a struct. 4 % 5 % THE REVERSE ACTION IS PERFORMED BY field2BlockType() 6 % 7 % PARAMETERS: 8 % ------9 % blockType(str): BlockType of a block (obtained in its params) 10 11 field = blockType; 12 field = strrep(field, '-', '__dash__'); 13 end

Listing A.15: convertFieldToBlockType.m function definition 1 function blockType = convertFieldToBlockType(field) 2 % Convert field (str) to a the original blockType string. The field string 3 % was made by making some string replacements using ,→ convertBlockTypeToField() 4 % function 5 % 6 % THE REVERSE ACTION IS PERFORMED BY field2BlockType() 7 % 8 % PARAMETERS: 9 % ------10 % blockType(str): BlockType of a block (obtained in its params) 11 12 blockType = field; 13 blockType = strrep(blockType, '__dash__', '-'); 14 end

Listing A.16: convertSystemNameToGcsFormat.m function definition 1 function str = convertSystemNameToGcsFormat(str) 2 % Convert the string to the format followed by gcs 3 % The following transformation take place: 4 % / --> // 5 % \n --> newline 6 % \" --> "" 7 % 8 % PARAMETERS: 9 % ------10 % str : string -- string to be converted into gcs format

107 11 12 str = str.replace('/', '//'); 13 str = str.replace('\n', newline); 14 str = str.replace('\"', '"'); 15 end

Listing A.17: createCloneFromXmlElementLines.m function definition 1 function clone = createCloneFromXmlElementLines(cloneLine, srcline1, ,→ srcline2, simvmaPath) 2 tokens = split(cloneLine); 3 4 nlinesTokens = split(tokens(2), '='); 5 nlines = nlinesTokens(2); 6 nlines = cell2int(nlines); 7 8 similarityTokens = split(tokens(3), '='); 9 similarityTokens = split(similarityTokens(2), '>'); 10 similarity = similarityTokens(1); 11 similarity = cell2int(similarity); 12 13 source1 = createSourceFromXmlElement(srcline1, simvmaPath); 14 source2 = createSourceFromXmlElement(srcline2, simvmaPath); 15 16 clone = Clone(nlines, similarity, source1, source2); 17 end 18 19 function value = cell2int(value) 20 value = char(value); 21 len = length(value); 22 value = value(2:len-1); 23 value = string(value); 24 value = double(value); 25 value = int64(value); 26 end

Listing A.18: createMdlFile.m function definition 1 function createMdlFile(simvmaPath, modelPath) 2 % create the mdl file in simvma/mdl-file/mdlfile.mdl 3 sysName = gcs; 4 disp(sysName) 5 end

Listing A.19: createMdlFileForSimone.m function definition 1 function createMdlFileForSimone(mudFilepath, sudPath, simvmaPath) 2 % Create/Update the mdl file for Simone input. 3 % - This file, rather than the actual MUD, is used as Simone input 4 % to overcome the "nested-clone" problem in Simone. 5 % - This file contains only the (sub)systems inside and including SUD,

108 6 % not the (sub)systems outside the SUD. 7 % - This, file will always be MDL (not SLX) because Simone works on MDL ,→ files only 8 % - This mdl is a valid mdl file (from Simone's perspective), but not from 9 % Simulink's perspective. 10 % 11 % PARAMETERS: 12 % ------13 % mudFilepath : string -- absolute filepath of Model Under Development 14 % sudPath : string -- path of (sub)System Under Development. This 15 % should be in format as returned by 'gcs' eg: 16 % "mymodel/a/b/mysubsystem" 17 % simvmaPath : string -- absolute path of simvma 18 % 19 % ASSUMPTIONS: 20 % - both mdlFilepath and systemPath are valid. this means: 21 % 1. a mdl file exists for given mdlFilepath 22 % 2. the mdl file contains a system given by systemPath 23 % 24 % IF THIS REQUIREMENT IS NOT SATISFIED, AN ERROR WILL BE THROWN (IT 25 % WON'T BE HANDLED) STOPPING FURTHER EXECUTION OF PROGRAM IMMEDIATELY 26 % 27 % - the mdl file is created by Simulink and is in 'standard' format 28 % such that: 29 % - the first property of System{} is always 'Name' 30 % - "{" and "}" appear only one per line (at max), and only as last 31 % character (after stripping white space) 32 % 33 34 mudFilepath = string(mudFilepath); 35 sudPath = string(sudPath); 36 simvmaPath = string(simvmaPath); 37 38 % if mudFilepath is an SLX file, first save it in MDL format in 39 % simvma/tmp/convertedFromSlx.mdl 40 % IMPORTANT: The filename of the created mdl file must be the same as 41 % the slx file (with .mdl extension). Otherwise, sudPath won't be found 42 % in the created mdl file and getStartAndEndLinesFromSystemPath() will 43 % throw error 44 state = getSharedVar('simvma_appState'); 45 if mudFilepath.lower().endsWith(".slx") 46 % % remove previously generated mdl file (if any) in simvma/tmp/ ,→ simxample-slx2mdl-output/ folder 47 cmd = "rm " + simvmaPath + "/tmp/simxample-slx2mdl-output/*.mdl"; 48 (cmd); 49 50 slxFilepath = mudFilepath; 51 [slxFolderPath, name, ext] = fileparts(slxFilepath); 52 destPath = simvmaPath + "/tmp/simxample-slx2mdl-output/" + name + ". ,→ mdl"; 53 try

109 54 slx2mdl(mudFilepath, destPath); 55 mudFilepath = destPath; 56 57 % this flag (state.simxampleSavedAsMdl) will be used by 58 % updateMUDWithSuggestion() 59 state.simxampleSavedAsMdl = true; 60 catch ME 61 error("ERROR: FAILED TO CONVERT MODEL UNDER DEVELOPMENT TO MDL ,→ FORMAT. CANNOT INVOKE SIMXAMPLE WITH THIS PARTICULAR MODEL." ,→ ); 62 end 63 % slx2mdl conversion closes the MUD. So, we need to open it back. 64 open_system(slxFilepath); 65 else 66 state.simxampleSavedAsMdl = false; 67 end 68 setSharedVar('simvma_appState', state); 69 70 try 71 [sl, el] = getStartAndEndLinesFromSystemPath(mudFilepath, sudPath); 72 catch ME 73 disp("mudFilepath: " + mudFilepath); 74 disp("sudPath: " + sudPath); 75 error("*** ERROR: INVALID ARUGMENT(S)") 76 end 77 78 readlines = string.empty; 79 fid = fopen(mudFilepath, 'r'); 80 while ˜feof(fid) 81 line = fgets(fid); % char vector 82 line = string(line); 83 readlines = [readlines line]; 84 end 85 fclose(fid); 86 87 writelines = ["Model {", newline]; % cannot use \n, coz '\' is escaped 88 for i = 1: length(readlines) 89 if i >= sl && i <= el 90 writelines = [writelines readlines(i)]; 91 end 92 end 93 writelines = [writelines, newline, "}"]; % closing } for "Model" 94 95 dirpath = simvmaPath + "/Simone-2.0-Complete-Cygwin64-customized-for- ,→ simvma/mdl-file"; 96 % create parent directory for sud.mdl (if it doesnot exist already) 97 if ˜ exist(dirpath, 'dir') 98 mkdir(dirpath); 99 end 100 101 mdlFilepath = dirpath + "/sud.mdl";

110 102 103 fid = fopen(mdlFilepath, 'w'); 104 for i = 1:length(writelines) 105 line = writelines{i}; 106 line = strrep(line, '%', '%%'); % because '%' is escaped by matlab 107 line = strrep(line, '\', '\\'); 108 fprintf(fid, line); 109 end 110 fclose(fid); 111 end

Listing A.20: createMdlFileFromSource.m function definition 1 function createMdlFileFromSource(source, targetMdlFilepath, simvmaPath) 2 % Create a mdl file for which the top level system will be that 3 % corresponding to 'source' 4 5 sourceFilepath = source.realFilepath; % must use source.realFilepath, ,→ rather than source.filepath for Windows-compatiblity. 6 blankMdlFilepath = simvmaPath + "/special-files/blank_r2012b.mdl"; 7 copyfile(blankMdlFilepath, targetMdlFilepath); 8 9 SYSTEM_START_LINE_IN_BLANK_MDL_FILE = 732; 10 SYSTEM_END_LINE_IN_BLANK_MDL_FILE = 748; 11 12 writelines = {}; % lines to be written to target mdl file 13 14 % lines in blank.mdl from beginning to the line above System { 15 fid = fopen(blankMdlFilepath, 'r'); 16 lineCount = 0; 17 while ˜feof(fid) 18 line = fgets(fid); 19 lineCount = lineCount + 1; 20 21 if lineCount < SYSTEM_START_LINE_IN_BLANK_MDL_FILE 22 len = length(writelines); 23 writelines{len+1} = line; 24 end 25 end 26 fclose(fid); 27 28 % lines in clone source file corresponding to System {} 29 fid = fopen(sourceFilepath, 'r'); 30 lineCount = 0; 31 while ˜feof(fid) 32 line = fgets(fid); 33 lineCount = lineCount + 1; 34 35 if lineCount >= source.startline && lineCount <= source.endline 36 len = length(writelines); 37 writelines{len+1} = line; 38 end

111 39 end 40 fclose(fid); 41 42 % lines in blank.mdl from the line below } (of System) till the end of 43 % file 44 fid = fopen(blankMdlFilepath, 'r'); 45 lineCount = 0; 46 while ˜feof(fid) 47 line = fgets(fid); 48 lineCount = lineCount + 1; 49 50 if lineCount > SYSTEM_END_LINE_IN_BLANK_MDL_FILE 51 len = length(writelines); 52 writelines{len+1} = line; 53 end 54 end 55 fclose(fid); 56 57 fid = fopen(targetMdlFilepath, 'w'); 58 for i = 1:length(writelines) 59 line = writelines{i}; 60 line = strrep(line, '%', '%%'); % because '%' is escaped by matlab 61 fprintf(fid, line); 62 end 63 fclose(fid); 64 end

Listing A.21: createOrUpdateSimoneConfigFile.m function definition 1 function createOrUpdateSimoneConfigFile(difflimit, minsize, maxsize, ,→ rename, simvmaPath) 2 % Create or Update configuration file for Simone 3 % A new file will be created if a configuration file is not found matching 4 % provided difflimit 5 % If a configuration file exists for given difflimit, the same file will ,→ be 6 % updated. 7 % The file will be located at: 8 % .../simvma/Simone-2.0-Complete-Cygwin64-customized-for-simvma/config/ 9 % 10 % PARAMETERS: 11 % ------12 % difflimit : int -- maximum allowable difference between clones 13 % minsize : int -- minimum number of lines required for a (sub)system 14 % to be considered as a potential clone 15 % maxsize : int -- maximum number of lines allowed for a (sub)system 16 % to be considered as a potential clone 17 % rename : string -- what kind of filtering to be applies 18 % Possible values: "none"/"blind"/"consistent" 19 % simvmaPath : absolute path of simvma project 20

112 21 configFilepath = simvmaPath + "/Simone-2.0-Complete-Cygwin64-customized ,→ -for-simvma/config/simone" + difflimit + ".cfg"; 22 templatePath = simvmaPath + "/Simone-2.0-Complete-Cygwin64-customized- ,→ for-simvma/config/template.cfg"; 23 24 lines = string.empty; 25 fid = fopen(templatePath, 'r'); 26 while ˜feof(fid) 27 line = fgets(fid); % char vector 28 line = string(line); 29 30 tokens = line.split('='); 31 if length(tokens) == 2 32 if tokens(1) == "threshold" 33 tokens(2) = string(difflimit / 100) + "\n"; 34 35 elseif tokens(1) == "minsize" 36 tokens(2) = string(minsize) + "\n"; 37 38 elseif tokens(1) == "maxsize" 39 tokens(2) = string(maxsize) + "\n"; 40 41 elseif tokens(1) == "rename" 42 tokens(2) = rename + "\n"; 43 end 44 45 line = tokens.join('='); 46 end 47 lines = [lines line]; 48 end 49 fclose(fid); 50 51 fid = fopen(configFilepath, 'w'); 52 for i = 1:length(lines) 53 line = lines{i}; 54 line = strrep(line, '%', '%%'); % because '%' is escaped by matlab 55 fprintf(fid, line); 56 end 57 fclose(fid); 58 end

Listing A.22: createSourceFromXmlElement.m function definition 1 function source = createSourceFromXmlElement(xmlElement, simvmaPath) 2 tokens = split(xmlElement); 3 4 fileTokens = split(tokens(2), '='); 5 file = fileTokens(2); 6 file = cell2char(file); 7 file = string(file); 8 filepath = simvmaPath + "/Simone-2.0-Complete-Cygwin64-customized-for- ,→ simvma/" + file;

113 9 10 startlineTokens = split(tokens(3), '='); 11 startline = startlineTokens(2); 12 startline = cell2int(startline); 13 14 endlineTokens = split(tokens(4), '='); 15 endline = endlineTokens(2); 16 endline = cell2int(endline); 17 18 pcidTokens = split(tokens(5), '='); 19 pcidTokens = split(pcidTokens(2), '>'); 20 pcid = pcidTokens(1); 21 pcid = cell2int(pcid); 22 23 source = Source(filepath, startline, endline, pcid); 24 end 25 26 function value = cell2int(value) 27 value = char(value); 28 len = length(value); 29 value = value(2:len-1); 30 value = string(value); 31 value = double(value); 32 value = int64(value); 33 end 34 35 function value = cell2char(value) 36 value = char(value); 37 len = length(value); 38 value = value(2:len-1); 39 end

Listing A.23: detectChangesInRepos.m function definition 1 function changed = detectChangesInRepos(prevAppAdminState, ,→ newAppAdminState, verbose) 2 % returns true if there is any change in any default or custom 3 % repository. A change means one or more of the following: 4 % - addition/deletion of a mdl or slx file in a default repo 5 % - addition/deletion of a mdl or slx file in a custom repo 6 % - change of 'checked' status for 1 or more default repo 7 % * if a custom repopath NOT containing any mdl or slx files 8 % is added, it won't be detected as a 'change' (which is 9 % good as we don't want to update our prediction models in 10 % that case) 11 12 if verbose 13 dispUnderlined("Previous State") 14 disp(prevAppAdminState); 15 dispUnderlined("New State") 16 disp(newAppAdminState); 17 disp("Detecting changes in repositories...");

114 18 end 19 20 changed = false; % default 21 22 % loop over default repos 23 for i = 1 : 6 24 % check for potential change of 'checked' status of 1 or more ,→ default repos 25 status1 = prevAppAdminState.("defRepo" + i + "Checked"); 26 status2 = newAppAdminState.("defRepo" + i + "Checked"); 27 if status1 ˜= status2 28 changed = true; 29 if verbose 30 disp("Change detected in default repo " + i + " checked status ,→ "); 31 end 32 return; 33 end 34 35 % check for potential addition/deletion of mdl/slx files in default ,→ repos 36 paths1 = prevAppAdminState.("relpathsOfMdlSlxInDefRepo" + i); 37 paths2 = newAppAdminState.("relpathsOfMdlSlxInDefRepo" + i); 38 39 diff1 = setdiff(paths1, paths2); 40 diff2 = setdiff(paths2, paths1); 41 diff = union(diff1, diff2); 42 43 if ˜isempty(diff) 44 changed = true; 45 if verbose 46 disp("Change detected in relpathsOfMdlSlxInDefRepo " + i); 47 end 48 return; 49 end 50 end 51 52 % loop over custom repos 53 % check for potential addition/deletion of mdl/slx files in custom ,→ repos 54 for i = 1 : 5 55 paths1 = prevAppAdminState.("abspathsOfMdlSlxInCstRepo" + i); 56 paths2 = newAppAdminState.("abspathsOfMdlSlxInCstRepo" + i); 57 58 diff1 = setdiff(paths1, paths2); 59 diff2 = setdiff(paths2, paths1); 60 diff = union(diff1, diff2); 61 62 if ˜isempty(diff) 63 changed = true; 64 if verbose

115 65 disp("Change detected in abspathsOfMdlSlxInCstRepo " + i); 66 end 67 return; 68 end 69 end 70 71 if verbose 72 disp("No change detected in repositories."); 73 end 74 end

Listing A.24: dispBlkSuggsOnWorkspace.m function definition 1 function dispBlkSuggsOnWorkspace(suggs, context) 2 % Displays the block suggestions on user's workspace i.e. Simulink's ,→ canvas 3 % 4 % 5 % PARAMETERS: 6 % ------7 % suggs (BlockSuggestion): list of BlockSuggestion objects 8 % context (Context) : the context of the simulink workspace for which 9 % block suggestions are queried. 10 % 11 12 assignin('base', 'simvma_context', context); % debugging 13 disp(context); 14 % disp(context.mrcbNeighborsMap); 15 16 % highlightMrcbAndNeighbors(context); 17 18 maskDisplayHeader = "image('"+ getSimvmaPath() + "/images/sugg-panel- ,→ header.png" + "')"; 19 maskDisplayFooter = "image('"+ getSimvmaPath() + "/images/sugg-panel- ,→ footer.png" + "')"; 20 add_block('built-in/Subsystem', context.sud + "/suggPanelHeader", ' ,→ Position', context.positionSuggPanelHeaderSS, 'ShowName', 'off', ' ,→ MaskDisplay', maskDisplayHeader, 'OpenFcn', handleHeaderClicked); 21 add_block('built-in/Subsystem', context.sud + "/suggPanelBody", ' ,→ Position', context.positionSuggPanelBodySS, 'ShowName', 'off', ' ,→ OpenFcn', handleBodyClicked); 22 23 % delete previous footer, if any, first 24 try 25 delete_block(context.sud + "/suggPanelFooter") 26 catch ME 27 end 28 29 add_block('built-in/Subsystem', context.sud + "/suggPanelFooter", ' ,→ Position', context.positionSuggPanelFooterSS, 'ShowName', 'off', ' ,→ MaskDisplay', maskDisplayFooter, 'OpenFcn', ,→ handleIgnoreSuggestions);

116 30 31 xMinSugg = context.positionSuggPanelBodySS(1) + (context.suggBlkDim); 32 yMinSugg = context.positionSuggPanelBodySS(2) + (context.suggBlkDim / ,→ 2); 33 xMaxSugg = xMinSugg + context.suggBlkDim; 34 yMaxSugg = yMinSugg + context.suggBlkDim; 35 36 % point sized 37 xMinPercent = context.positionSuggPanelBodySS(1) + context.suggBlkDim * ,→ 2.5; 38 yMinPercent = context.positionSuggPanelBodySS(2) + context.suggBlkDim * ,→ .9; 39 xMaxPercent = xMinPercent; 40 yMaxPercent = yMinPercent; 41 42 percentTexts = string.empty; 43 for i = 1 : length(suggs) 44 % display block 45 sugg = suggs(i); 46 posSugg = [xMinSugg yMinSugg xMaxSugg yMaxSugg]; 47 posPercent = [xMinPercent yMinPercent xMaxPercent yMaxPercent]; 48 49 % Sometimes block-insertion fails. 50 % For example, if there is already a 'control' port in a subsystem, 51 % trying to insert another control port fails. 52 % So, we need a try-catch block here. 53 % Such block-insertion failures are notified from both an error 54 % dialog and the console. 55 try 56 add_block(sugg.libPath, sugg.blockPath, 'Position', posSugg, ' ,→ HideAutomaticName', 0, ... 57 'ForegroundColor', sugg.foregroundColor, 'BackgroundColor', ,→ sugg.backgroundColor, ... 58 'OpenFcn', handleSuggestionChosen); 59 yMinSugg = yMaxSugg + context.suggBlkDim/2; 60 yMaxSugg = yMinSugg + context.suggBlkDim; 61 62 yMinPercent = yMinPercent + 1.5 * context.suggBlkDim; 63 yMaxPercent = yMinPercent; 64 65 catch ME 66 disp("Block Insertion Failed"); 67 disp(ME); 68 dispUnderlined("Details of the block suggestion that failed to ,→ insert"); 69 disp(sugg); 70 dispDlgErr("Failed to insert a block suggestion, see console log ,→ for details", "Block Insertion Failure"); 71 end 72 73 % display block's prediction confidence to its right

117 74 percentText = round(sugg.confidence * 100) + "%"; 75 % sometimes two block suggestions may have same percentage. In such 76 % case, we cannot insert 2 subsystems with same name. So we append 77 % a space to the new name to make it unique 78 while (any(strcmp(percentText, percentTexts))) 79 percentText = percentText + " "; % append space character to make ,→ the name unique 80 end 81 percentTexts = [percentTexts percentText]; 82 83 blockPath = context.sud + "/" + percentText; 84 add_block('built-in/Subsystem', blockPath, 'Position', posPercent, ' ,→ ShowName', 'on'); 85 end 86 state = getSharedVar('simvma_blockInsertionState'); 87 state.percentTexts = percentTexts; 88 setSharedVar('simvma_blockInsertionState', state); 89 end 90 91 92 function str = handleIgnoreSuggestions() 93 % Returns a string of callback function code to handle the event when 94 % the user clickes 'Ignore Suggestions' annotation 95 96 % IMPORTANT: 97 % The callback code which is read from the file 98 % 'src/special-files/handleIgnoreSuggestionsCallbackCode.m' is a 99 % "string" (rather than a function) which is interpreted at runtime ( ,→ when the event occurs). 100 % As a result, even if we pass the arguments to this function, they 101 % won't be available at callback-execution-time. 102 % To mitigate this, we "pass" necessary arguments using the shared-var 103 % 'BlockInsertionState' 104 str = ""; 105 fid = fopen(getSimvmaPath + "/src/special-files/ ,→ handleIgnoreSuggestionsCallbackCode.m", 'r'); 106 while ˜feof(fid) 107 line = fgets(fid); % char vector 108 line = string(line); 109 str = str + line; 110 end 111 fclose(fid); 112 end 113 114 115 function str = handleSuggestionChosen() 116 % Returns a string of callback function code to handle the event when 117 % the user clickes 'Ignore Suggestions' annotation 118 % 119 % Unfortunately, we cannot pass arguments to this function (just like 120 % handleIgnoreSuggestions)

118 121 122 str = ""; 123 fid = fopen(getSimvmaPath + "/src/special-files/ ,→ handleSuggestionChosenCallbackCode.m", 'r'); 124 while ˜feof(fid) 125 line = fgets(fid); % char vector 126 line = string(line); 127 str = str + line; 128 end 129 fclose(fid); 130 end 131 132 133 function str = handleHeaderClicked() 134 str = "disp('header clicked')"; 135 end 136 137 function str = handleBodyClicked() 138 str = "disp('body clicked')"; 139 end

Listing A.25: dispClones.m function definition 1 function dispClones(clones, msg) 2 % display clones 3 % 4 % PARAMETERS: 5 % ------6 % clones : cell vector -- clones to be displayed 7 % msg : string -- message to be displayed before displaying the clones 8 9 dispTitle(msg); 10 11 if length(clones) == 0 12 dispTitle("NO CLONE AT ALL"); 13 else 14 for i = 1: length(clones) 15 dispUnderlined("Clone" + i); 16 clone = clones{i}; 17 disp(clone); 18 19 dispUnderlined("clone" + i + ".source1"); 20 disp(clone.source1); 21 22 dispUnderlined("clone" + i + ".source2"); 23 disp(clone.source2); 24 25 dispSeparator(); 26 end 27 end 28 end

119 Listing A.26: dispDlgConfirmation.m function definition 1 function answer = dispDlgConfirmation(question) 2 % Return user's ans (true/false) by prompting the user to choose an option 3 % (YES/NO) to the passed question. 4 5 windowTitle = "User's confirmation required!"; 6 7 buttonName = questdlg(question, windowTitle, 'Yes', 'No', 'No'); 8 switch buttonName 9 case 'Yes' 10 answer = true; 11 case 'No' 12 answer = false; 13 otherwise 14 answer = false; 15 end 16 end

Listing A.27: dispDlgErr.m function definition 1 function dispDlgErr(msg, title) 2 % Display error message in a Window. 3 % 4 % PARAMETERS: 5 % ------6 % msg (string): message to be displayed 7 % title(string): optional; title of the dialog window 8 9 if nargin == 1 % title not provided 10 title = "ERROR"; 11 end 12 errordlg(msg, title); 13 end

Listing A.28: dispDlgMsg.m function definition 1 function dispDlgMsg(msg, title) 2 % Display message in a Window. 3 % 4 % PARAMETERS: 5 % ------6 % msg (string): message to be displayed 7 % title(string): optional; title of the dialog window 8 9 if nargin == 1 % title not provided 10 title = "User's action required!"; 11 end 12 warndlg(msg, title); 13 end

Listing A.29: dispImp.m function definition

120 1 function dispImp(msg) 2 msg = string(msg); 3 msg = newline + "*** " + msg + newline; 4 disp(msg); 5 end

Listing A.30: dispKeyVal.m function definition 1 function dispKeyVal(key, val) 2 msg = key + " : " + val; 3 disp(msg); 4 end

Listing A.31: dispNewOnlyBlocks.m function definition 1 function dispNewOnlyBlocks (fm, am) 2 % fm : frequency model trained in as many simulink models as possible 3 4 % this function is for development purpose only 5 % both oldBlocks and newBlocks are list of strings 6 7 newBlocksFm = transpose(string(fields(fm.data))); 8 newBlocksAm = am.blockTypes; 9 10 newBlocks = [newBlocksFm newBlocksAm]; 11 12 oldBlocks = string.empty; 13 start = false; 14 15 fid = fopen("/Users/bhisma/courses/cse-700-simvma/simvma/src/classes/ ,→ BlockSugg.m", 'r'); 16 while ˜feof(fid) 17 line = fgets(fid); % char vector 18 line = string(line); 19 20 line = strip(line); 21 if startsWith(line, "builtinBlocks = [...") 22 start = true; 23 continue; 24 end 25 26 if line == "];" 27 break; 28 end 29 30 if start 31 block = replace(line, ",...", ""); 32 block = replace(block, '"', ''); 33 oldBlocks = [oldBlocks block]; 34 end 35 end

121 36 fclose(fid); 37 38 blocks = setdiff(newBlocks, oldBlocks); 39 for i = 1 : length(blocks) 40 disp(blocks(i)); 41 end 42 end

Listing A.32: dispSeparator.m function definition 1 function dispSeparator() 2 disp("------,→ "); 3 end

Listing A.33: dispSpaced.m function definition 1 function dispSpaced(msg) 2 msg = string(msg); 3 msg = newline + msg + newline; 4 disp(msg); 5 end

Listing A.34: dispSpacedAbove.m function definition 1 function dispSpacedAbove(msg) 2 msg = string(msg); 3 msg = newline + msg; 4 disp(msg); 5 end

Listing A.35: dispSpacedBelow.m function definition 1 function dispSpacedBelow(msg) 2 msg = string(msg); 3 msg = msg + newline; 4 disp(msg); 5 end

Listing A.36: dispSuggestions.m function definition 1 function dispSuggestions(suggestions, msg) 2 % display suggestions 3 % 4 % PARAMETERS: 5 % ------6 % suggestions : cell vector -- suggestions to be displayed 7 % msg : string -- message to be displayed before displaying the ,→ suggestions 8 9 dispTitle(msg);

122 10 11 if length(suggestions) == 0 12 dispTitle("NO SUGGESTION AT ALL"); 13 else 14 for i = 1:length(suggestions) 15 dispUnderlined("Suggestion" + i); 16 sug = suggestions{i}; 17 disp(sug); 18 disp(sug.source); 19 dispSeparator(); 20 end 21 end 22 end

Listing A.37: dispTitle.m function definition 1 function dispTitle(msg) 2 msg = string(msg); 3 msg = newline + "======" + msg + " ======" ,→ + newline; 4 disp(msg); 5 end

Listing A.38: dispUnderlined.m function definition 1 function dispUnderlined(msg) 2 msg = char(msg); 3 len = length(msg); 4 dashes = ""; 5 for i = 1:len 6 dashes = dashes + "-"; 7 end 8 9 disp(msg); 10 disp(dashes); 11 end

Listing A.39: dispUnderlinedSpaced.m function definition 1 function dispUnderlinedSpaced(msg) 2 msg = char(msg); 3 len = length(msg); 4 dashes = ""; 5 for i = 1:len 6 dashes = dashes + "-"; 7 end 8 9 disp(newline); 10 disp(msg); 11 disp(dashes); 12 disp(newline); 13 end

123 Listing A.40: dispUnderlinedSpacedAbove.m function definition 1 function dispUnderlinedSpacedAbove(msg) 2 msg = char(msg); 3 len = length(msg); 4 dashes = ""; 5 for i = 1:len 6 dashes = dashes + "-"; 7 end 8 9 disp(newline); 10 disp(msg); 11 disp(dashes); 12 end

Listing A.41: exportMdlAsPng.m function definition 1 function exportMdlAsPng(mdlFilepath, pngFilepath) 2 % export MDL file as PNG file 3 4 [fPath, fName, fExt] = fileparts(mdlFilepath); 5 6 loadedPreviously = bdIsLoaded(fName); 7 8 if ˜loadedPreviously 9 load_system(mdlFilepath); 10 end 11 12 print("-s" + fName, "-dpng", pngFilepath); % print('-sadder', '-dpng', ,→ 'adder.png') 13 14 % make sure, if the system was loaded before calling this function, 15 % it remains loaded, and vice-versa 16 if ˜loadedPreviously 17 close_system(mdlFilepath); 18 end 19 end

Listing A.42: filenamesMatch.m function definition 1 function result = filenamesMatch(filepath1, filepath2) 2 % Return true if filenames match, else return false 3 % 4 % PARAMETERS: 5 % ------6 % filepath1 : string -- absolute path of first file 7 % filepath2 : string -- absolute path of second file 8 9 10 [folder1, bdName1, ext1] = fileparts(filepath1); 11 [folder2, bdName2, ext2] = fileparts(filepath2); 12

124 13 if bdName1 == bdName2 14 result = true; 15 else 16 result = false; 17 end 18 end

Listing A.43: filterOutClonesBeyondThreshold.m function definition 1 function filteredClones = filterOutClonesBeyondThreshold(clones, ,→ thresholdLower, thresholdUpper) 2 filteredClones = {}; 3 for i = 1:length(clones) 4 c = clones{i}; 5 if c.similarity >= thresholdLower && c.similarity <= thresholdUpper 6 filteredClones{end+1} = c; 7 end 8 end 9 end

Listing A.44: filterOutClonesWithSource1NotMatchingSUD.m function definition 1 function filteredClones = filterOutClonesWithSource1NotMatchingSUD(clones, ,→ sudPath) 2 % This function removes clones for which the source1.system does not match 3 % the (Sub)System-Under-Development. 4 % 5 % PARAMETERS: 6 % ------7 % clones : cell array -- clones on which filtration is to be applied 8 % sudPath : string -- path of (sub)system under development (as returned ,→ by 9 % gcs (eg: "mymodel/a/b/c" 10 % 11 % NOTE: 12 % ----- 13 % When simone detects clones, the source1 (from which we derive source1. ,→ system) 14 % will be from the model-under-development, and source2 will be from the ,→ repositories. 15 % But, the source1 could be other than the (Sub)System Under Development 16 % which we are interested in (because, there can be multiple sub-systems ,→ in 17 % the model-under-development. 18 19 % Since the first input to Simone is folder containing "sud.mdl" file 20 % rather than the actual MUD, the sudPath needs to be adjusted -- the 21 % system path upto the SUD needs to be trimmed as "sud.mdl"'s system 22 % heirarchy starts with the SUD. 23 % eg. if actual sudPath is "dd/s1/s2", then sud.mdl will begin with 24 % "s2", thus "dd/s1/" needs to be trimmed. 25 %

125 26 % eg1: "dd/s1/s2" --> "s2" 27 % eg2: "dd" --> "dd" 28 % eg3: "dd/s1/sub//system --> "sub//system" 29 30 % We want to split the sudPath at '/', 31 % but we don't want to split at '//' 32 sudPath = sudPath.replace('//', '___slash_in_sub_system_name___'); 33 34 tokens = sudPath.split("/"); 35 sudPath = tokens(end); 36 sudPath = sudPath.replace('___slash_in_sub_system_name___', '//'); 37 38 39 filteredClones = {}; 40 41 for i = 1:length(clones) 42 clone = clones{i}; 43 if clone.source1.system == sudPath 44 filteredClones{end+1} = clone; 45 % disp("pass : " + clone.source1.system); 46 else 47 % disp("fail : " + clone.source1.system); 48 end 49 end 50 end

Listing A.45: filterOutDuplicateClones.m function definition 1 function clones = filterOutDuplicateClones(clones) 2 % Often, more than 1 clones (in repositories) have same similarity with ,→ the 3 % SUD. In many cases, these clones (from repos) are 100% similar to each 4 % other too. In such case, we don't want to display duplicate suggestions. 5 % (for example same suggestion (after renamingg) in 1st and 2nd rank). ,→ This 6 % function filters such duplicate clones. 7 % 8 % This filtration is simple and computationally affordable. It is based on 9 % comparing clone.similarity and clone.nlines. This filters out "duplicate ,→ " 10 % clones (intended), but also filters out those clones which are not 11 % duplictes even though they match in similarity and nlines (not intended) ,→ . 12 % 13 % A more strict filteration approach would be one based on Simone 14 % similarity value between the potential duplicate clones (those with same 15 % % similarity with SUD). However, that would require us to compute Simone 16 % similarity for each clone-pairs matching in percentage similarity with 17 % the SUD, and would be phrohibitively time-intensive 18 19 clonesBak = clones; 20 clones = {};

126 21 for i = 1 : length(clonesBak) 22 clone = clonesBak{i}; 23 if ˜matchesAnyPrevious(clone, clones) 24 clones{end + 1} = clone; 25 end 26 end 27 end 28 29 function present = matchesAnyPrevious(clone, prevClones) 30 % Returns true if clone matches any 1 of prevClones (in terms of nlines ,→ and 31 % similarity. Else returns false 32 present = false; 33 for i = 1 : length(prevClones) 34 pc = prevClones{i}; 35 if clone.similarity == pc.similarity && clone.nlines == pc.nlines 36 present = true; 37 return; 38 end 39 end 40 end

Listing A.46: filterOutUninsertableBlockSuggs.m function definition 1 function suggsPassed = filterOutUninsertableBlockSuggs(suggs) 2 % Remove all suggestions which are not insertable into the Simulink 3 % workspace 4 % 5 % PARAMETERS: 6 % ------7 % - suggs (array of BlockSugg) 8 9 uninsertableBlockTypes = ["Reference"]; 10 11 suggsPassed = BlockSugg.empty; 12 for i = 1 : length(suggs) 13 sugg = suggs(i); 14 pass = true; 15 if any(strcmp(sugg.blockType, uninsertableBlockTypes)) 16 pass = false; 17 end 18 19 if pass 20 suggsPassed = [suggsPassed sugg]; 21 end 22 end 23 end

Listing A.47: getBlockSuggs.m function definition 1 function suggs = getBlockSuggs(context, verbose) 2 % Returns an array of a maximum of N BlockSuggs, sorted by rank;

127 3 % N is determined by simvma_appState.nSuggsMax 4 % 5 % PARAMETERS: 6 % ------7 % context (Context) : the context of the simulink workspace for which 8 % block suggestions are queried. 9 % 10 % ASSUMPTION: 11 % ------12 % - shared-vars/simvma_freqModel.mat and shared-vars/simvma_armModel.mat 13 % exist 14 % 15 % APPROACH: 16 % ------17 % - context is passed in as an argument 18 % - prediction-models are read from shared-vars 19 % - BlockSuggs are obtained from each prediction model 20 % - A final list of BlockSuggs is prepared based on a 'weighted-model' 21 % approach 22 23 state = getSharedVar('simvma_appState'); 24 25 switch state.simgestionAccuracyLevel 26 case 1 % less accurate but fast 27 useArmModel = false; 28 useFreqModel = true; 29 case 2 % medium accuracy, medium speed 30 useArmModel = true; 31 useFreqModel = false; 32 case 3 % more accurate but slow 33 useArmModel = true; 34 useFreqModel = true; 35 otherwise 36 error("From getBlockSuggs(): Invalid value for accuracyLevel " + ,→ state.simgestionAccuracyLevel); 37 end 38 39 suggsArm = BlockSuggFromPredModel.empty; 40 suggsFreq = BlockSuggFromPredModel.empty; 41 42 if useArmModel 43 armModel = getSharedVar('simvma_armModel'); 44 [suggsArm, timeArm] = armModel.predict(context); 45 dispKeyVal('execution time for ArmModel', timeArm); 46 end 47 48 if useFreqModel 49 freqModel = getSharedVar('simvma_freqModel'); 50 [suggsFreq, timeFreq] = freqModel.predict(context); 51 dispKeyVal('execution time for FreqModel', timeFreq); 52 end

128 53 54 % display suggestions from prediction models in console 55 if verbose && useArmModel 56 dispUnderlinedSpacedAbove("Suggestions from Arm Model") 57 for i = 1 : length(suggsArm) 58 sugg = suggsArm(i); 59 disp(sugg.blockType + " : " + sugg.confidence); 60 end 61 end 62 if verbose && useFreqModel 63 dispUnderlinedSpacedAbove("Suggestions from Freq Model") 64 for i = 1 : length(suggsFreq) 65 sugg = suggsFreq(i); 66 disp(sugg.blockType + " : " + sugg.confidence); 67 end 68 end 69 70 71 % COMBINE SUGGESTIONS FROM MULTIPLE PREDICTION MODELS 72 73 if useArmModel && useFreqModel 74 % weights are tuned experimentally to maximize prediction accuracy 75 WEIGHT_ARM = 0.3; 76 WEIGHT_FREQ = 0.7; 77 elseif useArmModel && ˜useFreqModel 78 WEIGHT_ARM = 1; 79 WEIGHT_FREQ = 0; 80 elseif ˜useArmModel && useFreqModel 81 WEIGHT_ARM = 0; 82 WEIGHT_FREQ = 1; 83 else 84 error("Invalid appState: At least 1 block-prediction-model must be ,→ used"); 85 end 86 87 assert(WEIGHT_ARM + WEIGHT_FREQ == 1, "FROM getBlockSuggs.m: Assertion ,→ failed: Sum of all weights must be equal to 1.") 88 89 % to combine suggestions, we first need to make sure the #suggestions 90 % is same from each prediction models (to match matrix dimensions). 91 % So, we pad "fake" suggestions (with confidence 0), if necessary. 92 % These fake suggestions will be ignored by ,→ combineBlockSuggsFromPredModel() 93 nSuggsArm = length(suggsArm); 94 nSuggsFreq = length(suggsFreq); 95 nSuggsMax = max(nSuggsArm, nSuggsFreq); 96 97 % pad dummy suggestions to prepare suggs2D (for ,→ combineBlockSuggsFromPredModels()) 98 if nSuggsArm < nSuggsMax 99 for i = 1 : nSuggsMax - nSuggsArm

129 100 suggsArm = [suggsArm BlockSuggFromPredModel("", 0)]; 101 end 102 end 103 if nSuggsFreq < nSuggsMax 104 for i = 1 : nSuggsMax - nSuggsFreq 105 suggsFreq = [suggsFreq BlockSuggFromPredModel("", 0)]; 106 end 107 end 108 109 % each row contains suggestions from 1 prediction model 110 suggsPredModels2D = [suggsArm; suggsFreq]; 111 % these are sorted by confidence 112 suggsPredModels = combineBlockSuggsFromPredModel(suggsPredModels2D, [ ,→ WEIGHT_ARM, WEIGHT_FREQ]); 113 114 % create BlockSuggs from suggsPredModels 115 suggs = BlockSugg.empty; 116 reservedBlockNames = context.blockNamesInSud; 117 for i = 1 : length(suggsPredModels) 118 sugg = suggsPredModels(i); 119 blockType = sugg.blockType; 120 rank = i; 121 confidence = sugg.confidence; 122 blockName = generateBlockName(blockType, reservedBlockNames); 123 reservedBlockNames = [reservedBlockNames blockName]; 124 blockPath = context.sud + "/" + blockName; 125 126 sugg = BlockSugg(blockType, rank, confidence, blockName, blockPath); 127 suggs = [suggs sugg]; 128 end 129 130 suggs = filterOutUninsertableBlockSuggs(suggs); 131 132 133 % display suggestions in console 134 if verbose && useArmModel 135 dispUnderlinedSpacedAbove("Suggestions from Ensemble Model") 136 for i = 1 : length(suggs) 137 sugg = suggs(i); 138 disp(sugg.blockType + " : " + sugg.confidence); 139 end 140 end 141 142 % limit #suggs 143 state = getSharedVar('simvma_appState'); 144 if length(suggs) > getSharedVarSimgestionNSuggsMax() 145 suggs = suggs(1 : state.simgestionNSuggsMax); 146 end 147 end 148 149

130 150 function name = generateBlockName(blockType, reservedNames) 151 % generate a unique name for the block 152 % convention blockType-followed-by-an-optional-number 153 blockType = string(blockType); 154 name = blockType; 155 i = 0; 156 while any(strcmp(reservedNames, name)) 157 i = i + 1; 158 name = blockType + i; 159 end 160 end

Listing A.48: getClonesFromXmlFile.m function definition 1 function clones = getClonesFromXmlFile(xmlFilepath, simvmaPath) 2 clones = {}; 3 fid = fopen(xmlFilepath, 'r'); 4 5 lineCount = 0; 6 while ˜feof(fid) 7 line = fgets(fid); 8 lineCount = lineCount + 1; 9 10 % the occurance of tag repeats every 5 line, 11 % after line 5, which we leverage 12 if lineCount > 5 13 mod5 = mod(lineCount, 5); 14 15 if mod5==1 16 cloneLine = line; 17 end 18 19 if mod5==2 20 srcLine1 = line; 21 end 22 23 if mod5==3 24 srcLine2 = line; 25 end 26 27 % make new Clone object and append to clones 28 if mod5==0 29 clone = createCloneFromXmlElementLines(cloneLine, srcLine1, ,→ srcLine2, simvmaPath); 30 len = length(clones); 31 clones{len+1} = clone; 32 end 33 end 34 end 35 fclose(fid); 36 end

131 Listing A.49: getConnectedBlockHandlesByBlockHandle.m function definition 1 function handles = getConnectedBlockHandlesByBlockHandle(blockHandle, ,→ connectionType) 2 % Return an array of handles of all immediately connected blocks. 3 % 4 % RAISES_ERROR: Since this function is not guaranteed to succeed, you may 5 % need to use a try-catch construct in the caller. 6 % (see detailed comments in getParamsByHandle.m) 7 % 8 % 9 % By "immediately connected", we mean the blocks are connected by a line 10 % directly to the block given by blockPath 11 % 12 % SUBSYSTEM HANDLES, IF ANY, WILL BE IGNORED IN THE RETURNED ARRAY 13 % AS THAT WOULD NOT BE USEFUL FOR OUR APPLICATION 14 % 15 % The argument blockPath can be that of a subsystem too. 16 % 17 % 18 % The returned array will be empty i.e. [] if an unconnected block's path 19 % is passed in. 20 % 21 % All returned handles are valid i.e. they don't correspond to some 22 % invalid/deleted object (at the time of invocation of this function) 23 % 24 % 25 % ASSUMPTIONS: 26 % ------27 % corresponding system is loaded 28 % 29 % PARAMETERS: 30 % ------31 % blockHandle(handle) : handle of a block (can be a subsystem too) 32 % type(str/char) : src/dst/both 33 % 34 35 assert (any(strcmp(["src", "dst", "both"], connectionType))); 36 params = getParamsByHandle(blockHandle, false); % may raise error 37 portConns = params.PortConnectivity; 38 39 40 handles = []; 41 for i = 1:length(portConns) 42 if connectionType == "both" 43 if portConns(i).SrcBlock 44 handles = [handles portConns(i).SrcBlock]; 45 else 46 handles = [handles portConns(i).DstBlock]; 47 end 48 elseif connectionType == "src" 49 if portConns(i).SrcBlock

132 50 handles = [handles portConns(i).SrcBlock]; 51 end 52 else % type == "dst" 53 if portConns(i).DstBlock 54 handles = [handles portConns(i).DstBlock]; 55 end 56 end 57 end 58 59 60 % It was found that if an unconnected subsystem's path is provided as 61 % blockPath, -1.00 is returned (which is an invalid handle), but we 62 % want to return empty array i.e. [] in such case. This statements 63 % removes any handle which is -1.00. 64 handles = handles(handles ˜= -1); 65 66 % filter out subsystem handles 67 % this must be done after removing handle=-1.00, if any 68 % otherwise get(handle, 'BlockType') will throw error for handle=-1.00 69 handles1 = []; 70 for i = 1 : length(handles) 71 handle = handles(i); 72 blockType = get(handle, 'BlockType'); 73 if string(blockType) ˜= "SubSystem" 74 handles1 = [handles1 handle]; 75 end 76 end 77 handles = handles1; 78 end

Listing A.50: getConnectedBlockHandlesByBlockPath.m function definition 1 function handles = getConnectedBlockHandlesByBlockPath(blockPath, ,→ connectionType) 2 % Return an array of handles of all immediately connected blocks. 3 % 4 % RAISES_ERROR: Since this function is not guaranteed to succeed, you may 5 % need to use a try-catch construct in the caller. 6 % (see detailed comments in getParamsByHandle.m) 7 % 8 % 9 % By "immediately connected", we mean the blocks are connected by a line 10 % directly to the block given by blockPath 11 % 12 % SUBSYSTEM HANDLES, IF ANY, WILL BE IGNORED IN THE RETURNED ARRAY 13 % AS THAT WOULD NOT BE USEFUL FOR OUR APPLICATION 14 % 15 % The argument blockPath can be that of a subsystem too. 16 % 17 % 18 % The returned array will be empty i.e. [] if an unconnected block's path 19 % is passed in.

133 20 % 21 % All returned handles are valid i.e. they don't correspond to some 22 % invalid/deleted object (at the time of invocation of this function) 23 % 24 % 25 % ASSUMPTIONS: 26 % ------27 % corresponding system is loaded 28 % 29 % PARAMETERS: 30 % ------31 % blockPath(str/char) : path of a block (can be a subsystem too) 32 % type(str/char) : src/dst/both 33 34 assert (any(strcmp(["src", "dst", "both"], connectionType))); 35 36 blockHandle = get_param(blockPath, 'handle'); % path to handle 37 handles = getConnectedBlockHandlesByBlockHandle(blockHandle, ,→ connectionType); % may raise error 38 end

Listing A.51: getConnectedBlockTypesWithCountByBlockHandle.m function definition 1 function map = getConnectedBlockTypesWithCountByBlockHandle(blockHandle, ,→ connectionType) 2 % Return a map such that the key is BlockType and value is count. Each 3 % entry in the map corresponds to an immediately connected block. 4 % 5 % RAISES_ERROR: Since this function is not guaranteed to succeed, you may 6 % need to use a try-catch construct in the caller. 7 % (see detailed comments in getParamsByHandle.m) 8 % 9 % By "immediately connected", we mean the blocks are connected by a line 10 % directly to the block given by blockPath 11 % 12 % Any connected block of type 'SubSystem' are ignored. 13 % 14 % ANY CONNECTED BLOCK OF TYPE "SubSystem" ARE IGNORED 15 % AS THAT WOULD NOT BE USEFUL FOR OUR APPLICATION 16 17 % ASSUMPTIONS: 18 % ------19 % corresponding system is loaded 20 % 21 % PARAMETERS: 22 % ------23 % blockHadle(handle) : handle of a block (can be a subsystem too) 24 % connectionType(str/char) : src/dst/both 25 26 assert (any(strcmp(["src", "dst", "both"], connectionType))); 27

134 28 handles = getConnectedBlockHandlesByBlockHandle(blockHandle, ,→ connectionType); % may raise error 29 map = containers.Map('KeyType', 'char', 'ValueType', 'double'); 30 for i = 1:length(handles) 31 handle = handles(i); 32 blockType = get_param(handle, 'BlockType'); 33 if map.isKey(blockType) 34 map(blockType) = map(blockType) + 1; 35 else 36 map(blockType) = 1; 37 end 38 end 39 end

Listing A.52: getConnectedBlockTypesWithCountByBlockPath.m function definition 1 function map = getConnectedBlockTypesWithCountByBlockPath(blockPath, ,→ connectionType) 2 % Return a map such that the key is BlockType and value is count. Each 3 % entry in the map corresponds to an immediately connected block. 4 % 5 % RAISES_ERROR: Since this function is not guaranteed to succeed, you may 6 % need to use a try-catch construct in the caller. 7 % (see detailed comments in getParamsByHandle.m) 8 % 9 % By "immediately connected", we mean the blocks are connected by a line 10 % directly to the block given by blockPath 11 % 12 % Any connected block of type 'SubSystem' are ignored. 13 % 14 % ANY CONNECTED BLOCK OF TYPE "SubSystem" ARE IGNORED 15 % AS THAT WOULD NOT BE USEFUL FOR OUR APPLICATION 16 17 % ASSUMPTIONS: 18 % ------19 % corresponding system is loaded 20 % 21 % PARAMETERS: 22 % ------23 % blockPath(str/char) : path of a block (can be a subsystem too) 24 % connectionType(str/char) : src/dst/both 25 26 assert (any(strcmp(["src", "dst", "both"], connectionType))); 27 28 handles = getConnectedBlockHandlesByBlockPath(blockPath, connectionType ,→ ); % may raise error 29 map = containers.Map('KeyType', 'char', 'ValueType', 'double'); 30 for i = 1:length(handles) 31 handle = handles(i); 32 blockType = get_param(handle, 'BlockType'); 33 if map.isKey(blockType) 34 map(blockType) = map(blockType) + 1;

135 35 else 36 map(blockType) = 1; 37 end 38 end 39 end

Listing A.53: getContext.m function definition 1 function context = getContext() 2 % Return current context as Context object. This context will be used to 3 % make block suggestions. 4 % 5 sud = gcs; % system under development 6 mrcbPath = string(gcb); % most recently clicked block 7 mrcbParams = getParamsByPath(mrcbPath); 8 mrcbBlockType = mrcbParams.BlockType; 9 mrcbPosition = mrcbParams.Position; 10 11 map = getConnectedBlockTypesWithCountByBlockPath(mrcbPath, "both"); % ,→ containers.Map 12 keys = map.keys; % cell array (char) 13 values = map.values; % cell array (double) 14 mrcbNeighborsMap = struct; 15 for i = 1 : length(keys) 16 key = string(keys{i}); 17 key = convertBlockTypeToField(key); 18 value = values{i}; 19 mrcbNeighborsMap.(key) = value; 20 end 21 22 23 % find xMin, xMax, yMin, yMax (to know what they mean, see Context.m) 24 xMin = inf; 25 xMax = -inf; 26 yMin = inf; 27 yMax = -inf; 28 29 sud = gcs; 30 sudParams = getParamsByPath(sud); 31 blocks = string(sudParams.Blocks); 32 blockNamesInSud = string.empty; 33 blockTypesInSud = string.empty; 34 for i = 1 : length(blocks) 35 block = blocks(i); 36 % some block-names contain the character '/'. But since 37 % '/' is used as a separater between parent and child 38 % blocks, '/' in a block-name is replaced with '//' 39 block = strrep(block, '/', '//'); 40 blockPath = sud + "/" + block; 41 blockParams = getParamsByPath(blockPath); % block params 42 blockNamesInSud = [blockNamesInSud blockParams.Name]; 43 blockTypesInSud = [blockTypesInSud blockParams.BlockType];

136 44 45 pos = blockParams.Position; 46 xMinBlock = pos(1); 47 yMinBlock = pos(2); 48 xMaxBlock = pos(3); 49 yMaxBlock = pos(4); 50 51 if xMinBlock < xMin 52 xMin = xMinBlock; 53 end 54 55 if xMaxBlock > xMax 56 xMax = xMaxBlock; 57 end 58 59 if yMinBlock < yMin 60 yMin = yMinBlock; 61 end 62 63 if yMaxBlock > yMax 64 yMax = yMaxBlock; 65 end 66 end 67 68 blockTypesInSud = unique(blockTypesInSud); % remove duplicates, sort ,→ alphabetically 69 % block names, unlike block-types, are already unique 70 blockNamesInSud = sort(blockNamesInSud); % sort alphabetically 71 72 context = Context(xMin, xMax, yMin, yMax, sud, blockNamesInSud, ,→ blockTypesInSud, mrcbPath, mrcbBlockType, mrcbPosition, ,→ mrcbNeighborsMap); 73 end

Listing A.54: getCygwinPath.m function definition 1 function cygwinPath = getCygwinPath() 2 % Return 's path 3 4 % MATLAB recognizes both '\' and '/' as path separators for 5 % windows. So, we use '/' instead of '\' for consistency 6 cygwinPath = "C:/cygwin64"; 7 8 end

Listing A.55: getFieldWithMaxVal.m function definition 1 function fieldWithMaxVal = getFieldWithMaxVal(structure) 2 % Return the field (string) with maximum value. 3 % If there are multiple fields with the maximum value, only one (whichever 4 % comes first when parsing the structure fields) will be returned. 5 %

137 6 % If structure is empty i.e. has no field, an empty string i.e. "" will be 7 % returned. 8 % 9 % ASSUMPTIONS: 10 % ------11 % - value of each field in the structure is a number (double) 12 % 13 % PARAMETERS: 14 % ------15 % structure (struct) : structure whose fields are to be sorted. 16 17 fieldNames = string(fields(structure)); 18 fieldWithMaxVal = ""; 19 maxVal = -Inf; 20 21 for i = 1 : length(fieldNames) 22 fieldName = fieldNames(i); 23 val = structure.(fieldName); 24 if val > maxVal 25 fieldWithMaxVal = fieldName; 26 maxVal = val; 27 end 28 end 29 end

Listing A.56: getLoadedModelsAbsFilepaths.m function definition 1 function loadedModelsAbsFilepaths = getLoadedModelsAbsFilepaths() 2 % Return a list of absolute paths of all loaded models. 3 % IMPORTANT: 4 % ------5 % this returns the absolute "real"path (even if symlink is used to load 6 % the model) 7 loadedModelsAbsFilepaths = string(get_param(Simulink.allBlockDiagrams() ,→ , 'FileName')); 8 end

Listing A.57: getLoadedOnlyModelsAbsFilepaths.m function definition 1 function abspaths = getLoadedOnlyModelsAbsFilepaths() 2 % Return a list of absolute paths of all loaded (but not open) models. 3 loadedPaths = getLoadedModelsAbsFilepaths(); 4 abspaths = string.empty; 5 for i = 1 : length(loadedPaths) 6 path = loadedPaths(i); 7 if ˜isOpenByAbspath(path) 8 abspaths = [abspaths path]; 9 end 10 end 11 end

138 Listing A.58: getNFilesRecursively.m function definition 1 function nFiles = getNFilesRecursively(dirpath, extensions) 2 % Return the count of (nested) files located inside given directory, and 3 % having given extension. 4 % 5 % If dirpath is empty i.e. "", returns 0 6 % PARAMETERS: 7 % ------8 % dirpath (string) : path of directory (absolute or relative) 9 % extensions (string) : file extensions (without leading .) -- Only those 10 % files matching this extension (case is ignored) 11 % will be counted. 12 % This can be one of: 13 % - single string (eg: "mdl") 14 % - single char-array (eg: 'mdl') 15 % - list of strings (eg: ["mdl", "slx"]) 16 % This CANNOT be a list of chars (eg: ['mdl', 'slx'] 17 % is INVALID. 18 19 dirpath = string(dirpath); 20 extensions = string(extensions); 21 nFiles = 0; 22 if dirpath == "" 23 return; 24 end 25 for i = 1 : length(extensions) 26 ext = extensions(i); 27 path = dirpath + "/**/*." + ext; 28 nFiles = nFiles + length(dir(path)); 29 end 30 end

Listing A.59: getNInportOutport.m function definition 1 function [nInport, nOutport] = getNInportOutport(systemPath) 2 % Return the number of inports and outports of the (sub)system 3 % If the system is a top-level system, both nInport and nOutport will be 0 4 % 5 % ASSUMPTIONS: 6 % ------7 % 1. system is loaded 8 % 9 % PARAMETERS: 10 % ------11 % systemPath : (sub)system's path 12 % This should be in format as returned by 'gcs' eg: 13 % "mymodel/a/b/mysubsystem" 14 15 16 if isTopSystem(systemPath) 17 nInport = 0;

139 18 nOutport = 0; 19 else 20 % IMPORTANT: the subSystem must be a character vector(rather than a ,→ string) 21 % Otherwise it SOMETIMES (dont' know in detail why that happens) fails. 22 23 % systemPath must not be top-level system 24 params = get_param(char(systemPath), 'PortHandles'); 25 26 nInport = length(params.Inport); 27 nOutport = length(params.Outport); 28 end

Listing A.60: getOpenModelsAbsFilepaths.m function definition 1 function abspaths = getOpenModelsAbsFilepaths() 2 % Return a list of absolute paths of all open (not just loaded) models. 3 loadedPaths = getLoadedModelsAbsFilepaths(); 4 abspaths = string.empty; 5 for i = 1 : length(loadedPaths) 6 path = loadedPaths(i); 7 if isOpenByAbspath(path) 8 abspaths = [abspaths path]; 9 end 10 end 11 end

Listing A.61: getParamsByHandle.m function definition 1 function structure = getParamsByHandle(handle, sorted) 2 % Return a struct containing all parameters (sorted alphabetically) of the 3 % given handle 4 % 5 % RAISES_ERROR: Since this function is not guaranteed to succeed, you may 6 % need to use a try-catch construct in the caller. 7 % (see detailed comments in getParamsByHandle.m) 8 % 9 % 10 % ASSUMPTIONS: 11 % ------12 % corresponding system is loaded 13 % 14 % PARAMETERS: 15 % ------16 % handle : handle -- element's handle. The element can be block-diagram/ 17 % block/line/... 18 % sorted(boolean) : -- if true, params are sorted alphabetically 19 % 20 % IMPORTANT: Since this function is not guaranteed to succeed (see comment 21 % below), use a try-catch construct in the caller. 22 23 warning off;

140 24 25 % 'get(handle)' (and hence this function) raises exception in some ,→ cases. 26 % One such case is when one/more of the connected blocks references an ,→ unavailable library element. 27 % For example, In the following model: 28 % GitHub/T-MATS-master/Resources/Testing/TestBeds/SolverSS/ ,→ SolverSS_TestBed.slx 29 % if the handle of the block 'SolverSS_TestBed/Bus Selector' is passed 30 % to this function, get(handle) raises an exception. 31 32 % So, you may want to call this function from a try-catch construct. 33 34 structure = get(handle); % gives warning: Warning: The value of this ,→ parameter is only valid when the model is in a compiled state and ,→ is a top model. 35 warning on; 36 if sorted 37 [˜, neworder] = sort(lower(fieldnames(structure))); 38 structure = orderfields(structure, neworder); 39 end 40 end

Listing A.62: getParamsByPath.m function definition 1 function structure = getParamsByPath(path) 2 % Return a struct containing all parameters (not sorted alphabetically) of 3 % the given block-diagram/block/line... 4 % 5 % RAISES_ERROR: Since this function is not guaranteed to succeed, you may 6 % need to use a try-catch construct in the caller. 7 % (see detailed comments in getParamsByHandle.m) 8 % 9 % 10 % ASSUMPTIONS: 11 % ------12 % corresponding system is loaded 13 % path is valid 14 % 15 % PARAMETERS: 16 % ------17 % path(string) : element's path in a model (as returned by gcs). 18 % The element can be block-diagram/block/line/... 19 % 20 21 handle = get_param(path, 'handle'); 22 structure = getParamsByHandle(handle, false); % may raise error 23 end

Listing A.63: getParamsByPathSorted.m function definition

141 1 function structure = getParamsByPathSorted(path) 2 % Return a struct containing all parameters sorted alphabetically of the 3 % given block-diagram/block/line... 4 % 5 % ASSUMPTIONS: 6 % ------7 % corresponding system is loaded 8 % path is valid 9 % 10 % PARAMETERS: 11 % ------12 % path(string) : -- element's path. The element can be block-diagram/ 13 % block/line/... 14 15 handle = get_param(path, 'handle'); 16 structure = getParamsByHandle(handle, true); 17 end

Listing A.64: getSharedVar.m function definition 1 function value = getSharedVar(name) 2 % Return the value of the shared variable from corresponding file in 3 % shared-vars/*.mat 4 % 5 % PARAMETERS: 6 % ------7 % name(string) : name of the variable 8 % 9 % ASSUMPTION: corresponding .mat file exists in shared-vars/ 10 11 12 % don't include simvma_simvmaPath here. 13 % simvma_simvmaPath can be accessed using getSimvmaPath() function 14 validVars = [ ... 15 "simvma_appState", ... 16 "simvma_blockInsertionState", ... 17 "simvma_armCache", ... 18 "simvma_freqCache", ... 19 "simvma_armModel", ... % combination of 1 (of 64) default ArmModel ,→ and custom ArmModel 20 "simvma_freqModel", ... 21 "simvma_armModelCst", ... 22 "simvma_freqModelCst", ... % custom FreqModel (i.e. trained on ,→ custom repos only) 23 "simvma_freqModelStd01", ... 24 "simvma_freqModelStd10", ... 25 "simvma_freqModelStd11", ... 26 "simvma_armModelStd01", ... 27 "simvma_armModelStd10", ... 28 "simvma_armModelStd11", ... 29 "simvma_symlinkMap", ... % key: symbolic path, value: real path 30 "simvma_tempvar", ... % this variable is for testing purpose only.

142 31 ]; 32 33 % total of 64 models trained on all possible combinations of 6 default 34 % repos i.e. simvma_freqModelDef000000 ... simvmafreqModelDef111111 35 for i = 0 : 1 36 for j = 0 : 1 37 for k = 0 : 1 38 for l = 0 : 1 39 for m = 0 : 1 40 for n = 0 : 1 41 am = "simvma_armModelDef_" + i + j + k + l + m + n; 42 fm = "simvma_freqModelDef_" + i + j + k + l + m + n; 43 validVars = [validVars am fm]; 44 end 45 end 46 end 47 end 48 end 49 end 50 51 if ˜any(strcmp(validVars, name)) 52 error("Unexpected var : " + name); 53 end 54 55 filepath = getSimvmaPath() + "/shared-vars/" + name + ".mat"; 56 loadedVars = load(filepath); 57 value = loadedVars.(name); 58 59 end

Listing A.65: getSharedVarSimgestionNSuggsMax.m function definition 1 function value = getSharedVarSimgestionNSuggsMax() 2 % Return the value of the shared variable simvma_appState.nSuggsMax. 3 % Created this function because this variable (nSuggsMax) is frequently 4 % accessed. 5 % 6 state = getSharedVar('simvma_appState'); 7 value = state.simgestionNSuggsMax; 8 end

Listing A.66: getSimvmaPath.m function definition 1 function simvmaPath = getSimvmaPath() 2 % Return simvma project's absolute path. If not found, return empty string ,→ . 3 % 4 % IMPORTANT: 5 % ------6 % Unlike other functions, this function must not reside inside src/ ,→ functions 7 % because this function is used by initialize.m. (When initialize.m is

143 8 % first invoked, the path src/functions is not yet added to Matlab's 9 % path.) 10 11 simvmaPath = ""; 12 % to avoid redundant computation, we first check if the variable 13 % 'simvma_simvmaPath' exists in the workspace. If found, the same 14 % path is returned. If not, a search is made on Matlab's path list to 15 % determine simvmaPath 16 if exist('simvma_simvmaPath', 'var') 17 simvmaPath = simvma_simvmaPath; 18 else 19 % the command 'path' returns all paths as a ONE LONG char vector 20 paths = string(path); 21 if ispc 22 % MATLAB recognizes both '\' and '/' as path separators for 23 % windows. We make this replacement to avoid the mixing of 24 % these two symbols (although that would not be a problem for 25 % MATLAB). 26 paths = paths.replace('\', '/'); 27 paths = paths.split(";"); 28 else 29 paths = paths.split(":"); 30 end 31 32 for i=1:length(paths) 33 p = string(paths(i)); 34 if p.endsWith('SimIMA') 35 % more validation to make sure this is indeed the path we are 36 % looking for 37 dirs = ["src", "Simone-2.0-Complete-Cygwin64-customized-for- ,→ simvma"]; 38 files = ["initialize.m", "getSimvmaPath.m", "sl_customization.m ,→ ", "README.md"]; 39 40 okay = true; 41 for i = 1:length(dirs) 42 abspath = p + "/" + dirs(i); 43 if ˜ exist(abspath, 'dir') 44 disp("From getSimvmaPath(): Directory " + abspath + " not ,→ found"); 45 okay = false; 46 break; 47 end 48 end 49 50 if okay 51 for i = 1:length(files) 52 abspath = p + "/" + files(i); 53 if ˜ exist (abspath, 'file') 54 disp("From getSimvmaPath(): File " + abspath + " not ,→ found");

144 55 okay = false; 56 break; 57 end 58 end 59 end 60 61 if okay 62 simvmaPath = p; 63 break; 64 end 65 end 66 end 67 end 68 69 if boolean(simvmaPath) 70 % this is not just for debugging, so don't delete 71 % we store simvmaPath in workspace to use as a 'cache' for 72 % subsequent calls to getSimvmaPath(). 73 assignin('base', 'simvma_simvmaPath', simvmaPath); 74 end 75 end

Listing A.67: getSortedClones.m function definition 1 function [clones, simoneError] = getSortedClones(simvmaPath, clonePaths, ,→ maxClones, ... 2 ignore100PercentMatchingClones, sudPath, simone_difflimit, ,→ ... 3 simone_minsize, simone_maxsize, simone_rename) 4 % get clones as 'cell' 5 % each clone is a Clone object, 6 % and can be accessed as ans{index}. 7 % 8 % We are returning simoneError in addition to clones because it is needed 9 % by evaluateOneSet.m (in SimXample-evaluation) 10 % 11 % PARAMETERS: 12 % ------13 % simvmaPath : string --absolute path of simvma project 14 % clonePaths : string -- list of absolute paths of the directories ,→ containing 15 % mdl files with potential clones 16 % simone_difflimit : int -- maximum allowable difference between clones 17 % simone_minsize : int -- minimum number of lines required for a (sub) ,→ system 18 % to be considered as a potential clone 19 % simone_maxsize : int -- maximum number of lines allowed for a (sub) ,→ system 20 % to be considered as a potential clone 21 % simone_rename : string -- what kind of filtering to be applies 22 % Possible values: "none"/"blind"/"consistent" 23

145 24 % simone_maxClones : maximum number of clones to be returned 25 % ignore100PercentMatchingClones : if true, 100% matching clones will be 26 % ignored 27 28 originalPath = pwd; % backup 29 srcPath = simvmaPath + "/src"; 30 simonePath = simvmaPath + "/Simone-2.0-Complete-Cygwin64-customized-for ,→ -simvma"; 31 simoneXmlReportPath = simonePath + "/mdl-file_systems-sort-blind- ,→ crossclones/mdl-file_systems-sort-blind-crossclones-0." + ,→ simone_difflimit + ".xml"; 32 33 cd(simonePath) 34 disp("Removing links to previous clonePaths, if any ..."); 35 if exist('mdl-files', 'dir') 36 rmdir('mdl-files', 's'); % remove directory mdl-files including its ,→ sub-directory tree 37 end 38 disp(" "); 39 40 % remove previous simone results, if any 41 if ispc 42 cmd = getCygwinPath() + "/bin/bash cleanall"; 43 else 44 cmd = "./cleanall"; 45 end 46 disp(cmd); 47 unix(cmd); 48 49 mkdir('mdl-files'); % re-create empty mdl-files directory 50 51 % make symbolic links of all clone-paths to mdl-files/ 52 53 % NOTE: symbolic links will work fine for Simone under Windows too ,→ because Simone will be executed from within cygwin. 54 % However, later when creating Clone objects from the Simone xml report ,→ , 55 % MATLAB (in Windows) will complain "Invalid file identifier" as it won ,→ 't be able to recognize the file using symbolic links. 56 % This error will be raised from getClonesFromXmlFile -> ,→ createCloneFromXmlElementLines -> createSourceFromXmlElement -> ,→ Source -> getSystemPathFromStartLine 57 % Thus, to make SimIMA windows-compatible, we save the mapping from the 58 % symbolic link to the original file in shared-var simvma_symlinkMap. 59 % Since the filepaths include characters such as :, \, /, -, etc which 60 % are not supported as keys in struct, we implement this map using 61 % containers.Map 62 63 symlinkMap = containers.Map; % key: symlink, val: original path 64 for i = 1:length(clonePaths) 65 clonePath = clonePaths(i);

146 66 if ispc 67 cmd = getCygwinPath() + "/bin/ln -s " + clonePath + " mdl-files ,→ /"; 68 else 69 cmd = "ln -s " + clonePath + " mdl-files/"; 70 end 71 disp(cmd); 72 unix(cmd); 73 74 clonePathTokens = clonePath.split('/'); 75 clonePathDirName = clonePathTokens(end); 76 symlinkPath = simonePath + "/mdl-files/" + clonePathDirName; 77 symlinkMap(symlinkPath) = clonePath; 78 end 79 setSharedVar('simvma_symlinkMap', symlinkMap); 80 81 % make sure the configuration file for Simone exists 82 createOrUpdateSimoneConfigFile(simone_difflimit, simone_minsize, ,→ simone_maxsize, simone_rename, simvmaPath); 83 84 disp(newline + "Running SIMONE to extract clones ... "); 85 % eg: ./simonecross -difflimit 50% mdl-file mdl-files 86 if ispc 87 cmd = getCygwinPath() + "/bin/bash simonecross -difflimit " + ,→ simone_difflimit + "% mdl-file mdl-files"; 88 else 89 cmd = "./simonecross -difflimit " + simone_difflimit + "% mdl-file ,→ mdl-files"; 90 end 91 unix(cmd); 92 93 cd(srcPath); 94 95 if exist(simoneXmlReportPath, 'file') 96 simoneError = false; 97 clones = getClonesFromXmlFile(simoneXmlReportPath, simvmaPath); 98 clones = sortClonesBySimilarity(clones); 99 clones = filterOutDuplicateClones(clones); 100 101 assignin('base', 'allClones', clones); % export to workspace (for ,→ debugging) 102 dispClones(clones, "ALL SORTED CLONES ( Total : " + length(clones) + ,→ " )"); 103 dispSpaced("# All clones : " + length(clones)); 104 105 if ignore100PercentMatchingClones 106 clones = filterOutClonesBeyondThreshold(clones, simone_difflimit, ,→ 99); 107 else 108 clones = filterOutClonesBeyondThreshold(clones, simone_difflimit, ,→ 100);

147 109 end 110 111 dispSpaced("# clones < 100% similarity : " + length(clones)); 112 113 clones = filterOutClonesWithSource1NotMatchingSUD(clones, sudPath); 114 dispSpaced("# clones matching SUD : " + length(clones)); 115 assignin('base', 'clones', clones); % export to workspace (for ,→ debugging) 116 117 118 if length(clones) > maxClones 119 clones = clones(1:maxClones); 120 end 121 122 dispSpaced("# clones selected to create suggestions : " + length( ,→ clones)); 123 124 125 else 126 simoneError = true; 127 clones = {}; 128 end 129 130 cd(originalPath) % return to the original path 131 end

Listing A.68: getStartAndEndLinesFromSystemPath.m function definition 1 function [startline, endline] = getStartAndEndLinesFromSystemPath( ,→ mdlFilepath, systemPath) 2 % Return the startline, endline of (sub)system that matches given ,→ systemPath 3 % 4 % 5 % A (sub)system's 'path' here is simply a string that gives full path of 6 % the (sub)system beginning from the top system. Thus, returned value will 7 % be in the format: "model_name/a/b/c/sub_system_name" 8 % 9 % PARAMETERS: 10 % ------11 % mudFilepath : string -- absolute filepath of Model Under Development 12 % systemPath : string -- path of (sub)System. This must be in format as 13 % returned by 'gcs' eg: "mymodel/a/b/mysubsystem" 14 % 15 % ASSUMPTIONS: 16 % - both mdlFilepath and systemPath are valid. this means: 17 % 1. a mdl file exists for given mdlFilepath 18 % 2. the mdl file contains a system given by systemPath 19 % 20 % IF THIS REQUIREMENT IS NOT SATISFIED AN ERROR WILL BE THROWN (IT 21 % WON'T BE HANDLED) 22 %

148 23 % - the mdl file is created by Simulink and is in 'standard' format 24 % such that: 25 % - the first property of System{} is always 'Name' 26 % - "{" and "}" appear only one per line (at max), and only as last 27 % character (after stripping white space) 28 29 fid = fopen(mdlFilepath, 'r'); 30 31 lines = string.empty; % all lines of file after striping 32 while ˜feof(fid) 33 line = fgets(fid); % char vector 34 line = string(strip(line)); 35 lines(length(lines) + 1) = line; 36 end 37 fclose(fid); 38 39 pathStack = string.empty; 40 braceStack = string.empty; 41 startlineStack = []; 42 43 for i = 1: length(lines) 44 line = lines(i); 45 if line == "System {" 46 startlineStack = [startlineStack i]; 47 48 % get name of (sub)system 49 nextline = lines(i+1); 50 sysName = getSystemNameFromLine(nextline, "gcs"); 51 52 pathStack = [pathStack sysName]; 53 braceStack = [braceStack "sys{"]; 54 55 elseif line.endsWith("{") 56 braceStack = [braceStack "{"]; 57 58 elseif line.endsWith("}") 59 braceStack = [braceStack "}"]; 60 end 61 62 if braceStack(end) == "}" && braceStack(end-1) == "{" 63 % these pair corresponding to some block/line other than a 64 % System{} block 65 66 % remove last two entries that form a {} pair 67 braceStack(end) = []; 68 braceStack(end) = []; 69 end 70 71 if braceStack(end) == "}" && braceStack(end-1) == "sys{" 72 % these pair corresponding to a System{} block 73

149 74 if systemPath == pathStack.join('/') 75 endline = i; 76 startline = startlineStack(end); 77 return; 78 else 79 % remove last two entries that form a sys{} pair 80 braceStack(end) = []; 81 braceStack(end) = []; 82 83 % remove the last enty of pathStack and startlineStack too 84 pathStack(end) = []; 85 startlineStack(end) = []; 86 end 87 end 88 end 89 end

Listing A.69: getSuggestionFromClone.m function definition 1 function suggestion = getSuggestionFromClone(clone, rank, simvmaPath) 2 % create Suggestion object clone.source2 3 % WARNING: This will override any suggestion.mdl and 4 % suggestion.png files in simvma/tmp/ folder 5 6 % remove corresponding mdl and png files from simvma/tmp/ folder 7 cmd = "rm " + simvmaPath + "/tmp/simxample-suggs/suggestion" + rank + " ,→ _*"; 8 unix(cmd); 9 10 source = clone.source2; 11 similarity = clone.similarity; 12 % we insert a random number in the filename to make it unique everytime 13 % it is generated. Because, app-designer does not reload images when 14 % name remains unchanged. 15 randomNumber = randi(1000000); 16 mdlFilepath = simvmaPath + "/tmp/simxample-suggs/suggestion" + rank + " ,→ _" + randomNumber + ".mdl"; 17 imgFilepath = simvmaPath + "/tmp/simxample-suggs/suggestion" + rank + " ,→ _" + randomNumber + ".png"; 18 19 createMdlFileFromSource(source, mdlFilepath, simvmaPath); 20 exportMdlAsPng(mdlFilepath, imgFilepath); 21 22 suggestion = Suggestion(similarity, source, mdlFilepath, imgFilepath, ,→ rank); 23 end

Listing A.70: getSystemNameFromLine.m function definition 1 function sysName = getSystemNameFromLine(line, format) 2 % Return the system's name from line. 3 % The returned string is in the format as it appears in the mdl file.

150 4 % The returned string is NOT in the format as returned by gcs. 5 % 6 % PARAMETERS: 7 % ------8 % line : stringstring -- the line corresponding to the system's name 9 % format : string 10 % "mdl" : returned string is in the same format as it appears 11 % in the mdl file 12 % "gcs" : returned string is in the same file as this part 13 % would appear in the char-vector-returned-by-gcs (converted 14 % to string) 15 % 16 17 if ˜any(strcmp(["mdl", "gcs"], format)) % if format matches none in the ,→ list 18 disp("format: " + format); 19 error("*** ERROR: INVALID ARGUMENT"); 20 end 21 22 line = string(line); 23 line = line.strip(); 24 sysName = extractAfter(line, 5); 25 sysName = strip(sysName); 26 27 % remove leading and trailing " 28 sysName = char(sysName); 29 sysName = sysName(2:end-1); 30 sysName = string(sysName); 31 32 if format == "gcs" 33 sysName = convertSystemNameToGcsFormat(sysName); 34 end 35 end

Listing A.71: getSystemNamesAndLines.m function definition 1 function [names, startlines, endlines] = getSystemNamesAndLines( ,→ mdlFilepath) 2 % return 3 arrays : System names, startlines, and endlines (all are of 3 % same length) 4 5 fid = fopen(mdlFilepath, 'r'); 6 7 lines = string.empty; % all lines of file after striping 8 while ˜feof(fid) 9 line = fgets(fid); % char vector 10 line = string(strip(line)); 11 lines(length(lines) + 1) = line; 12 end 13 14 fclose(fid); 15

151 16 names = string.empty; 17 startlines = []; 18 endlines = []; 19 20 for i=1:length(lines) 21 line = lines(i); 22 if line == "System {" 23 startlines(length(startlines) + 1) = i; 24 % ASSUMPTION: the first attribute of any System{} is 'Name' 25 nextline = lines(i+1); 26 tokens = split(nextline); 27 28 name = char(tokens(2)); % attribute 'Name' 29 % strip first and last character, 30 % because split() returns tokens by surrounding with 31 % double-double quotes: eg: ""MySubsystem"" 32 ll = length(name); 33 name = string(name(2:ll-1)); 34 35 names(length(names)+1) = name; 36 37 openBraceCount = 1; 38 j = i; 39 while (true) 40 j = j + 1; 41 line = lines(j); 42 if line.endsWith("{") 43 openBraceCount = openBraceCount + 1; 44 end 45 if line.endsWith("}") 46 openBraceCount = openBraceCount - 1; 47 end 48 if openBraceCount == 0 49 endlines(length(endlines) + 1) = j; 50 break; 51 end 52 end 53 end 54 end 55 end

Listing A.72: getSystemPathFromStartline.m function definition 1 function path = getSystemPathFromStartline(mdlFilepath, startline) 2 % Return the 'path' of (sub)system that starts at given 'startline' 3 % If no such (sub)system is found, return "__NOT_FOUND__" 4 % 5 % A (sub)system's 'path' here is simply a string that gives full path of 6 % the (sub)system beginning from the top system. Thus, returned value will 7 % be in the format : "model_name/a/b/c/sub_system_name" 8 % This is the same format as the char vector returned by gcs converted to ,→ string

152 9 % 10 % 11 % ASSUMPTIONS: 12 % - the mdl file is created by Simulink and is in 'standard' format 13 % such that: 14 % - the first property of System{} is always 'Name' 15 % - "{" and "}" appear only one per line (at max), and only as last 16 % character (after stripping white space) 17 18 19 fid = fopen(mdlFilepath, 'r'); 20 21 lines = string.empty; % all lines of file after striping 22 while ˜feof(fid) 23 line = fgets(fid); % char vector 24 line = string(strip(line)); 25 lines(length(lines) + 1) = line; 26 end 27 28 fclose(fid); 29 30 if ˜lines(startline).startsWith("System {") 31 path = "__NOT_FOUND__"; 32 return 33 end 34 35 pathStack = string.empty; 36 braceStack = string.empty; 37 38 for i = 1: length(lines) 39 if i > startline + 1 40 break; 41 end 42 43 line = lines(i); 44 if line == "System {" 45 nextline = lines(i+1); 46 sysName = getSystemNameFromLine(nextline, "gcs"); 47 pathStack = [pathStack sysName]; 48 braceStack = [braceStack "sys{"]; 49 50 elseif line.endsWith("{") 51 braceStack = [braceStack "{"]; 52 53 elseif line.endsWith("}") 54 braceStack = [braceStack "}"]; 55 end 56 57 len = length(braceStack); 58 59 if len > 1 && braceStack(len) == "}" && braceStack(len-1) == "{"

153 60 % these pair corresponding to some block/line other than a 61 % System{} block 62 63 % remove last two entries that form a {} pair 64 braceStack(end) = []; 65 braceStack(end) = []; 66 end 67 68 len = length(braceStack); 69 if len > 1 && braceStack(len) == "}" && braceStack(len-1) == "sys{" 70 % these pair corresponding to a System{} block 71 72 % remove last two entries that form a sys{} pair 73 braceStack(end) = []; 74 braceStack(end) = []; 75 76 % remove the last enty of pathStack too 77 pathStack(end) = []; 78 end 79 80 end 81 path = join(pathStack, "/"); 82 end

Listing A.73: hash.m function definition 1 function res = hash(str) 2 % Return the MD5 hash (modified) of given string 3 % 4 % PARAMETERS: 5 % ------6 % str (string/char): string whose hash is to be computed. 7 8 import java.security.*; 9 import java.math.*; 10 md = MessageDigest.getInstance('MD5'); 11 hash = md.digest(double(str)); 12 bi = BigInteger(1, hash); 13 res = char(bi.toString(16)); 14 res = string(res); 15 % the name of a field in a struct must not begin with a number 16 % to make sure this hash value is ALWAYS good to be used as a field 17 % name in a struct (its intended use in FreqModel), we add a prefix "f_ ,→ " 18 res = "f_" + res; 19 end

Listing A.74: hashByFilepath.m function definition 1 function res = hashByFilepath(abspath) 2 % Return the MD5 hash (modified) of given string 3 %

154 4 % PARAMETERS: 5 % ------6 % abspath (string/char): absolute filepath of the file, whose content's ,→ hash 7 % is to be computed 8 9 abspath = string(abspath); 10 fileContent = fileread(abspath); 11 res = hash(fileContent); 12 end

Listing A.75: highlightMrcbAndNeighbors.m function definition 1 function highlightMrcbAndNeighbors(context) 2 % Highlights the Most Recently Clicked Block (MRCB) and its neighbors 3 % 4 % PARAMETERS: 5 % ------6 % context (Context) : the context of the simulink workspace 7 8 mrcbParams = getParamsByPath(context.mrcbPath); 9 mrcbHandle = mrcbParams.Handle; 10 set_param(mrcbHandle, 'ForegroundColor', 'blue'); 11 12 for i = 1 : length(context.mrcbNeighborsHandles) 13 handle = context.mrcbNeighborsHandles(i); 14 set_param(handle, 'ForegroundColor', 'cyan'); 15 end 16 end

Listing A.76: initialize.m function definition 1 function initialize() 2 % This function does the following jobs: 3 % 1. determine simvmaPath by searching through all paths; throw error if 4 % the path cannot be determined. 5 % 2. save simvmaPath (determined from above job) in 6 % shared-vars/simvmaPath.mat file so that it can be accessed by any 7 % part of the project 8 % 3. add path of simvma/src to matlab path. 9 % This function will be called before performing any action by all the 10 % callback functions so that the functions in simvma/src are always 11 % available to use. 12 % 4. create necessary files/directory structure. this includes: 13 % - /Simone-2.0-Complete-Cygwin64-customized-for-simvma/mdl- ,→ file/ 14 % - /Simone-2.0-Complete-Cygwin64-customized-for-simvma/mdl- ,→ files/ 15 % 5. if any of the custom repopaths in shared-var simvma_appState is non- ,→ existent, 16 % reset it to "". (A custom repo path might be non-existent because of 17 % one of the following reasons: )

155 18 % - the user deleted the folder in the machine 19 % - the user copied the project (simvma) from one machine to another 20 % where the custom repo path (from source machine) does not exist 21 % 6. if task 5 "detects" any change in the custom repos, this function ,→ also 22 % triggers a corresponding update in block-prediction models. 23 % 7. removes any unwanted mdl files immediately inside /tmp/ 24 % 25 % ASSUMPTION: simvma path is already added to matlab path (read README.md) 26 27 % task 1, taks 2 28 simvmaPath = getSimvmaPath(); 29 30 if boolean(simvmaPath) 31 % task 3 32 addpath(simvmaPath + "/src/apps"); 33 addpath(simvmaPath + "/src/classes"); 34 addpath(simvmaPath + "/src/functions"); 35 addpath(simvmaPath + "/src/testfunns"); 36 addpath(simvmaPath + "/src/special-files"); 37 addpath(simvmaPath + "/src/functions/devt"); 38 39 40 % task 4 41 dirpaths = [ 42 simvmaPath + "/Simone-2.0-Complete-Cygwin64-customized-for-simvma ,→ /mdl-file", ... 43 simvmaPath + "/Simone-2.0-Complete-Cygwin64-customized-for-simvma ,→ /mdl-files" ... 44 ]; 45 46 for i = 1 : length(dirpaths) 47 dirpath = dirpaths(i); 48 if ˜ exist(dirpath, 'dir') 49 mkdir(dirpath); 50 end 51 end 52 53 % task 5 54 state = getSharedVar('simvma_appState'); 55 stateCleaned = state.cleanCustomRepoPaths(); 56 setSharedVar('simvma_appState', stateCleaned); 57 58 % task 6 59 if detectChangesInRepos(state, stateCleaned, true) 60 updatePredModels(true); 61 end 62 63 % task 7 64 delete(simvmaPath + "/tmp/*.mdl"); 65

156 66 else 67 warndlg("SimIMA path coulld not be resolved. Please, read the README ,→ .md file and set SimIMA path", "Path not set"); 68 end 69 end

Listing A.77: isLoadedByAbspath.m function definition 1 function loaded = isLoadedByAbspath(abspath) 2 % Return true if passed model is loaded, else return false 3 % 4 % 5 % PARAMETERS: 6 % ------7 % abspath : string -- absolute path of model file 8 9 10 loaded = false; 11 loadedPaths = getLoadedModelsAbsFilepaths(); 12 realPath = symbolicPathToRealPath(abspath); 13 14 for i = 1 : length(loadedPaths) 15 if loadedPaths(i) == realPath 16 loaded = true; 17 break; 18 end 19 end 20 end

Listing A.78: isLoadedByModelName.m function definition 1 function loaded = isLoadedByModelName(modelName) 2 % Return true if any of the loaded models has the same name as the passed 3 % model name, else return false 4 % 5 % 6 % PARAMETERS: 7 % ------8 % modleName (string/char) -- model name (as would be returned by gcs) 9 10 modelName = string(modelName); % cannot be char 11 12 loaded = false; 13 % this returns the absolute "real"path (even if symlink is used to load 14 % the model) 15 loadedMdls = get_param(Simulink.allBlockDiagrams(), 'Name'); 16 for i = 1 : length(loadedMdls) 17 if loadedMdls(i) == modelName 18 loaded = true; 19 end 20 end 21 end

157 Listing A.79: isOpenByAbspath.m function definition 1 function isOpen = isOpenByAbspath(abspath) 2 % Return true if passed model is open (not just loaded), else return false 3 % 4 % 5 % PARAMETERS: 6 % ------7 % abspath : string -- absolute path of model file 8 9 [folder_, bdName, ext_] = fileparts(abspath); 10 bdName = string(bdName); 11 12 isOpen = false; 13 isLoaded = isLoadedByAbspath(abspath); 14 if isLoaded && get_param(bdName, 'Shown') == "on" 15 isOpen = true; 16 end 17 end

Listing A.80: isTopSystem.m function definition 1 function result = isTopSystem(systemPath) 2 % Return true if passed (sub)system's handle is 'Top' level system, else ,→ return false 3 % 4 % ASSUMPTIONS: 5 % ------6 % 1. system is loaded 7 % 8 % PARAMETERS: 9 % ------10 % systemPath : string -- (sub)system's name (which is to be updated) 11 % This should be in format as returned by 'gcs' eg: 12 % "mymodel/a/b/mysubsystem" 13 14 15 % LOGIC: the 'Type' of top-level system is 'block_diagram', whereas that ,→ of 16 % a sub-system is 'block' 17 18 19 blockType = get_param(systemPath, 'Type'); 20 21 if string(blockType) == "block_diagram" 22 result = true; 23 else 24 result = false; 25 end

158 Listing A.81: mdl2slx.m function definition 1 function mdl2slx(mdlFilePath, slxFilePath) 2 % Convert mdl file to slx 3 % 4 % parameters: 5 % ------6 % mdlFilePath : (string) absoulte path of the mdl file to be converted 7 % slxFilePath : (string) absoulte path of the slx file to be created 8 % The corresponding folder will be created if it 9 % does not exist yet. 10 % 11 % IMPORTANT: 12 % ------13 % - Conversion is not guaranteed to succeed. so, put this function in 14 % try-catch construct (in the caller). 15 % 16 17 mdlFilePath = string(mdlFilePath); 18 [slxFolderPath, slxFileName, slxExt] = fileparts(slxFilePath); 19 20 % create slx-folder if it does not exist already 21 if ˜ exist(slxFolderPath, 'dir') 22 mkdir (slxFolderPath); 23 end 24 25 load_system(mdlFilePath); 26 save_system(gcs, slxFilePath); % save as slx 27 close_system(gcs); 28 end

Listing A.82: mergeArmData.m function definition 1 function data = mergeArmData(data1, data2) 2 % Merges 2 'data' attributes of ArmModel 3 % Original data are not affected. 4 % 5 % PARAMETERS: 6 % ------7 % data1 (struct): first data 8 % data2 (struct): second data 9 % 10 11 data = data1; 12 n = length(fields(data)); 13 for i = 1 : length(fields(data2)) 14 data.("ss" + (n+i)) = data2.("ss" + i); 15 end 16 end

Listing A.83: mergeArmModels.m function definition

159 1 function am = mergeArmModels(verbose, armModels) 2 % Merges two ArmModels into one 3 % Original models are not affected. 4 % 5 % PARAMETERS: 6 % ------7 % verbose : boolean 8 % armModels : 1D array of ArmModels to be merged 9 10 % there may be duplicates this var hashes. 11 % duplicate hashes are dealt with inside ArmModel.trainByFilehash() 12 % This is done deliberately so that, we can print 'DUPLICATE' status 13 % in case duplicate hash values are found. 14 15 hashes = string.empty; 16 for i = 1 : length(armModels) 17 armModel = armModels(i); 18 hashes = [hashes armModel.hashTrainingFiles]; 19 end 20 21 am = ArmModel(); 22 am = am.trainByFilehash(hashes, verbose); 23 end

Listing A.84: mergeFreqData.m function definition 1 function data = mergeFreqData(data1, data2) 2 % Merges 2 'data' attributes of FreqModel 3 % Original data are not affected. 4 % 5 % PARAMETERS: 6 % ------7 % data1 (struct): first data 8 % data2 (struct): second data 9 % 10 11 data = struct; 12 blocks1 = transpose(string(fields(data1))); % 1xM string array 13 blocks2 = transpose(string(fields(data2))); % 1xN string array 14 15 blocksCommon = intersect(blocks1, blocks2); 16 blocks1Only = setdiff(blocks1, blocks2); 17 blocks2Only = setdiff(blocks2, blocks1); 18 19 % common blocks 20 for i = 1 : length(blocksCommon) 21 blockName = blocksCommon(i); 22 data.(blockName) = struct; 23 24 % merge 'count' 25 data.(blockName).count = data1.(blockName).count + data2.(blockName) ,→ .count;

160 26 27 % merge 'src' 28 src1 = data1.(blockName).src; 29 src2 = data2.(blockName).src; 30 data.(blockName).src = mergeInner(src1, src2); 31 32 % merge 'dst' 33 dst1 = data1.(blockName).dst; 34 dst2 = data2.(blockName).dst; 35 data.(blockName).dst = mergeInner(dst1, dst2); 36 37 % merge 'both' 38 both1 = data1.(blockName).both; 39 both2 = data2.(blockName).both; 40 data.(blockName).both = mergeInner(both1, both2); 41 end 42 43 % blocks in data1 only 44 for i = 1 : length(blocks1Only) 45 blockName = blocks1Only(i); 46 data.(blockName) = data1.(blockName); 47 end 48 49 % blocks in data2 only 50 for i = 1 : length(blocks2Only) 51 blockName = blocks2Only(i); 52 data.(blockName) = data2.(blockName); 53 end 54 55 end 56 57 58 function merged = mergeInner(first, second) 59 % merges the inner fields i.e 'src'/'dst'/'both' 60 merged = struct; 61 merged.count = first.count + second.count; 62 merged.details = struct; 63 blocks1 = transpose(string(fields(first.details))); 64 blocks2 = transpose(string(fields(second.details))); 65 66 blocksCommon = intersect(blocks1, blocks2); 67 blocks1Only = setdiff(blocks1, blocks2); 68 blocks2Only = setdiff(blocks2, blocks1); 69 70 % common blocks 71 for i = 1 : length(blocksCommon) 72 blockName = blocksCommon(i); 73 merged.details.(blockName) = first.details.(blockName) + second. ,→ details.(blockName); 74 end 75

161 76 % blocks in first only 77 for i = 1 : length(blocks1Only) 78 blockName = blocks1Only(i); 79 merged.details.(blockName) = first.details.(blockName); 80 end 81 82 % blocks in second only 83 for i = 1 : length(blocks2Only) 84 blockName = blocks2Only(i); 85 merged.details.(blockName) = second.details.(blockName); 86 end 87 end

Listing A.85: mergeFreqModels.m function definition 1 function fm = mergeFreqModels(verbose, freqModels) 2 % Merges two FreqModels into one 3 % Original models are not affected. 4 % 5 % PARFMETERS: 6 % ------7 % verbose : boolean 8 % freqModels : 1D array of FreqModels to be merged 9 10 % there may be duplicates this var hashes. 11 % duplicate hashes are dealt with inside FreqModel.trainByFilehash() 12 % This is done deliberately so that, we can print 'DUPLICATE' status 13 % in case duplicate hash values are found. 14 15 hashes = string.empty; 16 for i = 1 : length(freqModels) 17 freqModel = freqModels(i); 18 hashes = [hashes freqModel.hashTrainingFiles]; 19 end 20 21 fm = FreqModel(); 22 fm = fm.trainByFilehash(hashes, verbose); 23 end

Listing A.86: removeSimvmaPathPrefix.m function definition 1 function relpath = removeSimvmaPathPrefix(abspath) 2 % Remove simvma path prefix from given absolute path 3 % eg: 4 % "/Users/bhisma/courses/cse-700-simvma/simvma/default-repos/automotive/ ,→ Matlab_Central/MinorTimeStepLogging/SimpleBounce.mdl" 5 % --> "default-repos/automotive/Matlab_Central/MinorTimeStepLogging/ ,→ SimpleBounce.mdl" 6 % 7 % PARAMETERS: 8 % - abspath (char/string) : absolute file/folder path 9

162 10 % ASSUMPTION: 11 % - given abspath starts with simvmapath 12 13 14 simvmaPathLength = length(char(getSimvmaPath())); 15 relpath = extractAfter(abspath, simvmaPathLength + 1); 16 end

Listing A.87: replaceMdlFileContent.m function definition 1 function replaceMdlFileContent(replacedFilepath, replacerFilepath) 2 % replace entire content of one mdl file by that of another 3 % 4 % The name of top-level system will still be preserved (i.e. it will still 5 % match replaced's filename) 6 % 7 % The replacement works irrespective of whether the file to be replaced is 8 % open/loaded in Simulink, and the state (closed/open/loaded) is preserved ,→ . 9 % 10 % PARAMETERS: 11 % ------12 % replacedFilepath : string -- absolute path of the mdl file whose content 13 % is to be replaced 14 % replacerFilepath : string -- absolute path of the mdl file whose content 15 % will replace the replaced mdl's content 16 % 17 % ASSUMPTIONS: 18 % ------19 % - provided filepaths are valid 20 21 [folder_, bdName, ext_] = fileparts(replacedFilepath); 22 bdName = string(bdName); 23 24 isLoaded = bdIsLoaded(bdName); 25 26 % a block-diagram (bd) is considered 'open' if it is not just loaded, 27 % but actually OPENED. The window of an opened bd may be minimized; 28 % that does not matter. 29 30 if isLoaded && get_param(bdName, 'Shown') == "on" 31 isOpen = true; 32 else 33 isOpen = false; 34 end 35 36 % replace content. 37 % in doing so, preserve the state (closed/loaded/open, window position) 38 if isOpen 39 % re-opening the mdl file at the same location (on screen) is tricky 40 % after replacement is done, DON'T open directly, 41 % rather, load the model first, restore the location, and then open

163 42 % if the model is directly opened (without loading), the focus will 43 % be in (sub)system rather than top level system (if such 44 % sub-system exists after replacement), as a result set_param(..., 45 % 'location' won't show effect 46 47 loc = get_param(bdName, 'location'); 48 save_system(replacedFilepath); 49 close_system(replacedFilepath); 50 replaceContent(replacedFilepath, replacerFilepath); 51 52 load_system(replacedFilepath); % IMPORTANT: don't open_model ,→ directly 53 set_param(bdName, 'location', loc); 54 save_system(replacedFilepath); 55 open_system(replacedFilepath); 56 save_system(replacedFilepath); % required to restore original top- ,→ system name 57 58 elseif isLoaded 59 save_system(replacedFilepath); 60 close_system(replacedFilepath); 61 replaceContent(replacedFilepath, replacerFilepath); 62 load_system(replacedFilepath); 63 save_system(replacedFilepath); % required to restore original top- ,→ system name 64 65 else 66 replaceContent(replacedFilepath, replacerFilepath); 67 % load, save, close (to ensure that top-system's name gets changed 68 % to replaced's filename, otherwise top-system's name will be the 69 % one copied from replacer's content. (this won't be an issue, but 70 % we still want to have consistency between model's name and top 71 % system's name. 72 load_system(replacedFilepath); 73 save_system(replacedFilepath); 74 close_system(replacedFilepath); 75 end 76 end 77 78 79 function replaceContent(replacedFilepath, replacerFilepath) 80 if ispc 81 cmd = getCygwinPath() + "/bin/cat " + replacerFilepath + " > " + ,→ replacedFilepath; 82 else 83 cmd = "cat " + replacerFilepath + " > " + replacedFilepath; 84 end 85 unix(cmd); 86 end

Listing A.88: replaceSubSystem.m function definition

164 1 function postAction = replaceSubSystem(replacedSubSystemPath, ,→ replacerSubSystemPath, MUDFilepath) 2 % Replace the first sub-system by second sub-system 3 % 4 % IMPORTANT: 5 % ------6 % - the top-system and the sub-system MUST be from different mdl files, 7 % otherwise an error will be raised. 8 % - the name of the new sub-system will be the same as that of the old 9 % (replaced subsystem) 10 % - after replacement is made, the 'view' of the replaced model will be at 11 % the level of inside the replaced sub-system 12 % 13 % PARAMETERS: 14 % ------15 % replacedSubSystemPath : string -- path of replaced sub system, eg: ' ,→ mysys/a/b' 16 % replacerSubSystemPath : string -- path of replacer sub system, eg: ' ,→ mysys1/a/x' 17 % 18 % ASSUMPTIONS: 19 % ------20 % - provided paths are valid 21 % - both mdl files are loaded 22 % - subsystems are COMPATIBLE for replacement i.e. number of inports and 23 % outports match between the replaced subsystem and the replacer 24 % subsystem 25 % - the replaced subsystem is in MUDFilepath 26 27 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% ,→ 28 postAction = ""; 29 30 % verify that the 2 sub-systems are from different mdl files 31 replacedSubSystemPath = string(replacedSubSystemPath); 32 replacerSubSystemPath = string(replacerSubSystemPath); 33 34 disp("replacedSubSystemPath :" + replacedSubSystemPath); 35 disp("replacerSubSystemPath :" + replacerSubSystemPath); 36 37 tokens1 = replacedSubSystemPath.split('/'); 38 tokens2 = replacerSubSystemPath.split('/'); 39 bdnameReplaced = tokens1(1); % index 1 gives bdname 40 bdnameReplacer = tokens2(1); 41 42 if bdnameReplaced == bdnameReplacer 43 error("The two sub-systems MUST be from different mdl files"); 44 end 45 46 % location of window in screen 47 loc = get_param(tokens1(1), 'location');

165 48 49 % APPROACH 50 % 1. get the position of the replaced sub-system 51 % 2. delete replaced sub-system 52 % 3. copy replacer sub-system to replaced model 53 % 4. restore the position 54 55 % position of subsystem within the model 56 pos = get_param(replacedSubSystemPath, 'position'); 57 delete_block(replacedSubSystemPath); 58 add_block(replacerSubSystemPath, replacedSubSystemPath); 59 60 [nInport, nOutport] = getNInportOutport(replacedSubSystemPath); 61 62 set_param(replacedSubSystemPath, 'position', pos); 63 64 % in case the user is currently "viewing" MUD at the level of SUD 65 % (which is most probably the case), the MUD's window will be lost as 66 % soon as delete_block(replacedSubSystemPath) command is executed. 67 % So, to make sure the model is still visible to the user after 68 % replacement is made, open_system() is executed 69 70 open_system(MUDFilepath); % open model 71 72 % restore window's position in screen 73 dispImp("replacesSSPath : " + replacedSubSystemPath); 74 set_param(bdnameReplaced, 'location', loc); 75 save_system(MUDFilepath); 76 77 % change view level to SUD's parent, 78 % so that user can make adjustments to connections 79 tmp = replacedSubSystemPath.replace('//', '___double_slash___'); 80 tokens = tmp.split('/'); 81 tokens(end) = []; % remove last token i.e. SUD 82 parentSystemPath = tokens.join('/'); 83 parentSystemPath = parentSystemPath.replace('___double_slash___', '//') ,→ ; 84 open_system(parentSystemPath); 85 86 if ˜(nInport == 1 && nOutport == 1) 87 postAction = "ADJUST_CONN"; 88 end 89 end

Listing A.89: replaceSubSystemBySubSystemCreatedFromTopSystem.m function definition 1 function replaceSubSystemBySubSystemCreatedFromTopSystem(subSystemPath, ,→ topSystemPath) 2 % Replace the sub-system (firt arg) by another sub-system which is created 3 % from the top-system (second arg) 4 % 5 % IMPORTANT:

166 6 % ------7 % - the replacement will be done by wrapping the CONTENTS of the top- ,→ system, 8 % into a sub-system. 9 10 % - the top-system and the sub-system MUST be from different mdl files, 11 % otherwise an error will be raised. 12 % 13 % PARAMETERS: 14 % ------15 % subSystemPath : string -- path of sub system, eg: 'mysys2/a/b/c' 16 % topSystemPath : string -- path of top system, eg: 'mysystem' 17 % 18 % ASSUMPTIONS: 19 % ------20 % - provided paths are valid 21 % - both mdl files are loaded 22 23 % verify that top-system and sub-system are from different mdl files 24 topSystemPath = string(topSystemPath); 25 subSystemPath = string(subSystemPath); 26 27 tokensT = topSystemPath.split('/'); 28 tokensS = subSystemPath.split('/'); 29 30 if tokensT(1) == tokensS(1) 31 error("top-system and sub-system MUST be from different mdl files"); 32 end 33 34 35 % % APPROACH: 36 % % 1. delete everything from topsystem 37 % % 2. copy subsystem contents inside top system 38 % 39 % Simulink.BlockDiagram.deleteContents(topSystemPath); 40 % Simulink.SubSystem.copyContentsToBlockDiagram(subSystemPath, ,→ topSystemPath); 41 42 %%%%%%%%%%%% NOT IMPLEMENTED %%%%%%%%%%%%% 43 end

Listing A.90: replaceTopSystemBySubSystemContents.m function definition 1 function replaceTopSystemBySubSystemContents(topSystemPath, subSystemPath) 2 % Replace everything in the top-system by the contents of subsystem 3 % 4 % NOTE: 5 % ----- 6 % This function is not used anymore because it has problem inserting 7 % certain types of subsystems' contents (see detailed explanation in 8 % updateMUDWithSuggestion's case (Top, Sub) 9 %

167 10 % IMPORTANT: 11 % ------12 % - the replacement will be done by the CONTENTS of the sub-system, 13 % NOT by the subsystem i.e. This can be thought of as replacing 14 % the top-system by the sub-system, followed by expanding the copied 15 % subsystem inside the top-system. 16 % 17 % - the top-system and the sub-system MUST be from different mdl files, 18 % otherwise an error will be raised. 19 % 20 % PARAMETERS: 21 % ------22 % topSystemPath : string -- path of top system, eg: 'mysystem' 23 % subSystemPath : string -- path of sub system, eg: 'mysys2/a/b/c' 24 % 25 % ASSUMPTIONS: 26 % ------27 % - provided paths are valid 28 % - both mdl files are loaded 29 30 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% ,→ 31 32 % verify that top-system and sub-system are from different mdl files 33 topSystemPath = string(topSystemPath); 34 subSystemPath = string(subSystemPath); 35 36 tokensT = topSystemPath.split('/'); 37 tokensS = subSystemPath.split('/'); 38 39 if tokensT(1) == tokensS(1) 40 error("top-system and sub-system MUST be from different mdl files"); 41 end 42 43 44 % APPROACH: 45 % 1. delete everything from topsystem 46 % 2. copy subsystem contents inside top system 47 48 Simulink.BlockDiagram.deleteContents(topSystemPath); 49 50 dispKeyVal('topSystemPath', topSystemPath); 51 dispKeyVal('subsystemPath', subSystemPath); 52 53 Simulink.SubSystem.copyContentsToBlockDiagram(subSystemPath, ,→ topSystemPath); 54 end

Listing A.91: resetModels.m function definition 1 function resetModels() 2 % USE WITH CAUTION

168 3 % 4 % Resets the following shared vars: 5 % - simvma_armModelCst.mat 6 % - simvma_armModelDef_000000.mat 7 % ... 8 % - simvma_armModelDef_111111.mat 9 % - simvma_armModel.mat 10 % 11 % - simvma_freqModelCst.mat 12 % - simvma_freqModelDef_000000.mat 13 % ... 14 % - simvma_freqModelDef_111111.mat 15 % - simvma_freqModel.mat 16 17 18 % reset 'final' (those used for prediction by simgestion) pred models 19 setSharedVar('simvma_armModel', ArmModel()); 20 setSharedVar('simvma_freqModel', FreqModel()); 21 22 % reset prediction models trained on custom repos 23 setSharedVar('simvma_armModelCst', ArmModel()); 24 setSharedVar('simvma_freqModelCst', FreqModel()); 25 26 % reset the pred models trained on all possible combinations of default 27 % repos 28 for a = 0 : 1 29 for b = 0 : 1 30 for c = 0 : 1 31 for d = 0 : 1 32 for e = 0 : 1 33 for f = 0 : 1 34 setSharedVar("simvma_armModelDef_" + a + b + c + d + ,→ e + f, ArmModel()); 35 setSharedVar("simvma_freqModelDef_" + a + b + c + d + ,→ e + f, FreqModel()); 36 end 37 end 38 end 39 end 40 end 41 end 42 43 state = getSharedVar('simvma_appState'); 44 state.defBlockPredModelsReset = true; 45 setSharedVar('simvma_appState', state); 46 end

Listing A.92: resetModelsAndCache.m function definition 1 function resetModelsAndCache() 2 % Resets both caches and models i.e. 3 % Resets the following shared vars:

169 4 % - simvma_armModelCst.mat 5 % - simvma_armModelDef_000000.mat 6 % ... 7 % - simvma_armModelDef_111111.mat 8 % - simvma_armModel.mat 9 % 10 % - simvma_freqModelCst.mat 11 % - simvma_freqModelDef_000000.mat 12 % ... 13 % - simvma_freqModelDef_111111.mat 14 % - simvma_freqModel.mat 15 % 16 % - simvma_armCache.mat 17 % - simvma_freqCache.mat 18 19 resetModels(); 20 21 % We have not defined a separate function 'resetCache', because 22 % resetting cache but not reseting models will bring problem in future 23 % model-merging (as we won't have cache necessary to merge the models). 24 25 setSharedVar('simvma_armCache', struct); 26 setSharedVar('simvma_freqCache', struct); 27 end

Listing A.93: searchFilesRecursively.m function definition 1 function paths = searchFilesRecursively(folderPath, extensions) 2 % Return a list of absolute paths of all files matching the extension and 3 % contained within the directory given by folderPath. These files could be 4 % nested in subdirectories. 5 % 6 % If dirpath is empty i.e. "" or non-existent, returns 0 7 % 8 % PARAMETERS: 9 % ------10 % folderPath(str): absolute/relative path of the folder containing the ,→ files 11 % to be searched. 12 % extensions(str):extension of files to be searched without leading period 13 % This can be one of: 14 % - single string (eg: "mdl") 15 % - single char-array (eg: 'mdl') 16 % - list of strings (eg: ["mdl", "slx"]) 17 % This CANNOT be a list of chars (eg: ['mdl', 'slx'] 18 % is INVALID. 19 20 folderPath = string(folderPath); 21 extensions = string(extensions); 22 paths = string.empty; 23 if folderPath == "" 24 return;

170 25 end 26 27 for i = 1 : length(extensions) 28 ext = extensions(i); 29 structures = dir(folderPath + "/**/*." + ext); 30 for j = 1: length(structures) 31 s = structures(j); 32 path = fullfile(s.folder, s.name); 33 paths = [paths path]; 34 end 35 36 end 37 end

Listing A.94: setSharedVar.m function definition 1 function value = setSharedVar(name, value) 2 % Set the value of the workspace data in corresponding file in 3 % shared-vars/*.mat. 4 % The .mat file will be created if it does not exist already. 5 % This will override the corresponding .mat file in case it already exists ,→ . 6 % 7 % PARAMETERS: 8 % ------9 % name(string) : name of the variable 10 % value(ANY): value to be set 11 % 12 13 14 filepath = getSimvmaPath() + "/shared-vars/" + name + ".mat"; 15 16 % could not find a generic way to create a variable inside this 17 % function with given name and value. 18 % so, resorted to this rather ugly approach. 19 20 if name == "simvma_appState" 21 setAppState(filepath, value); 22 23 elseif name == "simvma_blockInsertionState" 24 setBlockInsertionState(filepath, value); 25 26 elseif name == "simvma_armCache" 27 setArmCache(filepath, value); 28 elseif name == "simvma_freqCache" 29 setFreqCache(filepath, value); 30 31 elseif name == "simvma_tempvar" % for test purpose only 32 setTempvar(filepath, value); 33 34 elseif name == "simvma_armModelCst" 35 setArmModelCst(filepath, value);

171 36 elseif name == "simvma_armModel" 37 setArmModel(filepath, value); 38 elseif name == "simvma_freqModelCst" 39 setFreqModelCst(filepath, value); 40 elseif name == "simvma_freqModel" 41 setFreqModel(filepath, value); 42 43 % todo: delete following 6 else-ifs once the 6 default models are ,→ trained 44 % delete begin 45 elseif name == "simvma_freqModelStd01" 46 setFreqModelStd01(filepath, value); 47 elseif name == "simvma_freqModelStd10" 48 setFreqModelStd10(filepath, value); 49 elseif name == "simvma_freqModelStd11" 50 setFreqModelStd11(filepath, value); 51 elseif name == "simvma_armModelStd01" 52 setArmModelStd01(filepath, value); 53 elseif name == "simvma_armModelStd10" 54 setArmModelStd10(filepath, value); 55 elseif name == "simvma_armModelStd11" 56 setArmModelStd11(filepath, value); 57 elseif name == "simvma_symlinkMap" 58 setSymlinkMap(filepath, value); 59 % delete end 60 61 62 63 % The following code (between %%% generated-start %%% and %%% generated- ,→ end %%%) was generated with this script 64 65 % #!/usr/local/bin/python3 66 % 67 % with open('file.txt', 'w') as file: 68 % file.write("\n%%%%%%%%%%%%%%%%%%%%%%%%%%%%% generated-start ,→ %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%\n") 69 % for model in ['Arm', 'Freq']: 70 % model_lower = model.lower(); 71 % 72 % for a in range(0, 2): 73 % for b in range(0, 2): 74 % for c in range(0, 2): 75 % for d in range(0, 2): 76 % for e in range(0, 2): 77 % for f in range(0, 2): 78 % s = f""" 79 % elseif name == "simvma_{model_lower}ModelDef_{a}{b}{c}{d}{e}{f}" 80 % set{model}ModelDef_{a}{b}{c}{d}{e}{f}(filepath, value); 81 % """ 82 % file.write(s) 83 % file.write("\n%%%%%%%%%%%%%%%%%%%%%%%%%%%%% generated-end

172 ,→ %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%\n") 84 85 86 87 %%%%%%%%%%%%%%%%%%%%%%%%%%%%% generated-start ,→ %%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 88 89 elseif name == "simvma_armModelDef_000000" 90 setArmModelDef_000000(filepath, value); 91 92 elseif name == "simvma_armModelDef_000001" 93 setArmModelDef_000001(filepath, value); 94 95 elseif name == "simvma_armModelDef_000010" 96 setArmModelDef_000010(filepath, value); 97 98 elseif name == "simvma_armModelDef_000011" 99 setArmModelDef_000011(filepath, value); 100 101 elseif name == "simvma_armModelDef_000100" 102 setArmModelDef_000100(filepath, value); 103 104 elseif name == "simvma_armModelDef_000101" 105 setArmModelDef_000101(filepath, value); 106 107 elseif name == "simvma_armModelDef_000110" 108 setArmModelDef_000110(filepath, value); 109 110 elseif name == "simvma_armModelDef_000111" 111 setArmModelDef_000111(filepath, value); 112 113 elseif name == "simvma_armModelDef_001000" 114 setArmModelDef_001000(filepath, value); 115 116 elseif name == "simvma_armModelDef_001001" 117 setArmModelDef_001001(filepath, value); 118 119 elseif name == "simvma_armModelDef_001010" 120 setArmModelDef_001010(filepath, value); 121 122 elseif name == "simvma_armModelDef_001011" 123 setArmModelDef_001011(filepath, value); 124 125 elseif name == "simvma_armModelDef_001100" 126 setArmModelDef_001100(filepath, value); 127 128 elseif name == "simvma_armModelDef_001101" 129 setArmModelDef_001101(filepath, value); 130 131 elseif name == "simvma_armModelDef_001110" 132 setArmModelDef_001110(filepath, value);

173 133 134 elseif name == "simvma_armModelDef_001111" 135 setArmModelDef_001111(filepath, value); 136 137 elseif name == "simvma_armModelDef_010000" 138 setArmModelDef_010000(filepath, value); 139 140 elseif name == "simvma_armModelDef_010001" 141 setArmModelDef_010001(filepath, value); 142 143 elseif name == "simvma_armModelDef_010010" 144 setArmModelDef_010010(filepath, value); 145 146 elseif name == "simvma_armModelDef_010011" 147 setArmModelDef_010011(filepath, value); 148 149 elseif name == "simvma_armModelDef_010100" 150 setArmModelDef_010100(filepath, value); 151 152 elseif name == "simvma_armModelDef_010101" 153 setArmModelDef_010101(filepath, value); 154 155 elseif name == "simvma_armModelDef_010110" 156 setArmModelDef_010110(filepath, value); 157 158 elseif name == "simvma_armModelDef_010111" 159 setArmModelDef_010111(filepath, value); 160 161 elseif name == "simvma_armModelDef_011000" 162 setArmModelDef_011000(filepath, value); 163 164 elseif name == "simvma_armModelDef_011001" 165 setArmModelDef_011001(filepath, value); 166 167 elseif name == "simvma_armModelDef_011010" 168 setArmModelDef_011010(filepath, value); 169 170 elseif name == "simvma_armModelDef_011011" 171 setArmModelDef_011011(filepath, value); 172 173 elseif name == "simvma_armModelDef_011100" 174 setArmModelDef_011100(filepath, value); 175 176 elseif name == "simvma_armModelDef_011101" 177 setArmModelDef_011101(filepath, value); 178 179 elseif name == "simvma_armModelDef_011110" 180 setArmModelDef_011110(filepath, value); 181 182 elseif name == "simvma_armModelDef_011111" 183 setArmModelDef_011111(filepath, value);

174 184 185 elseif name == "simvma_armModelDef_100000" 186 setArmModelDef_100000(filepath, value); 187 188 elseif name == "simvma_armModelDef_100001" 189 setArmModelDef_100001(filepath, value); 190 191 elseif name == "simvma_armModelDef_100010" 192 setArmModelDef_100010(filepath, value); 193 194 elseif name == "simvma_armModelDef_100011" 195 setArmModelDef_100011(filepath, value); 196 197 elseif name == "simvma_armModelDef_100100" 198 setArmModelDef_100100(filepath, value); 199 200 elseif name == "simvma_armModelDef_100101" 201 setArmModelDef_100101(filepath, value); 202 203 elseif name == "simvma_armModelDef_100110" 204 setArmModelDef_100110(filepath, value); 205 206 elseif name == "simvma_armModelDef_100111" 207 setArmModelDef_100111(filepath, value); 208 209 elseif name == "simvma_armModelDef_101000" 210 setArmModelDef_101000(filepath, value); 211 212 elseif name == "simvma_armModelDef_101001" 213 setArmModelDef_101001(filepath, value); 214 215 elseif name == "simvma_armModelDef_101010" 216 setArmModelDef_101010(filepath, value); 217 218 elseif name == "simvma_armModelDef_101011" 219 setArmModelDef_101011(filepath, value); 220 221 elseif name == "simvma_armModelDef_101100" 222 setArmModelDef_101100(filepath, value); 223 224 elseif name == "simvma_armModelDef_101101" 225 setArmModelDef_101101(filepath, value); 226 227 elseif name == "simvma_armModelDef_101110" 228 setArmModelDef_101110(filepath, value); 229 230 elseif name == "simvma_armModelDef_101111" 231 setArmModelDef_101111(filepath, value); 232 233 elseif name == "simvma_armModelDef_110000" 234 setArmModelDef_110000(filepath, value);

175 235 236 elseif name == "simvma_armModelDef_110001" 237 setArmModelDef_110001(filepath, value); 238 239 elseif name == "simvma_armModelDef_110010" 240 setArmModelDef_110010(filepath, value); 241 242 elseif name == "simvma_armModelDef_110011" 243 setArmModelDef_110011(filepath, value); 244 245 elseif name == "simvma_armModelDef_110100" 246 setArmModelDef_110100(filepath, value); 247 248 elseif name == "simvma_armModelDef_110101" 249 setArmModelDef_110101(filepath, value); 250 251 elseif name == "simvma_armModelDef_110110" 252 setArmModelDef_110110(filepath, value); 253 254 elseif name == "simvma_armModelDef_110111" 255 setArmModelDef_110111(filepath, value); 256 257 elseif name == "simvma_armModelDef_111000" 258 setArmModelDef_111000(filepath, value); 259 260 elseif name == "simvma_armModelDef_111001" 261 setArmModelDef_111001(filepath, value); 262 263 elseif name == "simvma_armModelDef_111010" 264 setArmModelDef_111010(filepath, value); 265 266 elseif name == "simvma_armModelDef_111011" 267 setArmModelDef_111011(filepath, value); 268 269 elseif name == "simvma_armModelDef_111100" 270 setArmModelDef_111100(filepath, value); 271 272 elseif name == "simvma_armModelDef_111101" 273 setArmModelDef_111101(filepath, value); 274 275 elseif name == "simvma_armModelDef_111110" 276 setArmModelDef_111110(filepath, value); 277 278 elseif name == "simvma_armModelDef_111111" 279 setArmModelDef_111111(filepath, value); 280 281 elseif name == "simvma_freqModelDef_000000" 282 setFreqModelDef_000000(filepath, value); 283 284 elseif name == "simvma_freqModelDef_000001" 285 setFreqModelDef_000001(filepath, value);

176 286 287 elseif name == "simvma_freqModelDef_000010" 288 setFreqModelDef_000010(filepath, value); 289 290 elseif name == "simvma_freqModelDef_000011" 291 setFreqModelDef_000011(filepath, value); 292 293 elseif name == "simvma_freqModelDef_000100" 294 setFreqModelDef_000100(filepath, value); 295 296 elseif name == "simvma_freqModelDef_000101" 297 setFreqModelDef_000101(filepath, value); 298 299 elseif name == "simvma_freqModelDef_000110" 300 setFreqModelDef_000110(filepath, value); 301 302 elseif name == "simvma_freqModelDef_000111" 303 setFreqModelDef_000111(filepath, value); 304 305 elseif name == "simvma_freqModelDef_001000" 306 setFreqModelDef_001000(filepath, value); 307 308 elseif name == "simvma_freqModelDef_001001" 309 setFreqModelDef_001001(filepath, value); 310 311 elseif name == "simvma_freqModelDef_001010" 312 setFreqModelDef_001010(filepath, value); 313 314 elseif name == "simvma_freqModelDef_001011" 315 setFreqModelDef_001011(filepath, value); 316 317 elseif name == "simvma_freqModelDef_001100" 318 setFreqModelDef_001100(filepath, value); 319 320 elseif name == "simvma_freqModelDef_001101" 321 setFreqModelDef_001101(filepath, value); 322 323 elseif name == "simvma_freqModelDef_001110" 324 setFreqModelDef_001110(filepath, value); 325 326 elseif name == "simvma_freqModelDef_001111" 327 setFreqModelDef_001111(filepath, value); 328 329 elseif name == "simvma_freqModelDef_010000" 330 setFreqModelDef_010000(filepath, value); 331 332 elseif name == "simvma_freqModelDef_010001" 333 setFreqModelDef_010001(filepath, value); 334 335 elseif name == "simvma_freqModelDef_010010" 336 setFreqModelDef_010010(filepath, value);

177 337 338 elseif name == "simvma_freqModelDef_010011" 339 setFreqModelDef_010011(filepath, value); 340 341 elseif name == "simvma_freqModelDef_010100" 342 setFreqModelDef_010100(filepath, value); 343 344 elseif name == "simvma_freqModelDef_010101" 345 setFreqModelDef_010101(filepath, value); 346 347 elseif name == "simvma_freqModelDef_010110" 348 setFreqModelDef_010110(filepath, value); 349 350 elseif name == "simvma_freqModelDef_010111" 351 setFreqModelDef_010111(filepath, value); 352 353 elseif name == "simvma_freqModelDef_011000" 354 setFreqModelDef_011000(filepath, value); 355 356 elseif name == "simvma_freqModelDef_011001" 357 setFreqModelDef_011001(filepath, value); 358 359 elseif name == "simvma_freqModelDef_011010" 360 setFreqModelDef_011010(filepath, value); 361 362 elseif name == "simvma_freqModelDef_011011" 363 setFreqModelDef_011011(filepath, value); 364 365 elseif name == "simvma_freqModelDef_011100" 366 setFreqModelDef_011100(filepath, value); 367 368 elseif name == "simvma_freqModelDef_011101" 369 setFreqModelDef_011101(filepath, value); 370 371 elseif name == "simvma_freqModelDef_011110" 372 setFreqModelDef_011110(filepath, value); 373 374 elseif name == "simvma_freqModelDef_011111" 375 setFreqModelDef_011111(filepath, value); 376 377 elseif name == "simvma_freqModelDef_100000" 378 setFreqModelDef_100000(filepath, value); 379 380 elseif name == "simvma_freqModelDef_100001" 381 setFreqModelDef_100001(filepath, value); 382 383 elseif name == "simvma_freqModelDef_100010" 384 setFreqModelDef_100010(filepath, value); 385 386 elseif name == "simvma_freqModelDef_100011" 387 setFreqModelDef_100011(filepath, value);

178 388 389 elseif name == "simvma_freqModelDef_100100" 390 setFreqModelDef_100100(filepath, value); 391 392 elseif name == "simvma_freqModelDef_100101" 393 setFreqModelDef_100101(filepath, value); 394 395 elseif name == "simvma_freqModelDef_100110" 396 setFreqModelDef_100110(filepath, value); 397 398 elseif name == "simvma_freqModelDef_100111" 399 setFreqModelDef_100111(filepath, value); 400 401 elseif name == "simvma_freqModelDef_101000" 402 setFreqModelDef_101000(filepath, value); 403 404 elseif name == "simvma_freqModelDef_101001" 405 setFreqModelDef_101001(filepath, value); 406 407 elseif name == "simvma_freqModelDef_101010" 408 setFreqModelDef_101010(filepath, value); 409 410 elseif name == "simvma_freqModelDef_101011" 411 setFreqModelDef_101011(filepath, value); 412 413 elseif name == "simvma_freqModelDef_101100" 414 setFreqModelDef_101100(filepath, value); 415 416 elseif name == "simvma_freqModelDef_101101" 417 setFreqModelDef_101101(filepath, value); 418 419 elseif name == "simvma_freqModelDef_101110" 420 setFreqModelDef_101110(filepath, value); 421 422 elseif name == "simvma_freqModelDef_101111" 423 setFreqModelDef_101111(filepath, value); 424 425 elseif name == "simvma_freqModelDef_110000" 426 setFreqModelDef_110000(filepath, value); 427 428 elseif name == "simvma_freqModelDef_110001" 429 setFreqModelDef_110001(filepath, value); 430 431 elseif name == "simvma_freqModelDef_110010" 432 setFreqModelDef_110010(filepath, value); 433 434 elseif name == "simvma_freqModelDef_110011" 435 setFreqModelDef_110011(filepath, value); 436 437 elseif name == "simvma_freqModelDef_110100" 438 setFreqModelDef_110100(filepath, value);

179 439 440 elseif name == "simvma_freqModelDef_110101" 441 setFreqModelDef_110101(filepath, value); 442 443 elseif name == "simvma_freqModelDef_110110" 444 setFreqModelDef_110110(filepath, value); 445 446 elseif name == "simvma_freqModelDef_110111" 447 setFreqModelDef_110111(filepath, value); 448 449 elseif name == "simvma_freqModelDef_111000" 450 setFreqModelDef_111000(filepath, value); 451 452 elseif name == "simvma_freqModelDef_111001" 453 setFreqModelDef_111001(filepath, value); 454 455 elseif name == "simvma_freqModelDef_111010" 456 setFreqModelDef_111010(filepath, value); 457 458 elseif name == "simvma_freqModelDef_111011" 459 setFreqModelDef_111011(filepath, value); 460 461 elseif name == "simvma_freqModelDef_111100" 462 setFreqModelDef_111100(filepath, value); 463 464 elseif name == "simvma_freqModelDef_111101" 465 setFreqModelDef_111101(filepath, value); 466 467 elseif name == "simvma_freqModelDef_111110" 468 setFreqModelDef_111110(filepath, value); 469 470 elseif name == "simvma_freqModelDef_111111" 471 setFreqModelDef_111111(filepath, value); 472 473 %%%%%%%%%%%%%%%%%%%%%%%%%%%%% generated-end %%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 474 475 else 476 error("Unexpected var : " + name); 477 end 478 end 479 480 481 482 483 function setAppState(filepath, simvma_appState) 484 save(filepath, 'simvma_appState'); 485 end 486 487 488 function setFreqModelStd01(filepath, simvma_freqModelStd01) 489 save(filepath, 'simvma_freqModelStd01');

180 490 end 491 492 function setFreqModelStd10(filepath, simvma_freqModelStd10) 493 save(filepath, 'simvma_freqModelStd10'); 494 end 495 496 function setFreqModelStd11(filepath, simvma_freqModelStd11) 497 save(filepath, 'simvma_freqModelStd11'); 498 end 499 500 function setFreqModelCst(filepath, simvma_freqModelCst) 501 save(filepath, 'simvma_freqModelCst'); 502 end 503 504 function setFreqModel(filepath, simvma_freqModel) 505 save(filepath, 'simvma_freqModel'); 506 end 507 508 509 function setArmModelStd01(filepath, simvma_armModelStd01) 510 save(filepath, 'simvma_armModelStd01'); 511 end 512 513 function setArmModelStd10(filepath, simvma_armModelStd10) 514 save(filepath, 'simvma_armModelStd10'); 515 end 516 517 function setArmModelStd11(filepath, simvma_armModelStd11) 518 save(filepath, 'simvma_armModelStd11'); 519 end 520 521 function setArmModelCst(filepath, simvma_armModelCst) 522 save(filepath, 'simvma_armModelCst'); 523 end 524 525 function setArmModel(filepath, simvma_armModel) 526 save(filepath, 'simvma_armModel'); 527 end 528 529 530 function setBlockInsertionState(filepath, simvma_blockInsertionState) 531 save(filepath, 'simvma_blockInsertionState'); 532 end 533 534 function setArmCache(filepath, simvma_armCache) 535 save(filepath, 'simvma_armCache'); 536 end 537 function setFreqCache(filepath, simvma_freqCache) 538 save(filepath, 'simvma_freqCache'); 539 end 540

181 541 function setTempvar(filepath, simvma_tempvar) 542 save(filepath, 'simvma_tempvar'); 543 end 544 545 function setBlockTypes(filepath, simvma_blockTypes) 546 save(filepath, 'simvma_blockTypes'); 547 end 548 549 function setSymlinkMap(filepath, simvma_symlinkMap) 550 save(filepath, 'simvma_symlinkMap'); 551 end 552 553 554 555 % The following code (between %%% generated-start %%% and %%% generated- ,→ end %%%) 556 % was generated using this script 557 % 558 % #!/usr/local/bin/python3 559 % 560 % with open('file.txt', 'w') as file: 561 % file.write("\n%%%%%%%%%%%%%%%%%%%%%%%%%%%%% generated-start ,→ %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%\n") 562 % for model in ['Arm', 'Freq']: 563 % model_lower = model.lower(); 564 % 565 % for a in range(0, 2): 566 % for b in range(0, 2): 567 % for c in range(0, 2): 568 % for d in range(0, 2): 569 % for e in range(0, 2): 570 % for f in range(0, 2): 571 % s = f""" 572 % function set{model}ModelDef_{a}{b}{c}{d}{e}{f}(filepath, simvma_{ ,→ model_lower}ModelDef_{a}{b}{c}{d}{e}{f}) 573 % save(filepath, 'simvma_{model}ModelDef_{a}{b}{c}{d}{e}{f}'); 574 % end 575 % """ 576 % file.write(s) 577 % file.write("\n%%%%%%%%%%%%%%%%%%%%%%%%%%%%% generated-end ,→ %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%\n") 578 579 580 581 %%%%%%%%%%%%%%%%%%%%%%%%%%%%% generated-start ,→ %%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 582 583 function setArmModelDef_000000(filepath, simvma_armModelDef_000000) 584 save(filepath, 'simvma_armModelDef_000000'); 585 end 586

182 587 function setArmModelDef_000001(filepath, simvma_armModelDef_000001) 588 save(filepath, 'simvma_armModelDef_000001'); 589 end 590 591 function setArmModelDef_000010(filepath, simvma_armModelDef_000010) 592 save(filepath, 'simvma_armModelDef_000010'); 593 end 594 595 function setArmModelDef_000011(filepath, simvma_armModelDef_000011) 596 save(filepath, 'simvma_armModelDef_000011'); 597 end 598 599 function setArmModelDef_000100(filepath, simvma_armModelDef_000100) 600 save(filepath, 'simvma_armModelDef_000100'); 601 end 602 603 function setArmModelDef_000101(filepath, simvma_armModelDef_000101) 604 save(filepath, 'simvma_armModelDef_000101'); 605 end 606 607 function setArmModelDef_000110(filepath, simvma_armModelDef_000110) 608 save(filepath, 'simvma_armModelDef_000110'); 609 end 610 611 function setArmModelDef_000111(filepath, simvma_armModelDef_000111) 612 save(filepath, 'simvma_armModelDef_000111'); 613 end 614 615 function setArmModelDef_001000(filepath, simvma_armModelDef_001000) 616 save(filepath, 'simvma_armModelDef_001000'); 617 end 618 619 function setArmModelDef_001001(filepath, simvma_armModelDef_001001) 620 save(filepath, 'simvma_armModelDef_001001'); 621 end 622 623 function setArmModelDef_001010(filepath, simvma_armModelDef_001010) 624 save(filepath, 'simvma_armModelDef_001010'); 625 end 626 627 function setArmModelDef_001011(filepath, simvma_armModelDef_001011) 628 save(filepath, 'simvma_armModelDef_001011'); 629 end 630 631 function setArmModelDef_001100(filepath, simvma_armModelDef_001100) 632 save(filepath, 'simvma_armModelDef_001100'); 633 end 634 635 function setArmModelDef_001101(filepath, simvma_armModelDef_001101) 636 save(filepath, 'simvma_armModelDef_001101'); 637 end

183 638 639 function setArmModelDef_001110(filepath, simvma_armModelDef_001110) 640 save(filepath, 'simvma_armModelDef_001110'); 641 end 642 643 function setArmModelDef_001111(filepath, simvma_armModelDef_001111) 644 save(filepath, 'simvma_armModelDef_001111'); 645 end 646 647 function setArmModelDef_010000(filepath, simvma_armModelDef_010000) 648 save(filepath, 'simvma_armModelDef_010000'); 649 end 650 651 function setArmModelDef_010001(filepath, simvma_armModelDef_010001) 652 save(filepath, 'simvma_armModelDef_010001'); 653 end 654 655 function setArmModelDef_010010(filepath, simvma_armModelDef_010010) 656 save(filepath, 'simvma_armModelDef_010010'); 657 end 658 659 function setArmModelDef_010011(filepath, simvma_armModelDef_010011) 660 save(filepath, 'simvma_armModelDef_010011'); 661 end 662 663 function setArmModelDef_010100(filepath, simvma_armModelDef_010100) 664 save(filepath, 'simvma_armModelDef_010100'); 665 end 666 667 function setArmModelDef_010101(filepath, simvma_armModelDef_010101) 668 save(filepath, 'simvma_armModelDef_010101'); 669 end 670 671 function setArmModelDef_010110(filepath, simvma_armModelDef_010110) 672 save(filepath, 'simvma_armModelDef_010110'); 673 end 674 675 function setArmModelDef_010111(filepath, simvma_armModelDef_010111) 676 save(filepath, 'simvma_armModelDef_010111'); 677 end 678 679 function setArmModelDef_011000(filepath, simvma_armModelDef_011000) 680 save(filepath, 'simvma_armModelDef_011000'); 681 end 682 683 function setArmModelDef_011001(filepath, simvma_armModelDef_011001) 684 save(filepath, 'simvma_armModelDef_011001'); 685 end 686 687 function setArmModelDef_011010(filepath, simvma_armModelDef_011010) 688 save(filepath, 'simvma_armModelDef_011010');

184 689 end 690 691 function setArmModelDef_011011(filepath, simvma_armModelDef_011011) 692 save(filepath, 'simvma_armModelDef_011011'); 693 end 694 695 function setArmModelDef_011100(filepath, simvma_armModelDef_011100) 696 save(filepath, 'simvma_armModelDef_011100'); 697 end 698 699 function setArmModelDef_011101(filepath, simvma_armModelDef_011101) 700 save(filepath, 'simvma_armModelDef_011101'); 701 end 702 703 function setArmModelDef_011110(filepath, simvma_armModelDef_011110) 704 save(filepath, 'simvma_armModelDef_011110'); 705 end 706 707 function setArmModelDef_011111(filepath, simvma_armModelDef_011111) 708 save(filepath, 'simvma_armModelDef_011111'); 709 end 710 711 function setArmModelDef_100000(filepath, simvma_armModelDef_100000) 712 save(filepath, 'simvma_armModelDef_100000'); 713 end 714 715 function setArmModelDef_100001(filepath, simvma_armModelDef_100001) 716 save(filepath, 'simvma_armModelDef_100001'); 717 end 718 719 function setArmModelDef_100010(filepath, simvma_armModelDef_100010) 720 save(filepath, 'simvma_armModelDef_100010'); 721 end 722 723 function setArmModelDef_100011(filepath, simvma_armModelDef_100011) 724 save(filepath, 'simvma_armModelDef_100011'); 725 end 726 727 function setArmModelDef_100100(filepath, simvma_armModelDef_100100) 728 save(filepath, 'simvma_armModelDef_100100'); 729 end 730 731 function setArmModelDef_100101(filepath, simvma_armModelDef_100101) 732 save(filepath, 'simvma_armModelDef_100101'); 733 end 734 735 function setArmModelDef_100110(filepath, simvma_armModelDef_100110) 736 save(filepath, 'simvma_armModelDef_100110'); 737 end 738 739 function setArmModelDef_100111(filepath, simvma_armModelDef_100111)

185 740 save(filepath, 'simvma_armModelDef_100111'); 741 end 742 743 function setArmModelDef_101000(filepath, simvma_armModelDef_101000) 744 save(filepath, 'simvma_armModelDef_101000'); 745 end 746 747 function setArmModelDef_101001(filepath, simvma_armModelDef_101001) 748 save(filepath, 'simvma_armModelDef_101001'); 749 end 750 751 function setArmModelDef_101010(filepath, simvma_armModelDef_101010) 752 save(filepath, 'simvma_armModelDef_101010'); 753 end 754 755 function setArmModelDef_101011(filepath, simvma_armModelDef_101011) 756 save(filepath, 'simvma_armModelDef_101011'); 757 end 758 759 function setArmModelDef_101100(filepath, simvma_armModelDef_101100) 760 save(filepath, 'simvma_armModelDef_101100'); 761 end 762 763 function setArmModelDef_101101(filepath, simvma_armModelDef_101101) 764 save(filepath, 'simvma_armModelDef_101101'); 765 end 766 767 function setArmModelDef_101110(filepath, simvma_armModelDef_101110) 768 save(filepath, 'simvma_armModelDef_101110'); 769 end 770 771 function setArmModelDef_101111(filepath, simvma_armModelDef_101111) 772 save(filepath, 'simvma_armModelDef_101111'); 773 end 774 775 function setArmModelDef_110000(filepath, simvma_armModelDef_110000) 776 save(filepath, 'simvma_armModelDef_110000'); 777 end 778 779 function setArmModelDef_110001(filepath, simvma_armModelDef_110001) 780 save(filepath, 'simvma_armModelDef_110001'); 781 end 782 783 function setArmModelDef_110010(filepath, simvma_armModelDef_110010) 784 save(filepath, 'simvma_armModelDef_110010'); 785 end 786 787 function setArmModelDef_110011(filepath, simvma_armModelDef_110011) 788 save(filepath, 'simvma_armModelDef_110011'); 789 end 790

186 791 function setArmModelDef_110100(filepath, simvma_armModelDef_110100) 792 save(filepath, 'simvma_armModelDef_110100'); 793 end 794 795 function setArmModelDef_110101(filepath, simvma_armModelDef_110101) 796 save(filepath, 'simvma_armModelDef_110101'); 797 end 798 799 function setArmModelDef_110110(filepath, simvma_armModelDef_110110) 800 save(filepath, 'simvma_armModelDef_110110'); 801 end 802 803 function setArmModelDef_110111(filepath, simvma_armModelDef_110111) 804 save(filepath, 'simvma_armModelDef_110111'); 805 end 806 807 function setArmModelDef_111000(filepath, simvma_armModelDef_111000) 808 save(filepath, 'simvma_armModelDef_111000'); 809 end 810 811 function setArmModelDef_111001(filepath, simvma_armModelDef_111001) 812 save(filepath, 'simvma_armModelDef_111001'); 813 end 814 815 function setArmModelDef_111010(filepath, simvma_armModelDef_111010) 816 save(filepath, 'simvma_armModelDef_111010'); 817 end 818 819 function setArmModelDef_111011(filepath, simvma_armModelDef_111011) 820 save(filepath, 'simvma_armModelDef_111011'); 821 end 822 823 function setArmModelDef_111100(filepath, simvma_armModelDef_111100) 824 save(filepath, 'simvma_armModelDef_111100'); 825 end 826 827 function setArmModelDef_111101(filepath, simvma_armModelDef_111101) 828 save(filepath, 'simvma_armModelDef_111101'); 829 end 830 831 function setArmModelDef_111110(filepath, simvma_armModelDef_111110) 832 save(filepath, 'simvma_armModelDef_111110'); 833 end 834 835 function setArmModelDef_111111(filepath, simvma_armModelDef_111111) 836 save(filepath, 'simvma_armModelDef_111111'); 837 end 838 839 function setFreqModelDef_000000(filepath, simvma_freqModelDef_000000) 840 save(filepath, 'simvma_freqModelDef_000000'); 841 end

187 842 843 function setFreqModelDef_000001(filepath, simvma_freqModelDef_000001) 844 save(filepath, 'simvma_freqModelDef_000001'); 845 end 846 847 function setFreqModelDef_000010(filepath, simvma_freqModelDef_000010) 848 save(filepath, 'simvma_freqModelDef_000010'); 849 end 850 851 function setFreqModelDef_000011(filepath, simvma_freqModelDef_000011) 852 save(filepath, 'simvma_freqModelDef_000011'); 853 end 854 855 function setFreqModelDef_000100(filepath, simvma_freqModelDef_000100) 856 save(filepath, 'simvma_freqModelDef_000100'); 857 end 858 859 function setFreqModelDef_000101(filepath, simvma_freqModelDef_000101) 860 save(filepath, 'simvma_freqModelDef_000101'); 861 end 862 863 function setFreqModelDef_000110(filepath, simvma_freqModelDef_000110) 864 save(filepath, 'simvma_freqModelDef_000110'); 865 end 866 867 function setFreqModelDef_000111(filepath, simvma_freqModelDef_000111) 868 save(filepath, 'simvma_freqModelDef_000111'); 869 end 870 871 function setFreqModelDef_001000(filepath, simvma_freqModelDef_001000) 872 save(filepath, 'simvma_freqModelDef_001000'); 873 end 874 875 function setFreqModelDef_001001(filepath, simvma_freqModelDef_001001) 876 save(filepath, 'simvma_freqModelDef_001001'); 877 end 878 879 function setFreqModelDef_001010(filepath, simvma_freqModelDef_001010) 880 save(filepath, 'simvma_freqModelDef_001010'); 881 end 882 883 function setFreqModelDef_001011(filepath, simvma_freqModelDef_001011) 884 save(filepath, 'simvma_freqModelDef_001011'); 885 end 886 887 function setFreqModelDef_001100(filepath, simvma_freqModelDef_001100) 888 save(filepath, 'simvma_freqModelDef_001100'); 889 end 890 891 function setFreqModelDef_001101(filepath, simvma_freqModelDef_001101) 892 save(filepath, 'simvma_freqModelDef_001101');

188 893 end 894 895 function setFreqModelDef_001110(filepath, simvma_freqModelDef_001110) 896 save(filepath, 'simvma_freqModelDef_001110'); 897 end 898 899 function setFreqModelDef_001111(filepath, simvma_freqModelDef_001111) 900 save(filepath, 'simvma_freqModelDef_001111'); 901 end 902 903 function setFreqModelDef_010000(filepath, simvma_freqModelDef_010000) 904 save(filepath, 'simvma_freqModelDef_010000'); 905 end 906 907 function setFreqModelDef_010001(filepath, simvma_freqModelDef_010001) 908 save(filepath, 'simvma_freqModelDef_010001'); 909 end 910 911 function setFreqModelDef_010010(filepath, simvma_freqModelDef_010010) 912 save(filepath, 'simvma_freqModelDef_010010'); 913 end 914 915 function setFreqModelDef_010011(filepath, simvma_freqModelDef_010011) 916 save(filepath, 'simvma_freqModelDef_010011'); 917 end 918 919 function setFreqModelDef_010100(filepath, simvma_freqModelDef_010100) 920 save(filepath, 'simvma_freqModelDef_010100'); 921 end 922 923 function setFreqModelDef_010101(filepath, simvma_freqModelDef_010101) 924 save(filepath, 'simvma_freqModelDef_010101'); 925 end 926 927 function setFreqModelDef_010110(filepath, simvma_freqModelDef_010110) 928 save(filepath, 'simvma_freqModelDef_010110'); 929 end 930 931 function setFreqModelDef_010111(filepath, simvma_freqModelDef_010111) 932 save(filepath, 'simvma_freqModelDef_010111'); 933 end 934 935 function setFreqModelDef_011000(filepath, simvma_freqModelDef_011000) 936 save(filepath, 'simvma_freqModelDef_011000'); 937 end 938 939 function setFreqModelDef_011001(filepath, simvma_freqModelDef_011001) 940 save(filepath, 'simvma_freqModelDef_011001'); 941 end 942 943 function setFreqModelDef_011010(filepath, simvma_freqModelDef_011010)

189 944 save(filepath, 'simvma_freqModelDef_011010'); 945 end 946 947 function setFreqModelDef_011011(filepath, simvma_freqModelDef_011011) 948 save(filepath, 'simvma_freqModelDef_011011'); 949 end 950 951 function setFreqModelDef_011100(filepath, simvma_freqModelDef_011100) 952 save(filepath, 'simvma_freqModelDef_011100'); 953 end 954 955 function setFreqModelDef_011101(filepath, simvma_freqModelDef_011101) 956 save(filepath, 'simvma_freqModelDef_011101'); 957 end 958 959 function setFreqModelDef_011110(filepath, simvma_freqModelDef_011110) 960 save(filepath, 'simvma_freqModelDef_011110'); 961 end 962 963 function setFreqModelDef_011111(filepath, simvma_freqModelDef_011111) 964 save(filepath, 'simvma_freqModelDef_011111'); 965 end 966 967 function setFreqModelDef_100000(filepath, simvma_freqModelDef_100000) 968 save(filepath, 'simvma_freqModelDef_100000'); 969 end 970 971 function setFreqModelDef_100001(filepath, simvma_freqModelDef_100001) 972 save(filepath, 'simvma_freqModelDef_100001'); 973 end 974 975 function setFreqModelDef_100010(filepath, simvma_freqModelDef_100010) 976 save(filepath, 'simvma_freqModelDef_100010'); 977 end 978 979 function setFreqModelDef_100011(filepath, simvma_freqModelDef_100011) 980 save(filepath, 'simvma_freqModelDef_100011'); 981 end 982 983 function setFreqModelDef_100100(filepath, simvma_freqModelDef_100100) 984 save(filepath, 'simvma_freqModelDef_100100'); 985 end 986 987 function setFreqModelDef_100101(filepath, simvma_freqModelDef_100101) 988 save(filepath, 'simvma_freqModelDef_100101'); 989 end 990 991 function setFreqModelDef_100110(filepath, simvma_freqModelDef_100110) 992 save(filepath, 'simvma_freqModelDef_100110'); 993 end 994

190 995 function setFreqModelDef_100111(filepath, simvma_freqModelDef_100111) 996 save(filepath, 'simvma_freqModelDef_100111'); 997 end 998 999 function setFreqModelDef_101000(filepath, simvma_freqModelDef_101000) 1000 save(filepath, 'simvma_freqModelDef_101000'); 1001 end 1002 1003 function setFreqModelDef_101001(filepath, simvma_freqModelDef_101001) 1004 save(filepath, 'simvma_freqModelDef_101001'); 1005 end 1006 1007 function setFreqModelDef_101010(filepath, simvma_freqModelDef_101010) 1008 save(filepath, 'simvma_freqModelDef_101010'); 1009 end 1010 1011 function setFreqModelDef_101011(filepath, simvma_freqModelDef_101011) 1012 save(filepath, 'simvma_freqModelDef_101011'); 1013 end 1014 1015 function setFreqModelDef_101100(filepath, simvma_freqModelDef_101100) 1016 save(filepath, 'simvma_freqModelDef_101100'); 1017 end 1018 1019 function setFreqModelDef_101101(filepath, simvma_freqModelDef_101101) 1020 save(filepath, 'simvma_freqModelDef_101101'); 1021 end 1022 1023 function setFreqModelDef_101110(filepath, simvma_freqModelDef_101110) 1024 save(filepath, 'simvma_freqModelDef_101110'); 1025 end 1026 1027 function setFreqModelDef_101111(filepath, simvma_freqModelDef_101111) 1028 save(filepath, 'simvma_freqModelDef_101111'); 1029 end 1030 1031 function setFreqModelDef_110000(filepath, simvma_freqModelDef_110000) 1032 save(filepath, 'simvma_freqModelDef_110000'); 1033 end 1034 1035 function setFreqModelDef_110001(filepath, simvma_freqModelDef_110001) 1036 save(filepath, 'simvma_freqModelDef_110001'); 1037 end 1038 1039 function setFreqModelDef_110010(filepath, simvma_freqModelDef_110010) 1040 save(filepath, 'simvma_freqModelDef_110010'); 1041 end 1042 1043 function setFreqModelDef_110011(filepath, simvma_freqModelDef_110011) 1044 save(filepath, 'simvma_freqModelDef_110011'); 1045 end

191 1046 1047 function setFreqModelDef_110100(filepath, simvma_freqModelDef_110100) 1048 save(filepath, 'simvma_freqModelDef_110100'); 1049 end 1050 1051 function setFreqModelDef_110101(filepath, simvma_freqModelDef_110101) 1052 save(filepath, 'simvma_freqModelDef_110101'); 1053 end 1054 1055 function setFreqModelDef_110110(filepath, simvma_freqModelDef_110110) 1056 save(filepath, 'simvma_freqModelDef_110110'); 1057 end 1058 1059 function setFreqModelDef_110111(filepath, simvma_freqModelDef_110111) 1060 save(filepath, 'simvma_freqModelDef_110111'); 1061 end 1062 1063 function setFreqModelDef_111000(filepath, simvma_freqModelDef_111000) 1064 save(filepath, 'simvma_freqModelDef_111000'); 1065 end 1066 1067 function setFreqModelDef_111001(filepath, simvma_freqModelDef_111001) 1068 save(filepath, 'simvma_freqModelDef_111001'); 1069 end 1070 1071 function setFreqModelDef_111010(filepath, simvma_freqModelDef_111010) 1072 save(filepath, 'simvma_freqModelDef_111010'); 1073 end 1074 1075 function setFreqModelDef_111011(filepath, simvma_freqModelDef_111011) 1076 save(filepath, 'simvma_freqModelDef_111011'); 1077 end 1078 1079 function setFreqModelDef_111100(filepath, simvma_freqModelDef_111100) 1080 save(filepath, 'simvma_freqModelDef_111100'); 1081 end 1082 1083 function setFreqModelDef_111101(filepath, simvma_freqModelDef_111101) 1084 save(filepath, 'simvma_freqModelDef_111101'); 1085 end 1086 1087 function setFreqModelDef_111110(filepath, simvma_freqModelDef_111110) 1088 save(filepath, 'simvma_freqModelDef_111110'); 1089 end 1090 1091 function setFreqModelDef_111111(filepath, simvma_freqModelDef_111111) 1092 save(filepath, 'simvma_freqModelDef_111111'); 1093 end 1094 1095 %%%%%%%%%%%%%%%%%%%%%%%%%%%%% generated-end %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

192 Listing A.95: setSharedVarsForPredModelsTrainedOnDefRepos.m function definition 1 function setSharedVarsForPredModelsTrainedOnDefRepos(am1, am2, am3, am4, ,→ am5, am6, fm1, fm2, fm3, fm4, fm5, fm6) 2 % Sets 64 + 64 = 128 shared vars for all possible combinations if ,→ ArmModels 3 % and all possible combinations of FreqModels trained on 6 default repos 4 % 5 % PARAMETERS: 6 % am1 : ArmModel trained on 1st default repo only (automotive) 7 % am2 : ArmModel trained on 2nd default repo only (avionics) 8 % am3 : ArmModel trained on 3rd default repo only (electronics) 9 % am4 : ArmModel trained on 4th default repo only (energy) 10 % am5 : ArmModel trained on 5th default repo only (robotics) 11 % am6 : ArmModel trained on 6th default repo only (miscellaneous) 12 % fm1 : FreqModel trained on 1st default repo only (automotive) 13 % fm2 : FreqModel trained on 2nd default repo only (avionics) 14 % fm3 : FreqModel trained on 3rd default repo only (electronics) 15 % fm4 : FreqModel trained on 4th default repo only (energy) 16 % fm5 : FreqModel trained on 5th default repo only (robotics) 17 % fm6 : FreqModel trained on 6th default repo only (miscellaneous) 18 19 am_000000 = ArmModel(); % empty i.e. not trained at all 20 am_000001 = am6; 21 am_000010 = am5; 22 am_000011 = mergeArmModels(true, [am5, am6]); 23 am_000100 = am4; 24 am_000101 = mergeArmModels(true, [am4, am6]); 25 am_000110 = mergeArmModels(true, [am4, am5]); 26 am_000111 = mergeArmModels(true, [am4, am5, am6]); 27 am_001000 = am3; 28 am_001001 = mergeArmModels(true, [am3, am6]); 29 am_001010 = mergeArmModels(true, [am3, am5]); 30 am_001011 = mergeArmModels(true, [am3, am5, am6]); 31 am_001100 = mergeArmModels(true, [am3, am4]); 32 am_001101 = mergeArmModels(true, [am3, am4, am6]); 33 am_001110 = mergeArmModels(true, [am3, am4, am5]); 34 am_001111 = mergeArmModels(true, [am3, am4, am5, am6]); 35 am_010000 = am2; 36 am_010001 = mergeArmModels(true,[am2, am6]); 37 am_010010 = mergeArmModels(true,[am2, am5]); 38 am_010011 = mergeArmModels(true,[am2, am5, am6]); 39 am_010100 = mergeArmModels(true,[am2, am4]); 40 am_010101 = mergeArmModels(true,[am2, am4, am6]); 41 am_010110 = mergeArmModels(true,[am2, am4, am5]); 42 am_010111 = mergeArmModels(true,[am2, am4, am5, am6]); 43 am_011000 = mergeArmModels(true,[am2, am3]); 44 am_011001 = mergeArmModels(true,[am2, am3, am6]); 45 am_011010 = mergeArmModels(true,[am2, am3, am5]); 46 am_011011 = mergeArmModels(true,[am2, am3, am5, am6]); 47 am_011100 = mergeArmModels(true,[am2, am3, am4]); 48 am_011101 = mergeArmModels(true,[am2, am3, am4, am6]);

193 49 am_011110 = mergeArmModels(true,[am2, am3, am4, am5]); 50 am_011111 = mergeArmModels(true,[am2, am3, am4, am5, am6]); 51 am_100000 = am1; 52 am_100001 = mergeArmModels(true, [am1, am6]); 53 am_100010 = mergeArmModels(true, [am1, am5]); 54 am_100011 = mergeArmModels(true, [am1, am5, am6]); 55 am_100100 = mergeArmModels(true, [am1, am4]); 56 am_100101 = mergeArmModels(true, [am1, am4, am6]); 57 am_100110 = mergeArmModels(true, [am1, am4, am5]); 58 am_100111 = mergeArmModels(true, [am1, am4, am5, am6]); 59 am_101000 = mergeArmModels(true, [am1, am3]); 60 am_101001 = mergeArmModels(true, [am1, am3, am6]); 61 am_101010 = mergeArmModels(true, [am1, am3, am5]); 62 am_101011 = mergeArmModels(true, [am1, am3, am5, am6]); 63 am_101100 = mergeArmModels(true, [am1, am3, am4]); 64 am_101101 = mergeArmModels(true, [am1, am3, am4, am6]); 65 am_101110 = mergeArmModels(true, [am1, am3, am4, am5]); 66 am_101111 = mergeArmModels(true, [am1, am3, am4, am5, am6]); 67 am_110000 = mergeArmModels(true, [am1, am2]); 68 am_110001 = mergeArmModels(true, [am1, am2, am6]); 69 am_110010 = mergeArmModels(true, [am1, am2, am5]); 70 am_110011 = mergeArmModels(true, [am1, am2, am5, am6]); 71 am_110100 = mergeArmModels(true, [am1, am2, am4]); 72 am_110101 = mergeArmModels(true, [am1, am2, am4, am6]); 73 am_110110 = mergeArmModels(true, [am1, am2, am4, am5]); 74 am_110111 = mergeArmModels(true, [am1, am2, am4, am5, am6]); 75 am_111000 = mergeArmModels(true, [am1, am2, am3]); 76 am_111001 = mergeArmModels(true, [am1, am2, am3, am6]); 77 am_111010 = mergeArmModels(true, [am1, am2, am3, am5]); 78 am_111011 = mergeArmModels(true, [am1, am2, am3, am5, am6]); 79 am_111100 = mergeArmModels(true, [am1, am2, am3, am4]); 80 am_111101 = mergeArmModels(true, [am1, am2, am3, am4, am6]); 81 am_111110 = mergeArmModels(true, [am1, am2, am3, am4, am5]); 82 am_111111 = mergeArmModels(true, [am1, am2, am3, am4, am5, am6]); 83 84 fm_000000 = FreqModel(); % empty i.e. not trained at all 85 fm_000001 = fm6; 86 fm_000010 = fm5; 87 fm_000011 = mergeFreqModels(true, [fm5, fm6]); 88 fm_000100 = fm4; 89 fm_000101 = mergeFreqModels(true, [fm4, fm6]); 90 fm_000110 = mergeFreqModels(true, [fm4, fm5]); 91 fm_000111 = mergeFreqModels(true, [fm4, fm5, fm6]); 92 fm_001000 = fm3; 93 fm_001001 = mergeFreqModels(true, [fm3, fm6]); 94 fm_001010 = mergeFreqModels(true, [fm3, fm5]); 95 fm_001011 = mergeFreqModels(true, [fm3, fm5, fm6]); 96 fm_001100 = mergeFreqModels(true, [fm3, fm4]); 97 fm_001101 = mergeFreqModels(true, [fm3, fm4, fm6]); 98 fm_001110 = mergeFreqModels(true, [fm3, fm4, fm5]); 99 fm_001111 = mergeFreqModels(true, [fm3, fm4, fm5, fm6]);

194 100 fm_010000 = fm2; 101 fm_010001 = mergeFreqModels(true,[fm2, fm6]); 102 fm_010010 = mergeFreqModels(true,[fm2, fm5]); 103 fm_010011 = mergeFreqModels(true,[fm2, fm5, fm6]); 104 fm_010100 = mergeFreqModels(true,[fm2, fm4]); 105 fm_010101 = mergeFreqModels(true,[fm2, fm4, fm6]); 106 fm_010110 = mergeFreqModels(true,[fm2, fm4, fm5]); 107 fm_010111 = mergeFreqModels(true,[fm2, fm4, fm5, fm6]); 108 fm_011000 = mergeFreqModels(true,[fm2, fm3]); 109 fm_011001 = mergeFreqModels(true,[fm2, fm3, fm6]); 110 fm_011010 = mergeFreqModels(true,[fm2, fm3, fm5]); 111 fm_011011 = mergeFreqModels(true,[fm2, fm3, fm5, fm6]); 112 fm_011100 = mergeFreqModels(true,[fm2, fm3, fm4]); 113 fm_011101 = mergeFreqModels(true,[fm2, fm3, fm4, fm6]); 114 fm_011110 = mergeFreqModels(true,[fm2, fm3, fm4, fm5]); 115 fm_011111 = mergeFreqModels(true,[fm2, fm3, fm4, fm5, fm6]); 116 fm_100000 = fm1; 117 fm_100001 = mergeFreqModels(true, [fm1, fm6]); 118 fm_100010 = mergeFreqModels(true, [fm1, fm5]); 119 fm_100011 = mergeFreqModels(true, [fm1, fm5, fm6]); 120 fm_100100 = mergeFreqModels(true, [fm1, fm4]); 121 fm_100101 = mergeFreqModels(true, [fm1, fm4, fm6]); 122 fm_100110 = mergeFreqModels(true, [fm1, fm4, fm5]); 123 fm_100111 = mergeFreqModels(true, [fm1, fm4, fm5, fm6]); 124 fm_101000 = mergeFreqModels(true, [fm1, fm3]); 125 fm_101001 = mergeFreqModels(true, [fm1, fm3, fm6]); 126 fm_101010 = mergeFreqModels(true, [fm1, fm3, fm5]); 127 fm_101011 = mergeFreqModels(true, [fm1, fm3, fm5, fm6]); 128 fm_101100 = mergeFreqModels(true, [fm1, fm3, fm4]); 129 fm_101101 = mergeFreqModels(true, [fm1, fm3, fm4, fm6]); 130 fm_101110 = mergeFreqModels(true, [fm1, fm3, fm4, fm5]); 131 fm_101111 = mergeFreqModels(true, [fm1, fm3, fm4, fm5, fm6]); 132 fm_110000 = mergeFreqModels(true, [fm1, fm2]); 133 fm_110001 = mergeFreqModels(true, [fm1, fm2, fm6]); 134 fm_110010 = mergeFreqModels(true, [fm1, fm2, fm5]); 135 fm_110011 = mergeFreqModels(true, [fm1, fm2, fm5, fm6]); 136 fm_110100 = mergeFreqModels(true, [fm1, fm2, fm4]); 137 fm_110101 = mergeFreqModels(true, [fm1, fm2, fm4, fm6]); 138 fm_110110 = mergeFreqModels(true, [fm1, fm2, fm4, fm5]); 139 fm_110111 = mergeFreqModels(true, [fm1, fm2, fm4, fm5, fm6]); 140 fm_111000 = mergeFreqModels(true, [fm1, fm2, fm3]); 141 fm_111001 = mergeFreqModels(true, [fm1, fm2, fm3, fm6]); 142 fm_111010 = mergeFreqModels(true, [fm1, fm2, fm3, fm5]); 143 fm_111011 = mergeFreqModels(true, [fm1, fm2, fm3, fm5, fm6]); 144 fm_111100 = mergeFreqModels(true, [fm1, fm2, fm3, fm4]); 145 fm_111101 = mergeFreqModels(true, [fm1, fm2, fm3, fm4, fm6]); 146 fm_111110 = mergeFreqModels(true, [fm1, fm2, fm3, fm4, fm5]); 147 fm_111111 = mergeFreqModels(true, [fm1, fm2, fm3, fm4, fm5, fm6]); 148 149 % The following code (between %%% generated-start %%% and %%% generated- ,→ end %%%) was generated with this script

195 150 % #!/usr/local/bin/python3 151 % 152 % with open('file.txt', 'w') as file: 153 % file.write("\n%%%%%%%%%%%%%%%%%%%%%%%%%%%%% generated-start ,→ %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%\n") 154 % for model in ['Arm', 'Freq']: 155 % model_lower = model.lower(); 156 % model_abb = 'am' if model == 'Arm' else 'fm' 157 % 158 % for a in range(0, 2): 159 % for b in range(0, 2): 160 % for c in range(0, 2): 161 % for d in range(0, 2): 162 % for e in range(0, 2): 163 % for f in range(0, 2): 164 % 165 % s = f""" 166 % setSharedVar('simvma_{model_lower}ModelDef_{a}{b}{c}{d}{e}{f}', { ,→ model_abb}_{a}{b}{c}{d}{e}{f})""" 167 % file.write(s) 168 % file.write("\n\n%%%%%%%%%%%%%%%%%%%%%%%%%%%%% generated-end ,→ %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%\n") 169 % 170 171 %%%%%%%%%%%%%%%%%%%%%%%%%%%%% generated-start ,→ %%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 172 173 setSharedVar('simvma_armModelDef_000000', am_000000) 174 setSharedVar('simvma_armModelDef_000001', am_000001) 175 setSharedVar('simvma_armModelDef_000010', am_000010) 176 setSharedVar('simvma_armModelDef_000011', am_000011) 177 setSharedVar('simvma_armModelDef_000100', am_000100) 178 setSharedVar('simvma_armModelDef_000101', am_000101) 179 setSharedVar('simvma_armModelDef_000110', am_000110) 180 setSharedVar('simvma_armModelDef_000111', am_000111) 181 setSharedVar('simvma_armModelDef_001000', am_001000) 182 setSharedVar('simvma_armModelDef_001001', am_001001) 183 setSharedVar('simvma_armModelDef_001010', am_001010) 184 setSharedVar('simvma_armModelDef_001011', am_001011) 185 setSharedVar('simvma_armModelDef_001100', am_001100) 186 setSharedVar('simvma_armModelDef_001101', am_001101) 187 setSharedVar('simvma_armModelDef_001110', am_001110) 188 setSharedVar('simvma_armModelDef_001111', am_001111) 189 setSharedVar('simvma_armModelDef_010000', am_010000) 190 setSharedVar('simvma_armModelDef_010001', am_010001) 191 setSharedVar('simvma_armModelDef_010010', am_010010) 192 setSharedVar('simvma_armModelDef_010011', am_010011) 193 setSharedVar('simvma_armModelDef_010100', am_010100) 194 setSharedVar('simvma_armModelDef_010101', am_010101) 195 setSharedVar('simvma_armModelDef_010110', am_010110) 196 setSharedVar('simvma_armModelDef_010111', am_010111)

196 197 setSharedVar('simvma_armModelDef_011000', am_011000) 198 setSharedVar('simvma_armModelDef_011001', am_011001) 199 setSharedVar('simvma_armModelDef_011010', am_011010) 200 setSharedVar('simvma_armModelDef_011011', am_011011) 201 setSharedVar('simvma_armModelDef_011100', am_011100) 202 setSharedVar('simvma_armModelDef_011101', am_011101) 203 setSharedVar('simvma_armModelDef_011110', am_011110) 204 setSharedVar('simvma_armModelDef_011111', am_011111) 205 setSharedVar('simvma_armModelDef_100000', am_100000) 206 setSharedVar('simvma_armModelDef_100001', am_100001) 207 setSharedVar('simvma_armModelDef_100010', am_100010) 208 setSharedVar('simvma_armModelDef_100011', am_100011) 209 setSharedVar('simvma_armModelDef_100100', am_100100) 210 setSharedVar('simvma_armModelDef_100101', am_100101) 211 setSharedVar('simvma_armModelDef_100110', am_100110) 212 setSharedVar('simvma_armModelDef_100111', am_100111) 213 setSharedVar('simvma_armModelDef_101000', am_101000) 214 setSharedVar('simvma_armModelDef_101001', am_101001) 215 setSharedVar('simvma_armModelDef_101010', am_101010) 216 setSharedVar('simvma_armModelDef_101011', am_101011) 217 setSharedVar('simvma_armModelDef_101100', am_101100) 218 setSharedVar('simvma_armModelDef_101101', am_101101) 219 setSharedVar('simvma_armModelDef_101110', am_101110) 220 setSharedVar('simvma_armModelDef_101111', am_101111) 221 setSharedVar('simvma_armModelDef_110000', am_110000) 222 setSharedVar('simvma_armModelDef_110001', am_110001) 223 setSharedVar('simvma_armModelDef_110010', am_110010) 224 setSharedVar('simvma_armModelDef_110011', am_110011) 225 setSharedVar('simvma_armModelDef_110100', am_110100) 226 setSharedVar('simvma_armModelDef_110101', am_110101) 227 setSharedVar('simvma_armModelDef_110110', am_110110) 228 setSharedVar('simvma_armModelDef_110111', am_110111) 229 setSharedVar('simvma_armModelDef_111000', am_111000) 230 setSharedVar('simvma_armModelDef_111001', am_111001) 231 setSharedVar('simvma_armModelDef_111010', am_111010) 232 setSharedVar('simvma_armModelDef_111011', am_111011) 233 setSharedVar('simvma_armModelDef_111100', am_111100) 234 setSharedVar('simvma_armModelDef_111101', am_111101) 235 setSharedVar('simvma_armModelDef_111110', am_111110) 236 setSharedVar('simvma_armModelDef_111111', am_111111) 237 setSharedVar('simvma_freqModelDef_000000', fm_000000) 238 setSharedVar('simvma_freqModelDef_000001', fm_000001) 239 setSharedVar('simvma_freqModelDef_000010', fm_000010) 240 setSharedVar('simvma_freqModelDef_000011', fm_000011) 241 setSharedVar('simvma_freqModelDef_000100', fm_000100) 242 setSharedVar('simvma_freqModelDef_000101', fm_000101) 243 setSharedVar('simvma_freqModelDef_000110', fm_000110) 244 setSharedVar('simvma_freqModelDef_000111', fm_000111) 245 setSharedVar('simvma_freqModelDef_001000', fm_001000) 246 setSharedVar('simvma_freqModelDef_001001', fm_001001) 247 setSharedVar('simvma_freqModelDef_001010', fm_001010)

197 248 setSharedVar('simvma_freqModelDef_001011', fm_001011) 249 setSharedVar('simvma_freqModelDef_001100', fm_001100) 250 setSharedVar('simvma_freqModelDef_001101', fm_001101) 251 setSharedVar('simvma_freqModelDef_001110', fm_001110) 252 setSharedVar('simvma_freqModelDef_001111', fm_001111) 253 setSharedVar('simvma_freqModelDef_010000', fm_010000) 254 setSharedVar('simvma_freqModelDef_010001', fm_010001) 255 setSharedVar('simvma_freqModelDef_010010', fm_010010) 256 setSharedVar('simvma_freqModelDef_010011', fm_010011) 257 setSharedVar('simvma_freqModelDef_010100', fm_010100) 258 setSharedVar('simvma_freqModelDef_010101', fm_010101) 259 setSharedVar('simvma_freqModelDef_010110', fm_010110) 260 setSharedVar('simvma_freqModelDef_010111', fm_010111) 261 setSharedVar('simvma_freqModelDef_011000', fm_011000) 262 setSharedVar('simvma_freqModelDef_011001', fm_011001) 263 setSharedVar('simvma_freqModelDef_011010', fm_011010) 264 setSharedVar('simvma_freqModelDef_011011', fm_011011) 265 setSharedVar('simvma_freqModelDef_011100', fm_011100) 266 setSharedVar('simvma_freqModelDef_011101', fm_011101) 267 setSharedVar('simvma_freqModelDef_011110', fm_011110) 268 setSharedVar('simvma_freqModelDef_011111', fm_011111) 269 setSharedVar('simvma_freqModelDef_100000', fm_100000) 270 setSharedVar('simvma_freqModelDef_100001', fm_100001) 271 setSharedVar('simvma_freqModelDef_100010', fm_100010) 272 setSharedVar('simvma_freqModelDef_100011', fm_100011) 273 setSharedVar('simvma_freqModelDef_100100', fm_100100) 274 setSharedVar('simvma_freqModelDef_100101', fm_100101) 275 setSharedVar('simvma_freqModelDef_100110', fm_100110) 276 setSharedVar('simvma_freqModelDef_100111', fm_100111) 277 setSharedVar('simvma_freqModelDef_101000', fm_101000) 278 setSharedVar('simvma_freqModelDef_101001', fm_101001) 279 setSharedVar('simvma_freqModelDef_101010', fm_101010) 280 setSharedVar('simvma_freqModelDef_101011', fm_101011) 281 setSharedVar('simvma_freqModelDef_101100', fm_101100) 282 setSharedVar('simvma_freqModelDef_101101', fm_101101) 283 setSharedVar('simvma_freqModelDef_101110', fm_101110) 284 setSharedVar('simvma_freqModelDef_101111', fm_101111) 285 setSharedVar('simvma_freqModelDef_110000', fm_110000) 286 setSharedVar('simvma_freqModelDef_110001', fm_110001) 287 setSharedVar('simvma_freqModelDef_110010', fm_110010) 288 setSharedVar('simvma_freqModelDef_110011', fm_110011) 289 setSharedVar('simvma_freqModelDef_110100', fm_110100) 290 setSharedVar('simvma_freqModelDef_110101', fm_110101) 291 setSharedVar('simvma_freqModelDef_110110', fm_110110) 292 setSharedVar('simvma_freqModelDef_110111', fm_110111) 293 setSharedVar('simvma_freqModelDef_111000', fm_111000) 294 setSharedVar('simvma_freqModelDef_111001', fm_111001) 295 setSharedVar('simvma_freqModelDef_111010', fm_111010) 296 setSharedVar('simvma_freqModelDef_111011', fm_111011) 297 setSharedVar('simvma_freqModelDef_111100', fm_111100) 298 setSharedVar('simvma_freqModelDef_111101', fm_111101)

198 299 setSharedVar('simvma_freqModelDef_111110', fm_111110) 300 setSharedVar('simvma_freqModelDef_111111', fm_111111) 301 302 %%%%%%%%%%%%%%%%%%%%%%%%%%%%% generated-end %%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 303 304 end

Listing A.96: sl customization.m function definition 1 function sl_customization(cm) 2 % adds items to the end of Simulink Editor Tools menu 3 cm.addCustomMenuFcn('Simulink:ToolsMenu', @getMyMenuItems); 4 5 % adds right-click menu items at the beginning 6 cm.addCustomMenuFcn('Simulink:PreContextMenu', @getMyMenuItems); 7 8 % adds right-click menu items at the end 9 % cm.addCustomMenuFcn('Simulink:ContextMenu', @getMyMenuItems); % 10 11 end 12 13 % Define the custom menu function. 14 function schemaFcns = getMyMenuItems(callbackInfo) 15 schemaFcns = {@getItem1, @getItem2, @getItem3}; 16 end 17 18 % Define the schema function for first menu item. 19 function schema = getItem1(callbackInfo) 20 schema = sl_action_schema; 21 schema.label = 'SimIMA: Suggest Blocks'; 22 schema.userdata = 'simvma'; 23 schema.accelerator = 'Ctrl+J'; % todo: fixit -- not working 24 schema.callback = @myCallback1; 25 end 26 27 28 % Define the schema function for second menu item. 29 function schema = getItem2(callbackInfo) 30 schema = sl_action_schema; 31 schema.label = 'SimIMA: Suggest Complete Systems'; 32 schema.userdata = 'simvma'; 33 schema.callback = @myCallback2; 34 end 35 36 % Define the schema function for third menu item. 37 function schema = getItem3(callbackInfo) 38 schema = sl_action_schema; 39 schema.label = 'SimIMA: Configure Block Suggestions'; 40 schema.userdata = 'simvma'; 41 schema.callback = @myCallback3; 42 end 43

199 44 45 function myCallback1(callbackInfo) 46 initialize(); 47 suggestBlocks(); 48 end 49 50 function myCallback2(callbackInfo) 51 initialize(); 52 appSimxample(); 53 end 54 55 function myCallback3(callbackInfo) 56 initialize(); 57 appSimgestion(); 58 end

Listing A.97: slx2mdl.m function definition 1 function slx2mdl(slxFilePath, mdlFilePath) 2 % Convert slx file to mdl 3 % 4 % parameters: 5 % ------6 % slxFilePath : (string) absoulte path of the slx file to be converted 7 % mdlFilePath : (string) absoulte path of the mdl file to be created 8 % The corresponding folder will be created if it 9 % does not exist yet. 10 % 11 % IMPORTANT: 12 % ------13 % - Conversion is not guaranteed to succeed. so, put this function in 14 % try-catch construct (in the caller). 15 % 16 17 slxFilePath = string(slxFilePath); 18 [mdlFolderPath, mdlFileName, mdlExt] = fileparts(mdlFilePath); 19 20 % create mdl-folder if it does not exist already 21 if ˜ exist(mdlFolderPath, 'dir') 22 mkdir (mdlFolderPath); 23 end 24 25 load_system(slxFilePath); 26 save_system(gcs, mdlFilePath); % save as mdl 27 close_system(gcs); 28 end

Listing A.98: sortClonesBySimilarity.m function definition 1 function clones = sortClonesBySimilarity(clones) 2 % sort clones by similarity, high to low 3

200 4 for i = 1:length(clones)-1 5 for j = i+1:length(clones) 6 if clones{i}.similarity < clones{j}.similarity 7 % swap 8 tmp = clones{i}; 9 clones{i} = clones{j}; 10 clones{j} = tmp; 11 end 12 end 13 end 14 end

Listing A.99: sortObjsByProp.m function definition 1 function newObjs = sortObjsByProp(objs, prop, order) 2 % Sort object by the given property in given order 3 % 4 % PARAMETERS: 5 % ------6 % objs (Objects) : list of objects 7 % prop (String/char) : property name by which the objects are to be 8 % sorted (must be comparable) 9 % order(String/char) : 'ascend' / 'descend' 10 11 % APPROACH: 12 % - convert class instances (objs) to structs 13 % - convert the structs to table 14 % - sort the table 15 % - convert the table back to structs 16 % - convert the structs back to class instances 17 18 prop = string(prop); 19 order = string(order); 20 21 assert(any(strcmp(order, ["ascend", "descend"]))); 22 23 if isempty(objs) 24 newObjs = objs; 25 return; 26 end 27 28 % new objects will be created by updating properties of the passed 29 % objects. 30 newObjs = objs; 31 32 % object --> struct 33 objsStruct = struct.empty(); 34 warning off; 35 for i = 1 : length(objs) 36 obj = objs(i); 37 objStruct = struct(obj); 38 objsStruct = [objsStruct objStruct];

201 39 end 40 warning on; 41 42 table_ = struct2table(objsStruct); 43 % sort the table by given property in given order 44 tableSorted = sortrows(table_, prop, order); 45 objsStructSorted = table2struct(tableSorted); 46 47 % struct --> object 48 objs = BlockSuggFromPredModel.empty(); 49 for i = 1 : length(objsStructSorted) 50 obj = newObjs(i); 51 s = objsStructSorted(i); 52 fields_ = fields(s); % cell array 53 for j = 1 : length(fields_) 54 field = fields_{j}; 55 obj.(field) = s.(field); 56 end 57 newObjs(i) = obj; 58 end 59 end

Listing A.100: sortStructFieldsMaxToMin.m function definition 1 function sortedFields = sortStructFieldsMaxToMin(structure) 2 % Return a list of fields (string) sorted by the value (number) of the 3 % fields in given structure. 4 % 5 % ASSUMPTIONS: 6 % ------7 % - value of each field in the structure is a number (double) 8 % 9 % PARAMETERS: 10 % ------11 % structure (struct) : structure whose fields are to be sorted. 12 13 sortedFields = string.empty; 14 while length(fields(structure)) 15 f = getFieldWithMaxVal(structure); 16 sortedFields = [sortedFields f]; 17 structure = rmfield(structure, f); 18 end 19 end

Listing A.101: subSystemsCompatibleForReplacement.m function definition 1 function result = subSystemsCompatibleForReplacement(subSystem1Path, ,→ subSystem2Path) 2 % Return true if passed sub-systems are compatible for replacement (by one 3 % another), else return false. 4 % 5 % ASSUMPTIONS:

202 6 % ------7 % 1. both mdl files are loaded 8 % 9 % PARAMETERS: 10 % ------11 % subSystem1Path : string -- first sub-system's path 12 % subSystem2Path : string -- second sub-system's path 13 % - Both subSystem1Path and subSystem2Path should be in format as returned 14 % by 'gcs' eg: "mymodel/a/b/mysubsystem" 15 16 % LOGIC: 17 % - The two sub-systems must have the same number of inports and outports. 18 % i.e. nInport1 = nInport2 && nOutport1 = nOutport2 19 20 % disp("subSystem1 : " + subSystem1); 21 % disp("subSystem2 : " + subSystem2); 22 23 24 [nInport1, nOutport1] = getNInportOutport(subSystem1Path); 25 [nInport2, nOutport2] = getNInportOutport(subSystem2Path); 26 27 if nInport1 == nInport2 && nOutport1 == nOutport2 28 result = true; 29 else 30 result = false; 31 end 32 end

Listing A.102: suggestBlocks.m function definition 1 function suggestBlocks() 2 % This is the entry function for generating and displaying block-level 3 % suggestions directly onto the system under development. 4 % This function will be called by a callback function inside 5 % sl_customization.m 6 7 simvmaPath = getSimvmaPath(); 8 % capture context, before it gets overridden (potentially) 9 context = getContext(); 10 11 % When suggestBlocks() is called for the first time (after loading ,→ Matlab) 12 % gcb (and hence context.mrcbPath) give the actual 13 % MostRecentlyClickedBlock (by the user). At other times, if the ,→ suggestBlocks() 14 % is called repeatedly without the user clicking some block in SUD, 15 % this value is set to "/path/to/suggPanelFooter". So, we use the ,→ previous 16 % MostRecentlyClickedBlock (by the user) in such cases. 17 % 18 % However, we still make sure that the most recent nSuggsMax (this 19 % could have been updated from last one) is used to decide the "size"

203 20 % of the suggestion panel 21 if context.mrcbPath.endsWith("suggPanelFooter") 22 context = evalin('base', 'simvma_context'); % load from workspace 23 context = context.setPositions(); 24 end 25 assignin('base', 'simvma_context', context); % not just for debugging 26 27 suggs = getBlockSuggs(context, true); 28 assignin('base', 'simvma_suggs', suggs); % for debugging 29 if length(suggs) == 0 30 dispDlgMsg("Sorry, no suggestions were found!", "Suggestion Not ,→ Found"); 31 return; 32 end 33 34 % blockInsertionState must be shared using shared-vars, because it 35 % cannot be passed as argument to the callback function code (see 36 % more explanation in dispBlkSuggsOnWorkspace.m) 37 % blockInsertionState.percenTexts will be set from inside 38 % dispBlkSuggsOnWorkspace() 39 blockInsertionState = BlockInsertionState(context, suggs, NaN); 40 setSharedVar('simvma_blockInsertionState', blockInsertionState); 41 42 dispBlkSuggsOnWorkspace(suggs, context); 43 end

Listing A.103: symbolicPathToRealPath.m function definition 1 function realPath = symbolicPathToRealPath(symbolicPath) 2 % convert symbolic path to real path 3 % 4 % IMPORTANT: 5 % ------6 % - this is a workaround as matlab does not support the UNIX 7 % command "realpath" 8 % - It is limited in functionality and is intended to use only to resolve 9 % the absolute path of the standard repositories 10 11 symbolicPath = string(symbolicPath); % cannot be char 12 realPath = symbolicPath.replace("Simone-2.0-Complete-Cygwin64- ,→ customized-for-simvma/mdl-files", "default-repos"); 13 end

Listing A.104: trimLabel.m function definition 1 function label = trimLabel(label, maxLen) 2 % trims the label such that the returned value <= maxLen in length 3 % 4 % PARAMETERS: 5 %------6 % label: string/char 7

204 8 len= length(char(label)); 9 if len> maxLen 10 label= "..."+ extractAfter(label, len- maxLen); 11 end 12 end

Listing A.105: unhighlightMrcbAndNeighbors.m function definition 1 function unhighlightMrcbAndNeighbors(context, resetAll) 2 % Undo Highlighting of the Most Recently Clicked Block (MRCB) and its ,→ neighbors 3 % 4 % PARAMETERS: 5 % ------6 % context (Context) : the context of the simulink workspace 7 % resetAll (bool) : If true, all blocks in SUD will be unhighlighted, 8 % If false, only MRCB and neighbors will be 9 % unhilighted. 10 11 if resetAll 12 params = getParamsByPath(context.sud); 13 blocks = string(params.Blocks); % all blocks in SUD 14 15 for i = 1 : length(blocks) 16 block = blocks(i); 17 % some block-names contain the character '/'. But since 18 % '/' is used as a separater between parent and child 19 % blocks, '/' in a block-name is replaced with '//' 20 block = strrep(block, '/', '//'); 21 blockPath = context.sud + "/" + block; 22 set_param(blockPath, 'ForegroundColor', 'black'); 23 end 24 25 else 26 mrcbParams = getParamsByPath(context.mrcbPath); 27 mrcbHandle = mrcbParams.Handle; 28 set_param(mrcbHandle, 'ForegroundColor', 'black'); 29 30 for i = 1 : length(context.mrcbNeighborsHandles) 31 handle = context.mrcbNeighborsHandles(i); 32 set_param(handle, 'ForegroundColor', 'black'); 33 end 34 end 35 end

Listing A.106: unwrapSubSystem.m function definition 1 function unwrapSubSystem(subSystemPath) 2 % Un-wrap the given (sub)system. 3 % 4 % PARAMETERS 5 % ------

205 6 % subSystemPath : string -- path of the subsystem to be unwrapped 7 % 8 % ASSUMPTIONS: 9 % ------10 % - path is a valid path to a sub-system (not the top-system) 11 12 13 Simulink.BlockDiagram.expandSubsystem(subSystemPath); 14 15 % by default, after unwrapping (expanding) the original subsystem, 16 % it is wrapped in an 'annotation' whose name is the same as that of 17 % expanded subsystem 18 % we want to remove this annotation. 19 20 % APPROACH: 21 % 1. get path of parent system 22 % 2. get all annotations in parent system 23 % 3. remove the annotation whose name matches with that of expanded ,→ subsystem 24 25 26 subSystemPath = string(subSystemPath); 27 tokens = subSystemPath.split('/'); 28 subsystemName = tokens(end); % to be used later 29 tokens(end) = []; % remove last token 30 parentPath = tokens.join('/'); 31 32 annotations = find_system(parentPath, 'FindAll', 'on', 'Type', ' ,→ annotation'); 33 annotationNames = get_param(annotations, 'Name'); % cell array 34 for i = 1: length(annotationNames) 35 annotationName = string(annotationNames(i)); 36 if annotationName == subsystemName 37 delete(annotations(i)); 38 end 39 end 40 end

Listing A.107: updateDefBlockPredModelsAfterChangesInDefRepos.m function definition 1 function updateDefBlockPredModelsAfterChangesInDefRepos 2 % This function updates the following shared vars based on 3 % current mdl files in default-repo 1 through default-repo 6 folders 4 % - simvma_armModelDef_000000.mat 5 % ... 6 % - simvma_armModelDef_111111.mat 7 % 8 % The following custom-block-prediction models are not changed: 9 % - simvma_armModelCst.mat 10 % - simvma_freqModelCst.mat 11 % 12 % The following are updated too:

206 13 % - simvma_armModel.mat 14 % - simvma_freqModel.mat 15 16 dispTitle("Updating default block-prediction models using the following ,→ state"); 17 18 state = getSharedVar('simvma_appState'); 19 state = state.setModelFilePathsAndCounts(); 20 setSharedVar('simvma_appState', state); 21 disp(state); 22 23 % Now, we first create separate ArmModels and FreqModels trained on 24 % exactly 1 default repo. Then we merge appropriate models to create 25 % all 64 models for both Arm and Freq 26 27 28 % ArmModel trained on 1st Default repo only 29 am1 = ArmModel(); 30 dispTitle("Training armModelDef_100000"); 31 am1 = am1.trainByFilepath(state.abspathsOfMdlSlxInDefRepo1, true); 32 33 % ArmModel trained on 2nd Default repo only 34 am2 = ArmModel(); 35 dispTitle("Training armModelDef_010000"); 36 am2 = am2.trainByFilepath(state.abspathsOfMdlSlxInDefRepo2, true); 37 38 % ArmModel trained on 3rd Default repo only 39 am3 = ArmModel(); 40 dispTitle("Training armModelDef_001000"); 41 am3 = am3.trainByFilepath(state.abspathsOfMdlSlxInDefRepo3, true); 42 43 % ArmModel trained on 4th Default repo only 44 am4 = ArmModel(); 45 dispTitle("Training armModelDef_000100"); 46 am4 = am4.trainByFilepath(state.abspathsOfMdlSlxInDefRepo4, true); 47 48 % ArmModel trained on 5th Default repo only 49 am5 = ArmModel(); 50 dispTitle("Training armModelDef_000010"); 51 am5 = am5.trainByFilepath(state.abspathsOfMdlSlxInDefRepo5, true); 52 53 % ArmModel trained on 6th Default repo only 54 am6 = ArmModel(); 55 dispTitle("Training armModelDef_000001"); 56 am6 = am6.trainByFilepath(state.abspathsOfMdlSlxInDefRepo6, true); 57 58 59 % FreqModel trained on 1st Default repo only 60 fm1 = FreqModel(); 61 dispTitle("Training FreqModelDef_100000"); 62 fm1 = fm1.trainByFilepath(state.abspathsOfMdlSlxInDefRepo1, true);

207 63 64 % FreqModel trained on 2nd Default repo only 65 fm2 = FreqModel(); 66 dispTitle("Training FreqModelDef_010000"); 67 fm2 = fm2.trainByFilepath(state.abspathsOfMdlSlxInDefRepo2, true); 68 69 % FreqModel trained on 3rd Default repo only 70 fm3 = FreqModel(); 71 dispTitle("Training FreqModelDef_001000"); 72 fm3 = fm3.trainByFilepath(state.abspathsOfMdlSlxInDefRepo3, true); 73 74 % FreqModel trained on 4th Default repo only 75 fm4 = FreqModel(); 76 dispTitle("Training FreqModelDef_000100"); 77 fm4 = fm4.trainByFilepath(state.abspathsOfMdlSlxInDefRepo4, true); 78 79 % FreqModel trained on 5th Default repo only 80 fm5 = FreqModel(); 81 dispTitle("Training FreqModelDef_000010"); 82 fm5 = fm5.trainByFilepath(state.abspathsOfMdlSlxInDefRepo5, true); 83 84 % FreqModel trained on 6th Default repo only 85 fm6 = FreqModel(); 86 dispTitle("Training FreqModelDef_000001"); 87 fm6 = fm6.trainByFilepath(state.abspathsOfMdlSlxInDefRepo6, true); 88 89 setSharedVarsForPredModelsTrainedOnDefRepos(am1, am2, am3, am4, am5, ,→ am6, fm1, fm2, fm3, fm4, fm5, fm6); 90 91 % update appState 92 state = getSharedVar('simvma_appState'); 93 state.defBlockPredModelsReset = false; 94 setSharedVar('simvma_appState', state); 95 96 % update simvma_armModel.mat and simvma_freqModel.mat 97 updatePredModels(true); 98 end

Listing A.108: updateMUDWithSuggestion.m function definition 1 function updateMUDWithSuggestion(SUD, suggestion, MUDFilepath, simvmaPath) 2 % Update the Model-Under-Development (MUD) with suggestion 3 % 4 % PARAMETERS: 5 % ------6 % SUD : string -- (sub)system (which is to be updated) 7 % This should be in format as returned by 'gcs' eg: 8 % "mymodel/a/b/mysubsystem" 9 % suggestion : Suggestion object 10 % MUDFilepath : string -- absolute path of Model-Under-Development 11 % simvmaPath : string -- absolute path of simvma project 12 %

208 13 % ASSUMPTIONS: 14 % - MUD i.e. mdl file at MUDFilepath is loaded 15 % - mdl file corresponding to suggestion may or may not be loaded. 16 % In either case, the state of the file (loaded/not-loaded) will be 17 % preserved. 18 % 19 % APPROACH: 20 % - There are 4 possible cases -- Handle each case separately. 21 22 % ,→ ======,→ 23 24 dispTitle("updating complete (sub)system"); 25 26 % disp(suggestion); 27 % disp(suggestion.source); 28 29 30 % in case suggestion's Mdl filepath needs to be changed, change 31 % sugMdlFilepath, not suggestion.source.realFilepath. 32 % In Matlab, the following string assignment is "by value", not "by ,→ reference" 33 % so, we can safely change sugMdlFilepath as needed. 34 sugMdlFilepath = suggestion.source.realFilepath; 35 36 SFS = getSystemPathFromStartline(sugMdlFilepath, suggestion.source. ,→ startline); 37 38 disp(""); 39 disp("SUD : " + SUD); 40 disp("SFS : " + SFS); 41 disp("MUDFilepath : " + MUDFilepath); 42 disp("Suggestion Filepath : " + sugMdlFilepath); 43 44 45 % find whether suggestion's mdl file is loaded 46 % this is required to preserve its state 47 sugMdlWasLoaded = isLoadedByAbspath(sugMdlFilepath); 48 if ˜sugMdlWasLoaded 49 % if the mdl file corresponding to the suggestion (sugMdl) and MUD ,→ have 50 % the same name, it is not possible to load both the models as such. 51 % In such condition, we need to rename one of them (to load sugMdl). 52 % We choose sudMdl to rename following the following steps: 53 % 1. rename sudMdl such that it doesnot conflict with any loaded ,→ system 54 % 2. load sudMdl 55 % 3. copy sudMdl to MUD 56 % 4. close sudMdl 57 % 5. restore original name of sudMdl

209 58 59 60 sugMdlFilepathChanged = false; 61 sugMdlFilepathOriginal = sugMdlFilepath; 62 63 if filenamesMatch(MUDFilepath, sugMdlFilepath) 64 % backup of original sugMdl file 65 sugMdlBackupFilepath = simvmaPath + "/tmp/sugMdlBackup.mdl"; 66 copyfile(sugMdlFilepath, sugMdlBackupFilepath); 67 68 [folder, bdName, ext] = fileparts(sugMdlFilepath); 69 bdName = bdName + "_9dsf598sdf473"; % add a random string 70 71 sugMdlFilepath = folder + filesep + bdName + ext; 72 movefile(sugMdlFilepathOriginal, sugMdlFilepath); % rename file i ,→ .e move file 73 sugMdlFilepathChanged = true; 74 75 % update SFS 76 tokens = SFS.split('/'); 77 tokens(1) = bdName; 78 SFS = tokens.join('/'); 79 end 80 81 load_system(sugMdlFilepath); 82 end 83 84 85 %======86 87 % by now, both SUD and SFS are loaded, so we can safely call ,→ getNInportOutport 88 [nInportSud, nOutportSud] = getNInportOutport(SUD); 89 [nInportSfs, nOutportSfs] = getNInportOutport(SFS); 90 91 92 sudIsTop = isTopSystem(SUD); 93 sfsIsTop = isTopSystem(SFS); 94 95 % based on whether SUD and SFS are top-system or sub-system, 96 % there are 4 possible cases. 97 98 postAction = "NONE"; 99 100 if sudIsTop && sfsIsTop 101 dispTitle("CASE I -- SUD:TOP, SFS:TOP"); 102 postAction = handleTopTop(SUD, MUDFilepath, sugMdlFilepath); 103 104 elseif sudIsTop && ˜sfsIsTop 105 dispTitle("CASE II -- SUD:TOP, SFS:SUB"); 106 % replaceTopSystemBySubSystemContents(SUD, SFS); % fails in some

210 ,→ cases (read detailed comments below) 107 108 % Earlier, we used the function replaceTopSystemBySubSystem() to 109 % accmplish this task. However, it was found that this function 110 % fails in some cases (not understood fully). For example, when the 111 % replacer subsystem is a "Function" or a "Chart". 112 % To avoid the problem altogether, we now "convert" this case 113 % (Top,Sub) to the first case i.e. (Top, Top). This conversion is 114 % possible because we already have the corresponding suggestion's 115 % mdl file (in simvma/tmp/simample-suggs/) which contains exactly 116 % the replacer subsystem as the top-level system. 117 118 postAction = handleTopTop(SUD, MUDFilepath, suggestion.mdlFilepath); 119 120 elseif ˜sudIsTop && sfsIsTop 121 dispTitle("CASE III -- SUD:SUB, SFS:TOP"); 122 postAction = handleSubTop(SUD, SFS, MUDFilepath, simvmaPath, ,→ sugMdlFilepath, nInportSud, nOutportSud, nInportSfs, ,→ nOutportSfs); 123 124 else 125 dispTitle("CASE IV -- SUD:SUB, SFS:SUB"); 126 postAction = handleSubSub(SUD, SFS, MUDFilepath, nInportSud, ,→ nOutportSud, nInportSfs, nOutportSfs); 127 end 128 129 %======130 131 132 % at this stage, sugMdl is loaded 133 % And, its filename MIGHT have changed (in case it originally matched 134 % with the MUD's filename. 135 % 136 % We now want to restore the original state of sugMDL (i.e. its 137 % filename, and its loaded/not loaded state) 138 139 if ˜sugMdlWasLoaded 140 close_system(sugMdlFilepath); 141 if sugMdlFilepathChanged 142 movefile(sugMdlBackupFilepath, sugMdlFilepathOriginal); 143 delete(sugMdlFilepath); 144 end 145 end 146 147 % post-replacement stuffs 148 disp("postAction : " + postAction); 149 150 % if postAction == "ADJUST_CONN" 151 % msg = "Sub-System under development has been replaced with the suggested ,→ Sub-System." + newline + newline + ... 152 % "You may need to adjust signal lines connecting to/from the replaced sub

211 ,→ -system."; 153 % dispDlgMsg(msg); 154 % end 155 156 if postAction == "ADJUST_CONN" 157 msg = "Sub-System under development has been replaced with the ,→ suggested Sub-System." ... 158 + newline + newline + "You may need to adjust signal lines ,→ connecting to/from the replaced sub-system." ... 159 + newline + newline + "Number of Inports in Subsystem Under ,→ Development = " + nInportSud ... 160 + newline + "Number of Inports in Suggested subsystem = " + ,→ nInportSfs ... 161 + newline + "Number of Outports in Subsystem Under Development = ,→ " + nOutportSud ... 162 + newline + "Number of Outports in Suggested subsytem = " + ,→ nOutportSfs; 163 164 dispDlgMsg(msg); 165 end 166 167 168 end 169 170 171 function postAction = handleTopTop(SUD, MUDFilepath, sugMdlFilepath) 172 state = getSharedVar('simvma_appState'); 173 174 % If the user is working on MDL-formatted model, replacement is 175 % straightforward -- simply replace the entire MUD's MDL file with 176 % suggestion's MDL file 177 % 178 % However, if the user is working on SLX-formatted model, replacement 179 % is not straightforward (we cannot put MDL content into an 180 % SLX-formatted file). In such case, we: 181 % 1. replace a temporary MDL file simvma/tmp/simxample-slx2mdl-output ,→ /*.mdl 182 % (which is already created by createMdlFileForSimone()) with 183 % sugMdlFilepath's content 184 % 2. convert the temporary MDL file to slx format such that the slx 185 % file so created has same name as MUDFilepath. This effectively ,→ replaces 186 % the original MUD SLX file with the one obtained from conversion. 187 188 if state.simxampleSavedAsMdl 189 % we use the file simvma/tmp/slx2mdl-output/*.mdl (created earlier ,→ by 190 % createMdlFileForSimone() as a temporary file 191 [folder, filename, ext] = fileparts(MUDFilepath); 192 destMdlFilepath = getSimvmaPath() + "/tmp/simxample-slx2mdl-output/" ,→ + filename + ".mdl";

212 193 194 close_system(MUDFilepath); 195 open_system(destMdlFilepath); 196 replaceMdlFileContent(destMdlFilepath, sugMdlFilepath); 197 close_system(destMdlFilepath); 198 199 mdl2slx(destMdlFilepath, MUDFilepath); % MUDFilepath is .slx file 200 open_system(MUDFilepath); 201 else 202 replaceMdlFileContent(MUDFilepath, sugMdlFilepath); 203 end 204 205 postAction = "NONE"; 206 207 % change view to top level 208 % assuming the model is loaded (which is true, here), open(bdname) does 209 % the job 210 open_system(SUD); 211 end 212 213 214 function postAction = handleSubTop(SUD, SFS, MUDFilepath, simvmaPath, ,→ sugMdlFilepath, nInportSud, nOutportSud, nInportSfs, nOutportSfs) 215 216 % APPROACH 217 % 1. copy the suggestion's mdl file to simvma/tmp with the same name as 218 % the original mdl file 219 % 2. load the copied file (just load, don't open) 220 % 3. wrap the suggested-top-system (from this copied file) in a 221 % sub-system 222 % 4. save it. 223 % 5. now it is equivalent to case III (sub,sub); handle accordingly 224 225 dispUnderlined("handling SUD:sub, SFS:top"); 226 disp("SUD : " + SUD); 227 disp("SFS : " + SFS); 228 229 230 [folder_, sugMdlFilename, ext] = fileparts(sugMdlFilepath); 231 destPath = simvmaPath + "/tmp/" + sugMdlFilename + ext; 232 233 234 copyfile(sugMdlFilepath, destPath); 235 236 % close if the suggestion-mdl-file (original one) is already open 237 % because Matlab does not allow to open two models with same name (in 238 % different folders) simultaneously 239 240 % the second arg (0) is to suppress warning, in case the model is not 241 % already open 242 close_system(sugMdlFilename, 0);

213 243 % load the copied file 244 load_system(destPath); 245 wrapperSystem = wrapInSystem(sugMdlFilename); 246 save_system(destPath); 247 248 % wrapperSystem will serve as the SFS 249 postAction = handleSubSub(SUD, wrapperSystem, MUDFilepath, nInportSud, ,→ nOutportSud, nInportSfs, nOutportSfs); 250 251 close_system(destPath); 252 end 253 254 255 function postAction = handleSubSub(SUD, SFS, MUDFilepath, nInportSud, ,→ nOutportSud, nInportSfs, nOutportSfs) 256 257 postAction = "NONE"; 258 259 dispUnderlined("handling SUD:sub, SFS:sub"); 260 disp("SUD : " + SUD); 261 disp("SFS : " + SFS); 262 263 264 if subSystemsCompatibleForReplacement(SUD, SFS) 265 postAction = replaceSubSystem(SUD, SFS, MUDFilepath); 266 else 267 dispImp("Subsystems incompatible for replacement. Asking user for ,→ confirmation..."); 268 269 question = "The subsystems are not compatible for replacement due to ,→ a mismatch in number of input/output ports." ... 270 + newline + "If you choose to replace your sub-system with the ,→ suggested sub-system, you will need to adjust the ports and ,→ connections." ... 271 + newline + newline + "Do you still want to proceed with the ,→ replacement?"; 272 273 question = "The subsystems are not compatible for replacement due to ,→ a mismatch in number of input/output ports." ... 274 + newline + newline + "Number of Inports in Subsystem Under ,→ Development = " + nInportSud ... 275 + newline + "Number of Inports in Suggested subsystem = " + ,→ nInportSfs ... 276 + newline + "Number of Outports in Subsystem Under Development = ,→ " + nOutportSud ... 277 + newline + "Number of Outports in Suggested subsytem = " + ,→ nOutportSfs ... 278 + newline + newline + "If you choose to replace your sub-system ,→ with the suggested sub-system, you will need to adjust the ,→ ports and connections." ... 279 + newline + newline + "Do you still want to proceed with the

214 ,→ replacement?"; 280 281 if (dispDlgConfirmation(question)) 282 postAction = replaceSubSystem(SUD, SFS, MUDFilepath); 283 end 284 end 285 end

Listing A.109: updateModelWithClone.m function definition 1 function updateModelWithClone(modelPath, clonePath) 2 % replace the model's entire content with that of the clone 3 4 % any name can be given; it's temporary and doesn't matter 5 model = 'temp'; 6 clonePath = '/Users/bhisma/courses/cse-700-simvma/simvma/simulink- ,→ models/automotive/powerwindow.mdl'; 7 8 % close the simulink model the user is working on 9 % so that a new system with the same name can be saved 10 % the second argument, 0 suppresses the warning which is thrown in case 11 % modelPath is not loaded 12 close_system(modelPath, 0); 13 14 new_system(model, 'FromFile', clonePath); 15 % system must be either loaded (load_system) or opened (open_system) 16 % before saving (save_system) 17 open_system(model); 18 save_system(model, modelPath); 19 end

Listing A.110: updatePredModels.m function definition 1 function updatePredModels(verbose) 2 % This function updates the following shared vars, based on the shared var 3 % 'simvma_appState'. 4 % - simvma_freqModel 5 % - simvma_armModel 6 % 7 % This function does not update the prediction models trained on default 8 % repos (eg. simvma_armModelDef_100000) unless app.defBlockPredModelsReset 9 % is set to true. 10 % 11 % This function is called from inside appSimgestion, appSimxample, 12 % 13 % 14 % PARAMETERS: 15 % ------16 % verbose (bool) : If true, details are printed 17 % 18 % ASSUMPTIONS: 19 % ------

215 20 % default prediction models (stored in shared vars) are already 21 % updated (using updateDefBlockPredModelsAfterChangesInDefRepos() ) 22 % 23 % IMPORTANT: 24 % ------25 % This function DOES NOT update the default models i.e. 26 % (for example, simvma_armModelDef_010000.mat) when there is a change in ,→ mdl 27 % files in the default-repos folder. To achieve that functionality we have 28 % another function named updateDefBlockPredModelsAfterChangesInDefRepos(). 29 % That function is in src/functions/devt (as it is to be used during 30 % development time only because "default-repos" folder is fixed after 31 % development, and so this function is not necessary at production time) 32 % 33 % This function DOES NOT (and CAN NOT) check if there has been a "change" 34 % in the repositories (both default and custom). So, this function ALWAYS 35 % updates the prediction models (i.e. simvma_freqModel and 36 % simvma_armModel). Such check cannot be performed inside this function 37 % because 'previous' appState is not available here. Therefore, such 38 % check must be performed in the caller environment (before invoking this 39 % function) to avoid unnecessary updating of the prediction models (which 40 % does take considerable time). 41 42 if verbose dispTitle("Updating PredModels"); end 43 44 % appState is set before invoking this function from 45 % appAdmin. So, calling getSharedVar() here gives the recent 46 % changes made by user from UI, if any. 47 state = getSharedVar('simvma_appState'); 48 49 % if default block-prediction-models are in 'reset' state, 50 % train them now 51 if state.defBlockPredModelsReset 52 disp("Standard Block Prediction models were found to have been reset ,→ ."); 53 disp("Training them now..."); 54 % this function also unsets the flag state.defBlockPredModelsReset 55 updateDefBlockPredModelsAfterChangesInDefRepos(); 56 end 57 58 % get all mdl/slx files' path from all custom repos 59 mdlSlxPaths = string.empty; 60 for i = 1 : 5 61 cstRepoPath = state.("cstRepoPath" + i); 62 if ˜strcmp(cstRepoPath, "") 63 mdlSlxPaths = [mdlSlxPaths searchFilesRecursively(cstRepoPath, [" ,→ mdl", "slx"])]; 64 end 65 end 66 67

216 68 % HANDLE ARM MODEL 69 70 % construct the shared var name (eg: simvma_armModelDef_010000) for 71 % required ArmModel (the one trained on checked default repos only) 72 am = "simvma_armModelDef_"; 73 for i = 1 : 6 74 if state.("defRepo" + i + "Checked") 75 am = am + 1; 76 else 77 am = am + 0; 78 end 79 end 80 armModelDef = getSharedVar(am); 81 82 % ArmModel trained on all custom repositories 83 armModelCst = ArmModel(); 84 if ˜isempty(mdlSlxPaths) 85 armModelCst = armModelCst.trainByFilepath(mdlSlxPaths, verbose); 86 end 87 88 armModel = mergeArmModels(false, [armModelCst, armModelDef]); 89 setSharedVar('simvma_armModel', armModel); 90 91 92 % HANDLE FREQ MODEL 93 94 % construct the shared var name (eg: simvma_freqModelDef_010000) for 95 % required FreqModel (the one trained on checked default repos only) 96 fm = "simvma_freqModelDef_"; 97 for i = 1 : 6 98 if state.("defRepo" + i + "Checked") 99 fm = fm + 1; 100 else 101 fm = fm + 0; 102 end 103 end 104 freqModelDef = getSharedVar(fm); 105 106 % FreqModel trained on all custom repositories 107 freqModelCst = FreqModel(); 108 if ˜isempty(mdlSlxPaths) 109 freqModelCst = freqModelCst.trainByFilepath(mdlSlxPaths, verbose); 110 end 111 112 freqModel = mergeFreqModels(false, [freqModelCst, freqModelDef]); 113 setSharedVar('simvma_freqModel', freqModel); 114 end

Listing A.111: wrapInSystem.m function definition 1 function wrapperSystem = wrapInSystem(system) 2 % Wrap the given (sub)system in a (sub)system, and

217 3 % return the newly created wrapper (sub)system's name (beginning from top 4 % system) 5 % 6 % PARAMETERS 7 % ------8 % system : string -- name of the (sub)system to be wrapped in a subsystem 9 % eg: "mymodel/sub1/sub2/sub3" 10 % 11 % NOTES: 12 % ------13 % If a name to a top-level system is passed in (eg: "mymodel"), 14 % 1. If there is no immediate (at first depth) subsystem named 15 % "Subsystem" among the blocks to be wrapped, the newly created wrapper 16 % subsystem will be named "Subsystem" 17 % 2. If there is an immediate subsystem named "Subsystem" among the 18 % blocks to be wrapped, the newly created wrapper subsystem will be 19 % named "Subsystem1" 20 % 3. If there are immediate subsystems named "Subsystem" and "Subsystem1" 21 % among the blocks to be wrapped, the newly created wrapper subsystem 22 % will be named "Subsystem2", and so on. 23 24 % - the top-level system will be wrapped inside a subsystem named 25 % 'Subsystem' 26 % If a nane to a sub-system is passed in (eg: "mymodel/mysubsystem"), 27 % - the subsystem will be wrapped in another subsystem 28 % - the wrapper subsystem will take the original name of wrapped subsystem 29 % - the wrapped subsystem will be renamed to 'Subsystem' 30 % - Thus, the new hierarchy will look like: 31 % "mymodel/mysubsystem/Subsystem" 32 % 33 % PORTS 34 % ----- 35 % - The number of inports and outports remains the same as before. 36 % - The name of the inports and outports remains preserved (both outside 37 % and inside the wrapper subsystem. For example, if the a name to a 38 % top-level system (having inport, ip and outport, op is passed; then the 39 % resulting model will have: 40 % 1. an inport named ip 41 % 2. an outport named op 42 % 3. a subsystem containing all blocks of the original top-level system. 43 % These block include the following: 44 % 1. an inport named ip 45 % 2. an outport named op 46 % 3. all other blocks of original top-level system 47 % 48 % IMPORTANT: 49 % ------50 % This function is NOT IDEMPOTENT. Executing this function 51 % multiple times keeps on wrapping the (sub)system by another subsystem in 52 % a nested fashion. So, use with caution. 53 %

218 54 % ASSUMPTIONS: 55 % ------56 % - The corresponding mdl file is loaded. 57 58 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 59 60 systemIsTopSystem = isTopSystem(system); 61 62 blocks = find_system(system, 'SearchDepth', 1); % 1 is max_depth ,→ considered 63 bh = []; 64 for i = 2:length(blocks) 65 bh = [bh get_param(blocks{i}, 'handle')]; 66 end 67 Simulink.BlockDiagram.createSubsystem(bh); 68 69 if systemIsTopSystem 70 blocks_after = find_system(system, 'SearchDepth', 1); % 1 is max_depth ,→ considered 71 wrapperSystem = setdiff(blocks_after, blocks); 72 else 73 wrapperSystem = system; 74 end 75 end

219 References

[1] Matthew Stephan. Towards a cognizant virtual software modeling assistant using model clones. In 2019 IEEE/ACM 41st International Conference on Software Engineering: New Ideas and Emerging Results (ICSE-NIER), pages 21–24. IEEE, 2019.

[2] Bhisma Adhikari, Eric Rapos, and Matthew Stephan. SimIMA: a virtual simulink intelligent modeling assistant. Software and System Modeling, 2021, under review.

[3] John Hutchinson, Mark Rouncefield, and Jon Whittle. Model-driven engineering practices in industry. In Proceedings of the 33rd International Conference on Software Engineering, pages 633–642, 2011.

[4] Marco Brambilla, Jordi Cabot, and Manuel Wimmer. Model-driven software engineering in practice. Synthesis lectures on software engineering, 3(1):1–207, 2017.

[5] Antonio Bucchiarone, Jordi Cabot, Richard F Paige, and Alfonso Pierantonio. Grand chal- lenges in model-driven engineering: an analysis of the state of the research. Software and Systems Modeling, pages 1–9, 2020.

[6] Jordi Cabot, Robert Clariso,´ Marco Brambilla, and Sebastien´ Gerard.´ Cognifying model- driven software engineering. In Federation of International Conferences on Software Tech- nologies: Applications and Foundations, pages 154–160. Springer, 2017.

[7] Muhammad Asaduzzaman, Chanchal K Roy, Kevin A Schneider, and Daqing Hou. Cscc: Simple, efficient, context sensitive code completion. In 2014 IEEE International Conference on Software Maintenance and Evolution, pages 71–80. IEEE, 2014.

[8] Sebastian Proksch, Johannes Lerch, and Mira Mezini. Intelligent code completion with bayesian networks. ACM Transactions on Software Engineering and Methodology (TOSEM), 25(1):1–31, 2015.

[9] Veselin Raychev, Martin Vechev, and Eran Yahav. Code completion with statistical language models. In Conference on Programming Language Design and Implementation, pages 419– 428, New York, NY, USA, 2014. ACM.

[10] Gunter Mussbacher, Benoit Combemale, Jorg¨ Kienzle, Silvia Abrahao,˜ Hyacinth Ali, Nelly Bencomo, Marton´ Bur,´ Loli Burgueno,˜ Gregor Engels, Pierre Jeanjean, et al. Opportunities in intelligent modeling assistance. Software and Systems Modeling, 19(5):1045–1053, 2020.

220 [11] Matthew Stephan. Towards a Cognizant Virtual Software Modeling Assistant Using Model Clones. In Proceedings of the 41st International Conference on Software Engineering: New Ideas and Emerging Results, ICSE-NIER ’19, pages 21–24, Piscataway, NJ, USA, 2019. IEEE Press.

[12] Robert Reicherdt and Sabine Glesner. Slicing matlab simulink models. In 2012 34th Inter- national Conference on Software Engineering (ICSE), pages 551–561. IEEE, 2012.

[13] Shafiul Azam Chowdhury, Lina Sera Varghese, Soumik Mohian, Taylor T Johnson, and Christoph Csallner. A curated corpus of simulink models for model-based empirical stud- ies. In 2018 IEEE/ACM 4th International Workshop on Software Engineering for Smart Cyber-Physical Systems (SEsCPS), pages 45–48. IEEE, 2018.

[14] Alexander Boll, Florian Brokhausen, Tiago Amorim, Timo Kehrer, and Andreas Vogelsang. Characteristics, potentials, and limitations of open source simulink projects for empirical research. Software and Systems Modeling, tbd(tbd):20pp, 2021. in press.

[15] Stuart Kent. Model driven engineering. In International Conference on Integrated Formal Methods, pages 286–298. Springer, 2002.

[16] Uml diagrams. https://www.uml.org. Accessed: 04-05-2020.

[17] Bran Selic. The pragmatics of model-driven development. IEEE software, 20(5):19–25, 2003.

[18] Florian Deissenboeck, Benjamin Hummel, Elmar Jurgens,¨ Bernhard Schatz,¨ Stefan Wagner, Jean-Franc¸ois Girard, and Stefan Teuchert. Clone detection in automotive model-based de- velopment. In 2008 ACM/IEEE 30th International Conference on Software Engineering, pages 603–612. IEEE, 2008.

[19] Eclipse modeling framework (emf). https://www.eclipse.org/modeling/emf/. Accessed: 04-05-2020.

[20] Papyrus real time. https://www.eclipse.org/papyrus-rt. Accessed: 04-05- 2020.

[21] Eclipse foundation. https://www.eclipse.org. Accessed: 04-05-2020.

[22] Simulink. https://www.mathworks.com/products/simulink.html. Ac- cessed: 04-05-2020.

[23] Mathworks. https://www.mathworks.com/?s_tid=gn_logo. Accessed: 04-05- 2020.

[24] Matlab. https://www.mathworks.com/products/matlab.html. Accessed: 04-05-2020.

221 [25] Matlab app designer. https://www.mathworks.com/products/matlab/ app-designer.html. Accessed: 05-01-2020.

[26] Guide. https://www.mathworks.com/help/matlab/creating_guis/ about-the-simple-guide-gui-example.html. Accessed: 05-01-2020.

[27] Comparing guide and app designer. https://www.mathworks.com/products/ matlab/app-designer/comparing-guide-and-app-designer.html. Ac- cessed: 05-01-2020.

[28] Arun Lakhotia, Junwei Li, Andrew Walenstein, and Yun Yang. Towards a clone detection benchmark suite and results archive. In 11th IEEE International Workshop on Program Com- prehension, 2003., pages 285–286. IEEE, 2003.

[29] Ira D Baxter, Andrew Yahin, Leonardo Moura, Marcelo Sant’Anna, and Lorraine Bier. Clone detection using abstract syntax trees. In Proceedings. International Conference on Software Maintenance (Cat. No. 98CB36272), pages 368–377. IEEE, 1998.

[30] Chanchal Kumar Roy and James R Cordy. A survey on software clone detection research. Queen’s School of Computing TR, 541(115):64–68, 2007.

[31] Stefan Bellon and R Koschke. Detection of software clones tool comparison experiment. In International workshop on source code analysis and manipulation. Montreal, 2002.

[32] Stefan Bellon. Vergleich von techniken zur erkennung duplizierten quellcodes. Master’s Thesis, Institut fur Softwaretechnologie, Universitat Stuttgart, Stuttgart, Germany, 2002.

[33] Rainer Koschke, Raimar Falke, and Pierre Frenzel. Clone detection using abstract syntax suffix trees. In 2006 13th Working Conference on Reverse Engineering, pages 253–262. IEEE, 2006.

[34] Raghavan Komondoor and Susan Horwitz. Effective, automatic procedure extraction. In 11th IEEE International Workshop on Program Comprehension, 2003., pages 33–42. IEEE, 2003.

[35] Jens Krinke. Identifying similar code with program dependence graphs. In Proceedings Eighth Working Conference on Reverse Engineering, pages 301–309. IEEE, 2001.

[36] Gilad Mishne, Maarten De Rijke, et al. Source code retrieval using conceptual similarity. In RIAO, volume 4, pages 539–554. Citeseer, 2004.

[37] Neil Davey, Paul Barson, Simon Field, Ray Frank, and D Tansley. The development of a software clone detector. International Journal of Applied Software Technology, 1995.

[38] Manar H Alalfi, James R Cordy, Thomas R Dean, Matthew Stephan, and Andrew Stevenson. Models are code too: Near-miss clone detection for simulink models. In 2012 28th IEEE International Conference on Software Maintenance (ICSM), pages 295–304. IEEE, 2012.

222 [39] Bakr Al-Batran, Bernhard Schatz,¨ and Benjamin Hummel. Semantic clone detection for model-based development of embedded systems. In International Conference on Model Driven Engineering Languages and Systems, pages 258–272. Springer, 2011.

[40] Rainer Koschke. Identifying and removing software clones. In Software Evolution, pages 15–36. Springer, 2008.

[41] J Howard Johnson. Identifying redundancy in source code using fingerprints. In Proceed- ings of the 1993 conference of the Centre for Advanced Studies on Collaborative research: software engineering-Volume 1, pages 171–183. IBM Press, 1993.

[42] Zhenmin Li, Shan Lu, Suvda Myagmar, and Yuanyuan Zhou. Cp-miner: Finding copy-paste and related bugs in large-scale software code. IEEE Transactions on software Engineering, 32(3):176–192, 2006.

[43] J Howard Johnson. Navigating the textual redundancy web in legacy source. In Proceed- ings of the 1996 conference of the Centre for Advanced Studies on Collaborative research, page 16. IBM Press, 1996.

[44] Akito Monden, Daikai Nakae, Toshihiro Kamiya, Shin-ichi Sato, and Ken-ichi Matsumoto. Software quality analysis by code clones in industrial legacy software. In Proceedings Eighth IEEE Symposium on Software Metrics, pages 87–94. IEEE, 2002.

[45] J Howard Johnson. Substring matching for clone detection and change tracking. In ICSM, volume 94, pages 120–126, 1994.

[46] Jean Mayrand, Claude Leblanc, and Ettore Merlo. Experiment on the automatic detection of function clones in a software system using metrics. In icsm, volume 96, page 244, 1996.

[47] Michael Toomim, Andrew Begel, and Susan L Graham. Managing duplicated code with linked editing. In 2004 IEEE Symposium on Visual Languages-Human Centric Computing, pages 173–180. IEEE, 2004.

[48] Cory J Kapser and Michael W Godfrey. “cloning considered harmful” considered harmful: patterns of cloning in software. Empirical Software Engineering, 13(6):645, 2008.

[49] Roy Chanchal K and James R Cordy. Nicad: Accurate detection of near-miss intentional clones using flexible pretty-printing and code normalization. In 16th Int. conf. on Program Compreh., pages 172–181. IEEE, 2008.

[50] Matthew Stephan, Manar H Alafi, Andrew Stevenson, and James R Cordy. Towards quali- tative comparison of simulink model clone detection approaches. In 2012 6th International Workshop on Software Clones (IWSC), pages 84–85. IEEE, 2012.

[51] Matthew Stephan and James R Cordy. Mumonde: A framework for evaluating model clone detectors using model mutation analysis. Software Testing, Verification and Reliability, 29(1- 2):e1669, 2019.

223 [52] BA Wichmann, AA Canning, DL Clutterbuck, LA Winsborrow, NJ Ward, and DWR Marsh. Industrial perspective on static analysis. Software Engineering Journal, 10(2):69–75, 1995.

[53] Dhavleesh Rattan, Rajesh Bhatia, and Maninder Singh. Software clone detection: A system- atic review. Information and Software Technology, 55(7):1165–1199, 2013.

[54] Chanchal K Roy, James R Cordy, and Rainer Koschke. Comparison and evaluation of code clone detection techniques and tools: A qualitative approach. Science of computer program- ming, 74(7):470–495, 2009.

[55] Nam H Pham, Hoan Anh Nguyen, Tung Thanh Nguyen, Jafar M Al-Kofahi, and Tien N Nguyen. Complete and accurate clone detection in graph-based models. In 2009 IEEE 31st International Conference on Software Engineering, pages 276–286. IEEE, 2009.

[56] Florian Deissenboeck, Benjamin Hummel, Elmar Juergens, Michael Pfaehler, and Bernhard Schaetz. Model clone detection in practice. In Proceedings of the 4th International Workshop on Software Clones, pages 57–64, 2010.

[57] Gail C Murphy, Mik Kersten, and Leah Findlater. How are java software developers using the elipse ide? IEEE software, 23(4):76–83, 2006.

[58] Veselin Raychev, Martin Vechev, and Eran Yahav. Code completion with statistical language models. In Proceedings of the 35th ACM SIGPLAN Conference on Programming Language Design and Implementation, pages 419–428, 2014.

[59] Marcel Bruch, Martin Monperrus, and Mira Mezini. Learning from examples to improve code completion systems. In Proceedings of the 7th joint meeting of the European software engineering conference and the ACM SIGSOFT symposium on the foundations of software engineering, pages 213–222, 2009.

[60] Tom M Mitchell et al. Machine learning. 1997.

[61] John R Koza, Forrest H Bennett, David Andre, and Martin A Keane. Automated design of both the topology and sizing of analog electrical circuits using genetic programming. In Artificial Intelligence in Design’96, pages 151–170. Springer, 1996.

[62] Paul Resnick and Hal R Varian. Recommender systems. Communications of the ACM, 40(3):56–58, 1997.

[63] Preeti Paranjape-Voditel and Umesh Deshpande. A stock market portfolio recommender system based on association rule mining. Applied Soft Computing, 13(2):1055–1063, 2013.

[64] Rakesh Agrawal, Tomasz Imielinski,´ and Arun Swami. Mining association rules between sets of items in large databases. In Proceedings of the 1993 ACM SIGMOD international conference on Management of data, pages 207–216, 1993.

224 [65] Thomas G Dietterich. Ensemble methods in machine learning. In International workshop on multiple classifier systems, pages 1–15. Springer, 2000.

[66] Cha Zhang and Yunqian Ma. Ensemble machine learning: methods and applications. Springer, 2012.

[67] Omer Sagi and Lior Rokach. Ensemble learning: A survey. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 8(4):e1249, 2018.

[68] Thomas Cover and Peter Hart. Nearest neighbor pattern classification. IEEE transactions on information theory, 13(1):21–27, 1967.

[69] Nir Friedman, Dan Geiger, and Moises Goldszmidt. Bayesian network classifiers. Machine learning, 29(2-3):131–163, 1997.

[70] Andrej Dyck, Andreas Ganser, and Horst Lichter. A framework for model recommenders requirements, architecture and tool support. In 2014 2nd International Conference on Model- Driven Engineering and Software Development (MODELSWARD), pages 282–290. IEEE, 2014.

[71] Tobias Kuschke, Patrick Mader,¨ and Patrick Rempel. Recommending auto-completions for software modeling activities. In International Conference on Model Driven Engineering Languages and Systems, pages 170–186. Springer, 2013.

[72] Angel´ Mora Segura, Ana Pescador, Juan de Lara, and Manuel Wimmer. An extensible meta- modelling assistant. In 2016 IEEE 20th International Enterprise Distributed Object Comput- ing Conference (EDOC), pages 1–10. IEEE, 2016.

[73] Sagar Sen, Benoit Baudry, and Hans Vangheluwe. Towards domain-specific model editors with automatic model completion. Simulation, 86(2):109–126, 2010.

[74] Steffen Mazanek, Sonja Maier, and Mark Minas. Auto-completion for diagram editors based on graph grammars. In 2008 IEEE Symposium on Visual Languages and Human-Centric Computing, pages 242–245. IEEE, 2008.

[75] Friedrich Steimann and Bastian Ulke. Generic model assist. In International Conference on Model Driven Engineering Languages and Systems, pages 18–34. Springer, 2013.

[76] Tanumoy Pati, Dennis C Feiock, and James H Hill. Proactive modeling: auto-generating models from their semantics and constraints. In Proceedings of the 2012 workshop on Domain-specific modeling, pages 7–12, 2012.

[77] Gunter Mussbacher, Benoit Combemale, Silvia Abrahao,˜ Nelly Bencomo, Loli Burgueno,˜ Gregor Engels, Jorg¨ Kienzle, Thomas Kuhn,¨ Sebastien´ Mosser, Houari Sahraoui, et al. To- wards an assessment grid for intelligent modeling assistance. In Proceedings of the 23rd ACM/IEEE International Conference on Model Driven Engineering Languages and Systems: Companion Proceedings, pages 1–10, 2020.

225 [78] Onder¨ Babur and Matthew Stephan. Mocop: towards a model clone portal. In 2019 IEEE/ACM 11th International Workshop on Modelling in Software Engineering (MiSE), pages 78–81, Montreal, QC, Canada, 2019. IEEE, IEEE.

[79] Francesco Basciani, Juri Di Rocco, Davide Di Ruscio, Amleto Di Salle, Ludovico Iovino, and Alfonso Pierantonio. Mdeforge: an extensible web-based modeling platform. In Cloud- MDE@ MoDELS, pages 66–75, 2014.

[80] Andrej Dyck, Andreas Ganser, and Horst Lichter. On designing recommenders for graphical domain modeling environments. In International Conference on Model-Driven Engineering and Software Development, pages 291–299, 2014.

[81] Stephen Barrett, Patrice Chalin, and Greg Butler. Model merging falls short of software engi- neering needs. In Proc. of the 2nd Workshop on Model-Driven Software Evolution. Citeseer, 2008.

[82] Julia Rubin and Marsha Chechik. N-way model merging. In proceedings of the 2013 9th Joint Meeting on Foundations of Software Engineering, pages 301–311, 2013.

[83] Davide Anguita, Luca Ghelardoni, Alessandro Ghio, Luca Oneto, and Sandro Ridella. The’k’in k-fold cross validation. In ESANN, pages 441–446, 2012.

[84] Tadayoshi Fushiki. Estimation of prediction error by using k-fold cross-validation. Statistics and Computing, 21(2):137–146, 2011.

[85] James H Andrews, Lionel C Briand, and Yvan Labiche. Is mutation an appropriate tool for testing experiments? In Proceedings of the 27th international conference on Software engineering, pages 402–411, 2005.

[86] Chih-Wei Hsu, Chih-Chung Chang, Chih-Jen Lin, et al. A practical guide to support vector classification, 2003.

[87] Gunnar Schroder,¨ Maik Thiele, and Wolfgang Lehner. Setting goals and choosing metrics for recommender system evaluations. In UCERSTI2 workshop at the 5th ACM conference on recommender systems, Chicago, USA, volume 23, page 53, 2011.

[88]F elix´ Hernandez´ Del Olmo and Elena Gaudioso. Evaluation of recommender systems: A new approach. Expert Systems with Applications, 35(3):790–804, 2008.

[89] Chanchal K Roy and James R Cordy. A mutation/injection-based automatic framework for evaluating code clone detection tools. In International Conference on Software Testing, Ver- ification and Validation Workshops (ICSTW), pages 157–166, 2009.

[90] Pratiksha Gautam and Hemraj Saini. Mutation testing-based evaluation framework for eval- uating software clone detection tools. In Reliability and Risk Assessment in Engineering, pages 21–35. Springer, 2020.

226 [91] Matthew Stephan, Manar H Alalfi, and James R Cordy. Towards a taxonomy for simulink model mutations. In 2014 IEEE Seventh International Conference on Software Testing, Veri- fication and Validation Workshops, pages 206–215. IEEE, 2014.

[92] Romain Robbes and Michele Lanza. Improving code completion with program history. Au- tomated Software Engineering, 17(2):181–212, 2010.

[93] Matthew Stephan and James R Cordy. A survey of model comparison approaches and appli- cations. In Modelsward, pages 265–277, 2013.

227