A Case Study on automated classification, mathematical model generation and solution space exploration of formalised knowledge

Master Thesis Submitted in Fulfillment of the Degree

Master of Science in Engineering (MSc)

University of Applied Sciences Vorarlberg Computer Science

Submitted to Rigger Eugen, Dr. Sc. ETH

Handed in by Christoph Bauer, BSc. Dornbirn, August 2020 Statutory Declaration

I declare that I have developed and written the enclosed work com- pletely by myself, and have not used sources or means without dec- laration in the text. Any thoughts from others or literal quotations are clearly marked. This Master Thesis was not used in the same or in a similar version to achieve an academic degree nor has it been published elsewhere.

Dornbirn, 16.8.2020 Christoph Bauer

I Abstract

A Case Study on automated classification, mathematical model generation and solution space exploration of formalised product knowledge

This thesis aims to support the product development process. Therefore, an ap- proach is developed, implemented as a prototype and evaluated, for automated solution space exploration of formally predefined design automation tasks hold- ing the product knowledge of engineers. For this reason, a classification of product development tasks related to the representation of the mathematical model is evaluated based on the parameters defined in this thesis. In a second step, the mathematical model should be solved. A Solver is identified able to handle the given problem class.

Due to the context of this work, System Modelling Language (SysML) is cho- sen for the product knowledge formalisation. In the next step the given SysML model has to be translated into an object-oriented model. This translation is implemented by extracting information of a ".xml"-file using the XML Meta- data Interchanging (XMI) standard. The information contained in the file is structured using the Unified Modelling Language (UML) profile for SysML. Af- terwards a mathematical model in MiniZinc language is generated. MiniZinc is a mathematical modelling language interpretable by many different Solvers. The generated mathematical model is classified related to the Variable Type and Linearity of the Constraints and Objective of the generated mathematical

II model. The output is stored in a ".txt"-file.

To evaluate the functionality of the prototype, time consumption of the differ- ent performed procedures is measured. This data shows that models containing Continuous Variables need a longer time to be classified and optimised. An- other observation shows that the transformation into an object-oriented model and the translation of this model into a mathematical representation are depen- dent on the number of SysML model elements. Using MiniZinc resulted in the restriction that models which use non-linear functions and Boolean Expressions cannot be solved. This is because the implementation of non-linear Solvers at MiniZinc is still in the development phase. An investigation of the optimally of the results, provided by the Solvers, was left for further work.

III Kurzreferat

Eine Fallstudie zur automatisierten Klassifikation, mathematische Modellgenerierung und Untersuchung des Lösungsraums von formalisiertem Produktwissen

Diese Arbeit zielt darauf ab, den Produktentwicklungsprozess zu unterstützen. Dazu wird ein Ansatz zur automatisierten Untersuchung von Lösungsräumen von formal vordefinierten Entwurfsautomatisierungsaufgaben, die das Produk- twissen von Ingenieuren beinhalten, gesucht, umgesetzt und evaluiert. Dazu wird eine Klassifikation von Produktentwicklungsaufgaben, die sich auf das darstellende mathematische Modell beziehen, anhand der in dieser Arbeit de- finierten Parameter evaluiert. In einem zweiten Schritt soll das mathematische Modell gelöst werden. Es wird ein Solver identifiziert, der in der Lage ist, die gegebene Problemklasse zu bearbeiten.

Aufgrund des Kontextes dieser Arbeit wird System Modelling Language (SysML) für die Formalisierung des Produktwissens gewählt. In einem nächsten Schritt muss das gegebene SysML Modell in ein objektorientiertes Modell übersetzt werden. Diese Übersetzung wird durch Extraktion von Informationen aus einer Extensible Markup Language (XML)-Datei unter Verwendung des XML Meta- data Interchanging (XMI) Standards realisiert. Die in der Datei enthaltenen In- formationen werden unter Verwendung des Unified Modelling Language (UML) Profils für SysML strukturiert. Anschließend wird ein mathematisches Modell

IV in der Sprache MiniZinc generiert. MiniZinc ist eine mathematische Model- lierungssprache, die von vielen verschiedenen Solvern interpretiert werden kann. Das generierte mathematische Modell wird in Bezug auf Variablentyp, sowie Linearität der Constraints und Objektivs des Modells klassifiziert. Die Aus- gabe wird in einer ".txt"-Datei gespeichert.

Um die Funktionalität des Prototyps zu bewerten, wird der Zeitaufwand der verschiedenen durchgeführten Verfahren gemessen. Diese Messungen zeigen, dass Modelle, die Kontinuierliche Variablen enthalten, länger brauchen, um klassifiziert und optimiert zu werden. Eine weitere Beobachtung zeigt, dass die Transformation in ein objektorientiertes Modell und die Übersetzung dieses Modells in eine mathematische Repräsentation von der Anzahl der SysML Mod- ellelemente abhängig ist. Die Verwendung von MiniZinc führte zu der Ein- schränkung, dass Modelle, die nich-tlineare Funktionen und Boolesche Aus- drücke verwenden, nicht gelöst werden können. Dies liegt daran, dass sich die Implementierung von nicht-linearen Solvern bei MiniZinc noch in der En- twicklungsphase ist. Eine Untersuchung des Optimums der von den Solvern gelieferten Ergebnisse wird für eine weiterführende Arbeit vorgeschlagen.

V Acknowledgements

During the work process of this thesis I have received a great deal of support and assistance.

First of all I would like to thank my supervisor Dr. Sc. ETH Eugen Rigger. His expertise as well as his motivating manner were a great support during the work process. He gave me convincing guidance and encouraged me to be professional and do the right thing, even when the road became tough. With- out his continued support, the goal of this project would not have been achieved.

Special thanks go to the team members of V-Research, who took the time for discussions and helped me with their expertise during the working process.

I would also like to thank the lecturers and the women in the secretariat of the "Computer Science" course at the University of Applied Sciences Vorarlberg for accompanying and supporting me on my academic path.

Personally, I would like to thank my family, my girlfriend and my friends, who have supported me both in good and stressful times. More particularly, I want to thank my parents Doris and Markus for supporting me in going my way and following my goals. I would like to thank Lukas, my brother, above all for his support in difficult situations of my life like the final phase of this thesis.

This thesis has been partially supported and funded by the Austrian Research Promotion Agency (FFG) via the “Austrian Competence Center for Digital Production” (CDP) under the contract number 854187.

VI Contents

Statutory DeclarationI

List of IllustrationsIX

List of TablesX

1 Introduction1 1.1 Motivation ...... 1 1.2 Context ...... 3 1.3 Research question ...... 3 1.4 Overview ...... 4

2 State of the Art5 2.1 System Modelling Language (SysML)...... 5 2.1.1 Concept of SysML ...... 6 2.1.2 Template ...... 8 2.2 Model transformation ...... 9 2.2.1 Meta Model Facility (MOF)...... 10 2.2.2 UML Profile ...... 10 2.2.3 XML Metadata Interchanging (XMI)...... 10 2.3 Mathematical Modelling ...... 10 2.3.1 Mathematical Modelling Languages ...... 10 2.3.2 Problem Classification ...... 11 2.3.3 Solvers ...... 14 2.4 Setup ...... 15 2.5 Summary ...... 16

VII 3 Use Cases 18 3.1 Laboratory Experiments ...... 19 3.2 Field Experiment ...... 22 3.3 Evaluation process ...... 23

4 Conceptual approach 24 4.1 XMI transformation ...... 25 4.2 Mathematical Model generation ...... 25 4.3 Problem Classification ...... 26 4.4 Solving ...... 30

5 Implementation 31 5.1 Structure ...... 31 5.2 Starting point ...... 34 5.2.1 Prototype-based extensions ...... 38 5.3 Mathematical model generation ...... 41 5.4 Classification ...... 41 5.5 Solving ...... 42

6 Result 43

7 Discussion 50 7.1 XMI transformation ...... 50 7.2 Mathematical model generation ...... 51 7.3 Problem classification ...... 53 7.4 Solving ...... 54

8 Conclusion 55

Bibliography 57

VIII List of Figures

2.1 Structure of SysML ...... 6 2.2 Task Definition Diagram, showing the structure of the template.8 2.3 Sketch of a convex and concave set ...... 14 2.4 Unimodal function (a), multimodal function (b) ...... 14

3.1 Knapsack problem overview ...... 19 3.2 Knapsack SysML Architekture ...... 20 3.3 Knapsack SysML Objective ...... 21 3.4 Connection of a Crank with positive Shaft Hub Connection ... 22 3.5 Cross section of a positive Shaft Hub Connection ...... 22

4.1 Conceptual procedure ...... 24 4.2 Schematic representation of a regression model (blue, line) with residual (red, dashed line) for observed points (yellow, dots) . . 28

5.1 Prototype architecture ...... 32 5.2 Sequential diagram process overview ...... 33 5.3 Class Diagram: Object-oriented depiction of the SysML model . 35 5.4 Objective with optimisation target ...... 39 5.5 Constraint with multiplicities activated ...... 39 5.6 Constraint Block holding a Decision Table ...... 40 5.7 Specification of a variable range ...... 40

6.1 Object-oriented model generation time of the Experiments ... 46 6.2 Mathematical model generation time of the Experiments .... 47 6.3 Classification time of the Experiments ...... 48 6.4 Solving time of the Experiments ...... 49

IX List of Tables

4.1 Evaluated Solver selection ...... 30

5.1 Translation table ...... 37

6.1 Knapsack: Number of elements of the different model ...... 44 6.2 Shaft Hub Connection: Number of elements of the different model 44 6.3 Evaluated times in milliseconds ...... 45

X Acronyms

SysML System Modelling Language ...... II

UML Unified Modelling Language ...... II

XML Extensible Markup Language ...... IV

XMI XML Metadata Interchanging...... II

MOF Meta Model Facility...... VII

DA Design Automation...... 1

KBE Knowledge Based Engineering ...... 1

CDS Computational Design Synthesis...... 1

AI Artificial Intelligence ...... 1

INCODE International Council on Systems Engineering ...... 5

OMG Object Management Group...... 5

RFP Request for Proposal ...... 5

LP Linear Programming ...... 11

NLP Non-linear Programming ...... 12

MIP Mixed Integer Programming...... 12

MINLP Mixed Integer Non-linear Programming...... 12

CP Constraint Programming ...... 12

CSP Constraint Satisfaction Problem ...... 13

COP Constraint Optimisation Problem ...... 13

Gecode Generic Constraint Development Environment...... 15

CBC Coin-or branch and cut...... 15

XI Couenne Convex Over and Under Envelopes for Non-linear Estimation . . . . 15

Ipopt Interior Point Optimizer...... 15 mXparser Math Expression Evaluator ...... 16

DOM Document Object Model ...... 16

RSS Residual Sum of Squares ...... 28

XII 1 Introduction

Two targets of the ongoing change to Industry 4.0 are intelligent information ac- quisition and the development of intelligent assistance systems. In this context humans can be seen as the provider of domain knowledge [Kli20, S. 161]. This change aims at enhancing the efficiency of the production department and lo- gistical areas. It also provides new opportunities for other areas of a company. One of those areas is the product development process. Through intelligent support of the design processes efficiency is increased, and new possibilities are opened up. Design Automation (DA) is defined by Rigger [Rig+19] as the application of computational methods and tools supporting the design process through automation of design tasks. He clarifies that two major fields of re- search contribute to DA. These fields are Knowledge Based Engineering (KBE) and Computational Design Synthesis (CDS).

Rocca [Roc12, p. 161] mentions KBE as a technology standing at the inter- section of Artificial Intelligence (AI), Computational Design Synthesis (CDS) and computer programming. He defines KBE as a technology enabling the automation of repetitive design tasks and supporting multidisciplinary design optimisation to reduce time and cost of product development. This work focuses on the aspects of CDS related to design automation tasks supporting solution space exploration and on creating reasonable alternatives based on computationally encoded knowledge [Cha+11].

1.1 Motivation

This work aims at supporting the product development process by defining an approach for automated solution space exploration of formally predefined de-

1 sign automation tasks. In general, the process of defining and solving these tasks is currently taking a long time. First of all the knowledge engineer has to gain all the knowledge of a product by interviewing the engineers who have constructed it. These engineers represent the domain experts in this process. In the next step the product knowledge is formalised. For this formalisation, a large number of tools are available. In the last step, an expert in modeling systems using mathematical representations has to convert formalised product knowledge into a mathematical model, which can further be solved or optimised.

The described process takes a long time and involves a minimum of three peo- ple. It should be considered that a process having this many participants always brings certain risks, for example ambiguities in communication or incorrect for- malisations due to false assumptions. To avoid such risks two problems have to be faced. First a generic approach for product knowledge formalisation has to be defined. This would give the domain experts the opportunity to formalise their product knowledge independently. The project giving the context of this work, described in the next chapter, relates to this problem. System Mod- elling Language (SysML) is used as the modeling language to formalise product knowledge. This modeling language enables the User to build up a model by defining the structure and behaviour in different diagram types (see Section 2.1). The second problem relates to the translation of the given formalised product knowledge into a mathematical model that can be solved by a com- puter. The domain experts should be able to trigger the process based on their self elaborated SysML model. Subsequently, the expert should get the result depending on the goal formalised as the Objective in the SysML model. This result can be:

• a number of optimised values in case of an optimisation task, depending on the possible solutions,

• a possible not optimal solution to satisfy given Constraints, or

• an error to point out, that the limitations of the model cannot be met.

2 1.2 Context

This thesis is related to ongoing efforts for a software studio for design automa- tion task definition. The Goal is to build an environment supporting engineers during the product development process. Therefore it links CAD Parts and Assemblies into a visual formalisation of products, where domain knowledge of the product is specified in SysML. Furthermore, the software should be able to perform model analysis and optimisation tasks in an automated way. To evaluate an approach for this part of the software, this thesis was started. It concerns the model analysis and optimisation part, as well as for an example implementation, which perform the tasks automatically.

The relation to the software specifies some constraints for the planed prototype:

• Java as the programming language

• Reusability of the approach

• Encapsulation of used third party software

1.3 Research question

The objective of this case study is to automatically transform a product knowl- edge formalisation of a optimisation task from the product development process into a mathematical model describing the problem by using SysML, and to evalu- ate the model using optimisation algorithms based on the goal of the formalised knowledge. The following research questions arise in this context:

Is there a classification of product development tasks, formalised in SysML, re- lated to representing mathematical model? What effect do the Solvers have on this classification? Do any restrictions have to be observed in order to au- tomatically convert the visual formalisation of the task into its mathematical equivalent?

3 For the further structuring of the work this task is divided up into the following parts, which provide the necessary basis for the reply to the primary questions:

For the classification task, which classification can be evaluated to identify the different types of product development tasks and the parameters meaningful for identification? Under what conditions is it possible to group the product development tasks into classes for a generic solving process, and how could these problem classes be determined?

As a second part the solving process is evaluated. Therefore, it has to be evalu- ated how many problem classes the Solvers can work on and which configuration is needed to work with them. Several factors should be clarified: Which criteria influence the selection process of the Solvers? Are there any open source Solvers available for the related problem class? How can an opti- miser be automatically selected and configured based on the specific problem class?

It should be noted that the optimisation of the Solvers results is not a part of this thesis.

1.4 Overview

First of all a examination of the current "State of the Art" of the given parts related to the goals was carried out. The results of this examination, described in chapter 2, is taken as a starting point for the different parts of the thesis. In chapter 3 the evaluated Use Cases are described. The steps for the automated model conversion and evaluation are explained in chapter 4. The implementa- tion of the prototype is the theme of chapter 5. The results are described in chapter 6. These results are discussed in chapter 7. Chapter 8 sums up the thesis and gives an outlook for further studies.

4 2 State of the Art

This chapter provides an outline of the current State of the Art of the different aspects of this thesis. These aspects are the model transformation process related to SysML, the mathematical modelling languages, classification of the mathematical model and the evaluated Solvers. It provides an overview on the tools related to this study.

2.1 System Modelling Language (SysML)

In 2003 Cris Kobryn organised the SysML Partners, an informal association of System Engineering and software modelling tool experts, to develop a dialect of the UML version 2 [Gro03]. SysML is used for the specification, analysis, de- sign, validation and verification of systems. These systems are able to depict processes and products. The goal of the formalisation of a system with SysML is a standardised representation of the system for further processing. SysML was developed to extend UML to support the needs of the system engineering com- munity. These needs were evaluated and described by the International Council on Systems Engineering (INCODE) and the Object Management Group (OMG). A link to the specification „UML 2 For System Engineering Request for Pro- posal (RFP)“ can be found in the source directory [Obj20b]. A second imple- mentation of SysML is the related UML Profile. This profile extends the UML standard with the SysML specific parts [Obj20a].

Visual modelling languages like SysML provide the opportunity to transform and implement product knowledge of technical engineers into an abstract model depicted by diagrams. Diagrams are different views of the model. Different diagram types are taken into account to enables the designer to represent the

5 behaviour and structure of a system in a formal graphical way [Obj20b]. In order to meet the requirements of the current state of the art, SysML was chosen as the formalisation method for depicting the systems.

2.1.1 Concept of SysML

SysML is based on UML2. Figure 2.1 depicts the structure of SysML. The basic distinction is made between behaviour and structure. The Behaviour Diagrams located on the left side of figure 2.1 represent the behaviour, functionality, data flows and control mechanisms of a model. The Structure Diagrams, located on the right side, represent the structure and dependencies, Constraints and various types of allocation of a model. For this case study, only Structure Diagrams are used to formalise product knowledge [Obj20b].

Figure 2.1: Structure of SysML Source: [Obj20b]

Model and View

To formalise product knowledge, its Value Properties and Constraints, SysML follows an object-oriented approach [Obj19, S. 174]. This encapsulates the model. A diagram is a depiction of the model in a specific context. Diagrams

6 support the development process of systems. In the context of this work, dia- grams are used to develop and depict Use Cases in a human interpretable way. The model of a system is defined by the following elements:

Block is a basic structural element. It describes the structure of an element or system and its characteristics by defining aspects such as Value Properties, owned Blocks and Constraints.

Property is a structural feature of a Block. Friedenthal, Moore and Steiner [FMS09] describe the main types as following:

1. Part property defines the usage of a Block in context of an enclosing Block.

2. Reference property defines the usage of a Block, which is not part of the enclosing Block.

3. Value property defines a quantifiable property.

Connectors are a description of the relation of who models elements, like in UML. It is used to define dependencies, multiplicities and inheritance [Obj19, p. 53].

Package organises the model. Therefore, it is not an essential part of the model, but it supports the ability of understanding models.

To structure the model and make it interpretable for Users, diagrams are used. For this paper two diagrams are mainly used to depict the models and their dependencies.

Parametric Diagram is used to depict mathematical relations of the system, for example Constraints. This diagram enables the User to clarify the relation- ship between two Variables. It is possible to connect different Constraint Blocks in a Parametric Diagram to depict more complex system bounds or build up complex calculations [IBM14].

7 Block Definition Diagram is taken into account to define structural features in a system. In these diagrams relations between different Blocks are specified, such as Associations or Generalisations. Secondly, it defines the features of a Block [No 20].

2.1.2 Template

The used template for SysML was described in 2019 by Rigger [Rig+19]. The template means to structure the product knowledge formalisation of the related product knowledge.

Figure 2.2: Task Definition Diagram, showing the structure of the template. Source: self-elaboration

Figure 2.2 shows the package diagram which contains the overview of the for- malised product knowledge. The Task Definition Diagram gives an overview of the structure of the Template. It is divided into three main packages holding different aspects. The three main packages, dividing the system in it’s main parts are: 1. Inputs

2. Goals

3. Requirements

8 Inputs

The input package is used to formalise the domain knowledge of the system. It structures the knowledge according to its context into four sub-packages. First is the "Component Library". This sub-package stands for all related components of the system. These components can be discovered as parts and assemblies, which are needed to specify the system. The second sub-module is named "Geometry" and relates to all components (also assemblies and parts) that are needed to depict the design automation task of the system. The third sub-module is called "Interconnections" and is used to formalised the relations of parts of the system. The last sub module holds the modifications of the system. It is named "Modifications" [Rig+19, p. 40].

Goals

This package is used to define the Constraints of the System, as well as the Objective of the formalised optimisation task related to the product knowledge. Therefore, it is structured into four sub-packages. First the Objective of the optimisation task is formalised in the sub-package "Objective Function". The Constraints tighten the domain of the problem are separated into spatial and functional Constraints into the related sub-packages. The fourth sub-package content are the "Design Variables" [Rig+19, p. 42].

Requirements

The textual explanation of a system is supposed to be located in this package. Diagrams located in this package should explain the context of a system to the Viewer. Every information explaining the problem in a textual way, as well as diagrams unnecessary for the system solving process, should be located in this package.

2.2 Model transformation

This section describes the applied technologies and standards used to extract the information of a SysML model.

9 2.2.1 Meta Model Facility (MOF)

It describes the concept of formal meta models, as well as the mapping between different MOF conform models, for example the mapping between SysML and XMI. Therefore, MOF can be seen as a language used to specify modelling languages [Obj19].

2.2.2 UML Profile

An UML Profile is an extension for reference metamodels enable the User to adapt and customise the referenced model [uml20]. An example for such a profile is the SysML extension for UML Diagrams[Obj20a].

2.2.3 XML Metadata Interchanging (XMI)

XMI is a standardised format of OMG describing storage of data and information based on meta models. It is used for data and information transfer between software development tools. In this context of use, the format is stored in a Extensible Markup Language (XML) file [Gro15].

2.3 Mathematical Modelling

This section gives an overview of the mathematical characteristics of the math- ematically formalised model and links it to the problem classes identified during this study. As a starting point, the problems handled by the Solvers, described in section 2.3.3 , were taken to identify the basic problem classes. These classes and their possible combinations are introduced in section 2.3.2 "Problem Clas- sification". Section 2.3.1 evaluates possible mathematical modelling languages as the input for the Solvers.

2.3.1 Mathematical Modelling Languages

The following sections describe the evaluated modelling languages under the conditions of this thesis, as specified in section 1.3.

10 MiniZinc

MiniZinc is a free to use open source modelling language. It enables high-level, Solver independent modelling of mathematical problems [MDC20]. Another point of interest is the support of MiniZinc for reusable code. For example, it is possible to predefine different functionalities in separate files and reuse them in different models. Another strength of MiniZinc is the availability of predefined Global Constraints [PMG18].

GAMS

Gams is described as a high level mathematical modelling system containing a language, compiler and Solvers. Therefore it is able to perform mathematical programming and optimisation. In contrast to MiniZinc this modelling system is not open source and not free to use [Cor].

AMPL

AMPL is a modelling tool supporting the entire optimisation modelling life cycle. Therefore, it supports the development of mathematical models, testing and deployment to related Solvers. As well as the other tools, it integrates a high level modelling language. Like Gams this tool is mainly for commercial use [inc20].

2.3.2 Problem Classification

The visualised model holding a specific problem needs to be classified, to se- lect the Solver, which fits the problem requirements. These classes are highly related to the problems the Solvers are able to handle. In the context of the Solvers, these problems are separated by the methods used for the solving pro- cess. The relevant literature distinguishes the following methods:

Linear Programming (LP) describes techniques to minimise or maximise linear functions [Van20, p. 6].

11 Non-linear Programming (NLP) describes techniques to minimise or max- imise non-linear functions [Van20, p. 6].

Mixed Integer Programming (MIP) describes techniques for solving prob- lems specified by the limitation that at least one Decision Variable has to be an integer value. [Fro11].

Mixed Integer Non-linear Programming (MINLP) is specified by the same characteristic as the MIP, extended by the fact that one of the functions repre- senting the Objective and Constraints are non-linear [Bel+13].

Constraint Programming (CP) is a technique used for solving problems, structured as it is described in section 2.3.2.

By evaluating these problem types, the following characteristics and problem classes are judged to be valid for classification:

Constraint Problem

Constraint Problems are problems identified by Constraints on their Variables. Nonobe and Ibaraki [NI98] define this problem type by a triple (X, D, C) where:

• X = {X1, ..., Xn} represents a finite set of Variables

• D = {D1, ..., Dn} represents the domain of each Variable given by a set of in general discrete values

• C = {C1, ..., Cm} represents a set of Constraints

In contrast to the below mentioned problem classes, a Constraint Problem is a class related to the structure of a given problem. In the modelling of knowledge- based systems, limitations due to complex equation/ systems and variable boundaries are important parts. These parts can be evaluated with the help of the classes specified below [NEO19]. Thapper [TJC10] mentions two types of Constraint Problems identified by their target.

12 1. Constraint Satisfaction Problem (CSP)

2. Constraint Optimisation Problem (COP)

To solve a CSP you should be able to find a value for each Decision Variable without violating any of the given Constraints. Satisfaction problems do not have the requirement to find the best solution. Their usage is to prove the consistency of the given task.

A COP is basically a Constraint Problem with an optimisation function, which is called Objective. This means, that the goal of the problem is to find the best solution or solutions for the given Objective.

Linear Problem

The main characteristic of this problem type is the linearity of its Constraints and Objective. In contrast to CSPs, the variable domains of a linear problem can be Continuous or Discrete. Similar to a COP, the goal of this problem class is to find the optimum solution or solutions for a given Objective [Wri16].

Non-linear Problem

Such as a linear problem, a non-linear problem is an optimisation problem. The difference, as the name suggests, is that at least one of the given Constraints or the Objective is a non-linear function [Wri16].

Variable Type

Beside the functions taken into account to classify the given problem, the Value Properties have to be examined. The Variable Type of the given properties considerably influence the solving process of a given optimisation or satisfaction task. The range of a single Value Property can be Discrete or Continuous. The appearance of of the different types of Variable Type specifies a Problem as Discrete, Continuous or Mixed Integer problem.

13 Convexity

Convexity is defined by a set in Euclidean space where the line segments con- necting any pair of its points are within the space [Wei].

Figure 2.3: Sketch of a convex and concave set Source: [Wei]

Figure 2.3 depicts the difference between convex and concave result spaces. These result spaces are given by the modelled system.

Modality

Modality of a function is defined by the number of minima of the function. Therefore, an unimodal fuction is a function with one minimum. A multimodal function does have more than one minimum.

Figure 2.4: Unimodal function (a), multimodal function (b) Source: [DD08]

2.3.3 Solvers

Solver is a generic term for programs, libraries or applications, which compute a given mathematical problem and solve it depending on the problem type and target. Problem types are described in 2.3.2. The target can be satisfaction

14 or optimisation. The Solvers described below have been selected to provide a suitable solver for each type of problem. Furthermore, they can be addressed by MiniZinc without additional effort.

Coin-or branch and cut (CBC)

For linear problems CBC is taken into account. This Solver implements a branch and cut and bound algorithm. It is used to solve Discrete linear problems [COI20].

Generic Constraint Development Environment (Gecode)

Gecode is a constraint programming system to solve CSP and COP. It was first released in 2005. Benefiting from a large community, the system is still under development. [STL]

Convex Over and Under Envelopes for Non-linear Estimation (Couenne)

Couenne is a non-linear Solver handling Continuous and Discrete Variables in convex and concave solution spaces. It implements a branch and bound algo- rithm to to solve the given Problem.[Bel]

Interior Point Optimizer (Ipopt)

This Solver is designed to find local solutions for large scale non-linear op- timisation problems. Therefore it implements the Interior Point method for optimisation [COI].

2.4 Setup

The prototype is written in Java 1.8 using Gradle as Build-Management-Automatising- Tool. To realise the Prototype, the following Libraries were included.

15 Math Expression Evaluator (mXparser)

The mXparser is a library for parsing and evaluating mathematical expressions represented by Strings [Gro].

Commons-Math 3

Commons-math is a math library. The prototype implemented during this thesis uses the library for analysing the given functions of Constraints and Objective. [Fou16].

Document Object Model (DOM) Parser

The DOM Parser defines an Interface, which builds up a tree to access and edit the parsed XML file. The tree structure ensures a simple handling and traversing of the parsed document [Tut20].

2.5 Summary

The described technologies are able to take care of the different parts of the product development process. Following the workflow described in section 1.1, it takes a long time and skills out of three different disciplines, product en- gineering, mathematics and computer science, to handle the described parts. Shane, Paredis, Burkhart and Schafer [Sha+12] did a research using SysML and mathematical programing, using GAMS for automated component sizing of hydraulic systems. The technologies taken into account for this thesis are influenced as following. The formalisation of the product knowledge in SysML is given by the context of this thesis. For the translation of the formalised product knowledge a standardised way easy to implement, comprehensible and customised for SysML should be found. The Solvers define the outlines for the mathematical representation of the product knowledge. To provide reusability of the mathematical model, a way should be found to address multiple Solvers using the same model. Therefore, the mathematical modelling languages de- scribed in section 2.3.1 are evaluated. The classification of the mathematical

16 model is influenced by the problem types that can be handled by the Solvers. Consequently, the parameters of the different problem types have to be identi- fied. In a next step, a way to evaluate these parameters has to be found. Based on these parameters, a Solver for the mathematical model based on the product knowledge has to be chosen. In the context of this thesis, a main target is to implement a prototype able to handle the described parts automatically. As an investigation into global or local optimality is beyond the current scope of this study, this point is left for further researchers.

17 3 Use Cases

In the context of this work, practical Use Cases from real live processes tend to be large in Size and highly Complex. The Size of Use Case is specified by the number of Decision Variables and Parameters that influence the result of the Solver. Complexity is measured by the number, dependencies and calculation effort of the given Constraints. This leads to problems in identifying the specific requirements of the problem classes. To analyse the Use Cases and check if the prototype handles different problem classes correctly, it is necessary to break down the Complexity. Therefore, the Use Cases are split into two types.

18 3.1 Laboratory Experiments

This type of Use Case is chosen to check the functionality of the implemented prototype. It is characterised by being adaptable in scale and Complexity. As a representative Use Case which is easy to handle in Size and Complexity, the Knapsack problem is used.

Figure 3.1: Knapsack problem overview Source: [Bec20]

Figure 3.1 depicts the Knapsack problem. The solution of the problem is to maximise the value of the packed items on condition that the maximum load of the knapsack is not exceeded. Therefore, it is an optimisation problem. The boxes are the available items. Each item has three parameters. The price in Euro on top, the weight of the item in kilogram and the number of available items of this kind.

A reduction in Size can easily be realised by reducing the number of items of the problem. The Complexity decreases by changing the scope of the problem to maximise the sum of the packed items weight. A second way of changing Complexity is to variate the function calculating the value or weight.

19 Figure 3.2: Knapsack SysML Architekture Source: self-elaborated

Figure 3.2 shows an overview of the Knapsack problem formalised in SysML. As depicted, the boxes have become Block with their Value and Weight as Value Properties. The relation of packable Items is depicted as Association with mul- tiplicities displaying the available numbers of a specific Item.

20 Figure 3.3: Knapsack SysML Objective Source: self-elaborated

Figure 3.3 depicts a representative visual formalisation of a Constraint. This specific case shows the Objective of the problem. It is a of all values of the different Items multiplied with their number. A reduction of items is going to reduce the Size of the problem. Complexity is already on a low level due to the fact that the problem neither has complex relations between its Constraints nor a large number of them.

21 3.2 Field Experiment

These experiments proof the usability of the implemented prototype for real world problems. One representative problem is the positive Shaft Hub Connec- tion.

Figure 3.4: Connection of a Crank with positive Shaft Hub Connection Source: [Jac19]

Figure 3.4 shows an example of the positive Shaft Hub Connection of a crank to another component. As it is depicted in the red circle the shaft is connected to the hub by geometric touch. This geometric touch is realised by a hub recessed into the shaft. The hub has a recess fitting the key on the shaft.

Figure 3.5: Cross section of a positive Shaft Hub Connection Source: [Jac19]

Figure 3.5 depicts a cross section of the connection type for a better under- standing. As shown, the Forces (F) are taking their direction left an right.

22 Due to the form of the two connected components, a connection is given which transmits the force to flat marked in red. For this type of connection the com- ponents do not have to be permanently connected to each other. It is therefore possible to separate them. [Jac19]

This problem was formalised using SysML. The given optimisation problem is described by “minimise the keys length”. In the background there are Con- straints calculating the necessary length, depth for the recess for the key, diam- eter of the key way depth of the hub and more. Because of these characteristics, the Use Case is more complex than the laboratory ones. It is also more reason- able for the field of engineering due to its appearance. For example the gearbox of a car has a lot of Shaft Hub Connections to fasten the used gearwheels.

3.3 Evaluation process

In order to evaluate the Use Cases described, two aspects are examined. First, the number of elements of the models is inspected. Second, the times are measured for which the prototype is used for:

• Transformation from product knowledge formalisation to object-oriented model.

• Translation from object-oriented model to mathematical model.

• Classification of the mathematical model.

• Solving the mathematical model.

By providing the data, the functionality of the prototype is proven for the indi- vidual Use Cases. In a second step the given data are explained and compared to each other to draw conclusions about the functionality of the prototype and the behaviour of the individual problems.

23 4 Conceptual approach

The given research field can be seen as an interdisciplinary research area. It includes the domain specific input as a model coming from the field of engi- neering. Mathematical Analysis is used to examine the given problem class of a model, which leads to the Solver able to evaluate the solution of the identi- fied problem class. The process is automated through the use of various Java libraries and programs.

Figure 4.1: Conceptual procedure Source: self-elaborated

Figure 4.1 shows the identified main parts of the process. To enable the au- tomated model conversion and evaluation, a SysML formalisation of product knowledge has to pass these steps. Step one, XMI transformation, translates the given SysML model into an object-oriented model manageable by the pro- totype. In the second step the object-oriented model is used to generate the related mathematical model based on the Objective and Constraints, avoiding the use of unnecessary parts of the given SysML model. It is possible, that the product expert formalises some information for better human understandabil- ity, which not necessary for the automated solving process. The following step, Problem Classification, analyses the mathematical model and maps it to a prob- lem class based on the parts needed for the model. In the last step Solving the Solver for the given problem class is evaluated and the model is processed. The following sections give a further explanation of the four parts of the process.

24 4.1 XMI transformation

The first step XMI transformation translates the information of the SysML model stored in an XMI-file into an object-oriented model. This encapsulates the following steps from the input format and ensures that the method can be easily reused.

For this transformation the User has to export his model to a ".xml"-file in XMI format using the UML profile for SysML. This format is used because it has been adapted for SysML and can therefore represent specific SysML elements. These elements can be identified by its tags starting with "sysml". Therefore, information extraction is simplified.

4.2 Mathematical Model generation

The second step takes care of the creation of the mathematical model. It han- dles the transformation process from the given object-oriented model into a mathematical model understandable to the different Solvers. In order to re- duce Complexity, a general way should be found to address the Solvers. For this reason a mathematical modelling language is used.

Addressing the outlines of this work, MiniZinc has been chosen as the used mathematical modelling language. The selection was made due to the require- ment of it being open source and supporting reusability. MiniZinc possesses these properties. [MDC20] During the formalisation process, the following parts of the mathematical model have to be defined in a ".mzn"-file:

1. Value Properties are Variables with a fixed value of "int" or "float".

2. Decision Variables are Variables evaluated during the solving process.

3. Constraints which have to be satisfied by all decision variables to be a valid solution to the given model.

25 4. Solving target In case of a optimisation problem the Objective and the target have to be specified (minimise, maximise). In case of a satisfaction problem the Objective contains the String "satisfy".

5. Output is an optional part of the file, to specify the shown values of each solution of the model.

After compilation of the defined MiniZinc model, it is translated to FlatZinc, a low-level Solver input language captured in one file. While the translation process is performed, the model runs through a process of flattening and post- flattening methods. This process reduces and simplifies the generated FlatZinc model. This improves the consistency of the model. The most representative steps of the flattening process are:

• Parameter substitution replaces any atomic literal value by its global assigned parameter and removes the declaration and assignment.

• Built-ins evaluations concern the evaluation of all calls to built-ins that have fixed, atomic literal arguments.

• Comprehensive unrolling after the generator ranges are fully reduced, this step reveals all set and array comprehensions.

• if-then-else evaluation evaluates each if-then-else expression as soon as its conditions are completely reduced.

The post-flattening process depends on decomposition of expressions, conjunc- tion splits and conversion. A more detailed description of the translation step can be found in "MiniZinc: Towards a Standard CP Modelling Language" published in the Book "Principles and Practice of Constraint Programming" written by Nethercode et al. [Net+07].

4.3 Problem Classification

This step analyses the given mathematical model related to the identified prob- lem classes. As mentioned before the identified classes are related to the Solvers

26 able to handle the mathematical characteristics of the given optimisation prob- lem. In summary, the class of a problem depends on its given Value Properties, Constraints and Objective. The leading characteristics are:

Variable Type is a characteristic of the Value Properties of the problem. For numerical Value Properties the Solver’s understandable types are Continuous and Discrete Variables. These Variables are identifiable through their type. The User specifies the type of a Value Property as "Real" or "Integer". The prototype maps this type to its equivalent, "float" or "int". Alphabetical Value Properties have to be mapped to numerical representations if they have an impact on the given problem. For example, if one wants to optimise a beverage container in cost, its shape has an impact on the formula evaluating the used material. Therefore, the shape has to be matched to the mathematical formula calculating the surface of the shape.

Boolean inputs are a special case of alphabetical inputs. The Boolean type is naturally a binary representation of the predefined values:

• True = 1

• False = 0 or anything other value

Linearity is a characteristic of the Constraints and Objective. A linear formal problem consists of linear Objective. The Constraints are shown as a system of linear equations and inequalities. For non-linear problems one of the mentioned parts is a non-linear function.

The following lines describe the process used to identify the Linearity of the given problem. For each Constraint and Objective it has to be checked if the given function is linear. It is not possible to predict, how many Variables are part of the function, as this depends on the input of the User. Therefore a method is needed, which works regardless of the number of Variables. To perform a Linearity check, a method fitting this requirement is necessary. The approach chosen comes from the regression analysis.

27 Figure 4.2: Schematic representation of a regression model (blue, line) with residual (red, dashed line) for observed points (yellow, dots) Source: self elaborated

Figure 4.2 is a schematic depiction of a Residual. It depicts a regression model as a thick blue line, the Observed Points calculated with the function to be eval- uated as yellow dots and the Residual calculated for the Linearity check as a dashed red line. By calculation the sum of all Residuals and square it, Linearity can be evaluated. A function is linear if the sum of all squared residuals is zero, therefore all observed points would lay on the blue line of the Regression Model.

By using Residual Sum of Squares (RSS) the number of independent Variables is not limited. These Variables are the parameters of the currently evaluated function. For a given, randomly chosen, standard, normal distributed set of input parameters (x), the corresponding output is computed with the current function. This output is further called observed y-values (y).

28 0 yˆi = xi ∗ β i = 1, ..., n (4.1)

The equation above represents the calculation of the predicted y-values (yˆ) 0 using a general regression model. xi describes the input parameter (x) in vector notation transposed. β is a vector of coefficients. i represents the number of variables to be predicted. As mentioned before, a Residual represents the horizontal deviation of the observed Point from the prediction made by the regression model. This can be formalised as:

0 yi = xi ∗ β + i (4.2)

 represents the error vector in this context represented by the Residual. The following equation describes the calculation of it.

0 i = yi − xi ∗ β = yi − yˆi (4.3)

Using the mentioned formulas 4.1 to 4.3 the Residual of each point is calculated. Because Residuals can be negative, they are squared before summed.

n n X 2 X 2 RSS = (i) = (yi − yˆi) (4.4) n=1 n=1

Formula 4.4 represents the calculation of the RSS. Under the condition that the result is zero, the function to be evaluated is linear [DS19, p.15-46].

Convexity is not taken into account due to the fact that most of the Solvers are able to handle convex and non-convex functions without being configured.

Modality is not taken into account as it is assumed that in future implemen- tations the choice of local or global optimisation will be left to the User.

29 The described classifications are necessary to identify the Solver able to handle the mathematical problem related to the given system.

4.4 Solving

The last step Solving is related to the used Solvers. Its task is to take care of the solving process. Therefore, it handles the Solver input, configuration and output. In this step the model analysis output is evaluated to choose the Solver for the given problem class. As a second point of interest encapsulation of the Solver is realised to improve maintainability and flexibility of the prototype.

Table 4.1: Evaluated Solver selection

Table 4.1 links the evaluated problem classes, combinations of Classification Parameters, to the Solvers chosen for the class and the optimisation target described in Section 2.3.2. The chosen Solvers have the ability to solve the problem with no need for additional configuration or value transformation.

30 5 Implementation

The following sections give an overview of the structure of the prototype and the implementation of the methodology for the applied mechanisms to achieve the desired goals of each main part. To handle the procedure of model transfor- mation and to solve the mathematical model, a project was started. Beside the formulated research question, the implementation should aim at the following secondary goals:

• easy implementation of Solvers

• extendable in the context of model analysis

• extendable expandable interface to address the chain of procedures

These goals have an influence on the chosen technologies and the implementa- tion of the prototype.

5.1 Structure

The architecture of the Prototype is structured in different parts related to the main parts identified in chapter 4. Figure 5.1 depicts the different layers of the architecture (green) and the related Classes (blue) of the layer. This encapsulation provides the possibility of replacing or extending different parts easily.

31 Figure 5.1: Prototype architecture Source: self-elaborated

The prototypes architecture relates on the four pre-mentioned procedures shown in Figure 5.1. The Input layer is used as the entry point of the evaluation pro- cess. From this point on the object-oriented model generated in the input layer serves as the information provider.

32 Figure 5.2: Sequential diagram process overview Source: self-elaborated

Figure 5.2 depicts the sequential procedure of the process. First of all the User defines the model to be solved by handing over the Path to the XML-file. The XMIParser loads the file and generates the object-oriented model. This model is transformed and stored in a ".mzn"-file holding the mathematical formalisation of the model. As a next step Value Properties and Constraints, as well as the Objective of optimisation problems, are examined as mentioned before. Afterwards under the use of the result of the model analysis the Solver is chosen, and the model is evaluated. The result is stored in a ".txt"-file

33 5.2 Starting point

The initial situation of the process requires the domain knowledge of the User about the task to be solved. A User generates the graphical model using SysML. This model is the visual formalisation and has to hold all domain-relevant Con- straints to ensure a correct solution. The fomalised SysML model needs to be saved as a ".xml"-file with XMI content.

The prototype’s input-related parts read out the information of the SysML model and generate an object-oriented model. Figure 5.3 depicts the whole structure of the object-oriented model after the translation process. It gives an overview of the needed classes providing the information for the mathematical model gen- eration. Comparing this object-oriented model to the model applied to the work of Shane, Paredis, Burkhart and Schafer [Sha+12], the following differences are realised. First of all the model taken into account for the work of Shane et al. is related to GAMS. Therefore, the classes and grammatical depictions used for the different parts are related to the GAMS API. The currently available Java Api, JMiniZinc, for MiniZinc, created by Siemens [Sie20], is still under development. To avoid problems related to missing functionality of JMiniZ- inc, the connection of the prototype to MiniZinc is implemented as a process call. Resulting from this fact, the depicted Classes are implemented relating to the object-orientation approach of SysMl, explained by the related specification [Gro03, p.7-13]. Secondly, Shane et al. defined hierarchy relations to support features like a model is able to hold other models as well as a Variable relates to a model [Sha+12, p.5]. The feature of models able to hold other models is implemented by the Owner. This Interface enables the Classes implementing this Interface to own other Classes. The relation of Variables to a single Model is described by the relation of the Model holding Model Elements.A Block implements this Abstract Class. The Block holds the related Value Properties. These Value Properties are the representation of the Variables. The table below describing the translation.

34 35 Figure 5.3: Class Diagram: Object-oriented depiction of the SysML model Source: self-elaborated SysML component Description Java representation

A Block is a representa- tion of a structural ele- ment of SysML.

Value Properties are usually members of Blocks. A fixed value indicates a Parameter, while Variables are identified with a value range, value list or without a value.

Block-defining system constraints as paramet- ric equations or in- equalities.

A Constraint Port is a connecting point for pa- rameters of the Con- straint.

Continued on next page

36 Table 5.1 – continued from previous page SysML component Description Java representation Associations are used to depict the dependen- cies of different parts of the system. As a second point multiplicity is de- fined by Associations.

Generalisations are showing one element as another’s specialisation

A Binding Connector is used to visualise the connection between Value Properties and Constraint Ports. Table 5.1: Translation table Source: self-elaborated

For translating the XMI in a ".xml"-file the w3c-Dom parser is used. It reads the whole file and stores it in memory. By searching the tree for the different SysML related tags, the different elements of the system are identified.

37 1

Listing 5.1: SysML taged Block

Listing 5.1 shows a representative line identifying a SysML Block as the base_Class attribute holding the ID of the specific element in the file.

1

2

3

4

5 Listing 5.2: Block packagedElement

As depicted in 5.2, the characteristics of the Block are identified by its Children. Tags, IDs, as well as attributes, are taken into account to identify the type of child and their corresponding object-oriented representation. Translation of the model into Java code is realised by repeating these information-extracting steps for each SysML element.

5.2.1 Prototype-based extensions

During the analysis of the ".xml"-file it was detected that some of the infor- mation necessary for the solving process is not specified by the standard SysML syntax. This missing information is:

38 Optimisation target is the information of the goal of the problem to optimise. It is implemented in the prototype as a part of the Objective and separated by a comma. The target is specified by "minimise", "maximise" or "satisfy". Fig- ure 5.4 depicts the Objective of one of the described Field Experiments. As it is shown it minimises the length of the spring.

Figure 5.4: Objective with optimisation target Source: self-elaborated

Multiplicity of Value Properties relates to the fact that it cannot be as- sumed that the multiplicity of a Block is affecting its Value Properties. This knowledge has to be actively defined by the keyword "mul(...)". By using the keyword surrounding a Value Property, the prototype multiplies the value with the Multiplicity of the Block as it is depicted in figure 5.5.

Figure 5.5: Constraint with multiplicities activated Source: self-elaborated

39 Decision Tables are a common method used by engineers to depict the se- lection of values based on parameters. Therefore, these tables are stored as ".csv"-files. The path to the file is stored as the content of a Constraint Block as it is depicted in figure 5.6. The Ports of the Block have to be connected with the Value Properties mentioned in the Decision Table.

Figure 5.6: Constraint Block holding a Decision Table Source: self-elaborated

Variable Rang has to be defined for classes containing non-linear Constraints or Objective.

Figure 5.7: Specification of a variable range Source: self-elaborated

40 5.3 Mathematical model generation

To handle the task of creating a mathematical model, MiniZinc is used. As mentioned before, it is a mathematical model language, as well as a tool chain of compilation and translation of the model to FlatZinc.

To generate the content of the MiniZinc ".mzn"-file a bottom-up approach is used. This ensures that only model relevant elements are transformed. At the beginning, the Objective is written to the content of the ".mzn"-file. Each Vari- able and Parameter used by this equation is stored for later transformation. The second step is to transform the Constraints. Like before the used Value Properties are stored.

The last step is to finalise the content with the stored Variables and Param- eters used in Constraints and Objective. During this step the output String is specified. It is a String holding all Decision Variables to be able to show each decision made by the Solver for each solution, as well as the value of the Objective.

5.4 Classification

To classify the problem under the conditions given by Linearity and Variable Type, the prototype implements the following sub-tasks to perform the model analysis. The Variable Type is identified by the value type defined in the given SysML model. To realise the Linearity check, the dependencies of the given Constraints and Objective have to be evaluated. This step is needed to cal- culate the observed y-values for each function. These values are used for the Regression Analysis described in chapter 4.3.

To face this calculation task, a class was created to transform the String rep- resentation of the given function into a List of Objects. This representation is used to be transformed into Strings interpretable by the mXparser. In prepara- tion of the regression task, 1.000 standard normal distributed x-values in the

41 range of [-10.000, 10.000] are created. Afterwards the related y-values, are cal- culated.

The last step of the evaluation process uses the Apache Commons Math 3 library to perform Sum of Squared Residual calculation. The result of this calculation is checked considering calculation error. If the Residual is zero for each function of the model, it is marked as linear, relating to the approach described in Section 4.3.

5.5 Solving

This part of the prototype handles the selection of the Solver, taking into ac- count the analysis carried out. The file created in the previous step is handed over to the MiniZinc Toolchain.

1 minizinc −o −−solver Listing 5.3: Command line string

Listing 5.3 depicts the command to start the MiniZinc toolchain. First the process ID selected by "minizinc". The option "-o" enables the output of the Solver to a defined file identified by the "<filepath>". "–solver" is the argument followed by the ID of the Solver to be taken for the given mathematical model. The "" argument holds the path to the ".mzn"-file containing the mathematical model in MiniZinc Language [PKG18a].

42 6 Result

The first goal of this work, to find a generic way for solving different classes of problem tasks from the product development process, has been reached. A case study has been made to evaluate the following process:

1. Transformation of the given product knowledge formalisation into an object-oriented model.

2. Translation of the object-oriented model into a mathematical formalisa- tion.

3. Classification of the task based on the previously mentioned parameters.

4. Evaluation of the model by a Solver related to the problem class.

The prototype is able to go through the above-mentioned process fully auto- matically. To determine the Parameters that are used for classification of the problem, the described Solvers are analysed. Due to the fact that a model containing one non-linear function is treated the same as a model containing only non-linear functions, only Experiments containing linear and non-linear functions have been evaluated4.

The number of different model elements related to the models of the different procedures performed for the Laboratory Experiments are shown in table 6.1, while table 6.2 shows the ones for the Field Experiment. What is remarkable is the reduction of elements happening between the SysML model and the math- ematical model. As shown in the tables, there are no Blocks and Relations in the mathematical model.

43 Table 6.1: Knapsack: Number of elements of the different model Source: self-elaborated

Table 6.2: Shaft Hub Connection: Number of elements of the different model Source: self-elaborated

Table 4.1 depicts the possible combinations of parameters. The column Chosen Solvers holds the name of the Solver evaluated for the problem class. As shown, there are nine possible combinations. The problem classes are identified by the classification parameters Variable Type and Linearity. These nine classes form the basis for choosing a Solver. Further generalisation or grouping could have a negative impact on the understanding of the classified problem, since basis statements would be excluded. For evaluation of the process, the Laboratory Experiment described in Section 3.1 is customised for the different mentioned classes. Therefore, the Value Properties, Constraints or Objective are changed in relation to the problem class. In a second step the Use Cases prototype was started 1000 times for

44 each problem and the times for the different parts were recorded.

Table 6.3: Evaluated times in milliseconds Source: self-elaborated

Table 6.3 holds the evaluated times the prototype needs for the different parts. As it is shown the generation of the object-oriented model, as well as the gen- eration of the mathematical model, needs in average below 50 milliseconds. Classification is the most time-consuming part as it uses between 725 and 5087 milliseconds. During the evaluation process of the different classes, some issues were detected when using MiniZinc. The implementation of Solvers being able to handle non-linear functions is still in experimental status [PKG18b]. Therefore, the prototype is able to evaluate these problem classes, but MiniZinc is not able to translate the model into FlatZinc. This happens for example when using mathematical keywords like sqrt(...) or Boolean Expressions. For this reason, the second goal of this work, the implementation of Solvers related to the problem classes, only works properly for linear problems.

45 Figure 6.1: Object-oriented model generation time of the Experiments Source: self-elaborated

Figure 6.1 compares the needed times for parsing the product knowledge for- malisation, stored in the ".xml"-file, into the object-oriented model. The trans- formation of the Laboratory Experiment is faster because of the lower number of elements shown in the tables 6.1 and 6.2.

46 Figure 6.2: Mathematical model generation time of the Experiments Source: self-elaborated

The time needed for the translation of the object-oriented model into the math- ematical model is compared in figure 6.2. The difference between the Laboratory Experiments and Field Experiment can be explained by the number of elements to be translated.

47 Figure 6.3: Classification time of the Experiments Source: self-elaborated

The boxplots shown in figure 6.3 depict the time consumed by each Use Case for the classification procedure.

48 Figure 6.4: Solving time of the Experiments Source: self-elaborated

Figure 6.4 depicts the time consumption of each Solver specified by table 4.1. As shown, the Field Experiment is missing because of the use of non-linear Constraints and Decision Tables.

49 7 Discussion

In this chapter the findings and results of this case study are critically reflected. Therefore, it is structured according to the main parts of the study.

7.1 XMI transformation

The results described in chapter 6 show that it is possible to automatically process optimisation tasks in the area of product development. The starting point, which is represented as a graphical formalisation in SysML, is iterated and transformed in an object-oriented representation of the model. For this sub-task it is helpful that there is a UML profile for SysML. The XMI export standardised by OMG is used to extract the needed information. This standard implements tags identifiable as the different SysML Elements. For this reason, the information can be extracted in an easy way. Figure 6.1 represents a com- parison of the times which the prototype needs to generate the object-oriented representation of the given SysML model. As depicted in numbers in table 4.1 and graphic 6.1, the time for generating the object-oriented model is nearly identical for the Field Experiments. The transformation of the Shaft Hub Con- nection takes longer because the SysML model of the experiment has 109 more elements.

A challenging part of the implementation comes from the field of engineering. Decision systems are stored as tables with Decision and Action Variables, the so-called Decision Tables. The tables show the specific conditions of the Vari- ables that have to be met so that certain values are assumed. Further the tables are often given by commercial Guidelines as PDF or Excel tables. To ease the way of integration of these tables, the prototype detects document

50 paths in Constraints. The Decision Tables have to be stored in .csv-format at the specified location. Predefined keywords mark a column as decision or action Variable. During the information extracting process Decision Tables are read and stored as an Object holding information of the table. This workaround needs improvement for the Decision Tables to be integrated in a more efficient way.

7.2 Mathematical model generation

Related to the data in table 6.3, figure 6.2 represents the comparison of time consumed to generate the mathematical model from the object-oriented model. As shown, each problem class takes nearly the same time to be translated. As the evaluated models consist of the same number of elements, this result can be expected. For the generation of the mathematical model it is possible to interface each Solver separately. This would mean that the integration of a new Solver would require the entire mathematical model generation process to be rebuilt for each Solver. To avoid this, a possibility was sought to generalise the modelling process. For this reason an open source mathematical modelling language was applied.

The language used for the mathematical model is MiniZinc. It is an open source language and tool chain supporting the User in implementing mathe- matical models. The IDE available offers the opportunity to learn the language in a quick way. This supports the User learning how to model in MiniZinc. For the processing procedure the translation to FlatZinc and the flattening of the model performed during the transformation process improve the performance of the Solver by optimising the mathematical model. It has to be mentioned, that MiniZinc is not able to handle multi-objective optimisation. An approach for solving this issue is to define a weighted optimisation function calculating the Objectives to be optimised.

The implementation of non-linear problems is limited due to the fact that this part of MiniZinc is still under experimental status. Therefore, some issues were

51 detected. One of the most sever restrictions is the lack of the translation from Boolean Variables and Constraints into the related Integer representation. For this reason, MiniZinc is not able to process the chosen "if then else" repre- sentation of Decision Tables. Another limitation with the same cause is that some mathematical keywords, for example sqrt(..), need Boolean Constraints for their transformation to FlatZinc. Therefore, the prototype is not able to process models containing non-linear functions and Decision Tables. Another limitation of the same reason is, that some mathematical keywords for example sqrt(..) need Boolean Constraints for their transformation to FlatZinc. This limitation does not apply to all implemented mathematical keywords. The Power operator (pow(...)) works within non-linear functions. Therefore, it should be examined more closely which mathematical keywords work.

The described limitations were raised during the evaluation process of the im- plementation of non-linear Solvers. The MiniZinc Handbook just mentions the experimental status, the missing Boolean Variables and Boolean Constraints, and a limitation of the Variable Type for Ipopt. Further description of the ex- perimental status is missing. The lack of mathematical keywords was found out experimentally. The handbook of MiniZinc promises full integration for future releases [PKG18b].

MiniZinc provides the opportunity to optimise the method of some Solvers. For example the search algorithm for some of the Solvers can be set through Annotations. For this project this possibility is not analysed further, due to the reason that it aims at a general approach. Therefore, the granularity of the analysis was chosen to identify a wider range of classes to fit the problem specification. The result of a short research on this topic is that for these fine granular settings Size and Complexity of a problem, described in chapter 3, have to be evaluated. Measurement of these parameters is a formidable task. For further research a topic could be to evaluate these two parameters in the context of MiniZinc for the different problem classes and related Solvers. Linked to the identified classes the implementation of Solvers handling linear Discrete and Continuous problems is working well. Formalisation of a model

52 can be done in a general way, by formalisation of the:

1. Objective

2. Constraints

3. Value Properties

By following the specified order, it is ensured that only optimisation-relevant Value Properties and are transformed into the mathematical model. This method is necessary since unused Value Properties can negatively affect the Solvers. In a worst-case scenario, the Solver throws an error.

7.3 Problem classification

The chosen parameters for classification are suitable for identification of the different evaluated open source Solvers. Variable Type is a strong parameter as long as the User sets the correct data type. Calculating the Sum of Squared Residuals of a given Constraint is implemented to evaluate Linearity. The calculation effort for this method depends on the number of points taken into account. For the mentioned results 1000 points are considered. For this number of points, the prototype needs between 800 and 2800 milliseconds in average as it is shown in figure 6.3. Furthermore, it can be observed that the classification of non-linear problems is in two out of three cases faster than identifying linear classes. This behaviour is based on the fact that for identifying a linear problem each Constraint as well as the Objective have to be checked.

When comparing the needed times to analyse the model and classify it, the following circumstances can be observed. Classes related to linear problems have to evaluate each Constraint and the Objective to ensure it is a linear problem. Therefore the time needed for the evaluation is longer. The boxplots depicting the Continuous classes nearly needed the same time for evaluation. This can be explained by the reason, that the Non-linear function was evaluated

53 as the last one. As a result the time evaluating these two classes is nearly the same. The significantly longer period of time which purely Continuous problems need to be identified, suggests that the library used to calculate the RSS has more work with Continuous Variables.

7.4 Solving

By using a mathematical model language, the group of Solvers taken into ac- count enlargens without the need of a separate integration. Chapter 2.3.3 describes the Solvers evaluated for this work. The reader has to keep in mind that the handling of Solvers is delegated to MiniZinc. The integration and opportunities of the Solvers under the use of MiniZinc are partly only sparsely described by the handbook. For this reason the evaluation process of Solvers required experiments to test the functionality of the Solvers. The IDE shipped by MiniZinc simplifies these exploration tests.

Comparing the different parts of Figure 6.4 it is remarkable that all Solvers are in average faster than 1000 milliseconds. The large bodies of the boxplots for Continuous problem classes suggest that the time it takes to optimise the problem varies widely. This can be justified with the functionality of the algo- rithms used and their starting point. A starting point close to the optima of the problem leads to a small solving time.

54 8 Conclusion

The purpose of the present thesis is a case study for the process of automatic model analysis, the translation and evaluation for models depicting formalisa- tions of tasks from the product development process. A prototype was imple- mented for testing reasons and to proof functionality of the developed chain of procedures. To transfer the visual formalisation, given in SysML, an XMI export has to be done. The prototype reads the file and generates an object-oriented model. For this process the SysML related tags of the UML Profile for SysML saved in the ".xml"-file are evaluated. Extracting the needed information by the related tags simplifies the implementation of the transformation process. Next, the generated object-oriented model has to be translated to a mathe- matical model. As a formalisation of the mathematical model, an open source modelling language has been chosen. The prototype implements a procedure to generate the content of the file in MiniZinc language automatically. This file is the basis for the calculation procedure of the different Solvers. As a next step, the given problem has to be classified. The parameters chosen to evaluate the different classes are Variable Type, Linearity of the Objective and Constraints. According to these parameters the Solver is chosen. Therefore, Solver-related communication is delegated to MiniZinc. The described implementation of the Solver does have the the following benefits:

• Generalisation of the Solver mathematical model

• Easy implementation of new Solvers

• Delegation of Solver-related communication to MiniZinc

To evaluate the implemented prototype, time consumption of the different pro- cedures of the prototype is measured for different Laboratory and Field Exper-

55 iments. The result of these measurements shows that problems which contain Continuous Variables are the most difficult to classify and optimise. Secondly, the conversion into an object-oriented model and the translation into a math- ematical model are highly dependent on the number of elements contained in the SysML model.

As a negative point it has to be mentioned that the integration of Solvers deal- ing with non-linear problems is under development. This becomes apparent due to the fact that the non-linear Solvers do not support Boolean Variables or Boolean Expressions. This leads to the result, that the problem classes identi- fied as non-linear containing Boolean Variable or Boolean Expressions are not solvable at the moment.

For this project, the restriction for non-linear Solvers have to be seen as a temporary status. It disqualifies the prototype itself for commercial usage as long as MiniZinc suffers of the lack of a non-linear solving process. From a scientific point of view, the use of MiniZinc has a lot of benefits related to the automation process and should be held in evidence for further evaluation.

For a follow-up work, two aspects should be taken into account. First, the study of global optimality is suggested, which is not part of this work. Second, an investigation of the effects of Size, given by the number of elements, and Complexity, given by the number of connections, of the SysML model on the applied procedures should be carried out.

56 Bibliography

[Bec20] Klaus Becker. Fallstudie - Das Rucksackproblem. July 2020. url: https://www.inf-schule.de/grenzen/komplexitaet/rucksackproblem/ station%5C_loesungsalgorithmus (visited on 07/29/2020). [Bel] Pietro Belotti. Couenne: a users manual. url: https://github. com / coin - or / Couenne / blob / master / doc / couenne - user - manual.pdf. [Bel+13] Pietro Belotti et al. “Mixed-integer nonlinear optimization”. en. In: Acta Numerica 22 (May 2013), pp. 1–131. issn: 0962-4929, 1474- 0508. doi: 10.1017/S0962492913000032. url: https://www. cambridge.org/core/product/identifier/S0962492913000032/ type/journal%5C_article (visited on 07/28/2020). [Cha+11] Amaresh Chakrabarti et al. “Computer-Based Design Synthesis Re- search: An Overview”. en. In: Journal of Computing and Infor- mation Science in Engineering 11.2 (June 2011), p. 021003. issn: 1530-9827, 1944-7078. doi: 10 . 1115 / 1 . 3593409. url: https : //asmedigitalcollection.asme.org/computingengineering/ article / doi / 10 . 1115 / 1 . 3593409 / 465824 / ComputerBased - Design-Synthesis-Research-An (visited on 07/30/2020). [COI20] COIN-OR Foundation. coin-or/Cbc. original-date: 2019-03-02T23:20:41Z. July 2020. url: https://github.com/coin-or/Cbc (visited on 07/30/2020). [COI] COIN-OR Foundation. Ipopt: Documentation. url: https://coin- or.github.io/Ipopt/ (visited on 07/30/2020).

57 [Cor] GAMS Development Corp. GAMS. url: https://www.gams.com/ products/gams/gams-language/ (visited on 07/30/2020). [DD08] Sankha Deb and U.S. Dixit. “Intelligent Machining: Computational Methods and Optimization”. en. In: Machining. London: Springer London, 2008, pp. 329–358. isbn: 978-1-84800-212-8 978-1-84800- 213-5. doi: 10 . 1007 / 978 - 1 - 84800 - 213 - 5 \ _12. url: http : //link.springer.com/10.1007/978-1-84800-213-5%5C_12 (visited on 08/12/2020). [DS19] Norman R Draper and Harry Smith. Applied regression analysis. English. OCLC: 1178727681. New Delhi: Wiley India Pvt. Ltd., 2019. isbn: 978-81-265-3173-8. [Fou16] The Apache Software Foundation. Math - Commons Math: The Apache Commons Mathematics Library. 2016. url: https : / / commons.apache.org/proper/commons-math/ (visited on 07/30/2020). [FMS09] Sanford Friedenthal, Alan Moore, and Rick Steiner. “OMG Systems Modeling Language (OMG SysML) Tutorial”. en. In: INCOSE In- terntional Council on Systems Engeneering (Sept. 2009), p. 132. [Fro11] Frontline Systems, Inc. Optimization Problem Types - Mixed-Integer and Constraint Programming. en. Library Catalog: www.solver.com. Jan. 2011. url: https://www.solver.com/integer-constraint- programming (visited on 07/28/2020). [Gro] Mariusz Gromada. mXparser - Math Expressions Parser. en. Li- brary Catalog: mathparser.org. url: https://mathparser.org/ (visited on 07/30/2020). [Gro03] Object Management Group. SysML Open Source Project - What is SysML? Who created it? Library Catalog: sysml.org. 2003. url: https://sysml.org/index.html (visited on 07/27/2020). [Gro15] Object Management Group. XML Meta Data Interchange (XMI) Specification.pdf. en. June 2015. url: https://www.omg.org/ spec/XMI/2.5.1/PDF.

58 [IBM14] IBM. Using SysML parametric diagrams. en. Library Catalog: www.ibm.com. Oct. 2014. url: www.ibm.com/support/knowledgecenter/ssb2mu% 5C_8.2.1/com.ibm.rhp.sysml.doc/topics/rhp%5C_c%5C_dm% 5C_parametric%5C_dgrms.html (visited on 07/28/2020). [inc20] AMPL Optimization inc. AMPL. en-US. Library Catalog: ampl.com. 2020. url: https : / / ampl . com / products / ampl/ (visited on 07/30/2020). [Jac19] Jacobs G. Welle-Nabe-Verbindung (V02). Dec. 2019. url: https: //www.youtube.com/watch?v=F3SfxydBQ9c (visited on 07/30/2020). [Kli20] Philipp Klimant. “Industrie 4.0 in der industriellen Praxis”. de. In: ed. by Yaman Kouli, Peter Pawlowsky, and Markus Hertwig. Wiesbaden: Springer Fachmedien Wiesbaden, 2020, p. 161. isbn: 978-3-658-22332-8 978-3-658-22333-5. doi: 10.1007/978-3-658- 22333-5. url: http://link.springer.com/10.1007/978-3- 658-22333-5 (visited on 07/30/2020). [MDC20] Monash University, Data61, and CSIRO. MiniZinc - Software. 2020. url: https://www.minizinc.org/software.html (visited on 07/29/2020). [NEO19] NEOS. Types of Optimization Problems. en. 2019. url: https : //neos-guide.org/optimization-tree (visited on 07/28/2020). [Net+07] Nicholas Nethercote et al. “MiniZinc: Towards a Standard CP Mod- elling Language”. en. In: Principles and Practice of Constraint Pro- gramming - CP 2007. Ed. by Christian Bessiere. Vol. 4741. Se- ries Title: Lecture Notes in Computer Science. Berlin, Heidelberg: Springer Berlin Heidelberg, 2007, pp. 529–543. isbn: 978-3-540- 74969-1. doi: 10.1007/978- 3- 540- 74970- 7\_38. url: http: //link.springer.com/10.1007/978-3-540-74970-7%5C_38 (visited on 07/29/2020). [No 20] No Magic, Inc. SysML Block Definition Diagram. Library Cata- log: docs.nomagic.com. 2020. url: https://docs.nomagic.com/

59 display/SYSMLP190/SysML+Block+Definition+Diagram (visited on 07/28/2020). [NI98] Koji Nonobe and Toshihide Ibaraki. “A tabu search approach to the constraint satisfaction problem as a general problem solver”. en. In: European Journal of Operational Research 106.2-3 (Apr. 1998), pp. 599–623. issn: 03772217. doi: 10 . 1016 / S0377 - 2217(97 ) 00294-4. url: https://linkinghub.elsevier.com/retrieve/ pii/S0377221797002944 (visited on 07/28/2020). [Obj19] Object Management Group. OMG Systems Modeling Language v1.6. en. Dec. 2019. url: https://www.omg.org/spec/SysML/1.6/PDF. [Obj20a] Object Management Group. SysML FAQ: What is the relation be- tween SysML and UML? 2020. url: https://sysmlforum.com/ sysml-faq//labels//sysml-faq/what-is-relation-between- sysml-and-uml.html (visited on 08/13/2020). [Obj20b] Object Management Group. What is SysML? en. 2020. url: https: //www.omgsysml.org/what-is-sysml.htm (visited on 07/27/2020). [PKG18a] Peter J. Stuckey, Kim Marriott, and Guido Tack. 3.1. The MiniZ- inc Command Line Tool - The MiniZinc Handbook 2.4.3. en. 2018. url: https://www.minizinc.org/doc-2.4.3/en/command%5C_ line.html (visited on 08/03/2020). [PKG18b] Peter J. Stuckey, Kim Marriott, and Guido Tack. 3.3. Solving Technologies and Solver Backends - The MiniZinc Handbook 2.4.3. 2018. url: https : / / www . minizinc . org / doc - 2 . 4 . 3 / en / solvers.html (visited on 08/05/2020). [PMG18] Peter J. Stuckey, Kim Marriott, and Guido Tack. 2.1. Basic Mod- elling in MiniZinc - The MiniZinc Handbook 2.4.3. 2018. url: https://www.minizinc.org/doc-2.4.3/en/modelling.html (visited on 07/29/2020). [Rig+19] E. Rigger et al. Task Definition for Design Automation. ETH Zuerich, 2019. url: https://books.google.at/books?id=si9GzQEACAAJ.

60 [Roc12] Gianfranco La Rocca. “Knowledge based engineering: Between AI and CAD. Review of a language based technology to support en- gineering design”. en. In: Advanced Engineering Informatics 26.2 (Apr. 2012), pp. 159–179. issn: 14740346. doi: 10.1016/j.aei. 2012.02.002. url: https://linkinghub.elsevier.com/retrieve/ pii/S1474034612000092 (visited on 07/30/2020). [STL] Christian Schulte, Guido Tack, and Mikael Z. Lagerkvist. “Model- ing and Programming with Gecode”. en. In: (), p. 578. [Sha+12] Aditya A. Shah et al. “Combining Mathematical Programming and SysML for Automated Component Sizing of Hydraulic Systems”. en. In: Journal of Computing and Information Science in Engi- neering 12.4 (Dec. 2012). issn: 1530-9827, 1944-7078. doi: 10 . 1115/1.4007764. url: https://asmedigitalcollection.asme. org/computingengineering/article/doi/10.1115/1.4007764/ 371348/Combining- Mathematical- Programming- and- SysML- for (visited on 08/03/2020). [Sie20] Siemens. siemens/JMiniZinc. original-date: 2016-05-30T05:00:00Z. Feb. 2020. url: https://github.com/siemens/JMiniZinc (vis- ited on 08/03/2020). [TJC10] Johan Thapper, Peter Jonsson, and David Cohen. Aspects of a Constraint Optimisation Problem. English. OCLC: 1141310964. 2010. url: http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva- 52103 (visited on 07/28/2020). [Tut20] Tutorialspoint. Java DOM Parser - Overview. 2020. url: https: //www.tutorialspoint.com/java%5C_xml/java%5C_dom%5C_ parser.htm (visited on 07/29/2020). [uml20] uml-diagrams.org. UML profile allows to adapt or customize a meta- model with constructs that are specific to a particular domain, plat- form, or a software development method. en. 2020. url: https:// www.uml-diagrams.org/profile.html (visited on 08/14/2020).

61 [Van20] Robert J. Vanderbei. Linear Programming: Foundations and Ex- tensions. en. Vol. 285. International Series in Operations Research and Management Science. Cham: Springer International Publish- ing, 2020. isbn: 978-3-030-39414-1 978-3-030-39415-8. doi: 10 . 1007/978-3-030-39415-8. url: http://link.springer.com/ 10.1007/978-3-030-39415-8 (visited on 07/30/2020). [Wei] Eric W. Weisstein. Convex. en. Text. url: https://mathworld. wolfram.com/Convex.html (visited on 07/28/2020). [Wri16] Stephen J. Wright. Optimization. en. Library Catalog: www.britannica.com. Oct. 2016. url: https://www.britannica.com/science/optimization (visited on 08/03/2020).

62