EXAMENSARBETE I INBYGGDA SYSTEM 120 HP, AVANCERAD NIVÅ STOCKHOLM, SVERIGE 2016

Survey of Modelling Formalisms for MISRA-:1998 Software Architecture Modelling

JOAKIM GUSTAVSSON

KTH KUNGLIGA TEKNISKA HÖGSKOLAN

SKOLAN FÖR INDUSTRIELL TEKNIK OCH MANAGEMENT Survey of Modelling Formalisms for MISRA-C:1998 Software Architecture Modelling

JOAKIM GUSTAVSSON

Master’s Thesis at ITM Supervisor: Jonas Westman Examiner: De-Jiu Chen

TRITA MMK 2016:08 MES 011 Abstract

The complexity of electrical and electronic automotive systems have increased steadily over the previous decades, with modern vehicles containing as many as 50-70 Elec- tronic Control Units, and several CAN-communication net- works. In order to address the increasing complexity of these safety-critical embedded systems, safety standards such as ISO 26262 are making their way to the market, posing strong restrictions on the development process of automotive systems in order to ensure safety. With cur- rent automotive actors possessing large existing source code bases for their ECUs, primarily written in the C program- ming language, the demands posed on software architec- tural models by ISO 26262 are proving to be a challenge to meet given the difficulties of modelling low-level languages such as C. This thesis aims to survey currently existing modelling formalisms with regards to their ability to model automotive embedded C source code in a way that facili- tates ISO 26262 compliance. A delimitation is made to the use of the MISRA-C:1998 subset of the C , a safer subset commonly used in automotive in- dustry. A short ontology is proposed, coupled with a metric for evaluating the completeness of a modelling formalism. Requirements are posed on suitable modelling formalisms, and AADL, Lustre, SysML and Promela are identified as promising candidates for modelling embedded C code. Se- mantic constructions present in the C language are iden- tified, and a mapping between these constructions and se- mantic constructions present in the selected modelling for- malisms is made and analyzed using the completeness eval- uation framework that was proposed. Architectural De- scription Languages (ADL), such as AADL, are identified as being the most promising with regards to modelling em- bedded C code. Control Flow Graphs are identified as a promising augmentation to ADLs in order to deal with their lack of control flow semantics. Referat

Granskning av Modelleringsformalismer för Modellering av MISRA-C:1998 Arkitekturer

Komplexiteten hos de el-system som finns i moderna fordon har stadigt ökat över de senaste decennierna, där moderna fordon kan innehålla så många som 50-70 elektro- niska kontrollkretsar, och flera CAN-nätverk för kommuni- kation mellan dessa. För att kunna hantera den ökade kom- plexiteten hos dessa säkerhetskritiska inbyggda system så har säkerhetsstandarder som ISO 26262 börjat giva sig till känna på marknaden. Denna standard ställer strikta krav på utvecklingsprocessen för inbygda system för att styrka deras säkerhet. Då flertalet aktörer inom fordonsindustrin redan besitter stora mängder källkod för de kontrollkretsar som de använder, ofta skriven i låg-nivå programmerings- språket C, har dessa ökade krav som ställs av ISO 26262 visat sig svåra att möta. Den här rapporten ämnar att granska på marknaden förekommande modelleringsforma- lismer, samt att utvärdera deras potential för att modellera inbyggd C källkod på ett sätt som underlättar uppfyllan- det av de krav som ställts av ISO 26262. En begränsning görs till den delmängd av C som specificeras av MISRA- C:1998 standarden, en vanlig standard inom fordonsindu- strin för att underlätta i utvecklandet av säker källkod. En kort ontologi presenteras tillsammans med ett ramverk för att utvärdera komplettheten av en modelleringsforma- lism. Ett antal krav ställs på de modelleringsformalismer som skall utvärderas, och AADL, Lustre, SysML och Pro- mela identifieras som lovande formalismer för modellering av C källkod. Semantiska element i C identifieras, och en mappning mellan dessa element och element i de identi- ferade modelleringsformalismerna genomförs och utvärde- ras enligt det tidigare föreslagna ramverket. Architectural Description Languages (ADLs), såsom AADL, identiferas som lovande för att modellera C källkod. Kontrollflödes- grafer identifieras som lovade för att hantera den svaghet som ADLs har rörande modellering av kontrollflöden. Acknowledgements

First and foremost I would like to thank my thesis supervisor Jonas Westman, who throughout the course of the thesis project always showed a great interest in my research, and was always there to come with feedback, suggestions or interesting discussions, which served as inspiration for the direction of the work. Without him this thesis would never have been written. I would also like to thank Maxim Olifer, who was always there to help out when I got stuck, and whose active engagement in lunch discussions regarding modelling helped shape the direction of the research. His work on automatic parsing of C code served as a strong inspiration for my own work. I would like to thank the people at the RESA department at Scania, especially Mattias Nyberg and Anton Einarson, who were always there with insight into ”the Scania way”, and could provide valuable insights that could not be found in docu- mentation. I would also give my thanks to the other Master Thesis workers at RESA who kept me motivated by bringing me along to lunch and coffee breaks, where I could clear my mind and engage in interesting discussions. I would like to thank Associate Professor De-Jiu Chen for offering to be the examiner for this thesis. Without him stepping up when everyone else was busy, I would never have been able to start the work in the first place. I would like to extend my thanks to Lars-Ivar Nero and Anita Sehlin, two of the most inspirational teachers I have had the honor of studying under, and without whom I would likely never have pursued Master level studies in the first place. Lastly I would like to thank my family, who have always been there with support when I was feeling down or stressed. They helped keep my spirits high so that I could eventually finish with my research. Contents

1 Introduction 1 1.1 Subject and Purpose ...... 1 1.2 Delimitations ...... 3 1.3 Disposition ...... 3

2 Background 5 2.1 Modelling ...... 5 2.1.1 Model-Driven Engineering ...... 6 2.1.2 Modelling Formalism Families ...... 7 2.1.3 Reverse Engineering ...... 9 2.2 Functional Safety Standards ...... 10 2.2.1 IEC 61508 ...... 10 2.2.2 ISO 26262 ...... 11 2.3 The C Programming Language ...... 14 2.3.1 ANSI-C ...... 14 2.3.2 Usage in Embedded Systems Development ...... 14 2.3.3 MISRA-C:1998 ...... 15 2.3.4 Modelling the C Language ...... 15 2.4 Introduction to Scania ...... 17 2.5 Related Work ...... 17

3 The Scania Software Architecture 21 3.1 The Layer Model ...... 21 3.2 Code Organization ...... 21 3.3 Communication Channels ...... 24 3.3.1 CAN ...... 24 3.3.2 RTDB ...... 25 3.3.3 Sensors ...... 25

4 Method and Evaluation Framework 27 4.1 Accurately Representing Software Architectures ...... 27 4.2 The Concept of Model Views ...... 29 4.3 Extending Model Completeness ...... 30 4.4 A Framework for Evaluating Expressiveness ...... 30 4.4.1 Constructions in C ...... 31 4.5 Requirements on Modelling Formalisms ...... 31 4.6 Method of Evaluation ...... 32 4.6.1 Modelling code or modelling behaviour? ...... 34 4.7 Weaknesses of Method ...... 35

5 Coverage Analysis 39 5.1 C Construction Categories ...... 39 5.1.1 Data Storage ...... 40 5.1.2 Data Flow ...... 44 5.1.3 Control Flow ...... 45 5.1.4 Code Structure ...... 47 5.1.5 Program Behaviour ...... 48 5.2 Modelling Formalisms ...... 51 5.2.1 Evaluated Formalisms ...... 51 5.2.2 Rejected Formalisms ...... 52 5.3 Coverage ...... 53 5.3.1 Fulfilment: Data Storage ...... 53 5.3.2 Fulfilment: Data Flow ...... 68 5.3.3 Fulfilment: Control Flow ...... 74 5.3.4 Fulfilment: Code Structure ...... 84 5.3.5 Fulfilment: Program Behaviour ...... 88 5.4 Coverage Summary ...... 93

6 Discussion 99 6.1 Formalism Coverage ...... 99 6.1.1 Data Storage ...... 99 6.1.2 Data Flow ...... 100 6.1.3 Control Flow ...... 101 6.1.4 Code Structure ...... 101 6.1.5 Program Behaviour ...... 102 6.2 Augmenting Formalisms ...... 102 6.3 Automatic Model Generation ...... 104 6.4 Validation ...... 104 6.5 Future Work ...... 106

Bibliography 107

Appendices 112

A Modelling Formalism Requirements 113

B Fulfilment: Data Storage 117 C Fulfilment: Data Flow 127

D Fulfilment: Control Flow 131

E Fulfilment: Code Structure 137

F Fulfilment: Program Behaviour 141

G AADL Examples 147

H Promela Examples 173

I Lustre Examples 185

J SysML Examples 191 List of Figures

2.1 The ten parts of ISO 26262 ...... 11 2.2 The parts and sections of the ISO 26262 standard...... 12 2.3 Hazard and risk analysis categories ...... 13 2.4 Example of recommended activities to achieve ASIL targets ...... 14 2.5 The usage of the char to address individual of data. Error- handling omitted for the sake of briefness...... 17

3.1 The software architectural layers of COO7. Arrows represent dependen- cies. Grey layers are unique to COO7, while white layers are general layers...... 22 3.2 Naming convention for parts of the module module1. This is an example module and is not present in any ECU software...... 22 3.3 Directory-based meta-data pertaining to layer architecture. The mod- ule names in the figure are obscured, and this is as such not an actual representation of the COO7 architecture...... 23 3.4 The façade of the BIOS layer aliases an internal function of the layer with a Bios -prefix to indicate that it is a public of the layer. . 24

4.1 The hierarchical construction of composite data types...... 28 4.2 An example of hierarchies stemming from pointer indirection...... 28 4.3 Summary of Modelling Formalism requirements ...... 32 4.4 The steps of the modelling formalism survey...... 33

5.1 A SysML example of nested function calls...... 82 5.2 Summary of Data Storage Coverage...... 94 5.3 Summary of Data Flow Coverage...... 95 5.4 Summary of Control Flow Coverage...... 96 5.5 Summary of Code Structure Coverage...... 97 5.6 Summary of Program behaviour Coverage...... 98

6.1 Modelling formalism syntax validation tools...... 105

G.1 Example for A1...... 147 G.2 Example for A11...... 147 G.3 Example for A12...... 148 G.4 Example for A13...... 149 G.5 Example for A15...... 150 G.6 Example for A16...... 151 G.7 Example for A2...... 151 G.8 Example for A20...... 152 G.9 Example for A21...... 152 G.10 Example for A22...... 153 G.11 Example for A24...... 153 G.12 Example for A5...... 154 G.13 Example for A6...... 154 G.14 Example for A7...... 155 G.15 Example for A8...... 155 G.16 Example for A8test2...... 156 G.17 Example for B1...... 156 G.18 Example for B2...... 157 G.19 Example for B4...... 158 G.20 Example for B6...... 159 G.21 Example for B8...... 160 G.22 Example for C1...... 161 G.23 Example for C10...... 162 G.24 Example for C11...... 163 G.25 Example for C1flows...... 164 G.26 Example for C2...... 165 G.27 Example for C5...... 165 G.28 Example for C6...... 166 G.29 Example for C7...... 167 G.30 Example for D4...... 168 G.31 Example for D4 2...... 168 G.32 Example for E3...... 168 G.33 Example for E6...... 169 G.34 Example for FileTypes...... 169 G.35 Example for function constants...... 169 G.36 Example for Interrupts...... 170 G.37 Example for Test...... 171 G.38 Example for Test2...... 171 G.39 Example for Test3...... 171 G.40 Example for C13...... 171 G.41 Property for a propertySet...... 171 G.42 Property set for ExternalLang...... 172 G.43 Property set for LoopBinds...... 172 G.44 Property set for Macro...... 172 G.45 Property set for TypedIntegers...... 172

H.1 Example for A1...... 173 H.2 Example for A10...... 173 H.3 Example for A11...... 174 H.4 Example for A12...... 174 H.5 Example for A15...... 174 H.6 Example for A16...... 175 H.7 Example for A20...... 175 H.8 Example for A21...... 176 H.9 Example for A22...... 176 H.10 Example for A22 mod...... 176 H.11 Example for A5...... 176 H.12 Example for A9...... 177 H.13 Example for B2...... 177 H.14 Example for B3...... 177 H.15 Example for B9...... 178 H.16 Example for C1...... 178 H.17 Example for C10...... 179 H.18 Example for C12...... 179 H.19 Example for C13...... 179 H.20 Example for C14...... 180 H.21 Example for C2...... 180 H.22 Example for C4...... 180 H.23 Example for C8...... 181 H.24 Example for D3...... 181 H.25 Example for D4 1...... 181 H.26 Example for D4 2...... 182 H.27 Example for D7...... 182 H.28 Example for E1...... 182 H.29 Example for E6...... 183 H.30 Example for E6 2...... 183

I.1 Example for A1...... 185 I.2 Example for A11...... 185 I.3 Example for A12...... 186 I.4 Example for A14 1...... 186 I.5 Example for A14 2...... 186 I.6 Example for A20...... 187 I.7 Example for A21...... 187 I.8 Example for A22...... 187 I.9 Example for A4...... 188 I.10 Example for A5...... 188 I.11 Example for A8...... 188 I.12 Example for C2...... 188 I.13 Example for D3...... 189 I.14 Example for D4...... 189 I.15 Example for D6...... 189 I.16 Example for E6...... 190 I.17 Example for H...... 190 I.18 Example for Test...... 190

J.1 Example for A1...... 191 J.2 Example for A11...... 192 J.3 Example for A12...... 192 J.4 Example for A13...... 193 J.5 Example for A15...... 193 J.6 Example for A16...... 193 J.7 Example for A2...... 193 J.8 Example for A20...... 194 J.9 Example for A21...... 194 J.10 Example for A3...... 194 J.11 Example for A4...... 194 J.12 Example for A5...... 195 J.13 Example for A7...... 195 J.14 Example for A9...... 195 J.15 Example for B2...... 196 J.16 Example for B4...... 196 J.17 Example for B8 1...... 197 J.18 Example for B8 2...... 197 J.19 Example for B9...... 198 J.20 Example for C1...... 198 J.21 Example for C10...... 199 J.22 Example for C11...... 199 J.23 Example for C12...... 200 J.24 Example for C2...... 200 J.25 Example for C7 1...... 200 J.26 Example for C7 2...... 201 J.27 Example for D4...... 201 J.28 Example for E3...... 202 J.29 Example for E6...... 202 J.30 Example for E8...... 202 List of Tables

5.1 The constructions of the Data Storage category...... 43 5.2 The constructions of the Data Flow category...... 45 5.3 The constructions of the Control Flow category...... 47 5.4 The constructions of the Code Structure category...... 48 5.5 The constructions of the Program Behaviour category...... 50

B.1 Coverage report for Promela Data Storage...... 119 B.2 Coverage report for AADL Data Storage...... 121 B.3 Coverage report for SysML Data Storage...... 123 B.4 Coverage report for Lustre Data Storage...... 125

C.1 Coverage report for Promela Data Flow...... 127 C.2 Coverage report for AADL Data Flow...... 128 C.3 Coverage report for SysML Data Flow...... 129 C.4 Coverage report for Lustre Data Flow...... 130

D.1 Coverage report for Promela Control Flow...... 132 D.2 Coverage report for AADL Control Flow...... 133 D.3 Coverage report for SysML Control Flow...... 135 D.4 Coverage report for Lustre Control Flow...... 136

E.1 Coverage report for Promela Code Structure...... 137 E.2 Coverage report for AADL Code Structure...... 138 E.3 Coverage report for SysML Code Structure...... 138 E.4 Coverage report for Lustre Code Structure...... 139

F.1 Coverage report for Promela Program Behaviour...... 142 F.2 Coverage report for AADL Program Behaviour...... 143 F.3 Coverage report for SysML Program Behaviour...... 144 F.4 Coverage report for Lustre Program Behaviour...... 145

Chapter 1

Introduction

The following chapter will present the problem that this thesis aims to address, as well as what benefits that could possibly be gained from the results of the thesis. The chapter ends with a disposition and outline for the consecutive chapters of the thesis.

1.1 Subject and Purpose

Modern automotive vehicles are highly complex systems, containing a large number of mechanical, electrical and electronic components. Recent development trends in the automotive industry have seen a steady move towards an increased usage of electronically controlled components, with concepts such as brake-by-wire and steer-by-wire becoming increasingly common[1]. The electric and electronic (E/E) signal architecture of a vehicle is controlled by a large number of Electronic Control Units (ECUs), communicating with each other over a bus network spanning the entirety of the vehicle[2][3]. As the functionality of the vehicle expands, the number of ECUs becomes larger, increasing the complexity of the system architecture. With several of the ECUs controlling safety critical functionality of the vehicle, such as the brake-by-wire system, where consequences of failure could result in the loss of human lives, the need for the E/E system to be robust and reliable is a key con- cern. Presently the verification and validation of these systems is performed using extensive testing and simulation, which is a process that is both time-consuming and error-prone[4]. In order to address this issue, the International Organization for Standardization (ISO) released a functional safety standard titled Road vehi- cles Functional safety in November, 2011. This standard came to be known as ISO 26262 and establishes requirements on the development process of automotive systems in order to be able to guarantee the safety of the system[5]. ISO 26262 does not currently extend to vehicles with a weight of more than 3.5 metric tonnes, although there are expectations that the standard will either be extended to, or another similar standard formalized for, heavier vehicles in the future[6]. The ISO 26262 standard poses requirements on the software development process

1 CHAPTER 1. INTRODUCTION in terms of the presence of architectural- and requirements models[7]. It has been demonstrated that such models could aid the software developers in the validation and verification of developed software[8]. Models have so far almost exclusively been used in forward-engineering, through the practice of model-driven development, where a model is created and the final system then synthesized from the model, how- ever for companies with large existing code-bases that want to achieve ISO 26262 compliance this approach becomes infeasible. Attempts at reverse-engineering mod- els from existing code artifacts have been performed in the past, but with limited degrees of success[9, 10, 11, 12, 13, 14]. Modelling of embedded systems has further proven problematic due to the low-level nature of the programming languages com- monly used in embedded systems development (C and Assembly Language). While higher level languages with stricter semantics exist, such as Ada, for embedded sys- tems development, C and Assembly Language remain the dominant languages for this particular of development today[15]. The difficulty of modelling low-level source code written in C, coupled with the ISO 26262 requirements for a software architecture model, has triggered an interest in the automotive industry in examining how producing such models of C-code could possibly be achieved. A first step is to limit the scope of the C language to a safer subset. For this task most automotive companies have adopted the MISRA-C language subset, the result of a set of rules produced by the Motor Industry Software Reliability Association (MISRA), which aim to increase the safety, portability and reliability of code artifacts written in the C language. Even given these restrictions however, C-code remains difficult to model. This thesis aims to investigate the possibility of modelling embedded systems C-code in currently available modelling formalisms. More specifically the thesis aims to answer the following questions:

• Will the modelling formalisms available today be able to preserve the structure of code written in MISRA-compliant C?

• What are the limitations in terms of what can be expressed in the modelling formalisms that may occur in MISRA-compliant C?

• How well established are the modelling formalisms in terms of documentation, tool-chains and active development?

The thesis aims to establish the requirements of the automotive industry re- garding a modelling formalism that could be used to model automotive embedded systems architectures. The thesis then aims to produce a collection of legal con- structions in MISRA-C:1998 and evaluate the expressiveness of modelling languages that fit the previously established requirements in regards to how many of these con- structions that can be modelled in the given formalism. Recommendations in terms of which language(s) that would be most suitable for automotive industry needs will be presented, as well as suggestions for what an ideal modelling language for auto- motive industry needs (assuming that none of the reviewed modelling formalisms

2 1.2. DELIMITATIONS can be considered ideal) could look like, borrowing concepts from the languages examined in the survey.

1.2 Delimitations

This thesis will cover the modelling of MISRA-C:1998 compliant C-code. Other programming languages, such as Assembly Language, will not be considered, al- though comments will be made whether the modelling formalisms could possibly cover mixed-language software, but no deeper analysis will be made regarding the extent of capability for these kinds of systems. The choice to limit the scope to MISRA-compliant C stems from the fact that it is widely used in automotive indus- try, and considered to be best practice for C-language development in this domain [7]. The thesis considers the 1998 revision of the MISRA-C guidelines. Later revi- sions are supersets of the 1998 revision, with additional rules added. This makes the 1998 revision the least limiting revision, and as such anything that can be modelled in the 1998 revision can be modelled in later revisions as well. The thesis will be conducted as a survey, and as such a fairly large number of modelling formalisms will be evaluated. This means that the author will not be able to fully master each individual modelling formalism. As a result only aspects of the modelling formalisms that are clearly documented will be considered. In places where documentation is incomplete or missing, the aspects of the formalisms described therein will not be considered. A set of requirements imposed on modelling formalisms for them to qualify for the survey will be formalized, and any formalism that does not fulfil these requirements will not be considered. This stems from sustainability aspects; for a formalism to be usable by engineers in day to day operation within the automotive industry it needs to be able to offer a degree of documentation, promise of future development as well as proper tool support. This allows automotive companies that choose to adopt the formalism to be able to use it in future development with limited degrees of risk.

1.3 Disposition

Following the introduction to the topic of study, chapter 2 will present a brief background of the concept of modelling. Model-driven development practices will be described, and a summary of families of modelling languages will be presented. The C Language will be introduced, and its usage in embedded systems development will briefly be explained. The difficulty of modelling the C language will be explained, as well as the difficulty of moving from code artifacts to a model, rather than the other way around. A summary of the ISO 26262 standard will be presented. Finally an introduction to Scania CV AB, the automotive company at which the work required to produce this thesis was performed, will be given.

3 CHAPTER 1. INTRODUCTION

Chapter 3 will summarize the Scania software architecture used in heavy trucks and buses. The layer model will be explained, and a description of internal com- munication will be presented. For reasons of non-disclosure the complete software architecture can not be presented, but an abbreviated version that will be sufficient to describe the goal of the modelling process will be described. Chapter 4 will outline the approach used by this thesis to determine the strength and capability of the modelling formalisms that are considered in the survey. The goals of the model will be presented in terms of accurate representation of the software as well as the possibility to perform automated verification of the model. The requirements imposed on modelling formalisms in order to ensure proper doc- umentation, development process and tool support will be described and argued for. Finally a framework of evaluation for the expressiveness of the formalisms in regards to ability to model the C language will be presented. Chapter 5 will present the results of the survey. A complete list of identified C language constructions will be presented and discussed, as well as categorized based on what sort of programming concept they express. The selected formalisms for the survey based on the requirements posed will be presented, and formalisms not selected for inclusion in the survey will be discussed briefly. The chapter ends with coverage analyses for the surveyed formalisms in regards to the C constructions that they can accurately model. Each construction will be discussed, and a discussion will take place for how this construction could be modelled in each of the formalisms. Example models for each construction will presented. Chapter 6 will discuss the results of the survey, as well as present suggestions for how automatic model parsing could potentially be implemented. Suggestions for how to augment existing formalisms in order to increase their strength in modelling the C language will be presented and argued for. The chapter concludes with a discussion about potential future work in the topic of modelling C language code.

4 Chapter 2

Background

2.1 Modelling

Several attempts[16][17][18] have been made to establish the nature of a model, although there is still disagreement concerning a formal definition of the term. It is commonly agreed upon however that a model is a representation of an object, physical or imaginary, that in some way presents a useful description of the object. This definition is very broad, and subjective in nature; what is useful depends on the observer, and as such does not suffice for a thesis concerning itself with modelling. For the purpose of this thesis, we will consider the following definition 1.

Definition 1. A model PM is the projection of an ordered set of points in do- main space D onto model space Mf using a modelling formalism f(x) where x = {x1, ..., xn}, xi ∈ D as an ordered set mapping function.

A domain constitutes a family of objects that share some common properties that prompt them to be modelled in a similar way. This thesis targets the domain of programs written in MISRA-C:1998 compliant C code. The definition of a domain is intentionally left vague, as what could constitute a family of objects sharing a set of properties is a broad definition. In fact general purpose modelling formalisms specifically rely on this definition being vague. A point in domain space represents the smallest possible cohesive subcomponent of an object for a given domain. For the purposes of modelling C language code, this realistically refers to a given statement in the code, such as a variable assignment. It becomes clear that the set of points passed to the modelling formalism mapping function has to be ordered; rearranging the order of the subcomponents would not make for the same composite object. In the domain of code, the rearranging of the program instructions would likely change the behaviour of the program synthesized from the code. Model space refers to the space in which projections resulting from the appli- cation of a given modelling formalism reside. Points in model space represent the smallest possible cohesive subcomponents of the model projection. It becomes evi-

5 CHAPTER 2. BACKGROUND dent that a unique model space exists for every given modelling formalism, as the set of possible subcomponents must depend on the mapping function. A modelling formalism refers to any mapping function that takes an ordered set of points in a given domain and creates a projection in the form of an ordered set. If this function is injective, the modelling formalism is considered unambiguous, that is that a given object in domain space maps to a single model. A modelling formalism is considered to be lean if the mapping function is surjective, in other words there are no unused subcomponents in the formalism. A construction is defined according to 2.

Definition 2. A construction defines a for subcomponents, which are identical in regards to purpose, but differ in implementation.

Referring to the modelling of C language code, constructions would constitute programming instructions such as assignments or if-else statements. These instruc- tions perform the same operation, but using different defining parameters. This definition implies that a construction is a hyper-plane in domain- or model space; the vector component that represents the type of subcomponent would be fixed based on the construction type, but the defining parameters would be left open.

An implied property of a model is that it should provide some level of abstraction in comparison to the object that is the target of the model, otherwise one could just as well study the object rather than the model. What constitutes a level of abstraction however is by nature subjective; in order to achieve abstraction some components will remain and some will be removed when making the transition from object to model. Which subcomponents that are of interest to preserve in the model depends on the intended usage of the model. For the purposes of this thesis, relevant abstractions will be based on the intended audience of the model. Since the thesis concerns itself with modelling C language source code, the target audience will be systems architects and code developers.

2.1.1 Model-Driven Engineering Model-Driven Engineering is the name given to the practice of conceiving the de- sign and architecture of a system through the use of models. In classical systems engineering[19] the flow of systems design goes from requirements, to a systems specification, to implementation. Requirements are synthesized by a systems archi- tect into a set of specifications for how the system should be implemented, which are then implemented by the developers. After the system has been developed, various testing stages such as unit testing and integration testing are performed in order to evaluate whether the system fulfils the requirements posed on it[20]. The verification and validation of systems developed in this fashion is performed at the end of the development process, when the system has already been realized. This means that errors related to the requirements are only caught once the sys-

6 2.1. MODELLING tem has been fully developed, potentially resulting in the need for redevelopment of parts of the system that stem from the faulty requirements, a process that could be time-consuming and costly. Test-Driven Development[21] was introduced as a way to address the problem of errors being discovered at the end of the development process. Test-Driven Development revolves around writing test-cases before the code that is meant to be tested is written. This makes it evident much sooner if there are problems with the requirements, or with a particular function, as problems with requirements can be detected during the construction of the test-cases, and problems with a particular function can be detected as soon as that function is written, without having to rely on dedicated sequential development and testing phases. Model-Driven Engineering[18] takes the fundamental principle of Test-Driven Development, namely early detection of errors, one step further. In a Model-Driven Engineering approach the first step is to create a model of the target system. This model initially starts out at a high degree of abstraction, and then gets expanded when functionality is added to the system. This allows for a multi-tiered testing approach, where the system can be tested at various degrees of abstraction. This allows for faster testing of the system, since the system will be less complex at a higher degree of abstraction, as well as early detection of architectural design errors. The model-based approach also has the benefit of presenting a number of depictions of the system at various degrees of complexity, targeting a large number of different stakeholders. Management parties might just be interested in the general idea of the system, and as such are looking for a very high degree of abstraction, while develop- ers might want a more detailed architectural model. Model-Driven Engineering has as such shown promise to be an effective way of dealing with increasingly complex systems[22].

2.1.2 Modelling Formalism Families

Modelling languages can be divided into a number of families based on which aspect of an object that they intend to model. In this section the most common families for the modelling of various aspects of embedded systems are presented.

Data Flow Languages

Data Flow Languages intend to model the flow of data through the structural com- ponents of a program. These languages tend to place little focus on how data transformations are achieved, or on the exact methods of communication in which data is passed around, but rather how a given piece of data is produced, and which intermediate- and final data representations that derive their values from this given piece of data. Data Flow Languages are in many cases synchronous1, where a steady

1Lustre, SIGNAL, Simulink

7 CHAPTER 2. BACKGROUND of data is produced based on a system clock. A few examples of data flow languages are Simulink2, Lustre3, Esterel4 and SIGNAL5.

Architecture Description Languages Architecture Description Languages aim to model the different structural compo- nents, which together make up the architecture, of an object. Communication be- tween structural components is defined, but the internal workings of each structural component tend not to be modelled. In embedded systems modelling, architecture models can exist at both the hardware level, where communication between differ- ent computational platforms through buses is modelled, or at software level, where communication might take the form of function calls with parameters, or commu- nication through global variables or message queues. Mixed architectures, taking into account both hardware and software, can in many cases be produced 6, where the model maps which software components that execute on a given processor. Due to the lack of internal component behaviour in the model, data flow and control flow is not guaranteed to be taken into account in the model; while communication interfaces between components are defined, which particular data that is communi- cated over these interfaces is not necessarily clear. Some examples of architecture description languages are AADL7, UniCon8 and ACME9.

Component Modelling Languages/Interface Description Languages Component Modelling Languages model a system in terms of its structural com- ponents, as well as the interfaces between them. Interfaces can take the form of inclusion or communication. Inclusion refers to one structural component with finer granularity, such as a function, being included in a structural component with coarser granularity, such as a source code file or layer. Communication takes the form of data or control being transmitted over an interface. Unlike architecture description languages, component modelling languages do not explicitly detail the method of communication between components. The mapping of software compo- nents onto hardware is also not taken into account. This means that we can view component modelling languages as a subset of architecture description languages.

Interface Description Languages focus on the ability to model the communica- tion on interfaces between components. As such they are similar to component modelling languages in that they view the object that is to be modelled in terms of

2http://se.mathworks.com/products/simulink/ 3http://www-verimag.imag.fr/Synchrone,30 4http://www-sop.inria.fr/meije/esterel/esterel-eng.html 5http://www.irisa.fr/espresso/Polychrony/ 6http://www.openaadl.org/post/2013/04/15/aadl-tutorial/ 7http://www.openaadl.org/ 8 http://www.cs.cmu.edu/˜Vit/unicon/ 9 http://www.cs.cmu.edu/˜acme/

8 2.1. MODELLING subcomponents that communicate over interfaces. Interface description languages however complement the lack of clear specifications for the method of communi- cation of the component modelling languages with the ability to model exact data representation that is passed on the interfaces in terms of values that have to be present, and optional parts of a signal.

Pure component modelling languages are becoming increasingly rare, as aug- mentations to their semantics in order to integrate the functionality of interface description languages or the software-on-hardware mapping of the architecture de- scription languages are being added10. UML as well as its derivatives, such as SysML, have component modelling language support in the form of component diagrams, but also include deployment diagrams and parametric diagrams to add the functionality of architecture description languages and interface description lan- guages respectively.

2.1.3 Reverse Engineering Reverse Engineering refers to the practice of extracting data from existing software artifacts[23]. This practice could take the form of disassembling compiled software in order to attempt to determine algorithmic implementations, but it could also entail the extraction of architectural data from source code. The latter interpretation is the main point of focus for this thesis. Reverse engineering becomes relevant when a paradigm shift in development methodology requires developers to re-examine the way they structure their work. Some such paradigms are presented in 2.1.1, which all require a different set of project deliverables. A practical example would be the transition from a test-driven development approach to a model-driven approach, where the test cases produced early in the test-driven approach are not required until much later in the model- driven approach. Instead the model-driven approach requires the production of a model of the overarching architecture of the software at an early stage, a step that was not required during any stage of the test-driven approach. In order to not be forced to start over with development every time the paradigm shifts, a process that could be very costly, it is appealing to try to construct these new required deliverables from the already existing work. The process of reverse engineering involves a number of risks however. Since the general development process moves from a higher level of abstraction to a lower level of abstraction the further into the development cycle a project proceeds[24], reverse engineering project progress from late stages in order to produce deliverables required in earlier stages represents a move from a lower level of abstraction to a higher level of abstraction. This inevitably results in a loss of information during the translation. If the goal of the new paradigm is the automated synthesis of systems, which is the case with model-driven development, then this loss of information

10UML Deployment Diagrams - http://www.agilemodeling.com/artifacts/ deploymentDiagram.htm

9 CHAPTER 2. BACKGROUND could result in a different system being synthesized from the reverse-engineered artifacts than the one actually in place. This is a phenomenon commonly referred to as architectural drift, where the model of the architecture does not represent the implementation of the architecture.

2.2 Functional Safety Standards

The International Electrotechnical Commission (IEC) defines the concept of safety as11:

(1) Freedom from unacceptable risk of physical injury or of damage to the health of people, either directly, or indirectly as a result of damage to property or to the environment.

The concept of Functional Safety is defined by IEC as a subset of general safety as the

(2) [...] part of the overall safety that depends on a system or equipment operating correctly in response to its inputs.

The IEC further defines the functional safety of a system as

(3) [...] the detection of a potentially dangerous condition resulting in the activation of a protective or corrective device or mechanism to prevent hazardous events arising or providing mitigation to reduce the fight consequence of the hazardous event.

The definition of the functional safety of a system states that the system needs to detect dangerous conditions. This definition implies that the system needs to be aware of what could potentially constitute a dangerous condition, and that such conditions need to be defined during system design.

2.2.1 IEC 61508 In order to aid the industry in the identification of dangerous conditions, or haz- ards, that would need to be taken into account during system design, the IEC defined a set of guidelines for the development process of Electrical/Electronic/Pro- grammable Electronic Safety-related (E/E/PES) systems, which were standardized as IEC 61508 [25]. IEC 61508 consists of seven parts, where the first three parts concern requirements posed on E/E/PES systems in order to fulfil compliance with the standard. The remaining four parts concern guidelines and examples of how to fulfil compliance with the standard. The IEC 61508 standard requires a hazard and risk analysis to be conducted at the start of development, in order to determine dangerous conditions such as those

11http://www.iec.ch/functionalsafety/explained/

10 2.2. FUNCTIONAL SAFETY STANDARDS referred to in the functional safety definition in (3). The result of the hazard and risk analysis is a target Safety Integrity Level (SIL); a target for the strength of safety precautions placed in the system. IEC 61508 then provides a set of development guidelines for how to achieve the desired SIL. IEC 61508 targets general E/E/PES systems. As a result, several adaptations of the standard have been formalized to aid specific domains in achieving IEC 61508 compliance: IEC 62279 for railway applications, IEC 61513 for nuclear power plants, IEC 62061 for machinery, and ISO 26262 for the automotive industry. ISO 26262 compliance is a driving factor for the work presented in this thesis.

2.2.2 ISO 26262

In 2011 the International Organization for Standardization (ISO) presented the standard ISO 26262 - Road vehicles Functional safety as an adaptation of IEC 61508 for the domain of automotive road vehicles with a weight of less than 3.5 metric tonnes. ISO 26262 attempts to achieve the same objective as IEC 61508, namely the guarantee of conceptual functional safety, in automotive vehicles.

The ISO 26262 Parts

The ISO 26262 standard is divided into ten parts, as detailed in figure 2.1.

The Parts of ISO 26262

1. Vocabulary

2. Management of functional safety

3. Concept phase

4. Product development at the system level

5. Product development at the hardware level

6. Product development at the software level

7. Production and operation

8. Supporting processes

9. Automotive Safety Integrity Level (ASIL)-oriented and safety-oriented analy- sis

10. Guideline on ISO 26262

Figure 2.1. The ten parts of ISO 26262

11 CHAPTER 2. BACKGROUND

The focus of this thesis will be on part 6 of the specification, concerning itself with development at the software level. Given a sufficiently expressive modelling formalism however, parts 3 to 5 could possibly be covered by the formalism. In the scope of this thesis such implications are not evaluated. Sections 6-5, 6-7 and 6-8 of part 6 are of particular interest as these sections concern themselves with modelling the architectural design and software of the system. An illustration of the different parts and sections of the ISO 26262 standard is shown in figure 2.2.

Figure 2.2. The parts and sections of the ISO 26262 standard.

ASIL Classification The ISO 26262 standard redefines the Safety Integrity Levels (SILs) proposed in IEC 61508 as a number of Automotive Safety Integrity Levels (ASILs). The standard defines five ASILs, enumerated as A-D as well as QM, where D is considered to be the most severe level with the highest likelihood of occurrence and the with the highest severity, and A is considered to be the least likely and with the lowest severity. QM, which stands for Quality Management, represents components that do not have an impact on the safety of the vehicle. During the hazard and risk analysis, each component of the vehicle is evaluated in terms Severity (S), Exposure (E) and

12 2.2. FUNCTIONAL SAFETY STANDARDS

Controllability (C). A summary of the levels of classification for each category is presented in figure 2.3.

Hazard and Risk Classifications

• Severity Classifications (S):

– S0 No Injuries – S1 Light to moderate injuries – S2 Severe to life-threatening (survival probable) injuries – S3 Life-threatening (survival uncertain) to fatal injuries

• Exposure Classifications (E):

– E0 Incredibly unlikely – E1 Very low probability (injury could happen only in rare operating conditions) – E2 Low probability – E3 Medium probability – E4 High probability (injury could happen under most operating condi- tions)

• Controllability Classifications (C):

– C0 Controllable in general – C1 Simply controllable – C2 Normally controllable (most drivers could act to prevent injury) – C3 Difficult to control or uncontrollable

Figure 2.3. Hazard and risk analysis categories

Each item in the system will be assigned a classification from each of the three classification categories. These three classification values then translate into the ASIL rating of the item[26]. During each step of the development process as outlined in parts 4-6 of the standard, several activities for achieving the target ASIL are proposed. These activities are classified according to a three-grade scale, with (++) indicating a highly recommended activity, (+) indicating a recommended activity and (◦) indicating an activity that is neither recommended nor not recommended. An example of such activities, as well as the recommended usage for various target ASIL is presented in figure 2.4.

13 CHAPTER 2. BACKGROUND

Methods ASIL A B C D 1c Initialization of variables ++ ++ ++ ++ 1f Limited use of pointers ◦ + + ++ 1j No recursion + + ++ ++

Figure 2.4. Example of recommended activities to achieve ASIL targets

2.3 The C Programming Language

C is a general purpose programming language developed in 1969-1973 by Den- nis Ritchie, inspired by previous programming languages such as BCPL[27] and ALGOL 68[28]. C was designed to be a low-level programming language, adding concepts such as addressability, that were not present in earlier programming languages outside of Assembly Languages. The development of C took place in order to produce a programming language capable of implementing the Unix oper- ating system on a PDP-11 computer[29]. Over the coming years the popularity of C would increase, and several compilers were implemented for a number of different platforms[30].

2.3.1 ANSI-C In order to address the implications of the increased usage of the C language, the American National Standards Institute (ANSI) formed a task force in order to formalize an official set of standards for how the C language should be used. This resulted in the 1989 publication standardizing what would come to be known as ANSI C, or C89[31]. A year later, in 1990, the C standard formalized by ANSI was recognized by the International Organization for Standardization (ISO) and came to be published as ISO/IEC 9899:1990, commonly known as C90[32].

2.3.2 Usage in Embedded Systems Development C has come to be the most widely used programming language for embedded systems development[33]. The small memory footprint of C code, which stems from the very low degree of abstraction provided by the native C libraries as well as efficient data storage through the usage of unions and -fields, is appealing for embedded systems, where resources such as memory and processing power tend to be limited. The ability to directly address memory allows for communication with memory- mapped I/O devices, such as sensors, which are common in embedded systems. The C syntax allows for more compact code to be written compared to that of earlier languages that also provided direct memory access and efficient data storage, such as various dialects of Assembly Language, which has shown benefits in terms of increased productivity in developers[34].

14 2.3. THE C PROGRAMMING LANGUAGE

2.3.3 MISRA-C:1998 The Motor Industry Software Reliability Association (MISRA) was formed in the early 1990s as a collaboration between vehicle manufacturers, component manufac- turers and engineering consultants12. The aim of MISRA was to establish guide- lines for the development process for electrical and electronic components used in automotive vehicles in order to ensure the safety of the vehicle users. In 1994 they released their first set of guidelines for the development of vehicle based software[35]. The publication provided a number of guidelines covering aspects such as project models, testing of source code and fault handling. In order to support compliance with the 1994 publication guidelines, MISRA released a second set of guidelines in 1998, this time targeting the C programming language. The publication, known as MISRA-C:1998[36], establishes guidelines for the safe usage of the C language for safety-critical embedded systems source-code. The publication covers 127 rules, out of which 93 are required for compliance and the remaining 34 are advisory. These rules effectively establish a subset of the ANSI C language, intended to provide increased safety, portability and reliability of developed software. Some restrictions imposed in the MISRA-C:1998 subset that are not present in ANSI C include the prohibition of function-level recursion (rule 70), a restriction of the level of pointer indirection to two levels13 (rule 102) as well as the prohibition of right-hand-side expressions in boolean expressions having side-effects14 (rule 33). The MISRA-C standard has become widely accepted and used in industry, even outside of the originally intended automotive domain[37]. MISRA-C has been rec- ommended as a suitable standard for the fulfilment of the ISO 26262 safety standard for automotive vehicles[7], and as such serves as a useful restriction for the purposes of models aimed at fulfilling ISO 26262 requirements.

2.3.4 Modelling the C Language The C language presents a number of characteristics not normally found in other programming languages, a fact that complicates the establishment of a suitable modelling formalism for the language. This section discusses some of these uncom- mon concepts, and argue for why they would be considered hard to model.

Weak Typing Weak typing refers to the concept of variables within the source code having the ability to change type during program execution without explicit type conversions. This concept should not be confused with Static vs. Dynamic Typing, which refers to whether variables have a set data type at compile time; C is a statically typed language. In strongly typed languages data can only change representation through explicit conversions, such as through type-casting. In a

12http://www.misra.org.uk/MISRAHome/AbriefhistoryofMISRA/tabid/69/Default.aspx 13ANSI C allows unbounded pointer indirection. 14This rule is intended to allow for the use of short-circuit boolean evaluation by compilers.

15 CHAPTER 2. BACKGROUND weakly typed language however, data contained in variables can freely be passed to functions that accept other data types. The behaviour during this process, known as an implicit type-cast, tends to be undefined in terms of the semantics of the language; the behaviour will either be compiler dependent or platform dependent. This creates a dependency on the ability to model aspects of the environment during a modelling process; if the behaviour of a part of the source code is compiler dependent, then the model needs to factor in aspects that are not explicitly mentioned within the source code itself.

Dynamic Memory Allocation Dynamic Memory Allocation is the process of reserving blocks of memory from the memory management system (usually the operating system) of arbitrary size, that can then be used for data storage. Issues arise when the allocated memory area does not conform to any built-in data type of the programming language. This means that unique data types can effectively be created during the execution of the program that is synthesized from the source code. If the sizes of these data types are dependent on input from the environment, then the memory footprint of the program will be non-deterministic during execution. Dynamic Memory Allocation further has the potential to create unreferencable data storage; if a block of memory is allocated and the reference to this block lost, then the given data will exist in memory, but it will not be tied to any variable that exists in the system. Data that gets orphaned in this way will cause what is known as memory leaks, i.e the continuous loss of available memory during program execution.

Pointers Pointers refer to data types that hold no data value of their own, instead they reference another address in memory, at which data exists. Pointers in C can take the form of variable pointers, function pointers, indirected pointers and pointers to dynamically allocated memory. Variable pointers refer to pointers that contain the address of another variable in the system, they act as a way of indirectly addressing the data stored within this other variable. Function pointers point to memory locations where executable machine instructions reside, they can be used as a way of redirecting control flow to other parts of the system. Indirect pointers are pointers that contain the address of another pointer. The C language supports unbounded pointer indirection, in other words a pointer can point to a pointer that can point to a pointer [...] and so on. If one views pointers as their own data type, then this allows for an infinite number of data types that exist within the C language. Pointers to dynamically allocated memory are pointers that address the start of a block of memory that has been dynamically allocated. That blocks of dynamically allocated memory can be of arbitrary size results in ambiguity as to what a pointer actually addresses: does the pointer reference a single data storage location of the same size as the pointer is meant to address, does it target the first element of an array of elements of the size that the pointer is meant to address, or was the type of the pointer selected simply because of the size that it aims to target rather than the data contained at the targeted being of this type?

16 2.4. INTRODUCTION TO SCANIA

An example to clarify the third case is presented in figure 2.5.

Example of bit addressing using char /∗ 64− b i t s to hold the squares o f a c h e s s board ∗/ char ∗ visited = malloc(8);

/∗ Function to mark a given square as v i s i t e d ∗/ void visitSquare( i n t n ) { /∗ Determine which block the square r e s i d e s in ∗/ i n t block = n / 8 ; /∗ Determine which b i t that holds the square ∗/ i n t b i t = n % 8 ; /∗ Set the b i t as v i s i t e d ∗/ ∗(visited + block) |= 1 << b i t ; }

Figure 2.5. The usage of the char data type to address individual bits of data. Error-handling omitted for the sake of briefness.

2.4 Introduction to Scania

Scania was founded in 1900 in Malmö, Sweden[38]. Initially the company primarily manufactured bicycles, but production extended in 1902 to include trucks and cars. In 1911 Scania merged with competitor Vabis as a result of increased competition from the European market, and formed Scania-Vabis. Production at this point extended to also cover buses. Scania-Vabis released its first diesel engine in 1936, which evolved to use standardized components in 1939. This was the start of Sca- nia’s modularization effort. During the 1950s Scania-Vabis expanded its European market shares, and established production facilities in Brazil. Scania-Vabis merged with Swedish avionic and automotive company Saab in 1969, and changed name to Saab-Scania. Expansion abroad continued in the following decades, with production facilities being established in Argentina, the Netherlands and France. Saab-Scania separated back into the two companies Saab and Scania in 1995, and Scania was once again introduced on the stock market. Today Scania continues to be a strong competitor on the international market[39, 40, 41], focusing on the production of heavy trucks, buses and commercial engines.

2.5 Related Work

The reasons to perform reverse engineering of systems, as well as an approach to how this can be achieved in practice are presented in [42]. Many of the reasons to perform reverse engineering presented are similar to those established in this

17 CHAPTER 2. BACKGROUND thesis, namely the ability to understand and model engineer design decisions. The approach taken focuses on analysing low-level code artifacts and transforming them into higher level programming concepts (an example of transforming ASM code to something reminiscent of C is presented). This method is similar to the approach taken in this thesis, where C code constructions are transformed into constructions for a given modelling formalism. Several attempts have been performed at Scania, as parts of various master’s theses[9, 10, 11, 12, 13] and as doctoral research[14], to transform automotive C code into a model depicting the architecture. These attempts have focused on determining the architectural design of the software in regards to data flow and control flow. The focus in these theses however has been placed on the visualization of the architecture with a human as intended audience. The architectural models that have been presented have used semantics determined by the authors rather than using established formalisms. The work presented in this thesis shares this goal, but rather than attempting to establish a new formalism it aims to review current formalisms and see if they can prove sufficient for modelling the architecture of the system. The scope of this thesis further extends beyond the modelling of data- and control flow, and the goal is for all constructions in C to be modellable. In [43] an attempt is made to model ECU software in the Open Modellica formalism. This effort is similar to that performed in this thesis in that it attempts to use a preexisting modelling formalism to model C code. While this thesis aims to provide more shallow analyses of a number of modelling formalisms in order to determine feasibility of application, [43] instead elects to perform a deeper analysis into one particular formalism, Open Modellica. The work in [43] elects to focus on the aspects of control- and data flow during the modelling process. This is a limitation in scope compared to the work in this thesis, where the aim is to cover as many aspects of the C language as possible. Work by Garcia et. al.[44] was performed in the area or determining suitable methods of architecture recovery. In their paper they analyze several existing meth- ods that can be used to recover software architectures, and present which methods that achieved the best success rates. Their goal of recovering a systems architecture is shared in this thesis, but this thesis takes a more formal approach to the resulting model, where strong semantics in the used modelling formalism is a key aspect in order to reduce ambiguity. Automatic recovery of architectures is briefly discussed in this work, but it is not the main focus of the thesis. In [45] a framework for whole-program analysis is presented, with the aims of producing an abstract syntax tree representing the software architecture of a pro- gram. This work is similar to that performed at Scania in [9, 10, 11, 12, 13, 14], but focuses on the space-efficient representation of large code-bases rather than specifi- cally on ECU software. While space-efficiency is certainly a concern, especially for automotive vehicle applications due to the number of ECUs that are part of the architecture, it does not concern itself with a formal representation of the architec- ture; in fact the measures taken to achieve space efficiency result in a model that is not accurately representative of the system, as identical AST nodes are merged,

18 2.5. RELATED WORK meaning that redundancy can not be represented. The aim of this thesis is to at- tempt to create a model that is as closely representative of the actual source code as possible. A pattern-based approach to architectural recovery is proposed in [46]. In the approach, a directed graph is produced from the source code of the software, and patterns representing various higher level architectural concepts are defined by a user. A query language is then used to match these patterns to sub-graphs of the system-wide graph in order to identify where the given design pattern is used. This approach is similar to that of matching C constructions, which could be said to be reminiscent of the patterns, to model formalism constructions. This thesis places focus on establishing constructions which can be used to perform a mapping similar to the one presented in [46], but does not concern itself with performing the actual mapping in practice. Several previous surveys have been performed in order to evaluate the expres- siveness of various modelling formalisms[47, 48]. The work performed in this thesis is very similar to the work performed in [47], but aims to address the modelling of C code in particular rather than the general expressiveness of the formalisms. Other differences include the restriction to architecture description languages imposed in [47] and to component modelling formalisms in [48] whereas this thesis aims to survey foramlisms from several modelling formalism families. Several formalisms surveyed in the previous surveys are no longer maintained15 prompting a need for an updated survey of formalisms. In [49] a semiotic framework for evaluating the expressiveness of modelling for- malisms is proposed, which is further extended in [50] and [51]. The framework borrows concepts from set theory in order to evaluate the expressiveness of a for- malism; the expressiveness is defined as a set difference between the expected results of the modelling process and the achieved results thereof. This metric is compat- ible with the definition of a model presented in definition 1, which also borrows concepts from set theory. The concept of model completeness is defined in [49] according according to (4).

(4) Completeness means that the model contains all the statements about the domain that are correct and relevant. That is, D\M = ∅. The corresponding error class is incompleteness, meaning that the equation does not hold.

This definition is binary in that a formalism can only be said to be complete if the equation in (4) holds. In [50] a non-binary definition is proposed for several similar metrics, but is not explicitly defined for the completeness metric itself. Due to the limitations in scope of this thesis, other evaluation metrics defined in [49], [50] and [51] will not be used, as they factor in aspects such as how a model is perceived by an audience. Evaluation of these aspects lie outside of the realm of unambiguous modelling of C language source code, and while they certainly are

15MetaH evolved into AADL, the UniCon toolset webpage is not available as of 2015-04.

19 CHAPTER 2. BACKGROUND useful for determining how well a model will be received, they would have to be covered in future work.

20 Chapter 3

The Scania Software Architecture

This chapter will briefly introduce one of the software architectural models used at Scania. The architecture examined will be that of the Coordinator (COO7), one of the ECUs present in all Scania vehicles. COO7 is developed by the REVE - Embedded Software department at Scania, and is responsible for coordinating communication between the various CAN-buses in the vehicle. This case study of the Coordinator aims to showcase the usage of architectural layers in automotive vehicle system design. While the individual layers are specific to REVE developed ECUs, the concept of architectural layers extends to other automotive architectures such as AUTOSAR1. As such the case study presented in this chapter should be applicable to other layered architectures with a few modifications.

3.1 The Layer Model

The embedded software of the Coordinator is divided into nine architectural layers[52]: APPL, MIDD, DRIV, BIOS, RTDB, DIMA, UTIL, EXEC and SYST as seen in figure 3.1. Each architectural layer has its own set of responsibilities.

3.2 Code Organization

Following individual functions and variables, the next finest granularity code structure- block in the ECU software architecture is the module: a set of functions with clearly defined input- and output interfaces that handle a specific task in the system. In COO7 all modules consist of a C-file, containing the function implementations of the module, and an H-file, containing the function prototypes that make up the interfaces of the module. Optionally a module can include an internal H-file con- taining module scope functions and variables, as well as a file containing calibration data for the module. These files all follow a specific naming pattern to highlight

1http://www.autosar.org/about/technical-overview/

21 CHAPTER 3. THE SCANIA SOFTWARE ARCHITECTURE

Figure 3.1. The software architectural layers of COO7. Arrows represent depen- dencies. Grey layers are unique to COO7, while white layers are general layers. that they are different parts of the same module. Table 3.2 contains an example of the different parts of a module.

File Name Purpose module1.c Implements the functionality of the module module1.h Defines the interfaces of the module module1 cal.c Calibration data for the module module1 int.h Internal functions and variables of the module

Figure 3.2. Naming convention for parts of the module module1. This is an example module and is not present in any ECU software.

Each module belongs to a layer, as described in section 3.1. A layer can be identified both through its communication semantics as well as through the code organization of the software. As shown by the arrows in figure 3.1, each layer is restricted in the manner of with which layers it is allowed to communicate. The APPL layer for example is only allowed to communicate with the MIDD, RTDB, DIMA, UTIL, EXEC and SYST layers, but never directly with the DRIV or BIOS layers. This layer structure is implicit, in other words the layers appear as a result of an abstract design used by the developers rather than through explicit mechanics

22 3.2. CODE ORGANIZATION of the C language in which the code is written. This means that unless one has access to the design documents, containing the description of the layer abstraction model, the fact that the layers exist is not immediately obvious. One would have to perform data- and control flow analysis on the code artifacts in order to be able to identify this method of communication. The code organization of the software refers to the presence of meta-data that highlights design decisions, while not in fact being part of the code artifacts them- selves. In the case of the COO7 layers, this takes the shape of a directory structure, where each directory is named after a layer, and all modules belonging to that layer are located inside this directory. An example of what this sort of meta-data can look like is presented in figure 3.3.

COO7 appl appl module1.c appl module1.h appl module2.c appl module2.h appl module2 cal.c appl module2 int.h [...] midd midd module1.c midd module1.h midd module1 int.h [...] driv [...] bios [...] util [...] exec [...] syst [...] rtdb [...] dima [...]

Figure 3.3. Directory-based meta-data pertaining to layer architecture. The module names in the figure are obscured, and this is as such not an actual representation of the COO7 architecture.

23 CHAPTER 3. THE SCANIA SOFTWARE ARCHITECTURE

3.3 Communication Channels

Communication between the layers primarily take the form of function calls with pa- rameters and return values. There are three exceptions to this however: the RTDB, the CAN network and the sensor readouts. These exceptions will be described separately in their own sections. Communication between layers is only allowed to take place according to the hierarchy presented in 3.1. Each layer provides an interface for inter-layer commu- nications known as a façade. The façade provides all the public interfaces of the layers, and is named according to the layer it provides the interface for. For exam- ple the BIOS layer façade is provided in the file bios.c. The main purpose of the façade is to alias internal functions of the layer that should be available as public interfaces. An example can be seen in figure 3.4.

Example of façade aliasing void B i o s a d i n i t ( void ) { Qa dc in i t ( ) ; }

Figure 3.4. The façade of the BIOS layer aliases an internal function of the layer with a Bios -prefix to indicate that it is a public interface of the layer.

The use of façades simplifies communication between layers, especially in regard to control- and data flow, in that they are forced to take the form of function calls with return parameters. Intra-layer communications interfaces are not clearly defined, and can as such take several different forms. Function calls with parameters and return values are common, as well as module scope variables used for passing data between various parts of the module. This complicates the tracking of data flow inside layers.

3.3.1 CAN

Several CAN-buses are used for communication between the different ECUs of the vehicle. An arriving CAN-message triggers an interrupt inside the ECU for which the message is addressed. The message is then read before the ISR returns con- trol back to the previous program counter address. This complicates control flow analysis, as control could at any point in time be handed over to the ISR following environmental signals. Due to the asynchronous arrival of data over the CAN bus, modelling data flow by observing just the code artifacts is difficult. Due to the priority protocol for CAN signals, the timing behaviour of data arrival is not exact; in fact the arrival of data is not guaranteed in case of bus saturation. This is a concern when modelling the data flow between ECUs, where one ECU can send data that never arrives.

24 3.3. COMMUNICATION CHANNELS

3.3.2 RTDB Communication of data values between the APPL and MIDD layer happens using the Real-Time Database Layer, the RTDB. The RTDB layer contains a number of functions for storage and retrieval of different data types. The data that resides in the RTDB is tagged with a keyword that represents which data value that the caller is trying to address, as well as the value of the stored value. This allows the APPL layer to not have to concern itself with how data was produced, it simply fetches the current value that is stored in the database representing the data it is interested in. This thesis will not cover the internal workings of the RTDB, but the save-by-tag read-by-tag data flow creates an additional alias for a given piece of data, complicating the tracking of data flow.

3.3.3 Sensors Sensors in COO7 are connected to the main MCU (Microcontroller Unit) to an Analogue-Digital converter (AD-converter) through memory-mapped IO pins. As such the values of the sensor registries are present as a specific memory address. Sampling of the sensors is done through periodic sampling of the value of this memory address. Data flow for memory-mapped IO is difficult to track, as the value residing at the memory address is updated through the environment, in this case the AD-converter sampling tick. This value can therefore change without it being evident in the code. This means that in order to model the data flow of values read from sensors, the concept of execution time of the code needs to be taken into account. If it is not, then retrieving the value of the sensor during different points within the scope of the same function could result in phantom reads.

25

Chapter 4

Method and Evaluation Framework

As highlighted in the previous chapters, a hierarchical architectural model of the embedded software is required by ISO 26262. In addition to this, a number of problems with achieving such a model were presented. This chapter aims to address those concerns, and examine a way in which a hierarchical model of the embedded software could be achieved.

4.1 Accurately Representing Software Architectures

In order to create a hierarchical model we must first examine which hierarchical structures that exist in C code. Three such aspects could quickly be identified by ex- amining the reference automotive architecture present in COO7: control flow, data flow and code organization. Communication between layers through the façades primarily takes the form of function calls with parameters and return values. As such control is passed between layers when communication takes place. Data prop- agates from lower layers, in the form of AD-converter values, up through the higher levels where it gradually transforms into more refined values that can be used by the higher level application modules. A given piece of data will as such have several intermediate representations before it reaches its final form, and is subsequently discarded or stored. Data flow as such becomes a key aspect to track in order to determine where the data refinement of the system takes place. In addition to control- and data flow, tracking the organization of the source code itself has proven to provide useful meta-data regarding the architectural design of the software, as shown in figure 3.3. Examining legal keywords in the C language[27] two more hierarchical concepts appear: hierarchical inclusion in composite data types and pointer hierarchies. The former refers to the way in which composite data types in C can include each other in order to create larger composite types. An example of this is presented in figure 4.1. Pointer hierarchies stem from the concept of pointer indirection in the C lan- guage. These hierarchies differ from the control flow, data flow, code organization and composite data types in that the hierarchy is reversely linked; that is, the child

27 CHAPTER 4. METHOD AND EVALUATION FRAMEWORK

typedef s t r u c t S1{ i n t s1 member1 ; }S1 ;

typedef s t r u c t S2{ i n t s2 member1 ; S1 S2 }S2 ; S3 S1 typedef s t r u c t S3{ −→ S1 s3 member1 ; S2 S2 s3 member2 ; S4 S3 } S3 ; S1 typedef s t r u c t S4{ S2 S3 s4 member1 ;

}S4 ;

Figure 4.1. The hierarchical construction of composite data types.

i n t P1 = 5 ;

i n t ∗ P2 = &P1 ; P1 i n t ∗ P3 = &P1 ; P2 −→ P4 i n t ∗∗ P4 = &P2 ; P5 P6 i n t ∗∗∗ P5 = &P4 ; P3 i n t ∗∗∗ P6 = &P4 ;

Figure 4.2. An example of hierarchies stemming from pointer indirection.

nodes of the hierarchical tree will know the parent, but not the other way around. The opposite is true for e.g function call hierarchies, where a given function will be aware of which functions it calls, but not by which functions it is called. Pointer hierarchies represent which pointers that alias a given piece of data. A given pointer can address data directly, or it can address another pointer through what is known as pointer indirection. Several pointers can address the same piece of data. This forms an alias hierarchy where the root node is a data value. An example of a pointer hierarchy can be seen in figure 4.2.

28 4.2. THE CONCEPT OF MODEL VIEWS

4.2 The Concept of Model Views

Given the wide range of possible hierarchical structures that can be defined, as outlined in section 4.1, it becomes clear that a modelling formalism that could satisfy ISO 26262 would be required to be fairly expressive. ISO 26262 does not explicitly define which hierarchies that are relevant for compliance, so the assumption is made in this thesis that all hierarchical structures present in C code would need to be possible to represent in a model. Not all hierarchies will necessarily be relevant for a given observer however. In order to address this concern, we will introduce the concept of model views. Formally a model view is defined according to definition 3.

Definition 3. A model view Vi of a given modelling formalism M is a subset of M. This subset could be proper or improper. The view space VM of a modelling formalism M is the set of all model views.

Informally a view allows a user to select a part of the model to view, based on the user’s given interest. A programmer might be interested in the view of all library functions present in the source code, while an architect might be interested in the view of all modules present in the code. The broad scope of this defini- tion, which stems from the unclarities regarding the general perception of what a model is, as highlighted in section 2.1, is the core concern of this thesis; how do we identify a modelling formalism that is powerful enough to support this definition of a view? The number of possible views grows exponentially as the size of the model increases, placing strong requirements on the modelling formalism in terms of compositionality, in order to efficiently be able to reproduce these given views upon request. In order to evaluate the feasibility of ISO 26262 model requirements compliance for current off-the-shelf modelling formalisms, there is a need to determine to what extent they can produce given views. Currently ISO 26262 requires hierarchical models, which means that views representing data flow, control flow, pointer hi- erarchies, composite data type hierarchies as well as code organization hierarchies need to be present at the very minimum. The standard further poses requirements on the use of pointers, recursion, multiple use variable names etc. For a full list of requirements see [7]. With the increased safety risk present in heavy trucks stemming from their increased weight and longer lifespans however, it would not be unexpected if the requirements are extended in a realization of IEC 61508 for heavy trucks. To determine if currently available modelling formalisms are sufficient to achieve ISO 26262 compliance one would need to determine which hierarchical structures that can be modelled in the given modelling formalism. By extension, in order to cope with any possible additions that could be added in a future heavy vehicle revision of the standard, it is of interest to determine which model views that can be created in a given formalism. This allows for the creation of a mapping from the views required by the standard to views that can be created within the formalism, and as such it is quickly determinable if a given formalism could achieve

29 CHAPTER 4. METHOD AND EVALUATION FRAMEWORK compliance.

4.3 Extending Model Completeness

In [50] a non-binary metric for the evaluation of coverage with regards to various aspects of a modelling language is proposed, as highlighted in section 2.5. This set of metrics does however not cover the completeness of the formalism, as defined in [49]. Borrowing from the theory introduced in [50], we can define a completeness coverage metric according to definition 4, which will be used as the evaluation metric for this thesis.

Definition 4. The completeness coverage, Cc of a modelling formalism M with respect to a domain D is |D \ M| C = 1 − c |D|

This metric will produce a real number in [0, 1], representing the ratio of ele- ments in domain space D that can be modelled in the model space M of a given modelling formalism. The metric will be further extended in section 4.7 to address user subjectivity in regards to the expressivity of a given modelling formalism.

4.4 A Framework for Evaluating Expressiveness

To determine which views that can be created in a given formalism, there is a need to establish the elements of model space. These elements will make up the subsets Vi of M, and as such make up the views. In order to establish these elements however, a limitation will need to be made to model elements that correspond to concepts that appear in the C programming language. The assumption is made that a given modelling formalism is unlikely to be lean with respect to the C language; that is, the modelling formalism is likely to contain constructions that are not used in the mapping from C to model. There is backing that this assumption holds given the modelling formalism families presented in section 2.1.2, where Architecture Description Languages often contain constructions regarding hardware mappings of software components, an aspect that is not necessarily present in the code artifacts. As such an identification process for elements in M that could be members of model views, while certainly possible, would result in an arduous process where many of the identified constructions would have no corresponding mapping in domain space. The solution used in this thesis will instead start with the elements of domain space, and perform the mapping according to a given formalism, and examine the resulting elements in model space. The mapped elements will make up their own view space, which will be a subset of VM . The difference in cardinality between this view space and VM represents the completeness coverage (see definition 4 in section 4.3) of

30 4.5. REQUIREMENTS ON MODELLING FORMALISMS the modelling formalism, in other words how many concepts that occur in the C language that can be modelled in the given formalism.

4.4.1 Constructions in C In order to determine which constructions that exist in domain space a reference is required detailing the concepts that occur within the language. For this we will use The C Programming Language, 2nd edition by Brian Kernighan and Dennis Ritchie [27]. This book was co-authored by Dennis Ritchie, the inventor of the C language, and should as such be considered an authority on the subject. The constructions identified in the book will then be cross-referenced with the source code of COO7 in order to identify any possible missing elements that could e.g be compiler specific. The COO7 source code will also be examined for meta-data that would need to be factored into the model, such as e.g directory structures or file naming conventions.

MISRA-C:1998 Restrictions The MISRA-C:1998 guidelines provide a num- ber of restrictions to what constitutes a legal construction. Since this standard is largely being followed in the automotive industry the delimitation to only account for constructions allowed in MISRA-C:1998 will be made within this thesis. A con- sequence of this is that not all constructions allowed in [27] will make it into this thesis. In many cases these limitations are depth bound imposing (e.g maximum levels of pointer indirection) or coding style restricting. These restrictions do not limit the use of certain constructions, they simply bound them, or force them to follow certain conventions. Other rules however do impose restrictions on the con- structions available (e.g the prohibition of using dynamic memory allocation). This latter kind of restrictions will be reflected in the identified constructions.

4.5 Requirements on Modelling Formalisms

Surveying all existing modelling formalisms present on the market today would be an infeasible amount of work, especially considering the broad definition of a modelling formalism used in this thesis. The aim of the thesis is to present a list of modelling formalisms that could potentially prove beneficial in the achieving of ISO 26262 compliance within automotive industry, and as such there is a need for a limitation of scope in what modelling formalisms that are reviewed. In order to narrow down the scope of candidate modelling formalisms a set of requirements will be posed on formalisms that are up for consideration. These requirements aim to restrict the surveyed formalisms to ones that could feasibly be used by line engineers for use in an industrial environment. Restricting the reviewed formalisms to those that could model automotive applications is a natural first step. Further restrictions are imposed in terms of availability of documentation and tools. These requirements aim to aid engineers in the usage of the formalisms, and minimize the need for in-house development. Requirements on maturity and development

31 CHAPTER 4. METHOD AND EVALUATION FRAMEWORK of the language are imposed in order to ensure that the formalisms are actively maintained and have proven to be of use in industrial applications. The formalisms need to be unambiguous in order to comply with ISO 26262. Lastly the formalisms are required to be open. This last requirement stems from a practical limitation of this thesis; the author does not have the ability to procure proprietary tools that require licensing fees. The full set of requirements is presented in appendix A, while a summary can be seen in figure 4.3.

Modelling Formalism Requirements

• Domain Relation

– Formalisms meant to model Automotive applications

• Documentation

– Formalisms require proper documentation to make them viable for widespread use

• Active Development and Maturity

– Formalisms are required to be properly maintained and proven to be viable in industrial applications

• Representation

– Unambiguous formalisms, supporting development of automated tools

• Tool Support

– Presence of tools in order to reduce necessity of in-house development

• Openness

– Formalism should not require licensing fees

Figure 4.3. Summary of Modelling Formalism requirements

4.6 Method of Evaluation

Combining the subtasks outlined in this chapter, we are left with a work-flow for the survey. The steps required in order to evaluate the completeness coverage of modelling formalisms are chronologically outlined in 4.4. The need for steps 1-4 has been established in previous sections of this chapter. Of note however is step 4b, where the strength of the mapping from domain space to model space is examined. This step aims to establish the unambiguity of the

32 4.6. METHOD OF EVALUATION

Survey Work-Flow

1. Identify constructions in the C language through a literature study and case study of COO7

2. Establish requirements to limit selection of modelling formalisms

3. Establish which modelling formalisms that fulfil the requirements, and as such will be a part of the survey

4. For every identified modelling formalism:

a) Create mapping from domain space to model space for each construction identified in the C language b) Determine strength of mapping c) Establish the completeness coverage of the formalism with regards to constructions

5. Analyze missing aspects for ISO 26262 compliance of surveyed formalisms and suggest augmentations to increase compliance

6. Review possibility of automatic reverse engineering of code artifacts into mod- els

Figure 4.4. The steps of the modelling formalism survey.

mapping; it might be the case that several constructions in the C language map to the same construction in model space, which causes ambiguity of the model. This could particularly be the case with constructions that effectively alias the same computational concept, so called syntactic sugar. In model space, the need for syntactic sugar is limited, and in many cases even undesirable as it results in ambiguity in model space. In domain space however, syntactic sugar will be present, and the ability to model its presence in code artifacts would therefore be desirable. This dilemma is discussed in further detail in section 4.6.1. In order to highlight where ambiguities exist in the translation from model space to domain space, three degrees of fulfilment factor will be used: Fulfilled, Partially Fulfilled and Not Fulfilled. Fulfilled means that there exists an unambiguous mapping from domain space to model space and vice-versa. Formally this means that if F (x) = a is a mapping from domain space to model space, then ∃!a : F −1(a) = x where F −1 is a synthesis function, in other words a function that synthesizes a domain object from a model. If this property holds then a model can be derived from code artifacts, and the same code artifacts can be derived from the model. This eliminates the occurrence of architectural drift, as the model and synthesized system are guaranteed to remain consistent even when one is changed.

33 CHAPTER 4. METHOD AND EVALUATION FRAMEWORK

If a is not unique in the domain-to-model transformation F (x) = a, if x is not unique in F −1(a) = x or if F −1 ◦ F (x) ≠ x then architectural drift will occur. If a construction is determined to be Partially Fulfilled then it can be modelled, but in an ambiguous or not formally defined way. This stems from one of the situa- tions previously mentioned as leading to architectural drift. For these constructions a comment will be attached highlighting what prevents the construction from being considered to be Fulfilled. Not Fulfilled indicates that the construction can not be modelled, that is @a : F (x) = a.

Step 5 in 4.4 aims to explore the possibility of using extensions to modelling for- malisms in order to increase the completeness coverage of the formalism. Several modelling formalisms available on the market today support extensions that can be provided by users, e.g AADL through the use of annexes or SysML through the use of stereotypes. These are not officially a part of the formalism and are as such not formally defined. In order for such augmentations to become feasible for automotive industry use, they would have to be formally defined and accepted by the industry. This formalization is beyond the scope of this thesis. Instead the thesis aims to explore whether such an extension could potentially be achieved, and then formally defined at a later stage. In other words, could the formalisms be augmented without losing their current semantics? In the last step an exploration of the possibility of realizing F (x) algorithmically will be explored. Due to the vastness of existing automotive codebases the ability to perform the translation to model space from domain space would be desirable as it would ensure that architectural drift could be prevented (if manual labour is required to perform the translation, then in order to avoid architectural drift an engineer would manually need to update the model whenever a change is made in domain space). While implementing a full source-to-source compiler from domain space to model space is beyond the scope of this thesis, such an implementation would be key to allow the widespread use of the models. As such this thesis will attempt to provide guidelines for how such a compiler could be algorithmically constructed.

4.6.1 Modelling code or modelling behaviour? A key concern with the modelling of C language code is to determine what consti- tutes domain space; the code as written or the code as implemented. When the C code is compiled into an executable piece of software, the compiler is bound to make optimizations that are not present in the code. As such the code as implemented by the compiler will not necessarily show the same behaviour that the source code would imply. Apart from compiler level optimizations such as loop unrolling and strength reduction, syntactic sugar as well as the pre-processor are of interest. Syn- tactic sugar signifies several ways of writing the same piece of code, depending on programmer preference. One example would be the convention of writing a[i] = 5; rather than ∗(a + i) = 5;. When compiled these statements are equivalent, how-

34 4.7. WEAKNESSES OF METHOD ever in domain space they could potentially indicate conventions used by a certain programmer, or company best practices. As such it is unclear if this distinction is relevant. The pre-processor performs a set of transformations prior to compilation. Since the code as implemented will not see the transformations of the pre-processor a concern rises whether the code-base should be modelled post- or prior to this trans- formation. In this thesis we will concern ourselves with the code as written rather than the code as implemented. This is a consequence of the lack of complete compiler insight; we do not know for certain which transformations that are being made by the compiler. Determining when a certain transformation is performed would re- quire deep insight into the internal workings of the compiler, which would either be time-consuming beyond the scope of the thesis, or in some cases, such as with com- mercial compilers, strictly impossible as the compiler source code is protected. The thesis will also view the pre-processor instructions as part of the code itself prior to transformation. This follows from the squashing of structural components that oc- curs in the pre-processor. The most notable example is the include directive, where objects from another structural component are referenced from a given structural component. The pre-processor solves these references by merging the referenced file into the referencing file, thus making the structural components appear as one. This clearly goes against the preservation of hierarchies required by ISO 26262.

4.7 Weaknesses of Method

This section aims to address the weaknesses of the suggested method, and factors that could potentially mitigate these weaknesses.

Lack of weight in constructions The completeness coverage metric presented in section 4.3 does not distinguish between views that are relevant to a given observer and views that are irrelevant. This is a consequence of the delimitation to not include humans in the survey. Using the ISO 26262 requirements as a base-line, a modelling formalism that can produce a large number of views, out of which none are hierarchical, will inevitably score a higher completeness coverage metric than a formalism that can produce fewer views but where the views are hierarchical. In order to remedy this concern, an augmented completeness coverage metric is proposed in definition 5.

Definition 5. The Augmented Completeness Coverage CA is given by the formula: ∑ C A ∩ · CC = (P (di,M D) W (di)) di∈D where { 1, if x ∈ Y P (x, Y ) = 0, if x ̸∈ Y

35 CHAPTER 4. METHOD AND EVALUATION FRAMEWORK is a presence function that determines whether an element∑ exists in a set, and W (x) is the weight value of variable x, where x ∈ D and W (di) = 1. di∈D

The Augmented Completeness Coverage metric assigns a weight to each given construction in domain space, and calculates a weighted coverage metric. If all weights are the same for all constructions in domain space, this definition yields the A same result as the initial completeness coverage metric, in other words CC = CC if ∀di, dj ∈ D : W (di) = W (dj) = 1/|D|.

Proof. We begin by defining the universe U as U = D ∪ M. Any elements that lie in (D ∪ M){ take the form of objects that neither lie in our domain of interest, nor in the domain of objects that can be modelled. These elements are not relevant to consider during the evaluation of a modelling formalism for a given domain and the definition does as such not cause a loss of generality. We know by∑ the definition of the weight function W , as well as the presence function P , that W (di) = 1 and di∈D that ∀di ∈ D : P (di) ∈ {0, 1}. From this, coupled with the definition of P , follows that ∑ ∑ A ∩ · − ∩ { · CC = P (di,D M) W (di) = 1 P (di, (D M) ) W (di) di∈D di∈D

Simplifying through the use of De Morgan’s Laws yields ∑ ∑ { { { 1 − P (di, (D ∩ M) ) · W (di) = 1 − P (di,D ∪ M ) · W (di) di∈D di∈D

The weight of a given di is part of the sum iff P (di) = 1 ∧ di ∈ D due to the summation limits as well as the definition of P . The sum above can therefore be rewritten as ∑ ∑ { { 1 − P (di,D ∪ M ) · W (di) = 1 − W (di) ∈ { { di D di∈(D∩(D ∪M ))

We know that D ∩ D{ = ∅. The summation limit can as such be further simplified as

D ∩ (D{ ∪ M {) = D ∩ M { = D \ M

1 Replacing W (di) with |D| and inserting the simplified limit yields

∑ ∑ | \ | A − − 1 − D M CC = 1 W (di) = 1 | | = 1 | | = CC { { ∈ \ D D di∈(D∩(D ∪M )) di D M

36 4.7. WEAKNESSES OF METHOD

This metric allows a given user to select which views that are of importance to him/her and select weights accordingly in order to determine the coverage of a formalism. A result of this thesis will be the computation of P (d, M ∩ D) for the surveyed formalisms, which follows as a consequence of F (x) ∈ M ∩ D; in other words the application of the modelling formalism transfer function will produce constructions that can be modelled, that is constructions in M ∩ D. As such the results of this thesis can be combined with any user-defined weight function, and the Augmented Completeness Coverage can be computed without the need to redo the survey. In order to perform this conversion, consider constructions classified as Ful- filled to be P = 1 and constructions that are Not Fulfilled as P = 0. Constructions that are Partially Fulfilled could be considered either P = 0 ∨ P = 1, depending on how strictly a user elects to stick to formal definitions. The recommendation of this thesis would be to consider Partially Fulfilled constructions as P = 0.

Extraction of Constructions Given the limited time scope of the thesis work, a complete study of all aspects of the C language could not be performed, and it is therefore possible that some constructions have not been taken into account. Due to different brands of compilers implementing their own pre-processors, constructions such as pragma, which provide specific directives to the compiler, have not been fully considered. The thesis primarily considers compiler independent C constructions, as part of ANSI C. It is however still possible that constructions have been left out that are part of ANSI C, but these constructions would then not have been mentioned in [27], been present in the source code of the Coordinator or been known by the author of the thesis, and should as such be considered to be rare in terms of usage.

Lack of Complete Survey of Formalisms With the broad definition of a mod- elling formalism, as presented in chapter 2, it would be impossible to cover every formalism within the scope of this thesis. In fact most programming languages fit under the scope of the definition of a modelling formalism, as programs written in C code can in many cases be reproduced in other programming languages. Even with a delimitation to only regard modelling formalisms where the authors claim that the formalism is designed for modelling, the number of formalisms would be very large. As such an initial selection of formalisms was required, which then was further narrowed down by the formalism requirements. The initial selection was based on the survey performed in [47], and extended with popular modelling for- malisms currently used in industry as recommended by the thesis supervisor. It is unlikely that all formalisms that could be considered relevant in terms of modelling the C language are considered, but the current selection should provide a baseline for the various families of modelling formalisms as presented in section 2.1.2. The evaluation framework is designed to be general enough that it should be directly applicable to any formalism not included in this initial survey, and the limited scope of formalisms selected should as such not result in a loss of generality.

37

Chapter 5

Coverage Analysis

This chapter will present the results of the survey. The identified constructions of the C language will be presented, followed by a description of the modelling formalisms that were considered, and subsequently evaluated or rejected, as part of the survey. The chapter is concluded with the coverage analysis of the surveyed formalisms with regards to the C constructions.

5.1 C Construction Categories

For ease of readability the constructions identified during the literature study, as well as during the case study of the source code for COO7, are divided into five categories based on what mechanical concept they cover. Data Storage represents constructions related to the storage of data in the system. This includes concepts such as variable types, translation from one variable type to another (type casting) and the scopes of variables. Data Flow covers the different ways that data can travel between structural components of the source code. This category includes concepts such as function parameter passing and return values, modification of data through aliasing pointers, side effects on global variables through function calls and similar concepts. Control Flow covers ways that program execution paths can be redirected between various structural components of the source code. The category includes concepts such as loops, function pointers, if-else if-else statements and similar. Code Structure covers human designed concepts of code organization rather than organization that is dictated by the C language itself. This category covers concepts such as the division of source code into various files, inclusion of such source code files, file scope functions, interfacing with other programming languages and similar. Program Behaviour covers aspects related to the behaviour of the program compiled from the source code, as well as bounds imposed on the source code by external means that are not part of the C language themselves. This covers concepts such as deterministic program execution, requirements posed on the source code, performance metrics of the code and similar concepts. In the subsections below the identified constructions are listed by category. Each

39 CHAPTER 5. COVERAGE ANALYSIS construction is assigned a key in order to allow for quick referencing as well as a more descriptive name, highlighting the concept that it covers. A C code example is provided, showing how the construction could appear in source code. In some cases a clarifying comment is included, highlighting boundary restrictions as imposed by the MISRA standard, or a further description that was deemed too long to be present in the name of the construction.

5.1.1 Data Storage The Data Storage category includes 24 constructions, as presented in table 5.1.

Key Name Example Comments A1 Declaration of typed data stor- i n t a ; age containers f l o a t b ;

A2 Custom data types not built tU32 id U32 ; into the lan- tBIOS STATUS E guage can4 s ramStatus E;

A3 Size of data type tU32 is 32 bits A4 Data type neu- Applies to pointers tral empty ele- NULL ment

A5 Composite data type - Structure typedef s t r u c t DataContainer { tU32 member1; tU64 member2; } Data Container;

A6 Composite data type - Union typedef union DataContainer { tU32 member1; tU64 member2; } DataContainer ;

40 5.1. C CONSTRUCTION CATEGORIES

A7 Function scope variables i n t fun1 ( ) { i n t a = 5 ; return a ; }

A8 File scope vari- ables s t a t i c i n t a ;

A9 Global scope variables i n t a = 5 ;

i n t main ( void ) {} ;

A10 Block scope variables i n t main ( void ) { { i n t a = 5 ; } }

A11 Constant vari- Applies to all vari- ables const i n t a = 5 ; able scopes

A12 Arrays How do we model i n t a [ 5 ] ; that an array is ef- i n t b [ 5 ] [ 5 ] ; fectively a pointer to the first element? A13 Pointers i n t ∗ a ;

A14 External vari- ables F i l e 1 . c

i n t a ;

F i l e 2 . c

extern i n t a ;

41 CHAPTER 5. COVERAGE ANALYSIS

A15 Typecasting of Applies both to im- variables tU32 a = (tU32) 5; mediate values (such as 5) and to casts be- tween variables dur- ing assignment. A16 Persistent function-level i n t f l i p F l o p ( i n t d ) { variables s t a t i c i n t a = 0 ; i n t tmp = a ; a = d ; return tmp ; }

A17 Pre-processor constants #d e f i n e A 100

A18 Unambiguous data types char a ;

is this equivalent to unsigned char a ; or signed char a ; ?

A19 Pointer indirec- Up to two levels of tion i n t ∗a ; indirection are al- i n t ∗∗ b = &a ; lowed by MISRA.

42 5.1. C CONSTRUCTION CATEGORIES

A20 Nested Custom Datatypes typedef s t r u c t { tU32 id U32 ; union { tU32 data aU32 [ 2 ] ; tU16 data aU16 [ 4 ] ; tU08 data aU08 [ 8 ] ; } data uni ; tU08 length U08 ; tB extendedFrame B; tB valid B; } tCAN4 QUEUE ENTRY STR;

A21 Enumerations typedef enum { CAN4 SPI FREE E = 0 , CAN4 SPI BUSY TX E, CAN4 SPI BUSY RX0 E, CAN4 SPI BUSY RX1 E } tCAN4 SPIUSAGE E;

A22 Expression Only applies to scope variables fun ( 1 ) ; immediate values in ANSI C.

A23 Address of How do we model p r i n t f ( ”%x” , &a ) ; that all variables have an implicit address that can be extracted through the &-operator? A24 Bit fields s t r u c t { unsigned i n t widthValidated : 1 ; unsigned i n t heightValidated : 1 ; } s t a t u s ;

Table 5.1. The constructions of the Data Storage category.

43 CHAPTER 5. COVERAGE ANALYSIS

5.1.2 Data Flow The Data Flow category includes 9 constructions, as detailed in table 5.2.

Key Name Example Comments B1 Direct assign- ment i n t a = 5 ;

B2 Function return value i n t a = fun ( ) ;

B3 Function param- eter passing i n t a = 5 ; fun ( a ) ;

B4 Modification by reference - Func- i n t a = 5 ; tion fun(&a ) ;

B5 Modification by reference - Di- i n t a = 5 ; rect i n t ∗ toA = &a ; ∗toA = 3 ;

B6 Modification by external factor #d e f i n e SENSOR ADDRESS 0 xDEADBEEF

v o l a t i l e i n t ∗ tempSensor = ( i n t ∗) SENSOR ADDRESS

B7 Transparent modification s t a t i c v o l a t i l e i n t sensorData; through inter- i n t fun1 ( i n t a ) { rupt a += sensorData; /∗ Random code here , i n t e r r u p t happens here ∗/ a −= sensorData; }

44 5.1. C CONSTRUCTION CATEGORIES

B8 Internal lan- Data flow for func- guage meshing - i n t add ( i n t arg1 , i n t arg2 ) { tions written in ASM ASM i n t add ; might need external a s m ( ” addl %%ebx , %%eax modelling through ; ” : ”=a” ( add ) : ”a” ( e.g contracts. arg1 ) , ”b” ( arg2 ) ) ; return add ; }

B9 Function call This is a special with side effects i n t a = 5 ; case of B1 concern- void modifyIt ( i n t b ) { ing global- or file- a = b ; scope variables. }

i n t main ( void ) { modifyIt(2); }

Table 5.2. The constructions of the Data Flow category.

5.1.3 Control Flow The Control Flow category contains 14 constructions, as outlined in table 5.3.

Key Name Example Comments C1 If-else if-else construction i f ( a && b ) {} e l s e i f ( a && c ) {} e l s e {}

C2 For-loop con- struction i n t i ; f o r ( i = 0 ; i < 1 0 ; i++) {} ;

C3 While-loop con- struction i n t i = 0 ; while ( i < 10) i ++;

45 CHAPTER 5. COVERAGE ANALYSIS

C4 Do-while-loop construction i n t i = 0 ; do{ i++} while ( i < 10) ;

C5 Function calls with standard i n t a = 5 ; output and in- a = fun1 ( a ) ; put parameters C6 Function calls with pointer i n t ∗ a ; parameters ∗ a = 5 ; a = fun ( a ) ;

C7 Function calls with function void fun1 ( i n t (∗ functionPtr)( int pointer parame- , i n t )); ters C8 Switch construc- tions i n t a = 5 ; switch ( a ) { case 1 : break ; d e f a u l t : break ; }

C9 Interrupts No specific example. Switches execu- tion from PC to interrupt handler be- fore switching back to PC. C10 Nested function MISRA does not al- calls void fun1 ( ) { low fun1 == fun2, fun2 ( ) ; but if modelling re- } cursion is possible in the model then that is a bonus. C11 Calling of func- tion pointers void fun1 ( i n t (∗ functionPtr)( int , i n t )) { (∗ functionPtr)(1, 2); }

46 5.1. C CONSTRUCTION CATEGORIES

C12 Function calls Special case of A14, with type-casted i n t fun1 ( i n t a ) ; mainly relevant variables f l o a t f = 3 . 1 4 ; to modelling lan- fun1 ( ( i n t ) f ) ; guages using strictly typed component interfaces. C13 Function-like Can be modelled as macros #d e f i n e ABS(x) (((x) >= 0) ? functions or in-place, ( x ) : −(x ) ) can the model make the distinction? C14 Ternary state- Can be modelled ment a = ( a > 2) ? a : 2 ; same way as if- statements

Table 5.3. The constructions of the Control Flow category.

5.1.4 Code Structure The Code Structure category contains 7 constructions as detailed in table 5.4.

Key Name Example Comments D1 External inter- Using functions exported by object facing files compiled from other languages, e.g Ada. D2 Modularization .c files and .h files representing mod- Should the .c file and ules of code. corresponding .h file be modelled as one block, or as separate blocks? D3 Module inclu- Modules need to be sion #import ” s c a n i a t y p e s . h” able to import other modules, or at the very least reference other modules. D4 Hierarchical in- Module 1 imports Module 2 which clusion in turns imports Module 3, meaning that Module 1 imports Module 3 by inheritance.

47 CHAPTER 5. COVERAGE ANALYSIS

D5 Global Scope Functions i n t fun1 ( i n t a ) { return a ; }

D6 File Scope Func- tions s t a t i c i n t fun1 ( i n t a ) { return a ; }

D7 Keyword alias- ing #d e f i n e PRIVATE s t a t i c

PRIVATE i n t a = 5 ;

Table 5.4. The constructions of the Code Structure category.

5.1.5 Program Behaviour The Program Behaviour category contains 8 constructions, as outlined in table 5.5.

Key Name Example Comments E1 Operation sup- Arithmetic, logic port a+5; and bitwise opera- a −3; tions. a & b ; a && b ;

Integer division, e.g E2 Deterministic It must be possible -5 / 3 = -1 remainder 2 ambiguous to model intended -5 / 3 = -2 remainder +1 behaviour behaviour of con- structions with am- biguous behaviour.

48 5.1. C CONSTRUCTION CATEGORIES

E3 Error Handling Several C functions that return inte- Error codes need to gers return -1 when there is an error. be outside of the bounds of legal data, or they will be mis- interpreted as legal output. As such in- terfaces need to han- dle the sending of error messages over the data ports. E4 Variable lifetime There needs to be i n t ∗ a ; a way to model the lifetime of variables. void fun1 ( ) { If this is not pos- i n t b = 5 ; sible, we could run a = &b ; } into situations where a pointer points to data that no longer exists. This is im- plicitly covered by scope, but the con- cept becomes com- plicated when apply- ing it to pointers, warranting its own criteria. Note that this be- haviour is disallowed by MISRA.

49 CHAPTER 5. COVERAGE ANALYSIS

E5 Reentrant Func- Not applicable. Does this need to be tions modelled? Whether a function is reen- trant or not depends on how it works internally. As such, is this an attribute posed by a re- quirement? Should function attributes be modelled in some way, such as through a characteristics list? E6 Requirements Not applicable. External require- ments posed on the code. These could be specified in nat- ural language by an engineer, or through formalisms such as LTL. For a complete model, these should somehow be present in the model, either through contracts or in worst case through comments. E7 Performance Pa- Not applicable. Information about rameters memory consump- tion, execution times and similar. These should exist in the model in order to be able to perform schedulability or hardware constraint analysis. E8 Escape se- Can strings such as quences char ∗ s = ” Hello \” John \” ” ; that be modelled?

Table 5.5. The constructions of the Program Behaviour category.

50 5.2. MODELLING FORMALISMS

5.2 Modelling Formalisms

The identification process for modelling formalisms that could potentially qualify for inclusion in the survey was performed in several steps. The initial step involved identification of formalisms specifically targeting the automotive industry. The as- sumption was made that others would have been posed with a similar problem in the past, given the requirements on modelling of source code introduced in ISO 26262. A literature study was performed, and identified EAST-ADL as a formalism designed with the automotive industry as target audience. The Architecture Analy- sis and Design Language, AADL, was identified as having been successfully applied to avionics, an industry that shares a number of similarities with the automotive in- dustry. The next step of the identification process attempted to identify formalisms targeting the modelling of non-Object Oriented software. SysML was identified as one of the few modelling formalisms that targeted software, but did not explicitly focus on Object-oriented source code, unlike standard UML. Following this step, formalisms that would be able to support formal verification of requirements were investigated. SIGNAL, Lustre, Esterel and Promela were identified as formalisms supporting formal verification of systems, and where there existed industrial cases showcasing the successful application of the formalisms[53]1,2. SPARK Ada was included as a potential candidate, meant to evaluate the strength of the Ada pro- gramming language, a language that has been used to build safety-critical systems in the past, and one of the main contenders for embedded systems development3. After a set of formalisms had been identified, a final selection was performed in order to narrow down the list further according to the requirements posed, as well as the applicability of the formalisms. The following two subsections outline the formalisms that were included in the survey, as well as those that were ultimately rejected.

5.2.1 Evaluated Formalisms AADL The Architecture Analysis and Design Language was selected to be in- cluded in the survey. A number of avionic use cases showcased the benefits of the formalism in regards to the modelling of Cyber-Physical systems. The formalism is actively maintained, and hosts a yearly conference. Support for compositional verification, a key research topic in regards to ISO 26262 compliance is under active development for AADL. All these factors indicated that AADL would constitute an appropriate candidate.

SysML SysML is based on the most widely used modelling formalism, UML, but has been adapted for non-Object Oriented source code, as well as the modelling of

1http://spinroot.com/spin/success.html 2http://www.esterel-technologies.com/success-stories/ 3http://www.spark-2014.org/about

51 CHAPTER 5. COVERAGE ANALYSIS hardware components. These factors made SysML stick out as a seemingly suitable candidate for inclusion in the survey.

Lustre Out of the three related data flow formalisms, SIGNAL, Lustre and Es- terel, Lustre had the best tool support through SCADE, as well as showed signs of most active development. Unlike SIGNAL and Esterel, Lustre provided fewer mechanics focused on the development of control systems. These factors resulted in the selection of Lustre as a suitable candidate for the survey.

Promela Promela is a verification formalism, and the language used by the pop- ular SPIN model checker. Promela, through SPIN, provides the ability to formally verify that models are correct, an appealing aspect for safety critical systems. Due to the safety concerns that ISO 26262 aims to address, the ability to construct a model that could also be verified for correctness without additional modifications seemed appealing. For this reason Promela was included in the survey.

5.2.2 Rejected Formalisms SIGNAL SIGNAL is a data flow language primarily focused on the modelling of systems with several different clocks. This makes SIGNAl an appealing formalism for the modelling of control systems, but the ability to use several clocks is not directly applicable to C source code. SIGNAL was as such considered to be unsuit- able for the modelling of source code, and Lustre was selected in favour of SIGNAL for inclusion in the survey.

Esterel Esterel is a data flow language, similar to Lustre and SIGNAL, with the primary focus on modelling and verification of highly parallel systems, and the communications between these parallel components. While this is certainly applicable to ECUs communicating over a CAN network, the level of modelling was deemed to target the overarching system rather than the C source code itself. As a result, Esterel was left out of the survey in favour of Lustre.

Unicon Unicon is an architecture description language, similar to AADL. Unicon provides several concepts that are similar to AADL, making it a suitable candidate for the survey for reasons similar to AADL. However, a license agreement has to be signed by the user in order to gain access to tool support for Unicon. At the time of the survey, the service through which the licence agreements were signed and the tools downloaded was not accessible. There was as such no way of gaining access to the tools required to study the formalism, and the formalism was as such excluded.

EAST-ADL EAST-ADL is an architecture description language, developed to complement the AUTOSAR architecture design principle. The formalism targets automotive industry, making it seem like an appropriate candidate for the survey.

52 5.3. COVERAGE

However, developers of the formalism claim that EAST-ADL is designed to model automotive systems at a more abstract level, as opposed to AADL which targets the software implementation of the system[54]. This coupled with the fact that EAST- ADL is designed to target AUTOSAR architectures, which are not the target of this survey, caused EAST-ADL to be rejected in favour of AADL.

SPARK-Ada SPARK-Ada constitutes a subset of the Ada programming lan- guage, with additional constraints on timing behaviour and correctness of the sys- tem, targeting safety-critical systems. While SPARK-Ada seems like a suitable candidate for the design of safety-critical systems, SPARK-Ada provides no way of constructing hierarchical models, a key requirement in ISO 26262. This fact excluded SPARK-Ada from the survey.

5.3 Coverage

This section will cover the degrees of fulfilment of the modelling formalisms with re- spect to the constructions in each category. A list of constructions will be provided, and for each construction the degree of fulfilment for each modelling formalism will be determined, together with a short comment justifying the classification. The raw data from the survey is attached in appendices B-F. Examples of how each construction could be modelled in the various modelling formalisms are presented in appendices G-J.

5.3.1 Fulfilment: Data Storage A1 - Declaration of typed data storage containers AADL AADL supports the creation of data types through the data keyword. All data types that are part of the C language are predefined in the base types library. This construction is as such classified as Fulfilled.

Lustre Lustre only supports its own built-in data types int, bool and real. These data types do not have a predefined size, and their bit-length is interpreter depen- dent. This causes this construction to be classified as Not Fulfilled.

SysML SysML can create variables that are part of blocks. All data types of the C language are supported through a C stereotype library that is part of the standard SysML distribution. This construction is as such considered to be Fulfilled.

Promela Promela supports the of int, short and unsigned byte, in addition to its own built-in data types mtype, bit and bool. Floating point data types are not supported in Promela for complexity reasons; Promela verifies models through the use of finite state machines. A consequence of floating point arithmetic is a complexity explosion for FSM states that require exploring as part of the model

53 CHAPTER 5. COVERAGE ANALYSIS validation. In order to make models verifiable in a short amount of time, floating point numbers are as such excluded. The construction is as such deemed to be Partially Fulfilled due to the lack of some of C’s built in data types.

A2 - Custom data types not built into the language AADL AADL supports the declaration of user defined data types by using the data keyword. These data types can be assigned properties defined in the data model library, assigning the custom data type properties such as bit-size, and endianness. This construction is as such considered to be Fulfilled.

Lustre Lustre version 6 (Lustre V6) supports abstract data types that are defined by the user through the use of C code. This effectively injects these new data types into the interpreter. This procedure however is not documented in the manual for the language, nor are the exact specifications of how it is realized in the interpreter defined. Due to the procedure not being documented, nothing can be said about its usability. This construction is as such considered Partially Fulfilled. Further documentation would be required in order to produce a stronger classification.

SysML Custom data types can be declared using the datatype block. These can inherit from pre-existing data types, or be completely independent. This construc- tion is as such considered to be Fulfilled.

Promela Promela does not support type definitions of regular data types. Dec- laration of new data types has to take the form of keyword aliasing by using the Promela preprocessor (the same preprocessor that is used for C). New data types can as such be defined as long as they are identical to pre-existing data types. This process is however aliasing rather than the actual declaration of new data types. The construction is as such deemed to be Partially Fulfilled.

A3 - Size of data type AADL Through the use of the data model library the size of declared data types can be defined. This construction is as such considered to be Fulfilled.

Lustre The bit sizes of the built-in data types are compiler dependent, and as no documentation exists for how to use the procedure to implement custom data types it can not be said whether these could be constructed with a given bit size or not. This construction is as such considered to be Not Fulfilled.

SysML Data bit sizes can be inherited from built-in data types with a predefined bit sizes, such as the uint32 t data type that is part of the C SysML library. For data types not inheriting from these built-in data types there is no way to

54 5.3. COVERAGE unambiguously define their bit sizes. This construction is therefore considered to be Partially Fulfilled.

Promela The size of Promela data types is dependent on the word size of the implementing architecture, similar to how the regular data types, such as int, in C are defined. Unlike C however, Promela does not support fixed length data types such as uint32 t. This construction is as such compiler dependent and deemed to be Not Fulfilled.

A4 - Data type neutral empty element AADL AADL implements pointers through a mechanic called require data ac- cess, referring to the fact that the structural component requires direct memory access to a given piece of data. As such all data access requests have to reference a variable that already exists, it can not reference a variable that is not present in the system. This construction is as such considered to be Not Fulfilled.

Lustre Lustre does not support pointers, and as such does not directly support the NULL pointer. Lustre does however have the keyword when, indicating that a given piece of data only has a value when a certain condition holds. When the condition does not hold, the variable simply holds no value. This is similar to how the NULL pointer works in C. As such this construction is considered to be Partially Fulfilled.

SysML Pointers in SysML are represented using aggregates. Aggregates represent that a block is included as a part of another block. Each aggregate has a multiplicity, indicating how many of said block that are included. This multiplicity can have a value of zero, effectively indicating a NULL pointer. This solution is however not unambiguous; an aggregate representing a pointer to an array of pointers where the first element exists and the other elements are NULL would have a multiplicity of one, which does not accurately represent that the referred to data is an array. This construction is as such considered to be Partially Fulfilled, as it is not unambiguous.

Promela Pointers are not natively supported in Promela, and a data type neu- tral element for pointers is as a consequence not present in the formalism. Due to Promela being based on deterministic finite state machine evaluation, type-less transitions are not supported. Operations where data is missing block until data is available rather than proceed without the given data. As a consequence void data types are not supported. This construction is classified as Not Fulfilled.

A5 - Composite data type - Structure AADL The data model library provides a property to indicate that a data type is a structure. It can then be assigned subcomponents of other data types to indicate

55 CHAPTER 5. COVERAGE ANALYSIS its member variables. This construction is therefore considered to be Fulfilled.

Lustre Lustre supports the struct keyword, allowing for the creation of composite data types consisting of the built-in Lustre data types of int, bool and real. This construction is as such considered to be Fulfilled.

SysML All SysML blocks can be assigned member variables, including the datatype block. A struct can as such be constructed through the creation of a new datatype block with one or more member variables. This construction is as such considered to be Fulfilled.

Promela Promela supports the declaration of structures in an almost identical way to the C language, through the use of a typedef command. The syntax for the components of the composite structure is identical to the C syntax. The construction is as such classified as being Fulfilled.

A6 - Composite data type - Union

AADL A union property is included in the data model library. This allows unions to be constructed in the same way as structs, causing this construction to be con- sidered to be Fulfilled.

Lustre Lustre supports no way to indicate that two pieces of data within the same composite type can not exist at the same time. Unions can as such not be created, and this construction is as such considered to be Not Fulfilled.

SysML Similar to Lustre, SysML supports no way of indicating that two mem- bers of a block can not exist at the same point in time, resulting in unions being impossible to realize. This construction is thus considered to be Not Fulfilled.

Promela There is no way to indicate that two given members of a composite data type can not exist at the same time in Promela. Unions are as such, similar to Lustre and SysML, not possible to realize. The construction is classified as being Not Fulfilled.

A7 - Function scope variables

AADL Functions are modelled through the subprogram concept in AADL. Sub- programs can be assigned their own subcomponents, which includes data blocks, representing variables. This construction can as such be considered to be Fulfilled.

56 5.3. COVERAGE

Lustre Functions in Lustre are represented through the use of nodes. A node represents a computational unit that accepts input and can produce output. Vari- ables declared inside a node are only visible to that node. The function scope construction can as such be considered to be Fulfilled.

SysML SysML models functions as blocks in a Block Definition Diagram. These blocks can have so called properties, which represent typed variables. These prop- erties are only available to the block itself unless they are tied to any of the block’s interfaces. This construction is as such considered to be Fulfilled.

Promela Variables declared inside a proctype, the Promela equivalent of a func- tion, are local to the proctype itself, similar to how variables declared inside func- tions in C are local to the declaring function. The construction is as such deemed to be Fulfilled.

A8 - File scope variables AADL AADL requires data to be members of at least one structural component, which does not include the file itself. In order to create a file scope variable, one would have to model the file as either a system, or create a subprogram group for all functions that are part of the file. This would have to be done in either case in a model, as this is the only way to model intra-file function calls. File scope variables can as such be modelled by assigning the variables to the subprogram group, or to the system component of the file, and then providing access to this variable to all other subprograms. This construction is as such considered to be Fulfilled.

Lustre Due to the synchronous nature of Lustre, where all variable assignments happen on a given clock tick, global variables and file scope variables can not be modelled. A way to circumvent this restriction however is to model a variable through the use of a registry node. This node would save a value when a value is passed, and then return this value every clock cycle until a new value is assigned. This is similar to how a Gated D Latch works in electronics. The use of such a registry node is not accurately representative of the underlying C code, as the usage is more akin to the use of mutator method design pattern common in object oriented programming. This construction is as such considered to be Partially Fulfilled as it is ambiguous.

SysML Files in SysML are modelled through the use of blocks, just like data types and functions. Functions belonging to a file are internal blocks to the file block. Since properties, SysML’s version of variables, can be assigned to blocks, this means that they can be assigned to the file block. This construction is as such considered to be Fulfilled.

57 CHAPTER 5. COVERAGE ANALYSIS

Promela Promela does not support a file scope for variables or functions. Vari- ables declared outside proctypes are considered to be globally accessible, while vari- ables declared inside proctypes are treated as being local to the given proctype. This construction is as such considered to be Not Fulfilled.

A9 - Global scope variables AADL AADL does not support variables that do not belong to a structural com- ponent. As such no free-floating global variables can exist. Global variables would have to be a part of the entire system, that is, the structural component represent- ing the composition of all entities in the code base. Due to this, the construction is to be considered to be Partially Fulfilled. By placing the global variables at system level information about the declaring module is lost, and the representation is as such not completely accurately representative of the source code.

Lustre In Lustre all variables have to belong to a node. It is as such not possible to model global variables without the use of the registry node design method as presented earlier. The modelling of global variables this way suffers from the same drawbacks as the usage of the method for modelling file scope variables. This construction is as such considered to be Partially Fulfilled.

SysML Global variables in SysML are modellable though the use of a global system-block, similar to how they can be modelled in AADL. This means that global variable usage will need to be exposed on the interfaces of functions that use the variable however (interaction with the environment outside of a block can only be done through the use of block ports, which in this case would need to be present for the global variable interaction). Clearly representing global variable interactions as ports of a block representing a function is not desirable as they can be mistaken for function parameter- or return value interfaces. This construction is as such considered to be Partially Fulfilled due to this ambiguity.

Promela Variables that are declared outside of a proctype or the init block (the Promela equivalent to C’s main function)are treated as globally accessible to all proctypes in the system. This construction is therefore deemed to be Fulfilled.

A10 - Block scope variables AADL The finest granularity that can be achieved in an AADL model is function- level granularity. The internal workings of the functions can not be modelled in detail. As such internal blocks present inside functions can not be modelled. This construction is as such considered to be Not Fulfilled.

Lustre Nodes are the only constructions in Lustre that can contain variables. As such it is not possible to model variable lifetime within a given node; they simply

58 5.3. COVERAGE exist for as long as the node is active. Blocks could be modelled through the use of nodes, and get called by the nodes representing the functions containing the blocks, but this causes ambiguity as a node would represent either a block or a function. This construction is as such deemed to be Not Fulfilled.

SysML SysML allows blocks to contain other blocks. As such a function block could contain another block that would represent the block of code to which the block scope of a variable applies. This is similar to nodes representing blocks in Lustre. This approach suffers from the same issues that arise when using Lustre nodes to model blocks, namely that an ambiguity is created concerning whether the called block is a block or a function. This construction is as such deemed to be Not Fulfilled.

Promela Blocks are supported by Promela through the use of the d step con- struction, which indicates that the instructions contained within the block should be executed as a single instruction. Considering that Promela is based internally on finite state machines, this allows the validation engine to coalesce all instructions within the d step block into a a single state in the FSM, reducing the state space, and as such the complexity of evaluation. Variables declared inside a d step block are local to the block itself, and are not accessible after the block terminates. Worth noting is that all variable names must be unique, regardless of scope. A block scope variable can as such not be declared with the same name as a global variable and masque the global variable for the duration of the block. Since MISRA does not allow the masquing of variables in this way, this does not pose a restriction. The construction is as a result deemed to be Fulfilled.

A11 - Constant variables AADL AADL realizes constant variables through the use of property sets. Prop- erty sets allow a user to specify properties that can be applied to other constructions present in the formalism. In order to create a constant, one would create a property set with a fixed value that applies to data of the given variable type. This property can then be applied to a variable in order to declare it as a constant value. This construction is as such considered to be Fulfilled.

Lustre Variables in Lustre are inherently constant. That is, a variable inside a node can not be reassigned. It is assigned once and then evaluated. In that way it works very similar to a mathematical equation, of a definition in a functional programming language. They can be assigned as part of an expression however, which could cause them to be dependent on input parameters of the node. They are as such not truly constant. The only way to create a true constant is to create a node that returns a constant value for all inputs. This once again causes ambiguity with whether this node represents a variable or a function however. This construction is as such considered to be Partially Fulfilled.

59 CHAPTER 5. COVERAGE ANALYSIS

SysML Constant variables can not be graphically represented in SysML. Rather they take the form of read only access and default values assigned to properties. As these are not visible in the model, it is not clear whether these restrictions are part of the modelling tool itself or are an official part of the formalism. The construction is as such deemed to be Partially Fulfilled.

Promela Promela supports declaration of constants through the use of the pre- processor, as well as through the use of enumerations. A variable can by itself not be declared as constant, such as through the const keyword in C. As it is unclear whether pre-processor instructions are part of the C language itself, as they are stripped away prior to compilation, the use of pre-processor constants to replace the use of the const keyword is ambiguous. The construction is as a consequence deemed to be Partially Fulfilled.

A12 - Arrays AADL The data model library provides property sets for defining arrays in AADL. The type of the data construction is set as an array, and then given a type and a set of dimensions. This construction is as such consdered as Fulfilled.

Lustre As of Lustre V6, arrays can be defined through the use of the ˆ-operator. This operator uses a similar syntax to the pair of brackets used to define arrays in C. The construction is as such considered to be Fulfilled.

SysML Properties of blocks can be given a multiplicity, thus effectively establish- ing an array. Due to the multiplicity indicating the presence of a certain number of variables of the same type however, arrays with more than one dimension can not be defined. This construction is as such considered to be Partially Fulfilled.

Promela Promela directly supports single-dimension arrays with a syntax identi- cal to that of C. Multi-dimensional arrays are not supported in regular syntax. They can be emulated through the use of structures, where a structure can be declared to contain an array. An array of these structures can then be declared, effectively creating a multi-dimensional array. This construction is however not unambiguous as a similar construction is possible with struts in C. This construction is as such deemed to be Partially Fulfilled.

A13 - Pointers AADL Pointers in AADL are modelled through the use of data access require- ments. A variable can be set as requiring data access, that is direct memory access, to another variable of the same type. This variable can then be declared as provid- ing data access. This construction is similar to how pointers work in C, with the difference that a variable in C does not have to be explicitly declared as being able

60 5.3. COVERAGE to be pointed to. A limitation in the AADL way of modelling pointers is that the variable that is pointed to has to be in the same scope as the variable that is point- ing to it. This is compliant with MISRA rules, but not with general ANSI C, where pointers can outlive the lifetime of the variable to which they point, a phenomenon known as dangling pointers. The construction is considered to be Fulfilled as the data access construction complies with MISRA guidelines.

Lustre Pointers can not be expressed in Lustre. Global variables in the form of registry nodes could be used to provide a similar, but more limited, usage. However, registry nodes can not handle alias-related propagation; that is, if two pointers point to the same variable, then the value of both of these pointers will be updated if the value of the pointed-to variable is changed. This can not be done transparently with registry nodes. This construction is thus considered to be Not Fulfilled.

SysML SysML handles pointers through the use of aggregates. A given block can aggregate another block, indicating that it is being referenced from the given block. The weakness of this method is that the aggregating block gains access to all members of the aggregated block. For pointers to be accurately represented the aggregated block can only contain the variable to which the pointer points. This is problematic given that blocks also represent scopes. Suppose that a given variable is present in a block representing a file (file scope variable). If this variable is aggregated, the aggregating block will also have access to all functions present in the file block. If the variable instead is encapsulated inside its own block, then it is not immediately clear that the variable is indeed file scope. This construction is as such considered to be Partially Fulfilled.

Promela Promela does not support pointers natively. Embedded C code can be included in Promela models to implement the functionality of pointers, but as this method involves hosting non-native code inside the model, it is not deemed to be part of the formalism itself. The construction is as such deemed to be Not Fulfilled.

A14 - External variables

AADL AADL does not support global variables directly. Variables have to be declared as members of a given structural component. The only way to declare global variable is as such to create a structural component representing the entire system, and declaring the variables in this scope. In order for external variables to exist, a variable would as such have to be declared in one structural component, be aggregated into the system component, and then inherited by other structural components. This can be done through the data access directive, but this uses the same method as pointers, and is as such ambiguous. This construction is therefore determined to be Not Fulfilled.

61 CHAPTER 5. COVERAGE ANALYSIS

Lustre Lustre does not support nodes being declared in one file and then defined in another file. By using provides/body parts of a package a declaration can be isolated from the definition, but they both have to be a part of the same file. This fact causes this construction to be considered to be Not Fulfilled.

SysML External variables can be modelled in SysML through the use of aggre- gates. The defining file block is aggregated into the declaring file block. This method however suffers from the same drawbacks that were outlined when using aggregates to represent pointers, namely that all members get aggregated, and not just the desired variable. Furthermore it is a cause of ambiguity with the pointer construction. This construction is as such to be considered Partially Fulfilled.

Promela Promela lacks the distinction between variable declaration and variable definition. A variable is considered to be defined when it is declared, and is assigned a default value (usually 0) upon declaration. The variable can then be addressed normally. The read-before-write problem with newly defined variables in C, where the value is non-deterministic does not exist in Promela as all variables are assigned this default value. If variables are declared (and thus defined) in several places in the model, regardless of scope, the compiler will yield an error. This is a consequence of variable masquing not being allowed in Promela. This construction is as such deemed to be Not Fulfilled.

A15 - Typecasting of variables AADL AADL uses strictly typed interfaces for communication between structural components. In order for a given data type to travel over a given interface, the interface has to be typed with the same type as the data, or as a supertype to the type of the data. Since C does not support inheritance, the concept of a supertype does not exist in C, and the usage of this construction (such as int being a supertype of uint32 t) will cause a loss of information. Typecasting this way can further not be done between arbitrary data types, such as between floats and ints. This construction is as such to be considered to be Not Fulfilled.

Lustre Lustre uses strict typing, and as such it does not allow for the conversion between data types, making the construction be Not Fulfilled.

SysML SysML supports using connectors to connect data objects to ports. This mapping can be done regardless of the type of the data and the type of the port, effectively representing typecasting of variables traversing a port. It is not clear however if this way of using connectors is considered to be legal in general SysML or if it stems from a bug in the Papyrus modelling software used during the studies in this thesis. It does not seem very formal to allow arbitrary internal blocks to be connected to arbitrary ports in this fashion. This construction is as such considered to be Partially Fulfilled.

62 5.3. COVERAGE

Promela Explicit type casting of variables is not supported in Promela, although assignments can freely be performed between variables of different types, resulting in implicit casts being performed. The result of such a cast is compiler specific, similar to how implicit casts are treated in ANSI C. Due to the lack of explicit casts, and the fact that implicit casts are prohibited by MISRA, this construction can only be classified as Partially Fulfilled.

A16 - Persistent function-level variables AADL AADL allows the declaration of static variables inside functions by using subprogram groups. By declaring a variable inside a subprogram group, and defin- ing a data access request from the subprogram to this variable, the variable lifespan extends past the exit of the subprogram; the variable is part of a persistent subpro- gram group, providing a sort of meta storage space for the subprograms contained within the group. This design pattern is ambiguous however, data access requests are associated with pointers, and the pattern could as sch easily be mistaken for a function that uses a pointer to a variable present elsewhere. What makes the pattern less ambiguous however is that subprogram groups are effectively only used as a meta container for subprograms. It can not be confused with other structural components of the code due to this. This makes the pattern clear enough for the construction to be considered to be Fulfilled.

Lustre Lustre supports the pre statement, indicating that the variable value should be pulled from the previous call to the node. This is identical to the use of static variables in C. This construction is as such considered to be Fulfilled.

SysML SysML allows for setting properties imposed on variables. One of the properties that can be set is the Is Static boolean expression. This does however not show up in the graphical representation, and it is not clear if this property is part of the SysML specification or if it is specific to the Papyrus tool. This construction is therefore considered to be Partially Fulfilled.

Promela Promela supports message channels, which can be used to emulate static variables. At the end of execution the value of the persistent variable is passed over a channel, and then re-red and re-initialized upon the next call of the function. As channels are not native to C, there is no direct risk in declaring channels that are not part of the C code, as they are a support structure in Promela. This use of channels should as such not be ambiguous with other constructions. The construction is as a result deemed to be Fulfilled.

A17 - Pre-processor constants AADL There is no distinct way to differentiate between regular constants and preprocessor constants in AADL. This construction is as such considered to be Not

63 CHAPTER 5. COVERAGE ANALYSIS

Fulfilled.

Lustre There is no distinct way to differentiate between regular constants and preprocessor constants in Lustre. This construction is as such considered to be Not Fulfilled.

SysML There is no distinct way to differentiate between regular constants and preprocessor constants in SysML. This construction is as such considered to be Not Fulfilled.

Promela Promela uses the same pre-processor as C, and as such supports the same method of modelling pre-processor constants. The construction is as such deemed to be Fulfilled.

A18 - Unambiguous data types AADL All basic C data types are provided as part of the data types library. Further specification regarding signedness and endianness, as well as bit size, can be assigned to custom data types through the use of the properties in the data model library. This construction is as such considered to be Fulfilled.

Lustre The implementation of the Lustre data types are interpreter dependent, this is shown in the Luke interpreter4 by the fact that the user can select data type implementation through command line arguments. This construction is as such Not Fulfilled.

SysML SysML contains support for built-in C data types through the C library that is part of the standard SysML distribution. This library contains exactly the same data types that the C language does. This construction is as such considered to be Fulfilled as it can represent the same amount of information that can be seen in the source code artifacts.

Promela Promela strictly defines the signedness of its built-in data types: and mtypes are unsigned, while shorts and ints are signed. The data types are as such unambiguous. The construction is deemed to be Fulfilled.

A19 - Pointer indirection AADL AADL models pointers through the use of data access directives. A pointer as such is just a request for access to the value of another variable. A request can not be made on a request. This means that pointer indirection is not possible to model. This construction is as such considered to be Not Fulfilled.

4 http://homepage.cs.uiowa.edu/˜tinelli/classes/181/Spring10/luke.shtml

64 5.3. COVERAGE

Lustre Lustre does not support pointers. As such it can not support pointer indirection, causing the construction to be considered to be Not Fulfilled.

SysML SysML models pointers through the use of aggregates. An aggregate effectively means that another block is included as a reference in the current block. Aggregating a block that aggregates another block as such places a reference to the indirectly aggregated block in the aggregating block. This effectively unrolls pointer indirection to take the form of regular pointers. Pointer indirection is as such not modellable in SysML and the construction is considered to be Not Fulfilled.

Promela Pointers are not natively supported by Promela, and as a result pointer indirection is not supported. The construction is as such deemed to be Not Fulfilled.

A20 - Nested Custom Datatypes AADL AADL supports the inclusion of already defined data types in composite data types. A limitation is that all data types need to be declared at package level, meaning that ad hoc data types can not be declared inside composite data types, thus limiting their scope to that particular data type. This construction is therefore considered to be Partially Fulfilled.

Lustre Nested data types can be constructed in a similar way in Lustre as in AADL, namely by declaring data types at package scope and then using them in other composite data types. These semantics could however not be verified as the Luke interpreter did not support composite data types despite being a formal part of the formalism, and the reference compiler provided by the Lustre developers could not be run on the evaluation platform used during the production of this thesis. The documentation however provides backing that this method does indeed work. This construction is as such to be considered Fulfilled.

SysML Data types declared in a given package will be available to all other data types in the same package. As such a data type can be declared and then included as a member in other data types. This method suffers from the same problem however as AADL and Lustre, namely that the data type can not be declared ad hoc inside another data type, and as such have data type scope, it must be package scope. The construction is nevertheless considered to be Fulfilled.

Promela Nested data types are supported as long as the types are declared at global scope. It is not possible to declare structures inside other structures in an ad hoc fashion, something that is possible to do in C. This limitation results in the construction being deemed as Partially Fulfilled.

65 CHAPTER 5. COVERAGE ANALYSIS

A21 - Enumerations

AADL AADL allows for enumerations to be declared in property sets that can be applied to data objects. This requires one additional step compared to C where an enumeration is declared once and can then be used. In AADL instead an enu- meration is declared and then a data type representing the implementation of the enumeration is declared. The end result is the same though. This construction is therefore determined to be Fulfilled.

Lustre Lustre declares enumerations in an identical way to C. The construction can as such be considered to be Fulfilled.

SysML SysML supports an enumeration block, where the different enumerated values are presented as literals inside this block. As such the construction is con- sidered to be Fulfilled.

Promela The mtype variable type is designed in order to provide enumerated values for user defined data constants. Mtype variables can be defined with an alias, similar to enumerated values in C. The value of the enumerated data type however is implicitly declared by the compiler, and enumerated types can not be given an initial value as a result. As all enumerated types are of mtype, all enumerated types must consequently have different values. This means that two separate enumerations, each with a member initialized to the same value is not possible to model. This poses a limitation to the MISRA case which allows for enumerated values to be initialized. The construction is as such deemed to be Partially Fulfilled.

A22 - Expression scope variables

AADL It is not possible to explicitly declare that a variable is expression scope in AADL. Connecting a value to a port communicating with another structural component requires the use of a pre-declared variable within the calling structural component. This by definition makes the variable function scope. It is possible to neglect connecting a variable to a given function call, and simply state that the call happens. This implies that the passed value is either expression scope or a constant. This method is ambiguous however, since it could also represent a part of the model that simply has not been added yet, in other words a stubbed call. This ambiguity causes this construction to be deemed to be Partially Fulfilled.

Lustre Lustre supports calling of nodes in a similar way to how function calls are declared in C. As such expression scope variables can be passed the same way that they are in C. This construction is as such Fulfilled.

66 5.3. COVERAGE

SysML All variables in SysML have to be members of a block, representing a structural component. Variables can as such not be declared as being expression scope for a specific function call. This construction is therefore Not Fulfilled.

Promela Promela supports the calling of functions with constant values, or with parameters to which arithmetic operations are applied (e.g fun1(a+5)). This is sufficient to reproduce the capability of expression scope variables in C. The con- struction is as such deemed to be Fulfilled.

A23 - Address of AADL Address off can be taken through the provides data access construction in AADL. The usage of this construction is however not as straight forward to use as in C; a variable has to provide data access when being declared, it cannot be declared regularly and then produce its address when requested elsewhere in the code. This complicates collaborative work, as the person implementing the variable definition has to be aware if other developers intend to use the address of that variable later. This construction is for this reason considered to be Partially Fulfilled.

Lustre Pointers do not exist in Lustre, and as such pointer-related operations are not supported. This construction is therefore Not Fulfilled.

SysML SysML allows the aggregation of any block. The address of a variable can therefore be implicitly taken by aggregating in the block to which the variable belongs. This method suffers from the same granularity issue as mentioned earlier concerning aggregates; the entire block has to be aggregated rather than the specific variable that is referenced. This construction is therefore deemed to be Partially Fulfilled.

Promela As pointers are not supported natively in Promela, pointer-related op- erations such as the address-of operator are not supported. The construction is as such deemed to be Not Fulfilled.

A24 - Bit fields AADL AADL does not recognize the concept of word size. As such a bit field can be constructed by declaring members, specifying their bit size using the data model library, and placing them inside a composite data type. This construction is as such considered to be Fulfilled.

Lustre Lustre does not support the declaration of custom data types (abstract data types are mentioned in the reference manual, but the chapter concerning their implementation is stubbed as of the writing of this thesis). These data types do not have a formally defined bit size. Bit fields can as such not be produced as the

67 CHAPTER 5. COVERAGE ANALYSIS concept of bit sizes is foreign to Lustre semantics. This construction is as such Not Fulfilled.

SysML Papyrus supports the data size property for data types, which can be applied to members of a composite data type in order to represent their bit sizes. This property however does not show up in the graphical representation, and it is therefore questionable whether the implementation of this property is specific to Papyrus or part of the SysML formalism itself. The constructions is therefore considered to be Partially Fulfilled.

Promela The Promela data types of bit and bool are implemented internally as bit-fields, but as Promela does not support the declaration of new data types with arbitrary length and simply allows the aliasing of existing data types, user-defined bit-fields are not supported. The construction is as such deemed to be Not Fulfilled.

5.3.2 Fulfilment: Data Flow B1 - Direct assignment AADL AADL supports initial value assignment to variables through the use of property sets. As subprograms, AADL’s construction for expressing functions, are black boxes however, no reassignment can be expressed inside the subprograms. This construction is as such considered to be Not Fulfilled.

Lustre Lustre supports variable assignments inside a node. Since variables in Lustre express a set of equations, similar to functional programming languages, a variable can not be reassigned after the initial assignment. In order to express derived values over several steps variable renaming as such needs to take place. This is however not an ideal way of modelling variables as it becomes unclear which variables that actually exist within the modelled source code artifacts and which ones that stem from variable renaming. This ambiguity causes this construction to be considered to be Partially Fulfilled.

SysML It is not possible to model variable assignments in the block diagrams, one would instead have to use activity diagrams or state diagrams. Due to the complexity of state diagrams it would be preferable to not have to resort to having to use them. Activity diagrams however can express the algorithmic implementation of blocks. This construction is as such considered to be Fulfilled.

Promela Promela supports direct assignments to variables in a similar manner to C. Reassignment is legal, unlike in Lustre. Assignments can take the form of a direct value constant, or as part of chains of operators and operands, just like in C. The construction is as such deemed to be Fulfilled.

68 5.3. COVERAGE

B2 - Function return value

AADL AADL supports function calls through call sequences. A calling entity (such as a subprogram or thread) can specify one or more call sequences, represent- ing the subprograms that get called during its execution. A connections block can be provided, tying variables present in the calling component to ports present in the called component. This makes it clear which variables that are used to provide the input to the function, as well as which variables that can expect to have their values changes as a result of the function call. This construction is as such considered to be Fulfilled.

Lustre Nodes in Lustre support both parameters during calls to them as well as return values in an almost identical fashion to C. This construction is as such considered to be Fulfilled.

SysML Function calls in SysML are modelled through the use of internal block diagrams with ports. In order to avoid call sequence ambiguities (is a connection between input and output ports of two given functions a call and a return, or two calls?) activity diagrams can be used to specify the internals of a block, and as such specify where in the implementation that function calls take place. This construction is as such considered to be Fulfilled.

Promela Return values of functions are not directly supported in Promela as proctypes are meant to represent processes rather than function calls. As such they support communication through the means of regular parallel computing concepts, namely message passing and shared memory (through the use of global variables). Proctypes can however not directly return values. In order to emulate return values a global result variable has to be declared, or a message channel between the calling function and the called function needs to be declared. The use of a global variable is cause for an ambiguous representation, as said variable is not present in the original C code. The channel solution is therefore preferable, however since the channels need to be declared at global scope to be accessible by the calling- as well as the called proctype, this makes for a very large number of global interfaces as the call hierarchies grow, something that is undesirable. The construction is as a consequence considered to be Partially Fulfilled.

B3 - Function parameter passing

AADL AADL supports this in the same way that function return values are modelled, namely by call sequences coupled with a connection block representing variables that are used as part of the call or modified as part of the function return. This construction is as such considered to be Fulfilled.

69 CHAPTER 5. COVERAGE ANALYSIS

Lustre Lustre nodes support both parameter passing and return values in a sim- ilar fashion to how function calls are modelled in C. This construction is as such Fulfilled.

SysML Parameter passing is modelled in exactly the same way as function return values. This construction is as such considered to be Fulfilled.

Promela Function parameter passing is done in an identical fashion to parameter passing to C functions. This construction is therefore considered to be Fulfilled.

B4 - Modification by reference - Function AADL AADL supports the modelling of modifications through reference by the use of the data access directive, similar to how pointers are modelled. A subprogram can be specified as requiring data access to variables, and can as such modify their values without the use of return values. This is the same construction that needs to be used for functions that modify global variables however. In order for a sub- program to modify a global variable it would need to have access to said variable, which requires a data access request due to the global variable being part of the scope of the system, which is not directly addressable by child components of the system. This is a cause for ambiguity. The construction is therefore determined to be Partially Fulfilled.

Lustre Lustre does not support pointers. Modifications by reference can therefore not be modelled. The construction is as such Not Fulfilled.

SysML In SysML in-out flow ports can be used to model modifications through reference. Instead of specifying separate input ports and output ports for a block representing a function, a port going both ways can be used to represent that an input variable is also modified as part of the call itself rather than through a return value. The construction is as such considered to be Fulfilled.

Promela As Promela does not support the modelling of pointers, modifications by reference are not supported. The construction is therefore considered to be Not Fulfilled.

B5 - Modification by reference - Direct AADL AADL does not support modelling of internal component behaviour. This kind of direct modification can as such not be modelled, as it is not visible to the external interfaces of a function/subprogram. The construction is therefore Not Fulfilled.

70 5.3. COVERAGE

Lustre Lustre does not support the modelling of pointers, and this kind of modi- fication is therefore not supported, causing the construction to be considered to be Not Fulfilled.

SysML SysML does not support the modelling of direct modification through reference in an unambiguous way. Block diagrams do not support the detailed modelling of internal behaviour of blocks, and the user is therefore referred to the use of state diagrams (which are complex to use) or activity diagrams. Activity diagrams however do not have formal semantics, they support a small set of control flow operators such as branches and parallelism operators, but the actual behaviour of the branches is specified by the user in text. The semantics therefore depend on the convention that the user elects to adopt, and is as such not formal. This construction is considered to be Not Fulfilled.

Promela As Promela does not support the modelling of pointers, modifications by reference are not supported. The construction is therefore considered to be Not Fulfilled.

B6 - Modification by external factor AADL Due to AADL’s ability to model hardware platforms, modifications by external factors can be modelled through the use of the device component type. Devices represent hardware actuators or sensors. Sampling of these sensors can be modelled the same way as calls to subprograms; a port on the device is connected to an input port or variable that represents the sampling of the sensor. For reg- ular sampling, a thread with a period can be used. This construction is as such considered to be Fulfilled.

Lustre Lustre can not model hardware specifically. In order to model a sensor or actuator a node would have to be used, which would be cause for ambiguity as nodes can also be used to model function calls as well as global variables. This construction is considered to be Not Fulfilled as a result of this ambiguity.

SysML Since SysML is a block-based modelling formalism, an external sensor or actuator could be modelled through the use of a block with output ports (sensor) or input ports (actuator). The connections to sampling functions can be modelled as part of an internal block diagram, showing the connections between the ports of the devices and the sampling functions, while specific sampling behaviour can be modelled through the use of an activity diagram. It is however not possible to distinguish between a block representing a device and a block representing a software component, such as a function, without the use of user defined naming conventions or through the use of stereotypes. Due to this ambiguity the construction is deemed to be Partially Fulfilled.

71 CHAPTER 5. COVERAGE ANALYSIS

Promela Hardware can not be modelled in Promela. It could be possible to model external factors through the use of a process that modifies a value sent over a message channel, but this would mean that the affected variable would constantly have to read a value and update its own value from a message channel, a convention that is not very clean. The construction is as such considered to be Not Fulfilled.

B7 - Transparent modification through interrupt AADL AADL supports the modelling of interrupts through the use of devices or abstract components. A device can be defined to generate an interrupt either peri- odically or aperiodically, which can then be passed on to a process representing the interrupt service handler through the use of an event port (a port representing the occurrence of an event without associated data flow). The interrupt service handler can then call other subprograms or generate signals based on how the interrupt needs to be handled. The construction is therefore considered to be Fulfilled.

Lustre Interrupts are not directly supported in Lustre. It is possible to create a node that triggers when a certain clock signal, representing an incoming interrupt, is set to high, and perform an action to process the interrupt signal. Several such signals can be used for various interrupt channels, effectively creating an interrupt handler. These input streams are however difficult to accurately model as they would require manual construction based on the frequency of occurrence of incoming signals. Modelling periodic interrupts this way is fairly straight forward as periodic signals can be expressed using counters. Modelling aperiodic signals however is more difficult as they can not be generated by a generator node. There is still cause for ambiguity with this approach to using a node, as it would appear that the interrupt signals are generated by software rather than possible hardware interrupts. Nodes are indistinguishable from functions, further worsening the ambiguity. The construction is determined to be Partially Fulfilled.

SysML SysML event ports do not cover the concept of time. Connecting an event port from a block representing an interrupt generating device to a block representing an interrupt handler can as such be done. This method comes with a number of drawbacks however; the lack of time as a concept on event ports means that the exact occurrence type (periodic or aperiodic) can not be modelled without the use of clarifying activity diagrams. The ambiguity of what a given block represents is still present; is the interrupt generated by a function, or a device? The construction can as such not be considered to be completely fulfilled, and is instead deemed to be Partially Fulfilled.

Promela As Promela does not support the modelling of hardware interrupt mod- ifications can not be directly modelled. As interrupt handlers can not normally address function local variables, this case mainly applies to global variables. It is possible to model the interrupt handler as a process that constantly modifies a

72 5.3. COVERAGE global variable, and that is constantly active. Due to process interleaving not being deterministic in Promela however, this is not a reliable method for achieving the desired outcome. The construction is as a result classified as Not Fulfilled.

B8 - Internal language meshing - ASM

AADL As AADL can not model the internals of a given function, the imple- menting language can be defined through the use of property sets. An enumeration of possible programming languages can be defined and then specified to apply to subprograms. This property can then be set for each given subprogram. In-line use of assembly language within a function written in C can however not be mod- elled, as a subprograms (representing a function) is the finest granularity structural component present in AADL. For the Scania specific use-case this is not an issue, as Scania internal guidelines forbids the use of in-line assembly language. For the general case however, this is a limiting factor. Due to how in-line assembly is writ- ten in C however, with registries tied to given variables handling input and return values, a block of in-line assembly code can be viewed as a separate function call, as the semantics with input- and output parameters are almost identical. As such each in-line assembly block can be modelled as a call to a separate subprogram. This allows us to specify implementing language for each given subprogram. The construction is as such considered to be Fulfilled.

Lustre Lustre supports no way of modelling the source language for a given piece of code. All program behaviour that is modelled in Lustre must be re-written to use Lustre’s own equation-based syntax, and the semantics for how the behaviour was implemented in the original code artifacts is therefore lost. This construction is as such Not Fulfilled.

SysML SysML has no formal way of defining which language that implements the behaviour of a given function (block in SysML). Calls to specific in-line as- sembly code blocks can be modelled using activity diagrams, but since the blocks representing flow in the activity diagrams lack formal semantics, this usage is not considered to be formal. A more formal way would be to use stereotypes to define a user property representing implementing language that can be assigned to a block representing a function. Properties such as this are however not present in the vi- sual model, and it is as such unclear if they are specific to Papyrus or part of formal SysML. This construction is considered to be Partially Fulfilled as a result of this.

Promela Promela supports no way to model that specific blocks of code are implemented in a different language. This construction can as such be deemed to be Not Fulfilled.

73 CHAPTER 5. COVERAGE ANALYSIS

B9 - Function call with side effects AADL Side effects on global variables can be modelled through the use of the data access directive. A subprogram that modifies a global variable, or, since global variables do not exist in AADL, a system scope variable, would request data access to the system scope variable in question. The variable can then be modified by the subprogram. An ambiguity exists however in that access to a global variable and access to a pointer both are modelled using the data access construction. It is therefore not clear whether a function gets called with a pointer parameter that is modified, or whether the subprogram is called with no parameters and instead modifies a global variable. This ambiguity causes the construction to be considered to be Partially Fulfilled.

Lustre Global variables can be represented using registry nodes in Lustre. A given node can then call the set function of the registry node in order to modify its value, which then persists until a new assignment is made. The use of registry nodes does however, as mentioned previously, generate an ambiguous definition of what a node represents in Lustre. Since nodes are used to model functions, the use of them to also model global variables is questionable. The ambiguity that stems from this multiple use of nodes means that this construction is considered to be Partially Fulfilled.

SysML The Papyrus tool allows for the connection of an internal block represent- ing a variable to a port representing an external interface. This allows a function to modify the variable by sending data over the port. This is however the same method used to represent function calls with input parameters. There is no clear way to distinguish whether the port is used as an interface to a global variable or for parameter passing. This source of ambiguity means that the construction will have to be considered to be Partially Fulfilled.

Promela Any proctype can modify a global variable through an assignment op- eration performed inside the proctype body. This is identical to how side effects on global variables are depicted in C. The construction can as such be deemed to be Fulfilled.

5.3.3 Fulfilment: Control Flow C1 - If-else if-else construction AADL AADL does not support the modelling of the internal behaviour of sub- programs (representing C functions). As such conditional branching can not be directly modelled. AADL does however support alternate control flow directions through a given subprogram. This can be modelled through the use of call se- quences. If a different set of functions are called, or the same functions called in a different order, then a call sequence can be declared for every call trace through the

74 5.3. COVERAGE subprogram. The use of call sequences is however limited to calls of subprograms; it is not possible to model alternate data flow due to variable assignments or similar as a result of if-statements. Due to this limitation, of a case that is comparatively common, this construction is considered to be Partially Fulfilled.

Lustre Lustre implements a functional if-statement similar to the ternary state- ment in C. As a consequence of this, if-statements can be used to control data flow in terms of alternate variable assignments, but not control flow, unless the con- trol flow also involves data flow. Since nodes can return values, the assignment of different node return values to a given variable can be modelled, but if the nodes lack return values, and thus only affect control flow and not data flow, then the conditional statement can not be modelled. This limitation forces us to consider the construction to be Partially Fulfilled.

SysML SysML supports decision points in activity diagrams. Decision points branch the execution flow based on a given condition in an identical fashion to the C if-statement. This construction is as such considered to be Fulfilled.

Promela Promela supports the if-construction, but selects which branch to take by evaluating which guards that hold, and then select one in a non-deterministic fashion. Only one branch is ever selected. This disallows the use of fall-through if-statements, where several conditions can hold and action taken based on several conditions. An example of a situation where this might be the case is bitwise flag evaluation, where a given branch triggers for every flag, and several flags can be set at the same time. This situation is fairly common in C code, and as such pose a quite severe restriction. The construction is as such deemed to be Partially Fulfilled.

C2 - For-loop construction

AADL Repeated execution of a given piece of code is not possible in AADL, as internal behaviour can not be modelled, while still preserving the formal semantics of the formalism. Repeated execution can be modelled through the use of threads with a given period, or as a call sequence to other subprograms, where the subprograms represent the loop body. Using a subprogram to represent a loop body does however create an ambiguous situation as subprograms also model function calls. The use of threads to model loops muddle the intended usage of the construction; threads are meant to represent a thread of execution that can be assigned to a process, which in turn can be assigned to a processor. Using a thread to model a loop body as such causes confusion as threads can be scheduled for execution independently of other threads. In other words, a thread is an independent unit of execution rather than a subset of a unit of execution. They should as such not be used to model loop bodies. The construction is therefore determined to be Not Fulfilled.

75 CHAPTER 5. COVERAGE ANALYSIS

Lustre Lustre does not support explicit loops, as they do not comply with Lus- tre’s functional definition of variables, where reassignment inside a given node is not possible. Instead it supports functions that can be used to iterate over data streams, such as foldl, map and fill, commonly present in functional programming languages. As a result of this, loops that fill a similar functionality to these func- tions can be reproduced, but loops that affect control flow, such as the repeated calling of a void function, can not be modelled in this way. This construction is as such considered to be Partially Fulfilled as only a subset of data flow is possible to model.

SysML A loop is effectively a counter modification followed by a decision point based on the results of the counter modification. That this holds is evident from how loops are implemented in assembly dialects, with a comparison operator followed by a conditional branch statement. SysML supports an assembly language-like implementation of loops, with a decision point followed by a branch back to the loop header. This construction is sufficient to cover all loop constructions in C. This construction is as such deemed to be Fulfilled.

Promela Promela supports a do-construction, which acts as an infinite loop. The do-loop can contain several guards, if several guards evaluate to true, then a single guarded statement will be evaluated in a non-deterministic way, similar to how if- statements are handled. A for-loop can be modelled by using a guard that will evaluate to true as long as the loop should be running (the negation of the end condition in the for-loop header in C), followed by an else-branch that breaks out of the loop when triggered. This construction is as such deemed to be Fulfilled.

C3 - While-loop construction AADL While-loops suffer from the same AADL’ restrictions as mentioned for For-loops. The construction is therefore Not Fulfilled.

Lustre While-loops suffer from the same Lustre restrictions as For-loops. The construction is deemed to be Partially Fulfilled.

SysML The main difference between a for-loop and a while-loop is that the For- loop contains a condition modification as part of the loop construction itself, whereas a while loop requires the programmer to manually specify a modification that can result in a terminating condition. This effectively makes a for-loop syntactic sugar for the while-loop construction, which lies closer to the assembly-language imple- mentation that the compiler will produce. While-loops are therefore modellable the same way as for-loops. The construction is deemed to be Fulfilled.

Promela While-loops can be modelled the same way as for-loops, through the use of the do construction. This is a cause for ambiguity, but as for loops in C are

76 5.3. COVERAGE syntactic sugar for while-loops, this ambiguity does not pose a severe restriction. The construction is as such classified as being Fulfilled.

C4 - Do-while-loop construction AADL Do-while-loops suffer from the same AADL restrictions as mentioned for For-loops. The construction is therefore Not Fulfilled.

Lustre Do-while-loops suffer from the same Lustre restrictions as For-loops. The construction is deemed to be Partially Fulfilled.

SysML Do-while loops differ from while-loops in that the decision point is per- formed at the end of the loop body rather than in the loop header, which is the case with while-loops. As the decision point and branch block are moveable in the functional diagram used in SysML, this construction can be reproduced by moving the decision point. The construction is as such Fulfilled.

Promela Promela does not support a clean way of construction a do-while loop, as guards are always evaluated prior to execution. It is as such not directly possible to force the execution of one iteration of the loop body, even when the guard does not hold. A way around this is to make use of goto-statements, where a conditional goto is issues at the end to return to the loop header. This representation is however ambiguous as goto exists in C, and is banned in MISRA use. The use of the goto statement in the model could as such be seen as confusing. Due to this, the construction is deemed to be Partially Fulfilled.

C5 - Function calls with standard output and input parameters AADL AADL supports function calls with parameters through the use of subpro- gram calls, specified in call sequences. The construction can as such be reproduced and is considered to be Fulfilled.

Lustre Lustre supports function calls through the use of nodes, which can be passed parameters and can supply return values. The construction is as such deemed to be Fulfilled.

SysML SysML supports function calls through blocks representing functions con- nected through ports in an internal block diagram. Parameters and return values are passed over the ports connecting the two function blocks. The construction is as such Fulfilled.

Promela Promela can model functions through the use of proctypes, which sup- port input parameters. Return values however are only supported through the use of message passing channels, or through the use of global variables. This is ambiguous,

77 CHAPTER 5. COVERAGE ANALYSIS as message channels can be constructed in C, while the use of global variables would introduce variables which are not present in the original code. This construction must as such be deemed as Partially Fulfilled.

C6 - Function calls with pointer parameters AADL Pointers are supported in AADL through the use of data access requests. Data access requests can not be passed over ports however, a variable is declared as requiring data access to another variable (aliasing the variable it requires data access to). Another structural component that is a supercomponent can then provide data access to one of the variables contained within its scope. This declaration ties the two variables together, and establish the aliasing. A similar method is used in order to provide data access to global scope variables used by functions. It is as such not clear whether a data access request is an access request to a global variable, or the declaration of a pointer. Furthermore the construction explicitly establishes that the pointer will be passed as an argument to a function, implying that it should be passed on one of the function’s input ports. This is not doable in AADL. The construction is as such considered to be Not Fulfilled.

Lustre Lustre does not support pointers. Function calls with pointer parameters are as a result not modellable. The construction is as such deemed to be Not Fulfilled.

SysML SysML supports pointers through aggregation. An aggregated variable (pointer) is internally indistinguishable from a regular variable; the only way to see that a given variable is a pointer is whether it was declared internally or imported through composition, or whether it was aggregated from another block. The point- ers can as such follow the same rules as regular variables, and can as such be passed over regular ports. The construction is deemed to be Fulfilled.

Promela Promela does not support pointers, and as such the passing of pointers to functions is not possible to model. This construction is as such deemed to be Not Fulfilled.

C7 - Function calls with function pointer parameters AADL AADL supports the use of requires subprogram access in addition to the data access construction. This construction is very similar to how function pointers are implemented, that is, a request is made for access to the memory location at which the executable code of the function resides. Subprogram access requests can be embedded inside a data type, effectively creating a function pointer, which allows them to be passed over port interfaces. This construction is therefore deemed to be Fulfilled.

78 5.3. COVERAGE

Lustre Lustre does not support the concept of pointers, neither to data nor to functions (Lustre nodes). This construction is as such Not Fulfilled.

SysML SysML allows arbitrary blocks to be passed over port interfaces. In the case of variable parameters, these blocks represent variables that exist within the code, but in the case of function pointers, aggregated function blocks can be passed over the ports in a similar way to how variables are passed. This construction is as such deemed to be Fulfilled.

Promela As pointers can not be modelled in Promela, neither can pointers to functions be modelled. Function pointer parameter passing is as such not possible, and the construction is determined to be Not Fulfilled.

C8 - Switch constructions AADL Switch-statements are effectively syntactic sugar for if-else if-else blocks and suffer from the same limitation to the modelling of control flow variation as a result of conditional statements. The construction can as such not be modelled and is Partially Fulfilled.

Lustre Lustre if-statements are functional, and are as such limited to the ability to express data flow rather than control flow. Switch statements would have to be modelled by using Lustre functional if-statements, and as such suffer from the same limitations. The construction is therefore considered to be Partially Fulfilled.

SysML Switch-statements are modelled through the use of sequential decision points in SysML, similar to if-else if-else statements with a large number of else if-clauses. They can as such be reproduced in the model and the construction is deemed to be Fulfilled.

Promela Switch-statements are modelled through the use of if-statements in Promela. As noted with the if-statement, Promela does not allow fall-through of if-statements, which in the case of if-statements is problematic. ANSI C permits fall-through in switch-statements, however MISRA explicitly forbids several clauses from activating. The lack of fall-through is as such not a limitation when adhering to MISRA guidelines. The construction is as such deemed to be Fulfilled.

C9 - Interrupts AADL AADL can model interrupts through the use of a device that generates aperiodic (or periodic depending on the nature of the interrupt occurrence in the system) signals that are handled by a thread representing the interrupt handler. As AADL does not have the capability to model the internal behaviour of functions, it is not clearly defined when a subprogram block (representing a function) terminates.

79 CHAPTER 5. COVERAGE ANALYSIS

If an interrupt is set to trigger the execution of a thread acting as the interrupt handler with a certain period, then this execution takes place in parallel to the subprogram execution. If the subprogram and interrupt handler are defined to be executed on the same processing unit, then the subprogram will automatically stall whenever the interrupt handler executes, similar to how interrupts are handled in C. The construction is as such deemed to be Fulfilled.

Lustre Interrupts are not supported in Lustre and the construction is therefore deemed to be Not Fulfilled.

SysML SysML does not include a concept of timing behaviour for blocks. It is as such not clear when a block executes and when it stalls simply from looking at the block diagrams. If an interrupt generating block, representing a sensor or similar device that would generate interrupts, is connected to a block representing an inter- rupt handling routine in the form of a function, then the interrupt handler would appear to execute in parallel to the blocks representing the code being executed on the processor. Due to the lack of timing behaviour, it is not evident whether the regular program flow stalls during interrupt handler execution or whether they are executed in parallel. If both the interrupt handler and the regular code are aggre- gated to be part of a block representing a processing unit, then a similar situation to that which occurs in AADL is produced. As opposed to AADL however, SysML does not include a formal concept of periodicity, or priority of execution. It is there- fore not clear whether the interrupt handler stalls in favour of the regular code, or the other way around. This lack of clarity forces us to consider the construction to be Partially Fulfilled.

Promela Promela does not support the possibility to model interrupts. Several processes can be run at the same time, but without the use of channels to represent mutexes all processes are assumed to execute concurrently. As such it is not possibly to explicitly model that an ISR is run in place of the regular flow of execution without specifying a mutex that the original process has to hold on. This implies that explicit synchronization mechanisms exist within the code, which might not be the case. This construction is as such deemed to be Not Fulfilled.

C10 - Nested function calls

AADL AADL supports subprograms calling other subprograms as long as they reside in a scope that is accessible by the calling subprogram. The calling subpro- gram declares the call as part of one of its call sequences. The called subprogram can in turn call other subprograms by declaring them in one of its own call sequences. The construction is therefore determined to be Fulfilled.

80 5.3. COVERAGE

Lustre Lustre calling of nodes follows the same semantics as C function calls. A given node can call other nodes in the same way that C calls other functions from a given function. The construction is as such deemed to be Fulfilled.

SysML SysML supports chains of blocks connected through the use of ports as part of internal block diagrams, representing nested function calls. This convention is in itself ambiguous as it is unclear whether the functions are called in a sequence in a similar order to binomial coefficients, or if the functions are called in a nested fashion. Consider the SysML diagram presented in figure 5.1. In the figure it is not evident if the functions are called as

C10Caller → C10CallerCallee → C10Caller with return values being passed back, or as

C10Caller → C10CallerCallee → C10Callee → C10CallerCallee → C10Caller without return values. In order to specify which of the call sequences that is the one that is intended to be modelled the internals of the blocks would have to be exposed through the use of activity diagrams to highlight the actual function calls. This method proves sufficient to establish an unambiguous representation. The construction is as a result deemed to be Fulfilled.

Promela Promela proctypes support the instantiation of other proctypes, rep- resenting nested function calls. Due to processes executing concurrently however, only functions that are the result of tail-calls will execute in the intended manner. If a function is called in earlier in the body, then the called function and the calling function will execute in parallel. If the calling function relies on output from the called function, this results in unintended behaviour. In order to force sequential execution mutexes would have to be used, which implies code synchronization mech- anisms that are not present in the original code. This construction is as a result deemed to be Partially Fulfilled.

C11 - Calling of function pointers AADL The calling of function pointers is done by requiring subprogram access to the called subprogram. Given the fact that function pointers can be declared through the use of a container , one would assume that these functions could also be called by passing the container data structure to a subprogram and declaring a call sequence using the subprogram contained within the passed data structure. This is however not possible, as AADL does not support declaring calls to subprograms that are contained as subcomponents inside data types. As such the function meant to call the given function pointer will need to request subprogram access itself, bypassing the passed data structure completely. This is the same convention as calling a global function, and is in other words ambiguous. This

81 CHAPTER 5. COVERAGE ANALYSIS

Figure 5.1. A SysML example of nested function calls. seems like an oversight in the design of AADL, there seems to be no reason to allow the declaration of function pointer data types and then not allowing the calling of them, as this would result in function pointers being unusable. In order to allow for the calling of function pointers, the call sequences would have to be augmented to allow addressing subprogram subcomponents of data types. Currently connection declarations can address data subcomponents in this way, and there seems to be no reason why the same should not be applicable to subprogram subcomponents. Due to this limitation, the construction will have to be considered to be Partially Fulfilled.

Lustre Lustre does not support pointers, and function pointer calls is as such not supported, and the construction is a result deemed to be Not Fulfilled.

SysML SysML supports functions being passed over ports. However, unlike C, where a function pointer is defined by its return value and parameter types, SysML function passing over ports have to specify a specific function that they allow, thus nullifying one of the primary uses of function pointers, namely generality5. A further

5This concept is common in the C standard libraries through functions such as qsort that can be supplied with arbitrary data types as long as a comparison function is provided.

82 5.3. COVERAGE limitation stems from the fact that there is no way to establish that it is actually the passed function that is called, rather than calling the function present in an accessible scope directly (this problem is shared with AADL). The construction is as such deemed to be Partially Fulfilled.

Promela As Promela does not support pointers it does, as a consequence, not support the calling of function pointers. The construction is as aresult deemed to be Not Fulfilled.

C12 - Function calls with type-casted variables AADL AADL does not support arbitrary type casts. In order for a typecast to take place, the source variable would have to be a subtype of the destination variable, a concept that is not supported in C. This construction is as such considered to be Not Fulfilled.

Lustre Lustre does not support type casting between variable types. The con- struction is as such Not Fulfilled.

SysML SysML variable blocks can be connected to ports regardless of port type, indicating that a type cast takes place. The construction can as such be considered to be Fulfilled.

Promela Promela implicitly type casts variables for all assignment operations between variables of different types, or during function calls with parameters of improper data types. Explicit casts are not permitted. MISRA requires all type casts to be explicit, which at first would seem to imply non-compliance. However, since MISRA only supports explicit type casts while Promela only supports implicit casts,there still exists a one-to-one mapping between the C code and the modelling formalism. The construction is therefore deemed to be Fulfilled.

C13 - Function-like macros AADL Function-like macros are modelled the same way as regular functions in AADL, namely through the use of subprograms. A property set can be defined in order to identify that the subprograms are declared as function-like macros rather than as regular functions. This construction is as such Fulfilled.

Lustre Lustre does not support a way to model function-like macros. They would instead have to be modelled the same way as regular functions, which would be cause for ambiguity. This construction is as such deemed to be Not Fulfilled.

83 CHAPTER 5. COVERAGE ANALYSIS

SysML Similar to Lustre, SysML supports no way to distinguish between a function-like macro and a regular function. The construction is as such, similar to Lustre, deemed to be Not Fulfilled.

Promela As Promela uses the same pre-processor as the C language, all pre- processor commands supported in C are also supported in Promela, including the use of pre-processor macros for functions. The construction is therefore considered to be Fulfilled.

C14 - Ternary statement AADL AADL supports conditional statements that affect control flow through the use of call sequences. Since no internal behaviour of subprograms can be mod- elled however it is is not possible to model conditional statements affecting data flow. The ternary statement is a conditional statement solely affecting data flow, and can as such not be modelled. The construction is Not Fulfilled.

Lustre Lustre’s functional if-statement only affects data flow, and is semantically identical to the C ternary statement. The construction is as such deemed to be Fulfilled.

SysML SysML activity diagrams supporting decision points can be used to model ternary statements. A decision point followed by two branches indicating the dif- ferent variable assignments, followed by a merge of the two branches accurately represents a ternary statement. The construction is as such deemed to be Fulfilled.

Promela The ternary statement can be modelled in Promela through the use of an if-clause. This is ambiguous with the regular if-construction, but as ternary statements are syntactic sugar for an if-else assignment, this will be considered to be acceptable. The construction is therefore considered to be Fulfilled.

5.3.4 Fulfilment: Code Structure D1 - External interfacing AADL AADL supports the interfacing of other source code languages through the use of property sets. The Data Modelling annex, a part of the standard AADL distribution, supports a property for source language called Source Language. Seamless meshing of structural components written in several different programming languages is as such possible, and the construction is deemed to be Fulfilled.

Lustre Lustre supports no way to show which source code language that im- plemented the functionality of a given node. The construction can therefore be considered to be Not Fulfilled.

84 5.3. COVERAGE

SysML SysML can depict the implementing language through the use of stereo- types. A function block for each source language could be declared, inheriting properties from the default SysML block. Ada functions could as such be mod- elled using an AdaFunction block for example. Since these blocks inherit from the default SysML block, they can be passed over ports declared to support regular blocks. This construction is as such deemed to be Fulfilled.

Promela Promela does not support a way to model the implementing language of a given function apart from informal comments. The construction will therefore have to be considered to be Not Fulfilled.

D2 - Modularization AADL AADL supports packages containing structural components. Members of a package can be declared as public, and can then be imported into other packages, or as private, making them accessible only within the declaring package. This is reminiscent of how inclusion works in C, and the construction is therefore deemed to be Fulfilled.

Lustre Lustre supports packages, with separate blocks declaring which nodes that are provided (exported) by the package, as well as a body block implementing the exported functionality. Nodes that are declared and implemented in the body rather than declared in the provides-block are treated as module scope. This construction is as such classified as Fulfilled.

SysML SysML supports container blocks that can be used to represent C- and H-files. Members of these files are declared to be included in these blocks. There is however no way to formally specify what a block represents. It is as such not clear if a given container block represents a file, layer or entire system. This ambiguity forces us to consider the construction to be Partially Fulfilled.

Promela As Promela supports the C preprocessor, the modularization of code between files is supported in the same way as it is in C, where a given file can be imported into other files. There is however no distinction between h-files and c-files as the concept of declaration does not exist in Promela; a declaration is always an implicit definition. This muddles the code structure slightly. As a result the construction is deemed to be Partially Fulfilled.

D3 - Module inclusion AADL AADL supports the with keyword to import structural components from other packages. The functionality of this keyword is identical to that of the #in- clude C directive. The construction is as such deemed to be Fulfilled.

85 CHAPTER 5. COVERAGE ANALYSIS

Lustre Lustre supports the include keyword to import nodes declared in other packages. The functionality is identical to the corresponding directive in C. The construction is therefore considered to be Fulfilled.

SysML SysML supports importation of members of other blocks through the use of aggregates or composition. This concept makes all the members of the imported block appear to be members of the importing block. This is identical to how the C pre-processor treats included files by simply copying the included code into the including code file. The construction can as such be considered to be Fulfilled.

Promela As inclusions in C are handled through the pre-processor, and Promela supports the same pre-processor as C, inclusions in Promela are identical to inclu- sions in C. The construction is as such deemed to be Fulfilled.

D4 - Hierarchical inclusion AADL AADL does not support hierarchical inclusion. If a given package A imports a package B which in turn imports a property set C, then the components of C will only be addressable in B. C would have to be explicitly imported by A for A to be able to address its components. This construction can as such be considered to be Not Fulfilled.

Lustre Hierarchical inclusion is supported by Lustre, and parent packages can address nodes declared in its imported grandchildren. The construction is as such deemed to be Fulfilled.

SysML SysML allows any given block to aggregate another block. Aggregation chains can as such be established in order to represent hierarchical inclusion. The construction is therefore classified as Fulfilled.

Promela Hierarchical inclusion is handled in Promela in the same way as in C, where inclusions are forwarded up the chain. In order to avoid multiple inclusions, resulting in naming conflicts, include guards have to be used. This construction is as a result determined to be Fulfilled.

D5 - Global Scope Functions AADL A subprogram declared in the public section of a package will be address- able by all packages that import the aforementioned package. This is the same way that global scope functions are implemented in C. The construction is as such deemed to be Fulfilled.

86 5.3. COVERAGE

Lustre Any node declared in the provides section of a Lustre package will be made available to all packages that import the declaring package. This is similar to how global functions are represented in C, and the construction is as such deemed to be Fulfilled.

SysML In SysML all members of a block are available to any block that aggregates the declaring block by default. Every function therefore becomes global when it is aggregated into other blocks. The construction is as a result of this considered to be Fulfilled.

Promela All proctypes declared in Promela are considered to be globally scoped. It is not possible to limit the scope of a function to a given file. This construction is as such Fulfilled.

D6 - File Scope Functions AADL A subprogram declared in the private section of a package will only be accessible by components declared in the same package. By using packages to represent source code files, these subprograms (functions) therefore have file scope. The construction is as such deemed to be Fulfilled.

Lustre Nodes that are declared inside the body of a package, rather than in the provides block are only addressable by other nodes contained in the same pack- age. This effectively makes the nodes file-scope, and the construction is therefore classified as being Fulfilled.

SysML SysML blocks can contain operations, which represent functions that are declared inside the block. A block representing a file can as such declare all functions contained within the file. Each operation can be given a visibility value, which can be set as either public (representing a global function), or as private (representing a file scope function). This construction is as such deemed to be Fulfilled.

Promela Promela’s version of functions, proctypes, are always global. They can not be limited to the scope of a particular file. This construction is as a result deemed to be Not Fulfilled.

D7 - Keyword aliasing AADL Keyword aliasing is not supported in AADL. The construction is as such Not Fulfilled.

Lustre Keyword aliasing is not supported in Lustre. The construction is as such Not Fulfilled.

87 CHAPTER 5. COVERAGE ANALYSIS

SysML Keyword aliasing is not supported in SysML. The construction is as such Not Fulfilled.

Promela Keyword aliasing in C is performed through definition directives to the pre-processor, which then performs a sweeping pass on the source code, replacing all aliases with the original keyword. As Promela uses the same pre-processor as C, keyword aliasing can be realized in an identical fashion. The construction is therefore deemed to be Fulfilled.

5.3.5 Fulfilment: Program Behaviour E1 - Operation support AADL AADL does not support the modelling of internal behaviour of functions, but rather the data flow and control flow between structural components. As op- erators are associated with function internal data flow, they can not be modelled. The construction is as such classified to be Not Fulfilled.

Lustre Lustre supports all arithmetic- and logic operators that are present in the C language. The construction is as such deemed to be Fulfilled.

SysML SysML can model internal behaviour of functions through the use of ac- tivity diagrams. The activity diagrams however lack formal semantics; an activity is declared as a block, with an associated text string describing what operation the block performs. There is no established convention for how these text strings are to be written, and they seem to borrow heavily from various programming languages or from mathematics. The conventions are as a result not part of the formalism itself, but rather imported from other formalisms. Due to the lack of semantics, rather than just ambiguous semantics, the construction will be classified as Not Fulfilled.

Promela Promela supports all arithmetic- and logic operators present in the C language. The construction is as such classified as being Fulfilled.

E2 - Deterministic ambiguous behaviour AADL Due to the fact that operations can not be modelled in AADL, neither can the behaviour of the operations. This construction is as such deemed to be Not Fulfilled.

Lustre Lustre expresses operators in a similar manner to C. As such it suffers from the same problems as C, where the behaviour of certain operators is compiler, or platform, dependent. This construction is as such classified to be Not Fulfilled.

88 5.3. COVERAGE

SysML SysML supports the modelling of behaviour of operators through require- ment diagrams. The expected behaviour of the operator can there be expressed as a requirement imposed upon the source code. This is similar to how the situation is handled in the case of the C language; since there is no way to know the behaviour of the ambiguous operator without compiling the code, or referring to the instruc- tion manual of the compiler, the behaviour has to be specified as a requirement, or in the documentation of the program. Since SysML can fulfill this criteria in a way that is similar to C, this construction is classified as Fulfilled.

Promela As Promela models are compiled into C through the use of a C compiler (gcc through MinGW for the example models used in this thesis) in order to be verified, the behaviour of the ambiguous operators are determined by the compiler in an identical fashion to C. The behaviour is as such compiler dependent. The construction is as a consequence deemed to be Not Fulfilled.

E3 - Error Handling AADL The default distribution of AADL comes with the Error Model annex, which supports the modelling of errors propagated over ports. A given subsystem can contain a block containing the possible errors that can be generated and re- turned by the subprogram, as well as over which ports these errors are sent and what the error signifies. The construction will as such have to be considered to be Fulfilled.

Lustre Lustre supports error codes based on variable return ranges, in a similar fashion to C functions returning a negative value when an error occurs. The C convention of passing a pointer to a structure intended to house error data, which can be read after the function call to retrieve diagnostic data, is not supported, as pointers can not be modelled. Lustre does however support multiple return values, meaning that data normally modified through the passed pointer reference can be returned as a second return parameter. It is not unambiguous however what these additional return parameters represent, as they attempt to model pointer modification through the use of a regular variable. The construction will as such be considered to be Partially Fulfilled.

SysML SysML supports several ways of modelling the occurrence of errors. Ac- tivity diagrams can be used to depict the emission of errors at various points inside a function, signals sent over flow ports can be used to trigger error handlers or neg- ative error codes could be emitted over regular data ports similar to the negative C return codes. The construction is therefore deemed to be Fulfilled.

Promela Negative error codes can be returned over channels the same way as they can be in C. The C convention of using a pointer to a structure as an error parameter, where diagnostic information is returned as part of the structure upon

89 CHAPTER 5. COVERAGE ANALYSIS producing an error is however not supported due to the lack of pointers. This can be remedied with global variables or message channels to indicate error messages, but this use of channels/global variables is ambiguous. The construction is as a result deemed to be Partially Fulfilled.

E4 - Variable lifetime

AADL Variable lifetime is implicitly modelled in AADL through component life- times; when a subprogram produces its return value, then the internal variables of the subprogram will be destroyed. Variables pointed to by a subprogram will have to exist in an encompassing scope of the subprogram, and their lifetimes as such extend past the lifetimes of the pointers. These conditions are sufficient to model the MISRA case, where dangling pointers are prohibited. It is however not sufficient for the general ANSI case. Since MISRA C guidelines need to be enforced for ISO 26262 compliance, this construction is considered to be Fulfilled for the purposes of this thesis.

Lustre As Lustre does not support pointers, the concern of dangling pointers is not present in Lustre models. Variable lifetime extends to the duration of the call of the node, but can be explicitly extended through the use of the pre operator that refers to the value of the variable in the previous call to the function (similar to static variables in C). As dangling pointers are prohibited in MISRA C, the Lustre notion of variable lifetime is sufficient for functions cope variables. Due to global variables being modelled through the use of registry nodes however, they act as static variables residing inside a mutator pattern function. This representation is not ambiguous as a similar construction could be produced in C, which would be cause for an ambiguity. This construction is as such classified as Partially Fulfilled.

SysML Variable lifetime in SysML is very clearly defined as a result of composi- tion and aggregation. An aggregated variable is expected to outlive the aggregating block, while a composed variable is expected to have the same lifetime as the com- posing block. Variables contained in blocks that are not aggregated are expected to have a lifetime identical to the encompassing block. These semantics are clear enough for this construction to be deemed to be Fulfilled.

Promela As pointers do not exist, variable lifetime is tied to the scope of the variable. Global variables will have a lifetime that lasts for the duration of the program, function scope variables will last for the duration of the proctype and block scope variables will last through the duration of the block (usually a d step clause). Due to the clear semantics for variable lifetime the construction is considered to be Fulfilled.

90 5.3. COVERAGE

E5 - Reentrant Functions AADL Reentrancy can be modelled through the use of AADL property sets. A property can be defined for subprograms that specify that the subprogram is reentrant, and this property can then be applied to all reentrant functions. This is sufficient for the construction to be classified as Fulfilled.

Lustre Lustre does not support the modelling of properties that apply to a given node. Due to Lustre not allowing the reassignment of variables, variable renaming might have to take place to properly represent the algorithmic implementation of a given C function. This re-writing has the potential to have implication for whether the reentrancy property is preserved in the model. This construction is as a result classified as being Not Fulfilled.

SysML The Papyrus tool supports the modelling of properties that apply to blocks representing functions through the use of stereotypes. It is however not clear whether stereotypes are tool specific or part of general SysML as they do not show up in the graphical representation of the model. This construction is as such classified as being Partially Fulfilled.

Promela Promela does not support a way to attach characteristic attributes to a given function in a formal way. Reentrancy can as such not be modelled. The construction is as a consequence deemed to be Not Fulfilled.

E6 - Requirements AADL The default AADL distribution includes the AGREE annex, which pro- vides AADL with the ability to model requirements through AG-contracts [55]. Requirements are there modelled as a set of conditions that the given component assumes to hold upon execution, and a set of conditions that it guarantees will hold post execution. AG-contracts do not cover all possible requirements; they can not model persistent internal states of the subprogram, nor can it model timing requirements. The ability to model AG-contracts will be considered sufficient for the purposes of this thesis, as they are sufficient to model LTL/CTL based require- ments, which have been identified [56] as able to model a large number of automotive requirements. The construction is as such deemed to be Fulfilled.

Lustre Lustre supports modelling of requirements through the use of synchronous observers, which fulfil similar functionality to pre- and postcondition contracts. This kind of contracts will be considered to be sufficient for the construction to be classified as Fulfilled.

SysML SysML supports a diagram type for requirements modelling, where hi- erarchies of requirements can be modelled, and where requirements can be tied to

91 CHAPTER 5. COVERAGE ANALYSIS certain structural components to which the requirements apply. This construction is as such deemed to be Fulfilled.

Promela Promela supports modelling of requirements through assert statements and claims. Assert statements check that properties hold at a given point of the execution trace. Monitoring processes can be used to establish system invariants as can be seen in the example. Claims establish truths that can not be broken. A never claim for example expresses states that should never be reached during program execution. Together these constructions are sufficient to model AG-contracts (assets that A contracts hold upon entering process, which can be triggered through a message channel, and that G contracts hold at end of execution). The construction is as such classified as being Fulfilled.

E7 - Performance Parameters AADL AADL contains a number of built-in formal properties to model perfor- mance behaviour such as periodicity and dispatch protocols. These properties can be extended through further property sets containing metrics such as memory foot- print or other required metrics. This construction is as such considered to be Ful- filled.

Lustre Lustre supports no way to model properties that apply to nodes other than informal comments. This construction can as such be considered to be Not Fulfilled.

SysML Papyrus allows for performance properties to be modelled through the use of stereotypes. As previously mentioned however, it is unclear if these stereotypes are an official part of SysML as they do not show up in the graphical representation of the model. The construction is therefore deemed to be Partially Fulfilled.

Promela As evaluation of the Promela models is not deterministic, it is impos- sible to make predictions about performance. There is no way to model the per- formance of a function that has been modelled, as properties assigned to structural components do not exist. The construction is as such deemed to be Not Fulfilled.

E8 - Escape sequences AADL AADL supports escape sequences in an identical fashion to C, through the use of the backslash operator. This construction can as such be classified as being Fulfilled.

Lustre Lustre does not support strings as a data type, and escape sequences are as such not applicable. The construction is deemed to be Not Fulfilled.

92 5.4. COVERAGE SUMMARY

SysML SysML automatically escapes special characters contained inside strings declared as text inside blocks. The use of escape sequences is as such not needed. The construction is considered to be Fulfilled.

Promela Strings are not supported in Promela, and escape sequences are as such not applicable to Promela models. The construction is as a result deemed to be Not Fulfilled.

5.4 Coverage Summary

Summaries of the coverage for each surveyed formalism with regard to the various construction categories can be found in figures 5.2, 5.3, 5.4, 5.5 and 5.6.

93 CHAPTER 5. COVERAGE ANALYSIS

Figure 5.2. Summary of Data Storage Coverage.

94 5.4. COVERAGE SUMMARY

Figure 5.3. Summary of Data Flow Coverage.

95 CHAPTER 5. COVERAGE ANALYSIS

Figure 5.4. Summary of Control Flow Coverage.

96 5.4. COVERAGE SUMMARY

Figure 5.5. Summary of Code Structure Coverage.

97 CHAPTER 5. COVERAGE ANALYSIS

Figure 5.6. Summary of Program behaviour Coverage.

98 Chapter 6

Discussion

This chapter will present a discussion regarding the findings of the thesis. First an analysis of the results of the coverage analysis will be performed, discussing strengths and weaknesses of the surveyed formalisms. In the next section a discussion will be held regarding the possibility of augmenting the formalisms surveyed in this thesis, in order to enhance their performances with regards to the coverage metrics. A discussion regarding the problems of automatic model parsing from C source code will then be held, highlighting constructions that could pose problems with regard to automated analysis. The chapter will be concluded with a discussion about future work, where several potential future areas of investigation will be suggested.

6.1 Formalism Coverage

In order to provide a clear overview of the results of the coverage analysis, this section will be divided up into subsections based on each construction category.

6.1.1 Data Storage In the Data Storage category, AADL was able to produce a high number of fulfilled cases compared to the other formalisms. Given the purpose of architecture descrip- tion languages, where architectural components and the communication between these are modelled, the fact that data storage representation is a key aspect of the formalism is not surprising. In order to specify proper communication procedures between components, knowledge about the format of the data that is being sent and received is key. SysML presents a high degree of partially fulfilled results, stemming from the simplicity of the formalism. Data types are expressed using named blocks, which can in turn contain other blocks in order to form composite data types. The life times of data is represented using aggregation and composition, that borrow from Object-Oriented concepts, where the life-time of an object is tied to another object. This is a clear limitation with regards to modelling embedded systems, as data

99 CHAPTER 6. DISCUSSION tends to be passed between components without being tied into the life-time of the components themselves. Further limitations are presented regarding pointers, as there is no strict way to present that a block is in fact a reference to another block. In this sense SysML could be viewed as a concrete modelling formalism, where there is little support for creating virtual blocks that are in fact not present in the real system as implemented. Lustre and Promela both suffer from high degrees of non-fulfilled constructions, stemming from the verification-oriented nature of these formalisms. In order to support efficient model-checking, limitations in expressibility have to be posed in order to limit the size of the input spaces to be verified. This provides a contrast to other modelling formalisms, which tend to be verified through simulation rather than model checking, and as such do not have strict requirements posed on them regarding absolute correctness.

6.1.2 Data Flow AADL continues to provide a high degree of fulfilled coverage for data flow. This once again follows from the fact that architecture description languages target the architectural elements of a system, as well as the communication between them. Due to the abstraction of the internal workings of each component, more focus is placed on the phenomena taking place outside of a given component, namely connections between components and their methods of communication. Data flow by definition concerns itself with the exchange of data between components, and is subsequently heavily emphasized in AADL. SysML suffers from a high degree of partially fulfilled constructions, highlighting the ambiguity of the representation of communication present in the formalism. The basic component for data exchange in SysML is the data port, indicating an input or output of data. These ports can be connected to other ports in order to indicate that communication between components takes place. It is however not clear how the data is being transmitted or when it is being transmitted. In fact if two components are connected with an input and output port, it is not clear if the production of data in one component is dependent on the transmission of data from the other component or not, and as such there is no way to depict the actual flow of data. Lustre and Promela once again produce a large number of non-fulfilled construc- tions, but a fairly small number of partially fulfilled constructions. It is evident that these formalisms have very clear semantics, or else the degree of partially fulfilled constructions would be higher. Lustre and Promela both produce a lower number of fulfilled constructions in this category mainly due to the fact that they can not address hardware, a key aspect of embedded systems. In order to represent data being transmitted between hardware and software, such as e.g sensor signals being read, the notion of hardware has to be present in the formalism. This concept is generally abstracted in these formalisms, and is represented the same way as a soft- ware component, thus creating cause for ambiguity. The inability to model pointers, stemming from the difficulties of verifying pointer targets (see section 6.3), provides

100 6.1. FORMALISM COVERAGE further limitations regarding the expressiveness of the formalisms.

6.1.3 Control Flow The so far high degree of coverage of AADL starts to drop off in the Control Flow aspects of the C language. The lack of an internal representation of AADL compo- nents means that modelling alternate paths of execution proves problematic. The concept of alternate flows, i.e alternate execution traces throughout the system, helps to mitigate this to some degree, but does not solve the alternate traces within a given component. As data communicated between components simply consists of a data type, there is no way to model input- and output spaces of a component given internal execution traces. SysML supports a diagram type known as the activity diagram, which expresses the flow between computational blocks within a component, similar to Control-Flow Graphs (CFGs) used by many compilers as an intermediate representation of source code, prior to full compilation[57]. Given that compilers are able to express entire C programs as CFGs prior to compilation, a diagram type that almost completely mirrors their functionality should realistically be able to express a large number of control flow constructions, something that holds given the coverage metric results. Promela and Lustre provide high degrees of partially fulfilled and non-fulfilled constructions. This primarily stems from the fact that the control flow of a C program often has to be re-written in order to support the stricter semantics of the verification formalisms. Lustre uses Static Single Assignment form, meaning that once a variable has been assigned a value, this value can not change. This poses limitations regarding exact modelling of algorithmic implementations. Lustre differs from C in that it is a declarative formalism. In other words the goal is to model what facts you want to hold, and verify these facts, rather than verify how these facts are produced. Using an example of a Kalman filter, Lustre would express the mathematical formula of the Kalman filter as a fact that should hold, rather than how the programmer implemented the Kalman filter. This causes a disconnect with the low level nature of the C language, where much focus is placed on the implementation itself.

6.1.4 Code Structure The coverage metrics are almost identical across the board for all formalisms with regards to the constructions in the Code Structure category. All formalisms support modularization in order to distinguish between multiple components and provide a mean to reuse models later on. The construction that poses the largest concern in regard to coverage in this category is keyword aliasing, namely the ability to declare your own keywords by providing a translation to an already existing keyword. This is possible in the C language through the define pre-processor directive, and is as such included as a construction. In a real world implementation it is questionable how much of a limitation the lack of fulfilment with regard to this construction

101 CHAPTER 6. DISCUSSION actually poses, as it could be argued that redefining keywords in the C language in fact aims to change the syntax of the language itself, and that the result would then be a new programming language rather than an instantiation of the C language.

6.1.5 Program Behaviour The Program behaviour category aims to address the modelling capability regarding the use of meta-data, not present in the C source code itself, but rather in the im- plementation of the code on a given computational platform. As a result formalisms with an explicit notion of hardware perform better in this category, with AADL and SysML producing high degrees of fulfilment. Lustre and Promela, where re-writing of algorithmic implementations in order to better comply with verification engines, perform worse, with high degrees of non-fulfilment stemming from the disparity be- tween the implementations of the verification model itself, and the source code they aim to model. Unless the algorithmic implementations remain the same between model and object, little can be said in terms of memory consumption or complexity of the original algorithm, which in the model has been abstracted or obfuscated.

6.2 Augmenting Formalisms

Given the discussion from the previous chapter, we can identify a number of aspects that have to be present in a modelling formalism for embedded systems.

1. The notion of hardware has to be present in the formalism.

2. Both the external behaviour of a hierarchical component, i.e its interactions with other components as well as with the environment, as well as the internal behaviour of the component, in terms of algorithmic decision points, have to be present in the model.

3. In order to verify a model completely, there is a need to limit the state space through the use of transformations which reduce the accuracy of the model.

4. The efficiency of the model, both in terms of generation and interpretation, is determined by the lack of ambiguities in the formalism.

Based on these aspects, Architecture Description Languages seem to form a solid base to extend from, as they are one one of the few families of modelling formalisms where hardware is an integral concept. This assumption is strengthened by the fact that AADL was able to achieve the highest fulfilled coverage out of the surveyed formalisms. The primary downside of these formalisms is the lack of ability to express internal workings, an aspect the other surveyed formalisms performed better in. The solution used in Lustre, where an equation-based declarative semantic is used to describe the behaviour of components, is appealing due to its simplicity;

102 6.2. AUGMENTING FORMALISMS very little code is required in order to express fairly complex behaviours. This simplicity however comes with a number of drawbacks. Being able to succinctly express a complex algorithm as a small set of equations poses strong requirements on the modelling engineer in terms of mathematical skills required to simplify the algorithm to the highest possible degree, thus resulting in the most succinct representation. The move away from modelling the exact algo- rithmic implementation further poses problems with regards to accuracy; in many cases a perfectly simplified algorithm will not exhibit the same performance met- rics as the original algorithmic implementation. Sorting algorithms highlight this concern, where a fully simplified sorting algorithm will have a time complexity of O(n · log(n)) (Quick Sort), while naïve implementations might have time complex- ities of O(n2) (Bubble Sort). This removes the possibility to use the model for various kinds of performance analyses. Another solution would be to represent the internal behaviour of a component in terms of a control flow graph, similar to SysML activity diagrams. SysML was able to produce a very high degree of fulfilment of control flow constructions through the use of this representation, and there is as such backing that this way of modelling internal behaviour could prove to be powerful. Certain C compilers already use this representation as an intermediate step of compilation, and as such the automatic generation of such graphs is possible. Given recent research in requirements en- gineering [56], the use of Computation Tree Logic (CTL) as a way of formulating requirements seems promising. Given the breakdown of execution paths in a con- trol flow graph, the ability to perform reasoning based on certain paths through the program becomes natural, and control flow graphs as such support the notion of CTL-based reasoning. This further complies with the notion of traces as presented in [55], as a trace is effectively an execution path through a component. This ap- proach does as such seem favourable as a means of expressing component internal behaviour.

The ability to perform formal verification on a model of embedded systems C source code would seem to be a goal that might not be reachable with current formalisms. The surveyed modelling formalisms supporting formal verification are able to do so by vastly restricting the input space of data types, and can as such model only a small part of the constructions supported in the C language. These re- strictions are cause for concerns regarding the mapping between model and object. A number of intermediate verification steps would have to be performed during modelling in order to ensure that the simplifications performed when translating the algorithmic implementation of the object into the simplified representation of the model would maintain the correctness of the implementation. This makes au- tomatic generation of a model from existing source code artifacts unlikely, as this transformation process would need to be strictly supervised in order to ensure the correctness of the transformation. Even so, the restricted input space of the verifi- cation modelling formalisms would mean that such a transformation will in many cases not be achievable.

103 CHAPTER 6. DISCUSSION

Formal verification remains a powerful tool for verifying systems, and should be used where possible, given the high confidence of the verification results, see section 6.5. The limited state space however means that such a formalism is ill suited for modelling of entire automotive systems, as they are simply too complex to fit in the rigid semantics of the verification formalisms. As the aim of this thesis is to investigate the possibility of modelling formalisms to model the entirety of embedded C source code, these formalisms do not prove sufficient for this task.

6.3 Automatic Model Generation

Previous work at Scania in the area of extracting model data from source code, as presented in section 2.5, has had success in extracting control flow, data flow and data storage semantics from code artifacts, by manipulation of the Abstract Syntax Tree, an intermediate compilation step. As we noted in the previous section, Control Flow Graphs seem to be a suitable representation of control flow for C programs, an artifact also produced during source code compilation. Complications start to arise with the introduction of pointers, a key concept in the C language, and commonly used in embedded systems source code, mainly due to their ability to address memory-mapped device data. The full extent of pointers are not possible to model, due to the vast input space for memory addresses. In ANSI C, a pointer can manually be assigned a memory address in the memory space of the program. In an embedded system, where system stimuli can come in the form of sensor signals, this results in a situation where a pointer can be assigned to address any point in memory space based on e.g a read sensor value. As the data passed over a sensor signal can not be deduced from the source code artifacts, limiting the scope of possible pointer values thus becomes a matter of having domain specific knowledge outside of what is allowed in the C language. The MISRA guidelines pose some restrictions, in that pointer arithmetic during run-time is disallowed. Given that the memory addresses of memory-mapped IO devices can not be deduced from the source code, it is still unclear what pointers refer to, even if these pointers are statically assigned during system initiation. In order to solve this problem, meta-data would need to be provided that could help automatic parsing tools to limit the space that pointers can address, as this information can not be deduced from the C source code itself, especially for embedded systems.

6.4 Validation

Following the reasoning regarding weaknesses of the method, as presented in section 4.7, the main concern regarding validation of the thesis results revolves around the mapping between Domain Space and Model Space. In order for the mapping to be valid, it must conform to the syntax rules for the modelling formalism in question, and also be classified correctly with regards to fulfilment.

104 6.4. VALIDATION

Syntax Rules During the course of the investigation, models for each given for- malism will be constructed using purpose-specific tools, which support syntax val- idation. Where such tools are not present, external validation tools will be used, providing similar functionality. Table 6.1 presents the syntax validating tools used during the investigation. Any model present within this thesis has passed syntax validation.

Formalism Validation Tool AADL Open Source AADL2 Tool Environment (OSATE2)1 SysML Papyrus2 Lustre Luke3 Promela SPIN4

Figure 6.1. Modelling formalism syntax validation tools.

Mapping In order to determine the strength of the mapping we must first make the distinction between a mathematically correct mapping and a semantically correct mapping. In the case of the former, a correct mapping would indicate that a bijection relation exists between Domain Space and Model Space for the given construction. This would indicate that there is an unambiguous way to express the C language construction in the given modelling formalism. It can be trivially proven that as long as the cardinality of Model Space is larger than that of Domain Space, then such a mapping is producible. A mapping that is semantically correct instead refers to a mapping in which a given construction in Domain Space is mapped to a construction in Model Space that corresponds to the intended meaning of the Model Space construction, as defined by the creator of the modelling formalism. A real-world example of a situation where a mapping is mathematically correct, but semantically wrong, would be using a Stop-sign to indicate that vehicles are allowed to enter a tunnel. As long as the Stop-sign is not used to indicate any other action to be taken by a driver, then there exists a one-to-one relation between this sign and the legality of entering a tunnel. To the driver however, a Stop-sign might not correctly convey the meaning that the tunnel is free to enter. Validating the mapping from a mathematical sense revolves around not re-using the same constructions in Model Space for several constructions in Domain Space. This can be shown given the example mappings presented in appendices G-J, indi- cating that no degree of overlap exists between constructions labelled as Fulfilled. Validation from a semantical perspective revolves around understanding the in- tention of the modelling formalism creator in order to determine their intended

1https://wiki.sei.cmu.edu/aadl/index.php/Osate_2 2https://eclipse.org/papyrus/ 3Tool by Koen Claessen at Chalmers University of Technology - http://www.cse.chalmers. se/˜koen/ 4http://spinroot.com/spin/whatispin.html

105 CHAPTER 6. DISCUSSION meaning of Model Space constructions. This has been attempted through the justi- fications presented in section 5.3, however due to the fact that none of the surveyed modelling formalisms were purpose-built for the modelling of the C language, a completely correct semantical mapping can not be produced. The focus has as such lied in producing a mathematically correct mapping, which is semantically correct whenever possible.

6.5 Future Work

Due to limited time, only a select few modelling formalisms were surveyed in this thesis. Given the seeming promise of architecture description languages in regard to modelling embedded systems C code, it would be of interest to investigate a larger number of these modelling formalisms in order to determine the best possible candidate to use as a base formalism. Verification formalisms, while not suitable for modelling entire embedded sys- tems, show promise in the verification of specific components of a system. None of the formalisms surveyed in this thesis however support the explicit modelling of verification activities associated with given components. Investigating possible extensions in order to be able to associate specific verification models with model elements in a more expressive formalism might therefore be of interest. The analysis of the coverage of each modelling formalism provided in this thesis only goes as far as to produce the presence function associated with the Augmented Completeness Coverage metric. No investigations were made in regards to the char- acteristics of the weight function, which would produce a more accurate depiction of the most suitable modelling formalism for a given purpose. The ISO 26262 standard provides a number of requirements posed on modelling formalisms, in terms of being able to express e.g hierarchical data. In order for the most suitable modelling for- malism in regards to ISO 26262 compliance to be produced, the weight function for ISO 26262 requirements would need to be produced, a task which could constitute future work.

106 Bibliography

[1] M. Sundar and D. Plunkett, “Brake-by-wire, motivation and engineering - gm sequel,” in SAE Technical Paper. SAE International, 10 2006. [Online]. Available: http://dx.doi.org/10.4271/2006-01-3194

[2] Road vehicles – Controller area network (CAN), ISO Std., Rev. 11898:2003, 2003.

[3] Road vehicles – Communication on FlexRay, ISO Std., Rev. 10681:2010, 2010.

[4] J. Pan, “Software testing,” Carnegie Mellon University, Tech. Rep., 1999. [Online]. Available: http://users.ece.cmu.edu/∼koopman/des s99/sw testing/

[5] Road vehicles – Functional safety – Part 2: Management of functional safety, ISO Std., Rev. 26262-2:2011, 2011.

[6] P. Johannessen, Öjvind Halonen, and O. Örsmark, “Functional safety extensions to automotive spice according to iso 26262,” in Software Process Improvement and Capability Determination, ser. Communications in Computer and Information Science, R. OConnor, T. Rout, F. McCaffery, and A. Dorling, Eds. Springer Berlin Heidelberg, 2011, vol. 155, pp. 52–63. [Online]. Available: http://dx.doi.org/10.1007/978-3-642-21233-8 5

[7] Road vehicles – Functional safety – Part 6: Product development at the software level, ISO Std., Rev. 26262-6:2011, 2011.

[8] M. Whalen, “Formal modeling and analysis of software sys- tems with lustre,” University of Minnesota, 2012. [Online]. Available: http://www.lccc.lth.se/index.php?mact=ReglerSeminars,cntnt01, abstractbio,0&cntnt01abstractID=367&cntnt01returnid=116

[9] O. Molin, “Design verification through software architecture recovery : Meet- ing ISO 26262 requirements on software using static analysis,” Master Thesis, Uppsala University, 2013.

[10] J. Pantovic, “Automated Data Dependency Visualization for Embedded Sys- tems Programmed in C,” Master Thesis, KTH Royal Institute of Technology, Stockholm, Sweden, Jan. 2014.

107 BIBLIOGRAPHY

[11] W. Wieweg and M. Creutzberg, “Architecture Recovery for ECU Software,” Master Thesis, Stockholm University, Stockholm, Sweden, 2014.

[12] Zamouche, Ahmed and Chammam, Oussama, “Towards automated recovery of embedded system functional architecture from source code and product data,” Master Thesis, KTH Royal Institute of Technology, Stockholm, Sweden, 2013.

[13] M. Pruscha, “Infrastructure for the Generation of Functional Data-Flow Views for Automotive Embedded Systems,” Master Thesis, KTH Royal Institute of Technology, Stockholm, Sweden, 2014.

[14] X. Zhang, M. Persson, M. Nyberg, B. Mokhtari, A. Einarson, H. Linder, J. Westman, D. Chen, and M. Torngren, “Experience on applying software architecture recovery to automotive embedded systems,” in 2014 Software Evo- lution Week - IEEE Conference on Software Maintenance, Reengineering and Reverse Engineering (CSMR-WCRE), Feb. 2014, pp. 379–382.

[15] J. Cooling, “Languages for the programming of real-time embedded systems a survey and comparison,” Microprocessors and Microsystems, vol. 20, no. 2, pp. 67 – 77, 1996. [Online]. Available: http://www.sciencedirect.com/science/ article/pii/014193319501067X

[16] L. Apostel, “Towards the Formal Study of Models in the Non- Formal Sciences,” in The Concept and the Role of the Model in Mathematics and Natural and Social Sciences, ser. Synthese Library. Springer Netherlands, 1961, no. 3, pp. 1–37. [Online]. Available: http: //link.springer.com.focus.lib.kth.se/chapter/10.1007/978-94-010-3667-2 1

[17] R. Aris, Mathematical Modelling Techniques. Courier Corporation, 1978.

[18] M. Brambilla, J. Cabot, and M. Wimmer, “Model-Driven Software Engineering in Practice,” Synthesis Lectures on Software Engineering, vol. 1, no. 1, pp. 1–182, Sep. 2012. [Online]. Available: http://www.morganclaypool.com/doi/ abs/10.2200/S00441ED1V01Y201208SWE001

[19] A. Stellman and J. Greene, Applied Software Project Management. ”O’Reilly Media, Inc.”, 2006.

[20] M. A. Ould, Testing in Software Development, C. Unwin, Ed. New York, NY, USA: Cambridge University Press, 1987.

[21] K. Beck, Test Driven Development: By Example. Addison-Wesley Professional, 2002. [Online]. Available: http://proquest.safaribooksonline.com.focus.lib.kth. se/book/software-engineering-and-development/software-testing/0321146530

[22] D. C. Schmidt, “Model-driven engineering,” Vanderbilt University, Tech. Rep., 2006. [Online]. Available: http://www.cs.wustl.edu/∼schmidt/PDF/GEI.pdf

108 BIBLIOGRAPHY

[23] E. Chikofsky and I. Cross, J.H., “Reverse engineering and design recovery: a taxonomy,” Software, IEEE, vol. 7, no. 1, pp. 13–17, Jan 1990.

[24] D. Firesmith, “Using v models for testing,” Carnegie Mellon University, Tech. Rep., 2013. [Online]. Available: http://blog.sei.cmu.edu/post.cfm/ using-v-models-testing-315

[25] Functional Safety of Electrical/Electronic/Programmable Electronic Safety- related Systems, IEC Std., Rev. TR 61508:2005, 2005.

[26] Road vehicles – Functional safety – Part 3: Concept phase, ISO Std., Rev. 26262-3:2011, 2011.

[27] B. W. Kernighan and D. M. Ritchie, The C Programming Lan- guage, Second Edition, 2nd ed. Prentice Hall, Mar. 1988. [On- line]. Available: http://proquest.safaribooksonline.com.focus.lib.kth.se/book/ programming/c/9780133086249

[28] D. M. Ritchie, “The Development of the C Language,” in The Second ACM SIGPLAN Conference on History of Programming Languages, ser. HOPL-II. New York, NY, USA: ACM, 1993, pp. 201–208. [Online]. Available: http://doi.acm.org/10.1145/154766.155580

[29] PDP11 processor handbook, Pdp11/05/10/35/40 ed., 1973.

[30] S. Johnson and D. Ritchie, “UNIX time-sharing system: Portability of c pro- grams and the UNIX system,” Bell System Technical Journal, The, vol. 57, no. 6, pp. 2021–2048, Jul. 1978.

[31] Programming Language C, ANSI Std., Rev. X3.159-1989, 1989.

[32] Programming languages – C, ISO Std., Rev. ISO/IEC 9899:1990, 1990.

[33] VDC Research, “What languages do you use to develop software?” 2010. [Online]. Available: http://blog.vdcresearch.com/embedded sw/2010/ 09/what-languages-do-you-use-to-develop-software.html

[34] S. P. Dandamudi, Introduction to Assembly Language Programming: For Pen- tium and RISC Processors, 2nd ed. New York, NY: Springer, Nov. 2004.

[35] MISRA, “Development guidelines for vehicle based software,” Motor Industry Software Reliability Association, Tech. Rep., 1994. [Online]. Available: http://www.thelibcommonproject.org/pool/external/misra rules.pdf

[36] Guidelines for the use of the C language in vehicle based software, MISRA Std., Rev. MISRA-C:1998, 1998.

109 BIBLIOGRAPHY

[37] Jack Ganssle, “Results of Firmware Standard Survey,” The Embedded Muse, no. 266, Aug. 2014. [Online]. Available: http://www.ganssle.com/tem/tem266. html

[38] Scania, “Scania - Historia.” [Online]. Available: http://www.scania.se/ om-scania/historia/

[39] “Truck manufacturers - market share in Western Europe 2013 | Ranking.” [Online]. Available: http://www.statista.com/statistics/265008/ market-share-of-truck-manufacturers-in-europe/

[40] M. Kahl, “Green shoots appear for Brazil’s truck market - Automotive World.” [Online]. Available: http://www.automotiveworld.com/analysis/ green-shoots-appear-brazils-truck-market/

[41] C. Nürk and M. A. Maier, “Truck market 2024- sustainable growth in global markets,” Deloitte, Tech. Rep., 2014. [Online]. Available: http://www2.deloitte.com/content/dam/Deloitte/de/Documents/ strategy/DELO Truck-Studie-2014-s.pdf

[42] I. D. Baxter and M. Mehlich, “Reverse Engineering is Reverse Forward Engi- neering,” Proceedings of Fourth Working Conference on reverse Engineering, October 6-8, Amsterdam, The Netherlands, Aug. 1997. [Online]. Available: https://www.semanticdesigns.com/Company/Publications/WCRE97.pdf

[43] S. Shrestha, “Software Modeling in Cyber-Physical Systems,” Master Thesis, Linköping University, 2014. [Online]. Available: http://www.diva-portal.org/ smash/record.jsf?dswid=9052&pid=diva2%3A756384&c=1&searchType= SIMPLE&language=en&query=Shilu+Shrestha&af=%5B%5D&aq=%5B% 5B%5D%5D&aq2=%5B%5B%5D%5D&aqe=%5B%5D&noOfRows=50& sortOrder=author sort asc&onlyFullText=false&sf=all&jfwid=9052

[44] J. Garcia, I. Ivkovic, and N. Medvidovic, “A comparative analysis of software architecture recovery techniques,” in 2013 IEEE/ACM 28th International Con- ference on Automated Software Engineering (ASE), Nov. 2013, pp. 486–496.

[45] T. Panas, D. Quinlan, and R. Vuduc, “Analyzing and Vi- sualizing Whole Program Architectures,” ICSE Workshop on Aerospace Software Engineering (AeroSE), Minneapolis, MN, 2007. [Online]. Available: http://rosecompiler.org/ROSE ResearchPapers/ 2007-AnalyzingAndVisualizingWholeProgramArchitectures-Aerose-ICSE.pdf

[46] K. Sartipi, “Software architecture recovery based on pattern matching,” in International Conference on Software Maintenance, 2003. ICSM 2003. Pro- ceedings, Sep. 2003, pp. 293–296.

110 BIBLIOGRAPHY

[47] Nenad Medvidovic and Richard N. Taylor,, “A Classification and Comparison Framework for Software Architecture Description Languages,” IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, vol. 26, no. 1, pp. 70–91, Jan. 2000. [Online]. Available: http://ieeexplore.ieee.org.focus.lib.kth. se/stamp/stamp.jsp?tp=&arnumber=825767 [48] Ivica Crnkovic, Michel Chaudron, Séverine Sentilles, and Aneta Vulgarakis, “A Classification Framework for Component Models,” in Proceedings of the 7th Conference on Software Engineering Research and Practice in Sweden, Chalmers, Gothenburg, Sweden, Oct. 2007. [Online]. Available: https: //gupea.ub.gu.se/bitstream/2077/18179/1/gupea 2077 18179 1.pdf#page=14 [49] O. Lindland, G. Sindre, and A. Solvberg, “Understanding quality in conceptual modeling,” Software, IEEE, vol. 11, no. 2, pp. 42–49, March 1994. [50] J. Krogstie, O.I Lindland, and G. Sindre, “Defining quality aspects for conceptual models,” Faculty of Electrical Engineering and , The Norwegian Institute of Technology, Tech. Rep., 1995. [Online]. Available: http://www.idi.ntnu.no/∼krogstie/publications/1995/ISCO3/fulltext.pdf [51] J. Krogstie, G. Sindre, and H. Jørgensen, “Process models representing knowledge for action: A revised quality framework,” Eur. J. Inf. Syst., vol. 15, no. 1, pp. 91–102, Feb. 2006. [Online]. Available: http://dx.doi.org/10.1057/palgrave.ejis.3000598 [52] Programvaruarkitektur i koordinator 7 (COO7), Internal Document, REVE. [53] A. Gamatié, T. Gautier, and L. Besnard, “Modeling of avionics applications and performance evaluation techniques using the synchronous language {SIGNAL},” Electronic Notes in Theoretical Computer Science, vol. 88, pp. 87 – 103, 2004, {SLAP} 2003: Synchronous Languages, Applications and Programming, A Satellite Workshop of {ECRST} 2003F. Maraninchi, A. Girault and E. Rutten. [Online]. Available: http://www.sciencedirect.com/science/article/pii/S1571066104050807 [54] ATESST, “EAST-ADL Overview,” 2010. [Online]. Avail- able: http://www.atesst.org/home/liblocal/docs/ConceptPresentations/ 01 EAST-ADL OverviewandStructure.pdf [55] J. Westman, M. Nyberg, and M. Törngren, “Structuring Safety Requirements in ISO 26262 Using Contract Theory,” in LNCS 8153: Computer Safety, Reliability and Security, ser. Lecture Notes in Computer Science, Vol. 8153, vol. 8153. Tolouse, France: Springer, p. 304. [Online]. Available: http://www. springer.com/computer/security+and+cryptology/book/978-3-642-40792-5 [56] R. Agrawal, “Semi-automated formalization and verification of automotive re- quirements using simulink design verifier,” Master Thesis, KTH Royal Institute of Technology, Stockholm, Sweden, 2015.

111 BIBLIOGRAPHY

[57] D. Mikushin, “Visualizing code structure in ,” 2013. [Online]. Available: http://icsweb.inf.unisi.ch/cms/images/stories/ICS/slides/llvm-graphs.pdf

112 Appendix A

Modelling Formalism Requirements

113 “Modelling Language Requirements” Joakim Gustavsson KTH Royal Institute of Technology 2015-02-18 Modelling Language Requirements

Domain Related

● The modelling language must be defined either as a general purpose language for architecture modelling, or must be defined as an automotive modelling language. This is to exclude languages not applicable to the concerned field of study. ● The language must support modelling of the C language. Modelling languages that target other programming languages will not be considered. Subsets of the C language are acceptable as long as they are supersets (proper or non-proper) of MISRA-C. ● The language needs to have a clear definition of which family of modelling languages it falls under. Examples of families are ADLs, Dataflow languages, component modelling languages etc. This requirement stems from a need to compare the power of each language; one can with relative ease compare two dataflow languages against each other, but comparing a dataflow language and an ADL might prove more difficult. ● The language needs to have formal and deterministic semantics, and these semantics need to be documented. The aim of the thesis is to aid Scania in achieving ISO26262 compliance by determining a formal way of modelling the Scania architecture, both existing architecture where the model is reverse engineered as well as new pieces of the architecture that may stem from forward engineering. As such the modelling languages need to have formal semantics that can be used to verify that the model of the architecture is in fact correct.

Documentation

● The documentation of the modelling language must cover all legal constructions in the given language. In other words, there should not be legal patterns that can be constructed in the modelling language that are not fully documented. ● The documentation needs to be centralized in a single repository. This aims to facilitate ease-of-use for engineers. If the engineers need to visit several repositories because the documentation is distributed over several sources, then this will likely cause frustration. ● The documentation needs to be presented in format that supports ease of navigation and readability. The concept of an old-school manual should serve as

1 “Modelling Language Requirements” Joakim Gustavsson KTH Royal Institute of Technology 2015-02-18 a guideline here. Video presentations or slides from lectures are examples of documentation that lack ease of navigation. ● The documentation should target both users of the language as well as tools developers improving the toolchain for the language. This requirement aims to support Scania’s independence in developing their own tools that suit their needs, rather than having to resort to begging a developer to provide functionality for them.

Active Development and Maturity

● The modelling language needs to show signs that active development/refining is taking place. Valid such signs are language-related conferences, new versions of the language being released or similar. The time to live for languages with no signs of active development is one year. In other words, if the language has shown no signs of active development in the past year, it will not be considered. ● The language needs to have been used to model some kind of real industrial system successfully. This is to weed out small-scale academic languages and make certain that the language in question has proven real-life applications. ● The language shall be no younger than 5 years since public release. In other words, only languages that were made public prior to 2010 will be considered. ● The language needs to have been cited or referenced in a number of scientific papers or journals. This demonstrates that there is scientific backing as to the validity of the language.

Representation

● The language needs to support a textual representation of the model. This could be in the form of a specific programming-language-like language, XML or similar. This is to support tool development. ● The language representation needs to be unambiguous. A given model should have a single representation and any differences between representations of a model should reflect changes in the actual model. This applies on a semantic level; clearly human-conventions such as variable naming could differ.

Tool Support

● For a given language there needs to be freely available development tools that support work with the language. At least some of these tools need to be provided free of charge, either as closed-source freeware or as open source projects. It

2 “Modelling Language Requirements” Joakim Gustavsson KTH Royal Institute of Technology 2015-02-18 would be impossible to conduct a survey such as this one if there is no way of testing the languages without paying, often severe, license fees. ● The licensing of the language needs to allow for the development of tools to be performed by third parties freely. Scania needs to be able to develop their own tools that can aid their development without being restricted by licenses. To clarify, development of tools needs to be allowed without any involvement by the language provider. That means no restrictions and no licensing fees for developers.

Openness

● The language, as well as its documentation, needs to be openly provided, free of charge. This is to support ease of collaboration. Scania needs to be able to rely on industrial partners being able to access the information they need in order to collaborate with Scania on joint ventures without having to pay licensing fees to do so. There are several other benefits to language openness, including, but not limited to, support from the language community to add features that Scania are in need of, the ability for collaboration with educational facilities to educate new engineers in the tools used by Scania etc.

3 Appendix B

Fulfilment: Data Storage

Key Degree Example Comments of Fulfill- ment A1 1 A1.pml Promela supports the bit, bool, byte, mtype, short and int data types. Floating point data types are not supported. Floats are supported through embedded C code, but are not native. A2 1 Supported as long as the data types inherit from one of the existing data types. They can then be aliased to the inherent types through preprocessor directives. Typedefing is how- ever not supported. A3 0 Promela data types are defined according to the word size of the implementing architec- ture. As such ints are 32 bits on a 32-bit CPU and 64 bits on a 64-bit CPU. There is no way to express that data types should have a par- ticular bit length. A4 0 Promela does not support a built-in neutral element that can be applied to all data types. In order to keep verification of Promela mod- els simple, most functions will simply stall if no appropriate input data is available. A5 2 A5.pml A6 0 Unions are not supported in Promela. A7 2 A1.pml

117 APPENDIX B. FULFILMENT: DATA STORAGE

A8 0 Variables declared outside of proc types are global and variables declared inside are func- tion scope. Variables declared inside blocks are block-scope. There is no way to make a variable file scope. A9 2 A9.pml A10 2 A10.pml In the example b can not be addressed outside of the d step block. A11 1 A11.pml Only doable through the use of pre-processor constants or enumerations, which makes for an ambiguous representation. A12 1 A12.pml Only single dimension arrays are supported. Multi-dimensional arrays have to be declared as a struct of arrays, which is ambiguous. A13 0 Pointers are not supported in Promela. They can be added through embedded C code, which Promela supports, but they are ot na- tive to Promela itself. A14 0 Promela lacks the notion of variable declara- tion vs variable definition. A variable is au- tomatically defined when it is declared. If it is global scope, then another file can use it as long as it imports the file that defined it, without declaring it a second time (it can im- mediately assign values). If the variable is de- clared/defined a second time, the SPIN veri- fier will yield an error message. A15 1 A15.pml Only implicit type casts are permitted through the use of assignment operators. There is no way to explicitly define a type cast. A16 2 A16.pml Can be done by adding a global channel and letting the function pass and read values from the channel on every call. This method however creates a global channel that is not present in the original C code, and is as such not very clean. A17* 2 A11.pml Promela uses the same pre-processor as C, and the implementation of pre-processor constants is as such identical.

118 A18 2 Signedness is strictly defined according to the Promela semantics, where e.g ints are signed and bytes are unsigned. The sizes of the data types are not clearly defined, but as this is covered by A3, it will be omitted in the eval- uation of this construction. A19 0 Pointers are not natively supported in Promela, meaning that pointer indirection is also not supported. A20 1 A20.pml Nested composite data types are allowed as long as the nested data type is declared out- side of the nesting data type. It is not possible to delcare ad hoc data structures that are in- ternal to a given data structure. A21 1 A21.pml Can not be given an initial value. A22 2 A22.pml A23 0 Not supported as pointers can not natively be represented in Promela. A24 0 Bit and Bool are internally represented as bit- fields, but there is no way to declare bit-fields in the models themselves. Table B.1. Coverage report for Promela Data Storage.

Key Degree Example Comments of Fulfill- ment A1 2 A1.aadl A2 2 A2.aadl A3 2 A2.aadl A4 0 ”AADL implements pointers by requesting ac- cess to shared memory, which represents the variable the pointer points to. This is done by ””requires data access”” command. As such you can not have variables that request access from a non-existing variable.” A5 2 A5.aadl A6 2 A6.aadl A7 2 A7.aadl

119 APPENDIX B. FULFILMENT: DATA STORAGE

A8 2 A16.aadl See A16. Filescope can be done by placing all functions inside a given file in the same subprogram group. A9 1 A8.aadl, AADL does not support global scopes per A8test2.aadl se. Variables have scope spanning a process, thread or subprogram. Variables can only be declared inside these scopes. This means that topics such as filescope or external variables become dependent on the execution flow of the program, not the program structure. A10 0 The smallest granularity block we can model is a function, as such the smallest scope we can model is function scope. A11 2 function constants.aadl, A11.aadl A12 2 A12.aadl A13 2 A13.aadl Data that is pointed to has to be scope that is accessible by all functions that use the pointer. This is sufficient for MISRA, but not for gen- eral C (there can be pointers to function scope variables in general C) A14 0 AADL does not support global scopes per se. Variables have scope spanning a process, thread or subprogram. Variables can only be declared inside these scopes. This means that topics such as filescope or external variables become dependent on the execution flow of the program, not the program structure. See A8. A15 0 A15.aadl Function has to be declared to accept super- type of data type. For example if integer and float are both subtypes of number, then cast- ing between them works as long as subpro- gram accepts a number rather than integer or float. A16 2 A16.aadl Can be done using subprogram groups. See table 6-4 in Model-Based Engineering with AADL: An Introduction to the SAE Archi- tecture. A17* 0 function constants.aadl,Can be modelled as normal constants, al- A11.aadl though there is no explicit preprocessor con- struction.

120 A18 2 All normal C data types are defined in the base types library. This library makes the distinction between signed and unsigned vari- ables. A19 0 Since the pointer implementation in AADL is in effect an alias for shared memory this is not possible. A pointer is shared memory be- tween subprograms or threads and processes, and shared memory cant exist inside the same component. A20 2 A20.aadl A21 2 FileTypes.aadl, A21.aadl A22 1 A22.aadl The call here does not specify a connection, and as such it could be a constant or ex- pression scope variable. More information in AGREE User Guide first example. A23 1 A13.aadl Effectively the same as a pointer. Taking address of is just requesting access to some shared memory variable. A24 2 A24.aadl Bit fields are effectively just a guideline for how to pack smaller data types inside a word. Table B.2. Coverage report for AADL Data Storage.

Key Degree Example Comments of Fulfill- ment A1 2 A1.png A2 2 A2.png A3 1 A3.png Can be done by extending the built-in C data types uint32 t etc. For other data sizes how- ever this is not unambiguously defineable. A4 1 A4.png Pointers are represented using aggregates. The aggregate can have a multiplicity of 0, indicating a null pointer. This is however not very clean. A5 2 A5.png A6 0 There is no way to show that the members of a data type can not exist at the same time. A7 2 A7.png

121 APPENDIX B. FULFILMENT: DATA STORAGE

A8 2 A7.png Since both files and functions are represented using blocks, file-scope is represented the same way as function-scope, i.e by adding properties to a block. A9 1 A9.png ”This can technically be done by creating a ””Global”” block, and then providing input ports to all blocks using these global vari- ables that uses this global block. This how- ever means that a port needs to be declared for global variable use, which mucks up the interfaces.” A10 0 A7.png If a block that is part of a function is mod- elled as a block in SysML, then it is possible to add a variable that is local to that block. This however means that we need to decom- pose functions. A11 1 A11.png Does not show visibly in the diagram, but rather as read-only access as well as a default value in the properties of the object. A12 1 A12.png Works for one-dimensional arrays only. A13 1 A13.png Aggregates are used to indicate that data per- sists after an object has been destroyed. This means that a block can aggregate another block to indicate that it accesses data con- tained within that block. This however gives access to all block members, meaning that data has to be contained within its own block for this to correctly represent a pointer. A14 1 Modelled either as aggregation. The block representing the file that uses the external variable aggregates the block representing the file that declares the external variable. This however once again gives access to all mem- bers of the aggregated block. A15 1 A15.png Using the connectors this way is possible in the tool, but does not feel very formal. A16 1 A16.png ”Does not show visibly in the diagram, but rather as a property of the block that can be configured. It is not guaranteed that this property is part of the formalism

122 A17* 0 We can not define pre-processor constants, and are limited to constants that are declared as described in A11. A18 2 SysML supports all built-in C data types, and as such it can represent exactly the same amount of information as C can. In the case of char, it would have to be defined using uint8 t or int8 t in order to make the type unambigu- ous. A19 0 Not doable. Pointers are represented as ag- gregates, and we can not aggregate an already existing aggregate. A20 2 A20.png A21 2 A21.png A22 0 All variable blocks have to be members of an- other block. This means that we can not use temporary blocks for a single function call. A23 1 Can be done by aggregating in the variable we want to address. This implicitly converts it into a pointer. A24 1 Can be done using the data size property, but this is not visible in the graphical model. A data type of a given size can also be declared as a stereotype, but one would then have to rely on a naming convention for this data type to be of a certain size, which is cause for am- biguous semantics. Table B.3. Coverage report for SysML Data Storage.

Key Degree Example Comments of Fulfill- ment A1 0 A1.lus Only variable types bool, int and real are sup- ported. The int data type is internally repre- sented as a 16-bit data type.

123 APPENDIX B. FULFILMENT: DATA STORAGE

A2 1 ”Lustre V6 supports abstract data types, which are defined by the user and imported from another implementing language. The semantics for these data types are however omitted from the Lustre V6 reference manual as ””TO DO”” clauses, and the behaviour of these data types are compiler dependent.” A3 0 A1.lus The sizes of the built-in data types int, bool and real are undefined. A4 1 A4.lus ”Pointers are not supported by the language, and as such the NULL element in reference to pointers is not supported. There is however a similar element that exists for the when- statement A5 2 A5.lus A6 0 There is no way to show that the members of a data type can not exist at the same time. A7 2 A1.lus A8 1 A8.lus Not possible due to the synchronous nature of the language. Since several nodes could be trying to update or read a global or file-scope variable at the same time, this would lead to unpredictable behaviour. This can however be bypassed by creating a registry node that acts as a global variable, see A8.lus. A9 1 See A8. A10 0 Variables are always local to a node. The only way to make a variable block scope would be to break the block out to its own node, which breaks program structure. A11 1 A11.lus A12 2 A12.lus A13 0 Can not be done. Global variables are not permitted short of the registry solution show- cased in A8. If that solution is used it still does not solve the issue of aliasing (two point- ers pointing to the same data). A14 2 A14 1.lus, A14 2.lus A15 0 Not allowed.

124 A16 2 A11.lus Persistent variables can be depicted using the pre statement, which takes the value of the variable from the previous time the node was called. A17* 0 No preprocessor directives can be modelled. A18 0 This is not obvious and required testing. The Luke-simulator uses signed 16-bit numbers to represent integers. Whether this is official or compiler specific is unclear from the reference manual. The Lustre V6 compiler (official) is only provided under Linux and OSX, so test- ing could not be performed in the Scania Win- dows environment. A19 0 Pointers are not supported, which by exten- sion means that pointer indirection is not sup- ported. A20 2 The language semantics seem to allow this, but since this could not be verified with the reference compiler, I am not certain. A21 2 A21.lus A22 2 A22.lus A23 0 Again, related to pointers, which can not be expressed. A24 0 Might be possible with the abstract data types, but due to lack of documentation this can not be ascertained. Table B.4. Coverage report for Lustre Data Storage.

125

Appendix C

Fulfilment: Data Flow

Key Degree Example Comments of Fulfill- ment B1 2 A1.pml B2 1 B2.pml Return values are only supported through global variables and message channels. This is ambiguous. B3 2 B3.pml B4 0 Not possible as pointers are not natively sup- ported. B5 0 Not possible as pointers are not natively sup- ported. B6 0 Hardware can not be modelled in Promela. It could be possible to model external factors through the use of a process that modifies a value sent over a message channel, but this would mean that the affected variable would constantly have to read a value and update its own value from a message channel, a conven- tion that is not very clean. B7 0 As mentioned in B6 sensor modifications could be modelled by constantly sampling a variable value from a message channel. This is however not at all transparent. B8 0 Promela has no way to model that specific blocks of code are implemented in a different language. B9 2 B9.pml q is a rendezvous . Table C.1. Coverage report for Promela Data Flow.

127 APPENDIX C. FULFILMENT: DATA FLOW

Key Degree Example Comments of Fulfill- ment B1 0 TypedIntegers.aadl, Variable initial values can be modelled using B1.aadl property sets. Reassignment inside a block is not possible, although reassignment can hap- pen across components. B2 2 B2.aadl B3 2 B2.aadl B4 1 B4.aadl Possible to do in ambiguous way. Requires data access can be used, but this is the same convention used to indicate access to a global variable. As such the distinction is not clear. B5 0 Not doable as function-internal data flow can not be modelled. B6 2 B6.aadl This requires the entire hardware architecture to be modelled since the memory-mapped reg- istry needs to be connected to a sensor, which is part of a system. B7 2 Interrupts.aadl This requires a lot of hardware knowledge un- less the ISR implementation is stubbed using abstract and a design pattern. B8 2 ExternalLang.aadl, Since we do not model the internals of sub- B8.aadl programs, we can just add a property stating which language that implements it. B9 1 A8.aadl Table C.2. Coverage report for AADL Data Flow.

Key Degree Example Comments of Fulfill- ment B1 2 Not doable in a clean way. It is possible to declare a state machine where variable assign- ment could be modelled as a state transition, but this way is extremely unclear and compli- cated. B2 2 B2.png B3 2 B2.png B4 2 B4.png Modellable using in-out flow ports.

128 B5 0 Not modellable as intra-block aliasing is not supported. B6 1 B2.png Since everything is a block, this can be mod- elled by creating a block representing the ex- ternal factor (ISR or similar) and tie it to the block containing the referenced variable using a flow port. B7 1 B2.png Event ports have no notion of timing. As such data on an event port could occur at any point during object execution. We can as such model interrupts the same way we modelled B6. B8 1 B8 1.png, B8 2.png Modelled using activity diagrams where a block can represent an external call to code in a different language. The output is then tied to a port. This is wrong! B9 1 B9.png Has to be done using a port, and even so it is questionably correct in terms of semantics to connect a variable to a port, despite the tool allowing it. Table C.3. Coverage report for SysML Data Flow.

Key Degree Example Comments of Fulfill- ment B1 1 A1.lus Assignments can be made once, but variables can not be reassigned. This has to do with Lustre seeing variables as equations rather than registries. B2 2 A22.lus B3 2 A22.lus B4 0 Pointers are not supported, so this is not pos- sible. B5 0 See above. B6 0 Not possible as Lustre can not reference hard- ware. B7 1 Lustre does not support interrupts. See text. B8 0 Since functions need to be re-written using Lustre’s equation system to model behaviour of functions, the source language does not matter as long as the behaviour is known.

129 APPENDIX C. FULFILMENT: DATA FLOW

B9 1 A11.lus Since functions can store previously passed values inside them using the pre statement, it is possible for function calls to have side effects in other nodes. This assumes that we model global variables as described in A11 however, since this is the only way global variables can be expressed. Table C.4. Coverage report for Lustre Data Flow.

130 Appendix D

Fulfilment: Control Flow

Key Degree Example Comments of Fulfill- ment C1 1 C1.pml Promela does not allow fall through if- statements. If several clauses are true, then a single one will be selected in a non- deterministic fashion. In the example file for this construction, the output of the program is non-deterministic and will vary in each ex- ecution. This is quite a severe restriction as fall through if-cases are fairly common. C2 2 C2.pml Has to be modelled similar to a while-loop, which is fine since a for-loop is syntactic sugar for a while-loop. C3 2 C2.pml C4 1 C4.pml Can be done using goto-statements. It is doable through the use of do-statements and an inverted if-statement, but it is less clean. C5 1 B2.pml Ambiguous due to the use of global variables or message channels for return values. C6 0 Pointers are not supported in Promela. C7 0 Pointers are not supported in Promela, and as a consequence neither are function pointers.

131 APPENDIX D. FULFILMENT: CONTROL FLOW

C8 2 C8.pml For general ANSI C, the lack of fall throughs for Promela if-statements is a problem. How- ever MISRA forbids the use of fall throughs in switch constructions. As such the Promela if-statement satisfies the MISRA switch- construction fully. C9 0 Not possible to model. Several processes can be run at the same time, but without the use of channels to represent mutexes all processes are assumed to execute concurrently. As such it is not possibly to explicitly model that an ISR is run in place of the regular flow of ex- ecution without specifying a mutex that the original process has to hold on. This implies that explicit synchronization mechanisms ex- ist within the code, which might not be the case. C10 1 C10.pml Unless explicit mutexes are used, the function calls have to be written as tail calls or they will execute in parallel, resulting in race con- ditions. C11 0 As function pointers are not modellable, nei- ther is the calling of them. C12 2 C12.pml Only doable through implicit casts as explicit casts are not supported. C13* 2 C13.pml Supported the same way as in C as the same preprocessor is used. C14 2 C14.pml Modellable using an if-else statement. This is ambiguous with regular if-statements, but this is acceptable since ternary statements are syntactic sugar for if-else statements. Table D.1. Coverage report for Promela Control Flow.

Key Degree Example Comments of Fulfill- ment C1 1 C1.aadl, While no internal information about functions C1flows.aadl can be defined, we can define call sequences that might follow from if-else constructions us- ing call sequences or flows in AADL.

132 C2 0 C2.aadl You can model repeated execution of some piece of code (e.g inside a loop) using a thread with a period or fixed number of iterations as defined in a property set, calling a subprogram representing loop body. C3 0 C2.aadl (swap the Modelling loops this way works fine for loops FOR to a WHILE) with a deterministic number of iterations, but loops running on e.g a boolean condition that can become false after an arbitrary number of iterations is harder to model. C4 0 C2.aadl (swap See C2-C3 comments. the FOR to a DOWHILE C5 2 C5.aadl C6 0 C6.aadl C7 2 C7.aadl C8 1 C1.aadl, Same as IF-ELSE C1flows.aadl C9 2 Interrupts.aadl This requires a lot of hardware knowledge un- less the ISR implementation is stubbed using abstract and a design pattern. C10 2 C10.aadl C11 1 C11.aadl Not clear whether the function call happens to the pointer or directly to the address that the pointer points to. C12 0 Type casting is not possible unless the cast happens to a data supertype. See A15. C13* 2 Have to be modelled as regular functions, or inlined (in which case they can not be mod- elled as internal behaviour is not considered in AADL). Whyen modelled as regular func- tions, a property set can be used to declare them as macros. C14 0 This is the same as an if-statement but for data flow. We noted in C1 that only control flow can be modelled using If-statements. Table D.2. Coverage report for AADL Control Flow.

Key Degree Example Comments of Fulfill- ment

133 APPENDIX D. FULFILMENT: CONTROL FLOW

C1 2 C1.png Modellable using an activity diagram and de- cision points. C2 2 C2.png Specify counter increment at end of loop, and decision to return to start based on loop iter- ator. C3 2 C2.png Same as for-loop, except with just a condition, and no increment at end. C4 2 C2.png Same as for loop, except with condiction at end rather than start. C5 2 B2.png C6 2 B4.png Variables and pointers appear to be the same in the model. What distinguishes them is whether they were added to the Internal Block Diagram as aggregations or compositions. C7 2 C7 1.png, C7 2.png Blocks can be passed on flow ports. C8 2 C1.png Switch statements are effectively long if-else if-else blocks. C9 1 A block in SysML does not include a concept of time. As such data could flow into the block at any given point during execution. This is modelled in more detail in state machine diagrams, that formalize when signals occur. This means that data from interrupt handlers could appear in a function at any point in time through an event port. We can however not model that we leave execution of one block unless there is an event port to signal us leav- ing the block. That means that we can not model that execution switches to an ISR in the middle of a block execution, unless we provide event ports to the ISR, which becomes clunky. C10 2 C10.png This is ambiguous when represented as an IBD, since we can not see if the blocks are called in a nested fashion or in sequence 1-2-3- 2-1. This could however be further specified in an activity diagram that ties messages passed on ports to certain points in the execution, making it unambiguous.

134 C11 1 C11.png Again, there is no clear way to show that the function that is called is actually the function that is being passed. The port is also depen- dent on a particular function being passed, not a function prototype, meaning that the entire purpose of passing a function pointer (gener- ality) falls flat. C12 2 A15.png C13* 0 Modelled same way as regular functions, i.e using blocks, but there is no way to differen- tiate between whether they were declared as macros or as functions. In fact, it is impossi- ble to distinguish between any kind of blocks, whether they represent a layer or a function or a source code file. C14 2 C12.png Is effectively an if-statement, meaning that it can be modelled as a decision point, an assign- ment on each of the decision paths, followed by a merge node. Table D.3. Coverage report for SysML Control Flow.

Key Degree Example Comments of Fulfill- ment C1 1 A21.lus Only functional if-statements related to vari- able assignment. We can, for example, not have an if-statement that triggers a function with no return values, as this is not part of an assignment. C2 1 C2.lus Loops are not explicitly reproducible, al- though Lustre supports iteration-like func- tions such as foldl, map and fill. Example is based on example from Lustre V6 reference manual. C3 1 See C2. C4 1 See C2. C5 2 A22.lus C6 0 Pointers are not supported. C7 0 Pointers to functions (nodes) are not sup- ported.

135 APPENDIX D. FULFILMENT: CONTROL FLOW

C8 1 Has to be implememented using if-statements, which are, as noted in C1, functional only. C9 0 Not supported. C10 2 A22.lus C11 0 Not supported since function pointers are not supported. C12 0 Not supported since type casting is not sup- ported. C13* 0 Pre-processor directives are not supported. Function-like macros have to be modelled as regular nodes, i.e the same way as functions would be modelled. C14 2 A21.lus The functional if-statement of Lustre is almost identical to C’s ternery statement. Table D.4. Coverage report for Lustre Control Flow.

136 Appendix E

Fulfilment: Code Structure

Key Degree Example Comments of Fulfill- ment D1 0 There is no way to distingusih between imple- menting languages in Promela. D2 1 As Promela supports the C preprocessor, the modularization of code between files is sup- ported in the same way as it is in C, where a given file can be imported into other files. There is however no distinction between h-files and c-files as the concept of declaration does not exist in Promela D3 2 A22 mod.pml, Done in an identical fashion to C due to D3.pml Promela using the same preprocessor. D4 2 D4 1.pml, D4 2.pml Supported in an identical way to C as the same preprocessor is used. D5 2 Proctypes are always global. D6 0 Proctypes can not be declared as file scope, they are always global. D7 2 D7.pml Supported in an identical way to C as the same preprocessor is used. Table E.1. Coverage report for Promela Code Structure.

Key Degree Example Comments of Fulfill- ment D1 2 B8 Done in the same way as ASM meshing.

137 APPENDIX E. FULFILMENT: CODE STRUCTURE

D2 2 Separate AADL packages for different source files. D3 2 ”Using the ””with”” keyword for inclusion.” D4 0 D4.aadl, D4 2.aadl, a propertySet.aadl D5 2 All functions are global as long as they are listed in the public section of a package. D6 2 All functions are file local as long as they are listed in the private section of a package. D7 0 AADL does not allow keyword aliasing. New keywords can be added using annexes, but that is beyond the scope of normal develop- ment. Table E.2. Coverage report for AADL Code Structure.

Key Degree Example Comments of Fulfill- ment D1 2 This can be done using profile stereotypes added to the Block object in SysML. One would simply create a stereotype that extends Block and contains a string representing a pro- gramming language. D2 1 This can be done using blocks representing dif- ferent levels of abstraction, which would in- clude C and H files. There is however no for- mal way to state that a block is a function or module, meaning that the model becomes ambiguous. D3 2 C7 2.png Done through composition and aggregation. D4 2 D4.png D5 2 Done by all blocks aggregating this function, which makes the model very ad hoc. D6 2 C7 2.png See text. D7 0 All keywords are in effect represented by prop- erties which are tool specific. In papyrus these are implemented as drop-down menues appli- cable to blocks, and are as such set. Table E.3. Coverage report for SysML Code Structure.

138 Key Degree Example Comments of Fulfill- ment D1 0 Since Lustre nodes work like black-boxes the implementing language is irrelevant to the model. Functions can be modelled as nodes no matter the input language. It bears noting however that there is no way to describe what a node represents (a layer, function, file etc.). This also means that there is no way (short of using comments) to specify the source lan- guage of a node representing a function. D2 2 A14 1.lus ”H-files and C-files are not represented as sep- arate files but rather as different parts of the same package declaration. H-files are depicted using a ””provides”” block, while the corre- sponding C-file resides in the ””body”” block.” D3 2 D3.lus D4 2 D4.lus D5 2 A14 1.lus, A14 2.lus ”If functions are listed under ””provides”” then they are global.” D6 2 D6.lus Nodes without a header in the provides-block are file scope. D7 0 Not allowed. Table E.4. Coverage report for Lustre Code Structure.

139

Appendix F

Fulfilment: Program Behaviour

Key Degree Example Comments of Fulfill- ment E1 2 E1.pml All C bitwise, arithmetic and logic operators are supported. E2 0 As Promela models are compiled into C through the use of a C compiler (gcc through MinGW in this thesis) in order to be veri- fied, the behaviour of the ambiguous operators are determined by the compiler in an identical fashion to C. E3 1 Negative error codes can be returned over channels the same way as they can be in C. The C convention of using a pointer to a struc- ture as an error parameter, where diagnostic information is returned as part of the struc- ture upon producing an error is however not supported due to the lack of pointers. This can be remedied with global variables or mes- sage channels to indicate error messages, but this use of channels/global variables is am- biguous.

141 APPENDIX F. FULFILMENT: PROGRAM BEHAVIOUR

E4 2 As pointers do not exist, variable lifetime is tied to the scope of the variable. Global vari- ables will have a lifetime that lasts for the du- ration of the program, function scope variables will last for the duration of the proctype and block scope variables will last through the du- ration of the block (usually a d step clause) E5 0 The properties that apply to a given function can not be modelled formally in Promella. E6 2 E6.pml, E6 2.pml Promela supports modelling of requirements through assert statements and claims. As- sert statements check that properties hold at a given point of the execution trace. Monitoring processes can be used to establish system in- variants as can be seen in the example. Claims establish truths that can not be broken. A never claim for example expresses states that should never be reached during program exe- cution. Together these constructions are suf- ficient to model AG-contracts (assets that A contracts hold upon entering process, which can be triggered through a message channel, and that G contracts hold at end of execu- tion). E7 0 As determinition of the Promela model is not deterministic, it is impossible to make predic- tions about performance. There is no way to model the performance of a function that has been modelled, as properties assigned to struc- tural components do not exist. E8 0 Strings are not supported in Promela, and es- cape sequences are as such not applicable to Promela models. Table F.1. Coverage report for Promela Program Behaviour.

Key Degree Example Comments of Fulfill- ment E1 0 Internal data behaviour can not be modelled, just data flow between components

142 E2 0 Since E1 can not be modelled, this one be- comes irrelevant. E3 2 E3.aadl Using Error Model annex. Can also be mod- elled implicitly (the way C does it, where er- ror codes within the variable ranged are re- turned, e.g. -1 on error), or using contracts from AGREE annex. E4 2 Implicitly modelled since variable lifetimes last as long as the instance of a component. E.g. when a subprogram exits execution, lo- cal variables are destroyed. We do not run into the issue with pointers to local variables since pointers can only point to data with a wider file scope. E5 2 Omitted, see prop- Modelled using property sets. erty set examples from e.g. B8. Reen- trant traits can be specified the same way. E6 2 E6.aadl Can be modelled with contracts using AGREE annex. E7 2 Interrupts.aadl Modelled using pre-defined property sets for memory and periodicity etc. E8 2 E6.aadl Done in the same way as C.

Table F.2. Coverage report for AADL Program Behaviour.

Key Degree Example Comments of Fulfill- ment E1 0 Can not be done in Papyrus since there is no way to specify what happens inside a block in an activity diagram. E2 2 Would have to be specified as a requirement, use-case or through a parametric diagram. E3 2 E3.png Can be modelled in a number of different ways, either as signals being emitted to an er- ror handler, or as exceptions that get logged to a data storage device. The example shows the exception logging method.

143 APPENDIX F. FULFILMENT: PROGRAM BEHAVIOUR

E4 2 Since SysML is based on UML, which is meant to model OO programming concepts, the con- cept of variable life-time is very clearly de- fined. Composition implies that the variable life-time lasts for the duration of the current block, while aggregation implies that the life- time of the variable lasts for longer than the life-time of the block, which is the case with pointers. E5 1 Would, similarly to other properties, have to be modelled through the use of stereotypes ap- plied to a given block. Stereotypes are effec- tively ways to extend SysML, and are as such not part of the formalism, meaning that they do not have any inherent semantics. E6 2 E6.png SysML supports an entire diagram type re- lated to requirements, that can be tied to which blocks that fulfill the requirements. E7 1 Again modellable through stereotypes, which causes a lack of formal semantics. E8 2 E8.png Special characters are automatically escaped inside SysML strings.

Table F.3. Coverage report for SysML Program Behaviour.

Key Degree Example Comments of Fulfill- ment E1 2 A1.lus (partial) All C arithmetic and logical operators are sup- ported. E2 0 Lustre suffers from the same ambiguities that C suffers from in this case. E3 1 Modellable with error codes the same way as C does it. Worth mentioning is that the C con- vention of passing an error parameter in the form of a pointer is not possible since pointers do not exist in Lustre. We are as such limited to error values in return codes.

144 E4 1 All variables die when a node is left unless they are redefined in consecutive calls using the pre operator. Since pointers are not modellable we do not have the C specific issue of pointer vs variable lifetime. E5 0 Not modellable. E6 2 E6.lus Can be modelled in the shape of synchronous observers. This is similar to a very simple contract system with pre-conditions and post- conditions. See E6.lus. This functionality is however very simple, bulky and dependent on third party verifiers. E7 0 Can not be modelled. E8 0 Strings can not be modelled, so escape se- quences are irrelevant.

Table F.4. Coverage report for Lustre Program Behaviour.

145

Appendix G

AADL Examples

package A1 p u b l i c with Base types ;

data Integer

end Integer;

subprogram Fun1 f e a t u r e s in parameter : in parameter Integer; end Fun1 ; end A1 ;

Figure G.1. Example for A1.

package A11 p u b l i c with b a s e t y p e s ; with data model ;

data BoundedInteger p r o p e r t i e s Data Model:: Data Representation => Array ; Data Model :: Base Type => (classifier (Base Types:: Integer)); Data Model:: Dimension => ( f u n c t i o n constants ::Min Value ) ; end BoundedInteger; end A11 ;

Figure G.2. Example for A11.

147 APPENDIX G. AADL EXAMPLES

package A12 p u b l i c with b a s e t y p e s ; with data model ;

data OneDim Array p r o p e r t i e s Data Model:: Data Representation => Array ; Data Model :: Base Type => (classifier (Base Types:: Integer)); Data Model:: Dimension => ( 8 ) ; end OneDim Array ;

data TwoDim Array p r o p e r t i e s Data Model:: Data Representation => Array ; Data Model :: Base Type => (classifier (Base Types:: Integer)); Data Model:: Dimension => (8 , 8) ; end TwoDim Array ; end A12 ;

Figure G.3. Example for A12.

148 package A13 p u b l i c data Integer end Integer;

subprogram fun1 f e a t u r e s p1 : requires data access Integer; end fun1 ;

subprogram fun2 f e a t u r e s p2 : requires data access Integer; end fun2 ;

process proc end proc;

process implementation proc.impl subcomponents f1 : subprogram fun1; f2 : subprogram fun2; pointedTo : data Integer; connections data access pointedTo −> f 1 . p1 ; data access pointedTo −> f 2 . p2 ;

end proc.impl; end A13 ;

Figure G.4. Example for A13.

149 APPENDIX G. AADL EXAMPLES

package A15 p u b l i c data Numeric end Numeric;

data implementation Numeric.integer end Numeric.integer; data implementation Numeric. f l o a t end Numeric. f l o a t ;

subprogram TakesInt f e a t u r e s inp : in parameter Numeric; end TakesInt;

subprogram TakesFloat f e a t u r e s inp : in parameter Numeric. f l o a t ; end TakesFloat;

thread t end t;

thread implementation t.impl subcomponents var1 : data Numeric.integer; var2 : data Numeric. f l o a t ; sp1 : subprogram TakesInt; sp2 : subprogram TakesFloat; c a l l s seq1 : { call1 : subprogram TakesInt; call2 : subprogram TakesFloat;

} ; connections parameter var1 −> c a l l 1 . inp ; parameter var2 −> c a l l 1 . inp ;

end t . impl ;

end A15 ;

Figure G.5. Example for A15.

150 package A16 p u b l i c data Integer end Integer;

subprogram keepsStaticVar f e a t u r e s staticVar : requires data access Integer; end keepsStaticVar;

subprogram group staticVarFamily end staticVarFamily;

subprogram group implementation staticVarFamily.impl subcomponents keepStatic : subprogram keepsStaticVar; sV : data Integer; connections data access sV −> keepStatic.staticVar; end staticVarFamily.impl; end A16 ;

Figure G.6. Example for A16.

package A2 p u b l i c with b a s e t y p e s ; with data model ;

data tS16 p r o p e r t i e s Data Model :: Number Representation => Signed ; Source Data Size => 2 Bytes ; end tS16 ;

data Float p r o p e r t i e s Data Model:: Data Representation => Float ; end Float ;

data Float 32 extends Float p r o p e r t i e s Data Model :: IEEE754 Precision => Simple ; Source Data Size => 4 Bytes ; end Float 32 ; end A2 ;

Figure G.7. Example for A2.

151 APPENDIX G. AADL EXAMPLES

package A20 p u b l i c with data model ; with b a s e t y p e s ;

data Data Container p r o p e r t i e s Data Model:: Data Representation => Struct ; end Data Container ;

data implementation Data Container.impl1 subcomponents member1 : data Base Types :: Unsigned 8 ; member2 : data Base Types:: Integer 3 2 ; end Data Container.impl1;

data implementation Data Container.impl2 subcomponents member1 : data Base Types :: Unsigned 8 ; member2 : data Data Container.impl1; end Data Container.impl2; end A20 ;

Figure G.8. Example for A20.

package A21 p u b l i c with FileTypes; Data File end File;

Data implementation file .directory p r o p e r t i e s FileTypes ::fType => DIR ; end file.directory; end A21 ;

Figure G.9. Example for A21.

152 package A22 p u b l i c data Numeric end Numeric;

subprogram fun1 f e a t u r e s inp : in parameter Numeric; end fun1 ;

thread t end t;

thread implementation t.impl

c a l l s seq1 : { call1 : subprogram fun1; } ; end t . impl ; end A22 ;

Figure G.10. Example for A22.

package A24 p u b l i c with Data Model ;

data OneBitBitfield p r o p e r t i e s Source Data Size => 1 b i t s ; end OneBitBitfield;

data PropertyFlags p r o p e r t i e s Data Model:: Data Representation => Struct ; end PropertyFlags;

data implementation PropertyFlags. s t r u c t subcomponents flag1 : data OneBitBitfield; flag2 : data OneBitBitfield; end PropertyFlags. s t r u c t ; end A24 ;

Figure G.11. Example for A24.

153 APPENDIX G. AADL EXAMPLES

package A5 p u b l i c with data model ; with b a s e t y p e s ;

data Data Container p r o p e r t i e s Data Model:: Data Representation => Struct ; end Data Container ;

data implementation Data Container.impl subcomponents member1 : data Base Types :: Unsigned 8 ; member2 : data Base Types:: Integer 3 2 ; end Data Container.impl; end A5 ;

Figure G.12. Example for A5.

package A6 p u b l i c with data model ; with b a s e t y p e s ;

data Data Container p r o p e r t i e s Data Model:: Data Representation => Union ; end Data Container ;

data implementation Data Container.impl subcomponents member1 : data Base Types :: Unsigned 8 ; member2 : data Base Types:: Integer 3 2 ; end Data Container.impl; end A6 ;

Figure G.13. Example for A6.

154 package A7 p u b l i c with b a s e t y p e s ;

subprogram fun end fun ;

subprogram implementation fun.impl subcomponents member1 : data Base Types :: Unsigned 8 ; end fun.impl; end A7 ;

Figure G.14. Example for A7.

package A8 p r i v a t e with b a s e t y p e s ;

data Integer end Integer;

process proc end proc ;

subprogram fun f e a t u r e s fileScope : requires data access Integer; end fun ;

process implementation proc.impl subcomponents d : data Integer; f: subprogram fun;

connections data access d −> f.fileScope; end proc.impl; end A8 ;

Figure G.15. Example for A8.

155 APPENDIX G. AADL EXAMPLES

package A8test2 p r i v a t e with b a s e t y p e s ;

data Integer end Integer;

thread th end th ;

subprogram fun f e a t u r e s fileScope : requires data access Integer; end fun ;

thread implementation th.impl subcomponents d : data Integer; f: subprogram fun;

connections data access d −> f.fileScope; end th.impl; end A8test2;

Figure G.16. Example for A8test2.

Figure G.17. Example for B1.

156 package B2 p u b l i c

data Integer end Integer;

subprogram fun1 f e a t u r e s b : in parameter Integer; ret : out parameter Integer; end fun1 ;

thread caller end caller;

thread implementation caller .impl subcomponents a : data Integer; c a l l s seq1 : { call1 : subprogram fun1; } ; connections parameter a −> c a l l 1 . b ; parameter call1.ret −> a ; end caller.impl; end B2 ;

Figure G.18. Example for B2.

157 APPENDIX G. AADL EXAMPLES

package B4 p u b l i c

data Integer end Integer;

subprogram fun1 f e a t u r e s inp : requires data access Integer; end fun1 ;

thread caller end caller;

thread implementation caller .impl subcomponents a : data Integer; c a l l s seq1 : { call1 : subprogram fun1; } ; connections data access a −> c a l l 1 . inp ; end caller.impl; end B4 ;

Figure G.19. Example for B4.

158 package B6 p u b l i c data Integer end Integer;

device actuator f e a t u r e s sensorData : out data port; end actuator;

subprogram FLD f e a t u r e s readData : requires data access Integer; end FLD;

system sys end sys;

p r o c e s s t f e a t u r e s readSensor : in data port; end t ;

process implementation t.impl subcomponents Registry : data Integer; FuelD : subprogram FLD; connections data access Registry −> FuelD.readData; end t . impl ;

system implementation sys.impl subcomponents FuelLevelSensor : device actuator; FuelLevelDisplay : subprogram FLD; runner : process t.impl; connections port FuelLevelSensor.sensorData −> runner.readSensor; end sys.impl; end B6 ;

Figure G.20. Example for B6.

159 APPENDIX G. AADL EXAMPLES

package B8 p u b l i c with ExternalLang; subprogram ASMFunction end ASMFunction;

thread caller end caller;

thread implementation caller .impl subcomponents calledFun : subprogram ASMFunction { ExternalLang :: language => ASM; } ; end caller.impl; end B8 ;

Figure G.21. Example for B8.

160 package C1 p u b l i c data Numeric end Numeric;

data implementation Numeric.integer end Numeric.integer; data implementation Numeric. f l o a t end Numeric. f l o a t ;

subprogram TakesInt f e a t u r e s inp : in parameter Numeric; end TakesInt;

subprogram TakesFloat f e a t u r e s inp : in parameter Numeric. f l o a t ; end TakesFloat;

thread t end t;

thread implementation t.impl subcomponents var1 : data Numeric.integer; var2 : data Numeric. f l o a t ; var3 : data Numeric. f l o a t ; sp1 : subprogram TakesInt; sp2 : subprogram TakesFloat; c a l l s seq1 : { callseq11 : subprogram TakesInt; callseq12 : subprogram TakesFloat;

} ; seq2 : { callseq21 : subprogram TakesInt; callseq22 : subprogram TakesFloat; } ; connections parameter var1 −> callseq11.inp; parameter var2 −> callseq11.inp; parameter var1 −> callseq21.inp; parameter var3 −> callseq22.inp;

end t . impl ;

end C1 ;

Figure G.22. Example for C1.

161 APPENDIX G. AADL EXAMPLES

package C10 p u b l i c subprogram nested end nested; subprogram caller end caller; subprogram outer caller end outer c a l l e r ;

subprogram implementation caller .impl c a l l s seq1 : { call1: subprogram nested; } ; end caller.impl;

subprogram implementation outer caller .impl c a l l s seq1 : { call1: subprogram caller.impl; } ; end o u t e r caller.impl; end C10 ;

Figure G.23. Example for C10.

162 package C11 p u b l i c

data Integer end Integer;

subprogram fun1 f e a t u r e s inp : in parameter Integer; outp : out parameter Integer; end fun1 ;

subprogram takesFuncPointer f e a t u r e s calledFun : requires subprogram access fun1; inp : in parameter FunctionPointer.impl; ret : out parameter Integer; end takesFuncPointer;

subprogram implementation takesFuncPointer.impl subcomponents arg1 : data Integer; c a l l s seq1 : { call1 : subprogram fun1; } ; connections parameter arg1 −> c a l l 1 . inp ; parameter call1 .outp −> arg1 ; end takesFuncPointer.impl;

data FunctionPointer f e a t u r e s pointsTo : provides subprogram access fun1; end FunctionPointer;

data implementation FunctionPointer.impl subcomponents funcPayLoad : subprogram fun1; connections pointer : subprogram access funcPayLoad −> pointsTo ; end FunctionPointer.impl;

thread caller end caller;

thread implementation caller .impl subcomponents funP : data FunctionPointer.impl; retVal : data Integer; c a l l s seq1 : { call1 : subprogram takesFuncPointer; } ; connections parameter funP −> c a l l 1 . inp ; parameter call1.ret −> retVal ; end caller.impl; end C11 ; 163

Figure G.24. Example for C11. APPENDIX G. AADL EXAMPLES

package C1flows p u b l i c thread fun1 f e a t u r e s inp : in data port; outp : out data port; f l o w s fpath1 : flow path inp −> outp ; end fun1 ;

thread fun2 f e a t u r e s inp : in data port; outp : out data port; f l o w s fpath1 : flow path inp −> outp ; end fun2 ;

thread environ f e a t u r e s c : out data port; r : in data port; f l o w s call : flow source c; return : flow sink r; end environ;

process main end main;

process implementation main.impl subcomponents f1 : thread fun1; f2 : thread fun2; env : thread environ; connections CallCon1 : port env.c −> f 1 . inp ; CallCon2 : port env.c −> f 2 . inp ; CallCon3 : port f1.outp −> env . r ; CallCon4 : port f2.outp −> env . r ; CallCon5 : port f2.outp −> f 1 . inp ; f l o w s Callseq1 : end to end flow env.call −> CallCon1 −> f 1 . fpath1 −> CallCon3 −> env . return ; Callseq2 : end to end flow env.call −> CallCon2 −> f 2 . fpath1 −> CallCon5 −> f 1 . fpath1 −> CallCon3 −> env . return ; end main.impl; end C1flows;

Figure G.25. Example for C1flows.

164 package C2 p u b l i c with LoopBinds; subprogram loopbody end loopbody;

thread forLoop p r o p e r t i e s LoopBinds ::LoopType => FOR; LoopBinds:: Iterations => 2 0 ; end forLoop;

thread implementation forLoop.impl calls callseq1 : { Iter : subprogram loopbody; } ; end forLoop.impl; end C2 ;

Figure G.26. Example for C2.

package C5 p u b l i c data Integer end Integer;

subprogram fun1 f e a t u r e s inp : in parameter Integer; outp : out parameter Integer; end fun1 ;

subprogram caller end caller;

subprogram implementation caller .impl subcomponents toFun : data Integer; retValue : data Integer; c a l l s seq1 : { call1 : subprogram fun1; } ; connections parameter toFun −> c a l l 1 . inp ; parameter call1 .outp −> retValue ; end caller.impl; end C5 ;

Figure G.27. Example for C5.

165 APPENDIX G. AADL EXAMPLES

package C6 p u b l i c data Integer end Integer;

subprogram fun1 f e a t u r e s inp : requires data access Integer; outp : out parameter Integer; end fun1 ;

thread caller end caller;

thread implementation caller .impl subcomponents toFun : data Integer; retValue : data Integer; c a l l s seq1 : { call1 : subprogram fun1; } ; connections data access toFun −> c a l l 1 . inp ; parameter call1 .outp −> retValue ; end caller.impl; end C6 ;

Figure G.28. Example for C6.

166 package C7 p u b l i c

data Integer end Integer;

subprogram fun1 f e a t u r e s inp : in parameter Integer; outp : out parameter Integer; end fun1 ;

subprogram takesFuncPointer f e a t u r e s inp : in parameter FunctionPointer.impl; ret : out parameter Integer; end takesFuncPointer;

data FunctionPointer f e a t u r e s pointsTo : provides subprogram access fun1; end FunctionPointer;

data implementation FunctionPointer.impl subcomponents funcPayLoad : subprogram fun1; connections pointer : subprogram access funcPayLoad −> pointsTo ; end FunctionPointer.impl;

thread caller end caller;

thread implementation caller .impl subcomponents funP : data FunctionPointer.impl; retVal : data Integer; c a l l s seq1 : { call1 : subprogram takesFuncPointer; } ; connections parameter funP −> c a l l 1 . inp ; parameter call1.ret −> retVal ; end caller.impl; end C7 ;

Figure G.29. Example for C7.

167 APPENDIX G. AADL EXAMPLES

package D4 p u b l i c with a PropertySet; thread T1 p r o p e r t i e s a propertySet:: Iterations => 2 ; end T1 ; end D4 ;

Figure G.30. Example for D4.

package D4 2 p u b l i c with D4 ;

process aProcess end aProcess;

process implementation aProcess.impl subcomponents T1imp : thread D4::T1; end aProcess.impl; end D4 2 ;

Figure G.31. Example for D4 2.

package E3 p u b l i c system sys f e a t u r e s e r r port : out data port; annex Error Model {∗∗ error propagations use types ErrorTypes; e r r port : out propagation [BadValue]; ∗∗}; end sys ; end E3 ;

Figure G.32. Example for E3.

168 package E6 p u b l i c with b a s e t y p e s ; system sys1 f e a t u r e s inp : in data port Base Types:: Integer; outp : out data port Base Types:: Float; annex agree {∗∗ assume ” Input quantity e x i s t s ” : inp > 0 ; guarantee ”Output value \” n e g a t i v e \” ” : outp < 0 . 0 ; ∗∗}; end sys1 ; end E6 ;

Figure G.33. Example for E6.

property set FileTypes is fType : enumeration ( DIR , FILE, SYM LINK ) applies to (data); end FileTypes;

Figure G.34. Example for FileTypes.

property set function constants is

Integer : type aadlinteger;

Min Value : constant function constants ::Integer => 1 ; end function c o n s t a n t s ;

Figure G.35. Example for function constants.

169 APPENDIX G. AADL EXAMPLES

−− Based on example by yoogx at https: // github . com/OpenAADL/AADLib/ blob / master / examples / i s r / i s r . aadl package Interrupts p u b l i c data Integer end Integer;

subprogram fun1 f e a t u r e s reg : requires data access Integer; end fun1 ;

d e v i c e RTC f e a t u r e s isrOut : out event port; p r o p e r t i e s Di s p a t c h Pr o t o c o l => P e r i o d i c ; Period => 10ms ; end RTC;

thread ISR f e a t u r e s c l k event : in event port; i s r call : out event port; p r o p e r t i e s Di s p a t c h Pr o t o c o l => a p e r i o d i c ; end ISR ;

thread ISR Handler f e a t u r e s i s r triggered : in event port; end ISR Handler ;

thread implementation ISR Handler.impl subcomponents sharedReg : data Integer; c a l l s seq1 : { call1 : subprogram fun1; } ; connections data access sharedReg −> c a l l 1 . reg ; end ISR Handler.impl;

process Main f e a t u r e s ISR call : in event port; end Main ;

process implementation Main.impl subcomponents worker : thread ISR Handler.impl; connections C1 : port I S R c a l l −> worker . i s r t r i g g e r e d ; end Main.impl; 170

system Environment end Environment;

system implementation Environment.impl subcomponents r t c clock : device RTC; main : process Main.impl; connections C1 : port r t c clock .isrOut −> main . I S R c a l l ; end Environment.impl;

end Interrupts;

Figure G.36. Example for Interrupts. package Test p u b l i c subprogram A end A; end Test ;

Figure G.37. Example for Test.

package Test2 p u b l i c with Test3 ; end Test2 ;

Figure G.38. Example for Test2.

package Test3 p u b l i c with Test ; end Test3 ;

Figure G.39. Example for Test3.

package C13 p u b l i c with Macro ; subprogram loopbody p r o p e r t i e s Macro::type => MACRO;

end loopbody; end C13 ;

Figure G.40. Example for C13.

property set a propertySet is

Iterations : aadlinteger applies to (thread); end a propertySet;

Figure G.41. Property set for a propertySet.

171 APPENDIX G. AADL EXAMPLES

property set ExternalLang is language : enumeration( ASM, C, Ada , CSHARP ) applies to (subprogram); end ExternalLang;

Figure G.42. Property set for ExternalLang.

property set LoopBinds is LoopType : enumeration( FOR, WHILE, DOWHILE ) applies to (thread);

Iterations : aadlinteger applies to (thread); end LoopBinds;

Figure G.43. Property set for LoopBinds.

property set Macro is type : enumeration( MACRO, FUNCTION ) applies to (subprogram); end Macro ;

Figure G.44. Property set for Macro.

property set TypedIntegers is

DataValue : aadlinteger applies to (data); end TypedIntegers;

Figure G.45. Property set for TypedIntegers.

172 Appendix H

Promela Examples

i n i t { i n t a ; byte b ; a = 5 ;

}

Figure H.1. Example for A1.

i n i t { i n t a ; a = 5 ; d s t e p { i n t b ; b = 3 ; p r i n t f ( ”%d\n” , b ) ; } p r i n t f ( ”%d\n” , a ) ; }

Figure H.2. Example for A10.

173 APPENDIX H. PROMELA EXAMPLES

#d e f i n e A 100 mtype = {PASS, FAIL} i n i t { i n t a ; mtype b ; a = A; b = PASS ; }

Figure H.3. Example for A11.

i n i t { i n t a [ 1 0 ] ; a [ 0 ] = 5 ; }

Figure H.4. Example for A12.

i n i t { i n t a ; s h o r t b ; byte c ; a = 5 ; b = a ; c = a ;

}

Figure H.5. Example for A15.

174 chan StaticVariable = [1] o f { i n t } ; proctype Persistent() { i n t a ; StaticVariable ? a; p r i n t f ( ”%d\n” , a ) ; a = a + 1 ; StaticVariable ! a; } i n i t { StaticVariable ! 5; run Persistent(); run Persistent(); run Persistent(); run Persistent(); run Persistent(); }

Figure H.6. Example for A16.

typedef Struct1 { byte a ; byte b ; } typedef Struct2 { Struct1 a ; Struct1 b ;

} i n i t { Struct2 a ; Struct1 b ; byte c = 5 ; a . a = b ; a . a . a = c ; }

Figure H.7. Example for A20.

175 APPENDIX H. PROMELA EXAMPLES

mtype = {OK, NOT OK, DEFERRED} ; i n i t { mtype s t a t u s = DEFERRED; }

Figure H.8. Example for A21.

#i f n d e f A22 #d e f i n e A22 proctype fun1 ( i n t a ) { p r i n t f ( ”%d\n” , a ) ; } i n i t { run fun1 ( 5 ) ; run fun1 (5+5) ; }

#e n d i f

Figure H.9. Example for A22.

#i f n d e f A22 mod #d e f i n e A22 mod proctype fun1 ( i n t a ) { p r i n t f ( ”%d\n” , a ) ; }

#e n d i f

Figure H.10. Example for A22 mod.

typedef DataContainer { i n t a ; byte b ; } ; i n i t { DataContainer a; a . b = 5 ; }

Figure H.11. Example for A5.

176 i n t a ; i n i t { a = 5 ;

}

Figure H.12. Example for A9.

i n t retF1 ; chan retF2 = [ 1 ] o f { i n t } ; proctype F1( i n t a ) { retF1 = a + 5 ; } proctype F2( i n t a ) { retF2 ! a + 5 ;

} i n i t { run F1 ( 5 ) ; run F2 ( 5 ) ; p r i n t f ( ”F1 returned %d\n” , retF1 ) ; i n t tmp ; retF2 ? tmp ; p r i n t f ( ”F2 returned %d\n” , tmp) ; }

Figure H.13. Example for B2.

proctype F1( i n t a ) { p r i n t f ( ”%d\n” , a ) ; } i n i t { i n t a = 5 ; run F1( a ) ;

}

Figure H.14. Example for B3.

177 APPENDIX H. PROMELA EXAMPLES

i n t glob ; chan q = [ 0 ] o f { bool } ; proctype F1( i n t a ) { glob = a ; q ! 0 ; } i n i t { run F1 ( 5 ) ; q ? 0 ; p r i n t f ( ”%d\n” , glob ) ;

}

Figure H.15. Example for B9.

i n i t { i n t counter = 0 ; do :: s k i p −> i f :: (counter == 5) −> goto end ; : : ( counter < 5) −> p r i n t f ( ” Less than 5” ); : : ( counter < 6) −> p r i n t f ( ” Less than 6” ); p r i n t f ( ”%d\n” , counter ) ; counter = counter + 1 ; f i od end : s k i p ; }

Figure H.16. Example for C1.

178 proctype F1 ( ) { p r i n t f ( ” Inner \n” ); } proctype F2 ( ) { p r i n t f ( ” Outer\n” ); run F1 ( ) ; } i n i t { p r i n t f ( ” I n i t \n” ); run F2 ( ) ;

}

Figure H.17. Example for C10.

proctype F1( i n t a ) { p r i n t f ( ”a = %d\n” , a ) ; } i n i t { byte b = 128; run F1( b ) ; }

Figure H.18. Example for C12.

#d e f i n e NEGATE( x ) (−x ) i n i t { i n t a = 5 ; a = NEGATE( a ) ; p r i n t f ( ”a = %d\n” , a ) ; }

Figure H.19. Example for C13.

179 APPENDIX H. PROMELA EXAMPLES

i n i t { i n t a , b ; i f :: s k i p −> a = 1 ; :: s k i p −> a = 2 ; f i i f : : ( a > 1) −> b = 2 ; :: e l s e −> b = 1 ; f i

p r i n t f ( ”b = %d\n” , b ) ;

}

Figure H.20. Example for C14.

i n i t { i n t i = 0 ; do : : ( i < 5) −> p r i n t f ( ” S t i l l in loop ! \ n” ); i = i + 1 ; :: e l s e −> break ; od }

Figure H.21. Example for C2.

i n i t { i n t i = 0 ; loophead : p r i n t f ( ” S t i l l in loop ! ” ); i = i + 1 ; i f : : ( i < 5) −> goto loophead ; :: e l s e −> s k i p ; f i }

Figure H.22. Example for C4.

180 i n i t { i n t a ; /∗ Assign a an a r b i t r a r y value ∗/ i f :: s k i p −> a = 5 ; :: s k i p −> a = 6 ; :: s k i p −> a = 7 ; :: s k i p −> a = 8 ; f i

i f : : ( a == 5) −> p r i n t f ( ”a = 5\n” ); : : ( a == 6) −> p r i n t f ( ”a = 6\n” ); : : ( a == 7) −> p r i n t f ( ”a = 7\n” ); : : ( a == 8) −> p r i n t f ( ”a = 8\n” ); :: e l s e −> p r i n t f ( ”a has a bad value ” ); f i }

Figure H.23. Example for C8.

#i n c l u d e ”A22 mod . pml” #i f n d e f D3 #d e f i n e D3 i n i t { i n t a = 5 ; run fun1 ( a ) ; }

#e n d i f

Figure H.24. Example for D3.

#i f n d e f D4 1 #d e f i n e D4 1 #i n c l u d e ”A22 mod . pml” proctype fun2 ( i n t a ) { p r i n t f ( ” Step 1\n” ); run fun1 ( a ) ; }

#e n d i f

Figure H.25. Example for D4 1.

181 APPENDIX H. PROMELA EXAMPLES

#i f n d e f D4 2 #d e f i n e D4 2 #i n c l u d e ”D4 1 . pml” i n i t { run fun2 ( 5 ) ; run fun1 (10) ; }

#e n d i f

Figure H.26. Example for D4 2.

#d e f i n e f u n c t i o n proctype #d e f i n e exec run #d e f i n e main i n i t function fun1( i n t a ) { p r i n t f ( ”a = %d\n” , a ) ; } main{ exec fun1(15); }

Figure H.27. Example for D7.

i n i t { i n t a = 1 , b = 2 ; i n t r e s ; r e s = a << b ; r e s = a − b ; }

Figure H.28. Example for E1.

182 Figure H.29. Example for E6.

Figure H.30. Example for E6 2.

183

Appendix I

Lustre Examples

node a1 ( a : i n t ) returns (ret : i n t ); var b , c : i n t ; d : bool ; l e t b = a + 3 ; c = b / 2 ; d = b > 7 ; r e t = i f d then 5 e l s e 4 ; t e l

Figure I.1. Example for A1.

−− Saves the value of a to the registry i f s e t i s true , e l s e reads the previously saved value of the registry. −− The d e f a u l t value of the registry is determined by the constant variable def.

const def : i n t = 5 ;

node GlobalVar1 (a : i n t ; set : bool) returns (ret : i n t ); l e t r e t = def −> i f s e t then a e l s e pre ( r e t ) ; t e l

node Wrong (a : i n t ) returns (ret : i n t ); l e t r e t = 5 + a ; t e l

Figure I.2. Example for A11.

185 APPENDIX I. LUSTRE EXAMPLES

node arraySub ( const s i z : i n t ; arr1, arr2 : i n t ˆsiz) returns (res : i n t ˆ s i z ) ; l e t r e s = arr1 − arr2 ; t e l

Figure I.3. Example for A12.

package AddLibFuns p r o v i d e s node Add (inp1, inp2 : i n t ) returns (res : i n t ); body node Add (inp1, inp2 : i n t ) returns (res : i n t ); l e t res = inp1 + inp2; t e l end

Figure I.4. Example for A14 1.

package Main uses AddLibFuns p r o v i d e s node Fib () returns (ret : i n t ); body node Fib () returns (ret : i n t ); l e t r e t = 1 −> pre(AddLibFuns:Add(ret , (0 −> pre r e t ) ) ) ; t e l end

Figure I.5. Example for A14 2.

186 type DataContainer = s t r u c t { member1 : real = 0.; member2 : i n t = 5 } ; type DataContainer2 = s t r u c t { member1 : DataContainer; member2 : i n t ; } ;

Figure I.6. Example for A20.

type direction = enum { left , right, up, down } ; node opposite (inp : direction) returns (ret : direction); l e t r e t = i f inp = up then down e l s e i f inp = down then up e l s e i f inp = right then left e l s e r i g h t ; t e l

Figure I.7. Example for A21.

node counter (clk,reset : bool) returns (ret : i n t ); l e t r e t = 0 −> i f r e s e t then 0 e l s e i f clk then pre(ret) + 1 e l s e pre ( r e t ) ; t e l node counterMul(coef : i n t ) returns (ret : i n t ); l e t ret = counter(true, false) ∗ c o e f ; t e l node main () returns (mainRet : i n t ); l e t mainRet = counterMul(5); t e l

Figure I.8. Example for A22.

187 APPENDIX I. LUSTRE EXAMPLES

node a4 (a : bool; b : i n t ) returns (ret : i n t ); l e t ret = b when a; t e l

Figure I.9. Example for A4.

type DataContainer = s t r u c t { member1 : real; member2 : i n t ; } ;

Figure I.10. Example for A5.

−− Saves the value of a to the registry i f s e t i s true , e l s e reads the previously saved value of the registry. node GlobalVar1 (a : i n t ; set : bool) returns (ret : i n t ); l e t r e t = 0 −> i f s e t then a e l s e pre ( r e t ) ; t e l

Figure I.11. Example for A8.

node incr (inp : i n t ) returns (ret, ret2 : i n t ); l e t r e t 2 = inp ; r e t = inp + 1 ; t e l node range (start, size : i n t ) returns (ret : i n t ˆ s i z e ) ; l e t r e t = f i l l <>( s t a r t ) ; t e l

Figure I.12. Example for C2.

188 i n c l u d e ”A22 . l u s ” node clock () returns (ret : bool); l e t r e t = true −> not pre(ret); t e l node evenCounter () returns (ret : i n t ); var fastClk : i n t ; l e t fastClk = counter(true, false); r e t = i f clock() then fastClk e l s e 0 ; −−ret = fastClk when clock(); t e l

Figure I.13. Example for D3.

i n c l u d e ”D3 . l u s ” node dualCounters () returns (ret1, ret2 : i n t ); var fastClk : i n t ; l e t fastClk = counter(true, false); r e t 1 = i f clock() then fastClk e l s e 0 ; r e t 2 = i f not clock() then fastClk e l s e 0 ; t e l

Figure I.14. Example for D4.

package AddLibFuns2 p r o v i d e s node Add (inp1, inp2 : i n t ) returns (res : i n t ); body node Add (inp1, inp2 : i n t ) returns (res : i n t ); l e t res = inp1 + inp2; t e l

node Sub (inp1, inp2 : i n t ) returns (res : i n t ); l e t r e s = inp1 − inp2 ; t e l end

Figure I.15. Example for D6.

189 APPENDIX I. LUSTRE EXAMPLES

node counter (clk,reset : bool) returns (ret : i n t ); l e t r e t = 1 −> i f r e s e t then 0 e l s e i f clk then pre(ret) + 1 e l s e pre ( r e t ) ; t e l node MainTestAid () returns (req1, req2 : bool); var r e s : i n t ; l e t res = counter(true, false); req1 = r e s > 0 and r e s < 3 0 ; req2 = r e s > 3 0 ; t e l node MainTest () returns (res : bool); var req1, req2 : bool; l e t (req1, req2) = MainTestAid(); res = not (req1 and req2); t e l

Figure I.16. Example for E6.

node H(inp: bool) returns (outp: bool); l e t outp = inp −> (inp and pre(outp)); t e l

Figure I.17. Example for H.

node Fib () returns (ret, ret2 : i n t ); l e t r e t = 1 −> pre ( r e t + (0 −> pre r e t ) ) ; r e t 2 = −r e t ; t e l

Figure I.18. Example for Test.

190 Appendix J

SysML Examples

Figure J.1. Example for A1.

191 APPENDIX J. SYSML EXAMPLES

Figure J.2. Example for A11.

Figure J.3. Example for A12.

192 Figure J.4. Example for A13.

Figure J.5. Example for A15.

Figure J.6. Example for A16.

Figure J.7. Example for A2.

193 APPENDIX J. SYSML EXAMPLES

Figure J.8. Example for A20.

Figure J.9. Example for A21.

Figure J.10. Example for A3.

Figure J.11. Example for A4.

194 Figure J.12. Example for A5.

Figure J.13. Example for A7.

Figure J.14. Example for A9.

195 APPENDIX J. SYSML EXAMPLES

Figure J.15. Example for B2.

Figure J.16. Example for B4.

196 Figure J.17. Example for B8 1.

Figure J.18. Example for B8 2.

197 APPENDIX J. SYSML EXAMPLES

Figure J.19. Example for B9.

Figure J.20. Example for C1.

198 Figure J.21. Example for C10.

Figure J.22. Example for C11.

199 APPENDIX J. SYSML EXAMPLES

Figure J.23. Example for C12.

Figure J.24. Example for C2.

Figure J.25. Example for C7 1.

200 Figure J.26. Example for C7 2.

Figure J.27. Example for D4.

201 APPENDIX J. SYSML EXAMPLES

Figure J.28. Example for E3.

Figure J.29. Example for E6.

Figure J.30. Example for E8.

202 TRITA MMK 2016:08 MES 011

www.kth.se