Masaryk University, Faculty of Informatics

PhD Thesis

Automating Development with Explicit Model Checking

Author: Supervisor: Petr Bauch Jiˇr´ı Barnat

Signature: Signature:

Brno, 2015 a

i Acknowledgements

The idea to combine explicit and symbolic approaches to model checking was bestowed upon me by my thesis supervisor, Jiˇr´ıBarnat, for which I will remain forever grateful to him. We have been discussing the topic at length on numerous occasions, always arriving at a new inspiration for further progress. Each of my earlier doubts was shattered and every found obstacle overcome by his help. My previous supervisor, LuboˇsBrim, introduced the idea of analysing requirements without models, which gave me my first experience with autonomous scientific work: an invaluable experience. Yet even greater influence on me had his course on the Dijkstra’s method. There was the first spark, my falling for the formal methods. Equally grateful I am to all members of the ParaDiSe and Sybila laboratories, both present and past, for creating an atmosphere which was as friendly as it was inspiring. Finally, I would like to give credit to students that helped me implement some of the ideas I had – Vojtˇech Havel and Jan Tuˇsil– since their coding skills surpass mine I was able to learn from them as much as they learnt from me.

ii a

iii Summary

In the last decade it became a common practice to formalise software requirements to improve the clarity of users’ expectations. This thesis builds on the fact that functional requirements can be expressed in temporal logic and we propose new means of verifying the requirements themselves and the compliance to those requirements throughout the development process. The goal is the automation of some of the verification processes currently employed in software development. The tool is the automata-based, explicit-state model checking. With respect to requirements, we propose new techniques that automatically detect flaws and suggest improvements of given requirements. Specifically, we describe and experimentally evaluate approaches to consistency and redundancy checking that identify all inconsistencies and pinpoint their exact source (the smallest inconsistent set). We further report on the experience obtained from employing the consistency and redundancy checking in an industrial environment. To complete the sanity checking we also describe a semi-automatic completeness evaluation that can assess the coverage of user requirements and suggest missing properties the user might have wanted to formulate. The usefulness of our completeness evaluation is demonstrated in a case study of an aeroplane control system. With respect to compliance to requirements, we propose extending the explicit model checking to scale to more realistic programs. Automatic verification of programs and computer systems with data non-determinism (e.g. reading from user input) represents a significant and well-motivated challenge. The case of Simulink diagrams is especially difficult, because there the inputs are read iteratively and the number of input variables is in theory unbounded. The case of parallel programs is difficult in a different way, because then also the control flow non- trivially complicates the verification process. We apply the techniques of explicit-state model checking to account for the control aspects of verified programs and use set-based reduction of the data flow, thus handling the two sources of non-determinism separately. We evaluate a number of representation of sets in scalability with respect to the range of input variables. Explicit (enumerating) sets are very fast for small ranges but fail to scale. Symbolic sets, represented as first-order formulae in the bit-vector theory and compared using satisfiability modulo theory solvers, scale well to arbitrary (though still bounded) range of input variables. Binary decision diagrams dominate in both space and time complexities, but only if we limit the scope to linear arithmetic and a relatively small number of variables. We build the theory of set-based reduction using first-order formulae in the bit-vector theory to encode the sets of variable evaluations representing program data. These repre- sentations are tested for emptiness and equality (state matching) during the verification and we harness modern satisfiability modulo theories solvers to implement these tests. We de- sign two methods of implementing the state matching, one using quantifiers and one which is quantifier-free, and we provide both analytical and experimental comparison. Further ex- periments evaluate the efficiency of the set-based reduction showing the classical, explicit approach to fail to scale with the size of data domains. Finally, we report on the development of a generic framework for automatic verification of linear temporal logic specifications for programs in LLVM bitcode. To evaluate the framework we compare our method with state-of-the-art tools on a set of unmodified C programs.

iv Abstract

This thesis builds on the fact that functional requirements can be expressed in temporal logic and we propose new means of verifying the requirements themselves and the compliance to those requirements throughout the development process. The goal is the automation of some of the verification processes currently employed in software development. The tool is the automata-based, explicit-state model checking. With respect to requirements, we propose new techniques that automatically detect flaws and suggest improvements of given require- ments. Specifically, we describe and experimentally evaluate approaches to consistency and redundancy checking that identify all inconsistencies and pinpoint their exact source (the smallest inconsistent set). With respect to compliance to requirements, we propose extending the explicit model checking to scale to more realistic programs. We apply the techniques of explicit-state model checking to account for the control aspects of verified programs and use set-based reduction of the data flow, thus handling the two sources of non-determinism separately. We build the theory of set-based reduction using first-order formulae in the bit- vector theory to encode the sets of variable evaluations representing program data. These representations are tested for emptiness and equality (state matching) during the verification and we harness modern satisfiability modulo theories solvers to implement these tests. Our experiments evaluate the efficiency of the set-based reduction showing the classical, explicit approach to fail to scale with the size of data domains.

v Contents

1 Introduction1 1.1 Software Development Cycle...... 1 1.2 Model Checking...... 4 1.3 Sanity Checking...... 7 1.4 Case Study: Simulink Diagrams...... 7 1.5 Contribution...... 9 1.5.1 Model-Free Sanity Checking...... 10 1.5.2 Control Explicit—Data Symbolic Model Checking...... 10 1.5.3 Verification Platform...... 11 1.5.4 Thesis Content...... 12 1.5.5 Author’s Other Publications...... 14

2 Related Work 17 2.1 Control Explicit—Data Symbolic Model Checking...... 17 2.1.1 Symbolic Representations...... 18 2.1.2 Bit-Precise LTL Model Checking of Parallel Programs with Inputs.. 19 2.1.3 State Space Reduction by Control-Flow Aggregation...... 21 2.1.4 Combinations of Explicit and Symbolic Approaches...... 22 2.2 Model-free Sanity Checking...... 22 2.3 Verification of Simulink Diagrams...... 24 2.4 LTL Verification of Parallel LLVM Programs...... 26

3 Preliminaries 29 3.1 First-Order Theory of Bit-Vectors...... 29 3.2 Control Flow Graphs...... 31 3.3 LTL Model Checking...... 34 3.3.1 Explicit-State Model Checking...... 35 3.4 Sanity Checking...... 37

4 Checking Sanity of Software Requirements 39 4.1 Consistency and Redundancy...... 39 4.1.1 Implementation of Consistency Checking...... 42 4.2 Completeness of Requirements...... 45 4.3 Sanity Checking Experiments...... 48 4.3.1 Generated Requirements...... 48 4.3.2 Case Study: Aeroplane Control System...... 49

vi 4.3.3 Industrial Evaluation...... 54 4.4 Closing Remarks...... 55 4.4.1 Alternative LTL to B¨uchi Translation...... 57

5 Control Explicit—Data Symbolic Model Checking 59 5.1 Set-Based Reduction...... 59 5.2 Symbolic Representation...... 65 5.3 Storing and Searching...... 70 5.3.1 Linear Storage...... 71 5.3.2 Explicating Variables...... 71 5.3.3 Witness-Based Matching Resolution...... 72 5.4 Model Checking Algorithm...... 73 5.4.1 Counterexamples...... 77 5.4.2 Deadlocks...... 78 5.4.3 Witnesses...... 78

6 Verification of Simulink Designs 81 6.1 Theory of Simulink Verification...... 81 6.1.1 Verification with Explicit Sets...... 83 6.1.2 SAT-Based Verification...... 84 6.1.3 Hybrid Representation...... 87 6.2 Experimental Evaluation...... 88 6.2.1 Implementation...... 89 6.2.2 Experiments...... 91 6.3 Discussion...... 100 6.3.1 Theoretical Limitations...... 100 6.3.2 State Matching by Comparing Function Images...... 101

7 Analysis of Semi-Symbolic Representations 103 7.1 Case Study: DVE Models...... 103 7.1.1 DVE Language for Open Systems...... 104 7.1.2 Set-Based Reduction with Explicit Sets...... 105 7.1.3 Set-Based Reduction with Symbolic Sets...... 107 7.1.4 Equivalence-Based versus Image-Based State Matching...... 108 7.2 Case Study: Random Models...... 110 7.2.1 Image versus Equivalence using Linear Search...... 111 7.2.2 Experiments with Witness-Based Matching Resolution...... 113

8 Verification Platforms 117 8.1 Input Languages...... 117 8.1.1 LLVM...... 117 8.1.2 NTS...... 119 8.2 SymDivine ...... 120 8.2.1 Memory Structure...... 120 8.2.2 Algorithms...... 122 8.2.3 Containers...... 124 8.2.4 Experimental Evaluation...... 125

vii 8.3 NTS-Platform...... 128 8.3.1 Standard Partial Order Reduction...... 128 8.3.2 Static Partial Order Reduction...... 129 8.3.3 LLVM to NTS Translation...... 131 8.3.4 Architecture of the NTS-Platform...... 132 8.4 Future of Platforms...... 133

9 Conclusion 135 9.1 Checking Requirements When It Matters...... 135 9.2 Model Checking Needs Symbolic Data...... 136

Bibliography 139

viii List of Algorithms

1 Consistency Detection...... 42 2 Task Successors...... 42 3 Task Supersets...... 43 4 Consistency Check...... 43 5 Completeness Evaluation...... 48

6 Generic Accepting Cycle Detection...... 74 7 Database Searching...... 76 8 Counterexample Extraction...... 77 9 Database Searching with Witness-Based Matching Resolution...... 79 10 Witness-based Matching Resolution...... 79 11 Extract Witness...... 79 12 Update Witnesses...... 80

13 Successor Generation for Simulink...... 86

14 SymDivine Reachability...... 124 15 SymDivine Nested-DFS...... 125 16 SymDivine Successors...... 126

ix List of Figures

1.1 Software Development Life Cycle...... 2 1.2 Simulink Diagram of the VoterCore System...... 8 1.3 Thesis Contributions...... 12

4.1 Consistency Checking Example...... 44 4.2 Consistency Checking Results...... 50 4.3 Redundancy Checking Results...... 51 4.4 Sanity Checking Complexity Progression...... 52

6.1 VoterCore Experiments: Purely Explicit vs Explicit Sets 1...... 92 6.2 VoterCore Experiments: Purely Explicit vs Explicit Sets 2...... 93 6.3 VoterCore Experiments: Purely Explicit vs Explicit Sets 3...... 94 6.4 VoterCore Experiments: Explicit Sets vs BV formulae 1...... 96 6.5 VoterCore Experiments: Explicit Sets vs BV formulae 2...... 97 6.6 VoterCore Experiments: Explicit Sets vs BV formulae 3...... 98 6.7 VoterCore Experiments: Hybrid Representation...... 99 6.8 Time Progression for Simulink Verification...... 101

7.1 DVE Experiments: Purely Explicit vs Explicit Sets...... 106 7.2 DVE Experiments: Step-Wise Progression...... 111 7.3 DVE Experiments: Parallel Scalability...... 112 7.4 Random Model Experiments: Image vs Equiv...... 114 7.5 Witness-Based Resolution Experiments...... 115

8.1 LLVM Compilation Diagram...... 118 8.2 SymDivine Overview...... 121 8.3 SymDivine User Overview...... 123

x List of Tables

4.1 Aeroplane Control System Results...... 55 4.2 Industrial Case Study Results...... 55

7.1 Work Distribution Protocol...... 109 7.2 Random Edge Generation...... 113

8.1 SymDivine vs CPAchecker ...... 126 8.2 SymDivine vs LTL Model Checkers...... 127

xi List of Mathematics

3.1 Definition (Syntax of BV)...... 30 3.2 Example (BV Semantics)...... 30 3.3 Definition (Quantified BV)...... 30 3.4 Definition (Control Flow Graph)...... 31 3.5 Example (CFG for Parallel Program)...... 31 3.6 Definition (Parallel Composition)...... 32 3.7 Definition (Synchronous Composition)...... 32 3.8 Definition (Sequential Composition)...... 32 3.9 Example (CFG Composition)...... 33 3.10 Definition (Explicit CFG)...... 33 3.11 Definition (LTL Syntax)...... 34 3.12 Definition (LTL Semantics)...... 34 3.13 Example (Temporal Operators)...... 34 3.14 Example (B¨uchi Automaton)...... 35 3.15 Definition (CFG Acceptance)...... 35 3.16 Example (Specification and Program Product)...... 36 3.17 Example (Model-Based Redundancy and Coverage)...... 37

4.1 Definition (Consistency and Redundancy)...... 39 4.2 Example (Consistent and Redundant Requirements)...... 40 4.3 Definition (Path-Quantified Consistency and Redundancy)...... 40 4.4 Theorem (Locating Inconsistent Subsets)...... 40 4.5 Example (Redundancy Witnesses)...... 41 4.6 Example (Consistency Checking Algorithm)...... 43 4.7 Example (Smallest Inconsistent Subset)...... 44 4.8 Definition (Almost-Simple Path)...... 45 4.9 Definition (Path Coverage)...... 46 4.10 Example (Automata Coverage)...... 47

5.1 Definition (Set-Reduced CFG)...... 60 5.2 Example (Incorrectness of Subsumption)...... 60 5.3 Example (Set-Based Reduction)...... 61 5.4 Example (Symbolic Execution)...... 62 5.5 Theorem (Correctness of Set-Based Reduction)...... 62 5.6 Lemma (Symbolic Path Existence)...... 62 5.7 Lemma (Explicit Path Existence)...... 63 5.8 Lemma (Fix-Point for BV Relations)...... 63

xii 5.9 Definition (Composition of BV Relations)...... 63 5.10 Lemma (Symbolic Cycle Existence)...... 63 5.11 Lemma (Explicit Cycle Existence)...... 63 5.12 Example (Set-Based Reduction Complexity)...... 64 5.13 Definition (Set-Symbolic CFG)...... 66 5.14 Definition (Image-Based Interpretation)...... 66 5.15 Theorem (Trace Equivalence for Image-Based Interpretation)...... 67 5.16 Definition (Image-Based State Matching)...... 68 5.17 Theorem (Image-Based Matching Correctness)...... 68 5.18 Definition (Equivalence-Based State Matching)...... 68 5.19 Theorem (Admissibility of Equivalence-Based State Matching)...... 69 5.20 Example (Iterative Input Reading)...... 70 5.21 Theorem (Image-Based Path Existence)...... 75 5.22 Lemma (Single Step Equality)...... 75 5.23 Example (Counterexample Extraction)...... 76

6.1 Example (CFG for Simulink)...... 81 6.2 Example (Simulink Semantics)...... 82 6.3 Theorem (Simulink Verification Correctness)...... 85

7.1 Example (DVE Model)...... 104

8.1 Example (LLVM IR)...... 118 8.2 Example (NTS for Producer/Consumer)...... 119 8.3 Definition (Transition Application)...... 128 8.4 Definition (Deterministic Relation)...... 129 8.5 Definition (Enabled Transition)...... 129 8.6 Definition (Independence Relation)...... 129 8.7 Definition (Invisible Transition)...... 129 8.8 Definition (Enabled Transition)...... 129 8.9 Definition (Transition Application)...... 130 8.10 Definition (Transition Independence)...... 130 8.11 Example (LLVM with Pthreads Translated to NTS)...... 131

xiii Chapter 1

Introduction

The objective of this thesis is to aid the development of software by employing formal meth- ods. The programs being developed may be executed concurrently which gives them a non- deterministic semantics. The properties being required describe the development of the pro- gram’s behaviour in time. Formal methods can aid such development by demonstrating that the required properties hold in every execution. The particular formal method for deciding that a program has given properties is called model checking. Throughout this thesis we systematically extend the application of model checking to further stages of the software development cycle. All these applications share two common aspects central to our work: verification of temporal properties rather than the simpler safety properties and orientating towards solving observed problems rather than theoretically inter- esting problems. The model checking is used to accelerate the building of confidence in the product of individual development stages. Ultimately, the motivation for introducing any change into the established development process is money. Formal methods thus should decrease the cost of development, by improving confidence in the correctness and by accelerating the development itself. A balance between the required time and the provided confidence is then steered either way, depending on how safety-critical the result must be. The application of model checking we have designed is intended as unobtrusive: if the developers choose to apply our tools the modifications to their routine should be minimal. Hence any correctness results provided by our tools should be sufficiently cheap for the development of any but the least critical software. The following text summarises our experience with building and using such tools. We describe the problems we solve and scrutinise the shortcomings of previous solutions. We establish the theoretical background, prove it correct, and specify its position among the forefront of the field. We implement tools that not only enable solving the problems but also allow for comparison with state-of-the-art tools. And finally, we report the results, both positive and negative, of rigorously constructed experiments as well as the results of real-world applications.

1.1 Software Development Cycle

Without introducing the vast amount of theory established around the software development, we will describe those stages of the development cycle that appear crucial to the resulting correctness. Hence, much simplified for the context of this thesis, the software development

1 nance te ain e qu M i re m e n t s

n

o

i

t

a

d

i

l

a

V D

e

s

i

g

n

n

o

i

t

a

t

n e I m m e p l

Figure 1.1: The individual stages of the software development life cycle. goes through the 5 stages, as depicted in Figure 1.1. The cycle is indeed a closed loop: new requirements may be produced during maintenance and the whole process is restarted. Also each of the stages may impose changes in any of the earlier stages thus necessitating a reversal within the process. Reversal of the development process is always expensive and eliminating or minimising the need to redo a particular development stage is most desirable. It follows that finding bugs in the code during the implementation stage is relatively affordable, whereas finding bugs in the requirements during the implementation stage can be substantially more costly. Each development stage produces results that later stages build upon. Hence the longer a bug remains undetected the more results have to be reproduced, the more stages have to be reverted, all of which increases the cost of development.

Requirements Stage The requirements stage is concerned with the elicitation and analysis of the properties the software being developed is to have. The importance of clear specification of the requirements in contract-based development process is apparent from the necessity of final-product compliance verification. Stakeholders and potential customers are interviewed by requirements engineers during the elicitation to communicate their ideas regarding the desired product. These ideas are then detailed and concretised to form a collection of prelim- inary requirements. Analysing requirements at this stage is mainly concerned with checking that they truly represent what stakeholders want. This stage is particularly problematic with respect to any attempts at a rigorously under- taken development. The elicited requirements are vague, incomplete and often contradictory as a collection. Formalising the requirements is only possible if the engineers have adequate skill in using advanced formalisms and only useful if the stockholders can be guided to under- stand these formalisms. Antithetically, a rigorously formulated set of requirements is highly

2 desirable, since all the future stages will define correctness with respect to this set. Restating requirements in a precise, formal way enables the requirement engineers to scrutinise their insight into the problem and allows for a considerably more thorough analysis of the final requirement documents [HJC+08]. To accurately delimit the focus of our work on requirements analysis, we will make several assumptions. We are only interested in functional requirements, specifically those specifiable in the Linear Temporal Logic (LTL) [MP81]. This is a relatively strong assumption, since other requirements, such as performance requirements are of great importance [Jai08], as well as requirements that specify for example continuous time behaviour [AILS07]. Another assumption is that the requirements are already formalised, i.e. the input of our tools are LTL formulae. This is a much weaker assumption, since automated tools exist that translate English requirements into LTL [NF96].

Design Stage The design phase is concerned with building a model of the system under development, one that complies with all the collected requirements. A great potential value lies in first building a model before implementing it in code. Modelling languages often al- low much more visual representation of the product while building or modifying a model is much faster. More importantly, however, modelling allows abstracting details that are not relevant for investigating some aspects of the system. For example, the correct functioning of a data-transitioning protocol may be independent of the content of the data being transmit- ted. Choosing which aspects can be abstracted is an error-prone process that has not been successfully automated yet. Once an appropriate abstraction is chosen and the design model is built, its compliance with the requirements should be checked. The greater guarantees are expected of this devel- opment stage, the higher the level of formalisation must be. This is not to say that testing a design without formalisation merits no recognition. As with testing in general, it is the fastest way to discover a certain portion of the bugs in both the design and the requirements. But no guarantees are provided as to the correctness of the system. Since models of computer systems are equally expressive as their functional counterparts, i.e. programs, their verification is undecidable [Ric53]. In this most general setting, verifi- cation of a program means deciding if for every input it behaves exactly as its specification prescribes. There are restricted cases for which verification is decidable. Our work applies formal methods to models without continuous or infinite components, hence we allow neither modelling real time nor the use of mathematical numbers. Finally, the number of communi- cating subsystems must be fixed and known in advance.

Implementation Stage The implementation phase offers a large space for rigorous analysis of the final product. Developing safety-critical systems prescribes a number of guidelines to be followed to achieve higher confidence in the result. Following these guidelines, however, tends to alter to established process the programmers are used to. More automated methods, e.g. checking correct memory allocation, are slowly becoming a standard part of the development process. Writing assertions that check certain properties at compile time is already a non- trivial activity, but also widely used. Yet similarly as with testing, these methods, while of considerable value, cannot guarantee correctness. Thus despite the existence of development processes that do guarantee correctness [Dij76], the implemented program which passes all

3 tests may still be erroneous. Consequently, we focus our verification efforts on the following, validation phase.

Validation Stage The validation phase justifies its crucial position within the development cycle mainly by the lack of rigour during the implementation phase. At the beginning of validation, the engineers are equipped with three intermediate products: the requirement document, the design model, and the code. The task is to validate that the code complies with all requirements, i.e. that the code implements its specification: the model. Hence the objective is identical to that of the design state verification. What is different is that the code is typically orders of magnitude larger: each abstraction introduced in the design is concretised in full in the implementation. Given that the objective is the same, one would be inclined to employ the same methods as for verifying the model. Yet as we mentioned earlier, the code differs from the model mainly by the lack of abstraction. Any verification tool with the ambition to scale to real-world programs must employ some form of abstraction, i.e. automate the work of a design engineer. As we intend to demonstrate several times throughout this thesis, the method we propose, control explicit—data symbolic model checking, performs such automated abstraction. We apply it on both the design of a system and its executable implementation. Hence similar limitations restricting our application of formal method in the design phase apply here as well. One that is perhaps missing in most design languages is the use of complex data structures. Since our methods require finite representation of the system, we presently cannot handle dynamic data structures.

Maintenance Stage The maintenance phase completes the development cycle by allowing the introduction of new requirements, modifying old requirements, and by ensuring that adequate changes are performed so that conformance to old requirements is preserved. Hence it would clearly be sufficient to postpone all validation efforts to the remaining phases that ensue the maintenance. On the other hand, assuming that the system was correct prior to maintenance, an incremental verification of only the modified part would be more cost effective. Alas, such incremental verification lies outside the scope of this work.

1.2 Model Checking

Verification of correctness of computer programs is decidable only in certain restricted cases, when both the program under verification and the verified property satisfy some well-defined criteria. Thus from the theoretical point of view, even deciding termination is impossible for general programs, mathematically represented for example by Turing machines. On the other hand, execution of many real-world programs entails only finitely many states of computation. Even parallel programs with a bounded number of threads that do not use dynamic data structures generate only finite state spaces. It follows that verification of many interesting properties is feasible and the only limitation is the efficiency of handling the size of the state spaces. In the case of verifying parallel programs that read from user input against properties specified in LTL, the state space explosion stems from three sources. (1) The verification complexity is exponential in the size of the formula being verified. In this work we assume sufficiently small formulae, so that their impact on the complexity is negligible, as is often

4 the case in practice. (2) The program under verification is commonly very large, especially considering that the number of possible thread interleavings is exponential in the number of parallel threads. (3) The number of possible evaluations of internal program variables is much larger still, given that introduction of every input variable multiplies this number by 2n new combinations, where n is the number of bits and may thus be even larger than 64. This thesis is partly concerned with the above described problem of verifying parallel programs with inputs against LTL properties. We argue why it is reasonable to separate the control (specification formula and thread interleaving) from data (variable evaluation) into two distinct sources of non-determinism that should be handled by different means. The means used here are parallel explicit-state model checking (B¨uchi automata-based) and satisfiability (SAT)-based symbolic execution. The proposed verification technique introduces the symbolic representation of data into otherwise unmodified LTL model checking or, equivalently, it promotes symbolic execution to LTL verification by implementing exact state matching. More precisely the proposed verification enumerates explicitly all reachable control con- figurations (combination of program counters and specification automaton states) while the set of variable evaluations associated with a control state is represented symbolically, as a formula in the first-order theory of bit-vectors (BV). Hence, instead of storing many copies of the same control state that differ in the variable evaluation, our verification represents the states with exactly the same control part and exactly the same set of evaluations as a single multi-state. This set-based state space reduction leads in some cases to exponentially smaller state spaces and we prove the correctness of the reduction and discuss its efficiency. Also, unlike in classical symbolic model checking where states are characteristic functions of sets of configurations (one for both control and data), we represent data as images of BV functions. Representing sets of variable evaluations as function images has the potential to be more concise than using a canonical representation of characteristic function, e.g. more than Bi- nary Decision Diagrams (BDDs), which are exponential in the number of variables for some arithmetic functions [Bry91]. Furthermore, evaluating the transition relation is syntactic, a simple substitution, which allows very fast generation of state successors. On the other hand, one of our contributions is the identification of a major limitation entailed with using function images for model checking: deciding state matching (equivalence of two sets) is very expensive. In the context of model checking, it is crucial to be able to decide when two objects in the state space represent the same state. State matching is used in accepting cycle detection and for deciding termination; both these operations are necessary for the correct functioning of the verification process. In order to utilise the advance of modern SAT and Satisfiability Modulo Theories (SMT) solvers, we have devised an SMT query for state matching: a BV formula satisfiable if and only if two given functions have different images. Yet such a query requires quantifying the input variables and thus deciding its satisfiability is, in theory, exponentially harder than quantifier-free satisfiability. Our experiments with randomly generated and DVE (native language of DIVINE [BBH+13]) models demonstrate that exact state matching limits our approach to rather small models of programs. In an attempt at ameliorating the complexity of state matching, we follow two directions: using quantifier-free state matching for the price of its precision and modifying the way the model checker searches the database of already visited states. Consider first the quantifier-free state matching implemented by comparing the observable behaviour of the two functions; let us call two states different if and only if their functions map a single input to two distinct outputs. We prove that this equivalence-based state matching, while not precisely comparing

5 two sets of evaluations, leads to a correct model-checking process which terminates provided that input variables are not read on a cycle. The second improvement decreases the number of SAT calls related to the insertion of a new state into the model checker database. The number of symbolic data parts associated with a single explicit control part can be decreased by explicating some variables which can be efficient if there is exactly one concrete value in the image. Finally, some states can be distinguished by reusing the satisfying models gained from previous SAT calls (evaluation of input variables that witness the difference between states). We have built a hierarchy atop these witness models which in some cases allows simulating a binary search among database states.

The proposed set-based state space reduction and the associated control explicit—data symbolic model checking are experimentally evaluated, together with the above described im- provements, on DVE models and randomly generated test instances. The experiments com- pare data representations using explicit sets with those using BV formulae, precise image-based state matching with equivalence-based state matching, and we also evaluate the influence of explicating variables and the witness model hierarchy.

There is a number of interesting observations to be highlighted. For example, there is a threshold on the size of the input variables domain below which the explicit representation outperforms the symbolic due to the high complexity of exact state matching. On the other hand, verification with the symbolic representation scales gracefully with the increase of data domains (32-bit domains caused less than two-fold slowdown compared to 4-bit domains). Regarding the two state matching techniques, the image-based comparison is two orders of magnitude slower than the quantifier-free equivalence-based comparison in a vast majority of cases. Yet our experiments with DVE models lead to relatively short quantifier-free SMT queries that required up to 88 seconds to be resolved by state-of-the-art SMT solvers. Finally, the witness-based matching reduces the number of SMT solver calls exponentially in some cases, yet all unsatisfiable instances remain (these are more expensive than the satisfiable ones) and thus the resulting impact is only a linear, 7-fold speedup.

As a final introductory note and to motivate the state space reductions proposed in this thesis, let us consider model checking of a correct system. To demonstrate correctness, the verification process has to refute the existence of accepting cycles, which, in the explicit-state case, entails enumerating the whole transition system. As we have mentioned earlier, the number of successors of each state stems from the combination of three choices the system as a whole can make. First, deciding whether the specification holds requires distinguishing which of the possible valid behaviours we are presently investigating. Second, each parallel process can be the first to execute and thus at least one new successor is generated for each parallel process. And third, every value of every variable leads to at least one successor.

The number of possible valid behaviours is often reasonably small since it depends on the complexity of the property under verification. Equally, the number of processes in real pro- grams and their models can be safely assumed to be small. This observation holds regardless the size of individual processes, i.e. only the number of processes increases the number of successors. On the other hand, the number of evaluations of many variables in a given pro- gram state can be equal to the domain of that variable, often exceeding 232. This observation lies at the core of the thesis mentioned earlier: that the two sources of non-determinism in parallel programs are of different nature and should thus be handled by different means.

6 1.3 Sanity Checking

Later in the development, when the requirements are given and a model is designed, the formal verification tools can provide a proof of correctness of the system being developed with respect to formally written requirements. The model of the system or even the system itself can be checked using model checking [BBCR10ˇ , CCG+02] or theorem proving [BFG+01] tools. If there are some requirements the system does not meet, the cause has to be found and the development reverted. The longer it takes to discover an error in the development, the more expensive the error is to mend. Consequently, errors made during requirements specification are among the most expensive ones in the whole development. Model checking is particularly efficient in finding bugs in the design, however, it exhibits some shortcomings when it is applied to requirement analysis. In particular, if the system satisfies the formula then the model checking procedure only provides an affirmative answer and does not elaborate for the reason of the satisfaction. It could be the case that, e.g. the specified formula is a tautology, hence, it is satisfied for any system. To mitigate the situation a subsidiary approach, the sanity checking, was proposed to check redundancy and coverage of requirements [Kup06]. Redundant requirements are such that they are trivially satisfied in the given model. Coverage of requirements enumerates the amount of those behaviours of the model that are described by the requirements. It follows that the existence of a model is still a prerequisite which postpones the verification until later phases in the development cycle. Our primary interest with respect to the requirements analysis is to extend the techniques that allow the developers of computer systems to check sanity of their requirements when it matters most, i.e. during the requirements stage and thus before any model is available. Given that previous definitions of sanity presupposed the existence of a model, we lift three properties of sane requirements to a pre-design phase of development. A consistent set of re- quirements is one that can be implemented in a single system. The satisfaction of a reduntant requirement is inevitable in any system that satisfies the rest of the requirements. Finally, a complete set of requirements extends to every possible behaviour that can be considered sensible in an environment. All three notions will be defined properly in Section4. We redefine the notion of sanity checking of requirements written as LTL formulae and describe its implementation and evaluation. The novelty of our approach lies in the idea of liberating sanity checking from the necessity of having a model of the system under devel- opment. Our approach to consistency and redundancy checking identifies all inconsistent (or redundant) subsets of the input set of requirements. This considerably simplifies the work of requirement engineers because it allowed to pinpoint all the sources of inconsistencies. As for completeness checking, we propose a new behaviour-based coverage metric. Assuming that the user specifies what behaviour of the system is sensible, our coverage metric calculates what portion of this behaviour is described by the requirements specifying the system itself. The method further suggests new requirements to the user that improves the coverage and thus ensures more accurate description of users’ expectations.

1.4 Case Study: Simulink Diagrams

Consistent with the primary motivation behind this thesis, i.e. automating the verification tasks included in software development, we now demonstrate our verification method for post-

7 inport2

outport ≥ ∗ delay

constant min + inport1

Figure 1.2: A partial Simulink diagram for the VoterCore system: the complete diagram was verified during the experimental evaluation (see Section6). Delay nodes form the state space, inport nodes generate new evaluations with every tick, and outport nodes can be referred to in the specification requirements. requirement phases on Simulink diagrams. To restate the problem we solve more accurately: the task is to verify that a program (or a model thereof) satisfies its specification, given as an LTL formula. The method is sufficiently general to allow verification of both modelling and programming languages as we later demonstrate by verifying LLVM bit code. Yet our research was guided by the concrete application to Simulink and we choose it for the demonstration because of its widespread use in industry. In the following, we assume no previous knowledge of the Simulink modelling language and in fact resort to a different notation than the one prescribed by the language standard. Let a Simulink diagram be an oriented graph, which can be interpreted as an abstract syntactic tree, except it may contain self-referencing cycles, for an example see Figure 1.2. The semantics of such diagram is that of a one-register machine. The delay nodes are the registers of the machine and the computation evaluates the expressions forming the subtrees of delay and outport nodes. The computation is executed synchronously with each tick of the global clocks. Each tick is thus different from any other and only the evaluation of delay nodes and the values read from inports. The expression associated with a node is described by the tree underneath it, e.g. in our diagram the next value of the delay node is computed as the multiplication of the two subtrees of the node labeled by ∗ to the left of the delay node. Translated into a full equation, the value of the delay in the next tick is:

delay = min(constant, inport1 + delay) ∗ inport2 Other than the arithmetic and logical operators there are 4 types of nodes: delays, inports, outports, and constants. Since the inport nodes serve as inputs – providing a new, non-deterministic value after every tick – a Simulink diagram generates a transition system of the possible values stored in delays. Such a transition system represents the possible evolution of the diagram, describing its behaviour as it changes in time. Similarly as with parallel and reactive programs, a developer may need to verify that the behaviour of a system complies with the specification. Consider for example a specification requiring that: whenever the inport1 is negative in three consecutive ticks, the value of outport will eventually be positive. Such a specification cannot be expressed in terms of safety, i.e. as a reachability of bad states, because if the system were incorrect the violating run would be infinite. Consequently, verification procedures that do not allow temporal specification, such as most implementations of symbolic execution by [Kin76], static analysis by [CC77] or deductive verification by [OG76], cannot be used in this case.

8 Model checking, initiated by [CES86, VW86], allows verification against temporal properties, yet Simulink diagrams and similar systems with input variables pose another obstacle in the form of data-flow non-determinism. In order to answer temporal verification queries automatically, an algorithm must al- low evaluation of Peano arithmetic expressions that form subtrees in the diagram. At the same time the input variables may be evaluated to an arbitrary number (which in practice is bounded, e.g. by the number of bits used to express the value) and each of these eval- uations must be considered for the verification to be exhaustive. Standard symbolic model checking approaches are either restricted to only a subset of Peano arithmetic, see [BGP97], or are concerned with Boolean input variables, as demonstrated by [McM92]. Standard ex- plicit approaches to model checking already suffer from the state space explosion caused by the control-flow non-determinism and thus cannot scale to any considerable degree with the allowed ranges of input variables. Hence the motivation for LTL verification of Simulink diagrams: it has the potential to replace testing in the design phase of software (or systems) development. In this sense, purely explicit representations succeed only in part. In a tool-chain, and preceded by a tool to formalise pattern-based, human language requirements into LTL formulae, a model checker could provide similar results as testing for considerably smaller expenses. Yet the severely limited domains of input variables the standard model checking allows still enforce a test-like approach to correctness validation. Effectively, the developer still has to mechanically test (a smaller number of) instances of the same validation query with different (intervals of) input values. Our work meliorates the limitations of previous approaches by proposing a model checking technique that scales to arbitrarily large domains of input variables. Through the reduction of certain crucial operations required by the model checking process to SMT queries, we have achieved a method for complete verification (for large but finite domains). Though the experiments were conducted on a relatively small Simulink model, they still convey the most important practical result: appropriately adapted model checking may be used to replace testing of safety-critical units as early as in the modeling phase of software development.

1.5 Contribution

There are three pivotal contributions that deserve pointing out. In this chapter we only give a brief summary of the most important results and refer the reader to the relevant chapters that describe each contribution in more detail. First, we have developed a model-free sanity checking as a way of verifying requirements before a model of the system is designed. We build the theory, describe the implementation and evaluate the experimental results in Section4. Second, we have developed the control explicit—data symbolic model checking as a method for verifying parallel programs that non-trivially manipulate data. The theory of this method is established in Section5 and the properties of a number of symbolic data representations are investigated in Section7. Our primary application of control explicit—data symbolic model checking was the verification of Simulink diagrams, detailed in Section6. And finally, our last contribution lies in building verification platforms for analysing real code via the LLVM intermediate language. We demonstrate how other researcher could build their own tools using our platforms by implementing the control explicit—data symbolic model checking and a static partial order reduction over LLVM in Section8.

9 1.5.1 Model-Free Sanity Checking

We redefine the notion of sanity checking of requirements written as LTL formulae and de- scribes its implementation and evaluation. The proposed notion liberates sanity checking from the necessity of having a model of the developed system. Sanity checking commonly consists of three parts: consistency and redundancy checking and completeness of requirements. Our approach to consistency and redundancy checking is novel in identifying all inconsistent (or reduntant) subsets of the input set of requirements. This considerably simplifies the work of requirements engineers because it pinpoints all sources of inconsistencies. For completeness checking, we propose a new behaviour-based coverage metric. Assuming that the user specifies what behaviour of the system is sensible, our coverage metric calculates what portion of this behaviour is described by the requirements specifying the system itself. The method further suggests new requirements to the user that would improve the coverage and thus ensure more accurate description of users’ expectations. The efficiency and usability of our approach to sanity checking is verified in an experimental evaluation and a case study. Modifying the work flow to that allows requirement engineers to explicitly specify univer- sal (∀) and existential (∃) nature of individual requirements allows to more naturally express some of the system properties. We also evaluate the robustness and sensitivity of our con- sistency and completeness checking procedures with respect to the use of different LTL to BA translators. Furthermore, we adopt redundancy checking for individual LTL formulae into the framework and add proper references to relevant previous work. Finally, we report on experience we gathered in cooperation with Honeywell International when applying our methodology on some real-life industrial use cases.

1.5.2 Control Explicit—Data Symbolic Model Checking

Let us summarise our work on bit-precise LTL model checking of parallel programs with inputs. First, we establish the theory of set-based verification, on which all presented appli- cations build upon. The list of applications contains Simulink diagrams, DVE models, and C/C++ parallel programs. Hence our verification method is scrutinised both theoretically – by investigating its properties on hand-crafted models – and practically – by undertaking experiments with real-world systems. The Simulink diagram verification also includes the methodology of translating state matching into quantified SMT queries. There, we have first formalised the process of model checking of Simulink diagrams, building on the concept of set-based reduction of the underly- ing state space. This reduction can significantly decrease the number of states, but does not lead to any reduction in the worst case. Also the representation used for the sets of variable evaluations determines the resulting complexity, which can differ exponentially for particu- lar representations. We have implemented the set-based reduction using explicit sets as one representation and using bit-vector formulae as another representation. Explicit sets are ex- ponential in both the number of variables represented and in their ranges, but allow simple implementation of state matching, i.e. deciding if two states represent the same evaluations. The representation using formulae is exponentially more succinct, but the state matching is a complex operation that requires deciding satisfiability of quantified bit-vector formulae. The results of Simulink verification show that (1) in the case of Simulink diagrams the set-based reduction is efficient, leading to exponentially smaller states spaces; (2) the repre- sentation using explicit sets allows fast verification but only for small ranges of variables; and

10 (3) the representation using bit-vector formulae scales to arbitrary (bounded) ranges, though the state matching is resource expensive. In the terms of practical results, the proposed SAT-based verification is fully automated. Furthermore, a hybrid representation is presented which combines explicit and SAT-based representations. This hybrid representation incorpo- rates both the explicit and the symbolic representations, thus allowing to utilise the hidden limits of input values if these are present but can also cope with systems without anyhow restricted inputs. Further experiments concentrate on the comparison between image-based and equivalence- based state matching and the explicating values heuristics. We provide detailed proofs of cor- rectness of set-based verification and the correctness of both state matching techniques and the relation between them. The witness-based matching resolution heuristics is evaluated on randomly generated models, which are also used to gain deeper insight into the relation be- tween the two state matching techniques. To provide a complete view to the reader, Section5 applies all the accumulated theory in a comprehensive description of the LTL model checking built on the set-based reduction. The novelty of the verification methodology proposed here lies in the combination of explicit state model checking with sets of variable evaluations represented as images of bit- vector functions. We build the theory of set-based reduction using the above combination for verification of systems with asynchronous control flow and regular, arithmetical data flow, e.g. for parallel programs with input variables. The result of incorporating the set-based reduction into an existing explicit-state model checker DIVINE effectively allows verification of programs reading from input, which was previously impossible in all non-trivial cases. Translating these results to software development, we have extended the limits of previous complete, bit-precise techniques to the point when the system designer no longer has to bound the domains of input variables. Instead of iteratively testing the model on domains of sizes smaller than 100, the set-based reduction and symbolic representation allow verifying for the whole domain 0 − 232 in a single verification run.

1.5.3 Verification Platform SymDivine provides the verification community a generic verification platform: the user is given a function that returns an initial state and a function that generates successors of a given state. The tool allows the separation of the representation of control (stored explicitly as the unique sequence of program locations) and data (stored symbolically as a set of variable evaluations). To the best of our knowledge, SymDivine is the only tool that allows verification of LTL properties of parallel LLVM with data non-determinism. The genericity of the framework is two-fold.

• As is common in explicit-state model checkers, a number of verification algorithms can be used, provided the algorithm can be implemented using the initial state and successor generator functions. We demonstrate this contribution on two algorithms: reachability for safety verification and Nested Depth-First Search (DFS) for LTL verification.

• The framework provides an interface for the symbolic data representation, with a few methods that must be implemented, leaving the choice of how data are stored completely on the user. Depending on whether the representation implements hashing (or ordering) a more efficient state space searching algorithm is automatically used. We demonstrate this contribution on four representations of sets: explicitly enumerating all set members,

11 Chapter4 nance SEFM’12 te R ain e FAOC qu M i re m DIVINE-San e n looney t s Chapters5,8

MEMICS’14 n SymDivine o

i

TOSEM t NTS-platform Chapters6,7

a

d

i

l DIVINE-Sym HASE’14

a

V SQJ’14

DIVINE-Set D

DIVINE-Eq e s PDP’14

i

g

n

n

o

i

t

a

t

n e I m m e p l

Figure 1.3: Software development stages matched against the individual contributions of this thesis.

Binary Decision Diagram-based representation, Satisfiability Modulo Theories-based representation, and an empty representation which completely ignores data.

Finally, we report the results of comparing the control explicit—data symbolic (CEDS) ap- proach to LTL model checking against the currently best other approaches. The NTS-platform further expands our attempt at supplying the community with an accessible way to test their verification method on real programs. The platform provides a C++ library to represent and manipulate transition system and a translator from the LLVM IR code and from the NTS modelling language. Furthermore, we have implemented method to inline function calls and sequentialise parallel programs. To demonstrate the applicability of our platform, we have also implemented a static partial order reduction as a transformation of the internal representation. The result is a reduced sequential control-flow graph with behaviour equivalent to its parallel LLVM counterpart.

1.5.4 Thesis Content The Figure 1.3 matches the following sections of this thesis with the publications (outside the circle) and tools (inside the circle) of the author. The contribution of the author with 100 respect to each publication can be approximated to n %, where n is the number of authors. The papers presented at HASE’14 and MEMICS’14 were awarded the Best Paper award. Here we only list the publications directly related to the content of this thesis. The remaining publications of the author are listed in the next subsection. The tools used in our experiments are freely available at the author’s web page: http://anna.fi.muni.cz/~xbauch/code. html.

12 Accepted Conference Papers

MEMICS’14 P. Bauch, V. Havel, J. Barnat. LTL Model Checking of LLVM Bitcode with Symbolic Data. In Proceedings of the 9th Doctoral Workshop on Mathe- matical and Engineering Methods in Computer Science, volume 8934 of Lecture Notes in Computer Science, pages 47–59. 2014.

PDP’14 J. Barnat, P. Bauch, V. Havel. Model Checking Parallel Programs with Inputs. In Proceedings of the 22nd Euromicro International Conference on Parallel, distributed and network-based Processing, pages 756–759. 2014.

HASE’14 J. Barnat, P. Bauch, V. Havel. Temporal Verification of Simulink Di- agrams. In Proceedings of the 15th IEEE International Symposium on High Assurance Systems Engineering, pages 81–88. 2014.

SEFM’12 J. Barnat, P. Bauch, L. Brim. Checking Sanity of Software Requirements. In Proceedings of the 10th International Conference on Software Engineering and Formal Methods, volume 7504 of Lecture Notes in Computer Science, pages 48–62. 2012. Accepted Journal Papers

SQJ’14 P. Bauch, V. Havel, J. Barnat. Accelerating Temporal Verification of Simulink Diagrams Using Satisfiability Modulo Theories. In Software Quality Journal, volume 22, pages 1–27, 2014. Submitted Journal Papers

FAOC J. Barnat, P. Bauch, N. Beneˇs,L. Brim, J. Beran, T. Kratochv´ıla. Analysing Sanity of Requirements for Avionics Systems. In the submission process (after first review) to Formal Aspects of Computing.

TOSEM P. Bauch, V. Havel, J. Barnat. Control Explicit—Data Symbolic Model Checking. In the submission process (after first review) to Transactions on Software Engineering and Methodology. Tools

DIVINE-San Build on top of the DIVINE version 2.5.2, DIVINE-San checks that a set of re- quirements possesses the three properties of sanity: consistency, non-redundancy, and completeness.

looney Accelerates the computation of consistency and non-redundancy while adding the support for redundancy witnesses.

DIVINE-Sym Build on top of the DIVINE version 2.94, DIVINE-Sym verifies models written in the DVE modelling language extended with the possibility to specify some variables as input: their initial values are not defined explicitly but only as a range of possible evaluations.

13 DIVINE-Set Build on top of the DIVINE version 2.5.2, DIVINE-Set verifies Simulink diagrams translated into the CESMI format. The tool supports both the explicit and the symbolic representation of the sets of variable evaluations.

DIVINE-Eq Build on top of the DIVINE version 2.96, DIVINE-Eq verifies models written in the DVE modelling language, where the evaluation of program variables is rep- resented via bit-vector functions while the control location are stored explicitly.

SymDivine A standalone framework for verification of LLVM programs. The tool accepts as input the LLVM code translated from a parallel C++ program that reads from user input. SymDivine houses a number of verification algorithms as well as providing a tool set for writing new algorithms. It implements the control explicit—data symbolic model checking using explicit sets, binary decision di- agrams, BV formulae, and other representations. (The author would like to acknowledge that a major part of the implementation is to be accredited to Vojtˇech Havel.)

NTS-platform A standalone framework for verification of NTS models. The framework consists of a translator from LLVM to NTS, a parser for NTS, a common internal repre- sentation for building new tools, and a partial order reduction for NTS models. (The author would like to acknowledge that a major part of the implementation is to be accredited to Jan Tuˇsil.)

1.5.5 Author’s Other Publications The following are publications the author contributed to during his Master’s studies. Their primary topic are data-parallel graph algorithms with an application in explicit-state model checking. Thus they share the theoretical goals with this thesis but not practical methods to achieve those goals. We only list them for the reader to have a complete picture. Accepted Conference Papers

PDMC’11 J. Barnat, P. Bauch, L. Brim, M. Ceˇska.ˇ Computing Optimal Cycle Mean in Parallel on CUDA. In Proceedings of the 10th International Workshop on Parallel and Distributed Methods in verifiCation, volume 72 of Electronic Proceedings in Theoretical Computer Science, pages 68–83. 2011.

IPDPS’11 J. Barnat, P. Bauch, L. Brim, M. Ceˇska.ˇ Computing Strongly Connected Components in Parallel on CUDA. In Proceedings of the 25th IEEE Inter- national Parallel & Distributed Processing Symposium, pages 544–555, 2011.

ICPADS’10 J. Barnat, P. Bauch, L. Brim, M. Ceˇska.ˇ Employing Multiple CUDA De- vices to Accelerate LTL Model Checking. In Proceedings of the 16th International Conference on Parallel and Distributed Systems, pages 259–266, 2010.

MEMICS’10 P. Bauch, M. Ceˇska.ˇ CUDA Accelerated LTL Model Checking – Re- visited. In Proceedings of the 6th Doctoral Workshop on Mathematical and Engineering Methods in Computer Science, pages 11–19, 2010.

14 Accepted Journal Papers

JPDC’12 J. Barnat, P. Bauch, L. Brim, M. Ceˇska.ˇ Designing Fast LTL Model Check- ing Algorithms for Many-Core GPUs. In Journal of Parallel and Dis- tributed Computing, 72(9):1083–1097, 2012.

15 a

16 Chapter 2

Related Work

Similarly to the content of this thesis, we will follow the same division when presenting the re- lated directions of research. Following the software development cycle, our work encompasses the requirements stage, the design stage, and the verification stage. Yet arguably the biggest contribution lies with the control explicit—data symbolic model checking. This method is employed in both design verification and code verification and as such deserves a separate treatment with respect to related research efforts. Those readers more closely familiar with the field of program verification might find our selection of related work in Section 2.4 cursory and lacking other important directions. Our opting for a shorter selection of only the most relevant directions was motivating by the specific target of our own verification: parallel C++ programs with input. Hence we have selected those direction that have the potential and ambition to aim for the same target programs.

2.1 Control Explicit—Data Symbolic Model Checking

The results described in this thesis rectify a very particular omission of the verification com- munity. Even though the state space of many real-world programs is finite, the majority of verification efforts concentrates on treating the state spaces as infinite, because then the analysis is often easier. We do not follow this simplification mainly since it may introduce impression by interpreting bit-vector integers as mathematical integers. Let us state precisely the properties of our verification method to substantiate our claims of presenting original results. Control explicit—data symbolic model checking:

• allows for verification of unmodified (1) program designs (Simulink), (2) program code (C/C++ via LLVM),

• handles bounded control-flow non-determinism of parallel programs,

• accepts full LTL as the behaviour specification,

• is bit-precise with respect to the evaluation of program variables,

• is fully automated without reporting false alarms,

• is complete, i.e. the output is effectively a proof of correctness.

17 At the time of writing this thesis, there is no other method capable of the above known to the author. The following description of related work lists those research directions that aim in part at achieving similar results, while often being considerably more efficient for solving an equally challenging, albeit a different problem. There are three vantage points that appear useful for placing the presented work among the established body of verification methodology. First, based on the problem being solved, this work addresses bit-precise LTL model checking of parallel programs with inputs, where computer variables are interpreted as bit-vectors and not as mathematical integers. Second, based on the achieved results, we have reduced the state space by handling separately the control- and data-flow aspects of parallel programs. And third, based on the distinguishing aspect of our approach, the described verification method combines explicit and symbolic approaches by enumerating the thread interleavings explicitly and by representing variable evaluations as images of bit-vector functions. The related work is analyses based on these three perspectives to locate the differences and common aspects with the content of this work. (We have also devoted the first part of this section to shortly describing the symbolic representations most commonly used in verification.)

2.1.1 Symbolic Representations

The symbolic representations employed in formal verification can be divided between canon- ical and non-canonical. If the object being represented is a Boolean function then the canon- 0 n icity of the representation means that for two Boolean functions f, f : B → B such that any n-long vector of Boolean values is mapped to the same value by both f and f 0, then the representations of f and f 0 are exactly the same (syntactically equivalent). The definition of canonicity naturally extends to formulae of FO theories, such as the theory of bit-vectors, which are the objects primarily represented in this work.

Canonical Representations

The most important example of canonical representation of Boolean functions are the Reduced Ordered Binary Decision Diagrams (ROBDDs) [Bry86], though in the future discussion we simply refer to them as BDDs. Apart of their other uses, BDDs were crucial in the devel- opment of symbolic model checking, given the efficient representation of state spaces they provide [McM92]. Yet the pivotal bottleneck of classical BDDs is their size when arithmetic operations are represented. It was shown by Bryant that BDDs grow exponentially for integer multiplication [Bry91], regardless of the variable ordering. The discovery of this limitation lead to subsequent modification and alternative diagrams. Linear functions may be canonically represented by Binary Moment Diagrams (BMDs), which allows efficient representation of digital systems at the world level [BC95], but BMDs cannot represent dividers and some Boolean functions representable by BDDs [BDE97]. Alge- braic Decision Diagrams (ADDs are essentially BDDs with leaves that take values from a set of constants) allow representation of modular arithmetic and are smaller than BMDs [RPHS96]. Boolean Expression Diagrams (BEDs extend BDDs with binary Boolean operator vertices) represent any Boolean circuit in linear space, furthermore quantification and substitution (important in fixed point iterations) are also linear [AH97]. Finally, the Taylor Expansion Diagrams (TEDs) allow constant computation of multiplication and polynomial computation

18 of bounded exponentiation, yet the wrap-around behaviour of modular arithmetic cannot be represented (for comprehensive summary of canonical representations see [CPJ09]).

Non-canonical Representations

The use of non-canonical representations of formulae of FO theories is justified by their poten- tial conciseness, given that all canonical representation grow exponentially for some arithmetic functions [Bry91]. This thesis only mentions the representations using automata [Boi99] or monadic logics [HJJ+95]; while these representation are very effective in some areas, they were shown impractical for sets of variable evaluations [WB00]. Similarly, the use of Pres- burger arithmetic formulae (for their use in verification, see for example [BW94]) and their combinations with decision diagrams [BGL98] are not detailed on the ground of full Peano arithmetic being used in the verified programs. Finally, given that satisfiability of Peano arithmetic formulae is undecidable, most researchers in program verification focused on using the bit-vector theory to represent computation, and designed efficient decision procedures for satisfiability of BV formulae. Early decision procedures for a subset of BV, i.e. for the quantifier-free fragment with bit-level logical operation, composition, and extraction were based on formula canonisation (using a set of rewriting rules) thus allowing Shostak combination [Sho82] with other theo- ries [CMR97]. The set of allowed operations was extended to include Presburger arithmetic in [BDL98]. The possibility of deciding satisfiability for non-fixed size bit-vectors was in- vestigated in [BS98, Pic03]. A solution based on solving the linear problems using M¨uller- Seidl algorithm and the non-linear problems using Newton’s p-adic iteration algorithm was implemented in [BM05]. This combination allows for Nelson-Oppen combination without converting the entire problem into a SAT instance (also called flattening). Precise modular arithmetic was first considered in [Wan06] which reduced the satisfiability of modular arithmetic formulae over the finite ring Z2ω to integer programming. For linear modular arithmetic only a linear number of constraints in needed; for non-linear one needs an additional factor of ω (though the constraints are still linear). The lazy approach, i.e. postponing the flattening process and using first the structural information such as equal- ities and arithmetic functions, was followed in [BCF+07]. Deciding satisfiability based on alternatively improving under- and over-approximation of the original formula was investi- gated in [BKO+07]. Finally, [WHdM13] implemented a decision procedure for the quantified bit-vector theory by extending the quantifier-free procedure with model-based quantifier in- stantiation.

2.1.2 Bit-Precise LTL Model Checking of Parallel Programs with Inputs

All related methods of program verification are instances of static analysis, in that they verify correctness of some aspect of the program (here specified as an LTL formula) without executing the program. We concentrate on three specific instances: model checking, symbolic execution, and abstraction-based program analysis. One outstanding research direction which is related but does not fit the above categories is the study of value-passing transition systems. Defined in [Lin96] these transition systems have edges labelled with guarded assignments and allow bisimulation of two systems to be reduced to finding the greatest solution of the underlying predicate equation system.

19 Model Checking

Explicit-state LTL model checking [VW86] does not have any theoretical limitation with respect to parallel programs with inputs other than the state space explosion since the arith- metic operations are computed explicitly. Given its theoretical basis in ω-regular automata, explicit model checking is local in the sense that only those parts of the system are considered that are relevant to the temporal property being verified. It follows the tableau-proof pro- cesses of deciding validity of LTL formulae, where the sequent rules describe what hypotheses must hold for the conclusion to hold (and the hypotheses are often structurally simpler). However, even this limited part of the transition system of parallel programs is often outside the storage capacity of present-day (and arguably even future) computers. Yet most approaches trying to ameliorate the state space explosion, such as partial-order or symmetry reduction and parallelisation (see for example [BBS01]), focus on the control-flow part; ignoring the variable evaluation or assuming it to be fixed. Only rarely was the impact of data-flow on explicit model checking studied [EP02] and the results were unfavourable. The recent attempts at incorporating a systematic treatment of input variables into explicit model checking, culminating in this work, started by introducing a new process into the verified system that generated the possible variable evaluations [BBB+12]. Symbolic CTL model checking [CES86], on the other hand, is global in the sense that the part of the system being represented at some step of the verification may be unreachable from the initial state and thus unnecessary to be considered. This seeming redundancy stems from the representation being symbolic: it is more efficient to incorporate some redundancy than to describe what is redundant and that it should be excluded from the representation. Unlike in local model checking, in global model checking one starts from those states that satisfy the simplest (atomic) subformulae of the specification formula and then structurally progresses towards the computation of the set of states that satisfy the whole formula. If this set contains all initial states of the program’s transition system then the program is correct. Similarly as with explicit-state model checking, verifying parallel programs with inputs using symbolic model checking is only limited by the complexity of the task. The classical approach based on BDDs [McM92] suffers from the unmanageable size of BDDs caused by (1) arithmetic operations in the data flow and (2) irregularity of the control flow. Using non-canonical representations, Boolean Expression Diagrams [WBCG00] or proposi- tional logic formulae [Bje99, McM02], to handle the first cause of BDD size increase, is limited by the introduction of quantifiers when pre- and post-images are computed during the fixed point search. (Note that our research has met similar difficulty when discovering accepting cycles.) Bounded symbolic model checking [BCCZ99] continues with using logical formulae instead of BDDs and adopts the locality of the explicit approach thus improving also the second cause of size complexity for the price of being incomplete. The completeness was then introduced in [McM03] and in [dMRS03], using Craig interpolants and k-induction, respec- tively. Recently, this approach was improved in [Bra11] by replacing the transition relation unfolding by incrementally generating clause that approximate the reachability information, thus further accelerating the verification process. Finally, module checking [KV96, Var01] allows verification of systems that interacts with an environment that introduces non-deterministic behaviour. The environment can restrict the set of possible choices of some inputs and the verification should be robust with respect to these restrictions, yet this can only be distinguished by branching time logics; where LTL is linear and its validity thus unaffected by these restrictions.

20 Symbolic execution

The basic idea of symbolic execution [Kin76, CPDGP01] is to track the execution of a program by building its execution tree. The nodes of such a tree consist of a path condition (constraint on the input variables that leads the execution to this node) and symbolic variable evaluation. The symbolic variable evaluation does not interpret the expressions assigned to variables during execution, they are syntactically substituted with the symbolic values of variables they refer to. This verification approach allows for efficient safety analysis of programs with inputs, since only one execution is necessary to consider all possible executions. An extension of symbolic execution to LTL verification was investigated in [BDKP08] but did not incorporated full LTL on the grounds of state matching not being decidable for theories with unbounded integers. Without state matching, symbolic execution is not guaranteed to terminate given that the program under verification may contain loops. One possible extension of symbolic execution is the addition of state subsumption [XMSN05, APV06] (effectively subset equality of concrete evaluations) which improves efficiency but not the scope of supported specifications. Further- more, the original limitation of symbolic execution to verification of sequential programs was lifted in [KPV03] by employing a model checker to generate possible thread interleavings.

Abstraction-Based Program Analysis

The verification methods based on abstract interpretation [CC77] do not track the execution of the program under verification in the precision specified in the program. Instead the precision is approximated to a model that can be more easily manipulated, each concrete value of variables mapped to an abstract object, e.g. an interval, that allows simpler and faster computation. The application of abstract interpretation in static analysis has lead to a number of results on real-world, industrial scale programs. Predicate abstraction for temporal properties was described in [PR05] which builds on the idea of encoding temporal properties into fair termination. Later, abstraction-based verification of parallel programs was proposed in [GPR11a]. Finally, the last ingredient for bit- precise verification was an abstract domain forming a lattice over bit-vectors. One possibility, extending the polyhedra abstraction with implicit wrap-around of arithmetic operations, was investigated in [SK07].

2.1.3 State Space Reduction by Control-Flow Aggregation

Cimatti et al. [CRB01] combine the approaches from explicit and symbolic model checking by representing subsets of the set of states as individual BDDs (uncertainty states) by relying on forward, hash-based space traversal used in explicit model checking. Uncertainty states are modified according to the transitions applicable to the states that comprise it. Another ap- proach, investigated in [HGD95, BDG+98], handles the problem of data-flow non-determinism by solving first-order CTL model checking (given that pre- and post-image computation in- troduces quantifiers, the first-order extension of a temporal logic is sufficiently powerful to represent the problem). From a semantic point of view, their algorithm iteratively approxi- mates the sets of evaluations annotating nodes of a product of two transition systems: one is essentially the control-flow graph, the other connects related subformulae of the input for- mula. This verification is incomplete for unbounded data domains unless the data and control

21 variables can be clearly separated. A similar, though still incomplete, approach is to accu- mulate the path condition during verification that distinguishes between program counter predicates and program variable assertions [GP05]. Finally, [BCZ99] showed how local and global approaches could be combined by aggregating the sets of states satisfying a subformula in the tableau-based verification. Another possible aggregation is to form sets from those states with similar observable events, together building an observation graph. Originally described in [HIK04] this ap- proach was optimised in [BDL07] by constructing powersets using previously formed subsets. The limitation of the original paper to the next-free fragment of LTL was recently lifted in [DLKPTM11].

2.1.4 Combinations of Explicit and Symbolic Approaches In [BL13] the authors combine explicit-value analysis with various symbolic approaches, such as abstraction, counterexample-guided refinement, and interpolation to build an abstract reachability graph, extending the work from [BK11]. The explicit analysis only tracks those variables necessary to refute the infeasibility of error paths and variables are added in the precision set iteratively, in further CEGAR iterations. The abstraction maps variables to Z ∪ {>, ⊥}, > for unknown value and ⊥ for no satisfiable value and this abstract variable assignment represents the region of concrete data states for which it is valid. Hence the abstraction clusters together sets of states with the same explicit value of every variable. In [CNR12] a verification method for SystemC designs was implemented by translating them to multi-threaded programs with cooperative scheduling, shared variables, and mutually exclusive thread execution. In this model only one thread is running at a time, when it stops the scheduler selects the next thread to run. The explicit scheduler, symbolic thread combination uses symbolic abstraction for sequential execution of individual threads while the scheduling policy is handled explicitly. A different combination was investigated in [STV05] which symbolically represented the system state space and explicitly the property automaton. Also related is the concolic execution, see for example [SMA05], where the standard symbolic execution is, after a number of symbolic steps, supplemented with a single step of concrete execution (on a smaller subset of values). Then the symbolic execution continues, maintaining a much smaller representation. Perhaps the most closely related work is that of [CKS05] where the distinction between data- and control-flow non-determinisms is realised via a SAT-based model checker for con- current (with a finite number of threads) Boolean programs. Data-flow non-determinism is handled by a combination of symbolic model checking and fixed point detection and the control-flow non-determinism is handled by symbolic partial-order reduction. The approach is limited to Boolean programs, yet standard programs can be converted to Boolean using predicate abstraction. Our work is precise with respect to the bit-vector semantics of pro- grams since we do now use any abstraction. Finally, [CKS05] verify reachability properties and thus do not require exact state matching. Their approach is thus much faster, but does not allow checking LTL properties.

2.2 Model-free Sanity Checking

The use of model checking with properties (specified in CTL) derived from real-life avion- ics software specifications was successfully demonstrated in [CAB+98]. Our work on sanity

22 checking intends to present a preliminary to such a use of a model checking tool, because there the authors presupposed sanity of their formulae. The idea of using coverage as a met- ric for completeness can be traced back to software testing, where it is possible to use LTL requirements as one of the coverage metrics [WRHM06, RWH07]. Model-based sanity checking was studied thoroughly and using various approaches, but it is intrinsically different from model-free checking presented in this work. Completeness is measured using metrics based on the state space coverage of the underlying model [CKKV01, CKV01]. Vacuity of temporal formulae was identified as a problem related to model checking and solutions were proposed in [KV03] and in [Kup06], again expecting existence of a model. Checking consistency (or satisfiability) of temporal formulae is a well understood problem solved using various techniques in many papers (most recently using SAT-based approach in [RDB+05] or in [RV07] where it was used as a comparison between different LTL to B¨uchi translation techniques). The classical problem is formulated as to decide whether a set of formulae is internally consistent. In this thesis, however, a more elaborate answer is sought: specifically which of the formulae cause the inconsistency. The approach is then extended to vacuity which is rarely used in model-free sanity checking. We have adopted the idea of searching for the smallest inconsistent subset of a set of requirements from the SAT community, where finding the minimal unsatisfiable core (a set of propositional clauses) is an important feature of SAT solvers, see e.g. [LMS04]. The authors of [LS08] extend the approach to finding all minimal unsatisfiable subsets, again in the context of propositional logic. Their exclusion of checking those subsets containing known unsatisfi- able subsets was extended by our algorithm with a dual heuristic: exclude also subsets with known satisfiable supersets. A problem related to consistency is that of realisability [ALW89]. Realisability generalises consistency by allowing some of the atomic propositions to be controlled by the environment. A set of system requirements is then realisable if there is a strategy which can react to arbitrary environment choices while satisfying the requirements. If no propositions are controlled by the environment then the notion of realisability becomes equivalent to consistency. A number of tools implement realisability, RATSY [BCG+10] for example, but again limit their responses to stating whether the requirements are realisable as a whole. Thus, even though the theory of minimal unrealisable cores had been known for some time [CRST08, KHB09], searching for smallest inconsistent subsets is a feature unique for our tool. We would also like to point out the theory described in [Sch12] which extends the unrealisability cores to search even within individual formulae, providing even finer information on the quality of requirements. Completeness of formal requirements is not as well-established and its definition often differs. The most thorough research in algorithmic evaluation of completeness was conducted in [HL95, Lev00, MTH03]. Authors of those papers use RSML (Requirements State Machine Language) to specify requirements which they translate (in the last paper) to CTL and to a theorem prover language to discover inconsistencies and detect incompleteness. Their notion of completeness is based on verifying that for every input value there is a reaction described in the specification. This thesis presents completeness as considering all behaviours described as sensible (and either refuting or requiring them). Finally, a novel semi-formal methodology is proposed in Section4, that recommends new requirements to the user, that have the potential to improve completeness of the input specification. Another approach at defining and checking completeness was pursued in [KGG99] in terms of evaluating the extend to which an implementation describes the intended specification. Their evaluation is based on the simulation preorder and searching for unimplemented tran-

23 sition/state evidence. In order to create a metric that would drive the engine for generating new requirements we have build an evaluation algorithm that enumerates the similarity of two paths. Our notion of similarity is to certain extend based on unimplemented transitions, but we carry the enumeration from two transition up to the whole automata, thus finally obtaining our completeness metric. The authors of [BB09] elaborate a method for measuring the quality of a set of requirements by detecting the presence of such a subsets, that together require behaviour that could not be detected by individual requirements. Similarly as in [HL95], the interesting behaviour is described in terms of reactions to input values observed at discrete time instances, yet the properties are limited to those of the form ∀t, At(t) =⇒ Zt(t). Finally, the notion of coverage is well-established in the area of software testing. The similarity with our work and that described for example in [RLS+03, FG03] lies in that both approaches attempt to generate a description of sets of behaviours that remain uncovered. The dissimilarity lies in that we describe these uncovered sets in term of software require- ments (or LTL formulae), whereas automated testing coverage produces collections of test, i.e. individual runs of a particular system.

2.3 Verification of Simulink Diagrams

There are three practical research directions, that we are aware of, describing LTL verification of Simulink diagrams. [MAW+05] translate Simulink models into the synchronous data-flow language Lustre, created by [HCRP91], and then into the native language of the symbolic model checker NuSMV, created by [CCG+02]. NuSMV was then used to verify correctness of the model with respect to a set of specifications (mostly in CTL, but NuSMV can verify full LTL as well) using BDD-based (Binary Decision Diagrams: see [McM92], for a classi- cal approach) representation of the state space. Although it was not a limitation in the input diagrams they used, BDDs are not readily able to support complex data types with Peano arithmetic, at least not while preserving their efficiency, as we will discuss later in this subsection. Similarly, [MBR06] translate Simulink diagrams for the verification using NuSMV, but in this case the translation is direct, without an intermediate language. This work is more relevant because the diagrams used contained integer data types and full Peano arithmetic (NuSMV flattens integer types into Boolean types, but represents the stored values precisely). The authors do not mention explicitly if they used the BDD-based branch or SAT-based branch of verification. Unfortunately, we could not compare that approach with the one proposed in this work, since the implementation is not available. Its efficiency can be only assumed from the reported results: when input variables were not bounded explicitly, i.e. allowed values between 0 and 232, the verification time was almost a week; the bounds for reasonably fast verification had to be relatively small: between 0 and 60. Explicit model checking was also used for verification of Simulink diagrams by [BBB+12], but there the data-flow non-determinism in the form of input variables had even greater detrimental effect. This is the only comparable implementation presently available and we provide detailed comparison between this purely explicit approach and the hybrid, control explicit—data symbolic approaches described here in Section6. From a theoretical point of view, the method proposed in Section6 is effectively equivalent to symbolic execution extended with LTL verification. This combination was attempted before by [BDKP08], but classical symbolic execution does not allow state matching and thus only

24 a small subset (safety properties) of LTL was supported. Without state matching, an infinite violating run would cause the tool to diverge, never yielding an answer. On the other hand it is possible to implement state subsumption (a state represents more heap configurations than another state) in symbolic execution, as shown by [XMSN05], but subsumption is not correct with respect to LTL even though it can significantly reduce the state space (and make it finite). A considerable amount of work pertained to allowing symbolic model checking to use more complex data types than Booleans. The problem is that BDDs grow exponentially in the presence of integer multiplication, as proved by [Bry91]. Other symbolic represen- tations were designed that represent variables on the word level rather than binary level, such as Binary Moment Diagrams, used by [BC95], or Boolean Expression Diagrams, used by [WBCG00], even though these are no longer canonical. Similarly to our approach, sym- bolic model checking can also use first-order formulas directly, leading to SAT-based and SMT-based approaches, invented by [BCC+99] and [AMP06], respectively, provided that the procedure is bounded in depth, and consequently incomplete. Various approaches to un- bounding have been proposed, most recently the IC3 method of [Bra11], but these are rather theoretical, without implementation, or limited to weaker arithmetics, see e.g. the model checker Kind based on k-induction and created by [HT08]. Module checking, introduced by [KV96] and detailed by [God03], allows verification of open systems. A system is open in the sense that the environment may restrict the non-determinism and the verification has to be robust with respect to arbitrary restrictions. Consequently, the approach to verifying open systems also differs since only branching time logics can distinguish open from closed systems, in the module checking sense. For linear time logics, every path has to satisfy the property and thus open and closed systems collapse into one source of non-determinism. Much closer to our separation between control and data is the work initiated by [Lin96]. Lin’s Symbolic Transition Graphs with Assignments represent precisely parallel programs with input variables using a combination of first-order logic and process algebra for communicating systems. Similarly as for symbolic execution, the most complicated aspect of this represen- tation is the handling of loops. Lin’s solution computes the greatest fix point of a predicate system representing the first-order term for each loop. Then two transition graphs are bisimi- lar if the predicate systems representing their loops are equivalent. The set-based reduction of transition systems used in this work effectively compute the fix points explicitly, represented as sets of variable evaluations. Finally, various combinations of different approaches and representations have been de- vised and experimented with. When multiple representations were combined, it was mostly to improve on weak aspects of either of the representations, for example by [YWGI06] multiple symbolic representations for Boolean and integer variables were employed in combination. Presburger constraints were combined with BDDs by [BGL98] to allow, with some restric- tions, the verification of infinite state spaces. Finally, the two approaches to model checking, explicit and symbolic, were combined to improve solely upon control-flow non-determinism. Some improvement was achieved by storing multiple explicit states in a single symbolic state, implemented by [DLKPTM11], or by storing explicitly the property and symbolically the sys- tem description, implemented by [STV05]. The approach described by [HGD95] represents control using symbolic model checking and data by purely symbolic manipulation of first- order formulas. There the symbolic data are limited in the sense that they cannot influence the control flow.

25 2.4 LTL Verification of Parallel LLVM Programs

Again, the width of our fields demands that we split our attention to only a few most relevant directions of research. We would also like to add that many of the following methods have a deep theoretical background, and their comprehensive exposition is outside the scope of this thesis. The interested reader is urged to follow the cited sources, but any attempt from our side to provide a short overview would likely be to the detriment of understanding.

Explicit Model Checking The tools following the explicit-state approach to model check- ing (such as SPIN [Hol97]) are inherently limited regarding the data non-determinism. Should the data be represented explicitly, the state space explosion quickly becomes infeasible. Re- garding the LLVM language support, however, DIVINE [BBH+13] provides a much larger subset, with better coverage of memory allocations, pointer operation and with a comprehen- sive handling of exceptions. The support for LLVM was recently added [vdB13] to LTSmin as well.

Symbolic Model Checking Fully symbolic tools, e.g. NuSMV [CCG+02] do not distin- guish between control and data, simply representing a set of states symbolically as a single entity. While they are designed for CTL model checking, they commonly allow verification of LTL by adding fairness constraints. The language support is limited to a language specific to a given tool: in order to compare with SymDivine we had to translate the input program to their language via the empty data representation.

Interpolation A relatively recent technique for accelerating the reachability algorithm uses Craig’s Interpolation for generating approximations of the transition relation [McM03]. Em- ploying this algorithm in LTL model checking is theoretically possible by reducing the LTL model checking to reachability [BAS02], yet no tool exists that would implement it.

IC3 Another technique for automatic approximation of the transition relation, one that is additionally guided by the property under verification, is IC3 [Bra11]. A CTL model checking tool based on IC3 was also designed [BSHY11]. Furthermore, the new version of NuSMV (called nuXmv) implements also the LTL verification [CGMT14], although a bug causing the tool to crash on LTL instances prevented us from comparing with this approach.

Termination Finally, a different family of techniques reduces the problem to program ter- mination. Two possible approaches were investigated: via fair termination [CGP+07] (shown extendable to predicate abstraction by encoding path segments as abstract transitions [PR05]) and via counterexample-guided CTL model checking [CK11]. The authors of the last paper also published inputs for their experiments which we adopted to compare SymDivine with available tools. Heidy Khlaaf builds on the results previously achieved by Erik Koskinen and has con- siderably extended them, both in the sense of efficiency and in the sense of the scope. Her tool is a part of the T2 temporal prover1. In [CKP14], she considerably accelerated the CTL model checking for C programs using precondition computation. In [CKP15a], she added the

1http://mmjb.github.io/T2/

26 support for fairness, thus effectively allowing the verification of LTL properties. And finally, in [CKP15b], she further extended the approach to full CTL*.

Horn Clauses Verifying CTL properties has also been reduced to solving quantified Horn clauses [BBR14]. This approach builds on the incremental results for solving existentially quantified Horn clauses [BPR13] and linear arithmetic constraint solving [GPR11b].

Abstract Interpretation The individual results produced by Antoine Min´ewould together also enable verifying temporal properties of parallel programs. First, in [Min12], he described an abstract domain for the bit-vector theory. Then in [Min13], he applied abstract interpre- tation to allow static analysis of concurrent programs. Finally, in [UM15], they extended the scope of abstract interpretation to a subset of LTL. Note that both this research direction and the direction using Horn clauses depends to some extend on deciding termination.

Program Traces Observing the program as producer of possible traces, i.e. sequences of transitions, allows building automata that accept these traces. The program analysis engine built on this idea is described in [HHP13] and extended to temporal verification in [DHLP15]. A related idea lead the authors of [FKP13] to a deductive verification method for parallel programs. Unlike the partial order reduction implement in Section8, the above approach is space-polynomial in the number of threads. And finally, Andrey Kupriyanov, working with a different notion of traces, designed a method analysis of multi-threaded programs. First for deciding safety properties in [KF13], then for deciding termination in [KF14], and finally, LTL verification in presently unpublished [Kup14].

27 a

28 Chapter 3

Preliminaries

The model checking, and in particular the explicit-state model checking, is one of the under- lying principles that connect the individual parts of this thesis. The sanity of requirements is tested by repeatedly checking the model of a conjunction of those requirements. The Simulink designs and LLVM programs are translated into control-flow graphs on which a model checking procedure is executed. In the latter case, the explicit model checking is extended with using symbolic data representations, but otherwise the verification technique itself is unmodified. This chapter establishes the necessary background into formal methods, as used in this thesis. The definitions pertaining to model checking may appear standard but the use of symbolic data representations required us to modify the definition to accommodate for the bivalent (explicit control/symbolic data) nature of states. The sanity checking techniques may be new to most readers, and we also recommend reading our short presentation of the bit-vector theory. The principles and limitations of checking the satisfiability in this theory lie at the core of some of the crucial concepts of this work.

3.1 First-Order Theory of Bit-Vectors

Following [MP95] we will assume that first-order (FO) formulae refer to variables from a universal set V . Each variable x ∈ V draws its values from a domain sort(x). The set S of evaluations E of variables from V contains functions from V to x∈V sort(x), such that ∀(ν ∈ E), ν(x) ∈ sort(x). Let X ⊆ V denote a set of variables {x1, . . . , xn}, we will assume a fixed ordering and overload the notation to allow ν(X) = (ν(x1), . . . , ν(xn)). In order to model precisely the semantics of programs, the domain of program variables qx will consist of fixed-width bit vectors, i.e. sort(x) = {0,..., 2 − 1} for some qx. Apart of program variables, V also contains program counters pc, one for each parallel process, where sort(pc) is the set of program locations of a given process. Finally, a subset I ⊆ V denotes an infinite set of input variables, drawing non-deterministic values from sort(i) = {0,..., 2qi −1} for i ∈ I. Altogether, V = D ∪ D0 ∪ {pc, pc0} ∪ I, where D is the set of data variables (the next-state versions are primed), pc the program counter, and I the set of input variables. Then the state of a program is uniquely defined as an evaluation of the control and data variables from D ∪ {pc}. The FO theory used in this thesis is the bit-vector BV theory with standard Peano arith- metic and bit-wise operations that when applied to variables from V form bit-vector expres- sions [KS10]. When necessary we will distinguish between the quantifier-free fragment of BV

29 and the full, quantified bit-vector theory QBV. We will denote both functions and predicates with Greek letters and the set of all BV expressions over V as Φ. Definition 3.1 (Syntax of BV). We define the syntax of (the quantifier-free fragment of) the theory of bit-vectors as follows:

formula ::= formula ∧ formula | ¬formula | (formula) | atom ::= term rel term | boolean-identifier | term[constant] rel ::= < | = term ::= term op term | identifier |∼ term | constant | atom ? term : term | term[constant : constant] op ::= + | − | · | / |  |  | & | | | ⊕ | ◦

Each BV term is associated with an integer bit-width and relations and binary operation are well-defined only when applied to terms of equal bit-width. Formally, each bit-width has a unique predicate symbol (denoted with a lower index), but we will not obfuscate the notation. Example 3.2 (BV Semantics). The standard semantics of bit-vector operations is similar to the modular arithmetic, i.e. each operation is computed modulo 2n, for n the bit-width of the operation.

11010000 = 200 + 01100100 = 100 = 00101100 = 44

Above is an example of bit-width 8 addition. It follows that in the theory of bit-vectors, 200 +8 100 = 44 is a valid formula. 4 A large part of this work directly depends on the high efficiency of decision procedures for BV satisfiability. Similarly to decision procedures of other first-order theories, BV satisfiability is decided by a collection of rewriting rules, heuristics, and the DPLL algorithm [DLL62] for propositional logic. Unlike other theories, BV is much more closely related to propositional logic since a BV formula can be translate to an equisatisfiable propositional formula. This translation assigns a new propositional variable to each bit of each bit-vector variable. Deciding satisfiability of QBV formulae is both theoretically and practically more chal- lenging. To be more precise, each BV formula implicitly existentially quantifies all its free variables. Thus when we talk about QBV, we mean those FO formulae that have at least one quantifier alternation. Definition 3.3 (Quantified BV). We extend syntax to quantified formulae in the following manner, where the type of each quantified variable denotes its bit-width:

formula ::= ∃(identifier :: bit-width), formula | ∀(identifier :: bit-width), formula | formula

The efficiency of state-of-the-art solver for QBV is much less consistent compared to BV. This is partly caused by the fact that very little effort has been directed towards solving QBV satisfiability. To the best of our knowledge, the authors of [WHdM13] implemented the first

30 and only QBV solver as part of Z3. Their methods combine simplification ideas from theorem proving, such as miniscoping and skolemisation [Har09], with heuristical quantifier instanti- ation. The latter technique synthesises a model µ of the universally quantified variables for which the input formula ∀x, ϕ is satisfiable, i.e. |=SAT ϕ[x 7→ µ(x)]. Then if x 6= µ(x) ∧ ϕ is unsatisfiable, the original formula is pronounced satisfiable. The heuristics is relatively effec- tive but only applicable on satisfiable formulae. For unsatisfiable formulae the instantiation continues in a loop until the formula can be simplified to ⊥. In our application, unsatisfiable formulae are created and checked every time we find two matching states. It is thus a relatively common situation in the verification we propose, and appropriate amount of attention needs to be paid in handing the potential inefficiency.

3.2 Control Flow Graphs

This section will establish the mathematical foundations for program analysis, building on the notion of control flow graphs. In the classical model checking literature, see e.g. [CGP99], they are referred to as Kripke structures or labelled transition systems. Classical model checking, however, does not distinguish between control and data flow, which is why we opt for control flow graphs instead. The formalism of control flow graphs can also be interpreted as omega- regular automata, i.e. automata for recognising certain languages over infinite words. To simplify the following discussion we do not introduce another formalism for automata recog- nising temporal specifications, but instead alter the semantics of control flow graph. The interested reader is recommended to investigate the B¨uchi automata [Chu62] for more infor- mation. Throughout this thesis, we will use the flow graphs for modelling parallel programs, their processes, specification automata and in general any reactive system.

Definition 3.4 (Control Flow Graph). The control flow graph (CFG) P = (L, l0, T ,F ) is a 4-tuple, where

• L is the set of program locations

• l0 ∈ L is the initial location

•T⊆ L × Φ × L is the transition relation

• F ⊆ L is the set of final locations

Hence a transition t ∈ T is a triple t = (l, ρ, l0), l, l0 ∈ L, where ρ describes the relation ρ between current state and next state data variables; depicted visually, τ = l −→ l0. The Example 3.5 shows CFGs of parallel processes of a program, where the parallel composition operator k will be defined momentarily.

Example 3.5 (CFG for Parallel Program). Let us demonstrate how threads of a parallel program can be mapped to our definition of CFGs. The simplified parallel program on the left is represented by the three CFGs on the right (the top one only initialises the global variables with non-deterministic values).

31 a0 = ι32 b0 = ι32 g 1 0 2 g1: uint32 t a, b;:gb1 1 g1 gb1

l thread1() { a ≤ 5 1 a > 5 l1: if (a > 5) { l2: b + +; } :bl1 bl1 l2 b0 = b + 1

b ≥ 20 mb 1 thread2() { m1 m1: while (b < 20) { b < 20 a0 = a mod 5 m2: a = a%5; } :mb 1 m2

The transition labels are simplified in that each variable preserves its value, i.e. x0 = x, whenever it is not explicitly modified. For example the CFG of thread 1 is G = (L, l1, T , ∅), where

• L = {l1, l2, bl1} 0 0 T = { (l1, a > 5 ∧ a = a ∧ b = b, l2), 0 0 • (l1, a ≤ 5 ∧ a = a ∧ b = b, bl1), 0 0 (l2, a = a ∧ b = b + 1, bl1) } 4 It will prove useful in future discussion to introduce three composition operations on CFGs: (1) the parallel k to model interleaving of parallel processes, (2) the synchronous o to model the connection between program and its specification, and (3) the sequential ; to model the sequential transfer of control from one CFG to the next. To simplify the notation we will define compositions between two CFGs, but the extension to more CFGs is straightforward. Definition 3.6 (Parallel Composition). The parallel composition P = P 1kP 2 of CFGs 1 1 1 1 1 2 2 2 2 2 P = (L , l0, T ,F ) and P = (L , l0, T ,F ) is a CFG P = (L, l0, T ,F ), where • L = L1 × L2

1 2 • l0 = (l0, l0)

ρ 0 0 ρ 0 1 ρ 0 2 •T = {(l1, l2) −→ (l1, l2) | l1 −→ l1 ∈ T ∨ l2 −→ l2 ∈ T } • F = {(l1, l2) | l1 ∈ F 2 ∨ l2 ∈ F 2} Definition 3.7 (Synchronous Composition). The synchronous composition P = P 1 o P 2 of 1 1 1 1 1 2 2 2 2 2 CFGs P = (L , l0, T ,F ) and P = (L , l0, T ,F ) is a CFG P = (L, l0, T ,F ), where L, l0, and F are defined similarly as for the parallel composition and

ρ 0 0 ρ1 0 1 ρ2 0 2 •T = {(l1, l2) −→ (l1, l2) | l1 −→ l1 ∈ T , l2 −→ l2 ∈ T , ρ = ρ1 ∧ ρ2} Definition 3.8 (Sequential Composition). The sequential composition P = P 1; P 2 of CFGs 1 1 1 1 1 2 2 2 2 2 P = (L , l0, T ,F ) and P = (L , l0, T ,F ) is a CFG P = (L, l0, T ,F ), where

32 • L = L1 ∪ L2

1 • l0 = l0

1 2 1 > 2 1 1 •T = T ∪ T ∪ {l −→ l0 | l ∈ F } • F = F 2

Example 3.9 (CFG Composition). Let P1 and P2 be the CFGs of threads 1 and 2 from Example 3.5 and Pg the CFG of the global process. Below is the composition P = Pg;(P1kP2), where some of the transition labels were omitted to avoid cluttering.

b0 = i a0 = i 2 0 1 g gb1 g1 1 > a ≤ 5 a0 = 13a%5 b ≥ 20 (l1, m2) (l1, m1) (l1, mb 1) b < 20 a > 5

(l2, m2) (l2, m1) (l2, mb 1)

b0 = b + 1

(bl1, m2) (bl1, m1) (bl1, mb 1)

4

The CFGs represent programs in a control explicit, data symbolic manner. Explicit- state model checking, however, operates over fully explicit systems. In order to define model checking of programs we must first explicate the manipulation of data. A standard formalism for the representation of models of programs is the labelled transition system: effectively a CFG where each transition is labelled with a true formula. This means that all the information about the variable values has to be stored in the control-flow locations. Since a large part of our work is concerned with reducing the size of the resulting CFG, we will represent each reduction technique as a process of generating the explicit CFG.

Definition 3.10 (Explicit CFG). The explicit control-flow graph is a CFG G = (S, s0,T,A), where for each s −→σ s0 ∈ T it holds that σ = > and we simplify the notation to s → s0. From a CFG P = (L, l0, T ,F ) we can generate the explicit CFG G. The set of locations S is the set of evaluations of {pc} ∪ D. We define the generation as follows:

S = {s : {pc} ∪ D → L ∪ S sort(x) | s(pc) ∈ L, • x∈D ∀(x ∈ D), s(x) ∈ sort(x)}

• s0 = {(pc, l0)} ∪ {(x, 0) | x ∈ D}

33 ρ T = {s → s0 | s(pc) −→ s0(pc) ∈ T , • V V 0 0 |=SAT x = s(x) ∧ ρ ∧ x = s (x)} • F = {s ∈ S | s(pc) ∈ F } The above definition of transitions in the explicit CFG requires that the transition relation ρ holds between the evaluation in s and the evaluation in s0. Note that ρ may also refer to the input variables from I. Hence the fact that the formula is satisfiable, denoted by |=SAT , states that there is an evaluation of the input variables for which the formula is valid.

3.3 LTL Model Checking

As stated in the introduction, LTL is a logic that allows developers to specify temporal properties of their programs. The logic is based on so-called atomic propositions, i.e. formulae from Φ which can be evaluated over locations of the explicit CFG. Definition 3.11 (LTL Syntax). For an atomic proposition p ∈ Φ, we define the syntax of the Linear Temporal Logic inductively as follows:

ψ ::= p | ¬ψ | ψ ∧ ψ | X ψ | ψ U ψ

Intuitively, X ψ states that ψ will hold in the next program state and ψ1 U ψ2 that ψ1 will hold up to some point in the future at which also ψ2 holds. The full and precise semantics of LTL relates to the specific transition system which must be defined first. Let us only add that there are other linear operators that can be defined using X and U, e.g. F ψ := > U ψ or G ψ := ¬ F ¬ψ formalise the intuitive notion of formulae that hold at some point in the future or globally (in every program state), respectively.

Definition 3.12 (LTL Semantics). Let G = (S, s0,T,A) be an explicit CFG and ψ an LTL formula. A run (or trace) w of G is an infinite sequence of explicit locations w = (s0, s1,...) and wi its i-th suffix, i.e. wi = (si, si+1,...). The satisfaction of ψ in a run w is defined inductively, according to the structure of ψ:

w |= p ⇔ s0 |= p w |= ¬ψ ⇔ w 6|= ψ w |= ψ1 ∧ ψ2 ⇔ w |= ψ1 and w |= ψ2 w |= X ψ ⇔ w1 |= ψ w |= ψ1 U ψ2 ⇔ ∃(i ∈ N0), ∀(j < i), wj |= ψ1, wi |= ψ2 Example 3.13 (Temporal Operators). The four runs below illustrate the meaning of LTL operators: p X p p F p p p p p p G p p p q p U q

34 For each formula on the left it holds that the run on the right satisfies the formula. Note that these are only the finite prefixes of infinite runs and each run is only one of possibly many runs that satisfy the respective formula. The nodes without labels can evaluate atomic proposition to arbitrary values. The nodes with labels represent locations in which their labels evaluate to true, e.g. a node labelled with p evaluates the atomic proposition p to true but any other proposition can be either true or false. With the exception of G, the depicted prefix alone attests to the satisfaction of the whole run, regardless the suffix. Demonstrating the satisfaction of G, on the other hand, requires the complete infinite run. 4

We overload the entailment relation |= to denote (1) that a trace w satisfies an LTL formula ψ and (2) that an evaluation of variables from X is a model of a BV formula over variables from Y ⊆ X. Finally, the whole CFG G satisfies ψ if all of its runs do. The task of an LTL model checker is to provide an answer {correct, (incorrect, counterexample)} for an input pair (G, ψ), where counterexample is a run of G that does not satisfy ψ.

Example 3.14 (B¨uchi Automaton). We continue our running example using the system P shown in Example 3.9. Below is one of the runs w of P . We represent a transition (s, ρ, s0) as s → s0, omitting ρ to ease the exposition. Let us assume that we are considering the satisfaction of an LTL formula ψ : G b < 12.

w0 w1 w2 w3 w7    0        pc 7→ g1 pc 7→ g1 pc 7→ gb1 pc 7→ (l1, m1) pc 7→ (bl1, m1)  a 7→ 0   a 7→ 6   a 7→ 6   a 7→ 6  ...  a 7→ 3  ... b 7→ 0 b 7→ 0 b 7→ 11 b 7→ 11 b 7→ 12

The run w does not satisfy ψ: even though b < 12 holds in the first seven states, the eighth state is not a model of b < 12 and thus even this finite prefix demonstrates that w does not satisfy ψ. Note that w is only one of many runs of the system, in fact there are exactly 232 distinct successors of the first state of w. 4

3.3.1 Explicit-State Model Checking A classical approach to verify LTL properties is automata-based model checking (see the relevant parts of Section2 for other approaches). Informally, for an LTL formula ψ one can construct an ω-automaton [CGP99] – automaton over infinite words – that accepts exactly those words that model the formula ψ. Hence the automaton for formula ¬ψ accepts exactly the counterexamples for the formula ψ. It follows that if a product automaton of the program P and the automaton for ¬ψ is constructed, then it accepts exactly those counterexamples for ψ that are legal executions of P . The verification is thus reduced to locating such accepted words. The ω-automata used to represent LTL formulae are called B¨uchi automata, but their structure is similar to CFGs and thus we will not introduce a new formalism. For our purposes it will suffice to define the set of words accepted by a CFG.

Definition 3.15 (CFG Acceptance). A CFG P = (L, l0, T ,F ) accepts a trace w = (s0, s1,...) if and only if there is an infinite sequence of locations α = (l0, l1,...) such that

ρ •∀ (i ∈ N), li −→ li+1 ∈ T ∧ w(i) |= ρ

35 •∀ (i0 ∈ N), ∃(i ∈ N), i > i0 ∧ α(i) ∈ F The first condition requires that each location in w satisfies the relevant transition in P while the second condition requires that the sequence α visits the set of final states F infinitely many times. It is often useful to denote what LTL formula is represented by a particular CFG. For that purpose we will use the notation Aϕ to denote a CFG that accepts all traces w that satisfy ϕ. Since then we interpret CFGs as automata, we can equally say that w is a word in the language accepted by Aϕ, denoted as w ∈ L(Aϕ). Henceforth, we will not need to distinguish between the original program and its CFG representation. Assuming that PA is the CFG for the LTL property being verified, we fix the notation P = (P1k ... kPn) o PA = (L, l0, T ,F ) and only refer to P as the input for model checking. Words accepted by P are the witnesses of the program’s failure to meet its specification. Thus if the set of accepting words is empty then the program is correct. The existence of accepting words is equivalent to the existence of reachable accepting cycles in the explicit state space. A cycle is a closed path ζ = s1 → s2 → ... → sn = s1 and ζ is accepting if ∃(1 ≤ i < n), si ∈ F . A state s is reachable if there is a path from s0 to s and, by extension, a cycle is reachable if any of its states is reachable. The classical explicit-state model checking thus reduces the LTL verification task to the location of accepting cycles within the transition system of the program/specification product. Traditionally, a system as a whole is considered to satisfy an LTL formula if all its exe- cutions (all infinite words over the states of its CFG) do. However, we might occasionally be also interested in the question whether at least one of a system’s executions satisfies the given LTL formula. We thus suggest explicit path quantification using the ∀ and ∃ quantifiers as follows: For a system model M and a formula ϕ, we write M |= ∀ϕ if all executions of M satisfy ϕ, and M |= ∃ϕ if there is at least one execution of M satisfying ϕ. We call the former kind of path-quantified formulae universal and the latter kind existential. There are several notes to be made. First, we do not allow for nesting of path-quantified formulae as we want to avoid the complexity of dealing with full CTL∗. We only occasionally use the negation operator with the meaning ¬∀ϕ ≡ ∃¬ϕ and vice versa. Second, one of the advantages of making the hidden universal quantification of LTL more explicit is to reduce the confusion over the fact that LTL seemingly does not follow the law of excluded middle: We can have a system model that satisfies neither ϕ nor ¬ϕ. Third, using path-quantified LTL formulae is really just a convenience, as it does not make the verification task any harder. Indeed, to verify whether a system model satisfies ∃ϕ is the same as to verify whether the system model satisfies ∀¬ϕ and invert the answer. Efficient verification of that satisfaction, however, requires a more systematic approach than enumeration of all executions. An example of a successful approach is the enumerative approach using B¨uchi automata. Example 3.16 (Specification and Program Product). Let us conclude the demonstration started in Example 3.5 by presenting the CFG PA (on the left) for verifying formula ϕ : G a < 4, i.e. representing runs that satisfy its negation: ¬ϕ = ¬ F (a ≥ 4). The sourceless arrow points to the initial location. ... b ≥ 12 b < 12 a0 a1 > s0 ∪ [pcA 7→ a0] ... s6 ∪ [pcA 7→ a0] s7 ∪ [pcA 7→ a1]

36 Assuming that w = (s0, s1, s2,...) is the run from Example 3.14, then the path on the right represents a reachable accepting cycle in the explicit CFG for the synchronous product of PA and the program from Example 3.9. This cycle can be mechanically detected, thus demonstrating the program’s incorrectness with respect to ψ. 4 Without going into too many implementation details, we will shortly describe the internal functioning of an explicit model checker (a more detailed exposition can be found at the end of Section5). Assume that the input is the composition ( P1k ... kPN ) o PA but only in an implicit form: the parallel processes and the specification automaton are known but the state space of the composition has to be generated. During accepting cycle detection, the model-checking algorithm generates successors of individual states using the above rule. The set of visited states is maintained in an explicit storage and every newly generated state is tested for being previously visited. Based on the specific cycle detection algorithm the state space is traversed in some order, until either an accepting cycle is found or no new state can be generated.

3.4 Sanity Checking

As described in the introduction the model checking procedure is not designed to decide why was a certain property satisfied in a given system. That is a problem, however, because the reasons for satisfaction might be the wrong ones. If for example the system is modelled erroneously or the formula is not appropriate for the system then it is still possible to receive a positive answer. These kinds of inconsistencies between the model and the formula are detected using sanity checking techniques, namely redundancy and coverage. In this thesis they will be described for comparison with their model-free versions; interested reader should consult for example [Kup06] for more details. Let K be an CFG, ϕ a formula and ψ its subformula. Let further ϕ[ψ0/ψ] be the modified formula ϕ in which its subformula ψ is substituted by ψ0. Then ψ does not affect the truth value of ϕ in K if the following property holds: K satisfies ϕ[ψ0/ψ] for every formula ψ0 iff K satisfies ϕ. Then a system K satisfies a formula ϕ redundantly if K satisfies ϕ and there is a subformula ψ of ϕ such that ψ does not affect ϕ in K. A state s of an CFG K is q-covered by ϕ, for a formula ϕ and an atomic proposition q, if K satisfies ϕ but K˜s,q does not satisfy ϕ. There K˜s,q stands for an CFG equivalent to K except the valuation of q in the state s is flipped. Example 3.17 (Model-Based Redundancy and Coverage). To demonstrate redundant sat- isfaction let us consider a standard liveness formula ϕ : G(req ⇒ F resp). Assume that we try to verify that a system K in which no request ever occurs, satisfies ϕ. Then setting ϕ> : ϕ[resp 7→ >] and ϕ⊥ : ϕ[resp 7→ ⊥] leads to K |= ϕ> and K |= ϕ⊥, thus demonstrating the redundancy. To demonstrate coverage let us consider the following system:

¬a, ¬b a, b a, b w and the formula ϕ : ¬ab U . There the state w is b-covered by ϕ, because K |= ϕ and Kew,b 6|= ϕ. On the other hand, w is not a-covered by ϕ, because K |= ϕ but also Kew,a |= ϕ.4

37 If one could extract the underlying ideas of redundancy and coverage and show that they do not necessarily require the existence of a model, it would be possible to extrapolate these notions into the model-free environment. Redundancy states that the satisfaction of a formula is given extrinsically (by an environment) and is not related to the formula itself. Thus we may consider the remaining formulae from the set of requirements to constitute the environment. Coverage, on the other hand, attempts to capture the amount of system behaviour that is described by the formulae. Again, we can use the remaining formulae to form the system, but in order to establish a coverage metric capable of differentiating various formulae, we will need to extend the notion from state-coverage to path-coverage. Although these notions of sanity are dependent on the system that is being verified, we might still utilise at least the notion of redundancy in a model-less setting, using a concept of redundancy witnesses. A redundancy witness to a given formula ϕ is a formula ψ with the following property: If a system K satisfies ψ then it satisfies ϕ redundantly. A stan- dard example is the request-response formula G(req ⇒ F resp) “every request is followed by a response”, whose two redundancy witnesses are G(¬req) “no request ever happens” and GF(resp) “responses happen infinitely often”. The redundancy witnesses for a given formula can be automatically generated, see [BBDER01]. Moreover, as it is desired for the redundancy witnesses not to hold, we may include this requirement in our original set of requirements with the help of path-quantified LTL formulae. For example, if our set of requirements contains the formula ∀ G(req ⇒ F resp), we add the formulae ∃ F(req) and ∃ F G(¬resp) to the set – those are the negations of the redundancy witnesses. In Section4, we translate the concepts of model-based sanity into model-free environment. We further supplement these notions with consistency verification thus arriving at a method that would aid the creation of a reasonable set of requirements, i.e. a set such that it is reasonable to ask if a system satisfies it or not. If, however, the intention is to build a system based on this set, then a further property of such set would be desirable and that the set completely describes the system under development. Consequently, a sane set of requirements is consistent, without redundancies, and complete in a sense we will formalise in Section4.

38 Chapter 4

Checking Sanity of Software Requirements

The requirements stage is the earliest of software development stages and as various studies concluded, undetected errors made early in the development cycle are the most expensive to eradicate. The impact of this thesis reaches to this early state of development by mechanically verifying certain crucial properties of requirements. Our sanity checking method guides the requirements engineers in producing a collection of requirements that may serve in future stages as the specification against which further analysis is run. Hence our overall contribution amounts not only to providing a verification method but also ensuring that one is verifying relevant properties. It is thus very important that the outcome of the requirements stage – a database of well-formed, traceable requirements – is what the customer intended and that nothing was omitted (not even unintentionally). While a procedure that would ensure the two properties cannot be automated, our work on requirements analysis proposes a methodology to check the sanity of requirements. In the following the sanity checking will be considered to consist of 3 related tasks: consistency, redundancy and completeness checking. As consistency and redundancy are more closely related, we focus on them in Section 4.1. The completeness checking is a little more involved than both consistency and redundancy: this is in fact the part of sanity checking that provably cannot be fully automated. Hence we will first describe the problem and then detail the semi-automatic solution proposed. We dedicate the following Section 4.2 solely to completeness. Through our collaboration with Honeywell R&D department, we were able to obtain industrial requirement documents which form one of our experimental studies, described in Section 4.3.

4.1 Consistency and Redundancy

Definition 4.1 (Consistency and Redundancy). A set Γ of LTL formulae over AP is consis- tent if ∃w ∈ AP ω : w satisfies V Γ. Checking consistency of a set Γ entails finding all minimal inconsistent subsets of Γ. A formula ϕ is redundant with respect to a set of formulae Γ if V Γ ⇒ ϕ. To check redundancy of a set Γ entails finding all pairs of hϕ ∈ Γ, Φ ⊆ Γi such that Φ is consistent and Φ ⇒ ϕ (and for no Φ0 ⊆ Φ does it hold that Φ0 ⇒ ϕ).

The existence of the appropriate w can be tested by constructing AV Γ and checking that L(AV Γ) is non-empty. The procedure is effectively equivalent to model checking where the

39 model is a clique over the graph with one vertex for every element of 2AP (allowing every possible behaviour). This approach to consistency and redundancy is especially efficient if a non-trivial set of requirements needs to be processed and standard sanity checking would only reveal if there is an inconsistency (or redundancy) but would not be able to locate the source. Further- more, dealing with larger sets of requirements entails the possibility that there will be several inconsistent subsets or that a formula is redundant due to multiple small subsets. Each of these conflicting subsets needs to be considered separately which can be facilitated using the methodology proposed in this work. Example 4.2 (Consistent and Redundant Requirements). Let us assume that there are five requirements formalised as LTL formulae over a set of atomic propositions (two-valued signals) {p, q, a}. They are 1. Eventually, the presence of signal p will entail its continued presence until the signal q is observed. ϕ1 = F(p ⇒ p U q)

2. The signal p will occur infinitely often. ϕ2 = GF p

3. The signals a and p cannot occur at the same time. ϕ3 = G ¬(a ∧ p) 4. Each occurrence of the signal q requires that signal a was observed in the previous time step. ϕ4 = G(X q ⇒ a)

5. The signal q will occur infinitely often. ϕ5 = GF q

In this set the formula ϕ4 is inconsistent due to the first 3 formulae and the last formula is redundant with respect to (implied by) the first 2 formulae. 4 We now reformulate the notions of Definition 4.1 to extend also to path-quantified LTL formulae. Again, we are interested in finding the minimal inconsistent subsets and minimal redundancy witnesses. Definition 4.3 (Path-Quantified Consistency and Redundancy). A set Γ of path-quantified LTL formulae over AP is consistent if there is a system model M such that M |= ϕ for all ϕ ∈ Γ. A path-quantified formula ϕ is redundant with respect to a set of formulae Γ if every model M satisfying all formulae of Γ also satisfies ϕ. We now show that checking consistency and redundancy in the path-quantified setting can be easily reduced to the case in the standard setting. Let us partition the set Γ into existential formulae ∃Γ and universal formulae ∀Γ, i.e. ∃Γ := Γ ∩ {∃ϕ}, for ϕ quantifier-free, and ∀Γ := Γ \ ∃Γ. Further, let Ψ1∃ := {Φ ∪ {ϕ} | Φ ⊆ ∀Γ, ϕ ∈ ∃Γ}. Theorem 4.4 (Locating Inconsistent Subsets). In order to locate all smallest inconsistent subsets of Γ is suffices to search among Ψ1∃. Similarly, checking redundancy may correctly be limited to checking among pairs hϕ ∈ Γ, Φ ⊆ ∀Γi ϕ universal h∀ψ, Φ ∈ Ψ1∃ ∪ ∀Γi ϕ existential, ϕ = ∃ψ. Proof. Let us first consider consistency. Clearly, if Γ only consists of universal formulae, we may simply ignore their quantifiers. In the case of Γ containing existential formulae, we can make the following two observations:

40 • First observation: Two existential formulae are always consistent provided that they are both self-consistent, i.e. none of them is equal to false. This can be easily seen as we can always provide a model with exactly two paths, each satisfying one of the formulae. This observations tells us that every minimal inconsistent subset contains at most one existential formula.

• Second observation: Let us write Γ∀ to denote the set Γ in which all path quantifiers have been changed to ∀. If the set Γ contains at most one existential formula then the following holds: Γ is consistent iff Γ∀ is consistent. One direction is obvious, the other follows from the fact that we can always provide a single-path model as a proof of consistency of Γ∀.

The result of the two observations is that if we are careful to never consider sets with more than one existential formula, we may freely ignore the path-quantifiers. Let us now consider redundancy. It is clear that the following holds: ϕ is redundant with respect to Γ iff Γ ∪ {¬ϕ} is inconsistent. This means that redundancy checking is easily reduced to consistency checking. This fact holds in the path-quantified setting as well as in the standard one and will be used later in the implementation. Note that the first observation above has the following consequences: If ϕ is a universal formula, then it makes sense to consider Γ with universal formulae only, as ¬ϕ is existential. Furthermore, if ϕ is an existential formula, it makes sense to consider Γ containing at most one existential formula.

From the above theorem it follows that all formulae that require consideration are uni- versal, which is the implicit quantifier of quantifier-free formulae. In other words the two statements above enable the same algorithms for checking sanity, described in the next sec- tion, to be used in the path-quantified setting as well. Yet the validity of the two statements requires a more detailed argumentation. Before we come to the implementation, let us illus- trate the interplay between the concepts discussed so far (model-less consistency, redundancy, and redundancy witnesses) on a more comprehensive example.

Example 4.5 (Redundancy Witnesses). Consider a set of requirements containing just two items:

1. If a signal a is ever activated, it will remain active indefinitely, i.e. in every time step it holds that if a is active now it will also be active in the next time step. Translating this natural language requirement to LTL we arrive at: ϕ1 = G(a ⇒ X a)

2. If a signal a is ever activated, it will not be active in the next time step, i.e. a must never hold in two successive steps. After translation the appropriate LTL formula thus stands: ϕ2 = G(a ⇒ X ¬a)

Note that although such requirements might seem inconsistent at first glance, they are, in fact, consistent as any consistency checking algorithm could demonstrate. What the following redundancy analysis is about to reveal is that, while consistent, it is not reasonable to require both formulae to hold at the same time. Let us elaborate and formalise the reasoning step of redundancy checking. We begin by constructing the redundancy witnesses according to the method in [BBDER01]. The redun- dancy witnesses for ϕ1 are: G(¬a), GX a. The first witness is quite intuitive, if the signal a is never activated in a system then requiring ϕ1 to hold is unreasonable. Similarly, if at

41 Algorithm 1: Consistency Detection Algorithm 2: Task Successors

1 while t ← getTask() do genSuccs(Task t) 2 if t.checked() then 1 if t.con() then 3 genSuccs(t) 2 if t.dir =↑ then

4 else 3 genSupsets(t) 5 t.checked ← (t) verCons 4 else 6 (t) updateSets 5 if t.dir =↓ then 7 Pool.enqueue(t) 6 genSubsets(t)

every time step we have that a will be active in the next step then clearly the presence of a in this time step is irrelevant. Exactly the same kind of mechanisable reasoning leads to the redundancy witnesses for ϕ2: G(¬a), GX (¬a). We thus create the new requirements as described in the previous section. The set of requirements now contains ∀ϕ1, ∀ϕ2, ∃ F a, ∃ F X (¬a), ∃ F X a. Take for example ∃ F X (¬a), resulting from the negated witness GX a. Being existentially quantified, this requirement demands the existence of at least one compu- tation of the system, one on which a state whose successor does not observe a will eventually occur. Note that we have converted the implicit formulae into universal ones and ignored the duplicity of the redundancy witnesses. We now check the redundancy of the requirements and discover that the formula ∃ F X a implies ∃ F a, we thus drop ∃ F a from the set. The remaining four formulae are checked for consistency. We find {∀ϕ1, ∀ϕ2, ∃ F X a} as the only minimal inconsistent subset. The conclusion is that although ϕ1 and ϕ2 are together consistent, the only models that satisfy both of them satisfy them redundantly. The two formulae can thus be said to be redundantly consistent. 4

4.1.1 Implementation of Consistency Checking Let us henceforth denote one specific instance of consistency (or redundancy) checking as a check. For consistency and a set Γ it means to check that for some γ ⊆ Γ is V γ satisfiable. For redundancy it means for γ ⊆ Γ and ϕ ∈ Γ to check that V γ ⇒ ϕ is satisfiable. In the worst case both consistency and redundancy checking would require an exponential number of checks. However, the proposed algorithm considers previous results and only performs the checks that need to be tested. Both consistency and redundancy checking use three data structures that facilitate the computation. First, there is the queue of verification tasks called Pool, then there are two sets, Con and Incon, which store the consistent and inconsistent combinations found so far. Finally, each individual task contains a set of integers (that uniquely identifies formulae from Γ) and a flag value (containing three bits for three binary properties). First, whether the satisfaction check was already performed or not. Second, if the combination is consistent. And the third bit specifies the direction in subset relation (up or down in the Hasse diagram) in which the algorithm will continue. The successors will be either subsets or supersets of the current combination. The idea behind consistency checking is very simple (listed as Algorithm1). The pool contains all the tasks to be performed and these tasks are of two types: either to check

42 Algorithm 3: Task Supersets Algorithm 4: Consistency Check

genSupsets(Task t) verCons(t=hi1, . . . , iji) 1 foreach i ∈ {1, . . . , n} do 1 F ← createConj(ϕi1 , . . . , ϕij ) 2 t.add(i) 2 A ← transform2BA(F) 3 if ∀X ∈ Con : X 6⊇ t ∧ 3 return findAccCycle(A) 4 ∀X ∈ Incon : X 6⊆ t then 5 enqueue(t)

6 t.erase(i)

consistency of the combination or to generate successors. The symmetry of the solution allows for parallel processing (multiple threads performing the Algorithm1 at the same time) given that the data structures are properly protected from race conditions. The pool needs to be initialised with all single element subsets of Γ and Γ itself, thus in the subsequent iteration will be checked the supersets of the former and subsets of the latter. Algorithm2 is called when the task t on the top of Pool is already checked. At this point either all subsets or all supersets of t should be enqueued as tasks. But not all successors need to be inspected, e.g. if t is consistent then also all its subsets will be consistent – that is clearly true and no subset of t needs to be checked. That observation is utilised again in Algorithm3. It does not suffice to stop generating subsets and supersets when its immediate predecessors are found consistent (inconsistent), because it can also happen that the combination to be checked was formed in a different branch of the Hasse diagram of the subset relation. In order to prevent redundant satisfiability checks two sets are maintained Con and Incon (see how these are used on line4 of Algorithm3). The actual consistency (and quite similarly also redundancy) checking is less complicated (see Algorithm4) and may even be delegated to a third party tool. First, the conjunction of formulae encoded in the task is created. In our setting we then check the appropriate B¨uchi automaton for the existence of an accepting cycle: using nested DFS [CVWY92]. Hence the Algorithm4 can easily be substituted with another method for checking consistency of a set of LTL formulae. A realisability checking tool, such as RATSY, could also be applied: one only needs to state that all signals are controlled by the system. Example 4.6 (Consistency Checking Algorithm). Let us assume that we want to locate all smallest inconsistent subsets of the 4-element set of requirements S = {a, b, c, d}. To better demonstrate the search, let {a, d}, {b, d}, and {c, d} be those smallest subsets. Our algorithm starts by add the whole S and all the singleton subsets to the pool of tasks. During the top-down search, S is found inconsistent and thus all its subsets are added. {a, b, c} is consistent and thus all its subsets are removed from the pool, namely the singletons {a}, {b}, and {c}. During the bottom-up search, {d} is found consistent but its superset, {c, d} is not and thus both {b, c, d} and {a, c, d} are removed from the pool. The situation is illustrated in Figure 4.1. 4

Implementation of Redundancy Checking The only difference when performing the redundancy checking is that the task t consists of a list hi1, . . . , iji which can be empty, and one index ik. Since the task is to decide whether

43 S

abc abd bcd acd

ab ac bc ad bd cd

a b c d

Figure 4.1: Illustration of the consistency checking algorithm. Red denotes the inconsistent subsets and green the consistent ones. The lightly shaded regions mark subsets which the algorithm decided without checking their consistency.

ϕi1 ∧ ... ∧ ϕij ⇒ ϕik the line1 needs to be altered to:

F ← createConj(ϕi1 , . . . , ϕij , ¬ϕik ) This is due to the fact discussed in the previous section, namely that the satisfiability of

ϕi1 ∧ ... ∧ ϕij ∧ ¬ϕik implies ϕi1 ∧ ... ∧ ϕij 6⇒ ϕik , i.e. ϕik is not redundant with respect to

{ϕi1 , . . . , ϕij }. Discarding the B¨uchi automata in every iteration may seem unnecessarily wasteful, es- pecially since synchronous composition of two (and more) automata is a feasible operation. However, the size of an automaton created by composition is a multiplication of the sizes of the automata being composed. Furthermore, it would not be possible to use the size optimising techniques employed in LTL to B¨uchi translation. And these techniques work particularly well in our case, because the translated formulae (conjunctions of requirements) have relatively small nesting depth (maximal depth among requirements +1).

Example 4.7 (Smallest Inconsistent Subset). We will demonstrate the search for smallest inconsistent subsets on the following set of requirements.

1. A request to increase the cabin temperature a may always be issued and each such request will eventually be granted by heating unit b. ϕ1 = G(F a ∧ (a ⇒ F b))

2. There is only a finite amount of fuel for the heating unit. ϕ2 = FG(¬b) 3. The system must be robust to handle overriding command c after which no request to increase the temperature may be issued. ϕ3 = F(c ∧ c ⇒ G ¬a) 4. There will be no request to increase the temperature initially, but every heating will be preceded by a request which must be turned off after one time unit once the heating starts. Also the request is always issued until the heating starts.

a ϕ4 = G(a U b ∧ X b ⇒ (a ∧ X a ∧ X X ¬a)) ∧ ¬a Our consistency checking algorithm begins by checking in parallel the individual require- ments ϕ1, . . . , ϕ4 for consistency and also the whole set Γ = {ϕ1, . . . , ϕ4}. It discovers that ϕ4 is inconsistent on its own and that Γ is also inconsistent. The fact that ϕ4 is inconsistent is

44 employed by not considering any of the sets containing ϕ4, i.e. {ϕ1, ϕ4}, {ϕ2, ϕ4}, {ϕ3, ϕ4}, {ϕ1, ϕ2, ϕ4}, {ϕ1, ϕ3, ϕ4}, and {ϕ2, ϕ3, ϕ4}. Next, the algorithm checks the supersets of the largest consistent sets, i.e. {ϕ1, ϕ2}, {ϕ1, ϕ3}, and {ϕ2, ϕ3}. Discovering that both {ϕ1, ϕ2} and {ϕ1, ϕ3} are inconsistent it terminates since {ϕ1, ϕ2, ϕ3} is necessarily also inconsistent. The smallest inconsistent subsets are thus {ϕ4}, {ϕ1, ϕ2}, and {ϕ1, ϕ3}. 4

4.2 Completeness of Requirements

Let us assume that the user specifies three types of requirements: environmental assump- tions ΓA, required behaviour ΓR and forbidden behaviour ΓF . The environmental assump- tions represent the sensible properties of the world a given system is to work in, e.g. “The plane is either in the air or on the ground, but never in both these states at once”, in LTL this would read G(ground ⇔ ¬air). The required behaviour represents the func- tional requirements imposed on the system: the system will not be correct if any of these is not satisfied, e.g. “If the plane is ever to fly it must first be tested on the ground”: F air ⇒ (ground ∧ tested). Dually, the forbidden behaviour contains those patterns that the system must not display, e.g. “Once tested the plane should be put into air within the next three time steps”: ¬tested U (air ∨ X air ∨ X X air).

Theory of Automata Coverage Assume henceforth the following simplifying notation for B¨uchi automata: let f be a propo- sitional formula over capital Latin letters A, B, . . . barred letters A,¯ B,...¯ and Greek letters α, β, . . ., where A substitutes V γ, A¯ stands for W γ and all Greek letters represent γ∈ΓA γ∈ΓA simple LTL formulae. Then Af denotes such a B¨uchi automaton that accepts all words sat- V W isfying the substituted f, e.g. A ¯ accepts words satisfying γ ∨ γ ∧ ϕ. The A∨B∧ϕ γ∈ΓA γ∈ΓB automaton AA thus describes the part of the state space the user is interested in and which the required and forbidden behaviour should together submerge. That most commonly is not the case with freshly elicited requirements and therefore the problem is the following: find sufficiently simple formulae over AP that would, together with formulae for R and F , cover a large portion of AA. In other words to find such ϕ that AR∨F¯∨ϕ covers as much of AA as possible. In order to evaluate the size of the part of AA covered by a single formula, i.e. how much of the possible behaviour is described by it, an evaluation methodology for B¨uchi automata needs to be established. The plain enumeration of all possible words accepted by an automaton is impractical given the fact that B¨uchi automata operate over infinite words. Similarly, the standard completeness metrics based on state coverage [CKV01, TK01] are unsuitable because they do not allow for comparison of sets of formulae and they require the underlying model. Equally inappropriate is to inspect only the underlying directed graph because B¨uchi automata for different formulae may have isomorphic underlying graphs. The methodology proposed in this thesis is based on the notion of almost-simple paths and directed partial coverage function.

Definition 4.8 (Almost-Simple Path). Let G be a directed graph. A path π in G is a sequence of vertices v1, . . . , vn such that ∀i :(vi, vi+1) is an edge in G. A path is almost-simple if no vertex appears on the path more than twice. The notion of almost-simplicity is also applicable to words in the case of B¨uchiautomata.

45 With almost-simple paths one can enumerate the behavioural patterns of a B¨uchi au- tomaton without having to handle infinite paths. Clearly, it is a heuristic approach and a considerable amount of information will be lost but since all simple cycles will be consid- ered (and thus all patterns of the infinite behaviour) the resulting evaluation should provide sufficient distinguishing capacity (as demonstrated in Section 4.3.2). Knowing which paths are interesting it is possible to propose a methodology that would allow comparing two paths. There is, however, a difference between B¨uchi automata that represent a computer system and those built using only LTL formulae (ones that should restrict the behaviour of the system). The latter automata use a different evaluation function νˆ that assigns to every edge a set of literals. The reason behind this is that the LTL-based automaton only allows those edges (in the system automaton) for which their source vertex has an evaluation compatible with the edge evaluation (now in the LTL automaton).

Definition 4.9 (Path Coverage). Let A1 and A2 be two (LTL) B¨uchiautomata over AP and let APL be the set of literals over AP . The directed partial coverage function Λ assigns to AP AP every pair of edge evaluations a rational number between 0 and 1, Λ : 2 L × 2 L → Q. The evaluation works as follows  0 ∃i, j : l ≡ ¬l Λ(A = {l , . . . , l },A = {l , . . . , l }) = 1i 2j 1 11 1n 2 21 2m p/m otherwise where p = |A1 ∩ A2|. From this definition one can observe that Λ is not symmetric. This is intentional because the goal is to evaluate how much a path in A2 covers a path in A1. Hence the fact that there are some additional restricting literals on an edge of A1 does not prevent automaton A2 to display the required behaviour (the one observed in A1). The extension of coverage from edges to paths and to automata is based on averaging over almost-simple paths. An almost-simple path π2 of automaton A2 covers an almost-simple Pn i=0 Λ(A1i,A2i) path π1 of automaton A1 by Λ(π1, π2) = n per cent, where n is the number of edges and Aji is the set of labels on i-th edge on πj. Then automaton A2 covers A1 by Pm i=0 max2i Λ(π1i,π2i) Λ(A1, A2) = m per cent, where m is the number of almost-simple paths of A1 that end in an accepting vertex. It follows that coverage of 100 per cent occurs when for every almost-simple path of one automaton there is an almost-simple path in the other automaton that exhibits similar behaviour.

Implementation of Completeness Evaluation The high-level overview of the implementation of the proposed methodology is based on partial coverage of almost-simple paths of AA. In other words finding the most suitable path in AR∨F¯∨ϕ for every almost-simple path in AA, where ϕ is a sensible simple LTL formula that is proposed as a candidate for completion. Finally, the suitability will be assessed as the average of partial coverage over all edges on the path. The output of such a procedure will be a sequence of candidate formulae, each associated with an estimated coverage (a number between 0 and 1) the addition of this particular formula would entail. The candidate formulae are added in a sequence so that the best unused formula is selected in every round. Finally, the coverage is always related to AA and, thus, if some behaviour that cannot be observed in AA is added with a candidate formula this addition will neither improve nor degrade the coverage.

46 Example 4.10 (Automata Coverage). This example only shows the enumeration of almost- simple paths and the partial coverage of two paths. What remains to the complete method- ology will be shown more structurally in Algorithm5. Enumerating all almost-simple paths of Aa:

2 3 6 1 4 5 we reach the following four:

1. h1, 2, 3, 5i

2. h1, 2, 3, 4, 2, 3, 5i

3. h1, 2, 3, 5, 6i

4. h1, 2, 3, 4, 2, 3, 5, 6i.

Let us assume that Ab: a a, b a c, a

¬a, b is the original automaton and Ac: a a, d ¬c, b c

is being evaluated for how thoroughly it covers Ab. There are 4 almost-simple paths in Ab, one of them is π = ha; ¬a, b; c, a; ai. The partial coverage between the first edge of π and the first edge of Ac (there is only one possibility) is 0.5, since there is the excessive literal d. The coverage between the second edges is also 0.5, but only because of ¬c in Ac; the superfluous literal ¬a restricts only the behaviour of Ab. Finally, the average similarity between π and the respective path in Ac is 0.75 and it is approximately 0.7 between the two automata. 4

The topmost level of the completeness evaluation methodology is shown as Algorithm5. As input this function requires the three sets of user defined requirements, the set of candidate formulae and the number of formulae the algorithm needs to select. On lines1 and2 the formulae for conjunction of assumptions and user requirements (both required and forbidden) are created. They will be used later to form larger formulae to be translated into B¨uchi automata and evaluated for completeness but, for now, they need to be kept separate. Next step is to enumerate the almost-simple paths of AA for later comparison, i.e. a baseline state space that the formulae from ΓCand should cover. The rest of the algorithm forms a cycle that iteratively evaluates all candidates from ΓCand (see line8 where the corresponding formula is being formed). Among the candidate formulae

47 Algorithm 5: Completeness Evaluation

Input :ΓA,ΓR,ΓF ,ΓCand, n Output: Best coverage for 1 . . . n formulae from ΓCand 1 γ ← V γ Assum γ∈ΓA 2 γ ← V γ ∨ W γ Desc γ∈ΓR γ∈ΓF 3 A ← transform2BA(γAssum) 4 pathsBA ← enumeratePaths(A) 5 for i = 1 . . . n do 6 max ← ∞ 7 foreach γ ∈ ΓCand do 8 γT est ← γDesc ∨ γ 9 A ← transform2BA(γT est) 10 cur ← avrPathCov(A, pathsBA) 11 if max < cur then 12 max ← cur 13 γMax ← γ

14 print( “Best coverage in i-th round is max.” ) 15 γDesc ← γDesc ∨ γMax 16 ΓCand ← ΓCand \{γMax}

the one with the best coverage of the paths is selected and subsequently added to the covering system. Functions enumeratePaths and avrPathCov are similar extensions of the BFS algorithms. Unlike BFS, however, they do not keep the set of visited vertices to allow state revisiting (twice in case of enumeratePaths and arbitrary number of times in case of avrPathCov). The avrPathCov search is executed once for every path it receives as input and stops after inspecting all paths to the length of the input path or if the current search path is incompatible (see Definition 4.9).

4.3 Sanity Checking Experiments

All three sanity checking algorithms were implemented as an extension of the parallel explicit- state LTL model checker DIVINE [BBCR10ˇ ]. From the many facilities offered by this tool, only the LTL to B¨uchi translation was used. Similarly as the original tool also this extension to sanity checking was implemented using parallel computation.

4.3.1 Generated Requirements

The first set of experiments was conducted on randomly generated LTL formulae. The mo- tivation behind using random formulae is to demonstrate the reduction in the number of consistency checks. A naive algorithm for detecting inconsistent/redundant subsets requires an exponential number of such checks. Generating a large number of requirement collections

48 with varying relations among individual requirements allows a more representative results than a few, real-world collections. In order for the experiments to be as realistic as possible (avoiding trivial or exceedingly complex formulae) requirements with various nesting depths were generated. Nesting depth denotes the depth of the syntactic tree of a formula. Statistics about the most common formulae show, e.g. in [DAC98], that the nesting is rarely higher than 5 and is 3 on average. Following these observations, the generating algorithm takes as input the desired number n of formulae and produces: n/10 formulae of nesting 5, 9n/60 of nesting 1, n/6 of nesting 4, n/4 of nesting 2 and n/3 of nesting 3. Finally, the number of atomic propositions is also chosen according to n (it is n/3) so that the formulae would all contribute to the same state space. All experiments were run on a dedicated workstation with quad core Intel Xeon 5130 @ 2GHz and 16GB RAM. The codes were compiled with optimisation options -O2 using GCC version 4.3.2. Since the running times and even the number of checks needed for completion of all proposed algorithms differ for every set of formulae, the experiments were ran multiple times. The sensible number of formulae starts at 8: for less formulae the running time is negligible. Experimental tests for consistency and redundancy were executed for up to 15 formulae and for each number the experiment was repeated 25 times. Figures 4.2a and 4.3a summarise the running times for sanity checking experiments. For every set of experiments (on the same number of formulae) there is one box capturing median, extremes and the quartiles for that set of experiments. From the figure it is clear that despite the optimisation techniques employed in the algorithm both median and maximal running times increase exponentially with the number of formulae. On the other hand there are some cases for which presented optimisations prevented the exponential blow-up as is observable from the minimal running times. Figures 4.2b and 4.3b illustrate the discrepancy between the number of combinations of formulae and the number of sanity checks that were actually performed. The number of combinations for n formulae is n∗2n−1 but the optimisation often led to much smaller number. As one can see from the experiments on 9 formulae, it is potentially necessary to check almost all the combinations but the method proposed in this thesis requires on average less than 10 per cent of the checks and the relative number decreases with the number of formulae. The actual running times, as reported in Figure 4.2a, depend crucially on the chosen method for consistency checking: automata-based Algorithm4 in our case. Our algorithm for enumerating all smallest inconsistent subsets is robust with respect to the consistency checking algorithm used. Hence any other consistency (or realisability) tool could be used, but that would only affect the actual running times, but not the number of checks, reported in Figure 4.3b. Finally, Figure 4.4 shows how much time did our sanity checking methods require to perform a single check. Since requirement document may contain a larger number of indi- vidual requirements than those we have experimented with, the reader might appreciate the progression of time complexity as depicted in the figure.

4.3.2 Case Study: Aeroplane Control System In order to demonstrate the full potential of the proposed sanity checking, we apply all the steps in a sequence. First, the whole requirements document is checked for consistency and redundancy and only the consistent and non-redundant subset is then evaluated with respect to its completeness. A sensible exposition of the effectiveness of completeness evaluation

49 100000

10000 Time (ms) 1000

100 7 8 9 10 11 12 13 14 15 16 Number of formulae (a) Log-plot summarising the time complexity. 100

10

1 Relative Number of Checks Performed

0.1 7 8 9 10 11 12 13 14 15 16 Number of formulae

(b) Log-plot summarising the number of checks.

Figure 4.2: Results for consistency checking of randomly generated formulae.

50 1e+06

100000

10000 Time (ms)

1000

100 7 8 9 10 11 12 13 14 15 16 Number of formulae (a) Log-plot summarising the time complexity.

100

10

1

0.1

Relative Number of Checks Performed 7 8 9 10 11 12 13 14 15 16 Number of formulae

(b) Log-plot summarising the number of checks.

Figure 4.3: Results for redundancy checking of randomly generated formulae.

51 8 consistency redundancy 7

6

5

4

3 Time (ms)/#Checks 2

1

0 7 8 9 10 11 12 13 14 15 16 Number of formulae

Figure 4.4: Log-plot summarising the relative time complexity of sanity checking.

proposed here requires more elaborate approach than using random formulae as the input requirements document.

For that purpose a case study has been devised that demonstrates the capacity to assess coverage of requirements and to recommend suitable coverage-improving LTL formulae. The requirements document was written by a requirements engineer whose task was to evaluate to method, and we only use random formulae as candidates for coverage improvement. The candidate formulae were built based on the atomic proposition that appeared in the input requirements and only very simple generated formulae were selected. It was also required that the candidates do not form an inconsistent or tautological set. Alternatively, pattern- based [DAC98] formulae could be used as candidates. The methodology is general enough to allow semi-random candidates generated using patterns from input formulae. For example if an implication is used as the input formula, the antecedent may not be required by the other formulae which may not be according to user’s expectations.

The case study attempts to propose a system of LTL formulae that should control the flight and more specifically the landing of an aeroplane. The requirements are divided into 3 categories similarly as in the text. Below, we provide each requirement first as an LTL formula and then as its intuitive meaning. The grayed formulae were found either redundant or inconsistent and were not included in the completeness evaluation.

52 Atomic Propositions LTL Requirements

A1 : G(a ⇔ b) R1 : G(l ⇒ F(G b ∧ F G c)) a ≡ [height = 0] A2 : F l ∧ G(l ⇒ F a) R2 : G(l ⇒ F(b U c)) b ≡ [speed ≤ 200] A3 : G(¬l ⇒ ¬b) R3 : G(¬b ⇒ ¬u) l ≡ [landing] A4 : G(u ⇒ F a ∧ u ⇒ c) R4 : F(l U(u U c)) u ≡ [undercarriage] A5 : G(¬l ⇒ ¬a) R5 : F(l ∧ G(¬c ∨ F(¬b))) c ≡ [speed ≤ 100] A6 : G(¬b ∨ F a) R6 : F(¬b ∧ ¬u ∧ X (¬b ∧ u) ∧ X X (b ∧ u)) F 1 : F(a ∧ G(¬a))

Environment Assumptions

A1 The plane is on the ground only if its speed is less or equal to 200 mph.

A2 The plane will eventually land and every landing will result in the plane being on the ground.

A3 If the plane is not landing its speed is above 200 mph.

A4 Whenever the undercarriage is down, first the speed must be at most 100 mph and second the plane must eventually land on the ground.

A5 Whenever the plane is not landing its height above ground is not zero.

A6 At any time during the flight it holds that either the plane if flying faster than 200 mph or it eventually touches the ground.

Required Behaviour

R1 The landing process entails that eventually: (1) the speed will remain below 200 mph indefinitely and (2) after some finite time the speed will also remain below 100 mph.

R2 The landing process also entails that the plane should eventually slow down from 200 mph to 100 mph (during which the speed never goes above 200 mph).

R3 Whenever flying faster than 200 mph, the undercarriage must be retracted.

R4 At some point in the future, the plane should commence landing, then detract the under- carriage until finally the speed is decrease below 100 mph.

R5 In the future the plane will be decommissioned: it will land and then either never flying faster than 100 mph or always slowing down below 200 mph.

R6 At some point in the future, the speed will be below 200 mph and the undercarriage retracted. Immediately before that and while the speed is still above 200 mph the under- carriage is detracted in two consecutive steps.

Forbidden Behaviour

F1 The plane will eventually be on the ground after which it will never take off again.

53 Consistency The three categories of requirements are checked for consistency and redun- dancy separately. The environmental assumptions form a consistent set. The set of required behaviour contained two smallest inconsistent subsets. The requirement R4 is inconsistent with R6, hence we have choose R4 as more important and exclude R6 from the collection. The second inconsistent subset was: R1, R2, R5. There we chose to preserve R1 and R2, leaving out the possibility to decommission a plane.

Redundancy After regaining consistency of the required behaviour, our sanity checking procedure detects no redundancy in this set. On the other hand there are two redundant environmental assumptions. The assumption A5 is implied by the conjunction of A1 and A3 and is thus redundant. It also holds that any environment in which A2 and A3 are true also satisfies A6. Hence both A5 and A6 can be removed without changing the set of sensible behaviour in the environment.

Completeness Initially, the required and forbidden behaviour together does not cover the environment assumptions at all, i.e. the coverage is 0. There simply are so many possible paths allowed by the assumptions which are not considered in the requirements, that our metric evaluated the coverage to 0. There is thus a considerable space for improvement. The completeness improving process we are about to describe begins by generating a set of candidate requirements among which those with the best resulting coverage are selected. Table 4.1 summarises the process. The first formula selected by the Algorithm5 and leading to coverage of 9 .1 per cent was a simple ϕ1 = G(¬l): “The plane will never land”. Not particularly interesting per se, nonetheless emphasising the fact that without this requirement, landing was never required. The formula ϕ1 should thus be included among the forbidden behaviour. Unlike ϕ1, the formula generated as second, F(¬b ∧ ¬l):“At some point the plane will fly faster than 200 mph and will not be in the process of landing”, should be included among the required behaviour. This formula points out that the required behaviour only specifies what should happen after landing, unlike the assumptions, which also require flight. The third and final formula was F(¬l U(a ∨ b)): “Eventually it will hold that the landing does not commence until either: (1) the plane is already on the ground or (2) the speed decreased below 200 mph”. This last formula connects flight and landing and its addition (among the required behaviour) entails coverage of 54.9 per cent. There are two conclusion one can draw from this case study with respect to the complete- ness part of sanity checking. First, the requirements are generated automatically, they are relevant and already formalised, thus amenable to further testing and verification. Second, the new requirements can improve the insight of the engineer into the problem. Both of these properties are valuable on their own, even though the method did not finish with 100 per cent coverage. The method produced further, coverage-improving formulae but even the best formulae only negligible improved the coverage and thus the engineer decided to stop the process.

4.3.3 Industrial Evaluation As a further extension to the original paper [BBBC12ˇ ], we have evaluated our consistency and redundancy checking method on a real-life sets of requirements obtained in cooperation with Honeywell International. The case study consisted of four Simulink models; each has been

54 Table 4.1: Iterations of the completeness evaluation algorithm for the Aeroplane Control System.

iteration suggested formula resulting system coverage of AA by An 1 ϕ = G(¬l) A = A 9.1% 1 1 R∨F ∨ϕ1 2 ϕ = F(¬b ∧ ¬l) A = A 39.4% 2 2 R∨F ∨ϕ1∨ϕ2 3 ϕ = F(¬l U(a ∨ b)) A = A 54.9% 3 3 R∨F ∨ϕ1∨ϕ2∨ϕ3

Table 4.2: Results of consistency and redundancy checking on an industrial case study.

model no. of consistency redundancy name formulae result time(s) mem(MB) result time(s) mem(MB) VoterCore 3 OK 0.1 21.6 OK 0.4 22.4 AGS 5 OK 0.1 23.5 ϕ4 ⇒ ϕ2 0.7 23.6 Lift 10 OK 73.6 58.5 ϕ1 ⇒ ϕ5 9460.5 1474.5 pbit 10 OK 210879.0 22 649.0 — — — associated with a set of formalised requirements in the form of LTL formulae. The formali- sation of the requirements that were originally given in natural language form has been done with the help of the ForReq tool. The tool allows the user to write formal requirements with the help of the specification patterns [DAC98] and some of their extensions. See [BBB+12] for a detailed description of the tool. Out of the four models, one (Lift) was a toy model used in Honeywell to evaluate the capabilities of ForReq, three (VoterCore, AGS, pbit) were models of real-world avionics sub- systems. As the focus of this work is model-less sanity checking, we ignored the models themselves and only considered the formalised requirements. The results of our experiments are summarised in Table 4.2. In all four cases, the whole set of requirements was consistent. As for redundancy, we have identified two cases where a formula was implied by another one. This is shown in the redundancy result column. The number displayed in the mem(MB) column represent the maximal (peak) amount of memory used. The experiments, with the exception of the last one, were run on a dedicated Linux workstation with Intel Xeon E5420 @ 2.5 GHz and 8 GB RAM. The last experiment (pbit) was run on a Linux server with Intel X7560 @ 2.27 GHz and 450 GB RAM. The last experiment was a consistency check only as the redundancy checking took too much time. The reason for such great consumption of time (and memory) is that the formulae in the pbit set of requirements are rather complex, some of them even using precise specification of time. For more details about the time extension of LTL and its translation to standard LTL, see [BBB+12].

4.4 Closing Remarks

Our work on sanity checking further expands the incorporation of formal methods into soft- ware development. Aiming specifically at the requirements stage we propose a novel approach to sanity checking of requirements formalised in LTL formulae. Our approach is comprehen- sive in scope integrating consistency, redundancy and completeness checking to allow the user (or a requirements engineer) to produce a high quality set of requirements easier. The novelty of our consistency (redundancy) checking is that they produce all inconsistent (redun-

55 dant) sets instead of a yes/no answer and their efficiency is demonstrated in an experimental evaluation. Finally, the completeness checking presents a new behaviour-based coverage and suggests formulae that would improve the coverage and, consequently, the rigour of the final requirements. Especially in the earliest stages of the requirements elicitation, when the engineers have to work with highly incomplete, and often contradictory sets of requirements, could our sanity checking framework be of great use. Other realisability checking tools can also (and with better running times) be used on the nearly final set of requirements but our process uniquely target these crucial early states. First, finding all inconsistent subsets instead of merely one helps to more accurately pinpoint the source of inconsistency, which may not have been elicited yet. Second, once working with a consistent set, the requirements engineer gets further feedback from the redundancy analysis, with every minimal redundancy witness pointing to a potential lack of understanding of the system being designed. Finally, the semi- interactive coverage-improving generation of new requirements attempts to partly automate the process of making the set of requirements as detailed as to prevent later development stages from misinterpreting a less thoroughly covered aspect of the system. We demonstrate this potential application on case studies of avionics systems. The above described addition to the standard elicitation process may seem intrusive, but the feedback we have gather from requirements engineers at Honeywell were mostly positive. Especially given the tool support for each individual steps of the process – pattern-based automatic translation from natural language requirements to LTL and the subsequent fully- and semi-automated techniques for sanity checking – a fast adoption of the techniques was observed. Many of the engineers reported a non-trivial initial investment to assume the basics of LTL, but in most cases the resulting overall savings in effort outweighed the initial adoption effort. After the first week, the overhead of tool-supported formalisation into LTL was less than 5 minutes per requirement, while the average time needed for corresponding stages of requirements elicitation decreased by 18 per cent. We have also accumulated a considerable amount of experience from employing the sanity checking on real-world sets of requirements during our cooperation with Honeywell Interna- tional. Of particular importance was the realisation that the time requirements of our original implementation effectively prevented the use of sanity checking on industrial scale. In order to ameliorate the poor timing results we have reimplemented the sanity checking algorithms using the state-of-the-art LTL to B¨uchi translator SPOT, which accelerated the process by several orders on magnitude on larger sets of formulae. Another important observation was that the concept of model-based redundancy is desirable to be translated into the model-free environment more directly. We thus extend the sanity checking process with the capacity to generate redundancy witnesses (using the formally described theory of path-quantified for- mulae). Finally, we summarise these and other observations in a case consisting of four sets of requirements gathered during the development of industry-scale systems. One direction of future research is the pattern-based candidate selection mentioned above. Even though the selected candidates were relatively sensible in presented experiments, using random formulae can produce useless results. Finally, experimental evaluation on real-life requirements and subsequent incorporation into a toolchain facilitating model-based devel- opment are the long term goals of the presented research. Our exposition of requirements completeness also lacks formal definition of total coverage (which the proposed partial cover- age merely approximates). We intend to formulate an appropriate definition using uniform probability distribution: which would also allow to compute total coverage without approxi-

56 mation and would not be biased by concrete LTL to BA translation. That solution, however, is not very practical since the underlying automata translation is doubly exponential.

4.4.1 Alternative LTL to B¨uchi Translation

Although the running time experiments were one of our priorities, their purpose was to investigate the effects of the proposed optimisations and not the concrete time measurements. In other words, the subject of our investigation was the asymptotic behaviour of the algorithms and how effectively does the algorithm scale with the number of formulae in the average and in the extrema. Given our recent cooperation with Honeywell and their interest in checking sanity of real-world, industry-level sets of requirements, the actual running times became of particular interest as well. Initial experiments revealed, however, that on larger sets of requirements and especially when more complex individual requirements were involved, the proposed sanity checking algorithms were unable to cope with the computational complexity. The bottleneck proved to be the algorithm translating LTL formulae – conjunctions of the original requirements – into B¨uchi automata. The translator incorporated in DiVinE is sufficiently fast on small formulae that commonly occur in software verification and was never intended to be used with larger conjunctions. Thus, to enable checking sanity of real-world sets of requirements, we have reimplemented the algorithms to employ the state-of-the-art LTL to B¨uchi translator SPOT [DL11] in a tool called Looney1. As the case study summarising our cooperation with Honeywell in Section 4.3 shows, Looney is able to process much larger formulae than the original tool and also to incorporate a larger number of formulae into the sanity checking process. The reimplementation of consistency was relatively straightforward since SPOT can be interfaced via a shared library with an arbitrary C++ application. Even though this enforced us to use the internal implementation of formulae from SPOT, which was different the one we had before, SPOT offers an adequate set of formulae-manipulating operations that allowed the translation of our algorithms, without their needing to be considerably modified. The reimplementation of the coverage algorithms was complicated by the fact that the product of SPOT is a transition-based B¨uchi automaton, whereas the algorithm of Section 4.1 expects state-based automata: the difference is that transitions rather than states are denoted as accepting. This shift requires modification of the coverage evaluation, more precisely the definition of Λ(A1, A2) of the coverage among automata. Before, the number m in the denominator referred to the number of almost-simple paths of A1 that ended in an accepting vertex (state). We propose to redefine m as the number of almost-simple paths that end in an accepting edge (transition). It is clear that the coverage reported by the original and the new metrics would differ: first, because the automata are generated using different methods and may thus be structurally different and, second, because the redefined metric considers different sets of almost-simple paths. With the alternative definition of m one can easily incorporate translators that produce transition-based automata (though parsing these automata may still require some degree of modification due to incompatibility of internal structures representing the automata).

1As we are checking the sanity of requirements, the translator helps us to spot the looney ones. The tool is available at http://anna.fi.muni.cz/~xbauch/code.html#sanity.

57 Therefore, in order to reestablish admissibility of the new evaluation as an adequate coverage metric, we have repeated the experiment with the aeroplane control system. As was to be expected given the above argumentation, the concrete evaluation of individual formulae differed from our original experiments. Specifically, the formula first generated (G(¬l)) was evaluated with coverage 12.9 per cent (instead of 9.1); the second formula F(¬b∧¬l) with 35.0 (instead of 39.4); and the third formula F(¬l U (a ∨ b)) with 61.1 (instead of 54.9). Hence there are two observation to be made. First, even though the coverage numbers differ for individual formulae, the same triple was produced by the modified algorithm. Moreover, the three formulae were produced in the same order, thus the new metric behaves appropriately even when a new set of requirements is obtained as a disjunction of the old set with the generated coverage-improving formula.

58 Chapter 5

Control Explicit—Data Symbolic Model Checking

Given the results of the previous chapter, we can now safely assume that a sensible collection of requirements is present. This section establishes a verification method that builds upon that assumption, verifying that a model of the software under development has certain properties specified as LTL formulae. For easier exposition we will assume a mathematical model of the software and explain how to obtain such model during various development stages in forthcoming chapters. As will then become clear, the pivotal contribution of our efforts, the control explicit—data symbolic model checking, is general enough to accept both software designs and software code as its input. The thesis on which the program verification part of this work is based can be summarised as follows: the explicit-state model checking may be seamlessly extended to handle the con- trol part of programs explicitly while substituting a symbolic representation for the data part. Not all symbolic representations are appropriate for the data part, and indeed not all forms of data are appropriate for LTL model checking. The reason is that most symbolic represen- tations were created to support checking safety properties. Yet LTL model checking requires exact state comparison, while checking safety properties does not. This section introduces one possible form of symbolic data, presented as a state space reduction technique. This representation will later prove adequate for arguing about correctness and for deriving the requirements on the appropriate representation that would implement this reduction.

5.1 Set-Based Reduction

The proposed method of handling the data flow consists of clustering together those variable evaluations that are connected to the same control state. This way, most of the aspects of explicit state model checking can be preserved provided that one can efficiently store these sets of evaluations. More precisely, we want to propose a reduction that would modify the state space in such a way that an unmodified model checking of this reduced space would yield exactly the same result as it would for the unreduced state space. That clearly entails the ability to efficiently decide whether a state has already been seen, using a fast storage data-structure and a state matching decision procedures.

59 Unlike the explicit CFG which represents the unreduced states space, the set-based re- duction does not include data explicitly within states. To simplify further argumentation, we represent the reduction by introducing a new variable data, such that

sort(data) = {(v1, . . . , vn) | vi ∈ sort(xi)}.

Definition 5.1 (Set-Reduced CFG). The set-reduced CFG G = (S, s0,T,A) is similar to the explicit CFG in limiting the transition labels to > but differs in that S contains the evaluations of only two variables pc and data. The generation from a CFG P = (L, l0, T ,F ) is defined as follows: S = {s : {pc, data} → L ∪ sort(data)} | s(pc) ∈ L, • s(data) ∈ sort(data)}

• s0 = {(pc, l0), (data, 0)} ρ T = {s → s0 | s(pc) −→ s0(pc), V • ∃(v ∈ s(data)), |=SAT xi = vi ∧ ρ, s0(data) = {v0 ∈ sort(data) | ∃(v ∈ s(data)), V V 0 0 a |=SAT xi = vi ∧ ρ ∧ xi = vi}} • A = {s ∈ S | s(pc) ∈ F } Even though the representation of the evaluations of data is very important, let us, for now, abstract from it and investigate the reduction assuming that there is an efficient repre- sentation. More precisely let us assume that we have a representation whose size is linear in the number of program variables but that does not depends on the domains of these variables or on the number of times the input was read from, the length of the path from s0, etc. None of these assumptions will prove achievable using the currently known symbolic representations, but for the purpose of evaluating the general idea, we will assume them regardless. One might observe that it is no longer obvious when should two states of the reduced transition system be considered equivalent: which needs to be decidable for the accepting cycle detection. First, let us demonstrate that equality based on the subset relation – a newly discovered state s0 being equal to a known s provided that s0(data) ⊆ s(data) – is not correct with respect to LTL properties. In the context of symbolic execution, this type of equivalence is known as subsumption [XMSN05], where its use is justified by the fact that symbolic execution verifies reachability properties for which the subsumption is correct. Example 5.2 (Incorrectness of Subsumption). The incorrectness of subsumption-based re- duction for LTL verification can be easily demonstrated. Consider the verification task below, with the program CFG on the left and the specification CFG on the right. x ≤ 10 0 l0 l x < 5 ∧ x = x + 5 a0 > τ0 τ1

After taking the transition τ0 the set-reduced CFG reaches a state s which evaluates data 0 0 to {0,..., 10}. Taking τ1 leads to a state s with s (data) = {5,... 9}. Under subsumption the states s and s0 would be equal. Given the specification CFG on the right, each state is accepting. Under subsumption we would thus report an accepting cycle, which would be incorrect. The incorrectness can be demonstrated by listing all path of the explicit CFG. In short, the τ1-successors of states with x < 5 evaluate x to a value larger by 5 and states with x ≥ 5 have no successors. Hence there are no cycles in the explicit CFG. 4

60 It appears that unless the underlying model-checking process itself is modified, the reduc- tion has to differentiate every pair of states with different sets of possible variable evaluations. Henceforth this reduction is referred to as set-based reduction, and for an unreduced state space Gu the set-reduced state space is be denoted by Gs (whenever we need to distinguish it from other forms of reduction). The reduction is demonstrated in the following example and formalised in the following subsection.

Example 5.3 (Set-Based Reduction). Let us consider verifying the following parallel program P :

thread t0

1: char b; 2: read b;

thread t1 thread t2

1: if ( b > 3 ) 1: if ( b*b <= 16 ) 2: while ( true ) 2: while ( true ) 3: b++; 3: b -= 10; with respect to a specification represented by the automaton PA:

1 b > 10 2 b > 10 requiring that at some point, the execution sets b to a value larger than 10, and afterwards never decreases it below 11. The program P is not represented as a composition of transition systems but using C- like pseudo-code to further demonstrate the possibility to translate real-world verification problems to the presented formalism. Below is the transition system generated without our set-based reduction:

b = 0 b = 0 b = 4 pc = 2, 0, 0 pc = 2, 1, 0 ... pc = 2, 1, 0 ... pcA = 1 pcA = 1 pcA = 1 b = 0 . pc = 1, 0, 0 δ . ζ pcA = 1 b = 4 b = 4 pc = 2, 0, 0 pc = 2, 0, 1 ... pcA = 1 pcA = 1 . .

The identification of program states can be divided into two parts: one for control information (marked with lighter blue in the figure) and the other for data (marked with darker red). Note also that the control part contains the program counters for individual threads of the main program pc and the locations pcA of the specification automaton PA. Similarly, it is

61 possible to distinguish the two sources of non-determinism in parallel programs: the control- flow non-determinism (thread interleaving) is marked as ζ-transitions and the data-flow non- determinism (variable evaluation) as δ-transitions. Finally, the set-reduced transition system may be visualised as follows:

d = (b, {4..255}) pc = 2, 1, 0 ... pcA = 1 d = (b, 0) d = (b, {0..255}) pc = 1, 0, 0 pc = 2, 0, 0 pcA = 1 pcA = 1 d = (b, {0..4}) pc = 2, 0, 1 ...... pcA = 1

In the above figure we abbreviate data to d. Note especially how the set-reduction contracts the data non-determinism δ, while preserving the distinctions prescribed by the control. 4 Example 5.4 (Symbolic Execution). The figure bellow depicts a program with input on the left-hand side and its associated transition system on the right-hand side.

1:read a; (0, (a, 0), >) 2:if(a>5) 0 3: a++; a = i1 4:else (1, (a, i1), >) 5: a--; a > 5 a ≤ 5

(2, (a, i1), i1 > 5) (4, (a, i1), i1 ≤ 5) a0 = a + 1 a0 = a − 1

(3, (a, i1 + 1), i1 > 5) (5, (a, i1 − 1), i1 ≤ 5)

The set of evaluations (of the one variable a) is represented symbolically, as a function over input symbols from I, here only i1. 4 Subset inclusion leads to unification of states that does not preserve the information on the progress towards the LTL specification. In fact, of any unification among states it must be proven that if an accepting cycle was created then there is one in the unreduced system as well. This holds even for a more restrictive form of state equivalence, when two states are equal only if the evaluations they contain match exactly. Notice that even though atomic propositions may evaluate differently under s(data) within a single state, they are divided on-the-fly, based on the labels of the specification CFG transitions. That is ensured by the way transitions are generated in Definition 5.1.

Theorem 5.5 (Correctness of Set-Based Reduction). Let P = (L, l0, T ,F ) be the CFG of u u u u u s s s s s the verified program and let G = (S , s0 ,T ,A ) and G = (S , s0,T ,A ) be the generated explicit and set-reduced CFGs, respectively. Then Gu contains a reachable accepting cycle if and only if Gs contains a reachable accepting cycle.

u u u u Lemma 5.6 (Symbolic Path Existence). For any path s0 → s1 → ... → sn in G there is a s s s s u s u s path s0 → s1 → ... → sn in G such that ∀(0 ≤ i ≤ n), si (pc) = si (pc) ∧ si (D) ∈ si (data).

62 u s u s Proof. For n = 0 we have that s0 (pc) = l0 = s0(pc) and s0 (D) = 0 = s0(data). Let us assume that the statement holds for paths of length n and take an arbitrary path of u u u u length n + 1, say s0 → s1 → ... → sn → sn+1. The induction hypothesis gives us the s s s appropriate path s0 → s1 → ... → sn. According to the Definition 3.10, the existence of u u u ρ u s ρ s sn → sn+1 entails sn(pc) −→ sn+1(pc) ∈ T . Thus also sn(pc) −→ sn+1(pc) ∈ T and since u s s s s s sn(D) ∈ sn(data) we also have sn(data) 6= ∅. Already it holds that sn → sn+1 ∈ T and from V u V 0 u u s |=SAT x = sn(x) ∧ ρ ∧ x = sn+1(x) we conclude that sn+1(D) ∈ sn+1(data).

s s s s Lemma 5.7 (Explicit Path Existence). For any path s0 → s1 → ... → sn in G and for all s u u u u u v ∈ sn(data) there is a path s0 → s1 → ... → sn in G such that sn(D) = v.

s Proof. Since s0(data) = {0} the statement holds trivially for n = 0. Let us assume a set- s s s s 0 s reduced path s0 → s1 → ... → sn → sn+1 and let us fix a data vector v ∈ sn+1(data). s s s ρ s s The transition sn → sn+1 gives us sn(pc) −→ sn+1(pc) ∈ T and sn(data) 6= ∅. Also since 0 s V V 0 0 s v ∈ sn+1(data) we have that |=SAT xi = vi ∧ ρ ∧ xi = vi for some v ∈ sn(data). Applying u u u u the induction hypothesis, we get a path s0 → s1 → ... → sn with v = sn(D) and a transition u u u u 0 in G from sn to sn+1 with sn+1(D) = v . Lemma 5.8 (Fix-Point for BV Relations). Let ρ ∈ Φ be a BV formula. We can associate to ρ a relation r as follows:

0 0 r = {(v, v ) | vi, vi ∈ sort(xi), V V 0 0 |=SAT xi = vi ∧ ρ ∧ xi = vi}

For each ρ ∈ Φ there are m, n ∈ N0 such that for all subsets X of {v | vi ∈ sort(xi)} it holds that {rm(v) | v ∈ X} = {rm+n(v) | v ∈ X}.

Proof. Let Ri = {ri(v) | v ∈ X} and consider the infinite sequence R1,R2,.... Given that k l each sort(x), x ∈ D is finite there are k, l ∈ N0 such that R = R . Assume k > l, then we can take l for m and k − l for n to establish the statement.

Definition 5.9 (Composition of BV Relations). For a collection of relations ρ1, . . . , ρn from Θ we define their composition ρ = ρn ◦ ... ◦ ρ1 as a formula

n n−1 0 n 2 0 3 0 2 ρ = ρn[x 7→ x ] ∧ ρn−1[x 7→ x , x 7→ x ] ∧ ... ∧ ρ2[x 7→ x , x 7→ x ] ∧ ρ1[x 7→ x ]

u u u u Lemma 5.10 (Symbolic Cycle Existence). If s1 → ... → sc = s1 is a cycle in G then there s s s s ∗ s ∗ is a path in G s11 → ... → sc1 = s12 → s13 → ... such that

u s 1. ∀(1 ≤ i ≤ c)(j ≥ 1), si (pc) = sij(pc) s s 2. ∃(m n ∈ N), s1m(data) = s1(m+n)(data) u u Proof. Unrolling the cycle s1 → ... → sc and applying the Lemma 5.6 we get a path s in G satisfying (1). Let ρi be the formulae labelling the corresponding path in P , i.e. u ρi u si (pc) −→ si+1(pc), and let ρ be the formula for the composition of ρ1, . . . , ρc. Applying Lemma 5.8 on ρ we get the m and n satisfying (2).

s s s s Lemma 5.11 (Explicit Cycle Existence). If s1 → ... → sc = s1 is a cycle in G then there s u u u ∗ u ∗ is a path in G s11 → ... → sc1 = s12 → s13 → ... such that

63 s u 1. ∀(1 ≤ i ≤ c)(j ≥ 1), si (pc) = sij(pc)

u u 2. ∃(m n ∈ N), s1m(D) = s1n(D)

s s u Proof. Unrolling the cycle s1 → ... → sc and applying the Lemma 5.7 we get a path in G s u ∗ u satisfying (1). Let k = |s1(data)| and consider the prefix s11 → s1(k+1). For this prefix it u s holds that ∀(1 ≤ i ≤ k + 1), s1i(D) ∈ s1(data) and thus at least one data-vector must repeat, u u i.e. ∃(1 ≤ m < n ≤ k + 1), s1m(D) = s1n(D), as required by (2).

u u u u Proof of Theorem 5.5. Assume that s1 → ... → sn = s1 is an accepting cycle in G reachable u s s s s from s0 . Then Lemma 5.10 gives us a lasso-shaped path s1 → ... → sm → ... → sm+n = sm in Gs which also leads to a cycle. This cycle is also accepting given (1) of Lemma 5.10 and s u u the definition of A . Finally, applying Lemma 5.6 on the path from s0 to s1 proves the reachability of an accepting cycle in Gs. s s s s s Assume that s1 → ... → sn = s1 is an accepting cycle in G reachable from s0. Then u u u u u Lemma 5.11 gives us a lasso-shaped path s1 → ... → sm → ... → sn = sm in G which also u u leads to a cycle. This cycle is also accepting given the definition of A and reachable from s0 s ∗ s u (after applying Lemma 5.7 to s0 → s1 and s1 (D)).

Theorem 5.5 gives us the assurance that an unmodified model checker can be used together with the set-reduced transition system and would provide the correct result. Yet reasoning about the efficiency of the proposed set-based reduction – the ratio between the size of the original system and the size of the reduced system – is rather complicated, as shown in Example 5.12 below. For a program without cycles, the reduction is exponential with respect to the number of input variables and to the sizes of their domains. Note, however, that for trivial cases of data-flow non-determinism even this reduction can be negligible. The case of programs with cycles is considerably more involved. Let us call control cycles those paths in a transition system that start and end in two states with the same control part, s, s0 such that s(pc) = s0(pc). Then the relation r of the cycle, transforming s(D) to s0(D), has a fixed point as was argued in the above proof, and this fixed point has to be computed (explicitly in our case, as opposed to the symbolic solution [Lin96] of the same problem). That aspect is present in full and reduced state spaces alike, yet may produce an exponential difference in their sizes. If the multi-state already contains the fixed point before the generation reaches the given cycle, as in the second program of Example 5.12, then the reduced system contains only as many multi-states as is the length of that cycle. On the other hand, as the first program of Example 5.12 demonstrates, the reduction can even be to the detriment of the space complexity even if we assume that the size of multi-states is sub-linear in the number of states contained within.

Example 5.12 (Set-Based Reduction Complexity). There are two programs that nicely ex- emplify some of the properties of the set-based reduction, which we will use in further dis- cussion, especially regarding the efficiency of this approach. Consider the following program with a loop: read a; while ( a > 10 ) a--;

When only the data part of multi-states is considered, the reduced transition system unfolds to

64 a = {0} a = {0..255} a = {11..255} a = {10..254} ...

a = {0..10} and while the state space is finite, there are as many multi-states in the reduced system as there were states in the original system. Each multi-state contains a different set of evaluations of a and each pair of multi-states is thus different. Furthermore, many states (individual variable evaluations) are represented multiple times: given that the first and the third states in the above system have the same program counter, the values of a between 10 and 254 are represented twice in these two multi-states alone. On the other hand, for a specification x = 1 and a program

x = 1; read y; while ( true ) y++; the reduced transition system contains three multi-states and three transitions:

x = 0 x = 1 x = 1 y = 0 y = 0 y = {0..255} whereas the original transition system contains 256 paths that each enclose into a cycle only after 256 unfoldings of the while loop. 4

Explicit-Value Representation The first representation considered in this thesis stores the allowed evaluations in an explicit set, enumerating all possible combinations of individual variable evaluations. Hence this approach is closely related to the purely explicit approach of [BBB+12] but improves on it in several aspects. Effectively, rather than repeating the verification for every evaluation of input variables (purely explicit approach), with explicit sets we only execute the verification once, from the multi-state containing all initial evaluations. It follows that evaluating branching conditions and computing successors needs only the modification that these operations are applied sequentially to all evaluations comprising the current multi-state. Most importantly, however, deciding state matching is possible on the level of syntax: two states are the same if their representation in memory matches. Verification with explicit sets clearly retains the biggest limitation of the purely explicit approach, i.e. the spatial complexity grows exponentially with the domains of inputs variables. We postpone discussing the comparison between the explicit set approach and the purely explicit approach until Section7 and only remark that while the improvement is considerable, it still scales linearly with the domain size. Verifying programs with realistic domains of input variables appears to require a symbolic representation and we continue with a representation which uses BV formulae and the satisfiability modulo theories (SMT) procedures for this theory.

5.2 Symbolic Representation

The classical way of representing a set of variable evaluations using BV formulae is by inter- preting a formula as the characteristic function of a set. A formula ϕ ∈ Φ then represents a set

65 V {v ∈ sort(D) ||=SAT ϕ ∧ xi = vi}. Generating successors with this representation, how- ever, requires the computation of either pre- or post-conditions, which is non-trivial for many instructions. We will thus use a different approach, popular in the symbolic execution [Kin76]. The symbolic representation of the set of evaluations considered in this thesis stores the possible current values of variables as images of bit-vector functions. For the following the- oretical discussion we set the data part of multi-states s(data) to a vector of BV formulae (ρ1, . . . , ρn): the transition relation labels collected along the path from l0. Such vector may V V 0 represents a set of evaluations {v ||=SAT ρn ◦ ... ◦ ρ1 ∧ xi = 0 ∧ xi = vi}, i.e. the image of the transition relation. In practice, the transition relations obtained from interpreting program commands are mostly a conjunction of a guard formula γ, which refers only to current-state variables, and a set of update formulae x0 = ϕ, where ϕ refers to current-state and input variables. This allows representing the data part of multi-states as a sequence of formulae referring only to input variables. The conjunction of guards forms the so-called path condition and the sequence of updates delimits the values of variables. Thus assuming that a variable x is associated with γ∧y0=ψ a formula ϕ in a state s (with path condition δ), taking the transition s(pc) −−−−−→ s0(pc) creates a state s0 with path condition δ ∧ γ[x 7→ ϕ] and y is associated with ψ[x 7→ ϕ]. This representation will not be used in the theoretical part of this thesis, but Section7 details on the experimental implementation which uses the representation.

Definition 5.13 (Set-Symbolic CFG). The set-symbolic CFG G = (S, s0,T,A) is similar to the set-reduced CFG in limiting the transition labels to > but differs in that data evaluates to a vector of BV formulae from Φ. The generation from a CFG P = (L, l0, T ,F ) is defined as follows:

S = {s : {pc, data} → L ∪ [Φ]} | s(pc) ∈ L, • s(data) = (ρ1, . . . , ρn), ρi ∈ Φ}

V 0 • s0 = {(pc, l0), (data, ( xi = 0))}

ρ T = {s → s0 | s(pc) −→ s0(pc), s(data) = (ρ , . . . , ρ ), • 1 n |=SAT ρ ◦ ρn ◦ ... ◦ ρ1, 0 s (data) = (ρ1, . . . , ρn, ρ)}

• A = {s ∈ S | s(pc) ∈ F }

Hence a multi-state in the set-symbolic state space consists of a vector of formulae. As stated earlier we want to interpret this vector as the image of the transition relation that results from appending the individual relations in the vector s(data). For such interpretation we then prove that paths in the set-reduced CFGs have their equivalent in the set-symbolic CFGs.

Definition 5.14 (Image-Based Interpretation). Let G = (S, s0,T,A) be a set-symbolic CFG. The image-based interpretation of set-symbolic states is a mapping from S to the set of evaluation vectors {(v1, . . . , vn) | vi ∈ sort(xi)}. For s(data) = (ρ1, . . . , ρn) we define it as:

^ 0 s = {v ||=SAT ρn ◦ ... ◦ ρ1 ∧ xi = vi} J K 66 Theorem 5.15 (Trace Equivalence for Image-Based Interpretation). Let P = (L, l0, T ,F ). s s s s s Under the image-based interpretation, the set-reduced CFG G = (S , s0,T ,A ) is trace- i i i i i equivalent to the set-symbolic CFG G = (S , s0,T ,A ). That is

s s s i i i 1. for each path s0 → s1 → ... in G there is a corresponding path s0 → s1 → ... in G , s i s i such that ∀(j ∈ N), sj(pc) = sj(pc), sj(data) = sj J K i i i s s s 2. for each path s0 → s1 → ... in G there is a corresponding path s0 → s1 → ... in G , s i s i such that ∀(j ∈ N), sj(pc) = sj(pc), sj(data) = sj J K Proof. For zero-length paths, the proof is the same for both (1) and (2). In the set-reduced s CFG we have s0(data) = {0}. In the set-symbolic CFG we have

i ^ 0 ^ 0 s0 = {v ||=SAT xi = 0 ∧ xi = vi} = {0}. J K s s s s (1) Let s0 → ... → sn → sn+1 be a path in G . From the induction hypothesis we have a i i i s i s ρ s s path s0 → ... → sn in G such that sn(data) = sn . Let sn(pc) −→ sn+1(pc) and v ∈ sn(data) V i J K V 0 such that |=SAT ρ ∧ xi = vi. Since v ∈ sn we also have |=SAT ρn ◦ ... ◦ ρ1 ∧ xi = vi for i J K V 0 sn(data) = (ρ1, . . . , ρn). Furthermore, ρn ◦ ... ◦ ρ1 ∧ xi = vi is equivalent to

0 n+1 ^ n+1 ϕ := (ρn ◦ ... ◦ ρ1)[X 7→ X ] ∧ xi = vi

V n+1 V n+1 and ρ ∧ xi = vi is equivalent to ψ := ρ[X 7→ X ] ∧ xi = vi. The only common vari- n+1 ables in ϕ and ψ are X which are bound to the same values and thus |=SAT ρ ◦ ρn ◦ ... ◦ ρ1. i s All that remains is to show that sn+1 = sn+1(data). J K 0 i V 0 0 v ∈ sn+1 ⇔ |=SAT ρ ◦ ρn ◦ ... ◦ ρ1 ∧ xi = vi J K V 0 0 V n+1 ⇔ |=SAT ρ ◦ ρn ◦ ... ◦ ρ1 ∧ xi = vi ∧ xi = vi i n+1 V 0 0 V n+1 ⇔ ∃(v ∈ sn ), |=SAT ρ[X 7→ X ] ∧ xi = vi ∧ xi = vi Js K V 0 0 V ⇔ ∃(v ∈ sn(data)), |=SAT ρ ∧ xi = vi ∧ xi = vi 0 s ⇔ v ∈ sn+1(data)

i i ρ i (2) Similarly as in (1), the path in G gives us sn(pc) −→ sn+1(pc) and |=SAT ρ ◦ ρn ◦ ... ◦ ρ1. Fixing the evaluation of Xn+1 variables from the satisfying model to v and renaming the i V variables we get ∃(v ∈ sn ), |=SAT ρ ∧ xi = vi. Applying the induction hypothesis we get s J KV s i ∃(v ∈ sn(data)), |=SAT ρ ∧ xi = vi. Finally, showing that sn+1(data) = sn+1 is exactly the same as in (1). J K

The image-based interpretation crucially depends on the fact that states of set-symbolic CFGs represent sets of program states. These sets of data evaluations are stored compactly in form of BV formulae but model checking algorithms require distinguishing multi-states. We demonstrate how to compare the explicit part of multi-states efficiently in Section 5.4. Comparing the symbolic part, i.e. performing state matching, requires a different approach for efficient implementation. Below we define a quantified BV formula which is satisfiable if and only if two given multi-state are different in their data parts. The satisfying assignment of input values for one of the multi-states is such that the resulting evaluation of data variables does not appear in the other state.

67 Definition 5.16 (Image-Based State Matching). ] For a set-symbolic CFG G = (S, s0,T,A) we define an image-based state matching formula image for differentiating two states sl, sr ∈ S as follows. Let us fix sl and sr to sl(data) = (ρl , . . . , ρl ) and sr(data) = (ρr, . . . , ρr ), where 1 nl 1 nr l r r each variable is also indexed with either l or r, i.e. formulae from s (data) refer to x1, . . . , xn r r and i1, . . . , imr . sl 6⊆ sr = ρl ◦ ... ◦ ρl ∧ ∀(ir . . . ir ), ρr ◦ ... ◦ ρr =⇒ W xl 6= xr nl 1 1 mr nr 1 i i image(sl, sr) = sl 6⊆ sr ∨ sr 6⊆ sl Theorem 5.17 (Image-Based Matching Correctness). Let sl and sr be two set-symbolic states. Then image(sl, sr) is satisfiable if and only if sl and sr represent two different sets of variable evaluations, i.e. sl 6= sr . J K J K Proof. Let y0 be the vector of input values that satisfy image(sl, sr). Given the form of image we can assume that sl 6⊆ sr. Let v0 be the satisfying vector of values for variables x0l, . . . , x0l . 1 nl From the first conjunct we know that v0 ∈ sl . Let v ∈ sr be chosen arbitrarily as well as r J K Jr K r V r V r r a vector of input values y, one for each ij . Then either ρnr ◦ ... ◦ ρ1 ∧ ij = yj ∧ xi = vi is unsatisfiable and thus v 6∈ sr . If the above formula is satisfiable then the consequent of sl 6⊆ sr states that in at leastJ oneK variable evaluation v0 differs from v. l r r Let v ∈ s be such that v 6∈ s and y be an arbitrary evaluation of ij variables. Then r J K r V r J K W l r either ρnr ◦ ... ◦ ρ1 ∧ ij = yj is unsatisfiable or xi 6= xi holds since v differs from every element of sr . J K The above theorem and the existence of SMT solvers for the quantified bit-vector theory (QBV) give a feasible way of checking state equivalence under image-based symbolic repre- sentation. By extension this allows implementing model checking of arbitrary system that uses input variables and even those that read from input variables iteratively. The most outstanding limitation of such an approach is the complexity of the satisfiability checking of QBV formulae, which, in theory, is an NEXPTIME-complete problem. Reported practical results are much more promising but the achieved times are still orders of magnitude higher than equivalent operations in classical model checking. There are two possible ways of overcoming this obstacle. One possible way is to reformulate the state matching query to allow faster satisfiability checking. The other possible way is to reformulate the definition of state matching that would be correct while easier to check. It can be easily observed that semantic equivalence between s(data) relations is not a correct state differentiating mechanism with respect to set-based reduction. However, given the increase in efficiency of BV (quantifier-free) satisfiability compared to QBV satisfiability, we now describe under what conditions would semantic equivalence be a permissible substitute for the exact, image-comparing, equivalence. Let us limit our scope to programs that read only a fixed number of times from the input variables, i.e. I is a finite set of cardinality m and the program statements refer only to input variables among i1, . . . , im. We claim that under such restriction, it is admissible to define the state matching based on semantic equivalence as follows.

Definition 5.18 (Equivalence-Based State Matching). Let G = (S, s0,T,A) be a set-symbolic CFG. We define an equivalence-based state matching formula equiv for differentiating two states sl, sr ∈ S as follows. sl 6v sr = ρl ◦ ... ◦ ρl ∧ W xl 6= xr nl 1 i i equiv(sl, sr) = sl 6v sr ∨ sl 6v sr ∨ ρl ◦ ... ◦ ρl 6= ρr ◦ ... ◦ ρr nl 1 nr 1

68 Hence the formula equiv is a quantifier-free BV formulae and checking its satisfiability is an NP-complete problem. Note that now the satisfying assignment can either come from the last disjunct – is an input variable evaluation that satisfies, say, the left path condition ρl ◦ ... ◦ ρl but does not satisfy the right path condition – or from the modified subset nl 1 inclusion. In the latter case, the satisfying assignment satisfies the path condition of sl and at the same time leads to different evaluation of program variables by sr. The exact meaning of admissibility with respect to state matching is formalised in the following theorem.

Theorem 5.19 (Admissibility of Equivalence-Based State Matching). Assuming the verified program reads only finitely many times from input, the model checking procedure which uses equiv for state matching satisfies the following three conditions.

1. sound: every counterexample exists also in the the set-reduced CFG.

2. complete: every violating run will eventually be detected.

3. terminating: verification of both correct and incorrect programs will terminate.

Proof. (1) Given that the image-based equivalence is strictly weaker than the semantic equiv- alence, i.e. ¬equiv ⇒ ¬image, every state equivalence located by the semantic equivalence is also located by the image-based equivalence. Consequently, every accepting cycle found using equiv exists in the set-reduced transition system. ∗ ∗ (2) Let us assume there is an accepting cycle in the system, i.e. s0 → s1 → s2 such that image(s1, s2) is unsatisfiable. Given the restriction above, we know that on the path from s1 to s2 the transition labelling relations does not use input variables. Let us denote these relations as ρ1, . . . , ρm. Their composition ρm ◦...◦ρ1 constitutes a permutation of its domain (otherwise image(s1, s2) would be satisfiable), but not necessarily an identity. In the case the composition is not an identity, equiv(s1, s2) is satisfiable and the computation continues. However, ρm◦...◦ρ1◦ρm◦...◦ρ1 is also a permutation of the same domain and furthermore the evaluation of control variables remains unmodified, and thus the resulting state is accepting if and only if s1 was accepting. Finally, given the finite number of possible permutations of m 1 k −k a finite domain, we conclude that there exists k0 ≤ k1 such that (ρ ◦ ... ◦ ρ ) 1 0 is an m 1 k0 identity on the domain of (ρ ◦ ... ◦ ρ ) and thus equiv(sk1−k0 , sk1 ) is satisfiable, where sk is the state resulting from k + 1 times iterating the original cycle from s1 to s2. (3) The termination can be established simply by observing that there is only finitely many mappings from I (given the above restriction) to sort(D). Then the state space of the model checking is also finite even though possibly larger than the original, set-reduced transition system.

Apart from the restriction to programs that do not read inputs on a cycle, the proposed method utilising semantic equivalence of states may produce larger state spaces than the original state matching based on image equivalence. Note that in the proof of Theorem 5.19 we relied on the number of permutations to be finite, in order to detect existing state equivalences. In practice it means that the traversal of the state space has to visit and store (at most) k1m new states before detecting the equivalence. There k1 depends on the particular function composition collected from the cycle transitions and m is the length of the cycle in the set- reduced system. As the Example 5.12 demonstrated, the value of m also depends on the transition labelling function. Together, the model checking using semantic equivalence has to unroll the transition relation to reach two fixed points: (1) to stabilise the image set and

69 (2) to stabilise the permutation of that set. We investigate the increased complexity in the number of states and the improvement in the verification time in Section7. Example 5.20 (Iterative Input Reading). To see why the restriction to programs without iterative input reading is necessary, consider this simple program:

a = 0; while(true) a = a + read(); The cycle introduces a new input variable with every iteration but more importantly each variable can modify the evaluation of the program variable a. Let us denote by s1, s2,... the states after 1, 2,... iterations of the cycle, si(data) = (ρ1, . . . , ρni ), and let ij be the variable introduced in j-th iteration. Then the evaluation ik = 0, k ∈ {1, . . . , j − 1}, ij = 1 always differentiates the last state sj from all the preceding states: for all k ≤ j we have that only V a 7→ 0 satisfies ρnk ◦ ... ◦ ρ1 ∧ il = 0 but sj evaluates a to 1. 4

5.3 Storing and Searching

During its state space traversal, an explicit-state model checker enumerates and stores the visited states in order to detect accepting cycles and to be able to decide that the system is correct (when the whole state space was traversed without detecting any accepting cycles). The question of how to implement an efficient representation of multi-states is at the centre of our work. (In the following we refer to the objects in a state space as states, unless the distinction between multi-states and single-states is necessary.) In classical explicit model checking, individual states represent one evaluation of variables and are thus stored as a list of values, each on a fixed position in a block of memory. Postponing any other state reduction techniques for later discussion, states represented explicitly can be efficiently manipulated by copying and comparing continuous blocks of memory and more importantly compressed using hashing. As a hashing function we consider any function (not necessarily injective) that maps a block of memory to a small number (small compared to the domain of the hashing function, often 32- or 64-bits integer number). Storing visited states in a hash table allows one to decide whether a state was visited before with an average-time constant complexity. Hashing, however, is only permissible on a set where every element has a canonical representation and this representation can be computed efficiently. More precisely, it is required that for two elements e1, e2, it holds that if the hashing function h returns different hashes, h(e1) 6= h(e2), then e1 is not the same object as e2. This is readily available for sets whose elements are stored explicitly, but for symbolic representation the situation is more complicated. The set of BV functions does not have the property of efficiently accessible canonical representations, i.e. BV formulae have canonical forms but the process of canonisation is an NP-complete problem. Clearly, a BV function, when stored as its abstract syntax tree is not a canonical representation of its image nor of its semantics. For example, 3 + 2 and 2 + 3 both map to 5, yet they are syntactically different and thus their representations do not match. Canonical representations of BV formulae exist, but their size grows exponentially in the presence of multiplication, and it is questionable what would be the improvement compared to the explicit set representation. Furthermore, the canonical form of BV formulae is canonical only with respect to satisfiability, i.e. two functions have the same canonical representation if and only if they are equisatisfiable. It follows that such canonical form need not preserve the images of functions contained within the original formula.

70 5.3.1 Linear Storage

Given the basic principle of the combination between explicit and symbolic approaches to model checking, that the control part of individual states is stored explicitly, one still can partially employ hash-based search. Effectively, only the states that share the same control part must be tested for equivalence of the data part in order to decide state matching. The control part is canonical and the standard hash-based search can be used to distinguish states that differ in the control part. This considerably improves the time efficiency, since the number of states among which one must search using the more expensive procedure is smaller. For further discussion let us denote the set of states of G = (S, s0,T,A) that share the same control part with a state s as [s]c, i.e. [s]c = {s0 ∈ S | s0(pc) = s(pc)}. A straightforward way of storing the set [s]c is in a linear array. Then deciding whether a new s0 is already present in [s]c constitutes a similar search as in hash table collisions: the array storing [s]c is traversed in a linear fashion and every element tested for equivalence with s0. If such a s00 is found for which image(s0, s00) is unsatisfiable then the traversal is aborted (s0 is already stored); otherwise, if [s]c was traversed to the end, s0 is appended to [s]c as a new state. Note that finding a state s0 to be a new state requires the traversal of the whole [s]c, yet for every s00 ∈ [s]c the formula image(s0, s00) is satisfiable. Since finding a satisfying assignment to a QBV formula is considerably faster than proving unsatisfiability, it follows that detecting already visited states is more expensive than finding a state to be genuinely new. There are several possible ways of improving this basic scheme. One can clearly see that the finer the state space is divided solely by the control part the faster the state matching is. This can be utilised by moving some of the data into the control part. For example, if we can be certain that a particular variable can be evaluated to a single value in a state, then it can be safely moved to the control part to be used in further, hash-based, state space division. Another optimisation is to use the satisfying assignments of previously checked image formulae. As stated before, such an assignment is in fact an evaluation of input variables that leads to an evaluation of program variables that exists in one state but not in the other. Since the data part of states connected by the same control part was formed under similar conditions (often only a different iteration of a cycle), it might happen that the evaluation that differentiated two such states could differentiate also the next ones. Furthermore, to test whether such evaluation could be reused is possible without calling the SMT solver, using the explicit evaluation method incorporated in the model checker. We now explain these optimising heuristics in more detail.

5.3.2 Explicating Variables

If some variables do not depend on inputs or their values were limited by path conditions it may happen that only one value is permissible as their evaluation (see [BL13] for explicit- value analysis using a similar idea). This can be detected by investigating the satisfiability model and employed during the hash-based search. It requires adding another set of control C variables CD to store the explicit values, i.e. for each x ∈ D we add x into CD. Thus hashing C C C is performed on the set {s(pc), s(x1 ), . . . , s(xn )} and we add new pairs (x , e ∈ sort(x)) into s when only a single evaluation e of x satisfies the path condition. More precisely, for 0 0 0 0 0 s (data) = (ρ1, . . . , ρn) when testing satisfiability of ρn ◦ ... ◦ ρ1, we extract the satisfying

71 0 0 evaluation as v. Then all data variables xi are tested for satisfiability of ρn ◦ ... ◦ ρ1 ∧ xi 6= vi. C If the previous formula is unsatisfiable we add (xi , vi) into s. Explicating variables does not require any further modification of either the accepting cycle detection or database searching. The desired effect of this heuristic is that the number of states C C with the same explicit part – evaluation of the variables from {pc, x1 , . . . , xn } – is lowered. The database thus stores more explicit states, but the complexity of a hash-based search is negligible compared to the satisfiability-based search which resolves hash-based collisions. There is a negative effect, however, in that detecting variables with a single satisfying value requires further calls of the SAT solver. The experiments of Section7 show that even when the explicating is successful in only a few instances, the overall effect is positive.

5.3.3 Witness-Based Matching Resolution

The advantage of hash-based searching is that the amortised complexity of a single search query can, under certain conditions, be constant. Less efficient storage methods use linear ordering on the elements and tree-like structure where all possible collisions can be detected in a logarithmic number of steps. The following heuristics attempts to simulate the important property of a linear order with respect to searching: every comparison divides the remaining number of potential candidates approximately in half. Furthermore, if there is a more efficient procedure than calling a SAT solver to compare two states, then that other procedure should be used instead. Two sets of symbolically represented states sharing the same explicit part cannot be distinguished using hashing. Consider the model Y gained from checking satisfiability of equiv(s, s0). It is an evaluation of input variables and also a witness of the difference between s and s0 (note that we use the equivalence-based state matching). This witness can be reused and can replace some calls of the SAT solver when processing a new state s00. For example, let the model from some previous equiv(s, s0) evaluate the input variables to y, such that 0 0 V V ρn0 ◦ ... ◦ ρ1 ∧ ij = yj evaluates the data variables to v, ρn ◦ ... ◦ ρ1 ∧ ij = wj evaluates 0 0 00 00 V the data variables to v , and v 6= v . Then if ρn00 ◦ ... ◦ ρ1 ∧ ij = yj evaluates the data variables to v00, where v00 6= v we have s00 6= s, otherwise s00 6= s0. Notice that obtaining v00 and testing the equality with v can be done without calling the SAT solver since we can simply substitute the input variables y and evaluate the resulting formulae. Let us focus the following discussion on a single set of states with the same explicit part. The above observation can be used more extensively as the number of states in- creases, since every pair among those states constitutes a witness. We propose to build a hierarchy among the states, attaching two sets of states to every witness, denoted as i i {s1, . . . , sl} y {z1, . . . , zr}, where y is the witness of s1 6= z1, si(data) = (ρ1, . . . , ρni ), i i and zi(data) = (ζ1, . . . , ζmi ). Let vi and wi be the data variable evaluations gained from the i i V i i satisfiability models of ρni ◦ ... ◦ ρ1 ∧ ij = yj and ζmi ◦ ... ◦ ζ1, respectively. The separation between s and z states is such that ∀(1 < i ≤ l), vi = v1 and ∀(1 ≤ i ≤ r), wi 6= v1. The witnesses are stored in a priority queue and the one with the highest priority is used for the next state matching. The priority of every witness equals the sum of the number of states that it connects, that is l + r, but we also continually update the priority with every state matching: if a state is found different from states in a set Z then we remove the states in Z from both sets connected by each witness. This ensures that, on average, each state matching removes the highest number of states still to be compared with. Once all witnesses with non-

72 zero priority have been tested, the remaining state matchings must be decided with standard calls to the SAT solver. Similar, though not equivalent simplification can be obtained from reusing the witnesses of image(s, s0), i.e. in the case of image-based state matching. There the model is again an evaluation y for the input variables in s such that for any input evaluation for s0, s0(data) leads to an evaluation of the state-forming variables different from s(data). While this cannot be used to differentiate a new state completely without any call to the SAT solver, one can exponentially decrease the theoretical complexity of the call by checking satisfiability in BV instead in QBV. (Equivalently, one could generalise the idea by stating that reusing a witness allows for differentiation with satisfiability of formulae with one less quantifier alternations.) 00 00 V 00 V 00 If the quantifier-free formula ρn ◦ ... ◦ ρ1 ∧ ρn00 ◦ ... ◦ ρn00 ∧ xi = xi ∧ ij = yj is satisfiable then s00 6= s0; if it is unsatisfiable then s00 6= s.

5.4 Model Checking Algorithm

At this point we have all the necessary methods described to sufficient detail to be able to construct the model checking algorithm using set-based state space reduction. It will prove useful to consider a model checker to consist of four communicating components: (1) an implicit graph description, i.e. the input program, (2) an accepting cycle detection algorithm which continuously generates the explicit graph, (3) a database of already visited states, and (4) a counterexample extractor. Also, even though the model checking will be explained on the control-flow graphs using image-based state matching, it works exactly the same for equivalence-based state matching. An implicit graph description for the purpose of model checking implements three func- tions: initial, successors, and is accepting, that together exhaustively represent a pro- gram P = (L, l0, T ,F ).

• The function initial simply returns the initial state of the program, i.e. s such that s(pc) = l0, s(data) = (>).

• The function successors takes a state s as its parameter and returns a set of all its →- successors generated as described in the Definition 5.13, i.e. s0 ∈ successors(s) ⇔ s → s0. Notice that generating each successor requires calling the SAT solver to verify that ρ 0 ρn ◦ ... ◦ ρ1 ∧ ρ is satisfiable, where s(data) = (ρ1, . . . , ρn) and s(pc) −→ s (pc) ∈ T . Using the explicit representation, computing successors(s) requires computing the successors for each v ∈ s . J K • Finally, is accepting(s) is > if and only if s(pc) ∈ F is an accepting state.

The database Visited is an associative array, storing a pair with each state reference. References are denoted by underlining and are implemented as pointers into a linear array States. Hence Succs is a sequence of references to the successors of s, and π is the reference to the parent state of s. In order to support a graph traversal algorithm, the database implements two functions: [·] and has. The semantics of [s] is such that either the pair associated with s is returned or a reference to a new pair is returned if s is not in Visited. The other functions have their expected meaning which is easily inferable from their use in Algorithm6.

73 Algorithm 6: Generic Accepting Cycle Detection Input : Program description Graph; Database Visited Output: A pair <{Correct, Incorrect}, S>, where S is a subset of program states that contains a reachable accepting cycle if the first element is Incorrect

1 Ready.push() 2 while Answer = Continue do 3 := Ready.pop() 4 if ¬Visited.has(s) then 5 Succs := Graph.successors(States[s]) 6 SuccRefs := map(Succs, States.push back) 7 Visited[s] :=

8 := Visited[s] 9 := acc cycle step(s, SuccRefs) 10 if Answer = Continue then 11 foreach s0 ∈ Next do 12 Ready.push()

13 return

Furthermore, all database functions can be implemented by the generic searching function described in Algorithm7. Note that the function in Algorithm7 is essentially a hash-based search with linear conflict resolution. The reason for our describing it to any extent is to expose the difference between the hash-based search for the set of control equal, data conflicting database elements, and the search within this set. While the former is a very fast, syntax- based comparison of two blocks of memory, the latter, represented on line3 by the call to a SAT solver on one of the image functions, can be extremely slow, since it has to interpret the function representing variable evaluations. As the experiments will later show, the efficiency of the whole model checking approach is crucially dependent on the size of these conflicting sets of states. The accepting cycle detection algorithm itself, depicted in Algorithm6, is again a generic graph traversal, abstracted from a concrete implementation for some specific cycle detection algorithm. Details of particular algorithms are not necessary for understanding this work and can be found in original publication for respective algorithms. The interested reader should consult [BCMˇ S04ˇ ], [CP03ˇ ], [CVWY92] for details on MAP, OWCTY, and Nested DFS algorithms, respectively (those are the three algorithms implemented in DIVINE). The existence of a database of known states is used for three main reasons: (1) it is more expensive to generate successors of a state through implicit graph representation than to obtain them directly from the database, (2) in order to decide correctness of a system the algorithm must traverse the whole graph and thus be able to decide when all states have been visited, (3) the accepting cycle detection requires that the state matching is implemented. Hence based on (1) we should access the successors of a state using the database (line8) instead of reconstructing them again (line5) in the case the present state is already stored in the database. The output of the accepting cycle detection, in the case the program is incorrect, is a subset of the state space which contains an accepting cycle reachable from the initial state.

74 One detail of the accepting cycle detection is interesting with respect to the correctness of the overall model checking procedure. When a state is revisited, the algorithm only preserves the older version of the state: the one stored in the database. In standard model checking these two states are identical, but under set-based reduction they may differ in both the path condition and the functions representing variable evaluations. The following theorem shows that in the set-based reduced systems, i.e. with state matching at least as strongly differentiating as the image-based, any version of the state can be used with the same effect. The implications of storing only the oldest version of a state are observable when we try to interpret a path in the system, which happens solely when counterexamples are extracted. Theorem 5.21 (Image-Based Path Existence). Let s be a reachable state in a set-symbolic 0 0 G = (S, s0,T,A) and let π be a path from s to s such that 6|=SAT image(s, s ). Then for every 0 path ζ1 from s to s1 there is a path from s to s2 such that 6|=SAT image(s1, s2). 0 0 Lemma 5.22 (Single Step Equality). Let s, s ∈ S be such that 6|=SAT image(s, s ) and let ρ ∈ Φ be such that for r, r0 ∈ S we have: r(pc) = s(pc) and r(data) = s(data).ρ. Furthermore, r0(pc) = s0(pc) and r0(data) = s0(data).ρ. Then r = r0 . J K J K 0 0 0 Proof. Let s(data) = (ρ1, . . . , ρn) and s (data) = (ρ1, . . . , ρn). Then V 0 v ∈ r ⇔ |=SAT ρ ◦ ρn ◦ . . . ρ1 ∧ xi = vi J K V 0 V V 0 ⇔ ∃w, |=SAT ρ ∧ xi = vi ∧ xi = wi ∧ ρn ◦ ... ◦ ρ1 ∧ xi = wi V 0 V ⇔ ∃(w ∈ s ), |=SAT ρ ∧ xi = vi ∧ xi = wi J K0 V 0 V ⇔ ∃(w ∈ s ), |=SAT ρ ∧ xi = vi ∧ xi = wi J K 0 0 V 0 ⇔ |=SAT ρ ◦ ρn ◦ . . . ρ1 ∧ xi = vi ⇔ v ∈ r0 J K

ρ1 ρn Proof of Theorem 5.21. Let the path from s0 to s be s0(pc) −→ ... −→ s(pc) and the path from s to s0 s(pc) −→σ1 ... −−→σm s0(pc). We prove the statement by induction on the length 0 p of the path from s to s1. For p = 0 the statement follows from the assumptions. Let 0 ζ1 ζp ζp+1 s (pc) −→ ... −→ sp1(pc) −−−→ s1(pc), then from the induction hypothesis we have a path θ1 θp s(pc) −→ ... −→ sp2(pc) such that 6|=SAT image(sp1, sp2). From

|=SAT ζp+1 ◦ ζp ◦ ... ◦ ζ1 ◦ σm ◦ ... ◦ σ1 ◦ ρn ◦ ... ◦ ρ1 and the definition of sp2 we have |=SAT ζp+1 ◦ θp ◦ ... ◦ θ1 ◦ ρn ◦ ... ◦ ρ1 and thus there is J K sp2 → s2. Finally, from Lemma 5.22 it follows that s1 = s2 . J K J K The subset returned by the Algorithm6 is not in a form presentable to the user as a counterexample and needs to be further clarified, i.e. the actual counterexample has to be extracted from Next using the Algorithm8. The algorithm uses heavily the graph induced on parent edges in order to accelerate the location of both an accepting cycle and the path to that cycle from the initial state. Hence three different graph traversal procedures are used in this algorithm: forward search along normal edges (Reachf ), forward search along parent p p graph edges (Reachf ), and backward search along parent graph edges (Reachb ). The first three parameters are common to both procedures, i.e. the number of steps (∗ for unlimited traversal), starting vertex, and the subgraph that limits the scope of the search. The last parameter of the backward search is a set of vertices on which the traversal should be aborted.

75 Algorithm 7: Database Searching Input : State s; Database Visited; Modifying function update Output: Pair , where Answer ⇔ s ∈ Visited and s is the reference to the stored state. Furthermore, Visited is correctly modified using function update.

1 Key := hash(s(pc)) 2 foreach s0 ∈ Visited : hash(s0(pc)) = Key do 0 0 3 if 6|=SAT image(s, s ) then // or equiv(s, s ) 4 return <>, &s0>

5 return <⊥, Visited.update(s)>

Example 5.23 (Counterexample Extraction). We use the relatively simple CFG below to demonstrate the counterexample extraction.

s1 s3

s4

s0

s5

s2 s6 sc

The input is the lighter green area, which contains an accepting cycle. In this example, we represent parent graph edges with full lines and the remaining edges with dashed lines. The extraction begins by locating the states with no successors in the parent graph, marked with small rectangle, i.e. the states s3, s4, s5, sc. Next, it identifies the states with an accepting predecessor on the parent graph, marked with red-filled rectangles. For each of those states, here demonstrated on state sc, the algorithm computes the set of successor vertices in the whole graph, marked with a darker blue square. Finally, the accepting cycle is identified as the yellow area on the bottom. 4

The parent graph is the shortest-path tree and thus contains no cycles. Yet some vertices in the parent graph may have outgoing edges in the full graph that end in one of their parent- graph predecessors. Such vertices cannot have successors in the parent graph. These are first located on line1 and then refined on2, since we are only interested in those that have 1 accepting predecessors; and then iteratively tested for being on a cycle. Hence the sets Rf and ∗ Rb contain the one-step forward closure and the full backward closure, respectively, the latter ∗ 1 along parent edges. Rb formation is also stopped at any vertex from Rf , which identifies the 1 ∗ vertex sc to be on a cycle. Given that the parent graph is a tree, we know that Rf ∩ Rb

76 Algorithm 8: Counterexample Extraction Input : Subset Next that contains a reachable accepting cycle Output: Pair <π, ζ>, where π = s0, . . . , sa is a path in Graph, such that s0 is the initial state, sa is an accepting state; ζ = sa, . . . , sa is a cycle in Graph. p 1 NoSucc := {s ∈ Next | Reachf (1, s, Next) = ∅} p 2 Acc := {s ∈ NoSucc | Reachb (∗, s, Next,F ) ∩ F 6= ∅} 3 foreach sc ∈ Acc do 1 4 Rf := Reachf (1, sc, Next) ∗ p 1 5 Rb := Reachb (∗, sc, Next,Rf ) 1 ∗ 6 if Rf ∩ Rb = ∅ then ∗ 7 sa := s ∈ Rb ∩ F p ∗ 8 return

∗ contains at most a single vertex. The cycle Rb must contain an accepting vertex sa, from which the final backward traversal towards the initial vertex can be started.

5.4.1 Counterexamples The ability to produce counterexamples is a crucial property of LTL model checking that must be retained even for our combination with symbolically represented data. In standard, explicit model checking a counterexample is an infinite path that violates the desired property, see Algorithm8. To the user, it is usually reported as a pair (path to a vertex on an accepting cycle, path from that vertex to itself). With symbolic representation of data we can either select one variable evaluation from among those comprising individual states, or use a part of the symbolic representation to provide the user with more information detailing the violating behaviour than what could be inferred from the explicit counterexample. We propose to represent the counterexample as a quadruple (path π to an accepting state, cycle ζ on that state, path conditions for π and ζ). In the description of paths we are only interested in the choices of the control-flow non-determinism that are required to be taken to steer the computation into and through the violating cycle. The path conditions on the other hand, need only to describe what data values are admissible to get to that cycle. It follows that from such a counterexample the user can get not only one (or a fixed number of) concrete run that violates the specified behaviour but also the precise description of all such runs, represented as the path condition. The negative aspect of symbolic counterexamples is related to the first fixed point search as described in Section 5.1, that can lead to counterexamples exponentially longer than those produced by the standard approach. 0 Altogether, the counterexample is a quadruple <π, ζ, sa(pc), sa(pc)>, where π = s0, . . . , sa 0 0 and ζ = sa, . . . , sa. Note that distinguishing between sa and sa is only necessary when employing image-based state matching. Given that equivalence-based state matching does 0 not allow reading from input variables on a cycle, the path conditions for sa and sa must be exactly the same. Furthermore, we can use the result of Algorithm8 without any modifications and simply project the paths to only control-flow information and extract the path condition 0 from sa. The image-based state matching must recompute the path condition for sa which is not stored in the database. Yet sa is stored and we know that sa is the direct predecessor

77 0 1 of sa and thus the states stored in Rf have correct path conditions. It follows that sa in the 0 algorithm already is the correct sa for the counterexample.

5.4.2 Deadlocks The above description is complete and correct for arbitrary system that does not contain deadlocks, i.e. at least one guard is satisfiable at any state of the system. In order to extend our approach to systems with deadlocks (which is desirable since deadlock-freedom is an important property to be verified), the algorithm has to correctly resolve the situation when there is a value for which no guard is satisfiable. Formally, let Γp be the guards on system transitions leaving a state s and Γa the guards on specification transitions. Then if W W 0 ρn ◦ ... ◦ ρ1 ∧ Γa ∧ Γp is unsatisfiable the traversal algorithm creates s as a copy of s, sets s0(data) to s(data).⊥ and stores it as another successor of s. This way the ability to detect deadlocks is restored and with it also the standard semantics of LTL on finite runs – implicit appending of a self-loop – is handled correctly.

5.4.3 Witnesses In the case of equivalence-based state matching, the witnesses may take two forms: (1) for demonstrating path condition difference and (2) for demonstrating data evaluation differ- ence. In both cases, the witness contains two sets of state references: one for positive and one for negative instances. Witnesses of form (1) further contain an evaluation of input variables y, such that for a positive state s it holds that y satisfies the path condition, i.e. V |=SAT ρn ◦ ... ◦ ρ1 ∧ ij = yj; y does not satisfy the path condition of negative states. Wit- nesses of form (2) contain two sets of evaluations: y of input variables and and v of state forming variables. The idea for witnesses of form (2) is that positive states map y to v, i.e. V V 0 for a state s it holds that ρn ◦ ... ◦ ρ1 ∧ ij = yj ∧ xi = vi. Negative states then map y to a vector other than v. Using witnesses can be incorporated into database searching (Algorithm7) as demon- strated in Algorithm9 into which three additions were made. First, the matching resolution was added on line 1a which prunes the database Visited, removing states using known wit- nesses while concurrently modifying the priority queue Witnesses (Algorithm 10). The second addition on line 4b extracts the satisfiability model and builds a new witness based on this model (Algorithm 11). The final addition is on line 4c and there we updates the priority keys of witnesses according to the newly added state ω0 (Algorithm 12).

78 Algorithm 9: Database Searching with Witness-Based Matching Resolution Input : State s; Database Visited; Modifying function update Output: Pair , where Answer ⇔ s ∈ Visited and s is the reference to the stored state. Furthermore, Visited is correctly modified using function update.

1 Key := hash(s(pc)) 0 1a Visited := witnessBasedResolution(Visited) 0 2 foreach s0 ∈ Visited : hash(s0(pc)) = Key do 0 3 if 6|=SAT image(s, s ) then 4 return <>, &s0>

4a else 4b ω0 := extractWitness(s, s0) 4c updateWitnesses(ω0)

5 return <⊥, Visited.update(s)>

Algorithm 10: Witness-based Matching Resolution Input : Database Visited Output: Subset Visited 0 of known multi-states 0 1 Visited := Visited 2 while ¬Witnesses.empty() do 3 ω := Witnesses.pop() V V 4 res := (|=SAT ρn ◦ ... ◦ ρ1 ∧ ij = ω.yj ∧ xi = ω.vi)? positive : negative 0 0 5 Visited := Visited \ ω.res 77 foreach ω0 ∈ Witnesses do 8 ω0.positive := ω0.positive \ ω.res 9 ω0.negative := ω0.negative \ ω.res 10 key := |ω0.positive ∪ ω0.negative| 11 if key < threshold then 12 Witnesses.remove(ω0)

13 Witnesses.decreaseKey(ω0, key)

0 14 returnVisited

Algorithm 11: Extract Witness Input : Differentiable states s, s0 Output: Witness ω0

1 M := getModel(image(s, s0)) 0 2 ω .y := M|I 0 3 ω .v := M|D 4 return ω0

79 Algorithm 12: Update Witnesses Input : Witness ω0

1 Witnesses.add(ω0, 0) 2 foreach s0 ∈ Visited do V V 3 res := (|=SAT ρn ◦ ... ◦ ρ1 ∧ ij = ω.yj ∧ xi = ω.vi)? positive : negative 4 ω0.res := ω0.res ∪ {s0} 5 key := |ω0.positive ∪ ω0.negative| 6 Witness.decreaseKey(ω0, key + 1)

80 Chapter 6

Verification of Simulink Designs

The Simulink diagram presented in the introduction (in Figure 1.2), shows why are these de- sign stage models perfect candidates for demonstrating the efficiency of control explicit—data symbolic model checking. The standard explicit-state model checking cannot be used with- out modifying the diagram, since the semantics dictates that each input is read in each step. A comprehensive explicit verification would thus need to generate 232n potential successor states in each step, where n is the number of inputs. In Section3 we have formalised the process of generating a CFG from a parallel program. Simulink diagrams share many similarities to standard programs, but there are a few unique features, regarding especially the input/output behaviour. These features may simplify the verification and thus need to be pointed out in detail. We thus first formalise the diagrams, trying to remain consistent with the notation of Section3 wherever possible. The results of comparing purely explicit model checking with control explicit—data symbolic model checking are presented next.

6.1 Theory of Simulink Verification

In order to be able to denote explicitly what variables may occur in a particular formula, we will use the lower index, e.g. ϕX is a formula over variables from X. In order to distinguish between BV formulae and terms we will denote formulae with p (for predicates) and function with f.

A state of a Simulink system is described by the evaluation of X = {d1, . . . , dm}, the delay variables, where their values for the next iteration are given by a function fX ∪I , denoted as δi. There is a finite number of input variable templates, I = {i1, . . . , in}, which can be read iteratively, but for each there are bit-vector constraints limiting its possible evaluation (represented by a predicate pI , denoted as ιi). Finally, the complete description of a system also requires the outport functions O = {σ1, . . . , σk} (each of which is fX ∪I ). On the side of the specification, we will abstract the given B¨uchi automaton to only the predicates pX ∪I∪O labeling its transitions, Ψ = {ϕ1, . . . , ϕl}, and assume that the transition system, the order in which the predicates occur, is given implicitly. Hence the example below contains all the information needed for a Simulink diagram verification query Q = (X , I, O, Ψ).

Example 6.1 (CFG for Simulink). Let Q1 = (X , I, O, Ψ), where

81 X = {d1, d2}

o2 δ1 :(i1 + d1) ∗ (i2 + d2) δ2 : d1

O = {σ , σ , σ } ite 1 2 3 σ1 : max(i1, i2, i3) σ2 : ite (i1 > 7) d1 d2 σ3 : d1 + i3 d2 d1 > 7

I = {i1, i2, i3} ι : 1 ≤ i ≤ 20 + ∗ + i 1 1 1 ι2 : 3 ≤ i2 ≤ 5 ι3 : 5 ≤ i3 ≤ 10

i2 i3 max o1 Ψ = {ϕ1, ϕ2}

ϕ1 : σ2 > 3 ∨ i1 ≤ σ1/2 ϕ : σ = σ ⇒ i ≤ 4 2 1 2 2 4

As mentioned above, the states of a Simulink transition system are sets of evaluations of the variables from X . The initial state consists of a single evaluation {di 7→ 0}. Let I = {x := {ij 7→ y} | x |= ιj} be the |I|-dimensional polyhedron of allowed inport evaluations, for Q1 it would be {1..20} × {3..5} × {5..10}. Then, computing every transition consists of three steps:

1. compute the Cartesian product (of this state) with I: results in a set of evaluations of variables from X ∪ I;

2. create a new state, initialised with the above set of evaluations, for every available specification formula ϕ (there can be more that one, hence the transition system may branch at this point) and remove those evaluations that do not satisfy ϕ;

3. apply δ functions to compute the new evaluations of X .

Assuming that the computation is in state s, represented as the evaluation of delay variables, and that the relevant transition is associated with a specification ϕ, then the resulting next 0 state is s = {x := {di 7→ v[δi](v(d), v(i))} | v ∈ s × I ∧ x |= ϕ}. The system can reach a deadlock if there is no such v ∈ s × I for which {di 7→ v[δi](v(d), v(i))} |= ϕ. The model checking process computes the graph of the Simulink computation, where identical states are unified, i.e. if a newly generated state s00 (as a successor of s) is found equal to an already existing state s0, then s00 is discarded and s0 is acknowledged as a successor of s. For the purpose of providing the user with counterexample traces, the states also contain an |I|-long sequence of inport values that satisfies the specification condition. But that is easily implementable, and will only be mention if a significant change in the computation was necessary to incorporate the counterexample traces.

82 Example 6.2 (Simulink Semantics). We continue with the query Q1 from Example 6.1 and simplify the notation by establishing two transformations. First, ϕei maps a set of evaluations of X ∪ I to that subset where every evaluation satisfies ϕi. Rephrased formally we have

 ν ν |= ϕ ϕ (ν) = i ei ⊥ otherwise

Second, ∆e maps a set of the same evaluations to evaluations of X , after the δ functions have been applied, hence

∆fi(ν) = {(di, vi) | vi = δi(ν(d1), . . . , ν(dn), ν(ι1), . . . , ν(ιm))}

Hence a part of the transition system of Q1 is as follows:

s01 = ∆e ◦ ϕf1(s0 × I)

s = {d 7→ 0} 0 i s011 = ∆e ◦ ϕf1(∆e ◦ ϕf1(s0 × I) × I)

s02 = ∆e ◦ ϕf2(s0 × I) s012 = ∆e ◦ ϕf2(s01 × I)

Concretely, the set s0 × I has 360 elements. ϕf1(s0 × I) and ϕf2(s0 × I) have 63 and 315 elements, respectively; s01 has 12 elements and s02 has 45 elements. For example s01 = {(3, 0), (4, 0), (5, 0), (6, 0), (8, 0), (9, 0), (10, 0), (12, 0), (15, 0), (16, 0), (20, 0), (25, 0)}. 4

The process described so far produces a transition system, where the states are evaluations of delay variables and a location of the B¨uchi automaton, and the branching is given by that same automaton. Without going into details with respect to the product of program and its specification – where we went in the previous section – the answer to a verification query Q is equivalent to the presence of accepting cycles in the produced transition system. The efficiency of answering a verification query, the complexity of Simulink model checking, is most noticeably influenced by the choice of how to represent the sets of evaluations. That is the biggest contributor to the state space explosion, since, unlike in parallel programs, the control-flow non-determinism is negligible in Simulink models.

6.1.1 Verification with Explicit Sets The first representation considered in the context of Simulink verification stores the allowed evaluations in an explicit set, enumerating all possible combinations of individual variable evaluations. This approach is closely related to the one we proposed in [BBB+12] but improves on it in several aspects. Effectively, rather than repetitive executions for every evaluation of both inport and delay variables, we only represent the delay variables in an explicit evaluation; the inport variables are stored purely symbolically, only as their constructing predicates ιj, outside individual states. The above description of Simulink verification semantics justifies this optimisation, since the inports are only used for the evaluation of Ψ predicates from the specification. They are discarded during the ∆e transformation, and can be recovered using an inverse transformation if necessary, e.g. during counterexample generation.

83 The implementation itself is then a relatively straightforward extension of the purely ex- plicit approach. The states enumerate every allowed evaluation and thus a separate access to each of these evaluations is permitted. Computing the Cartesian product with I amounts to enumerating all combinations of delay and inport variable evaluations. For the ϕe and ∆e transformations, one iteratively considers individual evaluations and removes those not satis- fying ϕ and applies the δ function, respectively. Most importantly, however, deciding equality of states is possible on the level of syntax: two states are the same if their representations in memory match. Verification with explicit sets clearly retains the biggest limitation of the purely explicit approach, i.e. the spacial complexity grows exponentially with the ranges of inport variables. We will postpone the discussion of the comparison between the explicit set approach and the purely explicit approach until Section 6.2.2 and only remark that while the improvement is considerable, it does not allow specification of ranges with respect to their bit-width. That ability appears to require a symbolic representation and we continue our analysis towards temporal verification of Simulink models with a method which uses BV formulae and the satisfiability modulo theories (SMT) procedures for this theory.

6.1.2 SAT-Based Verification

The need to iteratively read from inputs via the inport variables complicates the other- wise relatively straightforward application of symbolic execution. Later we will show that the execution-based approach, where current values of variables are represented as symbolic functions over input variables, entails another complication in deciding state equality. In standard symbolic execution, the variables are initialised to an arbitrary value only once, at the beginning. Thus reading from inports requires to strengthen the standard symbolic execution to allow computation with potentially infinite number of input variables. Henceforth, only the inport variables are considered as first-order variables, the delays and outports are only functions in the BV theory (over inport variables). The inports are history- dependent and thus one may label the symbols from I to stand for the inport variables, i.e. I0, I1, I2 and so on. Given that the only branching in the transition system is caused by the particular choice of a specification transition, the states can be represented as l-long sequences of numbers ρ = (r1, . . . , rl), where l is the length of this computation. Each number in this sequence represents which specification formula was selected in the respective branching of the computation. Indeed, the functions representing delays δl, outports σl and the specification predicates ϕl are unambiguously defined for each l recursively as follows:

0 0 l+1 l+1 l δ = δ[ij/ij , dj/0], δ = δ[ij/ij , dj/δj]; 0 l l+1 l+1 l σ = σ[ij/ij, dj/0], σ = σ[ij/ij , dj/δj]; (6.1) 0 0 0 l+1 l l+1 l ϕ = ϕ[σj/σj , ij/ij , dj/0], ϕ = ϕ[σj/σj, ij/ij , dj/δj], where f[x/y] is the formula f with every occurrence of x replaced by y. Hence every formula at any point of the verification computation uses only the inport variables decorated with history indices. As mentioned above, the system reaches a deadlock if no value satisfies the specification formula. Let the computation be in a state represented by ρ and the outgoing transition labelled with formula ϕp. Then checking whether this transition leads to a deadlock is equiv-

84 alent to checking satisfiability of the following recursive path condition formula, where  is an empty sequence: V 0 path() = 1≤j≤|I| ιj , V |ρ|+1 |ρ|+1 path(ρ.p) = path(ρ) ∧ 1≤j≤|I| ιj ∧ ϕp . Also note that the model of this formula, the satisfying evaluation of inport variables, is a counterexample trace (one of them to be precise) which the SMT solver can generate while deciding the satisfiability. Finally, to complete the verification process, one needs to be able to decide if two states are the same. Let ρ and ρ0 be two states, of which it needs to be assessed whether they are identical, i.e. if the sets of possible evaluations of the delay variables are the same. Then the |ρ| 0 function δj represents the possible values of dj in ρ, provided that |= path(ρ) ∧ path(ρ ), i.e. both path conditions are satisfiable. However, the δ functions represent the current values of delays by the means of inport variables, describing to what evaluation of delays does a particular combination of inputs lead. The states ρ, ρ0 may not be of the same length and thus may differ in the number of inputs. Consequently, comparing two states amounts to comparing the images of the δ functions, i.e. comparing the sets of possible evaluations. We will continue this argument in Section 6.3 and for now simply state the quantified version of 0 0 ψ(ρ, ρ ) := Ψρρ0 ∨ Ψρ0ρ – a formula which is satisfiable iff ρ and ρ are two different states:

0 0b Ψρρ0 : path(ρ) ∧ (∀y1, . . . , y|ρ0|∗|I|)(path(ρ )[ia /ya∗b] ⇒ 0 W|X | |ρ| |ρ | 0b  j=1 δj 6= δj [ia /ya∗b] .

Put together the SAT-based verification of Simulink works as shown in Algorithm 13– we only describe generation of one successor of a state s, the rest is the same as in standard model checking. That is, a particular accepting cycle detection algorithm traverses the state space by the means of successor generation. Given that the states already visited are known, a property that has to be preserved by any implementation of successor generation, the model checking process is indeed similar to the standard explicit model checking, described for example by [VW86]. The functions in the teletype font have their expected meaning, i.e. sym and exp return the symbolic and explicit part of the state given as their parameter. As described in Section3, the symbolic part is the set of evaluations of input variables (here succinctly represented as the sequence of specification formulae indices) and the explicit part is the control state (here only the state of the B¨uchi automaton is relevant). One interesting observation is that the search for equivalent states (cycle on lines5–10) is in standard model checking a constant time operation, using a hashing function. But our approach prohibits hashing because a representation using BV function is not canonical, leading to identical states with different hashes. The explicit part, which is canonical, is used (line5) for a partial hash-based search, thus the body of the cycle only considers states with the same explicit part. Hence the inner cycle computes the linear search among states with a common explicit part, where each comparison requires calling the state matching procedures (line7), i.e. deciding satisfiability of the quantified formula ψ. The crucial argument regarding correctness of the above SAT-based verification of Simulink diagrams is formalised in the following theorem.

85 Algorithm 13: Successor Generation for Simulink

Input : storage of seen states S; state s; specification automaton AΦ Output: S0 where all successors of s have been correctly added 0 ϕi 0 1 foreach ϕi, b such that exp(s) → b ∈ AΦ do 2 ρ ← sym(s).i 3 if SAT QFBV(pc(ρ)) then 4 seen ← false 5 foreach s0 ∈ S such that exp(s0) = b0 do 6 ρ0 ← sym(s0) 7 if SAT QBV(ψ(ρ, ρ0)) then 8 s0 is a successor of s 9 seen ← true 10 break

11 if ¬seen then 12 exp(s00), sym(s00) ← b0, ρ 13 S0 ← store(S, s00) 14 s00 is a successor of s

Theorem 6.3 (Simulink Verification Correctness). The proposed equality of states using the function ψ is correct with respect to the semantics of Simulink diagrams, i.e.

0 0 ∀ρ, ρ : ψ(ρ, ρ ) ⇔ sρ 6= sρ0 .

Proof. This can be easily reduced to Ψρρ0 ⇔ sρ 6⊆ sρ0 and similarly for the other inclusion. Furthermore, Ψρρ0 ⇔ ∃x.x ∈ sρ ∧ x∈ / sρ0 and

1 1 2 |κ| x ∈ sκ ⇔ ∃j1 , . . . , j|I|, j1 , . . . , j|I|.x = d(κ, J1,...,J|κ|),

l l where Jl unrolls to j1, . . . , j|I| and

d(, J ,...,J ) = ∆ ◦ ϕ (... ∆ ◦ ϕ ({0|X |,J }) ... × {J }). 1 n 1 n e gan e ga1 1 n

1 1 2 |κ| a a One then only needs to show that ∀j1 , . . . , j|I|, j1 , . . . , j|I| if pc(κ)[ib /jb ] then

|κ| a a |κ| a a <ν[δ1 [ib /jb ]], . . . , ν[δ|I|[ib /jb ]]>= d(κ, J1,...,J|κ|).

That can be done by induction on the length of κ. For the base case we have that for arbitrary 0 n+1 n+1 interpretation ν, ν[δi ] = 0 = πi(d()). Assume j1 , . . . , j|I| are in some ν evaluated such n+1 that ϕk[ia/ja ]. Then ∀1 ≤ l ≤ |X | it holds that

n+1 ν[δl[ia/ja , da/πl(d(κ, J1,...,Jn))]] = πl(∆e ◦ ϕfk(d(κ, J1,...,Jn) × {Jn+1})), which finally equals πl(d(κ.k, J1,...,Jn+1)).

86 Having established the correctness of the ψ function, i.e. equivalence among states that represent sets of variable evaluations, the rest follows from the correctness of the set-based reduction, which we proved in Section5. The complexity of the process is much more difficult to establish. On the one hand, the set-based reduction may, in the worst case, lead to no reduction at all. This closely relates to computation of the fix points of operations on program cycles, which is discussed in detail by [Lin96, HGD95] among others. The reader should note, however, that in the case of the Simulink diagram and the verified properties used in our experiments this phenomenon did not occur. In fact the transition systems were exponentially smaller than in the case when no reduction was used. On the other hand, even with exponentially smaller state spaces, every instance of state matching requires decision of satisfiability in the quantified BV theory. That is asymptotically equivalent to satisfiability of quantified Boolean formula, and thus NEXPTIME-complete, as proved by [WHdM13]. Given the results of our experiments (Section 6.2.2), this high complexity does not prevent from practical use, since many of these decision instances are in fact trivial.

6.1.3 Hybrid Representation Although the SAT-based representation already achieves viable results for the real-world Simulink model we have experimented on, there is still space for further improvement. In some sense, both explicit and SAT-based representations carried out an excessive amount of work in testing all possible input values. There may be a large number of values that the Simulink engineers did not intend to use, yet this information was not included into the model: either intentionally or because the language does not permit other than interval constraints. Such tacit knowledge may be inferred and utilised to accelerate the verification process. Assume that we want to generate the successor ρ0 of a state ρ, represented as a conjunction V|X | |ρ| of BV terms j=1 δj and the accumulated path condition path(ρ), using a transition guarded by ϕ. A single evaluation of delay variables belonging to the successor can be obtained by reading the satisfiability model of the formula

|X | ^ |ρ| dj = δj ∧ path(ρ.ϕ). j=1 A satisfiability model is an evaluation of all free variables that satisfies the formula. The input variables can be ignored but we will store the evaluation of delay variables as a sequence 1 1 1 d =. Now we can extend the above formula to

|X | |X | ^ |ρ| _ 1 dj = δj ∧ path(ρ.ϕ) ∧ dj 6= dj , j=1 j=1 whose satisfiability model (provided the formula is still satisfiable) contains evaluation of delay variables d2 different from d1. This processes can be repeated until the resulting formula η (Formula 6.2 below) becomes unsatisfiable, at which point we have all satisfying evaluation, and thus the explicit representation of the successor ρ0. There are several notes to be made regarding the above procedure. It is clearly impractical in cases when there are many evaluations forming ρ0. Yet in many cases the Simulink design

87 permits only a very small number of evaluations. The procedure is intended to be a heuristic: detecting that only few evaluations are admissible or reverting to the classical, symbolic representation. Such detection can be implemented by setting a user-specified threshold thr on the number of evaluations. It immediately follows that using this heuristic will require the ability to switch from symbolic to hybrid representation and vice versa. In the general case, the hybrid state comprises of a pair: the accumulated path condition path and a non-empty set of delay evaluations D = {d}i. We can assume D to be non-empty because, initially, D = {<0,..., 0>} and with every computation step the evaluations in D are either replaced with new evaluations D0 or the path condition is extended (in which case D remains unmodified). The first case occurs when the successor contains less than thr values, |D0| < thr and η is unsatisfiable, and the hybrid successor state is <>,D0 >. The second case occurs when |D0| ≥ thr, the successor is , where ϕ is the transition guard. We are now ready to formulate the formula η, employed in the generation of successor evaluations: |X | |D0| |X | ^ |ρ| ^ _ k η : dj = δj ∧ path(ρ.ϕ) ∧ dj 6= dj . (6.2) j=1 k=1 j=1

The hybrid representation thus maintains invariant the property that a state is formed by a certain concrete evaluation of delay variables D and a symbolic path condition, consisting of n formulae. This state, however, represents the result of a computation of n + m steps, where the effect of the first m steps is precisely represented and stored in D. The only modification of the preceding SAT-based verification required to incorporate hybrid representation is an additional set of formulae restricting the initial delay evaluations. More precisely, the set of recursive equations from Definition 6.1 has to be preceded by

|D| |X | _ ^ k d0 = dj . k=1 j=1

The size of D is initially 1 and never grows above thr: if the successor state contains more than thr values of delays – a fact which is automatically deduced – the verification proceeds with the symbolic representation. Yet whenever the successor contains at most thr values, our hybrid representation becomes fully explicit, succinctly representing the complete information about the system’s computation up to this point.

6.2 Experimental Evaluation

We report the results of the verification of the VoterCore diagram, which was partly presented in Section1: the figure shows one third of the whole diagram, the other two thirds are connected via the c3 constant. This diagram was selected for two reasons: (1) the operations that must be computed during state space traversal are sufficiently complex to demonstrate the strength of SMT, (2) the diagram itself is reasonably simple so that scalability compared to other approaches can be demonstrated. Note that even though this diagram comes from a real-world development, it does not represent the full model used in the design phase. It was identified as a crucial unit and extracted manually for verification purposes.

88 The above mentioned diagram was verified against three LTL specifications:

φ1 : G (sd23 > 1 ∧ ¬sv1) ⇒ sm1; φ2 : G(pm1 ⇒ G pm1); φ3 : G(sm1 ∧ Xsm1 ∧ XXsm1 ∧ XXXsm1 ⇒ XXXpm1),

where the model satisfied φ1, and φ2, but was incorrect with respect to φ3. The previous, purely explicit solution, which we described in [BBB+12] and refer to it here as expl, is compared with the two solutions proposed in this section: verification using explicit sets (set) and the SAT-based verification (sat). Finally, the hybrid representation (hyb) is compared to sat on a set of experiments using the same model but focusing also on a collection of extreme cases, where sat behaved poorly.

6.2.1 Implementation Both explicit-set and SAT-based approaches to verification of Simulink diagrams were imple- mented1 in a clone of the DIVINE verification environment, which already supported these diagrams as input, in the form of CESMI intermediate language (see [BBH+13] for details). The implementation of set is a relatively straightforward extension of expl: instead of keep- ing the evaluations in separate states there are fewer multi-states for every explicit control state. Furthermore, only the delay variables are represented; the inport variables are stored separately only once for the whole diagram. The implementation of sat first needed the expressions represented by Simulink blocks to be translated into BV formulae. Algorithm 13 demonstrates how to implement the model checking process, with a func- tion for generating successors, using syntactic substitution. Initially, the state matching was implemented exactly as described, where only the inport variables were quantified. But the experiments revealed that much better verification times can be achieved when all variables are quantified and the substitution is not recursive, but uses only the previous history version of a particular variable. At least for the Z3 [dMB08] solver, this leads to better results. Com- munication with the SMT solver is facilitated by the SMT2 [BST10] format, and thus any solver of quantified BV theory could have been used; to the best of our knowledge, however, Z3 [WHdM13] is the only one at this time. As mentioned in Section3, the states comprise of two parts: explicit and symbolic. The explicit part contains, most importantly, the state of the B¨uchi automaton and the model checker uses hashing on this part of a state to partially solve the state matching. The sym- bolic part contains the path condition vector and must be excluded from hashing. During successor generation, the path condition of the original state (in a conjunction with a formula labeling the B¨uchi automaton transition) is tested for satisfiability. The resulting formula does not contain quantifiers, and thus a decision procedure for quantifier-free BV is sufficient: line3 of Algorithm 13. The SMT2 file communicating the request would then be as follows:

(set-option :produce-models true) (set-logic QF BV) (declare-fun a0x0 () ( BitVec 4)) (declare-fun a1x1 () ( BitVec 4)) (declare-fun a12x1 () ( BitVec 4))

1 Code available at http://anna.fi.muni.cz/~xbauch/code.html#simulink.

89 ... (assert (= a0x0 #x0)) (assert (and (bvuge a1x1 #x0)(bvule a1x1 #x3))) (assert (= a12x1 (ite (bvuge (bvadd (ite (or (bvugt a1x1 #x1) (not (= a2x1 #x1))) #x1 #x0) (ite (bvult (ite (or (bvugt a1x1 #x1) (not (= a2x1 #x1))) a0x0 #x0) #3) (ite (or (bvugt a1x1 #x1) (not (= a2x1 #x1))) a0x0 #x0) #x3)) #x3) #x1 #x0))) ... (assert (and(= a12x1 #x1)(= a12x1 #x0))) (check-sat) (get-value (a1x1 a2x1 a4x1 a5x1 a7x1 a8x1)) where the variables a0x0, a1x1, and a12x1 are examples of delay, inport, and outport vari- ables, respectively; also the xi suffix denotes the history index. The third line from the end is the current path condition: the algorithm is generating successors of the first state, thus only one specification formula is present. Finally, the last line requests the model of the system of formulae: the inport variables evaluation for counterexample generation. State matching, on the other hand, leads to a formula in quantified BV, see line7 of Al- gorithm 13, and adequately powerful solver needs to be used. The following fragment shows the interesting part of the communicated SMT2 file:

(assert (forall ((b0x0 Full)(b3x0 Full)(b6x0 Full)) (=> (and (= b0x0 #x0) (= b3x0 #x0) (= b6x0 #x0)) (or (distinct a0x1 b0x0) (distinct a3x1 b3x0) (distinct a6x1 b6x0))))) where Full is the domain of delay variables (ai and bi refer to the same variables in different states). Note also that the algorithm allows comparison between delay variables with different history indices, i.e. those from states represented by path conditions of unequal length. Hybrid Representation Implementation: With regards to the hybrid representation of Section 6.1.3, there are two interesting implementation details: the generation of delay values using SMT queries and the modification of searching among known states. Other than these two changes, both path condition satisfiability and state matching have to be extended with the enumeration of initial delay values:

(assert (or (and (= a6x0 #x0) (= a7x0 #x0) (= a8x0 #x0)) (and (= a6x0 #x0) (= a7x0 #x0) (= a8x0 #x1)) (and (= a6x0 #x0) (= a7x0 #x0) (= a8x0 #x2)) )) which are then referenced in expressions for other Simulink variables (and possibly delays with higher history index). These initial delays are the only variables with history index zero and they retain it throughout the computation. This is justified because the length of the computation represented symbolically is equal to the length of the path sequence, thus we can always label the Simulink variables with the correct index. The generation of delay values is an extension of the path condition satisfiability check- ing: there as well has to be the enumeration of initial delay values but instead of collecting the values of input variables from the satisfiability model, we are interested in the values of delays. These values are stored and later used in subsequent queries so that the enumeration

90 could be exhaustive. The interesting part of the query would thus have the following structure:

(assert (and (or (distinct a6x2 #x0) (distinct a7x2 #x2) (distinct a8x2 #x7)) (or (distinct a6x2 #x3) (distinct a7x2 #x0) (distinct a8x2 #x0)) )) (check-sat) (get-value (a6x2) (a7x2) (a8x2)) which instruments the SMT solver to check satisfiability of the path condition using only such inputs whose evaluation leads to any value of delays different from those explicitly enumerated. Storing delay evaluations explicitly in the hybrid representation provides a very important property for model checking and that the representation is canonical. Hence the slow linear search on lines5–10 of Algorithm 13 may be replaced with an effectively constant time hash- based search. Given that the states are completely represented by the delay evaluations, moving these evaluations into the explicit part of the state suffices to implement such hash- based search. Only those states with path conditions consisting of at least one formula would need to be distinguished using the above SMT queries and only among themselves. Two states and can only be equal if |path| = 0 ⇔ |path0| = 0, i.e. if either both have been exhaustively concretised in the previous step or neither of them. Since two equal states must contain the same number of possible delay evaluations, it follows that the generating procedure would either report that there are more than thr values or successfully enumerate both states.

6.2.2 Experiments

Similar execution conditions were chosen for the comparison between the purely explicit approach (unmodified DIVINE) and the new hybrid approaches: the codes were compiled with optimisation option -O2 using GCC version 4.7.2 and ran on a dedicated Linux workstation with quad core Intel Xeon 5130 @ 2GHz and 16GB RAM. Given that DIVINE uses on-the-fly verification – the traversal terminates when an accepting cycle is detected – the progression of the verification process is prone to vary. Furthermore, parallel algorithms tend to differ in the runtime between individual executions, hence we have executed every experiment 10 times and report the average of those.

Purely Explicit versus Explicit Sets

The 6 plots in Figures 6.1 to 6.3 contain all the measured experiments: the left-hand side reports time results and the right-hand side reports space results; each row refers to exper- iments against different temporal specification φ1, φ2, and φ3 described above. Each plot has the range of inport variables placed on the linear x-axis and either time or space on the logarithmic y-axis. The crucial observation to be made from the figures is that the set exper- iments scale with the range of variables much more effectively than expl experiments. This observation, however, is obfuscated by two phenomena: the difference between parallel and sequential algorithms (treated in the next paragraph) and the results for property φ3. The VoterCore system is not a model of φ3 and hence there is an accepting cycle in the system. On-the-fly verification enables very fast detection in some cases (when the cycle is near the initial state) and thus the expl experiments are better on smaller ranges (though the time and

91 1000 expl-seq expl-par set-seq 100 set-par

10 Time (s)

1

0.1 1 2 3 4 5 6 7 8 9 Range of variables

1e+07 expl-seq expl-par set-seq set-par 1e+06

Space (kB) 100000

10000 1 2 3 4 5 6 7 8 9 Range of variables

Figure 6.1: Plots that report the results of the VoterCore Simulink diagram verification, comparing the purely explicit approach (expl) with the approach that uses explicit sets of variable evaluations (set). The sequential verification (-seq) is compared with the parallel verification (-par) which used four processor cores. The range z of input variables on the x-axis corresponds to the domain {0 . . . z}. The expl experiments had a time-out limit of 1000 seconds and a memory limit of 10GB. For example the unreduced state space of range 5 has 375000 states and 500 million transitions. Here we compare the complexity on the property φ1.

92 1000 expl-seq expl-par set-seq 100 set-par

10 Time (s)

1

0.1 1 2 3 4 5 6 7 8 9 Range of variables

1e+07 expl-seq expl-par set-seq set-par 1e+06

Space (kB) 100000

10000 1 2 3 4 5 6 7 8 9 Range of variables

Figure 6.2: Purely explicit vs. explicit sets on the property φ2. space are negligible for both approaches), because for set experiments the generation of every successor involves nontrivial amount of computation. The difference between parallel and sequential algorithms used in the experiments deserves placing under closer scrutiny. Parallel algorithms (expl-par and set-par in the plots) employ breadth-first search unlike sequential algorithms (expl-seq and set-seq) that employ depth-first

93 1000 expl-seq expl-par 100 set-seq set-par

10

Time (s) 1

0.1

0.01 1 2 3 4 5 6 7 8 9 Range of variables

1e+07 expl-seq expl-par set-seq set-par 1e+06

Space (kB) 100000

10000 1 2 3 4 5 6 7 8 9 Range of variables

Figure 6.3: Purely explicit vs. explicit sets on the property φ3. search. Given that the resulting transition system is very shallow (the longest paths have approx. 20 edges) and at the same time extremely wide (the data-flow explosion causes each state to have many successors), the depth-first approach has a considerable advantage with respect to the space complexity. This can be observed on all the right-hand side figures for expl experiments, yet the set experiments cannot profit from the same phenomenon: the data-flow

94 explosion of the number of successors was transformed into the complexity of their generation, which is common to both parallel and sequential approaches. Hence while retaining the parallel scalability (par experiments use all four processor cores) in time, in space the set experiments demonstrate superiority only compared to par experiments. (φ1 experiments are anomalous because the property transition system has no branching and thus set experiments traverse a linear system and cannot scale in parallel even in theory.)

Explicit Sets versus SAT-Based Verification The next 6 plots in Figures 6.4 to 6.6 compare the two approaches proposed here. The setting is exactly the same as before, except this time we do not include the results of parallel verifi- cation not to obfuscate the main message. Most importantly, for the VoterCore diagram and for domains of input variables larger than approximately 40, the SAT-based representation surpasses the one using explicit sets. The plots only report a part of the experiments, the limit on the x-axis is artificial so that the phenomena before the range of 40 remain clear. In fact the sat experiments continued to a range corresponding to the domain 0 − 232 of standard unsigned integers, which degraded the time performance at most twice compared to, for example, the range 10. Furthermore, the spacial requirements of such verification were almost negligible for arbitrary range, rarely higher than 500 MB. For very small ranges of variables, strictly smaller than 40, the spacial requirements of the set approach are tolerable and the time complexity is better than that of the sat approach. The outstanding spikes in the complexity of sat also deserve further explanation. They correspond to ranges 2n − 1 for some n. We postulate that this relates to the specific procedure that Z3 uses to decide satisfiability of quantified BV formulae. The precise explanation would probably require a detailed exposition of the functioning of the particular decision procedure employed in Z3, and we encourage the interested readers to follow it in [WHdM13]. The high level process in fact incorporates a model checking of quantifier instantiations, guided by previously tried unsuccessful counterexamples. It appears that limiting the range to a value very close to 2n forces a larger number of wrong instantiations.

Evaluation of the Hybrid Representation As will become clear once the results pertaining to the verification using hybrid representation (hyb) are presented, there is no need to compare it with the explicit verification. Both time and space requirements of hyb are lower than the respective requirements of expl, except on the most trivial instances. Yet the theoretical comparison of the two approaches is particularly interesting because the states themselves have exactly the same representation, i.e. explicit enumeration of delay variable evaluations. The only difference is in the process by which these evaluations were generated. Unlike in expl where we have to test all evaluations of inputs to see which delay evaluations they produce, in hyb we only employ those combinations of inputs for which we know that they will produce different delays. This knowledge about the Simulink system is obtained using further SMT queries, and may by itself help the designer in validation that the modeled system is the one intended. The plots in Figure 6.7 compare the purely symbolic and the hybrid representations with respect to the verification time. (We are not comparing the spacial requirements because they were negligible in sat already and in hyb the peak memory consumption never surpassed 6MB.) The timing results for sat are exactly the same as in Figure 6.4, only here we intentionally

95 10000 sat set

1000

100 Time (s)

10

1 100 101 102 103 104 105 106 107 108 109 Range of variables

1e+07 sat set

1e+06

Space (kB) 100000

10000 100 101 102 103 104 105 106 107 108 109 Range of variables

Figure 6.4: Plots that report the results of VoterCore Simulink diagram verification, compar- ing the representations based on explicit sets (set) and BV formulae (sat). The range z of input variables on the x-axis corresponds to the domain {0 . . . z}. The set experiments had a memory limit of 10GB. Here we compare the complexity on property φ1.

pinpoint the difficult domain sizes, i.e. the cases when input ranges are 0 − (2n − 1). The SMT solver struggles in these cases if it must produce a proof of unsatisfiability for a quantified

96 10000 sat set

1000

100 Time (s)

10

1 100 101 102 103 104 105 106 107 108 109 Range of variables

1e+07 sat set

1e+06

Space (kB) 100000

10000 100 101 102 103 104 105 106 107 108 109 Range of variables

Figure 6.5: Property φ2.

query. We hypothesise that this phenomenon is caused by the model-based instantiation of quantified variables, used as one of the techniques in Z3. Setting the upper bound of input range this close to the limit posed by the fixed bit-width seems to force the instantiation to choose from a larger set of values, before the solver can deduce unsatisfiability. It may thus

97 10000 sat set

1000

100 Time (s)

10

1 100 101 102 103 104 105 106 107 108 109 Range of variables

1e+07 sat set

1e+06

Space (kB) 100000

10000 100 101 102 103 104 105 106 107 108 109 Range of variables

Figure 6.6: Property φ3. be possible to modify the solver to achieve better results, and we are presently investigating this possibility. What these plots immediate show is that the undesirable behaviour on difficult domain sizes was completely eliminated with the employment of hyb. Even though it occurred in a logarithmic number of cases, it would still be very unpleasant if the system designer chose

98 1000 sat hyb

100

Time (s) 10

1 20 22 24 28 216 232 Range of variables

(a) Verification complexity for property φ1

10000 sat hyb

1000

Time (s) 100

10 20 22 24 28 216 232 Range of variables

(b) Verification complexity for property φ2

10000 sat hyb

1000

Time (s) 100

10 20 22 24 28 216 232 Range of variables

(c) Verification complexity for property φ3

Figure 6.7: Plots that report the results of VoterCore Simulink diagram verification, compar- ing the representation based on BV formulae (sat) and the hybrid representation (hyb). The range z of input variables on the x-axis corresponds to the domain {0 . . . z}.

99 such a case. Disregarding these extreme cases, with hyb we have achieved 2.4, 6.7, and 2.5 speedup for the three verified properties respectively. With the new representation, the verification process scales even better and the complexity is in fact only dependent on the number of steps needed to resolve the verification query. On the other hand, the complexity of individual SMT queries in hyb does not grow with the number of steps. It is not the case for sat, where the number of quantified variables is directly related to the length of the current path condition. Hence the difference in speedups for the three verified properties: checking φ2 required building a state state space with the longest path twice the length as for the other properties. We thus conclude that for more complex systems and in cases when hyb is applicable, as discussed in Section 6.1.3, it would produce even better speedup when compared to sat.

6.3 Discussion

By reducing deadlock checking and state matching to SMT queries we have managed to accelerate the model checking process to a point when the developer can omit specifying the unit testing input and verify that the design is correct for all possible values. Given that the process still requires large amounts of computational resources, the newly proposed hybrid representation accelerates the method further, thus rendering it even more applicable for verifying critical units in Simulink designs. To summarise the findings of our experiments with verification of Simulink diagrams, we propose a combination of the set and sat/hyb approaches. Such a combining verification is possible, because the two approaches allow identical models as the input. The result would correspond to the composition of (the better of) the solid and dotted lines of Figure 6.4. The developer could decide which representation to use, based on the range of variables they need to verify against. Or – better yet, and fully automatically – the two approaches can be run in parallel, the one finishing first reporting the results of the verification. Arguably, the experimental results and the achieved empirical findings remain largely academic. Even though the VoterCore model used in our experiments comes from a real-world design, it is only a part of a larger Simulink model. Industrial application of the proposed method would equally require manually selecting a safety-critical part of the verified model and verifying that this part has the desired properties. Our experiments already show the limits of using SMT solvers for LTL verification, even when the new hybrid representation is employed. More importantly, however, the experiments also show that if the input model is sufficiently small for the verification to finish in reasonable time, then a complete coverage across all possible input values is provided. Which is a quality that the established techniques based on testing cannot provide.

6.3.1 Theoretical Limitations Given that the proposed verification technique reduces the problem of LTL model checking to a sequence of SMT-queries, the limitations of the technique derives from the state-of-the-art of the SMT solvers. It follows that improvements in the efficiency of SMT solvers will also improve our technique and this improvement will require no further modifications. On the other hand, solving quantified SMT queries is a computationally demanding task. We will thus discuss the state matching – the part of our technique that seems to require solving quantified queries – in a greater detail.

100 1000 100 10

Instantiations 1 10 20 30 40 50 60 70 80 90 100

Figure 6.8: Time progression of the complexity of individual state matching instances.

The Figure 6.8 below shows a part of the verification process, where the x-axis lists the individual instances of state matching as they appeared in time. The y-axis represents the number of quantifier instantiations the solver needed to employ before finding a counterex- ample or a proof. Finding counterexamples, i.e. evaluations of delay variables that exist in one state but not in the other, appears to be a considerably simpler task, in the figure cor- responding to steps of less than 40 instantiations. Finding proof of equality between the two states was much more difficult, often requiring more than 400 instantiations with equivalently higher time and space complexity. These higher spikes occur in consecutive pairs, one for each subset inclusion.

6.3.2 State Matching by Comparing Function Images Returning back to the claimed necessity of resorting to quantified formulae for state match- ing, note that computing the equivalence between δ|ρ| and δ|ρ0| would be sufficient for state matching, if the inputs were read only once at the beginning. The state space would be larger as some states with the same sets of values would have semantically different representing functions. In Simulink, where inputs are read iteratively, however, such an approach would lead to infinite transition systems in all nontrivial instances. Computing equivalence between the two δ functions does not yield a correct result with respect to image comparison. For example on the Boolean domain the functions negation and identity are not equivalent, yet their images are the same. Consequently, stating the comparison as satisfiability in the BV theory must be different than simple equality. Assume 0 that the formula Ψρρ0 is satisfiable iff ρ and ρ are two different states. Then the model of 0 Ψρρ0 must encode an element, current evaluation of delay variables, of either ρ or ρ that is not in the other state. But Ψρρ0 can only refer to the domains of δ functions, the input variables, because in symbolic execution the current values are represented as functions on input variables. It follows that the satisfying evaluation of input variables in ρ must be such that for any evaluation of the inputs in ρ0 the δ functions evaluate to different values. It appears that stating functional image comparison within symbolic execution as a sat- isfiability decision requires the use of quantifiers. That much seems to hold if it is the input variables that are free in Ψρρ0 as argued above. The more general question, whether for bit- vector functions f : bvqn → bvq, f 0 : bvqm → bvq, there is a quantifier-free bit-vector formula ψ such that |ψ| ∈ poly(n+m) and ψ is satisfiable iff f(2bvqn ) = f 0(2bvqm ), is outside the scope of this work and we leave it open for future research. The more interesting question, at least from the practical point of view, is how to imple- ment the state matching in set-based reduced transition system, regardless of how the sets are

101 represented. Consider for example that one can represent the sets of evaluations using their characteristic functions, i.e. functions over state-forming variables that, when provided with an evaluation of those variables, return either 1 or 0, whether or not this evaluation is part of the set. (In other words representing evaluations of variables as the collected post-conditions.) Then the state matching would reduce to deciding semantic equivalence of these characteristic functions, or, if a canonical representation was available, even to syntactic equivalence.

102 Chapter 7

Analysis of Semi-Symbolic Representations

The Simulink designs, verified in the previous chapter, and LLVM programs, verified in the following chapter, are the main practical results of our work. We demonstrate on these two examples that our verification method is capable of handling unmodified products of both the design stage and the implementation stage of software development. Yet some properties of control explicit—data symbolic model checking may not be revealed by applying the method to two particular types of software models. A thorough investigation of the practical aspects of the method requires a more general approach. We have implemented the set-based reduction using both explicit sets and BV formulae, the state matching using both the image-based and the equivalence-based techniques, and the hash-table collision resolution accelerated by both the explicating values and the witness- based matching resolution heuristics. All the proposed and implemented1 techniques were experimentally evaluated on models in the DVE modelling language (either adopted from the BEEM [Pel07] database or hand-written, complex models) and on randomly generated models (which can be easily modified to highlight specific aspects of individual techniques). Finally, we have delegated the satisfiability queries to the Z3 tool [dMB08], which was at the time of writing the only one implementing QBV satisfiability. The presented experimental results are intended to demonstrate some of the properties of control explicit—data symbolic model checking. Hence, rather than accumulating a larger number of experiments, we select a representative sample and demonstrate relevant details on that sample. If the reader is interested in more real-world examples, we report our results for Simulink verification in Section6 and for parallel LLVM programs in Section8.

7.1 Case Study: DVE Models

The first set of experiments conducted pertains to the native language of DIVINE [BBH+13]. Using an existing language allows for a direct comparison with existing verification techniques, in this case with classical explicit-state LTL model checking. Yet since classical explicit-state model checking only verifies closed systems, i.e. without input variables, this technique also had to be slightly modified. The modification consisted of replacing every transition that

1 Code available at http://anna.fi.muni.cz/~xbauch/code.html#symbolics.

103 used input variables with as many new transition as was the size of the domain of the original ρ ρ∧V i =y input variable. Formally, a transition l −→ l0 is replaced with {l −−−−−−→j j l0 | y ∈ sort(I)} and we refer to this approach as purely explicit.

7.1.1 DVE Language for Open Systems The DVE language was established specifically for the design of protocols for communicating systems. There are three basic modelling structures in DVE: processes, states, and transitions. At any given point of time, every process is in one of its states and a change in the system is caused by following a transition from one state to another. Communication between pro- cesses is facilitated by global variables or channels, that connect two transitions of different processes such that these two transitions are followed concurrently. Following a transition is conditioned by guard expressions that the source state must satisfy, and entails effects, an assignment modifying the variable evaluation. The LTL specification is merely another pro- cess, whose transitions are always followed concurrently with some transition of the system. A comprehensive treatment of the DVE syntax and semantics can be found in [Sim06ˇ ]. The DVE language allows using variables of different types and consequently of different sizes. The proposed modification allows specifying which variables are input variables and what their domains are. Explicit states are represented as a function, assigning a value to each variable and a state to each process. The multi-states used in our setting are composed of two parts. The first, control part is essentially the whole explicit state as defined above. The second, data part introduces a new variable data that represent the evaluations of all input variables. Note that, in order to evaluate expression, we need to have an evaluation that assigns a single value to every variable.

Example 7.1 (DVE Model). Below is a DVE model for a server-based work distribution protocol. The process Control computes the amount of work and then chooses a worker in a loop. The worker processes continuously decrease the amount of work alloted to them. byte h[4]; byte w[4]; byte ws[0]=0,1000; byte W=32; process Control { byte i=0; byte rem=0; byte val=0; byte y=0; state pref, f, q, q1, q2; init pref;

trans pref -> f { effect y=ws; }, f -> q { effect h[0]=y, h[1]=y, h[2]=y, h[3]=y; }, q -> q1 { effect i=(i+1)%4, rem=(rem+h[i])%4, val=(val*rem)+h[i]; }, q1 -> q2 { guard rem == 0; effect w[rem]=(val*val)%W; }, q1 -> q2 { guard rem == 1; effect w[rem]=(val+val)%W; }, q2 -> q { effect i=i; }; }

process Worker0 { state w; init w;

104 trans w -> w { guard w[0]>0, w[0] w { guard w[0]>0, w[0]>=W/(2*4); effect w[0]=w[0]-(W/(2*4)); }; } a The one interesting extension to the standard DVE language is the declaration of variable ws. For technical reasons it is declared as an array of 0 elements, but the semantics is such that ws is an input variable with sort(ws) = {0,..., 1000}. 4

7.1.2 Set-Based Reduction with Explicit Sets As the Definition 5.1 prescribes, the function successors must evaluate the guard and as- signment of every available transition for every evaluation in the set comprising the given ρ multi-state. For a multi-state s and an LTS transition s(pc) −→ l0 the generation of succes- V sors amounts to (1) computing the set Valid = {v ∈ s(data) ||=SAT ρ ∧ xi = vi} and (2) 0 V V 0 0 computing the set Applied = {v ∈ sort(D) | ∃(v ∈ Valid), ρ ∧ xi = vi ∧ xi = vi}. Then the successor of s with respect to the above transition is {(pc, l0)} ∪ {(data, Applied)}. Hence implementing the function successors does not require any modification of the underlying model checking mechanisms. The individual evaluations from s(data) are sequentially loaded into an explicit variable evaluation and the results of taking the transition are collected back. The functions initial and is accepting are not modified at all, provided there is only one initial evaluation. Finally, note that the complications related to the database of visited states are avoided in the explicit sets approach, since these are canonical representations for the sets of evaluations.

Purely Explicit Approach versus Explicit Sets Similar conditions were chosen for the comparison between the purely explicit approach (original DIVINE denoted as exp) and the new hybrid approach (set-based reduction using explicit sets denoted as sym): the codes were compiled with optimisation option -O2 using GCC version 4.7.2 and ran on a dedicated Linux workstation with 64 core Intel Xeon 7560 @ 2.27GHz and 446GB RAM. We have conducted a set of experiments pertaining to Peterson’s communication protocol. For the purposes of verification, the protocol is usually modelled in such a way that once a process accesses the critical section, it immediately leaves the critical section, without performing any work. The introduction of input variables allows the model to achieve closer approximation of practical use by simulating some action in the critical section, however artificial that action might be. Note that this and other models from the BEEM database are abstraction of real programs and a verification engineer had to choose which aspect of the original program or communication protocol was irrelevant. By adding the support for input variables we allow more realistic models to be created automatically. Hence a global input variable l ∈ {0 ... r} was added to the model and an effect l = (l+1)%r. Note that the effect is not biased towards set-based reduction because it forces inclusion of all subsets {0 ... r}, {1 ... r},... even in the reduced state space. The two plots in Figure 7.1 report the results of liveness verification of this modified Peterson’s protocol. Verifying this protocol is nontrivial and the best parallel algorithm OWCTY [CP03ˇ ] requires several itera- tions before it can answer the verification query. The plots clearly show that exp cannot scale with the range of input variables r (the x-axis) and even when 32 parallel threads were used, verification of a single variable of range 0..140 required almost 100 seconds. The sym

105 1000 exp.w2 exp.w32 sym.w2 sym.w32

100 Time (s)

10

1 10 100 1000 10000 100000 Range of variables

(a) Time complexity for peterson-liveness

1e+09 exp.w2 exp.w32 sym.w2 sym.w32 1e+08

1e+07 Memory (kB)

1e+06

100000 10 100 1000 10000 100000 Range of variables

(b) Space complexity for peterson-liveness

Figure 7.1: Comparison between the purely explicit model checking (exp) and the set-based reduction using explicit sets (sym) with respect to both time and space. The verification was executed using 2 and 32 parallel threads, denoted by w2 and w32, respectively.

approach scaled markedly better, easily achieving the range up to 10000 with the same spacial complexity that exp needed for two orders of magnitude smaller range.

106 Experiments with other models from the BEEM database gave similar results. Introduc- ing explicit sets had minimal impact on the performance when no data manipulation was present. More importantly, both time and space limitations were reached for significantly larger domains of variables compared to the purely explicit approach. Depending on the complexity of the model the explicit sets allowed for verification with between one and two orders of magnitude larger data sets. We thus conclude that explicit sets are a useful tool to extend the unit testing capacity of a model checker, but realistic data domains require symbolic representation.

7.1.3 Set-Based Reduction with Symbolic Sets Following the space-efficient form of state representation used in DIVINE, the states under set-based reduction are also stored as continuous pieces of memory, composed of the explicit and the symbolic part. The s(C) part of a state includes the program counter and explicated variables and is stored in the same way as in DIVINE and the s(data) part is appended to it. The boundary between the two parts is known to the database component of DIVINE and thus the searching algorithm correctly hashes only s(C) and resolves the resulting collisions of the symbolic s(data) on a semantic level. One possible way of representing s(data) is to follow precisely the generating definitions, i.e. to represent both the path condition and evaluation functions as BV formulae over the input variables from I. This representation, however, may grow exponentially as in the fol- lowing sequence of assignments: x := 1; x := x + x; x := x + x; ... To prevent the above phenomenon the implementation does not substitute symbolic values into new formulae directly, but remembers each evaluation instance as an independent gen- eration of the value of a given variable. Hence the evaluation of assignments decorates each occurrence of every variable with the correct generation, and, internally, the above example is seen as x0 := 1; x1 := x0 + x0; x2 := x1 + x1; ... and each assignment is stored as a separate formula. The substitution is, of course, still required but can be postponed until the final formula is presented to the SMT solver. Then one can either introduce the new variables explicitly or by temporary expressions, if they are not to be interpreted as free variables. The latter option for the above example would translate to

(let ((= x0 1)) (let ((= x1 (+ x0 x0))) (let ((= x2 (+ x1 x1))) ...))) and we need to use it when comparing the path conditions of two states s1 and s2. There the only free variables in the resulting formula must be the input variables and each state has to have differently named variables. Altogether the formula reads

(let ((= xj αj[x0, . . . , xj−1, i1, . . . , ik])) ... (let ((= yj αj[y0, . . . , yj−1, i1, . . . , ik])) ... 1 1 2 2 (distinct ρn1 ◦ ... ◦ ρ1[xj, ij] ρn2 ◦ ... ◦ ρ1[yj, ij]) ...)) where f[X ] denotes that only the variables from X appear in f. Since ij are the only

107 1 1 V free variables, the satisfying assignment y is such that |=SAT ρn1 ◦ ... ◦ ρ1 ∧ ij = yj and 2 2 V 6|=SAT ρn2 ◦ ... ◦ ρ1 ∧ ij = yj is unsatisfied (or vice versa). Thus while the satisfiability checking of individual path conditions and the image-based state matching can be presented to the SMT solver as a simple sequence of assertions, speci- fying equivalence-based state matching requires the use of let constructs. The advantage of the former approach is that it can be performed using the C++ interface of Z3 and can thus be fully integrated into DIVINE, without the need to communicate with Z3 using SMT2 files or pipes. Calling the SMT solver is required at one more occasion and that is when array elements are referenced using expressions as indices. Each array element is stored as a separate variable but in order to know which element is referenced, the concrete value of the index has to be resolved. This can be easily achieved by requesting the SMT solver to produce a model of the current set of assertions, and then evaluating the index expression on the model. The final point is the preservation of the ability to accelerate the computation using par- allel processing, and the scalability of verification with DIVINE. Given that DIVINE currently distributes the hash function uniformly across the parallel threads, each thread is assigned a subset of the state database, related to some explicit values of s(data). Hence the scalability of the verification of set-based reduced systems is limited, compared to the standard verifica- tion, only in that the explicit values of s(data) cannot be used to further partition the state space between parallel threads. The experiments in Section 7.1.4 show that this limitation is rarely of any consequence, since the distribution using s(data) is mostly sufficient. On the other hand, Z3 is only thread-safe for satisfiability checking of quantifier-free formulae and thus the more general image-based state matching cannot be used at all with parallel state space traversal.

7.1.4 Equivalence-Based versus Image-Based State Matching The bulk of our experiments pertains to the work distribution protocol proposed in Exam- ple 7.1. The experiments were conducted on a machine with quad-core Xeon 5130 and 16GB of RAM; DIVINE 2.96 and Z3 4.3.1 were compiled with GCC 4.7.3. We have modelled the protocol in DVE for two worker processes with various parameters open for experimentation, viz. the initial work load for the whole system ws, the maximal increase of work for individual processes W , and the number of inputs (determining the initial individual work load) i. Table 7.1 summarises the results for ws = 100, W = 8, i = 1, where the solver used 16 bits for BV expressions. There the hybrid representation refers to the use of the explicating variables heuristics, the reported times are in seconds, and the memory consumption is in MBs. (The choice of parameter values was decided by an experiment with a 10 minute timeout, i.e. the reported are the largest values for which all 4 experiments finished within 10 minutes; especially the experiments with hybrid representation managed much higher parameter values.) The hybrid heuristics allows a number of possible configuration. For example all variables could initially be stored explicitly and only moved to the symbolic data part after being assigned a value which depends on an input. Another option is whether to allow the vari- ables to switch between explicit and symbolic parts multiple times. In this particular set of experiments all variables are initially symbolic, the test for explication is ran after every modification of s(data) and for each variable (and thus each variable may become symbolic multiple times), and a variable becomes symbolic again after reading from input.

108 Table 7.1: Results for the work distribution protocol.

Valuation Representation Pure Hybrid State Matching Technique Image Equiv Image Equiv Number of states 97 Number of transitions 167 Number of deadlocks 4 Number of control-flow states 5 54 Average number of collisions 19.4 1.80 State space distribution 1:2,23:1,24:2 1:42,2:5,3:3,9:4 Path condition time ∼3.2 Path condition calls 533 State matching time 78.1 49.6 31.57 21.18 State matching calls 2123 2308 419 574 Explicit value time n/a 13.81 14.45 Explicit value calls n/a 1680 Deadlock detection time ∼0.55 Deadlock detection calls 93 Verification time 81.92 59.59 49.55 40.01 Memory consumption 37.6 27.9 34.6 30.4

If a number occupies multiple columns then it represents the measured value for all of those columns. The two explicit value rows report the complexity of detecting that a variable can be explicated. Finally, the state space distribution row summarises the ratio among the numbers of multi-states that share a control-flow part (which were hashed to the same value and then distinguished by calling the SMT solver). For example without explicating variables there were only 5 different control-flow states but 97 multi-states altogether. Two control-flow states had unique symbolic data (1:2), one control-flow state was common to 23 multi-states (23:1), and two control-flow states were common to 24 multi-states (24:2). The efficiency of the hybrid representation depends on the number of variables that get assigned a concrete value (or whose evaluation is restricted to one concrete value) and this particular protocol has only a few such variables. Detection of these variables is thus a time consuming operation, but given the improved state space distribution and the reduction in state matching and in overall times, the use of this heuristics is well justified. The advantage of equivalence-based (equiv) state matching compared to the image-based (image) is also clear from the reported experiments for this protocol and parameter values. Given the higher complexity of image, the distinction grows with the complexity of the ex- pressions, especially with the use of nonlinear arithmetical operations such as multiplication and division. Given the satisfiability checking algorithm used in the SMT solver for quantified theory, proving unsatisfiability is much more complex than proving satisfiability. On the other hand, even the quantifier-free formulae used in equiv can generate very hard SMT queries, as shown in Figure 7.2a. There the higher spikes for image mostly relate to proving state equality, i.e. the more complex unsatisfiability of a quantified BV formula, which is in general two orders of magnitude more complex than state inequality. For the equality-based state matching there is hardly any difference between checking equality and inequality. Until the 140th state matching were the runs of equiv almost negligible but the

109 last 4 comparisons took 88 seconds each. It is also the case when equiv has to solve additional state matching instances because of its lower distinguishing power (compared to image). Since both the variable explication and the equivalence-based state matching are heuris- tics, their relative performance will differ based on the verified program. In this particular configuration, and generally in most of the configurations we have experimented with, the better performing state matching is the equivalence-based and the better performing sym- bolic representation is the one explicating variables. While explication entails reasonably small overhead on programs where it cannot benefit the verification, i.e. if each variable must be symbolic all the time, the equivalence-based state matching may entails considerable overhead. Figure 7.2b perhaps better illustrates the comparison between image and equiv and be- tween pure and hybrid representations. The lower lines, measured on the right, report the number of state matching calls the Algorithm7 needed to decide if a state is already stored in the database, i.e. the number of times line3 was executed. The full line reports the exper- iments without explication and the dashed line experiments with the explicating heuristics. Hence we can see that the heuristics almost eliminated the increase of the number of compar- isons for larger state spaces. This improvement can be attributed to hash-based resolution using the explicit part of multi-states, which precedes the calls of the SMT solver. The upper lines, measured on the left, report the cumulative time needed by each call of the Algorithm7. The comparison between image and equiv is less obvious in the average case, where both techniques require approximately 0.1 seconds to resolve membership and this time complexity only slowly grows throughout the state space traversal. However, there is a noticeable difference in the worst case instances, which are two orders of magnitude more complex than the average instances: a phenomenon which does not appear for equiv (except the special case in Figure 7.2a). The most likely explanation of this observation lies in the implementation of first-order satisfiability. For quantifier-free theories there are very successful heuristics for both satisfiability and unsatisfiability. At the time of writing, however, the heuristics for proving quantified formulae unsatisfiable are much less successful, compared to the satisfiable case. Finally, the last advantage of equiv resides in its paralellisibility, although mainly because of the limitation of Z3, which might be removed in the future. Figure 7.3 reports the parallel scalability of equiv for 1..4 CPU cores. Theoretically, the scalability is limited by the distri- bution of concrete states between parallel threads, which can force some threads to solve a larger number of more difficult state matching instances than other threads (for example the 1st thread for 3 cores or the 2nd thread for 4 cores). However, the 2.5x speedup for 4 cores demonstrates a relatively successful preservation of the original scalability of DIVINE, which was one the primary motivations for this work.

7.2 Case Study: Random Models

In order to investigate the relation between image-based and equivalence-based state matching γ∧(x0=f(x)) in more detail, we have implemented a simple generator of transitions of form ∗ −−−−−−−−→∗. Hence the influence of the explicit control is completely avoided since every state has the same control part. The arithmetic operations within f were evenly distributed among bitwise and Peano operations, including division and modulo, although this distribution was parametric. Other parameters of the random models were: the number of state-forming variables, the

110 20 image 88s equiv 15

10

5 State matching time (s)

0 0 20 40 60 80 100 120 140 160

(a) State matching times for ws = 200: the computation of image finished after 140 state matching instances; the heuristics in the computation of equiv required a few more iterations.

pure image.hybrid 100 hybrid equiv.hybrid

10

1

0.1 Time (s) #comparisons 0.01 20

10

0 0 10 20 30 40 50 60 70

(b) State matching time and number of comparisons for ws = 500.

Figure 7.2: The step-wise progression of the state matching time and the number of compar- isons for different values of ws. The values on the x-axis represent the nth state matching performed by the algorithm.

probability of reading from input, the average depth of syntactic trees of both BV predicates and functions, etc.

7.2.1 Image versus Equivalence using Linear Search For the first set of experiments, 500 edges were generated and their effects stored in a database, i.e. one state s among the states in the database was selected and the newly generated edge

111 image path condition 80 83.0 explicit values 75.9 equiv 68.2 state matching 60 61.1 53.2

40 38.2 38.4 33.9 35.8 time (s) 29.3 28.2 26.5 22.8 23.0 20 18.6 14.6 16.4 16.1 16.116.114.7 16.3 11.1 10.6 10 9.35 8.67 7.58

1 2 3 4 CPU cores

Figure 7.3: Parallel scalability comparison (ws = 20, i = 2).

0 γ∧x =f 0 l0 −−−−−→ l0 was appended to s. If ρn ◦ ... ◦ ρ1 ∧ γ is satisfiable, then a new state s , where s0(data) = s(data).(γ ∧ x0 = f), is added to the database. Adding to the database requires deciding the membership of s0, which was in this set of experiments implemented as a linear search, using both equivalence-based and image-based state matching for comparison. To obtain representative results, the experiment was repeated 100 times, each time with a different seed of randomness. Table 7.2 summarises the statistical distribution of the complexity of these experiments as a sequence of five-number summaries. Each measured value is given in 5 sample percentiles: the smallest observation, the lower quartile, the median, the upper quartile, and the largest observation. The data in the upper part of the table (the top 10 lines) represent the number of call to the SMT solver required during the experiment. We distinguish between satisfiability queries that were satisfiable (SAT) and those that were unsatisfiable (UNSAT). Each query was preceded by the system encoding – translation into Z3 internal format – and each had a time limit of 10 seconds, though only the QBV queries of the image-based state matching timed out. Following the definition of equiv, the equivalence-based state matching is split into two parts: comparing the path conditions (Equiv pc) and comparing the functions (Equiv fce), where comparing path conditions preceded comparing functions. The first line gives an accurate impression of the penalty for using non-canonical repre- sentation. The encoding procedure translates the internal representation of BV formulae into the Z3 format. Calling an SMT solver is an expensive operation and our experiment needed almost 14000 calls on average to revisit a program location 500 times. Another observation is that separating the equivalence-based state matching into path condition equality and eval- uation equality may be to the detriment of the total running time. On average almost 3000 checks showed the path condition equal and thus the resolution still had to be decide by the subsequent evaluation equality (Equiv fce). The separation would be useful provided the complexity of checking the path condition was much faster. This does not appear to be the case, as the timing results below demonstrate. Finally, the lower part reports memory requirements of each experiment. Terms in the database were stored concisely without substitution (see Section 7.1.3), yet encoding into Z3

112 Table 7.2: Five-number summary of SMT queries for the random edge generation.

min q1 med q3 max Encoding calls 5841 9086 13960 17915 25978 Equiv fce SAT calls 791 2100 2713 4467 7700 Equiv fce UNSAT calls 44 79 104 121 160 Equiv pc SAT calls 0 563 1356 2859 5425 Equiv pc UNSAT calls 818 2145 2768 4522 7779 Image SAT calls 1841 2925 5126 6284 8511 Image UNSAT calls 15 90 118 136 934 Image UNSAT timeouts 0 0 0 0 31 Path condition SAT calls 130 176 239 256 288 Path condition UNSAT calls 212 243 261 324 370 Longest path condition (B) 8547 17280 29791 58526 728392 Database size (B) 60066 109374 144402 183968 293120 Total memory (kB) 12508 14058 15636 18890 1064820 format required substitution, hence the size of the longest path condition sometimes exceeded the size of the whole database, which itself remained relatively small. The box plot in Figure 7.4 is derived from the same set of data, reporting only the time distribution. The plot is divided into three sets of five-number summaries: the first collects the sum of times for individual activities during the insertion of all 500 edges into the database, while the second and third report the distribution of the average and the maximum times over the 100 repetitions. Focusing only on the median values (bold lines) in the average case (the set of plots in the middle), one would conclude that the equivalence-based state matching is only moderately faster. Indeed in a majority of individual SMT queries, the image-based state matching was at most 4 times slower. Yet as the third set demonstrates, the worst case instances are one or two orders of magnitude worse. Table 7.2 already showed that 31 unsatisfiable image queries required more than 10 seconds, where the highest number for equivalence was below 0.5 seconds. Nor can we observe that there is a clear bound after which all image queries could be pronounced unsatisfiable, since even the satisfiable queries have peaks close to the 10 second time limit. The clear advantage of the equivalence-based state matching thus is the stability of the time requirements. The disadvantage, apart from limitation to programs with bounded number of inputs, is that the average case performance is merely 4 times better than the image based state matching.

7.2.2 Experiments with Witness-Based Matching Resolution The final set of experiments demonstrates the efficiency of the witness-based matching res- olution using satisfiability models as witnesses of state difference (see Section 5.3.3). The comparison between linear search and witness-based matching resolution is illustrated in Fig- ure 7.5. The number of SAT queries (Figure 7.5a) grows linearly for the linear search with the number of seen edges (the upper and lower progression relate approximately to the new state being and not being in the database, respectively). The use of witnesses decreases the number of SAT queries effectively to a constant value, independent on the number of states stored in the database. Yet as Figure 7.5b demonstrates, the time improvement is only 7- fold, i.e. linear, whereas the number of queries suggested an improvement from linear to a

113 1000

100

10

1

0.1 Time (s) 0.01

0.001

0.0001 encoding.totalequiv.fce.sat.totalequiv.fce.unsat.totalequiv.pc.sat.totalequiv.pc.unsat.totalimg.sat.totalimg.unsat.totalpc.sat.totalpc.unsat.totaltotal encoding.avgequiv.fce.sat.avgequiv.fce.unsat.avgequiv.pc.sat.avgequiv.pc.unsat.avgimg.sat.avgimg.unsat.avgpc.sat.avgpc.unsat.avgencoding.maxequiv.fce.sat.maxequiv.fce.unsat.maxequiv.pc.sat.maxequiv.pc.unsat.maximg.sat.maximg.unsat.maxpc.sat.maxpc.unsat.max

Figure 7.4: Five-number summary box plots for random models times. constant complexity. There are two reasons for this relatively modest improvement. First, as clear from Algorithm9, the overhead of maintaining the witnesses can be nontrivial and grows with their numbers. Second, even though only a constant number of instances need to be resolved by a SAT solver, the unsatisfiable queries (when the new state already resides in the database) cannot be resolved using witnesses. The unsatisfiable queries are commonly more time consuming and thus the resulting improvement is negatively affected since all unsatisfiable queries have to be resolved by a SAT solver anyway.

114 90 full 80 pruned

70

60

50

40 SAT calls 30

20

10

0

0 50 100 150 200 250 300 350 Edges (a) number of SAT queries comparison

0.7 full pruned 0.6

0.5

0.4

0.3 Time (s)

0.2

0.1

0

0 50 100 150 200 250 300 350 Edges (b) state matching times comparison

Figure 7.5: Step-wise evolution of the state matching time and the number of SAT queries on a random model using equivalence-based state matching comparing the linear search (full) with the witness-based resolution (pruned). The linear search traverses sequentially the set of multi-states with equal control part. The witness-base resolution heuristics accelerates the linear search by reusing inequality proofs.

115 Foo

116 Chapter 8

Verification Platforms

The previous chapters presented our contributions to the fields of requirements and program analysis. Each theoretical concept was experimentally evaluated but little effort was invested in allowing other researchers to compare their techniques with ours. This last chapter com- pensates for our omissions. SymDivine exposes the control explicit—data symbolic model checking in all its charac- teristics. With parallel C/C++ as its input, SymDivine can be used to build verification tools both for verification competitions and for analysing real code. Yet building a verifica- tion tool on top of LLVM requires non-trivial effort, which may not be worth investing if one only wants to test a hypothesis. The NTS-platform further simplifies the implementation of a new verification method by providing a translator from LLVM to NTS, a reasonably simple modelling language.

8.1 Input Languages

The LLVM language has established itself as the intermediate language for verification both on the side of industry [Met13] and on the side of research [ZNMZ12]. The NTS modelling lan- guage is sufficiently widespread and simple while expressive enough to represent the semantics of real programs without considerable size overhead.

8.1.1 LLVM LLVM [LA04] is a compiler infrastructure. The central part of LLVM is its well-defined language LLVM intermediate representation (IR). IR resembles a three address code and it is in Static Single Assignment (SSA) form. The common use of LLVM in compilers is summarised in Figure 8.1. The most important parts of LLVM are the following:

LLVM IR LLVM IR has three different forms: human readable assembly language, bitcode format for fast parsing and loading, and an in-memory IR representation. Core libraries provide translation facilities for every pair of these formats.

Basic Structure The basic LLVM IR unit is a module. A module consists of functions, global variables and other symbols. Each function contains a graph of basicblocks. Each

117 Language C/C++ Fortran Objective C Java

Frontend Clang GCC+DragonEgg

LLVM IR

Instruction Set ARM PowerPC SPARC x86

Figure 8.1: LLVM may serve as an intermediate language between input and target languages during compilation. basicblock represents a linear sequence of instructions. Jump instructions in LLVM are re- stricted to jump only to the first instruction of the target basicblock. The last instruction of every basicblock must be a terminator instruction (e.g. jump or return).

Type System and Memory Basic types are integers with bit-width ranging from 1 to 223, few floating point types and the x86 mmx type, a special type used for MMX instructions. From a set of types we can construct derived types. Most important are pointers, arrays and structures.

Registers and Memory The number of registers in a LLVM function is not bounded. A code in LLVM IR has a syntactic restriction that every register is assigned a value at most once – it is in the so-called SSA form. A register may be seen as a name for the result of the underlying instruction, that can be later used to refer to that value. The type of a register is statically determined by the instruction to which the register corresponds. Values in the memory (i.e. with an address) can be accessed only trough explicit load and store instructions.

Example 8.1 (LLVM IR). The LLVM IR snippet bellow shows the results of parsing a C function for the increment to LLVM IR using Clang. define i32 @add(i32 %x) #0 { %1 = alloca i32, allign 4 store i32 %x, i32* %1, align 4 %2 = load i32* %1, align 4 %3 = add nsw i32 1, %2 ret i32 %3 }

The function is a global symbol, which is denoted by the @ prefix. Both the return type i32 and the parameter type is a 32-bit integer number. The function body consists of a

118 single basicblock ending with the termination instruction ret. The symbol %1 denotes the result of alloca instruction, i.e. a local variable of type pointer to i32. The following store instruction copies the value of parameter %x to the previously allocated memory location and returns no value. After loading a value from memory and adding the integer constant 1 to the value, the result is return in the last instruction. 4

8.1.2 NTS The Numerical Transition Systems (NTS) language [HKG+12] is a relatively simple modelling language with formalised semantics. The NTS language is sufficiently expressive to model arbitrary software. In order to more closely represent specific programs, the language contains facilities for hierarchical and parallel composition of systems. The basic building block of NTS are listed below.

BasicNts The most basic building block in NTS is BasicNts: a standalone transition system, consisting of variables, control states, and transitions between control states. Control states may be marked as initial, final, or error states. The transitions are similar as in CFGs, i.e. ρ s −→ s0, where ρ is also called the transition rule. Finally, a transition can only be taken if the transition rule is satisfied.

Transition Rules There are two kinds of transition rules in NTS: calling rules and formula rules. The formula rules are standard transition relation as defined for CFGs. We use BV as the logical language, but NTS allows for any first-order theory to be used. The call rules are used to call another BasicNts interpreted as a function. For example

0 0 (p1, p2) = factorise(x)

calls the BasicNts named factorise with the value of x as its parameter and stores the result in the vector (p1, p2).

Type System NTS supports three scalar data types: integer, real, and Boolean. The values of these three are types are interpreted as members of Z, R, and {>, ⊥}, respectively. Furthermore, any scalar data type can be composed into an array type. Note that NTS lacks the support for bit-vector types. In order to support bit-precise semantics of programs, we have added the bit-vector type into NTS type system.

Havoc The language contains a predefined function havoc which is interpreted as an atomic proposition with the following meaning:

^ 0 havoc(v1, . . . , vn) ≡ v = v

v∈V \{v1,...,vn}

where V is the set of variables in the current scope. Hence havoc is merely a syntactic shortcut describing the set of variables that are not modified by a transition.

Example 8.2 (NTS for Producer/Consumer). The text below is the NTS model for the parallel system of one producer and one consumer.

119 G:int; c:bool; init G=0 && c=false; instances producer[1], consumer[1]; producer { initial si; i:int; si->sl { i’=0 && havoc(i) } sl->sl { c=false && c’=true && G’=i && i’=i+1 && havoc(c,G,i) } } consumer { initial si; sum, x:int; si->sl { sum’=0 && havoc(sum) } sl->sh { c=true && x’=G && havoc(x,G) } sh->sl { sum’=sum+1 && c’=false && havoc(sum, c) } }

The initial configuration of the system must satisfy the init formula G = 0 ∧ c = ⊥. The language also supports thread identification, via a predefined variable tid. Within the BasicNts producer the value of tid is 0; within consumer the value is 1. The local variable i in producer is initialised to 0 by the first transition, and incremented by each succeeding transition. The transition sl → sl can only be taken if c = ⊥. The semantics is such that the producer increments the value of G which is then summed by the consumer. 4

8.2 SymDivine

SymDivine is a generic platform enabling program analysis of parallel programs written in LLVM IR. SymDivine stands between the input formalism (LLVM IR) and a program analysis algorithm to provide the LLVM program state space without any complex details of LLVM IR. More precisely, SymDivine offers a set-based reduction generator for programs written in LLVM IR as depicted in Figure 8.2. The source code of SymDivine is publicly available1 and there is also an alternative description of the internal functioning [Hav14].

8.2.1 Memory Structure SymDivine has a layer between the data representation module and its instruction execution core, which keeps information about the memory layout. As memory layout we understand a mapping from variables in the LLVM program to variables in the data representation. When a new block of memory is allocated in an LLVM program it is assigned a unique identifier. Thus each variable in every state is uniquely addressed by a pair Var = (segment id, offset) of positive numbers, where segment id is the unique identifier of the block memory to which the variable belongs to and offset is the offset of this variable inside that block.

1http://anna.fi.muni.cz/∼xbauch/code.html#llvm

120 1:%b=alloca i8 thread t0 2:%b=call i8* @read() thread t2 thread t1 b ≤ 10 1:%1=load i8* @b 1:%1=load i8* @b 2:%2=mul nsw i8 %1, %1 2:%2=icmp sgt i8 %1, 3 3:%3=icmp slt i8 %2, 17 1 3:br i1 %2, label %.loop, ret 4:br i1 %3, label %.loop, ret b > 10 4:.loop: 5:.loop: 2 5: %3=load i8* @b 6: %3=load i8* @b 6: %4=add i8 %3, 1 7: %4=add i8 %3, 1 b > 10 7: store i8 %4, i8* @b 8: store i8 %4, i8* @b 8: br label %.loop 9: br label %.loop

d = (0 ≤ b < 28 ∧ b > 3 ∧ b < 10) s = 3, 5, 0 ... a = 2 ... d = (0 ≤ b < 28 ∧ b > 3 ∧ b ≥ 10) s = 3, 5, 0 ... a = 1

d = (b, 0) d = (0 ≤ b < 28) s = 1, 0, 0 s = 2, 0, 0 d = (0 ≤ b < 28 ∧ b ≤ 3) a = 1 a = 1 s = 3, 9, 0 ... a = 1

Figure 8.2: Set-based state space reduction in the SymDivine context. The input is an LLVM IR code with Pthreads implementation of parallelism (top left) and an LTL property (top right). In this figure, we use the BV formulae for data representation.

Symbolic Data Interface SymDivine assumes that the data representation module pro- vides the following interface (the list is not complete, more details can be found in the source code):

• addSegment, eraseSegment – functions that create or delete a segment

• computeStore r a b – for any arithmetic operation . Each argument is a Var pair. After a call to computeStore , the data representation module is supposed to compute a b and store it in r

• input r – stores non-deterministic value to r

• prune a b – for any relational operator . Each call to prune restricts the set of current variable evaluations to only those satisfying a b

• empty returns true iff the set of currently stored evaluations is empty

• equal compares two sets of valuations for set equality

Explicating Variables It is possible that in some states a variable has only one possible value. Owing to the low-level nature of LLVM, this situation occurs with many of the program variables. While such variables may be stored symbolically inside the data representation module, it is more efficient to store their values explicitly (i.e. as concrete values). Not only can arithmetic operations on concrete numbers be computed few magnitudes faster but any extra information in the explicit part of states can reduce the number of hashing collision thus speeding up the state space search.

121 Pointers SymDivine represents pointers as a 64-bit number, where the upper 32 bits is segment number and the lower 32 bits is offset inside the segment. Such pointer mapping corresponds to our memory layout scheme, therefore dereferencing does not need any complex mapping. Currently, we forbid pointers with a symbolic value, more precisely, we require all pointers to be explicable.

Path Reduction The low-level nature of LLVM entails a granularity of operations much finer than what is necessary for verification. τ-reduction [BBR12] proposes a collection of heuristics that lead to a coarser granularity without losing correctness of the overall verifi- cation. The basic idea behind τ-reduction is that effects of instructions are observable from other threads only when the thread accesses main memory via load or store instructions and few others; after such instructions we have to generate a successor to enable interleaving of these observable actions with instructions executed by other threads. We can benefit from this fact by executing more instructions from one thread and accumulating them into a single transition (see [BK11] for the sequential equivalent: large-block encoding).

8.2.2 Algorithms Each verification algorithm supported by SymDivine framework traverses the CEDS state space of the input program in some manner. Let MltSt denote the type MultiState DataStore of a multi-state parameterised by the specific container DataStore for the set of variable evaluations, i.e. MtlSt represents individual states from the point of view of verification algorithms. As stated above, SymDivine provides the user with two functions generating this state space – initial :: MltSt and successors :: MltSt → {MltSt} – to implement their verification algorithms of choice. The Figure 8.3 summarises the user view of the SymDivine. To demonstrate the flexibility of this minimal scheme, we have implemented two basic algorithms: reachability for safety analysis and Nested-DFS for LTL model checking. Note that certain data structures are not explicitly mentioned as inputs of the algorithms, but are modified by them. These are the LLVM execution engine, the product graph, and the database of known states.

Reachability Algorithm 14 implements a depth-bounded, breadth-first reachability. If a B¨uchi automaton is not provided the algorithm assumes a trivial one and creates the function initial by combining the initial states of the program and of the automaton. The function successors is used on line9 to produce all successors of the state being expanded. The function storeDatabase attempts to store a successor in the database of known states, returning true for unknown successors. The set open thus stores exactly the unknown successors to be expanded in later iterations. Since bounded, the reachability stops in three cases: either an error state S was found (Right S), no new state was produced (Left false), or the bound was hit (Left true).

Nested-DFS The reason for Reachability being depth-bounded is related to the major limitation of the CEDS approach: a looping construct in a program may require unrolling (see Section5). Hence if the DFS-based traversal chooses to follow a computation branch that begins such unrolling it would not try other branches until the unrolling completed. The Nested-DFS (Algorithm 15) shows how bounded reachability can be used to generate a part of the state space on which an unbounded Nested-DFS is run. Note also that a

122 Parallel C LTL ϕ clang negate

Parallel LLVM-IR ¬ϕ

atomics SPOT program B¨uchi automaton

automaton

successors :: (S : MltSt) → [MltSt] initial :: (S : MltSt) internal SymDivine reachability :: (ϕ : Safety) → Proof Nested−DFS :: (ϕ : LTL) → Proof external

Figure 8.3: The work flow of SymDivine is as follows: the code of a C program is translated into LLVM and the global variable names and extracted to provide atomic propositions. The negated LTL specification is translated into a B¨uchi automaton. The user can either use the predefined verification algorithms for safety and LTL properties or implement their own algorithms. In the latter case, SymDivine provides the basic functionality for state space traversal.

counterexample computation can be correctly detected on a partial graph, but the proof of correctness requires generating the full state space first.

The Nested-DFS algorithm is preceded by the construction of the property automaton on line1. Currently, we limit the specification language to LTL formulae, where atomic properties refer only to global variables, i.e. an atomic property can be arbitrary BV formula but the variables used in that formula must be declared global in the program. To better understand the complexity of including local variables, please follow the discussion in [BBR12]. Altogether, we use the model checker SPOT [DL14] to translate LTL to the corresponding B¨uchi automaton after which we link the atomic propositions to their respective formulae using an external function atomics.

To better understand the interaction between the program and the property automaton, we also present the implementation of the successors function in Algorithm 16. The function symbolically executes an LLVM code via the function advanceLLVM, which stops the execution on observable actions and calls the function yielder. The symbolic execution spans over all currently active threads, producing one successor for each thread. This single-step generation is repeated for every transition of the property automaton, where the guard is used to prune the evaluations. The yielder itself copies the current state of the program (readLLVM and writeLLVM transport the states from and to the LLVM execution engine), pairs this program state with the property state to finally store the new transition in the partial productGraph. The productGraph is then stored explicitly as an adjacency list, requiring only 4 bytes per state and transition.

123 Algorithm 14: SymDivine Reachability Input : bound :: N Output: Either B MltSt true 1 if empty BA then BA ← (0 −−→ 0) 2 3 initial ← join initialCFG initialBA 4 if empty bfsQ then push bfsQ (initial, 0) 5 6 while ¬(empty bfsQ) do 7 (S, level) ← front bfsQ 8 if level < bound then 9 open ← filter (λx.storeDatabase x)(successors S) 10 if empty open then 11 return (Left false)

12 foreach S0 ∈ open do 13 if erroneous S0 then 14 return (Right S0)

15 push bfsQ (S0, level + 1)

16 return (Left true)

8.2.3 Containers The execution used during successor generation is symbolic: arithmetic, logical, and bit-wise operations are not interpreted but instead invoke calls to functions implemented by each data representation. Apart of the above mentioned interface for implementing operations, the specific container may implement operator< and hash (equal is required always). Based on which of these functions are implemented, a more efficient state space searching algorithm will be used.

Explicit The most trivial extension of the original explicit-state model checking is a rep- resentation that stores each evaluation explicitly. While the space reduction is negligible – the control parts of individual states are stored only once per multi-state – using hash-based search is simple given the canonicity of this representation.

SMT The exactly opposite approach is to represent multi-states as the sequence of program instructions (as BV formulae) that lead to that particular state of execution. Provided that a variable is modified multiple times we create a new symbol representing the next generation of that variable. In theory, the space required may be exponentially smaller than for the explicit representation. While implementing the empty function entails a simple satisfiability query for an SMT solver, equal is much more complicated. The set of evaluations is effectively represented as the codomain of a function from the domain of input variables to the domain of program variables, i.e. the product of last generations of each variable. Then emptiness is equivalent

124 Algorithm 15: SymDivine Nested-DFS Input : property :: LTL, atomics :: string → BV formula[Globals] Output: answer :: Either B Path

1 BA ← parse (ltl2baSPOT property) atomics 2 b ← 0 3 while true do 4 match Reachability (++b) with 5 (Left true) ⇒ if ndfs productGraph = (Left true) 6 then return (Left true) 7 (Left false) ⇒ return ndfs productGraph 8 (Right S) ⇒ return (Right (pathTo S))

to the existence of such an input that is mapped by the function, but comparing codomains of two functions using inputs requires quantifier alternation (see Section5).

BDD Binary decision diagrams are canonical and in fact each set of evaluations is uniquely represented by a single 4 byte number that can be directly used as the hash. The complexity of using BDDs stems from computing the effect of individual instructions (which is a constant time operation for SMT). For each instruction we create a BDD representing the relation between current and next state bits of variables and then existentially quantify over the current state bits. Hence even if not used in a particular transition every variable increases the complexity. Furthermore BDDs for certain arithmetic operations are exponential in the bit widths. Our experiments showed that the large number of variables in LLVM programs prohibits the use of BDDs for any non-trivial program. The inability of BDDs to handle LLVM code of non-trivial size prevented us from including this container to the results of the following subsection.

Empty The framework can also be used to explicate the control non-determinism while leaving the data-flow completely uninterpreted by having equal return always true. The result of a traversal using the empty container is a control-flow graph with transitions labelled by program instructions. We use the empty container to enable nuXmv accept parallel C programs as input.

8.2.4 Experimental Evaluation

The following results were obtained on a dedicated Linux machine, with a Xeon CPU E3- [email protected], 16GB of RAM, using 64-bit clang and LLVM version 3.4.2 and -O2 opti- misations. We compared SymDivine with CPAchecker-1.3.4 and nuXmv-1.0.0 and the unnamed tool written by Eric Koskinen (for which we only assume the results reported in their paper [CK11]).

125 Algorithm 16: SymDivine Successors Input : S :: MltSt Output:[MltSt]

1 (programS, propertyS) ← split S guard 0 2 foreach propertyS −−−→ propertyS ∈ BA do 3 readLLVM programS 4 pruneData guard 5 if emptyData then continue 6 7 yielder :: void = 0 8 programS ← writeLLVM 0 0 9 S0 ← (programS , propertyS ) 10 insert open S0 11 addEdge productGraph (S → S0)(isAccepting S0) 12 advanceLLVM yielder 13 return open

with error safe total SymDivine CPA total SymDivine CPA bitvector/* 9 8 9 36 15 32 eca/* 89 30 33 44 14 32 locks/* 2 2 2 11 11 11 loops/* 32 21 27 35 24 27 ssh-simp/* 12 8 12 14 7 14 systemc/* 37 17 37 25 5 19 sum 181 86 120 165 76 135

Table 8.1: SymDivine vs CPAchecker on SV-Comp examples.

SV-Comp The Table 8.1 reports the capacity of SymDivine to verify examples from the Software Verification Competition2. We compare the number of examples from individual categories that either tools (SymDivine and the currently leading CPAchecker [BK11]) was able to correctly verify. A more detailed discussion of this comparison can be found in [Hav14].

SymDivine vs State-of-the-Art Model Checkers Benchmarks used in Table 8.2 were chosen from the selection attached to [CK11]. The table demonstrates two important contri- butions of this thesis: not only does SymDivine provide other tools (nuXmv in this example) with access to parallel C and C++ programs via its general interface, our SMT-based CEDS verifier can compete with state-of-the-art verification tools. The examples attached to [CK11] show to the full extend the major weakness of CEDS approach: that loop unrolling may lead to state spaces of unmanageable size. It follows that while with iterative deepening the Nested-DFS algorithm can in most cases find a counterex-

2https://svn.sosy-lab.org/software/sv-benchmarks/tags/svcomp14/

126 Table 8.2: Comparison Between Tools for Data Non-Determinism.

Name property safe -Ox Kosk. nuXmv SymDiv. SMT Empty depth 2 2.43 0.7 22-27-12 39-56 4 FG a × 11.0 0 1.13 1.7 38-39-32 4 1 2 0.68 0.38 9-17-3 full ¬ F G a 5.8 X 0 0.39 0.59 15-28-6 36-49 full 2 0.40 0.84 22-33-9 22-33 full FG a 1.3 X 0 >60 >60 234-301-149 7 2 2 0.83 1.40 28-63-12 full ¬ F G a × 2.7 0 >60 8.50 92-127-20 31-44 5 2 >60 0.99 22-35-9 25-37 4 GF a × 1.9 0 39.35 1.43 24-36-10 4 8 2 0.67 >60 237-239-227 228 ¬ G F a 3.6 X 0 0.63 >60 53-63-11 19-26 12 2 >60 0.18 9-11-2 27-42 full GF a 10.8 X 0 8.34 >60 75-106-23 12 9 2 >60 0.16 7-9-4 full ¬ G F a × 1.9 0 29.01 0.39 14-15-6 21-31 4 2 0.20 1.05 30-42-0 37-60 full G(a ⇒ F a) 29.5 X 0 0.21 1.04 30-42-0 full 12 2 0.41 0.23 10-21-9 6 ¬ G(a ⇒ F a) × 3.9 0 0.51 0.40 10-21-9 34-53 6 2 >60 2.37 42-53-16 10-12 10 G a × 0.5 0 0.33 7.73 71-81-24 10 14 2 >60 0.55 20-23-19 7 ¬ G a × 0.6 0 0.30 1.48 33-36-32 13-16 8 2 >60 >60 41-52-11 51-71 14 G(a ⇒ F b) 14.18 X 0 0.31 28.25 36-49-9 full acqrel 2 >60 0.08 6-6-4 4 ¬ G(a ⇒ F b) × N.A. 0 >60 0.11 7-75 28-36 5 2 >60 47.50 22-333-109 1170-1681 26 G a ⇒ G F b × 197.4 0 N.A. Error N.A. N.A. apache 2 >60 15.88 368-378-40 31 ¬(G a ⇒ G F b) × N.A. 0 N.A. Error N.A. Error N.A 2 0.25 >60 95-144-0 24-36 28 G(a ⇒ F b) 27.94 X 0 >60 >60 56-51-19 12 fig8-2007 2 0.48 >60 95-144-94 28 ¬ G(a ⇒ F b) × N.A. 0 >60 >60 144-172-54 77-89 40 2 15.20 >60 0.83 30-41-10 166-274 9 GF a × 0 ( ) >60 3.86 62-85-22 16 pgarch X 2 >60 0.15 11-11-8 6 ¬ G F a × N.A. 0 >60 0.25 16-16-11 185-257 9 2 7.15 1.86 67-71-12 1066-1535 8 G(a ⇒ F a) × 539.0 0 >60 0.62 31-34-5 8 win1 2 8.54 >60 387-405-45 16 ¬ G(a ⇒ F a) × N.A. 0 >60 >60 310-322-61 2920-3962 20 2 0.04 0.18 6-9-2 17-27 full FG a 15.75 X 0 0.04 0.27 11-14-4 full win3 2 0.08 0.38 9-17-3 3 ¬ F G a × N.A. 0 0.08 0.59 15-28-6 23-34 5

127 ample, proving correctness, i.e. enumerating the whole reduced state space, presently lies outside the scope of our tool. Unfortunately, at the time of writing NuXmv contained a bug in the implementation of k-liveness. Hence we only compare with the SAT-based, bounded model checking algorithm. The first three columns of the table report general properties of the benchmarks: iden- tifier, LTL specification, and its validity. For each benchmark we ran the verification with the original specification and its negation and for each we called clang with and without optimisations (column 4). The next three columns report the verification times in seconds of the respective tools, with timeout set to 60. The last three columns report some statistics about the transition system produced by SymDivine. SMT shows the size of the graph in which SymDivine ran the last Nested-DFS algorithm in the form |S|-|T |-|F | (see Section3). Empty shows the size of the graph send to nuXmv (no accepting states there). Finally, depth is the distance from the initial location SymDivine had to traverse to decide the validity of the specification. Two of the presented experiments may require special attention. First, SymDivine could not resolve the value of a pointer in the apache example, thus we were unable to verify (or produce the model of) it. Second, the pgargch model was reported correct in [CK11], while SymDivine found it incorrect. The cause stemmed from the function time being abstracted (possibly by the authors of [CK11]) to always return an undefined value, thus causing the program to be incorrect while with the unabstracted time the code would be correct.

8.3 NTS-Platform

The NTS language lacks a significant feature that prevents a direct translation of parallel pro- grams into NTS. The composition operators cannot be nested, i.e. one cannot build an NTS where parallel BasicNts blocks are composed sequentially. The NTS-platforms ameliorates this by implementing a translation from parallel LLVM to NTS. Parallelism in general considerably complicates verification and thus a large portion of the program analysis community concentrates on sequential code. The NTS-platform allows employing tools for sequential verification on parallel programs by implementing a transla- tion from parallel NTS to sequential NTS. Since this translation is exponential in the worst case, NTS-platform also contains a form of partial order reduction, applicable to the NTS formalism. This section describes the individual parts of the platform.

8.3.1 Standard Partial Order Reduction

Standard definition of partial order reduction assumes a state-labelled transition system (equivalent to our explicit CFG) and deterministic transition relations. For such classical exposition see for example [CGP99]. To avoid unnecessary repetition, we will fix the fol- lowing notation for the remainder of this section. Let P = (L, l0, T ,F ) be a CFG and G = (S, s0,T,A) the explicit CFG generated from S.

Definition 8.3 (Transition Application). Let ρ ∈ Φ be a transition relation and s ∈ S an explicit state. We define the application of ρ to s as

0 ρ 0 ^ ^ 0 0 ρ(s) = {s ∈ S | ∃(s(pc) −→ s (pc) ∈ T ), |=SAT xi = s(xi) ∧ ρ ∧ xi = s (xi)}.

128 Definition 8.4 (Deterministic Relation). A transition relation ρ ∈ Φ is deterministic in P if ∀(s ∈ S), |ρ(s)| ≤ 1.

Definition 8.5 (Enabled Transition). A transition relation ρ ∈ Φ is enabled at s ∈ S if V |=SAT xi = s(xi) ∧ ρ. We denote the set of transition relations enabled at s by enabled(s).

Definition 8.6 (Independence Relation). The independence between two transition relation α, β ∈ Φ in a state s ∈ S is defined as follows

0 0 α Is β ≡ ∀(s ∈ β(s)), α ∈ enabled(s ) ∀(s0 ∈ α(s)), β ∈ enabled(s0) ∀(s0 ∈ β(s)) (s00 ∈ α(s)), α(s0) = β(s00).

The independence relation I ⊆ Φ × Φ is a symmetric, anti-reflexive relation such that α I β if for all s ∈ S it holds that α Is β. Furthermore, D = Φ × Φ \ I.

Definition 8.7 (Invisible Transition). A transition relation ρ ∈ Φ is invisible if it does not modify any atomic proposition occurring in the specification. Let us assume that P = Pprg oPA, A A A A where PA = (L , l0 , T ,F ). Then ρ is invisible if

σ 0 A 0 ∀(l −→ l ∈ T ), 6|=SAT σ ∧ ρ ∧ (¬σ[xi 7→ xi]).

The partial order reduction is established by reducing the set of enabled transitions in a state to an ample subset. The following 4 conditions are sufficient.

C0 ample(s) = ∅ ⇔ enabled(s) = ∅.

C1 Let s ∈ S, α, β ∈ Φ, and π be a path in G such that π(0) = s, β ∈ ample(s), α D β, and π(j + 1) ∈ α(π(j)). Then there must be γ ∈ ample(s), such that π(i + 1) ∈ γ(π(i)) and i < j.

C2 If ample(s) 6= enabled(s) then ∀(ρ ∈ ample(s)), invisible(ρ).

C4 Let ζ be a cycle in G, s ∈ ζ, and ρ ∈ enabled(s). Then there must be s0 ∈ ζ such that ρ ∈ ample(s0).

8.3.2 Static Partial Order Reduction The standard partial order reduction as presented in the previous subsection cannot be used in control explicit—data symbolic model checking. This reduction methods is mostly imple- mented dynamically, i.e. the ample sets are computed during depth-first search generation of the explicit state space. Our intention is a reduction method that transforms CFGs. i i i i i To simplify the definitions we will fix the following notation. Let P = (L , l0, T ,F ), for A A A A A i ∈ {1, . . . , n}, be the CFGs for threads of a parallel program and P = (L , l0 , T ,F ) 1 n A the specification CFG. Furthermore, let (P k ... kP ) o P = P = (L, l0, T ,F ) and let G = (S, s0,T,A) be the explicit CFG generated from P .

Definition 8.8 (Enabled Transition). A transition relation ρ ∈ Φ is enabled at l ∈ L if ^ ∀(s ∈ S), s(pc) = l =⇒ |=SAT xi = s(xi) ∧ ρ.

129 Definition 8.9 (Transition Application). Let ρ ∈ Φ be a transition relation and l ∈ L a location. We define the application of ρ to l as ρ(l) = {l0 ∈ L | ∃(s → s0 ∈ T ), s(pc) = l, s0(pc) = l0}. Definition 8.10 (Transition Independence). A transition relations α, β ∈ Φ are independent at l ∈ L if 0 0 α Il β ≡ ∀(l ∈ β(l)), α ∈ enabled(l ) ∀(l0 ∈ α(l)), β ∈ enabled(l0) ∀(l0 ∈ β(l)) (l00 ∈ α(l)), α(l0) = β(l00). The independence relation I ⊆ Φ × Φ is a symmetric, anti-reflexive relation such that α I β if for all l ∈ L it holds that α Il β. Furthermore, D = Φ × Φ \ I. The task of static partial order reduction is to modify the computation of the composition CFG P such that for each location l ∈ L, instead of including all enabled transition, only the transitions from ample(l) are inserted into P . We require that ∀(s ∈ S), s(pc) = l =⇒ ample(s) ⊆ ample(l). Both the definitions in this section and in Section3 refer to the explicit CFG in order to define the ample sets. Since generating the explicit CFG would render any reduction useless, the common approach is to introduce sufficient heuristics such that the condition C0 to C3 are satisfied. First, α I β if and only if 1. α labels a transition in P i, β a transition in P j and i 6= j 2. there is no variable present in both α and β and modified by either one.

One such heuristics suggests setting ample(l) to enabled(li) for some i, provided enabled(li) satisfies certain conditions, and using enabled(l) otherwise. The conditions for this particular heuristics derive from C0 to C3 as follows.

C0 Provided enabled(l) 6= ∅ we can only choose li such that enabled(li) 6= ∅. One over- σ 0 i approximation is to check each ρ ∈ {σ | li −→ li ∈ T } if ρ is satisfiable for each evaluation of unprimed variables. Assuming ρ = γ ∧ µ as suggested in Section5, this property can be checked syntactically.

C1 Failing C1 entails the presence of a path π = s1 → ... → sm → sm+1 in the explicit CFG with the following properties:

1. ∀(1 ≤ i ≤ m), si+1 ∈ αi(si) ∧ αi 6∈ ample(s1) 2. ∃(β ∈ ample(s1)), αm D β 3. ∀(1 ≤ i < m)(β ∈ ample(s1)), αi I β. i Then there is a thread P with a transition labelled with ρ such that for some σ ∈ ample(s1) and ρ and σ share a variable modified by either one. This can be checked by enumerating all variables referenced in each thread. C2 This condition is simple to be tested statically by again enumerating all variables refer- enced in P A. C3 Since the goal is to build P completely, there is no reason not to use depth-first ordering. We can thus check the cycle property using the stack of the traversal algorithm.

130 8.3.3 LLVM to NTS Translation The LLVM IR shares a hierarchical division of programs with NTS. While the LLVM code is build using functions, the NTS is a collection of BasicNts blocks (or CFGs in the context of this thesis). Both LLVM functions and BasicNts blocks take an arbitrary number of input parameters, although functions can have at most one return parameter. Furthermore, local variables may be defined in both formalisms. In LLVM, functions consist of basicblocks: a sequence of instructions. Only the last instruction inside a basicblock changes the control flow. The execution of all but the last instruction is followed by the next instruction. Thus a basicblock of m instructions can be translated to a BasicNts with m control states where the i-th state represents the program location after executing the i-th instruction. The semantics of individual instructions is captured by the transition rules.

Parallelism We expect that the LLVM code is generated from C++ programs that use shared-memory parallelism and the Pthreads library. The generic nature of this implementa- tion of parallelism allows creating threads anywhere in the program. Since the NTS language is considerably more limiting in its support of parallelism, a semantics-preserving translation requires building an NTS, which is not a result of direct translation. Let us assume that at most n parallel threads may run concurrently in any execution of our program. The resulting NTS consists of n + 1 BasicNts blocks with n working threads and 1 main thread. For the working thread there are as many transitions from the initial state as there are parallel functions in the original programs. Each of these initial transitions is conditioned by a guard representing the call of pthread create with the appropriate function as its parameter. Taking this initial transition leads to a state which initiates a BasicNts, semantically equivalent to that parallel function. This allows reusing the working thread BasicNts blocks for later threads. The Example 8.11 demonstrates the translation.

Example 8.11 (LLVM with Pthreads Translated to NTS). Consider the C program below void* p1(void*) { .. } void* p2(void*) { .. } int main(void) { pthread t t1, t2; pthread create(&t1, NULL, p1, NULL); pthread create(&t2, NULL, p2, NULL); return 0; }

The NTS translated from the above C program is instances worker[2], main[1]; sel[2] : int; init sel[0]=0 && sel[1]=0; worker {

131 initial si; si -> s1 {sel[tid]=1 && havoc()} si -> s2 {sel[tid]=2 && havoc()} s1 -> sf {p1()} s2 -> sf {p2()} sf -> si {sel’[tid]=0 && havoc(sel)} }

p1 { .. } p2 { .. } main { initial s0; s0 -> s1 {sel’[0]=1 && havoc(sel)} s0 -> s2 {sel’[1]=2 && havoc(sel)} }

The NTS is simplified but the principal method should be clear. Note that this translation accepts a large portion of real parallel programs but limits the effect of the partial order reduction. 4

8.3.4 Architecture of the NTS-Platform The tools composing the NTS-platform are freely3 available to be extended and modified for building other verification tools and for evaluating new verification methods.

nts-parser The NTS parser is build using a parser combinator for C++ and thus uses the C++14 standard. The output of parsing an NTS file is the internal representation. The parser supports the full NTS language extended with support for the bit-vector theory. Adding BV required the inclusion of a new type BitVector< n >, where n ∈ N denotes the number of bits. We have also added the arithmetic and bit-wise operators and modified the language non-terminals so that terms consisting of bit-vectors were valid.

LLVM2NTS The most important part of the translation from LLVM with respect to parallel programs is the encoding of Pthread functions and the resulting model of parallelism (see the above Section 8.3.3). We do not yet support the LLVM language fully: the dynamic memory allocation and non-constant pointers are presently not allowed by the translation.

POR The static partial order reduction takes as input an internal representation of NTS that consists of a number of BasicNts blocks in parallel composition. The result of this re- duction is a sequentialised NTS consisting of a single BasicNts. In order to compare the efficiency of the reduction, we have also implemented a non-reducing sequentialiser. A more comprehensive evaluation would require extending the LLVM fraction presently sup- ported, but initial results appear promising. We have generated the NTS for a program fib bench false-unreach-call from SVComp which consisted of 3 parallel threads. With- out POR the NTS contained 38988 states and 129504 transition. The reduction decreased the state space to 29383 states and 63888 transitions.

3http://anna.fi.muni.cz/∼xbauch/code.html#nts

132 8.4 Future of Platforms

As the experiments of the Section 8.2 show, control explicit—data symbolic approach repre- sents a viable option for temporal verification of unmodified, parallel programs. Yet for some programs our tool was unable to decide correctness within a reasonable amount of time. The cause for this limitation lies with the set-based reduction and the presence of loops in the program: changes in data force the tool to unroll these loops. Hence the most important future work is to avoid the necessity of loop unrolling and detect accepting cycles by means other than exact state equality. Less theoretical, although by no means less important for larger programs, is pruning away those parts of the input programs that cannot influence the correctness. Especially given the low-level nature of LLVM, clever heuristics for detecting irrelevant code could lead to considerably smaller control-flow graphs. Apart from the implementation omissions, the NTS-platform offers a few possible exten- sions that would improve its usability as a framework for building verification tools. The dynamic manipulation with memory on the side of programs does not have an equivalent counterpart in NTS. One possible way could be to employ the numerical abstraction as in- vestigated in [MTLT10] or more recently in [SGB+14]. Another way is to encode the heap as an array and add support for the first-order theory of arrays into NTS. With respect to the LLVM to NTS translation of parallelism, a less monolithic approach is desirable. Especially given the limitation the present approach poses to approximating the partial order conditions. Finally, the heuristics used during the computation of ample sets may benefit from checking the satisfiability of transition guards and thus including an SMT solver might improve the reduction efficiency.

133 a

134 Chapter 9

Conclusion

Automating software development is a challenging task, clearly motivated by the demands of present day market. Our work concentrates on a very particular component of this task: checking correctness. We do not work towards any concrete hypothesis, although the theoret- ical properties of both control explicit—data symbolic model checking and the sanity checking of requirements appear promising. Our experiments do not corroborate any ground-braking theory, although they attest applicability on industrial-strength case studies. This thesis col- lects our scientific experience with using explicit-state model checking to automate some of the tasks of software development, that are crucial especially for safety-critical systems. Of the five stages of software development, this work contributed to (1) requirements analysis, (2) design analysis, and (3) program verification. The control explicit—data sym- bolic model checking is applicable both on the design and the final program, whereas the sanity checking is specific to the requirements phase. The following sections should provide adequate closure to the effort we have invested in automating some processes of the three selected phases.

9.1 Checking Requirements When It Matters

Our analysis of user requirements further expands the incorporation of formal methods into software development. Aiming specifically at the requirements stage we propose a novel approach to sanity checking of requirements formalised in LTL formulae. Our approach is comprehensive in scope integrating consistency, redundancy and completeness checking to allow the user (or a requirements engineer) to produce a high quality set of requirements easier. The novelty of our consistency (redundancy) checking is that they produce all incon- sistent (redundant) sets instead of a yes/no answer and their efficiency is demonstrated in an experimental evaluation. Finally, the completeness checking presents a new behaviour-based coverage and suggests formulae that would improve the coverage and, consequently, the rigour of the final requirements. Especially in the earliest stages of the requirements elicitation, when the engineers have to work with highly incomplete, and often contradictory sets of requirements, could our sanity checking framework be of great use. Other realisability checking tools can also (and with better running times) be used on the nearly final set of requirements but our process uniquely target these crucial early states. First, finding all inconsistent subsets instead of merely one helps to more accurately pinpoint the source of inconsistency, which may not

135 have been elicited yet. Second, once working with a consistent set, the requirements engineer gets further feedback from the redundancy analysis, with every minimal redundancy witness pointing to a potential lack of understanding of the system being designed. Finally, the semi- interactive coverage-improving generation of new requirements attempts to partly automate the process of making the set of requirements as detailed as to prevent later development stages from misinterpreting a less thoroughly covered aspect of the system. We demonstrate this potential application on case studies of avionics systems. The above described addition to the standard elicitation process may seem intrusive, but the feedback we have gather from requirements engineers at Honeywell were mostly positive. Especially given the tool support for each individual steps of the process – pattern-based automatic translation from natural language requirements to LTL and the subsequent fully- and semi-automated techniques for sanity checking – a fast adoption of the techniques was observed. Many of the engineers reported a non-trivial initial investment to assume the basics of LTL, but in most cases the resulting overall savings in effort outweighed the initial adoption effort. After the first week, the overhead of tool-supported formalisation into LTL was less than 5 minutes per requirement, while the average time needed for corresponding stages of requirements elicitation decreased by 18 per cent. We have also accumulated a considerable amount of experience from employing the sanity checking on real-world sets of requirements during our cooperation with Honeywell Interna- tional. Of particular importance was the realisation that the time requirements of our original implementation effectively prevented the use of sanity checking on industrial scale. In order to ameliorate the poor timing results we have reimplemented the sanity checking algorithms using the state-of-the-art LTL to B¨uchi translator SPOT, which accelerated the process by several orders on magnitude on larger sets of formulae. Another important observation was that the concept of model-based redundancy is desirable to be translated into the model-free environment more directly. We thus extend the sanity checking process with the capacity to generate redundancy witnesses (using the formally described theory of path-quantified for- mulae). Finally, we summarise these and other observations in a case consisting of four sets of requirements gathered during the development of industry-scale systems.

9.2 Model Checking Needs Symbolic Data

Moving from the theoretical application of LTL model checking – a methodology to verify correctness of reactive systems – to practical application on parallel programs requires han- dling efficiently both the control-flow and the data-flow choices the programs can make. And while a majority of approaches treats these two sources of non-determinism equally, this work promotes the appreciation of the fundamental differences between control and data flows. We argue that the nature of control flow prevents verification polynomial in the number of program threads (see [FKP13] for current development), and thus we propose to handle the control-flow state space explosion by efficient parallelisation. The data flow, however, is often regular enough for symbolic representations to capture it concisely. Hence the object of our investigation: control explicit–data symbolic model checking. The combination of explicit and symbolic approaches to model checking presents several possible directions of research and we follow the path where a symbolic representation of data is devised, which an explicit model checker could use without requiring modifications. The proposed set-based state space reduction is compliant with this requirement, though at the

136 price of potential inefficiency. After being set-reduced, the state space contains states with explicit and symbolic parts, the latter representing a set of variable evaluations. Hence the crux of the proposed combination: what symbolic representation would be most efficient in representing a set of variable evaluations. Given the advancement of modern SAT and SMT solvers, and given the potential in- efficiency of established canonical representation [Bry91], we opted for the representation based on bit-vector formulae: a bit-vector function represents the set that forms its image. Computing the effect of program statements thus translates trivially to syntactic substitu- tion. Deciding state matching and the related search among already visited states, crucial operations for LTL verification, proved considerably more complicated. We have devised two implementations of the state matching and two heuristics that accelerate searching and stor- ing within the database of states. All these implementations were compared with each other and with classical, purely explicit model checking on DVE models of communicating proto- cols and randomly generated models (allowing more targeted experiments). The common observation is that, with the proposed combination, explicit model checking can now verify programs with non-trivial data flow, which was previously not possible. Still, in order to improve the effectivity of design supported by verification, we have proposed, implemented and evaluated a combing approach using a hybrid representation of data. This representation leverages the positive aspects of both the concrete and symbolic representations while employing some knowledge about the Simulink diagram which was not explicitly stated by the developer. We have demonstrated how SMT solvers can be used to infer this tacit knowledge, which can subsequently enable the hybrid representation to achieve considerably better performance. For the use in verification of parallel programs or at least safety critical segments of pro- grams, the cost would probably be prohibitive. We discuss ideas that could lead to expressing the state matching problem without the use of quantifiers, but leave that direction of research to future work. Yet in the case of Simulink diagrams, our approach scales with the ranges of input variables and thus allows the developer to use the range limit of the used data type, e.g. 32 bits integers, where before the developers had to test for ranges that were expo- nentially smaller. This improvement effectively automates the verification process, replacing completely any manual testing, since the resulting confidence in the correctness of the system with model checking is at least as high as it was with classical testing.

137 a

138 Bibliography

[AH97] H. Andersen and H. Hulgaard. Boolean Expression Diagrams. In Proc. of LICS, pages 88–98, 1997. 18

[AILS07] L. Aceto, A. Ing´olfsd´ottir,K. Larsen, and J. Srba. Reactive Systems: Mod- elling, Specification and Verification. Cambridge University Press, 2007.3

[ALW89] M. Abadi, L. Lamport, and P. Wolper. Realizable and Unrealizable Specifica- tions of Reactive Systems. In Proc. of ICALP, pages 1–17, 1989. 23

[AMP06] A. Armando, J. Mantovani, and L. Platania. Bounded Model Checking of Software Using SMT Solvers Instead of SAT Solvers. In SPIN, volume 3925 of LNCS, pages 146–162. Springer, 2006. 25

[APV06] S. Anand, C. P˘as˘areanu, and W. Visser. Symbolic Execution with Abstract Subsumption Checking. In SPIN, volume 3925 of LNCS, pages 163–181. Springer, 2006. 21

[BAS02] A. Biere, C. Artho, and V. Schuppan. Liveness Checking as Safety Checking. In Proc. of FMICS, pages 160–177, 2002. 26

[BB09] J. Bormann and H. Busch. Method for the Determination of the Quality of a Set of Properties, Usable for the Verification and Specification of Circuits. U. S. Patent No.7,571,398 B2, 2009. 24

[BBB+12] J. Barnat, J. Beran, L. Brim, T. Kratochv´ıla, and P. Roˇckai. Tool Chain to Support Automated Formal Verification of Avionics Simulink Designs. In FMICS, volume 7437 of LNCS, pages 78–92. Springer, 2012. 20, 24, 55, 65, 83, 89

[BBBC12]ˇ J. Barnat, P. Bauch, L. Brim, and M. Ceˇska.ˇ Designing Fast LTL Model Check- ing Algorithms for Many-Core GPUs. J. Parallel Distr. Com., 72(9):1083– 1097, 2012. 54

[BBCR10]ˇ J. Barnat, L. Brim, M. Ceˇska,ˇ and P. Roˇckai. DiVinE: Parallel Distributed Model Checker. In Proc. of HiBi/PDMC, pages 4–7, 2010.7, 48

[BBDER01] I. Beer, S. Ben-David, C. Eisner, and Y. Rodeh. Efficient Detection of Vacuity in Temporal Model Checking. Form. Methods Syst. Des., 18(2):141–163, 2001. 38, 41

139 [BBH+13] J. Barnat, L. Brim, V. Havel, J. Havl´ıˇcek,J. Kriho, M. Lenˇco,P. Roˇckai, V. Still,ˇ and J. Weiser. DiVinE 3.0 – Explicit-state Model-checker for Multi- threaded C/C++ Programs. In CAV, pages 863–868, 2013.5, 26, 89, 103

[BBR12] J. Barnat, L. Brim, and P. Roˇckai. Towards LTL Model Checking of Unmod- ified Thread-Based C&C++ Programs. In NFM, pages 252–266, 2012. 122, 123

[BBR14] T. Beyene, M. Brockschmidt, and A. Rybalchenko. CTL+FO Verification as Constraint Solving. Preprint is available at http://arxiv.org/abs/1406.3988, 2014. 27

[BBS01] J. Barnat, L. Brim, and J. Stˇr´ıbrn´a.Distributed LTL model-checking in SPIN. In SPIN, volume 2057 of LNCS, pages 200–216. Springer, 2001. 20

[BC95] R. Bryant and Y.-A. Chen. Verification of Arithmetic Circuits with Binary Moment Diagrams. In Proc. of DAC, pages 535–541, 1995. 18, 25

[BCC+99] A. Biere, A. Cimatti, E. Clarke, M. Fujita, and Y. Zhu. Symbolic Model Checking Using SAT Procedures instead of BDDs. In Proc. of DAC, pages 317–320. ACM, 1999. 25

[BCCZ99] A. Biere, A. Cimatti, E. Clarke, and Y. Zhu. Symbolic Model Checking without BDDs. In TACAS, volume 1579 of LNCS, pages 193–207. Springer, 1999. 20

[BCF+07] R. Bruttomesso, A. Cimatti, A. Franz´en,A. Griggio, Z. Hanna, A. Nadel, A. Palti, and R. Sebastiani. A Lazy and Layered SMT(BV) Solver for Hard Industrial Verification Problems. In CAV, volume 4590 of LNCS, pages 547– 560. Springer, 2007. 19

[BCG+10] R. Bloem, A. Cimatti, K. Greimel, G. Hofferek, R. K¨onighofer,M. Roveri, V. Schuppan, and R. Seeber. RATSY A New Requirements Analysis Tool with Synthesis. In Proc. of CAV, pages 425–429, 2010. 23

[BCMˇ S04]ˇ L. Brim, I. Cern´a,P.ˇ Moravec, and J. Simˇsa.ˇ Accepting Predecessors Are Better than Back Edges in Distributed LTL Model-Checking. In FMCAD, volume 3312 of LNCS, pages 352–366. Springer, 2004. 74

[BCZ99] A. Biere, E. Clarke, and Y. Zhu. Multiple State and Single State Tableaux for Combining Local and Global Model Checking. In Correct System Design, volume 1710 of LNCS, pages 163–179. Springer, 1999. 22

[BDE97] B. Becker, R. Drechsler, and R. Enders. On the Representational Power of Bit-Level and Word-Level Decision Diagrams. In Proc. of ASP-DAC, pages 461–467, 1997. 18

[BDG+98] J. Bohn, W. Damm, O. Grumberg, H. Hungar, and K. Laster. First-Order- CTL Model Checking. In FSTTCS, volume 1530 of LNCS, pages 283–294. Springer, 1998. 21

140 [BDKP08] P. Braione, G. Denaro, B. Kˇrena,and M. Pezz`e.Verifying LTL Properties of Bytecode with Symbolic Execution. In Proc. of Bytecode, pages 1–14. Elsevier Science, 2008. 21, 24

[BDL98] C. Barrett, D. Dill, and J. Levitt. A Decision Procedure for Bit-Vector Arith- metic. In Proc. of DAC, pages 522–527, 1998. 19

[BDL07] S. Baarir and A. Duret-Lutz. Emptiness Check of Powerset Buchi Automata using Inclusion Tests. In Proc. of ACDS, pages 41–50, 2007. 22

[BFG+01] S. Blom, W. Fokkink, J. Groote, I. van Langevelde, B. Lisser, and J. van de Pol. µCRL: A Toolset for Analysing Algebraic Specifications. In CAV, volume 2102 of LNCS, pages 250–254. Springer, 2001.7

[BGL98] T. Bultan, R. Gerber, and C. League. Verifying Systems with Integer Con- straints and Boolean Predicates: A Composite Approach. In Proc. of ISSTA, pages 113–123, 1998. 19, 25

[BGP97] T. Bultan, R. Gerber, and W. Pugh. Symbolic Model Checking of Infinite State Systems Using Presburger Arithmetic. In CAV, volume 1254 of LNCS, pages 400–411. Springer, 1997.9

[Bje99] P. Bjesse. Symbolic Model Checking with Sets of States Represented as For- mulas. Technical report, Chalmers Technical University, 1999. 20

[BK11] D. Beyer and M. Keremoglu. CPAchecker: A Tool for Configurable Software Verification. In CAV, pages 184–190, 2011. 22, 122, 126

[BKO+07] R. Bryant, D. Kroening, J. Ouaknine, S. Seshia, O. Strichman, and B. Brady. Deciding Bit-Vector Arithmetic with Abstraction. In TACAS, volume 4424 of LNCS, pages 358–372. Springer, 2007. 19

[BL13] D. Beyer and S. L¨owe. Explicit-State Software Model Checking Based on CEGAR and Interpolation. In FASE, volume 7793 of LNCS, pages 146–162. Springer, 2013. 22, 71

[BM05] D. Babic and M. Musuvathi. Modular Arithmetic Decision Procedure. Tech- nical report, Research Redmont, 2005. 19

[Boi99] B. Boigelot. Symbolic Methods for Exploring Infinite State Space. PhD thesis, University of Liege, 1999. 19

[BPR13] T. Beyene, C. Popeea, and A. Rybalchenko. Solving Existentially Quantified Horn Clauses. In Proc. of CAV, pages 869–882, 2013. 27

[Bra11] A. Bradley. SAT-Based Model Checking without Unrolling. In VMCAI, pages 70–87, 2011. 20, 25, 26

[Bry86] R. Bryant. Graph-Based Algorithms for Boolean Function Manipulation. IEEE Trans. Comput., C-35(8):677–691, 1986. 18

141 [Bry91] R. Bryant. On the Complexity of VLSI Implementations and Graph Repre- sentations of Boolean Functions with Application to Integer Multiplication. IEEE Trans. Comput., 40(2):205–213, 1991.5, 18, 19, 25, 137 [BS98] M. B¨uchi and E. Sekerinski. Formal Methods for Component Software: The Refinement Calculus Perspective. In ECOOP, volume 1357 of LNCS, pages 332–337. Springer, 1998. 19 [BSHY11] A. Bradley, F. Somenzi, Z. Hassan, and Z. Yan. An Incremental Approach to Model Checking Progress Properties. In Proc. of FMCAD, pages 144–153, 2011. 26 [BST10] C. Barrett, A. Stump, and C. Tinelli. The SMT-LIB Standard: Version 2.0. Technical report, The University of Iowa, 2010. 89 [BW94] B. Boigelot and P. Wolper. Symbolic Verification with Periodic Sets. In CAV, volume 818 of LNCS, pages 55–67. Springer, 1994. 19 [CAB+98] W. Chan, R. Anderson, P. Beame, S. Burns, F. Modugno, D. Notkin, and J. Reese. Model Checking Large Software Specifications. IEEE T. Software Eng., 24:498–520, 1998. 22 [CC77] P. Cousot and R. Cousot. Abstract Interpretation: A Unified Lattice Model for Static Analysis of Programs by Construction or Approximation of Fixpoints. In Proc. of POPL, pages 238–252. ACM, 1977.8, 21 [CCG+02] A. Cimatti, E. Clarke, E. Giunchiglia, F. Giunchiglia, M. Pistore, M. Roveri, R. Sebastiani, and A. Tacchella. NuSMV 2: An OpenSource Tool for Symbolic Model Checking. In CAV, pages 241–268, 2002.7, 24, 26 [CES86] E. Clarke, E. Emerson, and A. Sistla. Automatic Verification of Finite-State Concurrent Systems Using Temporal Logic Specifications. ACM T. Progr. Lang. Sys., 8(2):244–263, 1986.9, 20 [CGMT14] A. Cimatti, A. Griggio, S. Mover, and S. Tonetta. Verifying LTL Properties of Hybrid Systems with K-Liveness. In Proc. of CAV, pages 424–440, 2014. 26 [CGP99] E. Clarke, O. Grumberg, and D. Peled. Model Checking. MIT Press, 1999. 31, 35, 128 [CGP+07] B. Cook, A. Gotsman, A. Podelski, A. Rybalchenko, and M. Vardi. Proving That Programs Eventually Do Something Good. In Proc. of POPL, pages 265–276, 2007. 26 [Chu62] A. Church. Logic, Arithmetic, and Automata. Proc. Intern. Congr. Math., 0:21–35, 1962. 31 [CK11] B. Cook and E. Koskinen. Making Prophecies with Decision Predicates. In Proc. of POPL, pages 399–410, 2011. 26, 125, 126, 128 [CKKV01] H. Chockler, O. Kupferman, R. Kurshan, and M. Vardi. A Practical Approach to Coverage in Model Checking. In CAV, volume 2102 of LNCS, pages 66–78. Springer, 2001. 23

142 [CKP14] B. Cook, H. Khlaaf, and N. Piterman. Faster Temporal Reasoning for Infinite- State Programs. In Proc. of FMCAD, pages 16:75–16:82, 2014. 26

[CKP15a] B. Cook, H. Khlaaf, and N. Piterman. Fairness for Infinite-State Systems. In TACAS, volume 9035 of LNCS, pages 384–398. Springer, 2015. 26

[CKP15b] B. Cook, H. Khlaaf, and N. Piterman. On Automation of CTL* Verification for Infinite-State Systems. In CAV, volume 9207 of LNCS, pages 13–29. Springer, 2015. 27

[CKS05] B. Cook, D. Kroening, and N. Sharygina. Symbolic Model Checking for Asyn- chronous Boolean Programs. In SPIN, volume 3639 of LNCS, pages 75–90. Springer, 2005. 22

[CKV01] H. Chockler, O. Kupferman, and M. Vardi. Coverage Metrics for Temporal Logic Model Checking. In TACAS, volume 2031 of LNCS, pages 528–542. Springer, 2001. 23, 45

[CMR97] D. Cyrluk, O. M¨oller, and H. Rueß. An Efficient Decision Procedure for the Theory of Fixed-Sized Bit-Vectors. In CAV, volume 1254 of LNCS, pages 60–71. Springer, 1997. 19

[CNR12] A. Cimatti, I. Narasamdya, and M. Roveri. Software Model Checking with Explicit Scheduler and Symbolic Threads. LMCS, 8(2):1–42, 2012. 22

[CP03]ˇ I. Cern´aandˇ R. Pel´anek.Distributed Explicit Fair Cycle Detection (Set Based Approach). In SPIN, volume 2648 of LNCS, pages 623–623. Springer, 2003. 74, 105

[CPDGP01] A. Coen-Porisini, G. Denaro, C. Ghezzi, and M. Pezz´e. Using Symbolic Ex- ecution for Verifying Safety-Critical Systems. In Proc. of ESEC/FSE, pages 142–151, 2001. 21

[CPJ09] M. Ciesielski, D. Pradhan, and A. Jabir. Canonical Graph-based Representa- tions for Verification of Arithmetic and Data Flow Designs, chapter 7, pages 173–245. Cambridge University Press, 2009. 19

[CRB01] A. Cimatti, M. Roveri, and P. Bertoli. Searching Powerset Automata by Com- bining Explicit-State and Symbolic Model Checking. In TACAS, volume 2031 of LNCS, pages 313–327. Springer, 2001. 21

[CRST08] A. Cimatti, M. Roveri, V. Schuppan, and A. Tchaltsev. Diagnostic Information for Realizability. In Proc. of VMCAI, pages 52–67, 2008. 23

[CVWY92] C. Courcoubetis, M. Vardi, P. Wolper, and M. Yannakakis. Memory-Efficient Algorithms for the Verification of Temporal Properties. Form. Method Syst. Des., 1(2):275–288, 1992. 43, 74

[DAC98] M. Dwyer, G. Avrunin, and J. Corbett. Property Specification Patterns for Finite-State Verification. In Proc. of FMSP, pages 7–15, 1998. 49, 52, 55

143 [DHLP15] D. Dietsch, M. Heizmann, V. Langenfeld, and A. Podelski. Fairness Modulo Theory: A New Approach to LTL Software Model Checking. In CAV, volume 9206 of LNCS, pages 49–66. Springer, 2015. 27

[Dij76] E. Dijkstra. A Discipline of Programming. Prentice-Hall, 1976.3

[DL11] A. Duret-Lutz. LTL Translation Improvements in Spot. In Proc. of VECoS, pages 72–83, 2011. 57

[DL14] A. Duret-Lutz. LTL Translation Improvements in SPOT 1.0. IJCCBS, 5(1):31– 54, 2014. 123

[DLKPTM11] A. Duret-Lutz, K. Klai, D. Poitrenaud, and Y. Thierry-Mieg. Self-Loop Ag- gregation Product — A New Hybrid Approach to On-the-Fly LTL Model Checking. In ATVA, volume 6996 of LNCS, pages 336–350. Springer, 2011. 22, 25

[DLL62] M. Davis, G. Logemann, and D. Loveland. A Machine Program for Theorem Proving. Commun. ACM, 5:394–397, 1962. 30

[dMB08] L. de Moura and N. Bjørner. Z3: An Efficient SMT Solver. In TACAS, volume 4963 of LNCS, pages 337–340. Springer, 2008. 89, 103

[dMRS03] L. de Moura, H. Rueß, and M. Sorea. Bounded Model Checking and Induction: From Refutation to Verification. In CAV, volume 2725 of LNCS, pages 14–26. Springer, 2003. 20

[EP02] C. Eisner and D. Peled. Comparing Symbolic and Explicit Model Checking of a Software System. In SPIN, volume 2318 of LNCS, pages 230–239. Springer, 2002. 20

[FG03] G. Feierbach and V. Gupta. True Coverage: A Goal of Verification. In Proc. of ISQED, pages 75–78, 2003. 24

[FKP13] A. Farzan, Z. Kincaid, and A. Podelski. Inductive Data Flow Graphs. In Proc. of POPL, pages 129–142, 2013. 27, 136

[God03] P. Godefroid. Reasoning about Abstract Open Systems with Generalized Mod- ule Checking. In EMSOFT, volume 2855 of LNCS, pages 223–240. Springer, 2003. 25

[GP05] E. Gunter and D. Peled. Model Checking, Testing and Verification Working Together. Form. Asp. Comput., 17:201–221, 2005. 22

[GPR11a] A. Gupta, C. Popeea, and A. Rybalchenko. Predicate Abstraction and Re- finement for Verifying Multi-Threaded Programs. In Proc. of POPL, pages 331–344. ACM, 2011. 21

[GPR11b] A. Gupta, C. Popeea, and A. Rybalchenko. Solving Recursion-Free Horn Clauses over LI+UIF. In APLAS, volume 7078 of LNCS, pages 188–203. Springer, 2011. 27

144 [Har09] J. Harrison. Handbook of Practical Logic and Automated Reasoning. Cam- bridge University Press, 2009. 31

[Hav14] V. Havel. Generic Platform for Explicit-Symbolic Verification. Master’s thesis, Masaryk University, 2014. 120, 126

[HCRP91] N. Halbwachs, P. Caspi, P. Raymond, and D. Pilaud. The Synchronous Data Flow Programming Language LUSTRE. Proc. IEEE, 79(9):1305–1320, 1991. 24

[HGD95] H. Hungar, O. Grumberg, and W. Damm. What if Model Checking Must Be Truly Symbolic. In CHARME, volume 987 of LNCS, pages 1–20. Springer, 1995. 21, 25, 87

[HHP13] M. Heizmann, J. Hoenicke, and A. Podelski. Software Model Checking for People Who Love Automata. In CAV, volume 8044 of LNCS, pages 36–52. Springer, 2013. 27

[HIK04] S. Haddad, J.-M. Ili´e,and K. Klai. Design and Evaluation of a Symbolic and Abstraction-Based Model Checker. In ATVA, volume 3299 of LNCS, pages 196–210. Springer, 2004. 22

[HJC+08] M. Hinchey, M. Jackson, P. Cousot, B. Cook, J. Bowen, and T. Margaria. Software Engineering and Formal Methods. Commun. ACM, 51:54–59, 2008. 3

[HJJ+95] J. Henriksen, J. Jensen, M. Jørgensen, N. Klarlund, R. Paige, T. Rauhe, and A. Sandholm. Mona: Monadic Second-Order Logic in Practice. In TACAS, volume 1019 of LNCS, pages 89–110. Springer, 1995. 19

[HKG+12] H. Hojjat, F. Koneˇcn´y,F. Garnier, R. Iosif, V. Kuncak, and P. R¨ummer. A Verification Toolkit for Numerical Transition Systems. In FM, volume 7436 of LNCS, pages 247–251. Springer, 2012. 119

[HL95] M. Heimdahl and N. Leveson. Completeness and Consistency Analysis of State-Based Requirements. In Proc. of ICSE, pages 3–14, 1995. 23, 24

[Hol97] G. Holzmann. The Model Checker SPIN. IEEE T. Software Eng., 23(5):279– 295, 1997. 26

[HT08] G. Hagen and C. Tinelli. Scaling Up the Formal Verification of Lustre Pro- grams with SMT-Based Techniques. In Proc. of FMCAD, pages 1–9, 2008. 25

[Jai08] R. Jain. The Art of Computer Systems Performance Analysis. John Wiley & Sons, 2008.3

[KF13] A. Kupriyanov and B. Finkbeiner. Causality-Based Verification of Multi- threaded Programs. In Proc. of CONCUR, pages 257–272, 2013. 27

[KF14] A. Kupriyanov and B. Finkbeiner. Causal Termination of Multi-threaded Pro- grams. In Proc. of CAV, pages 814–830, 2014. 27

145 [KGG99] S. Katz, O. Grumberg, and D. Geist. Have I Written Enough Properties? - A Method of Comparison Between Specification and Implementation. In Proc. of CHARME, pages 280–297, 1999. 23

[KHB09] R. Konighofer, G. Hofferek, and R. Bloem. Debugging Formal Specifications Using Simple Counterstrategies. In Proc. of FMCAD, pages 152–159, 2009. 23

[Kin76] J. King. Symbolic Execution and Program Testing. Commun. ACM, 19(7):385–394, 1976.8, 21, 66

[KPV03] S. Khurshid, C. P˘as˘areanu, and W. Visser. Generalized Symbolic Execution for Model Checking and Testing. In TACAS, volume 2619 of LNCS, pages 553–568. Springer, 2003. 21

[KS10] D. Kroening and O. Strichman. Decision Procedures: An Algorithmic Point of View. Springer, 2010. 29

[Kup06] O. Kupferman. Sanity Checks in Formal Verification. In CONCUR, volume 4137 of LNCS, pages 37–51. Springer, 2006.7, 23, 37

[Kup14] A. Kupriyanov. Causality-Based LTL Model Checking without Automata. Presented at FMCAD Poster Session, 2014. 27

[KV96] O. Kupferman and M. Vardi. Module Checking. In CAV, volume 1102 of LNCS, pages 75–86. Springer, 1996. 20, 25

[KV03] O. Kupferman and M. Vardi. Vacuity Detection in Temporal Model Checking. STTT, 4:224–233, 2003. 23

[LA04] C. Lattner and V. Adve. LLVM: A Compilation Framework for Lifelong Pro- gram Analysis Transformation. In Proc. of CGO, pages 75–86, 2004. 117

[Lev00] N. Leveson. Completeness in Formal Specification Language Design for Process-Control Systems. In Proc. of FMSP, pages 75–87, 2000. 23

[Lin96] H. Lin. Symbolic Transition Graph with Assignment. In CONCUR, volume 1119 of LNCS, pages 50–65. Springer, 1996. 19, 25, 64, 87

[LMS04] I. Lynce and J. P. Marques-Silva. On Computing Minimum Unsatisfiable Cores. In Proc. of SAT, pages 305–310, 2004. 23

[LS08] M. Liffiton and K. Sakallah. Algorithms for Computing Minimal Unsatisfiable Subsets of Constraints. J. Autom. Reasoning, 40(1):1–33, 2008. 23

[MAW+05] S. Miller, E. Anderson, L. Wagner, M. Whalen, and M. Heimdahl. Formal Verification of Flight Critical Software. In Proc. of GNC, pages 1–16, 2005. 24

[MBR06] B. Meenakshi, A. Bhatnagar, and S. Roy. Tool for Translating Simulink Models into Input Language of a Model Checker. In ICFEM, volume 4260 of LNCS, pages 606–620. Springer, 2006. 24

[McM92] K. McMillan. Symbolic Model Checking. PhD thesis, Carnegie Mellon Univer- sity, 1992.9, 18, 20, 24

146 [McM02] K. McMillan. Applying SAT Methods in Unbounded Symbolic Model Check- ing. In CAV, volume 2404 of LNCS, pages 250–264. Springer, 2002. 20

[McM03] K. McMillan. Interpolation and SAT-Based Model Checking. In CAV, pages 1–13, 2003. 20, 26

[Met13] C. Metz. The One Last Thread Holding Apple and Google Together. Wired, 7:1–1, 2013. 117

[Min12] A. Min´e.Abstract Domains for Bit-Level Machine Integer and Floating-Point Operations. In Proc. of WING’12, 2012. 27

[Min13] A. Min´e.Static Analysis by Abstract Interpretation of Concurrent Programs. Technical report, Ecole´ Normale Sup´erieure,2013. 27

[MP81] Z. Manna and A. Pnueli. Verification of Concurrent Programs, Part I: The Temporal Framework. Technical report, Stanford University, 1981.3

[MP95] Z. Manna and A. Pnueli. Temporal Verification of Reactive Systems: Safety. Springer, 1995. 29

[MTH03] S. Miller, A. Tribble, and M. Heimdahl. Proving the Shalls. In FME, volume 2805 of LNCS, pages 75–93. Springer, 2003. 23

[MTLT10] S. Magill, M.-H. Tsai, P. Lee, and Y.-K. Tsay. Automatic Numeric Abstrac- tions for Heap-Manipulating Programs. In Proc. of POPL, pages 211–222, 2010. 133

[NF96] R. Nelken and N. Francez. Automatic Translation of Natural Language System Specifications Into Temporal Logic. In CAV, volume 1102 of LNCS, pages 360– 371. Springer, 1996.3

[OG76] S. Owicki and D. Gries. An Axiomatic Proof Technique for Parallel Programs I. Acta Inform., 6:319–340, 1976.8

[Pel07] R. Pel´anek.BEEM: Benchmarks for Explicit Model Checkers. In SPIN, volume 4595 of LNCS, pages 263–267. Springer, 2007. 103

[Pic03] M. Pichora. Automated Reasoning about Hardware Data Types Using Bit- Vectors of Symbolic Lengths. PhD thesis, University of Toronto, 2003. 19

[PR05] A. Podelski and A. Rybalchenko. Transition Predicate Abstraction and Fair Termination. In Proc. of POPL, pages 132–144, 2005. 21, 26

[RDB+05] S. Roy, S. Das, P. Basu, P. Dasgupta, and P. P. Chakrabarti. SAT Based Solutions for Consistency Problems in Formal Property Specifications for Open Systems. In Proc. of ICCAD, pages 885–888, 2005. 23

[Ric53] H. Rice. Classes of Recursively Enumerable Sets and Their Decision Problems. T. Am. Math. Soc., 74(2):358–366, 1953.3

147 [RLS+03] S. Regimbal, J.-F. Lemire, Y. Savaria, G. Bois, E. Aboulhamid, and A. Baron. Automating Functional Coverage Analysis Based on an Executable Specifica- tion. In Proc. of IWSOC, pages 228–234, 2003. 24

[RPHS96] K. Ravi, A. Pardo, G. Hachtel, and F. Somenzi. Modular Verification of Multipliers. In FMCAD, volume 1166 of LNCS, pages 49–63. Springer, 1996. 18

[RV07] K. Rozier and M. Vardi. LTL Satisfiability Checking. In SPIN, volume 4595 of LNCS, pages 149–167. Springer, 2007. 23

[RWH07] A. Rajan, M. Whalen, and M. Heimdahl. Model Validation using Automati- cally Generated Requirements-Based Tests. In Proc. of HASE, pages 95–104, 2007. 23

[Sch12] V. Schuppan. Towards a Notion of Unsatisfiable and Unrealizable Cores for LTL. Sci. Comput. Program., 77(7-8):908–939, 2012. 23

[SGB+14] T. Str¨oder,J. Giesl, M. Brockschmidt, F. Frohn, C. Fuhs, J. Hensel, and P. Schneider-Kamp. Proving Termination and Memory Safety for Programs with Pointer Arithmetic. In IJCAR, volume 8562 of LNCS, pages 208–223. Springer, 2014. 133

[Sho82] R. Shostak. Deciding Combinations of Theories. In CADE, volume 138 of LNCS, pages 209–222. Springer, 1982. 19

[Sim06]ˇ P. Simeˇcek.DiVinEˇ – Distributed Verification Environment. Master’s thesis, Masaryk University, 2006. 104

[SK07] A. Simon and A. King. Taming the Wrapping of Integer Arithmetic. In SAS, volume 4634 of LNCS, pages 121–136. Springer, 2007. 21

[SMA05] K. Sen, D. Marinov, and G. Agha. CUTE: A Concolic Unit Testing Engine for C. ACM SIGSOFT, 30:263–272, 2005. 22

[STV05] R. Sebastiani, S. Tonetta, and M. Vardi. Symbolic Systems, Explicit Prop- erties: On Hybrid Approaches for LTL Symbolic Model Checking. In CAV, volume 3576 of LNCS, pages 100–246. Springer, 2005. 22, 25

[TK01] S. Tasiran and K. Keutzer. Coverage Metrics for Functional Validation of Hardware Designs. IEEE Des. Test. Comput., 18(4):36–45, 2001. 45

[UM15] C. Urban and A. Min´e. Proving Guarantee and Recurrence Temporal Prop- erties by Abstract Interpretation. In VMCAI, volume 8931 of LNCS, pages 190–208. Springer, 2015. 27

[Var01] M. Vardi. Branching vs. Linear Time: Final Showdown. In TACAS, volume 2031 of LNCS, pages 1–22. Springer, 2001. 20

[vdB13] F. van der Berg. Model Checking LLVM IR Using LTSmin. Master’s thesis, University of Twente, 2013. 26

148 [VW86] M. Vardi and P. Wolper. An Automata-Theoretic Approach to Automatic Pro- gram Verification. In Proc. of LICS, pages 332–344. IEEE Computer Society Press, 1986.9, 20, 85

[Wan06] B.-Y. Wang. On the Satisfiability of Modular Arithmetic Formulae. In ATVA, volume 4218 of LNCS, pages 186–199. Springer, 2006. 19

[WB00] P. Wolper and B. Boigelot. On the Construction of Automata from Lin- ear Arithmetic Constraints. In TACAS, volume 1785 of LNCS, pages 1–19. Springer, 2000. 19

[WBCG00] P. Williams, A. Biere, E. Clarke, and A. Gupta. Combining Decision Diagrams and SAT Procedures for Efficient Symbolic Model Checking. In CAV, volume 1855 of LNCS, pages 124–138. Springer, 2000. 20, 25

[WHdM13] C. Wintersteiger, Y. Hamadi, and L. de Moura. Efficiently Solving Quantified Bit-Vector Formulas. Form. Method. Syst. Des., 42(1):3–23, 2013. 19, 30, 87, 89, 95

[WRHM06] M. W. Whalen, A. Rajan, M. P. E. Heimdahl, and S. P. Miller. Coverage Metrics for Requirements-Based Testing. In Proc. of ISSTA, pages 25–36, 2006. 23

[XMSN05] T. Xie, D. Marinov, W. Schulte, and D. Notkin. Symstra: A Framework for Generating Object-Oriented Unit Tests Using Symbolic Execution. In TACAS, volume 3440 of LNCS, pages 365–381. Springer, 2005. 21, 25, 60

[YWGI06] Z. Yang, C. Wang, A. Gupta, and F. Ivanˇci´c.Mixed Symbolic Representations for Model Checking Software Programs. In Proc. of MEMOCODE, pages 17– 26, 2006. 25

[ZNMZ12] J. Zhao, S. Nagarakatte, M. Martin, and S. Zdancewic. Formalizing the LLVM Intermediate Representation for Verified Program Transformations. In Proc. of POPL, pages 427–440, 2012. 117

149