Masaryk University, Faculty of Informatics
PhD Thesis
Automating Software Development with Explicit Model Checking
Author: Supervisor: Petr Bauch Jiˇr´ı Barnat
Signature: Signature:
Brno, 2015 a
i Acknowledgements
The idea to combine explicit and symbolic approaches to model checking was bestowed upon me by my thesis supervisor, Jiˇr´ıBarnat, for which I will remain forever grateful to him. We have been discussing the topic at length on numerous occasions, always arriving at a new inspiration for further progress. Each of my earlier doubts was shattered and every found obstacle overcome by his help. My previous supervisor, LuboˇsBrim, introduced the idea of analysing requirements without models, which gave me my first experience with autonomous scientific work: an invaluable experience. Yet even greater influence on me had his course on the Dijkstra’s method. There was the first spark, my falling for the formal methods. Equally grateful I am to all members of the ParaDiSe and Sybila laboratories, both present and past, for creating an atmosphere which was as friendly as it was inspiring. Finally, I would like to give credit to students that helped me implement some of the ideas I had – Vojtˇech Havel and Jan Tuˇsil– since their coding skills surpass mine I was able to learn from them as much as they learnt from me.
ii a
iii Summary
In the last decade it became a common practice to formalise software requirements to improve the clarity of users’ expectations. This thesis builds on the fact that functional requirements can be expressed in temporal logic and we propose new means of verifying the requirements themselves and the compliance to those requirements throughout the development process. The goal is the automation of some of the verification processes currently employed in software development. The tool is the automata-based, explicit-state model checking. With respect to requirements, we propose new techniques that automatically detect flaws and suggest improvements of given requirements. Specifically, we describe and experimentally evaluate approaches to consistency and redundancy checking that identify all inconsistencies and pinpoint their exact source (the smallest inconsistent set). We further report on the experience obtained from employing the consistency and redundancy checking in an industrial environment. To complete the sanity checking we also describe a semi-automatic completeness evaluation that can assess the coverage of user requirements and suggest missing properties the user might have wanted to formulate. The usefulness of our completeness evaluation is demonstrated in a case study of an aeroplane control system. With respect to compliance to requirements, we propose extending the explicit model checking to scale to more realistic programs. Automatic verification of programs and computer systems with data non-determinism (e.g. reading from user input) represents a significant and well-motivated challenge. The case of Simulink diagrams is especially difficult, because there the inputs are read iteratively and the number of input variables is in theory unbounded. The case of parallel programs is difficult in a different way, because then also the control flow non- trivially complicates the verification process. We apply the techniques of explicit-state model checking to account for the control aspects of verified programs and use set-based reduction of the data flow, thus handling the two sources of non-determinism separately. We evaluate a number of representation of sets in scalability with respect to the range of input variables. Explicit (enumerating) sets are very fast for small ranges but fail to scale. Symbolic sets, represented as first-order formulae in the bit-vector theory and compared using satisfiability modulo theory solvers, scale well to arbitrary (though still bounded) range of input variables. Binary decision diagrams dominate in both space and time complexities, but only if we limit the scope to linear arithmetic and a relatively small number of variables. We build the theory of set-based reduction using first-order formulae in the bit-vector theory to encode the sets of variable evaluations representing program data. These repre- sentations are tested for emptiness and equality (state matching) during the verification and we harness modern satisfiability modulo theories solvers to implement these tests. We de- sign two methods of implementing the state matching, one using quantifiers and one which is quantifier-free, and we provide both analytical and experimental comparison. Further ex- periments evaluate the efficiency of the set-based reduction showing the classical, explicit approach to fail to scale with the size of data domains. Finally, we report on the development of a generic framework for automatic verification of linear temporal logic specifications for programs in LLVM bitcode. To evaluate the framework we compare our method with state-of-the-art tools on a set of unmodified C programs.
iv Abstract
This thesis builds on the fact that functional requirements can be expressed in temporal logic and we propose new means of verifying the requirements themselves and the compliance to those requirements throughout the development process. The goal is the automation of some of the verification processes currently employed in software development. The tool is the automata-based, explicit-state model checking. With respect to requirements, we propose new techniques that automatically detect flaws and suggest improvements of given require- ments. Specifically, we describe and experimentally evaluate approaches to consistency and redundancy checking that identify all inconsistencies and pinpoint their exact source (the smallest inconsistent set). With respect to compliance to requirements, we propose extending the explicit model checking to scale to more realistic programs. We apply the techniques of explicit-state model checking to account for the control aspects of verified programs and use set-based reduction of the data flow, thus handling the two sources of non-determinism separately. We build the theory of set-based reduction using first-order formulae in the bit- vector theory to encode the sets of variable evaluations representing program data. These representations are tested for emptiness and equality (state matching) during the verification and we harness modern satisfiability modulo theories solvers to implement these tests. Our experiments evaluate the efficiency of the set-based reduction showing the classical, explicit approach to fail to scale with the size of data domains.
v Contents
1 Introduction1 1.1 Software Development Cycle...... 1 1.2 Model Checking...... 4 1.3 Sanity Checking...... 7 1.4 Case Study: Simulink Diagrams...... 7 1.5 Contribution...... 9 1.5.1 Model-Free Sanity Checking...... 10 1.5.2 Control Explicit—Data Symbolic Model Checking...... 10 1.5.3 Verification Platform...... 11 1.5.4 Thesis Content...... 12 1.5.5 Author’s Other Publications...... 14
2 Related Work 17 2.1 Control Explicit—Data Symbolic Model Checking...... 17 2.1.1 Symbolic Representations...... 18 2.1.2 Bit-Precise LTL Model Checking of Parallel Programs with Inputs.. 19 2.1.3 State Space Reduction by Control-Flow Aggregation...... 21 2.1.4 Combinations of Explicit and Symbolic Approaches...... 22 2.2 Model-free Sanity Checking...... 22 2.3 Verification of Simulink Diagrams...... 24 2.4 LTL Verification of Parallel LLVM Programs...... 26
3 Preliminaries 29 3.1 First-Order Theory of Bit-Vectors...... 29 3.2 Control Flow Graphs...... 31 3.3 LTL Model Checking...... 34 3.3.1 Explicit-State Model Checking...... 35 3.4 Sanity Checking...... 37
4 Checking Sanity of Software Requirements 39 4.1 Consistency and Redundancy...... 39 4.1.1 Implementation of Consistency Checking...... 42 4.2 Completeness of Requirements...... 45 4.3 Sanity Checking Experiments...... 48 4.3.1 Generated Requirements...... 48 4.3.2 Case Study: Aeroplane Control System...... 49
vi 4.3.3 Industrial Evaluation...... 54 4.4 Closing Remarks...... 55 4.4.1 Alternative LTL to B¨uchi Translation...... 57
5 Control Explicit—Data Symbolic Model Checking 59 5.1 Set-Based Reduction...... 59 5.2 Symbolic Representation...... 65 5.3 Storing and Searching...... 70 5.3.1 Linear Storage...... 71 5.3.2 Explicating Variables...... 71 5.3.3 Witness-Based Matching Resolution...... 72 5.4 Model Checking Algorithm...... 73 5.4.1 Counterexamples...... 77 5.4.2 Deadlocks...... 78 5.4.3 Witnesses...... 78
6 Verification of Simulink Designs 81 6.1 Theory of Simulink Verification...... 81 6.1.1 Verification with Explicit Sets...... 83 6.1.2 SAT-Based Verification...... 84 6.1.3 Hybrid Representation...... 87 6.2 Experimental Evaluation...... 88 6.2.1 Implementation...... 89 6.2.2 Experiments...... 91 6.3 Discussion...... 100 6.3.1 Theoretical Limitations...... 100 6.3.2 State Matching by Comparing Function Images...... 101
7 Analysis of Semi-Symbolic Representations 103 7.1 Case Study: DVE Models...... 103 7.1.1 DVE Language for Open Systems...... 104 7.1.2 Set-Based Reduction with Explicit Sets...... 105 7.1.3 Set-Based Reduction with Symbolic Sets...... 107 7.1.4 Equivalence-Based versus Image-Based State Matching...... 108 7.2 Case Study: Random Models...... 110 7.2.1 Image versus Equivalence using Linear Search...... 111 7.2.2 Experiments with Witness-Based Matching Resolution...... 113
8 Verification Platforms 117 8.1 Input Languages...... 117 8.1.1 LLVM...... 117 8.1.2 NTS...... 119 8.2 SymDivine ...... 120 8.2.1 Memory Structure...... 120 8.2.2 Algorithms...... 122 8.2.3 Containers...... 124 8.2.4 Experimental Evaluation...... 125
vii 8.3 NTS-Platform...... 128 8.3.1 Standard Partial Order Reduction...... 128 8.3.2 Static Partial Order Reduction...... 129 8.3.3 LLVM to NTS Translation...... 131 8.3.4 Architecture of the NTS-Platform...... 132 8.4 Future of Platforms...... 133
9 Conclusion 135 9.1 Checking Requirements When It Matters...... 135 9.2 Model Checking Needs Symbolic Data...... 136
Bibliography 139
viii List of Algorithms
1 Consistency Detection...... 42 2 Task Successors...... 42 3 Task Supersets...... 43 4 Consistency Check...... 43 5 Completeness Evaluation...... 48
6 Generic Accepting Cycle Detection...... 74 7 Database Searching...... 76 8 Counterexample Extraction...... 77 9 Database Searching with Witness-Based Matching Resolution...... 79 10 Witness-based Matching Resolution...... 79 11 Extract Witness...... 79 12 Update Witnesses...... 80
13 Successor Generation for Simulink...... 86
14 SymDivine Reachability...... 124 15 SymDivine Nested-DFS...... 125 16 SymDivine Successors...... 126
ix List of Figures
1.1 Software Development Life Cycle...... 2 1.2 Simulink Diagram of the VoterCore System...... 8 1.3 Thesis Contributions...... 12
4.1 Consistency Checking Example...... 44 4.2 Consistency Checking Results...... 50 4.3 Redundancy Checking Results...... 51 4.4 Sanity Checking Complexity Progression...... 52
6.1 VoterCore Experiments: Purely Explicit vs Explicit Sets 1...... 92 6.2 VoterCore Experiments: Purely Explicit vs Explicit Sets 2...... 93 6.3 VoterCore Experiments: Purely Explicit vs Explicit Sets 3...... 94 6.4 VoterCore Experiments: Explicit Sets vs BV formulae 1...... 96 6.5 VoterCore Experiments: Explicit Sets vs BV formulae 2...... 97 6.6 VoterCore Experiments: Explicit Sets vs BV formulae 3...... 98 6.7 VoterCore Experiments: Hybrid Representation...... 99 6.8 Time Progression for Simulink Verification...... 101
7.1 DVE Experiments: Purely Explicit vs Explicit Sets...... 106 7.2 DVE Experiments: Step-Wise Progression...... 111 7.3 DVE Experiments: Parallel Scalability...... 112 7.4 Random Model Experiments: Image vs Equiv...... 114 7.5 Witness-Based Resolution Experiments...... 115
8.1 LLVM Compilation Diagram...... 118 8.2 SymDivine Overview...... 121 8.3 SymDivine User Overview...... 123
x List of Tables
4.1 Aeroplane Control System Results...... 55 4.2 Industrial Case Study Results...... 55
7.1 Work Distribution Protocol...... 109 7.2 Random Edge Generation...... 113
8.1 SymDivine vs CPAchecker ...... 126 8.2 SymDivine vs LTL Model Checkers...... 127
xi List of Mathematics
3.1 Definition (Syntax of BV)...... 30 3.2 Example (BV Semantics)...... 30 3.3 Definition (Quantified BV)...... 30 3.4 Definition (Control Flow Graph)...... 31 3.5 Example (CFG for Parallel Program)...... 31 3.6 Definition (Parallel Composition)...... 32 3.7 Definition (Synchronous Composition)...... 32 3.8 Definition (Sequential Composition)...... 32 3.9 Example (CFG Composition)...... 33 3.10 Definition (Explicit CFG)...... 33 3.11 Definition (LTL Syntax)...... 34 3.12 Definition (LTL Semantics)...... 34 3.13 Example (Temporal Operators)...... 34 3.14 Example (B¨uchi Automaton)...... 35 3.15 Definition (CFG Acceptance)...... 35 3.16 Example (Specification and Program Product)...... 36 3.17 Example (Model-Based Redundancy and Coverage)...... 37
4.1 Definition (Consistency and Redundancy)...... 39 4.2 Example (Consistent and Redundant Requirements)...... 40 4.3 Definition (Path-Quantified Consistency and Redundancy)...... 40 4.4 Theorem (Locating Inconsistent Subsets)...... 40 4.5 Example (Redundancy Witnesses)...... 41 4.6 Example (Consistency Checking Algorithm)...... 43 4.7 Example (Smallest Inconsistent Subset)...... 44 4.8 Definition (Almost-Simple Path)...... 45 4.9 Definition (Path Coverage)...... 46 4.10 Example (Automata Coverage)...... 47
5.1 Definition (Set-Reduced CFG)...... 60 5.2 Example (Incorrectness of Subsumption)...... 60 5.3 Example (Set-Based Reduction)...... 61 5.4 Example (Symbolic Execution)...... 62 5.5 Theorem (Correctness of Set-Based Reduction)...... 62 5.6 Lemma (Symbolic Path Existence)...... 62 5.7 Lemma (Explicit Path Existence)...... 63 5.8 Lemma (Fix-Point for BV Relations)...... 63
xii 5.9 Definition (Composition of BV Relations)...... 63 5.10 Lemma (Symbolic Cycle Existence)...... 63 5.11 Lemma (Explicit Cycle Existence)...... 63 5.12 Example (Set-Based Reduction Complexity)...... 64 5.13 Definition (Set-Symbolic CFG)...... 66 5.14 Definition (Image-Based Interpretation)...... 66 5.15 Theorem (Trace Equivalence for Image-Based Interpretation)...... 67 5.16 Definition (Image-Based State Matching)...... 68 5.17 Theorem (Image-Based Matching Correctness)...... 68 5.18 Definition (Equivalence-Based State Matching)...... 68 5.19 Theorem (Admissibility of Equivalence-Based State Matching)...... 69 5.20 Example (Iterative Input Reading)...... 70 5.21 Theorem (Image-Based Path Existence)...... 75 5.22 Lemma (Single Step Equality)...... 75 5.23 Example (Counterexample Extraction)...... 76
6.1 Example (CFG for Simulink)...... 81 6.2 Example (Simulink Semantics)...... 82 6.3 Theorem (Simulink Verification Correctness)...... 85
7.1 Example (DVE Model)...... 104
8.1 Example (LLVM IR)...... 118 8.2 Example (NTS for Producer/Consumer)...... 119 8.3 Definition (Transition Application)...... 128 8.4 Definition (Deterministic Relation)...... 129 8.5 Definition (Enabled Transition)...... 129 8.6 Definition (Independence Relation)...... 129 8.7 Definition (Invisible Transition)...... 129 8.8 Definition (Enabled Transition)...... 129 8.9 Definition (Transition Application)...... 130 8.10 Definition (Transition Independence)...... 130 8.11 Example (LLVM with Pthreads Translated to NTS)...... 131
xiii Chapter 1
Introduction
The objective of this thesis is to aid the development of software by employing formal meth- ods. The programs being developed may be executed concurrently which gives them a non- deterministic semantics. The properties being required describe the development of the pro- gram’s behaviour in time. Formal methods can aid such development by demonstrating that the required properties hold in every execution. The particular formal method for deciding that a program has given properties is called model checking. Throughout this thesis we systematically extend the application of model checking to further stages of the software development cycle. All these applications share two common aspects central to our work: verification of temporal properties rather than the simpler safety properties and orientating towards solving observed problems rather than theoretically inter- esting problems. The model checking is used to accelerate the building of confidence in the product of individual development stages. Ultimately, the motivation for introducing any change into the established development process is money. Formal methods thus should decrease the cost of development, by improving confidence in the correctness and by accelerating the development itself. A balance between the required time and the provided confidence is then steered either way, depending on how safety-critical the result must be. The application of model checking we have designed is intended as unobtrusive: if the developers choose to apply our tools the modifications to their routine should be minimal. Hence any correctness results provided by our tools should be sufficiently cheap for the development of any but the least critical software. The following text summarises our experience with building and using such tools. We describe the problems we solve and scrutinise the shortcomings of previous solutions. We establish the theoretical background, prove it correct, and specify its position among the forefront of the field. We implement tools that not only enable solving the problems but also allow for comparison with state-of-the-art tools. And finally, we report the results, both positive and negative, of rigorously constructed experiments as well as the results of real-world applications.
1.1 Software Development Cycle
Without introducing the vast amount of theory established around the software development, we will describe those stages of the development cycle that appear crucial to the resulting correctness. Hence, much simplified for the context of this thesis, the software development
1 nance te R ain e qu M i re m e n t s
n
o
i
t
a
d
i
l
a
V D
e
s
i
g
n
n
o
i
t
a
t
n e I m m e p l
Figure 1.1: The individual stages of the software development life cycle. goes through the 5 stages, as depicted in Figure 1.1. The cycle is indeed a closed loop: new requirements may be produced during maintenance and the whole process is restarted. Also each of the stages may impose changes in any of the earlier stages thus necessitating a reversal within the process. Reversal of the development process is always expensive and eliminating or minimising the need to redo a particular development stage is most desirable. It follows that finding bugs in the code during the implementation stage is relatively affordable, whereas finding bugs in the requirements during the implementation stage can be substantially more costly. Each development stage produces results that later stages build upon. Hence the longer a bug remains undetected the more results have to be reproduced, the more stages have to be reverted, all of which increases the cost of development.
Requirements Stage The requirements stage is concerned with the elicitation and analysis of the properties the software being developed is to have. The importance of clear specification of the requirements in contract-based development process is apparent from the necessity of final-product compliance verification. Stakeholders and potential customers are interviewed by requirements engineers during the elicitation to communicate their ideas regarding the desired product. These ideas are then detailed and concretised to form a collection of prelim- inary requirements. Analysing requirements at this stage is mainly concerned with checking that they truly represent what stakeholders want. This stage is particularly problematic with respect to any attempts at a rigorously under- taken development. The elicited requirements are vague, incomplete and often contradictory as a collection. Formalising the requirements is only possible if the engineers have adequate skill in using advanced formalisms and only useful if the stockholders can be guided to under- stand these formalisms. Antithetically, a rigorously formulated set of requirements is highly
2 desirable, since all the future stages will define correctness with respect to this set. Restating requirements in a precise, formal way enables the requirement engineers to scrutinise their insight into the problem and allows for a considerably more thorough analysis of the final requirement documents [HJC+08]. To accurately delimit the focus of our work on requirements analysis, we will make several assumptions. We are only interested in functional requirements, specifically those specifiable in the Linear Temporal Logic (LTL) [MP81]. This is a relatively strong assumption, since other requirements, such as performance requirements are of great importance [Jai08], as well as requirements that specify for example continuous time behaviour [AILS07]. Another assumption is that the requirements are already formalised, i.e. the input of our tools are LTL formulae. This is a much weaker assumption, since automated tools exist that translate English requirements into LTL [NF96].
Design Stage The design phase is concerned with building a model of the system under development, one that complies with all the collected requirements. A great potential value lies in first building a model before implementing it in code. Modelling languages often al- low much more visual representation of the product while building or modifying a model is much faster. More importantly, however, modelling allows abstracting details that are not relevant for investigating some aspects of the system. For example, the correct functioning of a data-transitioning protocol may be independent of the content of the data being transmit- ted. Choosing which aspects can be abstracted is an error-prone process that has not been successfully automated yet. Once an appropriate abstraction is chosen and the design model is built, its compliance with the requirements should be checked. The greater guarantees are expected of this devel- opment stage, the higher the level of formalisation must be. This is not to say that testing a design without formalisation merits no recognition. As with testing in general, it is the fastest way to discover a certain portion of the bugs in both the design and the requirements. But no guarantees are provided as to the correctness of the system. Since models of computer systems are equally expressive as their functional counterparts, i.e. programs, their verification is undecidable [Ric53]. In this most general setting, verifi- cation of a program means deciding if for every input it behaves exactly as its specification prescribes. There are restricted cases for which verification is decidable. Our work applies formal methods to models without continuous or infinite components, hence we allow neither modelling real time nor the use of mathematical numbers. Finally, the number of communi- cating subsystems must be fixed and known in advance.
Implementation Stage The implementation phase offers a large space for rigorous analysis of the final product. Developing safety-critical systems prescribes a number of guidelines to be followed to achieve higher confidence in the result. Following these guidelines, however, tends to alter to established process the programmers are used to. More automated methods, e.g. checking correct memory allocation, are slowly becoming a standard part of the development process. Writing assertions that check certain properties at compile time is already a non- trivial activity, but also widely used. Yet similarly as with testing, these methods, while of considerable value, cannot guarantee correctness. Thus despite the existence of development processes that do guarantee correctness [Dij76], the implemented program which passes all
3 tests may still be erroneous. Consequently, we focus our verification efforts on the following, validation phase.
Validation Stage The validation phase justifies its crucial position within the development cycle mainly by the lack of rigour during the implementation phase. At the beginning of validation, the engineers are equipped with three intermediate products: the requirement document, the design model, and the code. The task is to validate that the code complies with all requirements, i.e. that the code implements its specification: the model. Hence the objective is identical to that of the design state verification. What is different is that the code is typically orders of magnitude larger: each abstraction introduced in the design is concretised in full in the implementation. Given that the objective is the same, one would be inclined to employ the same methods as for verifying the model. Yet as we mentioned earlier, the code differs from the model mainly by the lack of abstraction. Any verification tool with the ambition to scale to real-world programs must employ some form of abstraction, i.e. automate the work of a design engineer. As we intend to demonstrate several times throughout this thesis, the method we propose, control explicit—data symbolic model checking, performs such automated abstraction. We apply it on both the design of a system and its executable implementation. Hence similar limitations restricting our application of formal method in the design phase apply here as well. One that is perhaps missing in most design languages is the use of complex data structures. Since our methods require finite representation of the system, we presently cannot handle dynamic data structures.
Maintenance Stage The maintenance phase completes the development cycle by allowing the introduction of new requirements, modifying old requirements, and by ensuring that adequate changes are performed so that conformance to old requirements is preserved. Hence it would clearly be sufficient to postpone all validation efforts to the remaining phases that ensue the maintenance. On the other hand, assuming that the system was correct prior to maintenance, an incremental verification of only the modified part would be more cost effective. Alas, such incremental verification lies outside the scope of this work.
1.2 Model Checking
Verification of correctness of computer programs is decidable only in certain restricted cases, when both the program under verification and the verified property satisfy some well-defined criteria. Thus from the theoretical point of view, even deciding termination is impossible for general programs, mathematically represented for example by Turing machines. On the other hand, execution of many real-world programs entails only finitely many states of computation. Even parallel programs with a bounded number of threads that do not use dynamic data structures generate only finite state spaces. It follows that verification of many interesting properties is feasible and the only limitation is the efficiency of handling the size of the state spaces. In the case of verifying parallel programs that read from user input against properties specified in LTL, the state space explosion stems from three sources. (1) The verification complexity is exponential in the size of the formula being verified. In this work we assume sufficiently small formulae, so that their impact on the complexity is negligible, as is often
4 the case in practice. (2) The program under verification is commonly very large, especially considering that the number of possible thread interleavings is exponential in the number of parallel threads. (3) The number of possible evaluations of internal program variables is much larger still, given that introduction of every input variable multiplies this number by 2n new combinations, where n is the number of bits and may thus be even larger than 64. This thesis is partly concerned with the above described problem of verifying parallel programs with inputs against LTL properties. We argue why it is reasonable to separate the control (specification formula and thread interleaving) from data (variable evaluation) into two distinct sources of non-determinism that should be handled by different means. The means used here are parallel explicit-state model checking (B¨uchi automata-based) and satisfiability (SAT)-based symbolic execution. The proposed verification technique introduces the symbolic representation of data into otherwise unmodified LTL model checking or, equivalently, it promotes symbolic execution to LTL verification by implementing exact state matching. More precisely the proposed verification enumerates explicitly all reachable control con- figurations (combination of program counters and specification automaton states) while the set of variable evaluations associated with a control state is represented symbolically, as a formula in the first-order theory of bit-vectors (BV). Hence, instead of storing many copies of the same control state that differ in the variable evaluation, our verification represents the states with exactly the same control part and exactly the same set of evaluations as a single multi-state. This set-based state space reduction leads in some cases to exponentially smaller state spaces and we prove the correctness of the reduction and discuss its efficiency. Also, unlike in classical symbolic model checking where states are characteristic functions of sets of configurations (one for both control and data), we represent data as images of BV functions. Representing sets of variable evaluations as function images has the potential to be more concise than using a canonical representation of characteristic function, e.g. more than Bi- nary Decision Diagrams (BDDs), which are exponential in the number of variables for some arithmetic functions [Bry91]. Furthermore, evaluating the transition relation is syntactic, a simple substitution, which allows very fast generation of state successors. On the other hand, one of our contributions is the identification of a major limitation entailed with using function images for model checking: deciding state matching (equivalence of two sets) is very expensive. In the context of model checking, it is crucial to be able to decide when two objects in the state space represent the same state. State matching is used in accepting cycle detection and for deciding termination; both these operations are necessary for the correct functioning of the verification process. In order to utilise the advance of modern SAT and Satisfiability Modulo Theories (SMT) solvers, we have devised an SMT query for state matching: a BV formula satisfiable if and only if two given functions have different images. Yet such a query requires quantifying the input variables and thus deciding its satisfiability is, in theory, exponentially harder than quantifier-free satisfiability. Our experiments with randomly generated and DVE (native language of DIVINE [BBH+13]) models demonstrate that exact state matching limits our approach to rather small models of programs. In an attempt at ameliorating the complexity of state matching, we follow two directions: using quantifier-free state matching for the price of its precision and modifying the way the model checker searches the database of already visited states. Consider first the quantifier-free state matching implemented by comparing the observable behaviour of the two functions; let us call two states different if and only if their functions map a single input to two distinct outputs. We prove that this equivalence-based state matching, while not precisely comparing
5 two sets of evaluations, leads to a correct model-checking process which terminates provided that input variables are not read on a cycle. The second improvement decreases the number of SAT calls related to the insertion of a new state into the model checker database. The number of symbolic data parts associated with a single explicit control part can be decreased by explicating some variables which can be efficient if there is exactly one concrete value in the image. Finally, some states can be distinguished by reusing the satisfying models gained from previous SAT calls (evaluation of input variables that witness the difference between states). We have built a hierarchy atop these witness models which in some cases allows simulating a binary search among database states.
The proposed set-based state space reduction and the associated control explicit—data symbolic model checking are experimentally evaluated, together with the above described im- provements, on DVE models and randomly generated test instances. The experiments com- pare data representations using explicit sets with those using BV formulae, precise image-based state matching with equivalence-based state matching, and we also evaluate the influence of explicating variables and the witness model hierarchy.
There is a number of interesting observations to be highlighted. For example, there is a threshold on the size of the input variables domain below which the explicit representation outperforms the symbolic due to the high complexity of exact state matching. On the other hand, verification with the symbolic representation scales gracefully with the increase of data domains (32-bit domains caused less than two-fold slowdown compared to 4-bit domains). Regarding the two state matching techniques, the image-based comparison is two orders of magnitude slower than the quantifier-free equivalence-based comparison in a vast majority of cases. Yet our experiments with DVE models lead to relatively short quantifier-free SMT queries that required up to 88 seconds to be resolved by state-of-the-art SMT solvers. Finally, the witness-based matching reduces the number of SMT solver calls exponentially in some cases, yet all unsatisfiable instances remain (these are more expensive than the satisfiable ones) and thus the resulting impact is only a linear, 7-fold speedup.
As a final introductory note and to motivate the state space reductions proposed in this thesis, let us consider model checking of a correct system. To demonstrate correctness, the verification process has to refute the existence of accepting cycles, which, in the explicit-state case, entails enumerating the whole transition system. As we have mentioned earlier, the number of successors of each state stems from the combination of three choices the system as a whole can make. First, deciding whether the specification holds requires distinguishing which of the possible valid behaviours we are presently investigating. Second, each parallel process can be the first to execute and thus at least one new successor is generated for each parallel process. And third, every value of every variable leads to at least one successor.
The number of possible valid behaviours is often reasonably small since it depends on the complexity of the property under verification. Equally, the number of processes in real pro- grams and their models can be safely assumed to be small. This observation holds regardless the size of individual processes, i.e. only the number of processes increases the number of successors. On the other hand, the number of evaluations of many variables in a given pro- gram state can be equal to the domain of that variable, often exceeding 232. This observation lies at the core of the thesis mentioned earlier: that the two sources of non-determinism in parallel programs are of different nature and should thus be handled by different means.
6 1.3 Sanity Checking
Later in the development, when the requirements are given and a model is designed, the formal verification tools can provide a proof of correctness of the system being developed with respect to formally written requirements. The model of the system or even the system itself can be checked using model checking [BBCR10ˇ , CCG+02] or theorem proving [BFG+01] tools. If there are some requirements the system does not meet, the cause has to be found and the development reverted. The longer it takes to discover an error in the development, the more expensive the error is to mend. Consequently, errors made during requirements specification are among the most expensive ones in the whole development. Model checking is particularly efficient in finding bugs in the design, however, it exhibits some shortcomings when it is applied to requirement analysis. In particular, if the system satisfies the formula then the model checking procedure only provides an affirmative answer and does not elaborate for the reason of the satisfaction. It could be the case that, e.g. the specified formula is a tautology, hence, it is satisfied for any system. To mitigate the situation a subsidiary approach, the sanity checking, was proposed to check redundancy and coverage of requirements [Kup06]. Redundant requirements are such that they are trivially satisfied in the given model. Coverage of requirements enumerates the amount of those behaviours of the model that are described by the requirements. It follows that the existence of a model is still a prerequisite which postpones the verification until later phases in the development cycle. Our primary interest with respect to the requirements analysis is to extend the techniques that allow the developers of computer systems to check sanity of their requirements when it matters most, i.e. during the requirements stage and thus before any model is available. Given that previous definitions of sanity presupposed the existence of a model, we lift three properties of sane requirements to a pre-design phase of development. A consistent set of re- quirements is one that can be implemented in a single system. The satisfaction of a reduntant requirement is inevitable in any system that satisfies the rest of the requirements. Finally, a complete set of requirements extends to every possible behaviour that can be considered sensible in an environment. All three notions will be defined properly in Section4. We redefine the notion of sanity checking of requirements written as LTL formulae and describe its implementation and evaluation. The novelty of our approach lies in the idea of liberating sanity checking from the necessity of having a model of the system under devel- opment. Our approach to consistency and redundancy checking identifies all inconsistent (or redundant) subsets of the input set of requirements. This considerably simplifies the work of requirement engineers because it allowed to pinpoint all the sources of inconsistencies. As for completeness checking, we propose a new behaviour-based coverage metric. Assuming that the user specifies what behaviour of the system is sensible, our coverage metric calculates what portion of this behaviour is described by the requirements specifying the system itself. The method further suggests new requirements to the user that improves the coverage and thus ensures more accurate description of users’ expectations.
1.4 Case Study: Simulink Diagrams
Consistent with the primary motivation behind this thesis, i.e. automating the verification tasks included in software development, we now demonstrate our verification method for post-
7 inport2
outport ≥ ∗ delay
constant min + inport1
Figure 1.2: A partial Simulink diagram for the VoterCore system: the complete diagram was verified during the experimental evaluation (see Section6). Delay nodes form the state space, inport nodes generate new evaluations with every tick, and outport nodes can be referred to in the specification requirements. requirement phases on Simulink diagrams. To restate the problem we solve more accurately: the task is to verify that a program (or a model thereof) satisfies its specification, given as an LTL formula. The method is sufficiently general to allow verification of both modelling and programming languages as we later demonstrate by verifying LLVM bit code. Yet our research was guided by the concrete application to Simulink and we choose it for the demonstration because of its widespread use in industry. In the following, we assume no previous knowledge of the Simulink modelling language and in fact resort to a different notation than the one prescribed by the language standard. Let a Simulink diagram be an oriented graph, which can be interpreted as an abstract syntactic tree, except it may contain self-referencing cycles, for an example see Figure 1.2. The semantics of such diagram is that of a one-register machine. The delay nodes are the registers of the machine and the computation evaluates the expressions forming the subtrees of delay and outport nodes. The computation is executed synchronously with each tick of the global clocks. Each tick is thus different from any other and only the evaluation of delay nodes and the values read from inports. The expression associated with a node is described by the tree underneath it, e.g. in our diagram the next value of the delay node is computed as the multiplication of the two subtrees of the node labeled by ∗ to the left of the delay node. Translated into a full equation, the value of the delay in the next tick is:
delay = min(constant, inport1 + delay) ∗ inport2 Other than the arithmetic and logical operators there are 4 types of nodes: delays, inports, outports, and constants. Since the inport nodes serve as inputs – providing a new, non-deterministic value after every tick – a Simulink diagram generates a transition system of the possible values stored in delays. Such a transition system represents the possible evolution of the diagram, describing its behaviour as it changes in time. Similarly as with parallel and reactive programs, a developer may need to verify that the behaviour of a system complies with the specification. Consider for example a specification requiring that: whenever the inport1 is negative in three consecutive ticks, the value of outport will eventually be positive. Such a specification cannot be expressed in terms of safety, i.e. as a reachability of bad states, because if the system were incorrect the violating run would be infinite. Consequently, verification procedures that do not allow temporal specification, such as most implementations of symbolic execution by [Kin76], static analysis by [CC77] or deductive verification by [OG76], cannot be used in this case.
8 Model checking, initiated by [CES86, VW86], allows verification against temporal properties, yet Simulink diagrams and similar systems with input variables pose another obstacle in the form of data-flow non-determinism. In order to answer temporal verification queries automatically, an algorithm must al- low evaluation of Peano arithmetic expressions that form subtrees in the diagram. At the same time the input variables may be evaluated to an arbitrary number (which in practice is bounded, e.g. by the number of bits used to express the value) and each of these eval- uations must be considered for the verification to be exhaustive. Standard symbolic model checking approaches are either restricted to only a subset of Peano arithmetic, see [BGP97], or are concerned with Boolean input variables, as demonstrated by [McM92]. Standard ex- plicit approaches to model checking already suffer from the state space explosion caused by the control-flow non-determinism and thus cannot scale to any considerable degree with the allowed ranges of input variables. Hence the motivation for LTL verification of Simulink diagrams: it has the potential to replace testing in the design phase of software (or systems) development. In this sense, purely explicit representations succeed only in part. In a tool-chain, and preceded by a tool to formalise pattern-based, human language requirements into LTL formulae, a model checker could provide similar results as testing for considerably smaller expenses. Yet the severely limited domains of input variables the standard model checking allows still enforce a test-like approach to correctness validation. Effectively, the developer still has to mechanically test (a smaller number of) instances of the same validation query with different (intervals of) input values. Our work meliorates the limitations of previous approaches by proposing a model checking technique that scales to arbitrarily large domains of input variables. Through the reduction of certain crucial operations required by the model checking process to SMT queries, we have achieved a method for complete verification (for large but finite domains). Though the experiments were conducted on a relatively small Simulink model, they still convey the most important practical result: appropriately adapted model checking may be used to replace testing of safety-critical units as early as in the modeling phase of software development.
1.5 Contribution
There are three pivotal contributions that deserve pointing out. In this chapter we only give a brief summary of the most important results and refer the reader to the relevant chapters that describe each contribution in more detail. First, we have developed a model-free sanity checking as a way of verifying requirements before a model of the system is designed. We build the theory, describe the implementation and evaluate the experimental results in Section4. Second, we have developed the control explicit—data symbolic model checking as a method for verifying parallel programs that non-trivially manipulate data. The theory of this method is established in Section5 and the properties of a number of symbolic data representations are investigated in Section7. Our primary application of control explicit—data symbolic model checking was the verification of Simulink diagrams, detailed in Section6. And finally, our last contribution lies in building verification platforms for analysing real code via the LLVM intermediate language. We demonstrate how other researcher could build their own tools using our platforms by implementing the control explicit—data symbolic model checking and a static partial order reduction over LLVM in Section8.
9 1.5.1 Model-Free Sanity Checking
We redefine the notion of sanity checking of requirements written as LTL formulae and de- scribes its implementation and evaluation. The proposed notion liberates sanity checking from the necessity of having a model of the developed system. Sanity checking commonly consists of three parts: consistency and redundancy checking and completeness of requirements. Our approach to consistency and redundancy checking is novel in identifying all inconsistent (or reduntant) subsets of the input set of requirements. This considerably simplifies the work of requirements engineers because it pinpoints all sources of inconsistencies. For completeness checking, we propose a new behaviour-based coverage metric. Assuming that the user specifies what behaviour of the system is sensible, our coverage metric calculates what portion of this behaviour is described by the requirements specifying the system itself. The method further suggests new requirements to the user that would improve the coverage and thus ensure more accurate description of users’ expectations. The efficiency and usability of our approach to sanity checking is verified in an experimental evaluation and a case study. Modifying the work flow to that allows requirement engineers to explicitly specify univer- sal (∀) and existential (∃) nature of individual requirements allows to more naturally express some of the system properties. We also evaluate the robustness and sensitivity of our con- sistency and completeness checking procedures with respect to the use of different LTL to BA translators. Furthermore, we adopt redundancy checking for individual LTL formulae into the framework and add proper references to relevant previous work. Finally, we report on experience we gathered in cooperation with Honeywell International when applying our methodology on some real-life industrial use cases.
1.5.2 Control Explicit—Data Symbolic Model Checking
Let us summarise our work on bit-precise LTL model checking of parallel programs with inputs. First, we establish the theory of set-based verification, on which all presented appli- cations build upon. The list of applications contains Simulink diagrams, DVE models, and C/C++ parallel programs. Hence our verification method is scrutinised both theoretically – by investigating its properties on hand-crafted models – and practically – by undertaking experiments with real-world systems. The Simulink diagram verification also includes the methodology of translating state matching into quantified SMT queries. There, we have first formalised the process of model checking of Simulink diagrams, building on the concept of set-based reduction of the underly- ing state space. This reduction can significantly decrease the number of states, but does not lead to any reduction in the worst case. Also the representation used for the sets of variable evaluations determines the resulting complexity, which can differ exponentially for particu- lar representations. We have implemented the set-based reduction using explicit sets as one representation and using bit-vector formulae as another representation. Explicit sets are ex- ponential in both the number of variables represented and in their ranges, but allow simple implementation of state matching, i.e. deciding if two states represent the same evaluations. The representation using formulae is exponentially more succinct, but the state matching is a complex operation that requires deciding satisfiability of quantified bit-vector formulae. The results of Simulink verification show that (1) in the case of Simulink diagrams the set-based reduction is efficient, leading to exponentially smaller states spaces; (2) the repre- sentation using explicit sets allows fast verification but only for small ranges of variables; and
10 (3) the representation using bit-vector formulae scales to arbitrary (bounded) ranges, though the state matching is resource expensive. In the terms of practical results, the proposed SAT-based verification is fully automated. Furthermore, a hybrid representation is presented which combines explicit and SAT-based representations. This hybrid representation incorpo- rates both the explicit and the symbolic representations, thus allowing to utilise the hidden limits of input values if these are present but can also cope with systems without anyhow restricted inputs. Further experiments concentrate on the comparison between image-based and equivalence- based state matching and the explicating values heuristics. We provide detailed proofs of cor- rectness of set-based verification and the correctness of both state matching techniques and the relation between them. The witness-based matching resolution heuristics is evaluated on randomly generated models, which are also used to gain deeper insight into the relation be- tween the two state matching techniques. To provide a complete view to the reader, Section5 applies all the accumulated theory in a comprehensive description of the LTL model checking built on the set-based reduction. The novelty of the verification methodology proposed here lies in the combination of explicit state model checking with sets of variable evaluations represented as images of bit- vector functions. We build the theory of set-based reduction using the above combination for verification of systems with asynchronous control flow and regular, arithmetical data flow, e.g. for parallel programs with input variables. The result of incorporating the set-based reduction into an existing explicit-state model checker DIVINE effectively allows verification of programs reading from input, which was previously impossible in all non-trivial cases. Translating these results to software development, we have extended the limits of previous complete, bit-precise techniques to the point when the system designer no longer has to bound the domains of input variables. Instead of iteratively testing the model on domains of sizes smaller than 100, the set-based reduction and symbolic representation allow verifying for the whole domain 0 − 232 in a single verification run.
1.5.3 Verification Platform SymDivine provides the verification community a generic verification platform: the user is given a function that returns an initial state and a function that generates successors of a given state. The tool allows the separation of the representation of control (stored explicitly as the unique sequence of program locations) and data (stored symbolically as a set of variable evaluations). To the best of our knowledge, SymDivine is the only tool that allows verification of LTL properties of parallel LLVM with data non-determinism. The genericity of the framework is two-fold.
• As is common in explicit-state model checkers, a number of verification algorithms can be used, provided the algorithm can be implemented using the initial state and successor generator functions. We demonstrate this contribution on two algorithms: reachability for safety verification and Nested Depth-First Search (DFS) for LTL verification.
• The framework provides an interface for the symbolic data representation, with a few methods that must be implemented, leaving the choice of how data are stored completely on the user. Depending on whether the representation implements hashing (or ordering) a more efficient state space searching algorithm is automatically used. We demonstrate this contribution on four representations of sets: explicitly enumerating all set members,
11 Chapter4 nance SEFM’12 te R ain e FAOC qu M i re m DIVINE-San e n looney t s Chapters5,8
MEMICS’14 n SymDivine o
i
TOSEM t NTS-platform Chapters6,7
a
d
i
l DIVINE-Sym HASE’14
a
V SQJ’14
DIVINE-Set D
DIVINE-Eq e s PDP’14
i
g
n
n
o
i
t
a
t
n e I m m e p l
Figure 1.3: Software development stages matched against the individual contributions of this thesis.
Binary Decision Diagram-based representation, Satisfiability Modulo Theories-based representation, and an empty representation which completely ignores data.
Finally, we report the results of comparing the control explicit—data symbolic (CEDS) ap- proach to LTL model checking against the currently best other approaches. The NTS-platform further expands our attempt at supplying the community with an accessible way to test their verification method on real programs. The platform provides a C++ library to represent and manipulate transition system and a translator from the LLVM IR code and from the NTS modelling language. Furthermore, we have implemented method to inline function calls and sequentialise parallel programs. To demonstrate the applicability of our platform, we have also implemented a static partial order reduction as a transformation of the internal representation. The result is a reduced sequential control-flow graph with behaviour equivalent to its parallel LLVM counterpart.
1.5.4 Thesis Content The Figure 1.3 matches the following sections of this thesis with the publications (outside the circle) and tools (inside the circle) of the author. The contribution of the author with 100 respect to each publication can be approximated to n %, where n is the number of authors. The papers presented at HASE’14 and MEMICS’14 were awarded the Best Paper award. Here we only list the publications directly related to the content of this thesis. The remaining publications of the author are listed in the next subsection. The tools used in our experiments are freely available at the author’s web page: http://anna.fi.muni.cz/~xbauch/code. html.
12 Accepted Conference Papers
MEMICS’14 P. Bauch, V. Havel, J. Barnat. LTL Model Checking of LLVM Bitcode with Symbolic Data. In Proceedings of the 9th Doctoral Workshop on Mathe- matical and Engineering Methods in Computer Science, volume 8934 of Lecture Notes in Computer Science, pages 47–59. 2014.
PDP’14 J. Barnat, P. Bauch, V. Havel. Model Checking Parallel Programs with Inputs. In Proceedings of the 22nd Euromicro International Conference on Parallel, distributed and network-based Processing, pages 756–759. 2014.
HASE’14 J. Barnat, P. Bauch, V. Havel. Temporal Verification of Simulink Di- agrams. In Proceedings of the 15th IEEE International Symposium on High Assurance Systems Engineering, pages 81–88. 2014.
SEFM’12 J. Barnat, P. Bauch, L. Brim. Checking Sanity of Software Requirements. In Proceedings of the 10th International Conference on Software Engineering and Formal Methods, volume 7504 of Lecture Notes in Computer Science, pages 48–62. 2012. Accepted Journal Papers
SQJ’14 P. Bauch, V. Havel, J. Barnat. Accelerating Temporal Verification of Simulink Diagrams Using Satisfiability Modulo Theories. In Software Quality Journal, volume 22, pages 1–27, 2014. Submitted Journal Papers
FAOC J. Barnat, P. Bauch, N. Beneˇs,L. Brim, J. Beran, T. Kratochv´ıla. Analysing Sanity of Requirements for Avionics Systems. In the submission process (after first review) to Formal Aspects of Computing.
TOSEM P. Bauch, V. Havel, J. Barnat. Control Explicit—Data Symbolic Model Checking. In the submission process (after first review) to Transactions on Software Engineering and Methodology. Tools
DIVINE-San Build on top of the DIVINE version 2.5.2, DIVINE-San checks that a set of re- quirements possesses the three properties of sanity: consistency, non-redundancy, and completeness.
looney Accelerates the computation of consistency and non-redundancy while adding the support for redundancy witnesses.
DIVINE-Sym Build on top of the DIVINE version 2.94, DIVINE-Sym verifies models written in the DVE modelling language extended with the possibility to specify some variables as input: their initial values are not defined explicitly but only as a range of possible evaluations.
13 DIVINE-Set Build on top of the DIVINE version 2.5.2, DIVINE-Set verifies Simulink diagrams translated into the CESMI format. The tool supports both the explicit and the symbolic representation of the sets of variable evaluations.
DIVINE-Eq Build on top of the DIVINE version 2.96, DIVINE-Eq verifies models written in the DVE modelling language, where the evaluation of program variables is rep- resented via bit-vector functions while the control location are stored explicitly.
SymDivine A standalone framework for verification of LLVM programs. The tool accepts as input the LLVM code translated from a parallel C++ program that reads from user input. SymDivine houses a number of verification algorithms as well as providing a tool set for writing new algorithms. It implements the control explicit—data symbolic model checking using explicit sets, binary decision di- agrams, BV formulae, and other representations. (The author would like to acknowledge that a major part of the implementation is to be accredited to Vojtˇech Havel.)
NTS-platform A standalone framework for verification of NTS models. The framework consists of a translator from LLVM to NTS, a parser for NTS, a common internal repre- sentation for building new tools, and a partial order reduction for NTS models. (The author would like to acknowledge that a major part of the implementation is to be accredited to Jan Tuˇsil.)
1.5.5 Author’s Other Publications The following are publications the author contributed to during his Master’s studies. Their primary topic are data-parallel graph algorithms with an application in explicit-state model checking. Thus they share the theoretical goals with this thesis but not practical methods to achieve those goals. We only list them for the reader to have a complete picture. Accepted Conference Papers
PDMC’11 J. Barnat, P. Bauch, L. Brim, M. Ceˇska.ˇ Computing Optimal Cycle Mean in Parallel on CUDA. In Proceedings of the 10th International Workshop on Parallel and Distributed Methods in verifiCation, volume 72 of Electronic Proceedings in Theoretical Computer Science, pages 68–83. 2011.
IPDPS’11 J. Barnat, P. Bauch, L. Brim, M. Ceˇska.ˇ Computing Strongly Connected Components in Parallel on CUDA. In Proceedings of the 25th IEEE Inter- national Parallel & Distributed Processing Symposium, pages 544–555, 2011.
ICPADS’10 J. Barnat, P. Bauch, L. Brim, M. Ceˇska.ˇ Employing Multiple CUDA De- vices to Accelerate LTL Model Checking. In Proceedings of the 16th International Conference on Parallel and Distributed Systems, pages 259–266, 2010.
MEMICS’10 P. Bauch, M. Ceˇska.ˇ CUDA Accelerated LTL Model Checking – Re- visited. In Proceedings of the 6th Doctoral Workshop on Mathematical and Engineering Methods in Computer Science, pages 11–19, 2010.
14 Accepted Journal Papers
JPDC’12 J. Barnat, P. Bauch, L. Brim, M. Ceˇska.ˇ Designing Fast LTL Model Check- ing Algorithms for Many-Core GPUs. In Journal of Parallel and Dis- tributed Computing, 72(9):1083–1097, 2012.
15 a
16 Chapter 2
Related Work
Similarly to the content of this thesis, we will follow the same division when presenting the re- lated directions of research. Following the software development cycle, our work encompasses the requirements stage, the design stage, and the verification stage. Yet arguably the biggest contribution lies with the control explicit—data symbolic model checking. This method is employed in both design verification and code verification and as such deserves a separate treatment with respect to related research efforts. Those readers more closely familiar with the field of program verification might find our selection of related work in Section 2.4 cursory and lacking other important directions. Our opting for a shorter selection of only the most relevant directions was motivating by the specific target of our own verification: parallel C++ programs with input. Hence we have selected those direction that have the potential and ambition to aim for the same target programs.
2.1 Control Explicit—Data Symbolic Model Checking
The results described in this thesis rectify a very particular omission of the verification com- munity. Even though the state space of many real-world programs is finite, the majority of verification efforts concentrates on treating the state spaces as infinite, because then the analysis is often easier. We do not follow this simplification mainly since it may introduce impression by interpreting bit-vector integers as mathematical integers. Let us state precisely the properties of our verification method to substantiate our claims of presenting original results. Control explicit—data symbolic model checking:
• allows for verification of unmodified (1) program designs (Simulink), (2) program code (C/C++ via LLVM),
• handles bounded control-flow non-determinism of parallel programs,
• accepts full LTL as the behaviour specification,
• is bit-precise with respect to the evaluation of program variables,
• is fully automated without reporting false alarms,
• is complete, i.e. the output is effectively a proof of correctness.
17 At the time of writing this thesis, there is no other method capable of the above known to the author. The following description of related work lists those research directions that aim in part at achieving similar results, while often being considerably more efficient for solving an equally challenging, albeit a different problem. There are three vantage points that appear useful for placing the presented work among the established body of verification methodology. First, based on the problem being solved, this work addresses bit-precise LTL model checking of parallel programs with inputs, where computer variables are interpreted as bit-vectors and not as mathematical integers. Second, based on the achieved results, we have reduced the state space by handling separately the control- and data-flow aspects of parallel programs. And third, based on the distinguishing aspect of our approach, the described verification method combines explicit and symbolic approaches by enumerating the thread interleavings explicitly and by representing variable evaluations as images of bit-vector functions. The related work is analyses based on these three perspectives to locate the differences and common aspects with the content of this work. (We have also devoted the first part of this section to shortly describing the symbolic representations most commonly used in verification.)
2.1.1 Symbolic Representations
The symbolic representations employed in formal verification can be divided between canon- ical and non-canonical. If the object being represented is a Boolean function then the canon- 0 n icity of the representation means that for two Boolean functions f, f : B → B such that any n-long vector of Boolean values is mapped to the same value by both f and f 0, then the representations of f and f 0 are exactly the same (syntactically equivalent). The definition of canonicity naturally extends to formulae of FO theories, such as the theory of bit-vectors, which are the objects primarily represented in this work.
Canonical Representations
The most important example of canonical representation of Boolean functions are the Reduced Ordered Binary Decision Diagrams (ROBDDs) [Bry86], though in the future discussion we simply refer to them as BDDs. Apart of their other uses, BDDs were crucial in the devel- opment of symbolic model checking, given the efficient representation of state spaces they provide [McM92]. Yet the pivotal bottleneck of classical BDDs is their size when arithmetic operations are represented. It was shown by Bryant that BDDs grow exponentially for integer multiplication [Bry91], regardless of the variable ordering. The discovery of this limitation lead to subsequent modification and alternative diagrams. Linear functions may be canonically represented by Binary Moment Diagrams (BMDs), which allows efficient representation of digital systems at the world level [BC95], but BMDs cannot represent dividers and some Boolean functions representable by BDDs [BDE97]. Alge- braic Decision Diagrams (ADDs are essentially BDDs with leaves that take values from a set of constants) allow representation of modular arithmetic and are smaller than BMDs [RPHS96]. Boolean Expression Diagrams (BEDs extend BDDs with binary Boolean operator vertices) represent any Boolean circuit in linear space, furthermore quantification and substitution (important in fixed point iterations) are also linear [AH97]. Finally, the Taylor Expansion Diagrams (TEDs) allow constant computation of multiplication and polynomial computation
18 of bounded exponentiation, yet the wrap-around behaviour of modular arithmetic cannot be represented (for comprehensive summary of canonical representations see [CPJ09]).
Non-canonical Representations
The use of non-canonical representations of formulae of FO theories is justified by their poten- tial conciseness, given that all canonical representation grow exponentially for some arithmetic functions [Bry91]. This thesis only mentions the representations using automata [Boi99] or monadic logics [HJJ+95]; while these representation are very effective in some areas, they were shown impractical for sets of variable evaluations [WB00]. Similarly, the use of Pres- burger arithmetic formulae (for their use in verification, see for example [BW94]) and their combinations with decision diagrams [BGL98] are not detailed on the ground of full Peano arithmetic being used in the verified programs. Finally, given that satisfiability of Peano arithmetic formulae is undecidable, most researchers in program verification focused on using the bit-vector theory to represent computation, and designed efficient decision procedures for satisfiability of BV formulae. Early decision procedures for a subset of BV, i.e. for the quantifier-free fragment with bit-level logical operation, composition, and extraction were based on formula canonisation (using a set of rewriting rules) thus allowing Shostak combination [Sho82] with other theo- ries [CMR97]. The set of allowed operations was extended to include Presburger arithmetic in [BDL98]. The possibility of deciding satisfiability for non-fixed size bit-vectors was in- vestigated in [BS98, Pic03]. A solution based on solving the linear problems using M¨uller- Seidl algorithm and the non-linear problems using Newton’s p-adic iteration algorithm was implemented in [BM05]. This combination allows for Nelson-Oppen combination without converting the entire problem into a SAT instance (also called flattening). Precise modular arithmetic was first considered in [Wan06] which reduced the satisfiability of modular arithmetic formulae over the finite ring Z2ω to integer programming. For linear modular arithmetic only a linear number of constraints in needed; for non-linear one needs an additional factor of ω (though the constraints are still linear). The lazy approach, i.e. postponing the flattening process and using first the structural information such as equal- ities and arithmetic functions, was followed in [BCF+07]. Deciding satisfiability based on alternatively improving under- and over-approximation of the original formula was investi- gated in [BKO+07]. Finally, [WHdM13] implemented a decision procedure for the quantified bit-vector theory by extending the quantifier-free procedure with model-based quantifier in- stantiation.
2.1.2 Bit-Precise LTL Model Checking of Parallel Programs with Inputs
All related methods of program verification are instances of static analysis, in that they verify correctness of some aspect of the program (here specified as an LTL formula) without executing the program. We concentrate on three specific instances: model checking, symbolic execution, and abstraction-based program analysis. One outstanding research direction which is related but does not fit the above categories is the study of value-passing transition systems. Defined in [Lin96] these transition systems have edges labelled with guarded assignments and allow bisimulation of two systems to be reduced to finding the greatest solution of the underlying predicate equation system.
19 Model Checking
Explicit-state LTL model checking [VW86] does not have any theoretical limitation with respect to parallel programs with inputs other than the state space explosion since the arith- metic operations are computed explicitly. Given its theoretical basis in ω-regular automata, explicit model checking is local in the sense that only those parts of the system are considered that are relevant to the temporal property being verified. It follows the tableau-proof pro- cesses of deciding validity of LTL formulae, where the sequent rules describe what hypotheses must hold for the conclusion to hold (and the hypotheses are often structurally simpler). However, even this limited part of the transition system of parallel programs is often outside the storage capacity of present-day (and arguably even future) computers. Yet most approaches trying to ameliorate the state space explosion, such as partial-order or symmetry reduction and parallelisation (see for example [BBS01]), focus on the control-flow part; ignoring the variable evaluation or assuming it to be fixed. Only rarely was the impact of data-flow on explicit model checking studied [EP02] and the results were unfavourable. The recent attempts at incorporating a systematic treatment of input variables into explicit model checking, culminating in this work, started by introducing a new process into the verified system that generated the possible variable evaluations [BBB+12]. Symbolic CTL model checking [CES86], on the other hand, is global in the sense that the part of the system being represented at some step of the verification may be unreachable from the initial state and thus unnecessary to be considered. This seeming redundancy stems from the representation being symbolic: it is more efficient to incorporate some redundancy than to describe what is redundant and that it should be excluded from the representation. Unlike in local model checking, in global model checking one starts from those states that satisfy the simplest (atomic) subformulae of the specification formula and then structurally progresses towards the computation of the set of states that satisfy the whole formula. If this set contains all initial states of the program’s transition system then the program is correct. Similarly as with explicit-state model checking, verifying parallel programs with inputs using symbolic model checking is only limited by the complexity of the task. The classical approach based on BDDs [McM92] suffers from the unmanageable size of BDDs caused by (1) arithmetic operations in the data flow and (2) irregularity of the control flow. Using non-canonical representations, Boolean Expression Diagrams [WBCG00] or proposi- tional logic formulae [Bje99, McM02], to handle the first cause of BDD size increase, is limited by the introduction of quantifiers when pre- and post-images are computed during the fixed point search. (Note that our research has met similar difficulty when discovering accepting cycles.) Bounded symbolic model checking [BCCZ99] continues with using logical formulae instead of BDDs and adopts the locality of the explicit approach thus improving also the second cause of size complexity for the price of being incomplete. The completeness was then introduced in [McM03] and in [dMRS03], using Craig interpolants and k-induction, respec- tively. Recently, this approach was improved in [Bra11] by replacing the transition relation unfolding by incrementally generating clause that approximate the reachability information, thus further accelerating the verification process. Finally, module checking [KV96, Var01] allows verification of systems that interacts with an environment that introduces non-deterministic behaviour. The environment can restrict the set of possible choices of some inputs and the verification should be robust with respect to these restrictions, yet this can only be distinguished by branching time logics; where LTL is linear and its validity thus unaffected by these restrictions.
20 Symbolic execution
The basic idea of symbolic execution [Kin76, CPDGP01] is to track the execution of a program by building its execution tree. The nodes of such a tree consist of a path condition (constraint on the input variables that leads the execution to this node) and symbolic variable evaluation. The symbolic variable evaluation does not interpret the expressions assigned to variables during execution, they are syntactically substituted with the symbolic values of variables they refer to. This verification approach allows for efficient safety analysis of programs with inputs, since only one execution is necessary to consider all possible executions. An extension of symbolic execution to LTL verification was investigated in [BDKP08] but did not incorporated full LTL on the grounds of state matching not being decidable for theories with unbounded integers. Without state matching, symbolic execution is not guaranteed to terminate given that the program under verification may contain loops. One possible extension of symbolic execution is the addition of state subsumption [XMSN05, APV06] (effectively subset equality of concrete evaluations) which improves efficiency but not the scope of supported specifications. Further- more, the original limitation of symbolic execution to verification of sequential programs was lifted in [KPV03] by employing a model checker to generate possible thread interleavings.
Abstraction-Based Program Analysis
The verification methods based on abstract interpretation [CC77] do not track the execution of the program under verification in the precision specified in the program. Instead the precision is approximated to a model that can be more easily manipulated, each concrete value of variables mapped to an abstract object, e.g. an interval, that allows simpler and faster computation. The application of abstract interpretation in static analysis has lead to a number of results on real-world, industrial scale programs. Predicate abstraction for temporal properties was described in [PR05] which builds on the idea of encoding temporal properties into fair termination. Later, abstraction-based verification of parallel programs was proposed in [GPR11a]. Finally, the last ingredient for bit- precise verification was an abstract domain forming a lattice over bit-vectors. One possibility, extending the polyhedra abstraction with implicit wrap-around of arithmetic operations, was investigated in [SK07].
2.1.3 State Space Reduction by Control-Flow Aggregation
Cimatti et al. [CRB01] combine the approaches from explicit and symbolic model checking by representing subsets of the set of states as individual BDDs (uncertainty states) by relying on forward, hash-based space traversal used in explicit model checking. Uncertainty states are modified according to the transitions applicable to the states that comprise it. Another ap- proach, investigated in [HGD95, BDG+98], handles the problem of data-flow non-determinism by solving first-order CTL model checking (given that pre- and post-image computation in- troduces quantifiers, the first-order extension of a temporal logic is sufficiently powerful to represent the problem). From a semantic point of view, their algorithm iteratively approxi- mates the sets of evaluations annotating nodes of a product of two transition systems: one is essentially the control-flow graph, the other connects related subformulae of the input for- mula. This verification is incomplete for unbounded data domains unless the data and control
21 variables can be clearly separated. A similar, though still incomplete, approach is to accu- mulate the path condition during verification that distinguishes between program counter predicates and program variable assertions [GP05]. Finally, [BCZ99] showed how local and global approaches could be combined by aggregating the sets of states satisfying a subformula in the tableau-based verification. Another possible aggregation is to form sets from those states with similar observable events, together building an observation graph. Originally described in [HIK04] this ap- proach was optimised in [BDL07] by constructing powersets using previously formed subsets. The limitation of the original paper to the next-free fragment of LTL was recently lifted in [DLKPTM11].
2.1.4 Combinations of Explicit and Symbolic Approaches In [BL13] the authors combine explicit-value analysis with various symbolic approaches, such as abstraction, counterexample-guided refinement, and interpolation to build an abstract reachability graph, extending the work from [BK11]. The explicit analysis only tracks those variables necessary to refute the infeasibility of error paths and variables are added in the precision set iteratively, in further CEGAR iterations. The abstraction maps variables to Z ∪ {>, ⊥}, > for unknown value and ⊥ for no satisfiable value and this abstract variable assignment represents the region of concrete data states for which it is valid. Hence the abstraction clusters together sets of states with the same explicit value of every variable. In [CNR12] a verification method for SystemC designs was implemented by translating them to multi-threaded programs with cooperative scheduling, shared variables, and mutually exclusive thread execution. In this model only one thread is running at a time, when it stops the scheduler selects the next thread to run. The explicit scheduler, symbolic thread combination uses symbolic abstraction for sequential execution of individual threads while the scheduling policy is handled explicitly. A different combination was investigated in [STV05] which symbolically represented the system state space and explicitly the property automaton. Also related is the concolic execution, see for example [SMA05], where the standard symbolic execution is, after a number of symbolic steps, supplemented with a single step of concrete execution (on a smaller subset of values). Then the symbolic execution continues, maintaining a much smaller representation. Perhaps the most closely related work is that of [CKS05] where the distinction between data- and control-flow non-determinisms is realised via a SAT-based model checker for con- current (with a finite number of threads) Boolean programs. Data-flow non-determinism is handled by a combination of symbolic model checking and fixed point detection and the control-flow non-determinism is handled by symbolic partial-order reduction. The approach is limited to Boolean programs, yet standard programs can be converted to Boolean using predicate abstraction. Our work is precise with respect to the bit-vector semantics of pro- grams since we do now use any abstraction. Finally, [CKS05] verify reachability properties and thus do not require exact state matching. Their approach is thus much faster, but does not allow checking LTL properties.
2.2 Model-free Sanity Checking
The use of model checking with properties (specified in CTL) derived from real-life avion- ics software specifications was successfully demonstrated in [CAB+98]. Our work on sanity
22 checking intends to present a preliminary to such a use of a model checking tool, because there the authors presupposed sanity of their formulae. The idea of using coverage as a met- ric for completeness can be traced back to software testing, where it is possible to use LTL requirements as one of the coverage metrics [WRHM06, RWH07]. Model-based sanity checking was studied thoroughly and using various approaches, but it is intrinsically different from model-free checking presented in this work. Completeness is measured using metrics based on the state space coverage of the underlying model [CKKV01, CKV01]. Vacuity of temporal formulae was identified as a problem related to model checking and solutions were proposed in [KV03] and in [Kup06], again expecting existence of a model. Checking consistency (or satisfiability) of temporal formulae is a well understood problem solved using various techniques in many papers (most recently using SAT-based approach in [RDB+05] or in [RV07] where it was used as a comparison between different LTL to B¨uchi translation techniques). The classical problem is formulated as to decide whether a set of formulae is internally consistent. In this thesis, however, a more elaborate answer is sought: specifically which of the formulae cause the inconsistency. The approach is then extended to vacuity which is rarely used in model-free sanity checking. We have adopted the idea of searching for the smallest inconsistent subset of a set of requirements from the SAT community, where finding the minimal unsatisfiable core (a set of propositional clauses) is an important feature of SAT solvers, see e.g. [LMS04]. The authors of [LS08] extend the approach to finding all minimal unsatisfiable subsets, again in the context of propositional logic. Their exclusion of checking those subsets containing known unsatisfi- able subsets was extended by our algorithm with a dual heuristic: exclude also subsets with known satisfiable supersets. A problem related to consistency is that of realisability [ALW89]. Realisability generalises consistency by allowing some of the atomic propositions to be controlled by the environment. A set of system requirements is then realisable if there is a strategy which can react to arbitrary environment choices while satisfying the requirements. If no propositions are controlled by the environment then the notion of realisability becomes equivalent to consistency. A number of tools implement realisability, RATSY [BCG+10] for example, but again limit their responses to stating whether the requirements are realisable as a whole. Thus, even though the theory of minimal unrealisable cores had been known for some time [CRST08, KHB09], searching for smallest inconsistent subsets is a feature unique for our tool. We would also like to point out the theory described in [Sch12] which extends the unrealisability cores to search even within individual formulae, providing even finer information on the quality of requirements. Completeness of formal requirements is not as well-established and its definition often differs. The most thorough research in algorithmic evaluation of completeness was conducted in [HL95, Lev00, MTH03]. Authors of those papers use RSML (Requirements State Machine Language) to specify requirements which they translate (in the last paper) to CTL and to a theorem prover language to discover inconsistencies and detect incompleteness. Their notion of completeness is based on verifying that for every input value there is a reaction described in the specification. This thesis presents completeness as considering all behaviours described as sensible (and either refuting or requiring them). Finally, a novel semi-formal methodology is proposed in Section4, that recommends new requirements to the user, that have the potential to improve completeness of the input specification. Another approach at defining and checking completeness was pursued in [KGG99] in terms of evaluating the extend to which an implementation describes the intended specification. Their evaluation is based on the simulation preorder and searching for unimplemented tran-
23 sition/state evidence. In order to create a metric that would drive the engine for generating new requirements we have build an evaluation algorithm that enumerates the similarity of two paths. Our notion of similarity is to certain extend based on unimplemented transitions, but we carry the enumeration from two transition up to the whole automata, thus finally obtaining our completeness metric. The authors of [BB09] elaborate a method for measuring the quality of a set of requirements by detecting the presence of such a subsets, that together require behaviour that could not be detected by individual requirements. Similarly as in [HL95], the interesting behaviour is described in terms of reactions to input values observed at discrete time instances, yet the properties are limited to those of the form ∀t, At(t) =⇒ Zt(t). Finally, the notion of coverage is well-established in the area of software testing. The similarity with our work and that described for example in [RLS+03, FG03] lies in that both approaches attempt to generate a description of sets of behaviours that remain uncovered. The dissimilarity lies in that we describe these uncovered sets in term of software require- ments (or LTL formulae), whereas automated testing coverage produces collections of test, i.e. individual runs of a particular system.
2.3 Verification of Simulink Diagrams
There are three practical research directions, that we are aware of, describing LTL verification of Simulink diagrams. [MAW+05] translate Simulink models into the synchronous data-flow language Lustre, created by [HCRP91], and then into the native language of the symbolic model checker NuSMV, created by [CCG+02]. NuSMV was then used to verify correctness of the model with respect to a set of specifications (mostly in CTL, but NuSMV can verify full LTL as well) using BDD-based (Binary Decision Diagrams: see [McM92], for a classi- cal approach) representation of the state space. Although it was not a limitation in the input diagrams they used, BDDs are not readily able to support complex data types with Peano arithmetic, at least not while preserving their efficiency, as we will discuss later in this subsection. Similarly, [MBR06] translate Simulink diagrams for the verification using NuSMV, but in this case the translation is direct, without an intermediate language. This work is more relevant because the diagrams used contained integer data types and full Peano arithmetic (NuSMV flattens integer types into Boolean types, but represents the stored values precisely). The authors do not mention explicitly if they used the BDD-based branch or SAT-based branch of verification. Unfortunately, we could not compare that approach with the one proposed in this work, since the implementation is not available. Its efficiency can be only assumed from the reported results: when input variables were not bounded explicitly, i.e. allowed values between 0 and 232, the verification time was almost a week; the bounds for reasonably fast verification had to be relatively small: between 0 and 60. Explicit model checking was also used for verification of Simulink diagrams by [BBB+12], but there the data-flow non-determinism in the form of input variables had even greater detrimental effect. This is the only comparable implementation presently available and we provide detailed comparison between this purely explicit approach and the hybrid, control explicit—data symbolic approaches described here in Section6. From a theoretical point of view, the method proposed in Section6 is effectively equivalent to symbolic execution extended with LTL verification. This combination was attempted before by [BDKP08], but classical symbolic execution does not allow state matching and thus only
24 a small subset (safety properties) of LTL was supported. Without state matching, an infinite violating run would cause the tool to diverge, never yielding an answer. On the other hand it is possible to implement state subsumption (a state represents more heap configurations than another state) in symbolic execution, as shown by [XMSN05], but subsumption is not correct with respect to LTL even though it can significantly reduce the state space (and make it finite). A considerable amount of work pertained to allowing symbolic model checking to use more complex data types than Booleans. The problem is that BDDs grow exponentially in the presence of integer multiplication, as proved by [Bry91]. Other symbolic represen- tations were designed that represent variables on the word level rather than binary level, such as Binary Moment Diagrams, used by [BC95], or Boolean Expression Diagrams, used by [WBCG00], even though these are no longer canonical. Similarly to our approach, sym- bolic model checking can also use first-order formulas directly, leading to SAT-based and SMT-based approaches, invented by [BCC+99] and [AMP06], respectively, provided that the procedure is bounded in depth, and consequently incomplete. Various approaches to un- bounding have been proposed, most recently the IC3 method of [Bra11], but these are rather theoretical, without implementation, or limited to weaker arithmetics, see e.g. the model checker Kind based on k-induction and created by [HT08]. Module checking, introduced by [KV96] and detailed by [God03], allows verification of open systems. A system is open in the sense that the environment may restrict the non-determinism and the verification has to be robust with respect to arbitrary restrictions. Consequently, the approach to verifying open systems also differs since only branching time logics can distinguish open from closed systems, in the module checking sense. For linear time logics, every path has to satisfy the property and thus open and closed systems collapse into one source of non-determinism. Much closer to our separation between control and data is the work initiated by [Lin96]. Lin’s Symbolic Transition Graphs with Assignments represent precisely parallel programs with input variables using a combination of first-order logic and process algebra for communicating systems. Similarly as for symbolic execution, the most complicated aspect of this represen- tation is the handling of loops. Lin’s solution computes the greatest fix point of a predicate system representing the first-order term for each loop. Then two transition graphs are bisimi- lar if the predicate systems representing their loops are equivalent. The set-based reduction of transition systems used in this work effectively compute the fix points explicitly, represented as sets of variable evaluations. Finally, various combinations of different approaches and representations have been de- vised and experimented with. When multiple representations were combined, it was mostly to improve on weak aspects of either of the representations, for example by [YWGI06] multiple symbolic representations for Boolean and integer variables were employed in combination. Presburger constraints were combined with BDDs by [BGL98] to allow, with some restric- tions, the verification of infinite state spaces. Finally, the two approaches to model checking, explicit and symbolic, were combined to improve solely upon control-flow non-determinism. Some improvement was achieved by storing multiple explicit states in a single symbolic state, implemented by [DLKPTM11], or by storing explicitly the property and symbolically the sys- tem description, implemented by [STV05]. The approach described by [HGD95] represents control using symbolic model checking and data by purely symbolic manipulation of first- order formulas. There the symbolic data are limited in the sense that they cannot influence the control flow.
25 2.4 LTL Verification of Parallel LLVM Programs
Again, the width of our fields demands that we split our attention to only a few most relevant directions of research. We would also like to add that many of the following methods have a deep theoretical background, and their comprehensive exposition is outside the scope of this thesis. The interested reader is urged to follow the cited sources, but any attempt from our side to provide a short overview would likely be to the detriment of understanding.
Explicit Model Checking The tools following the explicit-state approach to model check- ing (such as SPIN [Hol97]) are inherently limited regarding the data non-determinism. Should the data be represented explicitly, the state space explosion quickly becomes infeasible. Re- garding the LLVM language support, however, DIVINE [BBH+13] provides a much larger subset, with better coverage of memory allocations, pointer operation and with a comprehen- sive handling of exceptions. The support for LLVM was recently added [vdB13] to LTSmin as well.
Symbolic Model Checking Fully symbolic tools, e.g. NuSMV [CCG+02] do not distin- guish between control and data, simply representing a set of states symbolically as a single entity. While they are designed for CTL model checking, they commonly allow verification of LTL by adding fairness constraints. The language support is limited to a language specific to a given tool: in order to compare with SymDivine we had to translate the input program to their language via the empty data representation.
Interpolation A relatively recent technique for accelerating the reachability algorithm uses Craig’s Interpolation for generating approximations of the transition relation [McM03]. Em- ploying this algorithm in LTL model checking is theoretically possible by reducing the LTL model checking to reachability [BAS02], yet no tool exists that would implement it.
IC3 Another technique for automatic approximation of the transition relation, one that is additionally guided by the property under verification, is IC3 [Bra11]. A CTL model checking tool based on IC3 was also designed [BSHY11]. Furthermore, the new version of NuSMV (called nuXmv) implements also the LTL verification [CGMT14], although a bug causing the tool to crash on LTL instances prevented us from comparing with this approach.
Termination Finally, a different family of techniques reduces the problem to program ter- mination. Two possible approaches were investigated: via fair termination [CGP+07] (shown extendable to predicate abstraction by encoding path segments as abstract transitions [PR05]) and via counterexample-guided CTL model checking [CK11]. The authors of the last paper also published inputs for their experiments which we adopted to compare SymDivine with available tools. Heidy Khlaaf builds on the results previously achieved by Erik Koskinen and has con- siderably extended them, both in the sense of efficiency and in the sense of the scope. Her tool is a part of the T2 temporal prover1. In [CKP14], she considerably accelerated the CTL model checking for C programs using precondition computation. In [CKP15a], she added the
1http://mmjb.github.io/T2/
26 support for fairness, thus effectively allowing the verification of LTL properties. And finally, in [CKP15b], she further extended the approach to full CTL*.
Horn Clauses Verifying CTL properties has also been reduced to solving quantified Horn clauses [BBR14]. This approach builds on the incremental results for solving existentially quantified Horn clauses [BPR13] and linear arithmetic constraint solving [GPR11b].
Abstract Interpretation The individual results produced by Antoine Min´ewould together also enable verifying temporal properties of parallel programs. First, in [Min12], he described an abstract domain for the bit-vector theory. Then in [Min13], he applied abstract interpre- tation to allow static analysis of concurrent programs. Finally, in [UM15], they extended the scope of abstract interpretation to a subset of LTL. Note that both this research direction and the direction using Horn clauses depends to some extend on deciding termination.
Program Traces Observing the program as producer of possible traces, i.e. sequences of transitions, allows building automata that accept these traces. The program analysis engine built on this idea is described in [HHP13] and extended to temporal verification in [DHLP15]. A related idea lead the authors of [FKP13] to a deductive verification method for parallel programs. Unlike the partial order reduction implement in Section8, the above approach is space-polynomial in the number of threads. And finally, Andrey Kupriyanov, working with a different notion of traces, designed a method analysis of multi-threaded programs. First for deciding safety properties in [KF13], then for deciding termination in [KF14], and finally, LTL verification in presently unpublished [Kup14].
27 a
28 Chapter 3
Preliminaries
The model checking, and in particular the explicit-state model checking, is one of the under- lying principles that connect the individual parts of this thesis. The sanity of requirements is tested by repeatedly checking the model of a conjunction of those requirements. The Simulink designs and LLVM programs are translated into control-flow graphs on which a model checking procedure is executed. In the latter case, the explicit model checking is extended with using symbolic data representations, but otherwise the verification technique itself is unmodified. This chapter establishes the necessary background into formal methods, as used in this thesis. The definitions pertaining to model checking may appear standard but the use of symbolic data representations required us to modify the definition to accommodate for the bivalent (explicit control/symbolic data) nature of states. The sanity checking techniques may be new to most readers, and we also recommend reading our short presentation of the bit-vector theory. The principles and limitations of checking the satisfiability in this theory lie at the core of some of the crucial concepts of this work.
3.1 First-Order Theory of Bit-Vectors
Following [MP95] we will assume that first-order (FO) formulae refer to variables from a universal set V . Each variable x ∈ V draws its values from a domain sort(x). The set S of evaluations E of variables from V contains functions from V to x∈V sort(x), such that ∀(ν ∈ E), ν(x) ∈ sort(x). Let X ⊆ V denote a set of variables {x1, . . . , xn}, we will assume a fixed ordering and overload the notation to allow ν(X) = (ν(x1), . . . , ν(xn)). In order to model precisely the semantics of programs, the domain of program variables qx will consist of fixed-width bit vectors, i.e. sort(x) = {0,..., 2 − 1} for some qx. Apart of program variables, V also contains program counters pc, one for each parallel process, where sort(pc) is the set of program locations of a given process. Finally, a subset I ⊆ V denotes an infinite set of input variables, drawing non-deterministic values from sort(i) = {0,..., 2qi −1} for i ∈ I. Altogether, V = D ∪ D0 ∪ {pc, pc0} ∪ I, where D is the set of data variables (the next-state versions are primed), pc the program counter, and I the set of input variables. Then the state of a program is uniquely defined as an evaluation of the control and data variables from D ∪ {pc}. The FO theory used in this thesis is the bit-vector BV theory with standard Peano arith- metic and bit-wise operations that when applied to variables from V form bit-vector expres- sions [KS10]. When necessary we will distinguish between the quantifier-free fragment of BV
29 and the full, quantified bit-vector theory QBV. We will denote both functions and predicates with Greek letters and the set of all BV expressions over V as Φ. Definition 3.1 (Syntax of BV). We define the syntax of (the quantifier-free fragment of) the theory of bit-vectors as follows:
formula ::= formula ∧ formula | ¬formula | (formula) | atom atom ::= term rel term | boolean-identifier | term[constant] rel ::= < | = term ::= term op term | identifier |∼ term | constant | atom ? term : term | term[constant : constant] op ::= + | − | · | / | | | & | | | ⊕ | ◦
Each BV term is associated with an integer bit-width and relations and binary operation are well-defined only when applied to terms of equal bit-width. Formally, each bit-width has a unique predicate symbol (denoted with a lower index), but we will not obfuscate the notation. Example 3.2 (BV Semantics). The standard semantics of bit-vector operations is similar to the modular arithmetic, i.e. each operation is computed modulo 2n, for n the bit-width of the operation.
11010000 = 200 + 01100100 = 100 = 00101100 = 44
Above is an example of bit-width 8 addition. It follows that in the theory of bit-vectors, 200 +8 100 = 44 is a valid formula. 4 A large part of this work directly depends on the high efficiency of decision procedures for BV satisfiability. Similarly to decision procedures of other first-order theories, BV satisfiability is decided by a collection of rewriting rules, heuristics, and the DPLL algorithm [DLL62] for propositional logic. Unlike other theories, BV is much more closely related to propositional logic since a BV formula can be translate to an equisatisfiable propositional formula. This translation assigns a new propositional variable to each bit of each bit-vector variable. Deciding satisfiability of QBV formulae is both theoretically and practically more chal- lenging. To be more precise, each BV formula implicitly existentially quantifies all its free variables. Thus when we talk about QBV, we mean those FO formulae that have at least one quantifier alternation. Definition 3.3 (Quantified BV). We extend syntax to quantified formulae in the following manner, where the type of each quantified variable denotes its bit-width:
formula ::= ∃(identifier :: bit-width), formula | ∀(identifier :: bit-width), formula | formula
The efficiency of state-of-the-art solver for QBV is much less consistent compared to BV. This is partly caused by the fact that very little effort has been directed towards solving QBV satisfiability. To the best of our knowledge, the authors of [WHdM13] implemented the first
30 and only QBV solver as part of Z3. Their methods combine simplification ideas from theorem proving, such as miniscoping and skolemisation [Har09], with heuristical quantifier instanti- ation. The latter technique synthesises a model µ of the universally quantified variables for which the input formula ∀x, ϕ is satisfiable, i.e. |=SAT ϕ[x 7→ µ(x)]. Then if x 6= µ(x) ∧ ϕ is unsatisfiable, the original formula is pronounced satisfiable. The heuristics is relatively effec- tive but only applicable on satisfiable formulae. For unsatisfiable formulae the instantiation continues in a loop until the formula can be simplified to ⊥. In our application, unsatisfiable formulae are created and checked every time we find two matching states. It is thus a relatively common situation in the verification we propose, and appropriate amount of attention needs to be paid in handing the potential inefficiency.
3.2 Control Flow Graphs
This section will establish the mathematical foundations for program analysis, building on the notion of control flow graphs. In the classical model checking literature, see e.g. [CGP99], they are referred to as Kripke structures or labelled transition systems. Classical model checking, however, does not distinguish between control and data flow, which is why we opt for control flow graphs instead. The formalism of control flow graphs can also be interpreted as omega- regular automata, i.e. automata for recognising certain languages over infinite words. To simplify the following discussion we do not introduce another formalism for automata recog- nising temporal specifications, but instead alter the semantics of control flow graph. The interested reader is recommended to investigate the B¨uchi automata [Chu62] for more infor- mation. Throughout this thesis, we will use the flow graphs for modelling parallel programs, their processes, specification automata and in general any reactive system.
Definition 3.4 (Control Flow Graph). The control flow graph (CFG) P = (L, l0, T ,F ) is a 4-tuple, where
• L is the set of program locations
• l0 ∈ L is the initial location
•T⊆ L × Φ × L is the transition relation
• F ⊆ L is the set of final locations
Hence a transition t ∈ T is a triple t = (l, ρ, l0), l, l0 ∈ L, where ρ describes the relation ρ between current state and next state data variables; depicted visually, τ = l −→ l0. The Example 3.5 shows CFGs of parallel processes of a program, where the parallel composition operator k will be defined momentarily.
Example 3.5 (CFG for Parallel Program). Let us demonstrate how threads of a parallel program can be mapped to our definition of CFGs. The simplified parallel program on the left is represented by the three CFGs on the right (the top one only initialises the global variables with non-deterministic values).
31 a0 = ι32 b0 = ι32 g 1 0 2 g1: uint32 t a, b;:gb1 1 g1 gb1
l thread1() { a ≤ 5 1 a > 5 l1: if (a > 5) { l2: b + +; } :bl1 bl1 l2 b0 = b + 1
b ≥ 20 mb 1 thread2() { m1 m1: while (b < 20) { b < 20 a0 = a mod 5 m2: a = a%5; } :mb 1 m2
The transition labels are simplified in that each variable preserves its value, i.e. x0 = x, whenever it is not explicitly modified. For example the CFG of thread 1 is G = (L, l1, T , ∅), where
• L = {l1, l2, bl1} 0 0 T = { (l1, a > 5 ∧ a = a ∧ b = b, l2), 0 0 • (l1, a ≤ 5 ∧ a = a ∧ b = b, bl1), 0 0 (l2, a = a ∧ b = b + 1, bl1) } 4 It will prove useful in future discussion to introduce three composition operations on CFGs: (1) the parallel k to model interleaving of parallel processes, (2) the synchronous o to model the connection between program and its specification, and (3) the sequential ; to model the sequential transfer of control from one CFG to the next. To simplify the notation we will define compositions between two CFGs, but the extension to more CFGs is straightforward. Definition 3.6 (Parallel Composition). The parallel composition P = P 1kP 2 of CFGs 1 1 1 1 1 2 2 2 2 2 P = (L , l0, T ,F ) and P = (L , l0, T ,F ) is a CFG P = (L, l0, T ,F ), where • L = L1 × L2
1 2 • l0 = (l0, l0)
ρ 0 0 ρ 0 1 ρ 0 2 •T = {(l1, l2) −→ (l1, l2) | l1 −→ l1 ∈ T ∨ l2 −→ l2 ∈ T } • F = {(l1, l2) | l1 ∈ F 2 ∨ l2 ∈ F 2} Definition 3.7 (Synchronous Composition). The synchronous composition P = P 1 o P 2 of 1 1 1 1 1 2 2 2 2 2 CFGs P = (L , l0, T ,F ) and P = (L , l0, T ,F ) is a CFG P = (L, l0, T ,F ), where L, l0, and F are defined similarly as for the parallel composition and
ρ 0 0 ρ1 0 1 ρ2 0 2 •T = {(l1, l2) −→ (l1, l2) | l1 −→ l1 ∈ T , l2 −→ l2 ∈ T , ρ = ρ1 ∧ ρ2} Definition 3.8 (Sequential Composition). The sequential composition P = P 1; P 2 of CFGs 1 1 1 1 1 2 2 2 2 2 P = (L , l0, T ,F ) and P = (L , l0, T ,F ) is a CFG P = (L, l0, T ,F ), where
32 • L = L1 ∪ L2
1 • l0 = l0
1 2 1 > 2 1 1 •T = T ∪ T ∪ {l −→ l0 | l ∈ F } • F = F 2
Example 3.9 (CFG Composition). Let P1 and P2 be the CFGs of threads 1 and 2 from Example 3.5 and Pg the CFG of the global process. Below is the composition P = Pg;(P1kP2), where some of the transition labels were omitted to avoid cluttering.
b0 = i a0 = i 2 0 1 g gb1 g1 1 > a ≤ 5 a0 = 13a%5 b ≥ 20 (l1, m2) (l1, m1) (l1, mb 1) b < 20 a > 5
(l2, m2) (l2, m1) (l2, mb 1)
b0 = b + 1
(bl1, m2) (bl1, m1) (bl1, mb 1)
4
The CFGs represent programs in a control explicit, data symbolic manner. Explicit- state model checking, however, operates over fully explicit systems. In order to define model checking of programs we must first explicate the manipulation of data. A standard formalism for the representation of models of programs is the labelled transition system: effectively a CFG where each transition is labelled with a true formula. This means that all the information about the variable values has to be stored in the control-flow locations. Since a large part of our work is concerned with reducing the size of the resulting CFG, we will represent each reduction technique as a process of generating the explicit CFG.
Definition 3.10 (Explicit CFG). The explicit control-flow graph is a CFG G = (S, s0,T,A), where for each s −→σ s0 ∈ T it holds that σ = > and we simplify the notation to s → s0. From a CFG P = (L, l0, T ,F ) we can generate the explicit CFG G. The set of locations S is the set of evaluations of {pc} ∪ D. We define the generation as follows:
S = {s : {pc} ∪ D → L ∪ S sort(x) | s(pc) ∈ L, • x∈D ∀(x ∈ D), s(x) ∈ sort(x)}
• s0 = {(pc, l0)} ∪ {(x, 0) | x ∈ D}
33 ρ T = {s → s0 | s(pc) −→ s0(pc) ∈ T , • V V 0 0 |=SAT x = s(x) ∧ ρ ∧ x = s (x)} • F = {s ∈ S | s(pc) ∈ F } The above definition of transitions in the explicit CFG requires that the transition relation ρ holds between the evaluation in s and the evaluation in s0. Note that ρ may also refer to the input variables from I. Hence the fact that the formula is satisfiable, denoted by |=SAT , states that there is an evaluation of the input variables for which the formula is valid.
3.3 LTL Model Checking
As stated in the introduction, LTL is a logic that allows developers to specify temporal properties of their programs. The logic is based on so-called atomic propositions, i.e. formulae from Φ which can be evaluated over locations of the explicit CFG. Definition 3.11 (LTL Syntax). For an atomic proposition p ∈ Φ, we define the syntax of the Linear Temporal Logic inductively as follows:
ψ ::= p | ¬ψ | ψ ∧ ψ | X ψ | ψ U ψ
Intuitively, X ψ states that ψ will hold in the next program state and ψ1 U ψ2 that ψ1 will hold up to some point in the future at which also ψ2 holds. The full and precise semantics of LTL relates to the specific transition system which must be defined first. Let us only add that there are other linear operators that can be defined using X and U, e.g. F ψ := > U ψ or G ψ := ¬ F ¬ψ formalise the intuitive notion of formulae that hold at some point in the future or globally (in every program state), respectively.
Definition 3.12 (LTL Semantics). Let G = (S, s0,T,A) be an explicit CFG and ψ an LTL formula. A run (or trace) w of G is an infinite sequence of explicit locations w = (s0, s1,...) and wi its i-th suffix, i.e. wi = (si, si+1,...). The satisfaction of ψ in a run w is defined inductively, according to the structure of ψ:
w |= p ⇔ s0 |= p w |= ¬ψ ⇔ w 6|= ψ w |= ψ1 ∧ ψ2 ⇔ w |= ψ1 and w |= ψ2 w |= X ψ ⇔ w1 |= ψ w |= ψ1 U ψ2 ⇔ ∃(i ∈ N0), ∀(j < i), wj |= ψ1, wi |= ψ2 Example 3.13 (Temporal Operators). The four runs below illustrate the meaning of LTL operators: p X p p F p p p p p p G p p p q p U q
34 For each formula on the left it holds that the run on the right satisfies the formula. Note that these are only the finite prefixes of infinite runs and each run is only one of possibly many runs that satisfy the respective formula. The nodes without labels can evaluate atomic proposition to arbitrary values. The nodes with labels represent locations in which their labels evaluate to true, e.g. a node labelled with p evaluates the atomic proposition p to true but any other proposition can be either true or false. With the exception of G, the depicted prefix alone attests to the satisfaction of the whole run, regardless the suffix. Demonstrating the satisfaction of G, on the other hand, requires the complete infinite run. 4
We overload the entailment relation |= to denote (1) that a trace w satisfies an LTL formula ψ and (2) that an evaluation of variables from X is a model of a BV formula over variables from Y ⊆ X. Finally, the whole CFG G satisfies ψ if all of its runs do. The task of an LTL model checker is to provide an answer {correct, (incorrect, counterexample)} for an input pair (G, ψ), where counterexample is a run of G that does not satisfy ψ.
Example 3.14 (B¨uchi Automaton). We continue our running example using the system P shown in Example 3.9. Below is one of the runs w of P . We represent a transition (s, ρ, s0) as s → s0, omitting ρ to ease the exposition. Let us assume that we are considering the satisfaction of an LTL formula ψ : G b < 12.
w0 w1 w2 w3 w7 0 pc 7→ g1 pc 7→ g1 pc 7→ gb1 pc 7→ (l1, m1) pc 7→ (bl1, m1) a 7→ 0 a 7→ 6 a 7→ 6 a 7→ 6 ... a 7→ 3 ... b 7→ 0 b 7→ 0 b 7→ 11 b 7→ 11 b 7→ 12
The run w does not satisfy ψ: even though b < 12 holds in the first seven states, the eighth state is not a model of b < 12 and thus even this finite prefix demonstrates that w does not satisfy ψ. Note that w is only one of many runs of the system, in fact there are exactly 232 distinct successors of the first state of w. 4
3.3.1 Explicit-State Model Checking A classical approach to verify LTL properties is automata-based model checking (see the relevant parts of Section2 for other approaches). Informally, for an LTL formula ψ one can construct an ω-automaton [CGP99] – automaton over infinite words – that accepts exactly those words that model the formula ψ. Hence the automaton for formula ¬ψ accepts exactly the counterexamples for the formula ψ. It follows that if a product automaton of the program P and the automaton for ¬ψ is constructed, then it accepts exactly those counterexamples for ψ that are legal executions of P . The verification is thus reduced to locating such accepted words. The ω-automata used to represent LTL formulae are called B¨uchi automata, but their structure is similar to CFGs and thus we will not introduce a new formalism. For our purposes it will suffice to define the set of words accepted by a CFG.
Definition 3.15 (CFG Acceptance). A CFG P = (L, l0, T ,F ) accepts a trace w = (s0, s1,...) if and only if there is an infinite sequence of locations α = (l0, l1,...) such that
ρ •∀ (i ∈ N), li −→ li+1 ∈ T ∧ w(i) |= ρ
35 •∀ (i0 ∈ N), ∃(i ∈ N), i > i0 ∧ α(i) ∈ F The first condition requires that each location in w satisfies the relevant transition in P while the second condition requires that the sequence α visits the set of final states F infinitely many times. It is often useful to denote what LTL formula is represented by a particular CFG. For that purpose we will use the notation Aϕ to denote a CFG that accepts all traces w that satisfy ϕ. Since then we interpret CFGs as automata, we can equally say that w is a word in the language accepted by Aϕ, denoted as w ∈ L(Aϕ). Henceforth, we will not need to distinguish between the original program and its CFG representation. Assuming that PA is the CFG for the LTL property being verified, we fix the notation P = (P1k ... kPn) o PA = (L, l0, T ,F ) and only refer to P as the input for model checking. Words accepted by P are the witnesses of the program’s failure to meet its specification. Thus if the set of accepting words is empty then the program is correct. The existence of accepting words is equivalent to the existence of reachable accepting cycles in the explicit state space. A cycle is a closed path ζ = s1 → s2 → ... → sn = s1 and ζ is accepting if ∃(1 ≤ i < n), si ∈ F . A state s is reachable if there is a path from s0 to s and, by extension, a cycle is reachable if any of its states is reachable. The classical explicit-state model checking thus reduces the LTL verification task to the location of accepting cycles within the transition system of the program/specification product. Traditionally, a system as a whole is considered to satisfy an LTL formula if all its exe- cutions (all infinite words over the states of its CFG) do. However, we might occasionally be also interested in the question whether at least one of a system’s executions satisfies the given LTL formula. We thus suggest explicit path quantification using the ∀ and ∃ quantifiers as follows: For a system model M and a formula ϕ, we write M |= ∀ϕ if all executions of M satisfy ϕ, and M |= ∃ϕ if there is at least one execution of M satisfying ϕ. We call the former kind of path-quantified formulae universal and the latter kind existential. There are several notes to be made. First, we do not allow for nesting of path-quantified formulae as we want to avoid the complexity of dealing with full CTL∗. We only occasionally use the negation operator with the meaning ¬∀ϕ ≡ ∃¬ϕ and vice versa. Second, one of the advantages of making the hidden universal quantification of LTL more explicit is to reduce the confusion over the fact that LTL seemingly does not follow the law of excluded middle: We can have a system model that satisfies neither ϕ nor ¬ϕ. Third, using path-quantified LTL formulae is really just a convenience, as it does not make the verification task any harder. Indeed, to verify whether a system model satisfies ∃ϕ is the same as to verify whether the system model satisfies ∀¬ϕ and invert the answer. Efficient verification of that satisfaction, however, requires a more systematic approach than enumeration of all executions. An example of a successful approach is the enumerative approach using B¨uchi automata. Example 3.16 (Specification and Program Product). Let us conclude the demonstration started in Example 3.5 by presenting the CFG PA (on the left) for verifying formula ϕ : G a < 4, i.e. representing runs that satisfy its negation: ¬ϕ = ¬ F (a ≥ 4). The sourceless arrow points to the initial location. ... b ≥ 12 b < 12 a0 a1 > s0 ∪ [pcA 7→ a0] ... s6 ∪ [pcA 7→ a0] s7 ∪ [pcA 7→ a1]
36 Assuming that w = (s0, s1, s2,...) is the run from Example 3.14, then the path on the right represents a reachable accepting cycle in the explicit CFG for the synchronous product of PA and the program from Example 3.9. This cycle can be mechanically detected, thus demonstrating the program’s incorrectness with respect to ψ. 4 Without going into too many implementation details, we will shortly describe the internal functioning of an explicit model checker (a more detailed exposition can be found at the end of Section5). Assume that the input is the composition ( P1k ... kPN ) o PA but only in an implicit form: the parallel processes and the specification automaton are known but the state space of the composition has to be generated. During accepting cycle detection, the model-checking algorithm generates successors of individual states using the above rule. The set of visited states is maintained in an explicit storage and every newly generated state is tested for being previously visited. Based on the specific cycle detection algorithm the state space is traversed in some order, until either an accepting cycle is found or no new state can be generated.
3.4 Sanity Checking
As described in the introduction the model checking procedure is not designed to decide why was a certain property satisfied in a given system. That is a problem, however, because the reasons for satisfaction might be the wrong ones. If for example the system is modelled erroneously or the formula is not appropriate for the system then it is still possible to receive a positive answer. These kinds of inconsistencies between the model and the formula are detected using sanity checking techniques, namely redundancy and coverage. In this thesis they will be described for comparison with their model-free versions; interested reader should consult for example [Kup06] for more details. Let K be an CFG, ϕ a formula and ψ its subformula. Let further ϕ[ψ0/ψ] be the modified formula ϕ in which its subformula ψ is substituted by ψ0. Then ψ does not affect the truth value of ϕ in K if the following property holds: K satisfies ϕ[ψ0/ψ] for every formula ψ0 iff K satisfies ϕ. Then a system K satisfies a formula ϕ redundantly if K satisfies ϕ and there is a subformula ψ of ϕ such that ψ does not affect ϕ in K. A state s of an CFG K is q-covered by ϕ, for a formula ϕ and an atomic proposition q, if K satisfies ϕ but K˜s,q does not satisfy ϕ. There K˜s,q stands for an CFG equivalent to K except the valuation of q in the state s is flipped. Example 3.17 (Model-Based Redundancy and Coverage). To demonstrate redundant sat- isfaction let us consider a standard liveness formula ϕ : G(req ⇒ F resp). Assume that we try to verify that a system K in which no request ever occurs, satisfies ϕ. Then setting ϕ> : ϕ[resp 7→ >] and ϕ⊥ : ϕ[resp 7→ ⊥] leads to K |= ϕ> and K |= ϕ⊥, thus demonstrating the redundancy. To demonstrate coverage let us consider the following system:
¬a, ¬b a, b a, b w and the formula ϕ : ¬ab U . There the state w is b-covered by ϕ, because K |= ϕ and Kew,b 6|= ϕ. On the other hand, w is not a-covered by ϕ, because K |= ϕ but also Kew,a |= ϕ.4
37 If one could extract the underlying ideas of redundancy and coverage and show that they do not necessarily require the existence of a model, it would be possible to extrapolate these notions into the model-free environment. Redundancy states that the satisfaction of a formula is given extrinsically (by an environment) and is not related to the formula itself. Thus we may consider the remaining formulae from the set of requirements to constitute the environment. Coverage, on the other hand, attempts to capture the amount of system behaviour that is described by the formulae. Again, we can use the remaining formulae to form the system, but in order to establish a coverage metric capable of differentiating various formulae, we will need to extend the notion from state-coverage to path-coverage. Although these notions of sanity are dependent on the system that is being verified, we might still utilise at least the notion of redundancy in a model-less setting, using a concept of redundancy witnesses. A redundancy witness to a given formula ϕ is a formula ψ with the following property: If a system K satisfies ψ then it satisfies ϕ redundantly. A stan- dard example is the request-response formula G(req ⇒ F resp) “every request is followed by a response”, whose two redundancy witnesses are G(¬req) “no request ever happens” and GF(resp) “responses happen infinitely often”. The redundancy witnesses for a given formula can be automatically generated, see [BBDER01]. Moreover, as it is desired for the redundancy witnesses not to hold, we may include this requirement in our original set of requirements with the help of path-quantified LTL formulae. For example, if our set of requirements contains the formula ∀ G(req ⇒ F resp), we add the formulae ∃ F(req) and ∃ F G(¬resp) to the set – those are the negations of the redundancy witnesses. In Section4, we translate the concepts of model-based sanity into model-free environment. We further supplement these notions with consistency verification thus arriving at a method that would aid the creation of a reasonable set of requirements, i.e. a set such that it is reasonable to ask if a system satisfies it or not. If, however, the intention is to build a system based on this set, then a further property of such set would be desirable and that the set completely describes the system under development. Consequently, a sane set of requirements is consistent, without redundancies, and complete in a sense we will formalise in Section4.
38 Chapter 4
Checking Sanity of Software Requirements
The requirements stage is the earliest of software development stages and as various studies concluded, undetected errors made early in the development cycle are the most expensive to eradicate. The impact of this thesis reaches to this early state of development by mechanically verifying certain crucial properties of requirements. Our sanity checking method guides the requirements engineers in producing a collection of requirements that may serve in future stages as the specification against which further analysis is run. Hence our overall contribution amounts not only to providing a verification method but also ensuring that one is verifying relevant properties. It is thus very important that the outcome of the requirements stage – a database of well-formed, traceable requirements – is what the customer intended and that nothing was omitted (not even unintentionally). While a procedure that would ensure the two properties cannot be automated, our work on requirements analysis proposes a methodology to check the sanity of requirements. In the following the sanity checking will be considered to consist of 3 related tasks: consistency, redundancy and completeness checking. As consistency and redundancy are more closely related, we focus on them in Section 4.1. The completeness checking is a little more involved than both consistency and redundancy: this is in fact the part of sanity checking that provably cannot be fully automated. Hence we will first describe the problem and then detail the semi-automatic solution proposed. We dedicate the following Section 4.2 solely to completeness. Through our collaboration with Honeywell R&D department, we were able to obtain industrial requirement documents which form one of our experimental studies, described in Section 4.3.
4.1 Consistency and Redundancy
Definition 4.1 (Consistency and Redundancy). A set Γ of LTL formulae over AP is consis- tent if ∃w ∈ AP ω : w satisfies V Γ. Checking consistency of a set Γ entails finding all minimal inconsistent subsets of Γ. A formula ϕ is redundant with respect to a set of formulae Γ if V Γ ⇒ ϕ. To check redundancy of a set Γ entails finding all pairs of hϕ ∈ Γ, Φ ⊆ Γi such that Φ is consistent and Φ ⇒ ϕ (and for no Φ0 ⊆ Φ does it hold that Φ0 ⇒ ϕ).
The existence of the appropriate w can be tested by constructing AV Γ and checking that L(AV Γ) is non-empty. The procedure is effectively equivalent to model checking where the
39 model is a clique over the graph with one vertex for every element of 2AP (allowing every possible behaviour). This approach to consistency and redundancy is especially efficient if a non-trivial set of requirements needs to be processed and standard sanity checking would only reveal if there is an inconsistency (or redundancy) but would not be able to locate the source. Further- more, dealing with larger sets of requirements entails the possibility that there will be several inconsistent subsets or that a formula is redundant due to multiple small subsets. Each of these conflicting subsets needs to be considered separately which can be facilitated using the methodology proposed in this work. Example 4.2 (Consistent and Redundant Requirements). Let us assume that there are five requirements formalised as LTL formulae over a set of atomic propositions (two-valued signals) {p, q, a}. They are 1. Eventually, the presence of signal p will entail its continued presence until the signal q is observed. ϕ1 = F(p ⇒ p U q)
2. The signal p will occur infinitely often. ϕ2 = GF p
3. The signals a and p cannot occur at the same time. ϕ3 = G ¬(a ∧ p) 4. Each occurrence of the signal q requires that signal a was observed in the previous time step. ϕ4 = G(X q ⇒ a)
5. The signal q will occur infinitely often. ϕ5 = GF q
In this set the formula ϕ4 is inconsistent due to the first 3 formulae and the last formula is redundant with respect to (implied by) the first 2 formulae. 4 We now reformulate the notions of Definition 4.1 to extend also to path-quantified LTL formulae. Again, we are interested in finding the minimal inconsistent subsets and minimal redundancy witnesses. Definition 4.3 (Path-Quantified Consistency and Redundancy). A set Γ of path-quantified LTL formulae over AP is consistent if there is a system model M such that M |= ϕ for all ϕ ∈ Γ. A path-quantified formula ϕ is redundant with respect to a set of formulae Γ if every model M satisfying all formulae of Γ also satisfies ϕ. We now show that checking consistency and redundancy in the path-quantified setting can be easily reduced to the case in the standard setting. Let us partition the set Γ into existential formulae ∃Γ and universal formulae ∀Γ, i.e. ∃Γ := Γ ∩ {∃ϕ}, for ϕ quantifier-free, and ∀Γ := Γ \ ∃Γ. Further, let Ψ1∃ := {Φ ∪ {ϕ} | Φ ⊆ ∀Γ, ϕ ∈ ∃Γ}. Theorem 4.4 (Locating Inconsistent Subsets). In order to locate all smallest inconsistent subsets of Γ is suffices to search among Ψ1∃. Similarly, checking redundancy may correctly be limited to checking among pairs hϕ ∈ Γ, Φ ⊆ ∀Γi ϕ universal h∀ψ, Φ ∈ Ψ1∃ ∪ ∀Γi ϕ existential, ϕ = ∃ψ. Proof. Let us first consider consistency. Clearly, if Γ only consists of universal formulae, we may simply ignore their quantifiers. In the case of Γ containing existential formulae, we can make the following two observations:
40 • First observation: Two existential formulae are always consistent provided that they are both self-consistent, i.e. none of them is equal to false. This can be easily seen as we can always provide a model with exactly two paths, each satisfying one of the formulae. This observations tells us that every minimal inconsistent subset contains at most one existential formula.
• Second observation: Let us write Γ∀ to denote the set Γ in which all path quantifiers have been changed to ∀. If the set Γ contains at most one existential formula then the following holds: Γ is consistent iff Γ∀ is consistent. One direction is obvious, the other follows from the fact that we can always provide a single-path model as a proof of consistency of Γ∀.
The result of the two observations is that if we are careful to never consider sets with more than one existential formula, we may freely ignore the path-quantifiers. Let us now consider redundancy. It is clear that the following holds: ϕ is redundant with respect to Γ iff Γ ∪ {¬ϕ} is inconsistent. This means that redundancy checking is easily reduced to consistency checking. This fact holds in the path-quantified setting as well as in the standard one and will be used later in the implementation. Note that the first observation above has the following consequences: If ϕ is a universal formula, then it makes sense to consider Γ with universal formulae only, as ¬ϕ is existential. Furthermore, if ϕ is an existential formula, it makes sense to consider Γ containing at most one existential formula.
From the above theorem it follows that all formulae that require consideration are uni- versal, which is the implicit quantifier of quantifier-free formulae. In other words the two statements above enable the same algorithms for checking sanity, described in the next sec- tion, to be used in the path-quantified setting as well. Yet the validity of the two statements requires a more detailed argumentation. Before we come to the implementation, let us illus- trate the interplay between the concepts discussed so far (model-less consistency, redundancy, and redundancy witnesses) on a more comprehensive example.
Example 4.5 (Redundancy Witnesses). Consider a set of requirements containing just two items:
1. If a signal a is ever activated, it will remain active indefinitely, i.e. in every time step it holds that if a is active now it will also be active in the next time step. Translating this natural language requirement to LTL we arrive at: ϕ1 = G(a ⇒ X a)
2. If a signal a is ever activated, it will not be active in the next time step, i.e. a must never hold in two successive steps. After translation the appropriate LTL formula thus stands: ϕ2 = G(a ⇒ X ¬a)
Note that although such requirements might seem inconsistent at first glance, they are, in fact, consistent as any consistency checking algorithm could demonstrate. What the following redundancy analysis is about to reveal is that, while consistent, it is not reasonable to require both formulae to hold at the same time. Let us elaborate and formalise the reasoning step of redundancy checking. We begin by constructing the redundancy witnesses according to the method in [BBDER01]. The redun- dancy witnesses for ϕ1 are: G(¬a), GX a. The first witness is quite intuitive, if the signal a is never activated in a system then requiring ϕ1 to hold is unreasonable. Similarly, if at
41 Algorithm 1: Consistency Detection Algorithm 2: Task Successors
1 while t ← getTask() do genSuccs(Task t) 2 if t.checked() then 1 if t.con() then 3 genSuccs(t) 2 if t.dir =↑ then
4 else 3 genSupsets(t) 5 t.checked ← (t) verCons 4 else 6 (t) updateSets 5 if t.dir =↓ then 7 Pool.enqueue(t) 6 genSubsets(t)
every time step we have that a will be active in the next step then clearly the presence of a in this time step is irrelevant. Exactly the same kind of mechanisable reasoning leads to the redundancy witnesses for ϕ2: G(¬a), GX (¬a). We thus create the new requirements as described in the previous section. The set of requirements now contains ∀ϕ1, ∀ϕ2, ∃ F a, ∃ F X (¬a), ∃ F X a. Take for example ∃ F X (¬a), resulting from the negated witness GX a. Being existentially quantified, this requirement demands the existence of at least one compu- tation of the system, one on which a state whose successor does not observe a will eventually occur. Note that we have converted the implicit formulae into universal ones and ignored the duplicity of the redundancy witnesses. We now check the redundancy of the requirements and discover that the formula ∃ F X a implies ∃ F a, we thus drop ∃ F a from the set. The remaining four formulae are checked for consistency. We find {∀ϕ1, ∀ϕ2, ∃ F X a} as the only minimal inconsistent subset. The conclusion is that although ϕ1 and ϕ2 are together consistent, the only models that satisfy both of them satisfy them redundantly. The two formulae can thus be said to be redundantly consistent. 4
4.1.1 Implementation of Consistency Checking Let us henceforth denote one specific instance of consistency (or redundancy) checking as a check. For consistency and a set Γ it means to check that for some γ ⊆ Γ is V γ satisfiable. For redundancy it means for γ ⊆ Γ and ϕ ∈ Γ to check that V γ ⇒ ϕ is satisfiable. In the worst case both consistency and redundancy checking would require an exponential number of checks. However, the proposed algorithm considers previous results and only performs the checks that need to be tested. Both consistency and redundancy checking use three data structures that facilitate the computation. First, there is the queue of verification tasks called Pool, then there are two sets, Con and Incon, which store the consistent and inconsistent combinations found so far. Finally, each individual task contains a set of integers (that uniquely identifies formulae from Γ) and a flag value (containing three bits for three binary properties). First, whether the satisfaction check was already performed or not. Second, if the combination is consistent. And the third bit specifies the direction in subset relation (up or down in the Hasse diagram) in which the algorithm will continue. The successors will be either subsets or supersets of the current combination. The idea behind consistency checking is very simple (listed as Algorithm1). The pool contains all the tasks to be performed and these tasks are of two types: either to check
42 Algorithm 3: Task Supersets Algorithm 4: Consistency Check
genSupsets(Task t) verCons(t=hi1, . . . , iji) 1 foreach i ∈ {1, . . . , n} do 1 F ← createConj(ϕi1 , . . . , ϕij ) 2 t.add(i) 2 A ← transform2BA(F) 3 if ∀X ∈ Con : X 6⊇ t ∧ 3 return findAccCycle(A) 4 ∀X ∈ Incon : X 6⊆ t then 5 enqueue(t)
6 t.erase(i)
consistency of the combination or to generate successors. The symmetry of the solution allows for parallel processing (multiple threads performing the Algorithm1 at the same time) given that the data structures are properly protected from race conditions. The pool needs to be initialised with all single element subsets of Γ and Γ itself, thus in the subsequent iteration will be checked the supersets of the former and subsets of the latter. Algorithm2 is called when the task t on the top of Pool is already checked. At this point either all subsets or all supersets of t should be enqueued as tasks. But not all successors need to be inspected, e.g. if t is consistent then also all its subsets will be consistent – that is clearly true and no subset of t needs to be checked. That observation is utilised again in Algorithm3. It does not suffice to stop generating subsets and supersets when its immediate predecessors are found consistent (inconsistent), because it can also happen that the combination to be checked was formed in a different branch of the Hasse diagram of the subset relation. In order to prevent redundant satisfiability checks two sets are maintained Con and Incon (see how these are used on line4 of Algorithm3). The actual consistency (and quite similarly also redundancy) checking is less complicated (see Algorithm4) and may even be delegated to a third party tool. First, the conjunction of formulae encoded in the task is created. In our setting we then check the appropriate B¨uchi automaton for the existence of an accepting cycle: using nested DFS [CVWY92]. Hence the Algorithm4 can easily be substituted with another method for checking consistency of a set of LTL formulae. A realisability checking tool, such as RATSY, could also be applied: one only needs to state that all signals are controlled by the system. Example 4.6 (Consistency Checking Algorithm). Let us assume that we want to locate all smallest inconsistent subsets of the 4-element set of requirements S = {a, b, c, d}. To better demonstrate the search, let {a, d}, {b, d}, and {c, d} be those smallest subsets. Our algorithm starts by add the whole S and all the singleton subsets to the pool of tasks. During the top-down search, S is found inconsistent and thus all its subsets are added. {a, b, c} is consistent and thus all its subsets are removed from the pool, namely the singletons {a}, {b}, and {c}. During the bottom-up search, {d} is found consistent but its superset, {c, d} is not and thus both {b, c, d} and {a, c, d} are removed from the pool. The situation is illustrated in Figure 4.1. 4
Implementation of Redundancy Checking The only difference when performing the redundancy checking is that the task t consists of a list hi1, . . . , iji which can be empty, and one index ik. Since the task is to decide whether
43 S
abc abd bcd acd
ab ac bc ad bd cd
a b c d
∅
Figure 4.1: Illustration of the consistency checking algorithm. Red denotes the inconsistent subsets and green the consistent ones. The lightly shaded regions mark subsets which the algorithm decided without checking their consistency.
ϕi1 ∧ ... ∧ ϕij ⇒ ϕik the line1 needs to be altered to:
F ← createConj(ϕi1 , . . . , ϕij , ¬ϕik ) This is due to the fact discussed in the previous section, namely that the satisfiability of
ϕi1 ∧ ... ∧ ϕij ∧ ¬ϕik implies ϕi1 ∧ ... ∧ ϕij 6⇒ ϕik , i.e. ϕik is not redundant with respect to
{ϕi1 , . . . , ϕij }. Discarding the B¨uchi automata in every iteration may seem unnecessarily wasteful, es- pecially since synchronous composition of two (and more) automata is a feasible operation. However, the size of an automaton created by composition is a multiplication of the sizes of the automata being composed. Furthermore, it would not be possible to use the size optimising techniques employed in LTL to B¨uchi translation. And these techniques work particularly well in our case, because the translated formulae (conjunctions of requirements) have relatively small nesting depth (maximal depth among requirements +1).
Example 4.7 (Smallest Inconsistent Subset). We will demonstrate the search for smallest inconsistent subsets on the following set of requirements.
1. A request to increase the cabin temperature a may always be issued and each such request will eventually be granted by heating unit b. ϕ1 = G(F a ∧ (a ⇒ F b))
2. There is only a finite amount of fuel for the heating unit. ϕ2 = FG(¬b) 3. The system must be robust to handle overriding command c after which no request to increase the temperature may be issued. ϕ3 = F(c ∧ c ⇒ G ¬a) 4. There will be no request to increase the temperature initially, but every heating will be preceded by a request which must be turned off after one time unit once the heating starts. Also the request is always issued until the heating starts.
a ϕ4 = G(a U b ∧ X b ⇒ (a ∧ X a ∧ X X ¬a)) ∧ ¬a Our consistency checking algorithm begins by checking in parallel the individual require- ments ϕ1, . . . , ϕ4 for consistency and also the whole set Γ = {ϕ1, . . . , ϕ4}. It discovers that ϕ4 is inconsistent on its own and that Γ is also inconsistent. The fact that ϕ4 is inconsistent is
44 employed by not considering any of the sets containing ϕ4, i.e. {ϕ1, ϕ4}, {ϕ2, ϕ4}, {ϕ3, ϕ4}, {ϕ1, ϕ2, ϕ4}, {ϕ1, ϕ3, ϕ4}, and {ϕ2, ϕ3, ϕ4}. Next, the algorithm checks the supersets of the largest consistent sets, i.e. {ϕ1, ϕ2}, {ϕ1, ϕ3}, and {ϕ2, ϕ3}. Discovering that both {ϕ1, ϕ2} and {ϕ1, ϕ3} are inconsistent it terminates since {ϕ1, ϕ2, ϕ3} is necessarily also inconsistent. The smallest inconsistent subsets are thus {ϕ4}, {ϕ1, ϕ2}, and {ϕ1, ϕ3}. 4
4.2 Completeness of Requirements
Let us assume that the user specifies three types of requirements: environmental assump- tions ΓA, required behaviour ΓR and forbidden behaviour ΓF . The environmental assump- tions represent the sensible properties of the world a given system is to work in, e.g. “The plane is either in the air or on the ground, but never in both these states at once”, in LTL this would read G(ground ⇔ ¬air). The required behaviour represents the func- tional requirements imposed on the system: the system will not be correct if any of these is not satisfied, e.g. “If the plane is ever to fly it must first be tested on the ground”: F air ⇒ (ground ∧ tested). Dually, the forbidden behaviour contains those patterns that the system must not display, e.g. “Once tested the plane should be put into air within the next three time steps”: ¬tested U (air ∨ X air ∨ X X air).
Theory of Automata Coverage Assume henceforth the following simplifying notation for B¨uchi automata: let f be a propo- sitional formula over capital Latin letters A, B, . . . barred letters A,¯ B,...¯ and Greek letters α, β, . . ., where A substitutes V γ, A¯ stands for W γ and all Greek letters represent γ∈ΓA γ∈ΓA simple LTL formulae. Then Af denotes such a B¨uchi automaton that accepts all words sat- V W isfying the substituted f, e.g. A ¯ accepts words satisfying γ ∨ γ ∧ ϕ. The A∨B∧ϕ γ∈ΓA γ∈ΓB automaton AA thus describes the part of the state space the user is interested in and which the required and forbidden behaviour should together submerge. That most commonly is not the case with freshly elicited requirements and therefore the problem is the following: find sufficiently simple formulae over AP that would, together with formulae for R and F , cover a large portion of AA. In other words to find such ϕ that AR∨F¯∨ϕ covers as much of AA as possible. In order to evaluate the size of the part of AA covered by a single formula, i.e. how much of the possible behaviour is described by it, an evaluation methodology for B¨uchi automata needs to be established. The plain enumeration of all possible words accepted by an automaton is impractical given the fact that B¨uchi automata operate over infinite words. Similarly, the standard completeness metrics based on state coverage [CKV01, TK01] are unsuitable because they do not allow for comparison of sets of formulae and they require the underlying model. Equally inappropriate is to inspect only the underlying directed graph because B¨uchi automata for different formulae may have isomorphic underlying graphs. The methodology proposed in this thesis is based on the notion of almost-simple paths and directed partial coverage function.
Definition 4.8 (Almost-Simple Path). Let G be a directed graph. A path π in G is a sequence of vertices v1, . . . , vn such that ∀i :(vi, vi+1) is an edge in G. A path is almost-simple if no vertex appears on the path more than twice. The notion of almost-simplicity is also applicable to words in the case of B¨uchiautomata.
45 With almost-simple paths one can enumerate the behavioural patterns of a B¨uchi au- tomaton without having to handle infinite paths. Clearly, it is a heuristic approach and a considerable amount of information will be lost but since all simple cycles will be consid- ered (and thus all patterns of the infinite behaviour) the resulting evaluation should provide sufficient distinguishing capacity (as demonstrated in Section 4.3.2). Knowing which paths are interesting it is possible to propose a methodology that would allow comparing two paths. There is, however, a difference between B¨uchi automata that represent a computer system and those built using only LTL formulae (ones that should restrict the behaviour of the system). The latter automata use a different evaluation function νˆ that assigns to every edge a set of literals. The reason behind this is that the LTL-based automaton only allows those edges (in the system automaton) for which their source vertex has an evaluation compatible with the edge evaluation (now in the LTL automaton).
Definition 4.9 (Path Coverage). Let A1 and A2 be two (LTL) B¨uchiautomata over AP and let APL be the set of literals over AP . The directed partial coverage function Λ assigns to AP AP every pair of edge evaluations a rational number between 0 and 1, Λ : 2 L × 2 L → Q. The evaluation works as follows 0 ∃i, j : l ≡ ¬l Λ(A = {l , . . . , l },A = {l , . . . , l }) = 1i 2j 1 11 1n 2 21 2m p/m otherwise where p = |A1 ∩ A2|. From this definition one can observe that Λ is not symmetric. This is intentional because the goal is to evaluate how much a path in A2 covers a path in A1. Hence the fact that there are some additional restricting literals on an edge of A1 does not prevent automaton A2 to display the required behaviour (the one observed in A1). The extension of coverage from edges to paths and to automata is based on averaging over almost-simple paths. An almost-simple path π2 of automaton A2 covers an almost-simple Pn i=0 Λ(A1i,A2i) path π1 of automaton A1 by Λ(π1, π2) = n per cent, where n is the number of edges and Aji is the set of labels on i-th edge on πj. Then automaton A2 covers A1 by Pm i=0 max2i Λ(π1i,π2i) Λ(A1, A2) = m per cent, where m is the number of almost-simple paths of A1 that end in an accepting vertex. It follows that coverage of 100 per cent occurs when for every almost-simple path of one automaton there is an almost-simple path in the other automaton that exhibits similar behaviour.
Implementation of Completeness Evaluation The high-level overview of the implementation of the proposed methodology is based on partial coverage of almost-simple paths of AA. In other words finding the most suitable path in AR∨F¯∨ϕ for every almost-simple path in AA, where ϕ is a sensible simple LTL formula that is proposed as a candidate for completion. Finally, the suitability will be assessed as the average of partial coverage over all edges on the path. The output of such a procedure will be a sequence of candidate formulae, each associated with an estimated coverage (a number between 0 and 1) the addition of this particular formula would entail. The candidate formulae are added in a sequence so that the best unused formula is selected in every round. Finally, the coverage is always related to AA and, thus, if some behaviour that cannot be observed in AA is added with a candidate formula this addition will neither improve nor degrade the coverage.
46 Example 4.10 (Automata Coverage). This example only shows the enumeration of almost- simple paths and the partial coverage of two paths. What remains to the complete method- ology will be shown more structurally in Algorithm5. Enumerating all almost-simple paths of Aa:
2 3 6 1 4 5 we reach the following four:
1. h1, 2, 3, 5i
2. h1, 2, 3, 4, 2, 3, 5i
3. h1, 2, 3, 5, 6i
4. h1, 2, 3, 4, 2, 3, 5, 6i.
Let us assume that Ab: a a, b a c, a
¬a, b is the original automaton and Ac: a a, d ¬c, b c
is being evaluated for how thoroughly it covers Ab. There are 4 almost-simple paths in Ab, one of them is π = ha; ¬a, b; c, a; ai. The partial coverage between the first edge of π and the first edge of Ac (there is only one possibility) is 0.5, since there is the excessive literal d. The coverage between the second edges is also 0.5, but only because of ¬c in Ac; the superfluous literal ¬a restricts only the behaviour of Ab. Finally, the average similarity between π and the respective path in Ac is 0.75 and it is approximately 0.7 between the two automata. 4
The topmost level of the completeness evaluation methodology is shown as Algorithm5. As input this function requires the three sets of user defined requirements, the set of candidate formulae and the number of formulae the algorithm needs to select. On lines1 and2 the formulae for conjunction of assumptions and user requirements (both required and forbidden) are created. They will be used later to form larger formulae to be translated into B¨uchi automata and evaluated for completeness but, for now, they need to be kept separate. Next step is to enumerate the almost-simple paths of AA for later comparison, i.e. a baseline state space that the formulae from ΓCand should cover. The rest of the algorithm forms a cycle that iteratively evaluates all candidates from ΓCand (see line8 where the corresponding formula is being formed). Among the candidate formulae
47 Algorithm 5: Completeness Evaluation
Input :ΓA,ΓR,ΓF ,ΓCand, n Output: Best coverage for 1 . . . n formulae from ΓCand 1 γ ← V γ Assum γ∈ΓA 2 γ ← V γ ∨ W γ Desc γ∈ΓR γ∈ΓF 3 A ← transform2BA(γAssum) 4 pathsBA ← enumeratePaths(A) 5 for i = 1 . . . n do 6 max ← ∞ 7 foreach γ ∈ ΓCand do 8 γT est ← γDesc ∨ γ 9 A ← transform2BA(γT est) 10 cur ← avrPathCov(A, pathsBA) 11 if max < cur then 12 max ← cur 13 γMax ← γ
14 print( “Best coverage in i-th round is max.” ) 15 γDesc ← γDesc ∨ γMax 16 ΓCand ← ΓCand \{γMax}
the one with the best coverage of the paths is selected and subsequently added to the covering system. Functions enumeratePaths and avrPathCov are similar extensions of the BFS algorithms. Unlike BFS, however, they do not keep the set of visited vertices to allow state revisiting (twice in case of enumeratePaths and arbitrary number of times in case of avrPathCov). The avrPathCov search is executed once for every path it receives as input and stops after inspecting all paths to the length of the input path or if the current search path is incompatible (see Definition 4.9).
4.3 Sanity Checking Experiments
All three sanity checking algorithms were implemented as an extension of the parallel explicit- state LTL model checker DIVINE [BBCR10ˇ ]. From the many facilities offered by this tool, only the LTL to B¨uchi translation was used. Similarly as the original tool also this extension to sanity checking was implemented using parallel computation.
4.3.1 Generated Requirements
The first set of experiments was conducted on randomly generated LTL formulae. The mo- tivation behind using random formulae is to demonstrate the reduction in the number of consistency checks. A naive algorithm for detecting inconsistent/redundant subsets requires an exponential number of such checks. Generating a large number of requirement collections
48 with varying relations among individual requirements allows a more representative results than a few, real-world collections. In order for the experiments to be as realistic as possible (avoiding trivial or exceedingly complex formulae) requirements with various nesting depths were generated. Nesting depth denotes the depth of the syntactic tree of a formula. Statistics about the most common formulae show, e.g. in [DAC98], that the nesting is rarely higher than 5 and is 3 on average. Following these observations, the generating algorithm takes as input the desired number n of formulae and produces: n/10 formulae of nesting 5, 9n/60 of nesting 1, n/6 of nesting 4, n/4 of nesting 2 and n/3 of nesting 3. Finally, the number of atomic propositions is also chosen according to n (it is n/3) so that the formulae would all contribute to the same state space. All experiments were run on a dedicated Linux workstation with quad core Intel Xeon 5130 @ 2GHz and 16GB RAM. The codes were compiled with optimisation options -O2 using GCC version 4.3.2. Since the running times and even the number of checks needed for completion of all proposed algorithms differ for every set of formulae, the experiments were ran multiple times. The sensible number of formulae starts at 8: for less formulae the running time is negligible. Experimental tests for consistency and redundancy were executed for up to 15 formulae and for each number the experiment was repeated 25 times. Figures 4.2a and 4.3a summarise the running times for sanity checking experiments. For every set of experiments (on the same number of formulae) there is one box capturing median, extremes and the quartiles for that set of experiments. From the figure it is clear that despite the optimisation techniques employed in the algorithm both median and maximal running times increase exponentially with the number of formulae. On the other hand there are some cases for which presented optimisations prevented the exponential blow-up as is observable from the minimal running times. Figures 4.2b and 4.3b illustrate the discrepancy between the number of combinations of formulae and the number of sanity checks that were actually performed. The number of combinations for n formulae is n∗2n−1 but the optimisation often led to much smaller number. As one can see from the experiments on 9 formulae, it is potentially necessary to check almost all the combinations but the method proposed in this thesis requires on average less than 10 per cent of the checks and the relative number decreases with the number of formulae. The actual running times, as reported in Figure 4.2a, depend crucially on the chosen method for consistency checking: automata-based Algorithm4 in our case. Our algorithm for enumerating all smallest inconsistent subsets is robust with respect to the consistency checking algorithm used. Hence any other consistency (or realisability) tool could be used, but that would only affect the actual running times, but not the number of checks, reported in Figure 4.3b. Finally, Figure 4.4 shows how much time did our sanity checking methods require to perform a single check. Since requirement document may contain a larger number of indi- vidual requirements than those we have experimented with, the reader might appreciate the progression of time complexity as depicted in the figure.
4.3.2 Case Study: Aeroplane Control System In order to demonstrate the full potential of the proposed sanity checking, we apply all the steps in a sequence. First, the whole requirements document is checked for consistency and redundancy and only the consistent and non-redundant subset is then evaluated with respect to its completeness. A sensible exposition of the effectiveness of completeness evaluation
49 100000
10000 Time (ms) 1000
100 7 8 9 10 11 12 13 14 15 16 Number of formulae (a) Log-plot summarising the time complexity. 100
10
1 Relative Number of Checks Performed
0.1 7 8 9 10 11 12 13 14 15 16 Number of formulae
(b) Log-plot summarising the number of checks.
Figure 4.2: Results for consistency checking of randomly generated formulae.
50 1e+06
100000
10000 Time (ms)
1000
100 7 8 9 10 11 12 13 14 15 16 Number of formulae (a) Log-plot summarising the time complexity.
100
10
1
0.1
Relative Number of Checks Performed 7 8 9 10 11 12 13 14 15 16 Number of formulae
(b) Log-plot summarising the number of checks.
Figure 4.3: Results for redundancy checking of randomly generated formulae.
51 8 consistency redundancy 7
6
5
4
3 Time (ms)/#Checks 2
1
0 7 8 9 10 11 12 13 14 15 16 Number of formulae
Figure 4.4: Log-plot summarising the relative time complexity of sanity checking.
proposed here requires more elaborate approach than using random formulae as the input requirements document.
For that purpose a case study has been devised that demonstrates the capacity to assess coverage of requirements and to recommend suitable coverage-improving LTL formulae. The requirements document was written by a requirements engineer whose task was to evaluate to method, and we only use random formulae as candidates for coverage improvement. The candidate formulae were built based on the atomic proposition that appeared in the input requirements and only very simple generated formulae were selected. It was also required that the candidates do not form an inconsistent or tautological set. Alternatively, pattern- based [DAC98] formulae could be used as candidates. The methodology is general enough to allow semi-random candidates generated using patterns from input formulae. For example if an implication is used as the input formula, the antecedent may not be required by the other formulae which may not be according to user’s expectations.
The case study attempts to propose a system of LTL formulae that should control the flight and more specifically the landing of an aeroplane. The requirements are divided into 3 categories similarly as in the text. Below, we provide each requirement first as an LTL formula and then as its intuitive meaning. The grayed formulae were found either redundant or inconsistent and were not included in the completeness evaluation.
52 Atomic Propositions LTL Requirements
A1 : G(a ⇔ b) R1 : G(l ⇒ F(G b ∧ F G c)) a ≡ [height = 0] A2 : F l ∧ G(l ⇒ F a) R2 : G(l ⇒ F(b U c)) b ≡ [speed ≤ 200] A3 : G(¬l ⇒ ¬b) R3 : G(¬b ⇒ ¬u) l ≡ [landing] A4 : G(u ⇒ F a ∧ u ⇒ c) R4 : F(l U(u U c)) u ≡ [undercarriage] A5 : G(¬l ⇒ ¬a) R5 : F(l ∧ G(¬c ∨ F(¬b))) c ≡ [speed ≤ 100] A6 : G(¬b ∨ F a) R6 : F(¬b ∧ ¬u ∧ X (¬b ∧ u) ∧ X X (b ∧ u)) F 1 : F(a ∧ G(¬a))
Environment Assumptions
A1 The plane is on the ground only if its speed is less or equal to 200 mph.
A2 The plane will eventually land and every landing will result in the plane being on the ground.
A3 If the plane is not landing its speed is above 200 mph.
A4 Whenever the undercarriage is down, first the speed must be at most 100 mph and second the plane must eventually land on the ground.
A5 Whenever the plane is not landing its height above ground is not zero.
A6 At any time during the flight it holds that either the plane if flying faster than 200 mph or it eventually touches the ground.
Required Behaviour
R1 The landing process entails that eventually: (1) the speed will remain below 200 mph indefinitely and (2) after some finite time the speed will also remain below 100 mph.
R2 The landing process also entails that the plane should eventually slow down from 200 mph to 100 mph (during which the speed never goes above 200 mph).
R3 Whenever flying faster than 200 mph, the undercarriage must be retracted.
R4 At some point in the future, the plane should commence landing, then detract the under- carriage until finally the speed is decrease below 100 mph.
R5 In the future the plane will be decommissioned: it will land and then either never flying faster than 100 mph or always slowing down below 200 mph.
R6 At some point in the future, the speed will be below 200 mph and the undercarriage retracted. Immediately before that and while the speed is still above 200 mph the under- carriage is detracted in two consecutive steps.
Forbidden Behaviour
F1 The plane will eventually be on the ground after which it will never take off again.
53 Consistency The three categories of requirements are checked for consistency and redun- dancy separately. The environmental assumptions form a consistent set. The set of required behaviour contained two smallest inconsistent subsets. The requirement R4 is inconsistent with R6, hence we have choose R4 as more important and exclude R6 from the collection. The second inconsistent subset was: R1, R2, R5. There we chose to preserve R1 and R2, leaving out the possibility to decommission a plane.
Redundancy After regaining consistency of the required behaviour, our sanity checking procedure detects no redundancy in this set. On the other hand there are two redundant environmental assumptions. The assumption A5 is implied by the conjunction of A1 and A3 and is thus redundant. It also holds that any environment in which A2 and A3 are true also satisfies A6. Hence both A5 and A6 can be removed without changing the set of sensible behaviour in the environment.
Completeness Initially, the required and forbidden behaviour together does not cover the environment assumptions at all, i.e. the coverage is 0. There simply are so many possible paths allowed by the assumptions which are not considered in the requirements, that our metric evaluated the coverage to 0. There is thus a considerable space for improvement. The completeness improving process we are about to describe begins by generating a set of candidate requirements among which those with the best resulting coverage are selected. Table 4.1 summarises the process. The first formula selected by the Algorithm5 and leading to coverage of 9 .1 per cent was a simple ϕ1 = G(¬l): “The plane will never land”. Not particularly interesting per se, nonetheless emphasising the fact that without this requirement, landing was never required. The formula ϕ1 should thus be included among the forbidden behaviour. Unlike ϕ1, the formula generated as second, F(¬b ∧ ¬l):“At some point the plane will fly faster than 200 mph and will not be in the process of landing”, should be included among the required behaviour. This formula points out that the required behaviour only specifies what should happen after landing, unlike the assumptions, which also require flight. The third and final formula was F(¬l U(a ∨ b)): “Eventually it will hold that the landing does not commence until either: (1) the plane is already on the ground or (2) the speed decreased below 200 mph”. This last formula connects flight and landing and its addition (among the required behaviour) entails coverage of 54.9 per cent. There are two conclusion one can draw from this case study with respect to the complete- ness part of sanity checking. First, the requirements are generated automatically, they are relevant and already formalised, thus amenable to further testing and verification. Second, the new requirements can improve the insight of the engineer into the problem. Both of these properties are valuable on their own, even though the method did not finish with 100 per cent coverage. The method produced further, coverage-improving formulae but even the best formulae only negligible improved the coverage and thus the engineer decided to stop the process.
4.3.3 Industrial Evaluation As a further extension to the original paper [BBBC12ˇ ], we have evaluated our consistency and redundancy checking method on a real-life sets of requirements obtained in cooperation with Honeywell International. The case study consisted of four Simulink models; each has been
54 Table 4.1: Iterations of the completeness evaluation algorithm for the Aeroplane Control System.
iteration suggested formula resulting system coverage of AA by An 1 ϕ = G(¬l) A = A 9.1% 1 1 R∨F ∨ϕ1 2 ϕ = F(¬b ∧ ¬l) A = A 39.4% 2 2 R∨F ∨ϕ1∨ϕ2 3 ϕ = F(¬l U(a ∨ b)) A = A 54.9% 3 3 R∨F ∨ϕ1∨ϕ2∨ϕ3
Table 4.2: Results of consistency and redundancy checking on an industrial case study.
model no. of consistency redundancy name formulae result time(s) mem(MB) result time(s) mem(MB) VoterCore 3 OK 0.1 21.6 OK 0.4 22.4 AGS 5 OK 0.1 23.5 ϕ4 ⇒ ϕ2 0.7 23.6 Lift 10 OK 73.6 58.5 ϕ1 ⇒ ϕ5 9460.5 1474.5 pbit 10 OK 210879.0 22 649.0 — — — associated with a set of formalised requirements in the form of LTL formulae. The formali- sation of the requirements that were originally given in natural language form has been done with the help of the ForReq tool. The tool allows the user to write formal requirements with the help of the specification patterns [DAC98] and some of their extensions. See [BBB+12] for a detailed description of the tool. Out of the four models, one (Lift) was a toy model used in Honeywell to evaluate the capabilities of ForReq, three (VoterCore, AGS, pbit) were models of real-world avionics sub- systems. As the focus of this work is model-less sanity checking, we ignored the models themselves and only considered the formalised requirements. The results of our experiments are summarised in Table 4.2. In all four cases, the whole set of requirements was consistent. As for redundancy, we have identified two cases where a formula was implied by another one. This is shown in the redundancy result column. The number displayed in the mem(MB) column represent the maximal (peak) amount of memory used. The experiments, with the exception of the last one, were run on a dedicated Linux workstation with Intel Xeon E5420 @ 2.5 GHz and 8 GB RAM. The last experiment (pbit) was run on a Linux server with Intel X7560 @ 2.27 GHz and 450 GB RAM. The last experiment was a consistency check only as the redundancy checking took too much time. The reason for such great consumption of time (and memory) is that the formulae in the pbit set of requirements are rather complex, some of them even using precise specification of time. For more details about the time extension of LTL and its translation to standard LTL, see [BBB+12].
4.4 Closing Remarks
Our work on sanity checking further expands the incorporation of formal methods into soft- ware development. Aiming specifically at the requirements stage we propose a novel approach to sanity checking of requirements formalised in LTL formulae. Our approach is comprehen- sive in scope integrating consistency, redundancy and completeness checking to allow the user (or a requirements engineer) to produce a high quality set of requirements easier. The novelty of our consistency (redundancy) checking is that they produce all inconsistent (redun-
55 dant) sets instead of a yes/no answer and their efficiency is demonstrated in an experimental evaluation. Finally, the completeness checking presents a new behaviour-based coverage and suggests formulae that would improve the coverage and, consequently, the rigour of the final requirements. Especially in the earliest stages of the requirements elicitation, when the engineers have to work with highly incomplete, and often contradictory sets of requirements, could our sanity checking framework be of great use. Other realisability checking tools can also (and with better running times) be used on the nearly final set of requirements but our process uniquely target these crucial early states. First, finding all inconsistent subsets instead of merely one helps to more accurately pinpoint the source of inconsistency, which may not have been elicited yet. Second, once working with a consistent set, the requirements engineer gets further feedback from the redundancy analysis, with every minimal redundancy witness pointing to a potential lack of understanding of the system being designed. Finally, the semi- interactive coverage-improving generation of new requirements attempts to partly automate the process of making the set of requirements as detailed as to prevent later development stages from misinterpreting a less thoroughly covered aspect of the system. We demonstrate this potential application on case studies of avionics systems. The above described addition to the standard elicitation process may seem intrusive, but the feedback we have gather from requirements engineers at Honeywell were mostly positive. Especially given the tool support for each individual steps of the process – pattern-based automatic translation from natural language requirements to LTL and the subsequent fully- and semi-automated techniques for sanity checking – a fast adoption of the techniques was observed. Many of the engineers reported a non-trivial initial investment to assume the basics of LTL, but in most cases the resulting overall savings in effort outweighed the initial adoption effort. After the first week, the overhead of tool-supported formalisation into LTL was less than 5 minutes per requirement, while the average time needed for corresponding stages of requirements elicitation decreased by 18 per cent. We have also accumulated a considerable amount of experience from employing the sanity checking on real-world sets of requirements during our cooperation with Honeywell Interna- tional. Of particular importance was the realisation that the time requirements of our original implementation effectively prevented the use of sanity checking on industrial scale. In order to ameliorate the poor timing results we have reimplemented the sanity checking algorithms using the state-of-the-art LTL to B¨uchi translator SPOT, which accelerated the process by several orders on magnitude on larger sets of formulae. Another important observation was that the concept of model-based redundancy is desirable to be translated into the model-free environment more directly. We thus extend the sanity checking process with the capacity to generate redundancy witnesses (using the formally described theory of path-quantified for- mulae). Finally, we summarise these and other observations in a case consisting of four sets of requirements gathered during the development of industry-scale systems. One direction of future research is the pattern-based candidate selection mentioned above. Even though the selected candidates were relatively sensible in presented experiments, using random formulae can produce useless results. Finally, experimental evaluation on real-life requirements and subsequent incorporation into a toolchain facilitating model-based devel- opment are the long term goals of the presented research. Our exposition of requirements completeness also lacks formal definition of total coverage (which the proposed partial cover- age merely approximates). We intend to formulate an appropriate definition using uniform probability distribution: which would also allow to compute total coverage without approxi-
56 mation and would not be biased by concrete LTL to BA translation. That solution, however, is not very practical since the underlying automata translation is doubly exponential.
4.4.1 Alternative LTL to B¨uchi Translation
Although the running time experiments were one of our priorities, their purpose was to investigate the effects of the proposed optimisations and not the concrete time measurements. In other words, the subject of our investigation was the asymptotic behaviour of the algorithms and how effectively does the algorithm scale with the number of formulae in the average and in the extrema. Given our recent cooperation with Honeywell and their interest in checking sanity of real-world, industry-level sets of requirements, the actual running times became of particular interest as well. Initial experiments revealed, however, that on larger sets of requirements and especially when more complex individual requirements were involved, the proposed sanity checking algorithms were unable to cope with the computational complexity. The bottleneck proved to be the algorithm translating LTL formulae – conjunctions of the original requirements – into B¨uchi automata. The translator incorporated in DiVinE is sufficiently fast on small formulae that commonly occur in software verification and was never intended to be used with larger conjunctions. Thus, to enable checking sanity of real-world sets of requirements, we have reimplemented the algorithms to employ the state-of-the-art LTL to B¨uchi translator SPOT [DL11] in a tool called Looney1. As the case study summarising our cooperation with Honeywell in Section 4.3 shows, Looney is able to process much larger formulae than the original tool and also to incorporate a larger number of formulae into the sanity checking process. The reimplementation of consistency was relatively straightforward since SPOT can be interfaced via a shared library with an arbitrary C++ application. Even though this enforced us to use the internal implementation of formulae from SPOT, which was different the one we had before, SPOT offers an adequate set of formulae-manipulating operations that allowed the translation of our algorithms, without their needing to be considerably modified. The reimplementation of the coverage algorithms was complicated by the fact that the product of SPOT is a transition-based B¨uchi automaton, whereas the algorithm of Section 4.1 expects state-based automata: the difference is that transitions rather than states are denoted as accepting. This shift requires modification of the coverage evaluation, more precisely the definition of Λ(A1, A2) of the coverage among automata. Before, the number m in the denominator referred to the number of almost-simple paths of A1 that ended in an accepting vertex (state). We propose to redefine m as the number of almost-simple paths that end in an accepting edge (transition). It is clear that the coverage reported by the original and the new metrics would differ: first, because the automata are generated using different methods and may thus be structurally different and, second, because the redefined metric considers different sets of almost-simple paths. With the alternative definition of m one can easily incorporate translators that produce transition-based automata (though parsing these automata may still require some degree of modification due to incompatibility of internal structures representing the automata).
1As we are checking the sanity of requirements, the translator helps us to spot the looney ones. The tool is available at http://anna.fi.muni.cz/~xbauch/code.html#sanity.
57 Therefore, in order to reestablish admissibility of the new evaluation as an adequate coverage metric, we have repeated the experiment with the aeroplane control system. As was to be expected given the above argumentation, the concrete evaluation of individual formulae differed from our original experiments. Specifically, the formula first generated (G(¬l)) was evaluated with coverage 12.9 per cent (instead of 9.1); the second formula F(¬b∧¬l) with 35.0 (instead of 39.4); and the third formula F(¬l U (a ∨ b)) with 61.1 (instead of 54.9). Hence there are two observation to be made. First, even though the coverage numbers differ for individual formulae, the same triple was produced by the modified algorithm. Moreover, the three formulae were produced in the same order, thus the new metric behaves appropriately even when a new set of requirements is obtained as a disjunction of the old set with the generated coverage-improving formula.
58 Chapter 5
Control Explicit—Data Symbolic Model Checking
Given the results of the previous chapter, we can now safely assume that a sensible collection of requirements is present. This section establishes a verification method that builds upon that assumption, verifying that a model of the software under development has certain properties specified as LTL formulae. For easier exposition we will assume a mathematical model of the software and explain how to obtain such model during various development stages in forthcoming chapters. As will then become clear, the pivotal contribution of our efforts, the control explicit—data symbolic model checking, is general enough to accept both software designs and software code as its input. The thesis on which the program verification part of this work is based can be summarised as follows: the explicit-state model checking may be seamlessly extended to handle the con- trol part of programs explicitly while substituting a symbolic representation for the data part. Not all symbolic representations are appropriate for the data part, and indeed not all forms of data are appropriate for LTL model checking. The reason is that most symbolic represen- tations were created to support checking safety properties. Yet LTL model checking requires exact state comparison, while checking safety properties does not. This section introduces one possible form of symbolic data, presented as a state space reduction technique. This representation will later prove adequate for arguing about correctness and for deriving the requirements on the appropriate representation that would implement this reduction.
5.1 Set-Based Reduction
The proposed method of handling the data flow consists of clustering together those variable evaluations that are connected to the same control state. This way, most of the aspects of explicit state model checking can be preserved provided that one can efficiently store these sets of evaluations. More precisely, we want to propose a reduction that would modify the state space in such a way that an unmodified model checking of this reduced space would yield exactly the same result as it would for the unreduced state space. That clearly entails the ability to efficiently decide whether a state has already been seen, using a fast storage data-structure and a state matching decision procedures.
59 Unlike the explicit CFG which represents the unreduced states space, the set-based re- duction does not include data explicitly within states. To simplify further argumentation, we represent the reduction by introducing a new variable data, such that
sort(data) = {(v1, . . . , vn) | vi ∈ sort(xi)}.
Definition 5.1 (Set-Reduced CFG). The set-reduced CFG G = (S, s0,T,A) is similar to the explicit CFG in limiting the transition labels to > but differs in that S contains the evaluations of only two variables pc and data. The generation from a CFG P = (L, l0, T ,F ) is defined as follows: S = {s : {pc, data} → L ∪ sort(data)} | s(pc) ∈ L, • s(data) ∈ sort(data)}
• s0 = {(pc, l0), (data, 0)} ρ T = {s → s0 | s(pc) −→ s0(pc), V • ∃(v ∈ s(data)), |=SAT xi = vi ∧ ρ, s0(data) = {v0 ∈ sort(data) | ∃(v ∈ s(data)), V V 0 0 a |=SAT xi = vi ∧ ρ ∧ xi = vi}} • A = {s ∈ S | s(pc) ∈ F } Even though the representation of the evaluations of data is very important, let us, for now, abstract from it and investigate the reduction assuming that there is an efficient repre- sentation. More precisely let us assume that we have a representation whose size is linear in the number of program variables but that does not depends on the domains of these variables or on the number of times the input was read from, the length of the path from s0, etc. None of these assumptions will prove achievable using the currently known symbolic representations, but for the purpose of evaluating the general idea, we will assume them regardless. One might observe that it is no longer obvious when should two states of the reduced transition system be considered equivalent: which needs to be decidable for the accepting cycle detection. First, let us demonstrate that equality based on the subset relation – a newly discovered state s0 being equal to a known s provided that s0(data) ⊆ s(data) – is not correct with respect to LTL properties. In the context of symbolic execution, this type of equivalence is known as subsumption [XMSN05], where its use is justified by the fact that symbolic execution verifies reachability properties for which the subsumption is correct. Example 5.2 (Incorrectness of Subsumption). The incorrectness of subsumption-based re- duction for LTL verification can be easily demonstrated. Consider the verification task below, with the program CFG on the left and the specification CFG on the right. x ≤ 10 0 l0 l x < 5 ∧ x = x + 5 a0 > τ0 τ1
After taking the transition τ0 the set-reduced CFG reaches a state s which evaluates data 0 0 to {0,..., 10}. Taking τ1 leads to a state s with s (data) = {5,... 9}. Under subsumption the states s and s0 would be equal. Given the specification CFG on the right, each state is accepting. Under subsumption we would thus report an accepting cycle, which would be incorrect. The incorrectness can be demonstrated by listing all path of the explicit CFG. In short, the τ1-successors of states with x < 5 evaluate x to a value larger by 5 and states with x ≥ 5 have no successors. Hence there are no cycles in the explicit CFG. 4
60 It appears that unless the underlying model-checking process itself is modified, the reduc- tion has to differentiate every pair of states with different sets of possible variable evaluations. Henceforth this reduction is referred to as set-based reduction, and for an unreduced state space Gu the set-reduced state space is be denoted by Gs (whenever we need to distinguish it from other forms of reduction). The reduction is demonstrated in the following example and formalised in the following subsection.
Example 5.3 (Set-Based Reduction). Let us consider verifying the following parallel program P :
thread t0
1: char b; 2: read b;
thread t1 thread t2
1: if ( b > 3 ) 1: if ( b*b <= 16 ) 2: while ( true ) 2: while ( true ) 3: b++; 3: b -= 10; with respect to a specification represented by the automaton PA:
1 b > 10 2 b > 10 requiring that at some point, the execution sets b to a value larger than 10, and afterwards never decreases it below 11. The program P is not represented as a composition of transition systems but using C- like pseudo-code to further demonstrate the possibility to translate real-world verification problems to the presented formalism. Below is the transition system generated without our set-based reduction:
b = 0 b = 0 b = 4 pc = 2, 0, 0 pc = 2, 1, 0 ... pc = 2, 1, 0 ... pcA = 1 pcA = 1 pcA = 1 b = 0 . pc = 1, 0, 0 δ . ζ pcA = 1 b = 4 b = 4 pc = 2, 0, 0 pc = 2, 0, 1 ... pcA = 1 pcA = 1 . .
The identification of program states can be divided into two parts: one for control information (marked with lighter blue in the figure) and the other for data (marked with darker red). Note also that the control part contains the program counters for individual threads of the main program pc and the locations pcA of the specification automaton PA. Similarly, it is
61 possible to distinguish the two sources of non-determinism in parallel programs: the control- flow non-determinism (thread interleaving) is marked as ζ-transitions and the data-flow non- determinism (variable evaluation) as δ-transitions. Finally, the set-reduced transition system may be visualised as follows:
d = (b, {4..255}) pc = 2, 1, 0 ... pcA = 1 d = (b, 0) d = (b, {0..255}) pc = 1, 0, 0 pc = 2, 0, 0 pcA = 1 pcA = 1 d = (b, {0..4}) pc = 2, 0, 1 ...... pcA = 1
In the above figure we abbreviate data to d. Note especially how the set-reduction contracts the data non-determinism δ, while preserving the distinctions prescribed by the control. 4 Example 5.4 (Symbolic Execution). The figure bellow depicts a program with input on the left-hand side and its associated transition system on the right-hand side.
1:read a; (0, (a, 0), >) 2:if(a>5) 0 3: a++; a = i1 4:else (1, (a, i1), >) 5: a--; a > 5 a ≤ 5
(2, (a, i1), i1 > 5) (4, (a, i1), i1 ≤ 5) a0 = a + 1 a0 = a − 1
(3, (a, i1 + 1), i1 > 5) (5, (a, i1 − 1), i1 ≤ 5)
The set of evaluations (of the one variable a) is represented symbolically, as a function over input symbols from I, here only i1. 4 Subset inclusion leads to unification of states that does not preserve the information on the progress towards the LTL specification. In fact, of any unification among states it must be proven that if an accepting cycle was created then there is one in the unreduced system as well. This holds even for a more restrictive form of state equivalence, when two states are equal only if the evaluations they contain match exactly. Notice that even though atomic propositions may evaluate differently under s(data) within a single state, they are divided on-the-fly, based on the labels of the specification CFG transitions. That is ensured by the way transitions are generated in Definition 5.1.
Theorem 5.5 (Correctness of Set-Based Reduction). Let P = (L, l0, T ,F ) be the CFG of u u u u u s s s s s the verified program and let G = (S , s0 ,T ,A ) and G = (S , s0,T ,A ) be the generated explicit and set-reduced CFGs, respectively. Then Gu contains a reachable accepting cycle if and only if Gs contains a reachable accepting cycle.
u u u u Lemma 5.6 (Symbolic Path Existence). For any path s0 → s1 → ... → sn in G there is a s s s s u s u s path s0 → s1 → ... → sn in G such that ∀(0 ≤ i ≤ n), si (pc) = si (pc) ∧ si (D) ∈ si (data).
62 u s u s Proof. For n = 0 we have that s0 (pc) = l0 = s0(pc) and s0 (D) = 0 = s0(data). Let us assume that the statement holds for paths of length n and take an arbitrary path of u u u u length n + 1, say s0 → s1 → ... → sn → sn+1. The induction hypothesis gives us the s s s appropriate path s0 → s1 → ... → sn. According to the Definition 3.10, the existence of u u u ρ u s ρ s sn → sn+1 entails sn(pc) −→ sn+1(pc) ∈ T . Thus also sn(pc) −→ sn+1(pc) ∈ T and since u s s s s s sn(D) ∈ sn(data) we also have sn(data) 6= ∅. Already it holds that sn → sn+1 ∈ T and from V u V 0 u u s |=SAT x = sn(x) ∧ ρ ∧ x = sn+1(x) we conclude that sn+1(D) ∈ sn+1(data).
s s s s Lemma 5.7 (Explicit Path Existence). For any path s0 → s1 → ... → sn in G and for all s u u u u u v ∈ sn(data) there is a path s0 → s1 → ... → sn in G such that sn(D) = v.
s Proof. Since s0(data) = {0} the statement holds trivially for n = 0. Let us assume a set- s s s s 0 s reduced path s0 → s1 → ... → sn → sn+1 and let us fix a data vector v ∈ sn+1(data). s s s ρ s s The transition sn → sn+1 gives us sn(pc) −→ sn+1(pc) ∈ T and sn(data) 6= ∅. Also since 0 s V V 0 0 s v ∈ sn+1(data) we have that |=SAT xi = vi ∧ ρ ∧ xi = vi for some v ∈ sn(data). Applying u u u u the induction hypothesis, we get a path s0 → s1 → ... → sn with v = sn(D) and a transition u u u u 0 in G from sn to sn+1 with sn+1(D) = v . Lemma 5.8 (Fix-Point for BV Relations). Let ρ ∈ Φ be a BV formula. We can associate to ρ a relation r as follows:
0 0 r = {(v, v ) | vi, vi ∈ sort(xi), V V 0 0 |=SAT xi = vi ∧ ρ ∧ xi = vi}
For each ρ ∈ Φ there are m, n ∈ N0 such that for all subsets X of {v | vi ∈ sort(xi)} it holds that {rm(v) | v ∈ X} = {rm+n(v) | v ∈ X}.
Proof. Let Ri = {ri(v) | v ∈ X} and consider the infinite sequence R1,R2,.... Given that k l each sort(x), x ∈ D is finite there are k, l ∈ N0 such that R = R . Assume k > l, then we can take l for m and k − l for n to establish the statement.
Definition 5.9 (Composition of BV Relations). For a collection of relations ρ1, . . . , ρn from Θ we define their composition ρ = ρn ◦ ... ◦ ρ1 as a formula
n n−1 0 n 2 0 3 0 2 ρ = ρn[x 7→ x ] ∧ ρn−1[x 7→ x , x 7→ x ] ∧ ... ∧ ρ2[x 7→ x , x 7→ x ] ∧ ρ1[x 7→ x ]
u u u u Lemma 5.10 (Symbolic Cycle Existence). If s1 → ... → sc = s1 is a cycle in G then there s s s s ∗ s ∗ is a path in G s11 → ... → sc1 = s12 → s13 → ... such that
u s 1. ∀(1 ≤ i ≤ c)(j ≥ 1), si (pc) = sij(pc) s s 2. ∃(m n ∈ N), s1m(data) = s1(m+n)(data) u u Proof. Unrolling the cycle s1 → ... → sc and applying the Lemma 5.6 we get a path s in G satisfying (1). Let ρi be the formulae labelling the corresponding path in P , i.e. u ρi u si (pc) −→ si+1(pc), and let ρ be the formula for the composition of ρ1, . . . , ρc. Applying Lemma 5.8 on ρ we get the m and n satisfying (2).
s s s s Lemma 5.11 (Explicit Cycle Existence). If s1 → ... → sc = s1 is a cycle in G then there s u u u ∗ u ∗ is a path in G s11 → ... → sc1 = s12 → s13 → ... such that
63 s u 1. ∀(1 ≤ i ≤ c)(j ≥ 1), si (pc) = sij(pc)
u u 2. ∃(m n ∈ N), s1m(D) = s1n(D)
s s u Proof. Unrolling the cycle s1 → ... → sc and applying the Lemma 5.7 we get a path in G s u ∗ u satisfying (1). Let k = |s1(data)| and consider the prefix s11 → s1(k+1). For this prefix it u s holds that ∀(1 ≤ i ≤ k + 1), s1i(D) ∈ s1(data) and thus at least one data-vector must repeat, u u i.e. ∃(1 ≤ m < n ≤ k + 1), s1m(D) = s1n(D), as required by (2).
u u u u Proof of Theorem 5.5. Assume that s1 → ... → sn = s1 is an accepting cycle in G reachable u s s s s from s0 . Then Lemma 5.10 gives us a lasso-shaped path s1 → ... → sm → ... → sm+n = sm in Gs which also leads to a cycle. This cycle is also accepting given (1) of Lemma 5.10 and s u u the definition of A . Finally, applying Lemma 5.6 on the path from s0 to s1 proves the reachability of an accepting cycle in Gs. s s s s s Assume that s1 → ... → sn = s1 is an accepting cycle in G reachable from s0. Then u u u u u Lemma 5.11 gives us a lasso-shaped path s1 → ... → sm → ... → sn = sm in G which also u u leads to a cycle. This cycle is also accepting given the definition of A and reachable from s0 s ∗ s u (after applying Lemma 5.7 to s0 → s1 and s1 (D)).
Theorem 5.5 gives us the assurance that an unmodified model checker can be used together with the set-reduced transition system and would provide the correct result. Yet reasoning about the efficiency of the proposed set-based reduction – the ratio between the size of the original system and the size of the reduced system – is rather complicated, as shown in Example 5.12 below. For a program without cycles, the reduction is exponential with respect to the number of input variables and to the sizes of their domains. Note, however, that for trivial cases of data-flow non-determinism even this reduction can be negligible. The case of programs with cycles is considerably more involved. Let us call control cycles those paths in a transition system that start and end in two states with the same control part, s, s0 such that s(pc) = s0(pc). Then the relation r of the cycle, transforming s(D) to s0(D), has a fixed point as was argued in the above proof, and this fixed point has to be computed (explicitly in our case, as opposed to the symbolic solution [Lin96] of the same problem). That aspect is present in full and reduced state spaces alike, yet may produce an exponential difference in their sizes. If the multi-state already contains the fixed point before the generation reaches the given cycle, as in the second program of Example 5.12, then the reduced system contains only as many multi-states as is the length of that cycle. On the other hand, as the first program of Example 5.12 demonstrates, the reduction can even be to the detriment of the space complexity even if we assume that the size of multi-states is sub-linear in the number of states contained within.
Example 5.12 (Set-Based Reduction Complexity). There are two programs that nicely ex- emplify some of the properties of the set-based reduction, which we will use in further dis- cussion, especially regarding the efficiency of this approach. Consider the following program with a loop: read a; while ( a > 10 ) a--;
When only the data part of multi-states is considered, the reduced transition system unfolds to
64 a = {0} a = {0..255} a = {11..255} a = {10..254} ...
a = {0..10} and while the state space is finite, there are as many multi-states in the reduced system as there were states in the original system. Each multi-state contains a different set of evaluations of a and each pair of multi-states is thus different. Furthermore, many states (individual variable evaluations) are represented multiple times: given that the first and the third states in the above system have the same program counter, the values of a between 10 and 254 are represented twice in these two multi-states alone. On the other hand, for a specification x = 1 and a program
x = 1; read y; while ( true ) y++; the reduced transition system contains three multi-states and three transitions:
x = 0 x = 1 x = 1 y = 0 y = 0 y = {0..255} whereas the original transition system contains 256 paths that each enclose into a cycle only after 256 unfoldings of the while loop. 4
Explicit-Value Representation The first representation considered in this thesis stores the allowed evaluations in an explicit set, enumerating all possible combinations of individual variable evaluations. Hence this approach is closely related to the purely explicit approach of [BBB+12] but improves on it in several aspects. Effectively, rather than repeating the verification for every evaluation of input variables (purely explicit approach), with explicit sets we only execute the verification once, from the multi-state containing all initial evaluations. It follows that evaluating branching conditions and computing successors needs only the modification that these operations are applied sequentially to all evaluations comprising the current multi-state. Most importantly, however, deciding state matching is possible on the level of syntax: two states are the same if their representation in memory matches. Verification with explicit sets clearly retains the biggest limitation of the purely explicit approach, i.e. the spatial complexity grows exponentially with the domains of inputs variables. We postpone discussing the comparison between the explicit set approach and the purely explicit approach until Section7 and only remark that while the improvement is considerable, it still scales linearly with the domain size. Verifying programs with realistic domains of input variables appears to require a symbolic representation and we continue with a representation which uses BV formulae and the satisfiability modulo theories (SMT) procedures for this theory.
5.2 Symbolic Representation
The classical way of representing a set of variable evaluations using BV formulae is by inter- preting a formula as the characteristic function of a set. A formula ϕ ∈ Φ then represents a set
65 V {v ∈ sort(D) ||=SAT ϕ ∧ xi = vi}. Generating successors with this representation, how- ever, requires the computation of either pre- or post-conditions, which is non-trivial for many instructions. We will thus use a different approach, popular in the symbolic execution [Kin76]. The symbolic representation of the set of evaluations considered in this thesis stores the possible current values of variables as images of bit-vector functions. For the following the- oretical discussion we set the data part of multi-states s(data) to a vector of BV formulae (ρ1, . . . , ρn): the transition relation labels collected along the path from l0. Such vector may V V 0 represents a set of evaluations {v ||=SAT ρn ◦ ... ◦ ρ1 ∧ xi = 0 ∧ xi = vi}, i.e. the image of the transition relation. In practice, the transition relations obtained from interpreting program commands are mostly a conjunction of a guard formula γ, which refers only to current-state variables, and a set of update formulae x0 = ϕ, where ϕ refers to current-state and input variables. This allows representing the data part of multi-states as a sequence of formulae referring only to input variables. The conjunction of guards forms the so-called path condition and the sequence of updates delimits the values of variables. Thus assuming that a variable x is associated with γ∧y0=ψ a formula ϕ in a state s (with path condition δ), taking the transition s(pc) −−−−−→ s0(pc) creates a state s0 with path condition δ ∧ γ[x 7→ ϕ] and y is associated with ψ[x 7→ ϕ]. This representation will not be used in the theoretical part of this thesis, but Section7 details on the experimental implementation which uses the representation.
Definition 5.13 (Set-Symbolic CFG). The set-symbolic CFG G = (S, s0,T,A) is similar to the set-reduced CFG in limiting the transition labels to > but differs in that data evaluates to a vector of BV formulae from Φ. The generation from a CFG P = (L, l0, T ,F ) is defined as follows:
S = {s : {pc, data} → L ∪ [Φ]} | s(pc) ∈ L, • s(data) = (ρ1, . . . , ρn), ρi ∈ Φ}
V 0 • s0 = {(pc, l0), (data, ( xi = 0))}
ρ T = {s → s0 | s(pc) −→ s0(pc), s(data) = (ρ , . . . , ρ ), • 1 n |=SAT ρ ◦ ρn ◦ ... ◦ ρ1, 0 s (data) = (ρ1, . . . , ρn, ρ)}
• A = {s ∈ S | s(pc) ∈ F }
Hence a multi-state in the set-symbolic state space consists of a vector of formulae. As stated earlier we want to interpret this vector as the image of the transition relation that results from appending the individual relations in the vector s(data). For such interpretation we then prove that paths in the set-reduced CFGs have their equivalent in the set-symbolic CFGs.
Definition 5.14 (Image-Based Interpretation). Let G = (S, s0,T,A) be a set-symbolic CFG. The image-based interpretation of set-symbolic states is a mapping from S to the set of evaluation vectors {(v1, . . . , vn) | vi ∈ sort(xi)}. For s(data) = (ρ1, . . . , ρn) we define it as:
^ 0 s = {v ||=SAT ρn ◦ ... ◦ ρ1 ∧ xi = vi} J K 66 Theorem 5.15 (Trace Equivalence for Image-Based Interpretation). Let P = (L, l0, T ,F ). s s s s s Under the image-based interpretation, the set-reduced CFG G = (S , s0,T ,A ) is trace- i i i i i equivalent to the set-symbolic CFG G = (S , s0,T ,A ). That is
s s s i i i 1. for each path s0 → s1 → ... in G there is a corresponding path s0 → s1 → ... in G , s i s i such that ∀(j ∈ N), sj(pc) = sj(pc), sj(data) = sj J K i i i s s s 2. for each path s0 → s1 → ... in G there is a corresponding path s0 → s1 → ... in G , s i s i such that ∀(j ∈ N), sj(pc) = sj(pc), sj(data) = sj J K Proof. For zero-length paths, the proof is the same for both (1) and (2). In the set-reduced s CFG we have s0(data) = {0}. In the set-symbolic CFG we have
i ^ 0 ^ 0 s0 = {v ||=SAT xi = 0 ∧ xi = vi} = {0}. J K s s s s (1) Let s0 → ... → sn → sn+1 be a path in G . From the induction hypothesis we have a i i i s i s ρ s s path s0 → ... → sn in G such that sn(data) = sn . Let sn(pc) −→ sn+1(pc) and v ∈ sn(data) V i J K V 0 such that |=SAT ρ ∧ xi = vi. Since v ∈ sn we also have |=SAT ρn ◦ ... ◦ ρ1 ∧ xi = vi for i J K V 0 sn(data) = (ρ1, . . . , ρn). Furthermore, ρn ◦ ... ◦ ρ1 ∧ xi = vi is equivalent to
0 n+1 ^ n+1 ϕ := (ρn ◦ ... ◦ ρ1)[X 7→ X ] ∧ xi = vi
V n+1 V n+1 and ρ ∧ xi = vi is equivalent to ψ := ρ[X 7→ X ] ∧ xi = vi. The only common vari- n+1 ables in ϕ and ψ are X which are bound to the same values and thus |=SAT ρ ◦ ρn ◦ ... ◦ ρ1. i s All that remains is to show that sn+1 = sn+1(data). J K 0 i V 0 0 v ∈ sn+1 ⇔ |=SAT ρ ◦ ρn ◦ ... ◦ ρ1 ∧ xi = vi J K V 0 0 V n+1 ⇔ |=SAT ρ ◦ ρn ◦ ... ◦ ρ1 ∧ xi = vi ∧ xi = vi i n+1 V 0 0 V n+1 ⇔ ∃(v ∈ sn ), |=SAT ρ[X 7→ X ] ∧ xi = vi ∧ xi = vi Js K V 0 0 V ⇔ ∃(v ∈ sn(data)), |=SAT ρ ∧ xi = vi ∧ xi = vi 0 s ⇔ v ∈ sn+1(data)
i i ρ i (2) Similarly as in (1), the path in G gives us sn(pc) −→ sn+1(pc) and |=SAT ρ ◦ ρn ◦ ... ◦ ρ1. Fixing the evaluation of Xn+1 variables from the satisfying model to v and renaming the i V variables we get ∃(v ∈ sn ), |=SAT ρ ∧ xi = vi. Applying the induction hypothesis we get s J KV s i ∃(v ∈ sn(data)), |=SAT ρ ∧ xi = vi. Finally, showing that sn+1(data) = sn+1 is exactly the same as in (1). J K
The image-based interpretation crucially depends on the fact that states of set-symbolic CFGs represent sets of program states. These sets of data evaluations are stored compactly in form of BV formulae but model checking algorithms require distinguishing multi-states. We demonstrate how to compare the explicit part of multi-states efficiently in Section 5.4. Comparing the symbolic part, i.e. performing state matching, requires a different approach for efficient implementation. Below we define a quantified BV formula which is satisfiable if and only if two given multi-state are different in their data parts. The satisfying assignment of input values for one of the multi-states is such that the resulting evaluation of data variables does not appear in the other state.
67 Definition 5.16 (Image-Based State Matching). ] For a set-symbolic CFG G = (S, s0,T,A) we define an image-based state matching formula image for differentiating two states sl, sr ∈ S as follows. Let us fix sl and sr to sl(data) = (ρl , . . . , ρl ) and sr(data) = (ρr, . . . , ρr ), where 1 nl 1 nr l r r each variable is also indexed with either l or r, i.e. formulae from s (data) refer to x1, . . . , xn r r and i1, . . . , imr . sl 6⊆ sr = ρl ◦ ... ◦ ρl ∧ ∀(ir . . . ir ), ρr ◦ ... ◦ ρr =⇒ W xl 6= xr nl 1 1 mr nr 1 i i image(sl, sr) = sl 6⊆ sr ∨ sr 6⊆ sl Theorem 5.17 (Image-Based Matching Correctness). Let sl and sr be two set-symbolic states. Then image(sl, sr) is satisfiable if and only if sl and sr represent two different sets of variable evaluations, i.e. sl 6= sr . J K J K Proof. Let y0 be the vector of input values that satisfy image(sl, sr). Given the form of image we can assume that sl 6⊆ sr. Let v0 be the satisfying vector of values for variables x0l, . . . , x0l . 1 nl From the first conjunct we know that v0 ∈ sl . Let v ∈ sr be chosen arbitrarily as well as r J K Jr K r V r V r r a vector of input values y, one for each ij . Then either ρnr ◦ ... ◦ ρ1 ∧ ij = yj ∧ xi = vi is unsatisfiable and thus v 6∈ sr . If the above formula is satisfiable then the consequent of sl 6⊆ sr states that in at leastJ oneK variable evaluation v0 differs from v. l r r Let v ∈ s be such that v 6∈ s and y be an arbitrary evaluation of ij variables. Then r J K r V r J K W l r either ρnr ◦ ... ◦ ρ1 ∧ ij = yj is unsatisfiable or xi 6= xi holds since v differs from every element of sr . J K The above theorem and the existence of SMT solvers for the quantified bit-vector theory (QBV) give a feasible way of checking state equivalence under image-based symbolic repre- sentation. By extension this allows implementing model checking of arbitrary system that uses input variables and even those that read from input variables iteratively. The most outstanding limitation of such an approach is the complexity of the satisfiability checking of QBV formulae, which, in theory, is an NEXPTIME-complete problem. Reported practical results are much more promising but the achieved times are still orders of magnitude higher than equivalent operations in classical model checking. There are two possible ways of overcoming this obstacle. One possible way is to reformulate the state matching query to allow faster satisfiability checking. The other possible way is to reformulate the definition of state matching that would be correct while easier to check. It can be easily observed that semantic equivalence between s(data) relations is not a correct state differentiating mechanism with respect to set-based reduction. However, given the increase in efficiency of BV (quantifier-free) satisfiability compared to QBV satisfiability, we now describe under what conditions would semantic equivalence be a permissible substitute for the exact, image-comparing, equivalence. Let us limit our scope to programs that read only a fixed number of times from the input variables, i.e. I is a finite set of cardinality m and the program statements refer only to input variables among i1, . . . , im. We claim that under such restriction, it is admissible to define the state matching based on semantic equivalence as follows.
Definition 5.18 (Equivalence-Based State Matching). Let G = (S, s0,T,A) be a set-symbolic CFG. We define an equivalence-based state matching formula equiv for differentiating two states sl, sr ∈ S as follows. sl 6v sr = ρl ◦ ... ◦ ρl ∧ W xl 6= xr nl 1 i i equiv(sl, sr) = sl 6v sr ∨ sl 6v sr ∨ ρl ◦ ... ◦ ρl 6= ρr ◦ ... ◦ ρr nl 1 nr 1
68 Hence the formula equiv is a quantifier-free BV formulae and checking its satisfiability is an NP-complete problem. Note that now the satisfying assignment can either come from the last disjunct – is an input variable evaluation that satisfies, say, the left path condition ρl ◦ ... ◦ ρl but does not satisfy the right path condition – or from the modified subset nl 1 inclusion. In the latter case, the satisfying assignment satisfies the path condition of sl and at the same time leads to different evaluation of program variables by sr. The exact meaning of admissibility with respect to state matching is formalised in the following theorem.
Theorem 5.19 (Admissibility of Equivalence-Based State Matching). Assuming the verified program reads only finitely many times from input, the model checking procedure which uses equiv for state matching satisfies the following three conditions.
1. sound: every counterexample exists also in the the set-reduced CFG.
2. complete: every violating run will eventually be detected.
3. terminating: verification of both correct and incorrect programs will terminate.
Proof. (1) Given that the image-based equivalence is strictly weaker than the semantic equiv- alence, i.e. ¬equiv ⇒ ¬image, every state equivalence located by the semantic equivalence is also located by the image-based equivalence. Consequently, every accepting cycle found using equiv exists in the set-reduced transition system. ∗ ∗ (2) Let us assume there is an accepting cycle in the system, i.e. s0 → s1 → s2 such that image(s1, s2) is unsatisfiable. Given the restriction above, we know that on the path from s1 to s2 the transition labelling relations does not use input variables. Let us denote these relations as ρ1, . . . , ρm. Their composition ρm ◦...◦ρ1 constitutes a permutation of its domain (otherwise image(s1, s2) would be satisfiable), but not necessarily an identity. In the case the composition is not an identity, equiv(s1, s2) is satisfiable and the computation continues. However, ρm◦...◦ρ1◦ρm◦...◦ρ1 is also a permutation of the same domain and furthermore the evaluation of control variables remains unmodified, and thus the resulting state is accepting if and only if s1 was accepting. Finally, given the finite number of possible permutations of m 1 k −k a finite domain, we conclude that there exists k0 ≤ k1 such that (ρ ◦ ... ◦ ρ ) 1 0 is an m 1 k0 identity on the domain of (ρ ◦ ... ◦ ρ ) and thus equiv(sk1−k0 , sk1 ) is satisfiable, where sk is the state resulting from k + 1 times iterating the original cycle from s1 to s2. (3) The termination can be established simply by observing that there is only finitely many mappings from I (given the above restriction) to sort(D). Then the state space of the model checking is also finite even though possibly larger than the original, set-reduced transition system.
Apart from the restriction to programs that do not read inputs on a cycle, the proposed method utilising semantic equivalence of states may produce larger state spaces than the original state matching based on image equivalence. Note that in the proof of Theorem 5.19 we relied on the number of permutations to be finite, in order to detect existing state equivalences. In practice it means that the traversal of the state space has to visit and store (at most) k1m new states before detecting the equivalence. There k1 depends on the particular function composition collected from the cycle transitions and m is the length of the cycle in the set- reduced system. As the Example 5.12 demonstrated, the value of m also depends on the transition labelling function. Together, the model checking using semantic equivalence has to unroll the transition relation to reach two fixed points: (1) to stabilise the image set and
69 (2) to stabilise the permutation of that set. We investigate the increased complexity in the number of states and the improvement in the verification time in Section7. Example 5.20 (Iterative Input Reading). To see why the restriction to programs without iterative input reading is necessary, consider this simple program:
a = 0; while(true) a = a + read(); The cycle introduces a new input variable with every iteration but more importantly each variable can modify the evaluation of the program variable a. Let us denote by s1, s2,... the states after 1, 2,... iterations of the cycle, si(data) = (ρ1, . . . , ρni ), and let ij be the variable introduced in j-th iteration. Then the evaluation ik = 0, k ∈ {1, . . . , j − 1}, ij = 1 always differentiates the last state sj from all the preceding states: for all k ≤ j we have that only V a 7→ 0 satisfies ρnk ◦ ... ◦ ρ1 ∧ il = 0 but sj evaluates a to 1. 4
5.3 Storing and Searching
During its state space traversal, an explicit-state model checker enumerates and stores the visited states in order to detect accepting cycles and to be able to decide that the system is correct (when the whole state space was traversed without detecting any accepting cycles). The question of how to implement an efficient representation of multi-states is at the centre of our work. (In the following we refer to the objects in a state space as states, unless the distinction between multi-states and single-states is necessary.) In classical explicit model checking, individual states represent one evaluation of variables and are thus stored as a list of values, each on a fixed position in a block of memory. Postponing any other state reduction techniques for later discussion, states represented explicitly can be efficiently manipulated by copying and comparing continuous blocks of memory and more importantly compressed using hashing. As a hashing function we consider any function (not necessarily injective) that maps a block of memory to a small number (small compared to the domain of the hashing function, often 32- or 64-bits integer number). Storing visited states in a hash table allows one to decide whether a state was visited before with an average-time constant complexity. Hashing, however, is only permissible on a set where every element has a canonical representation and this representation can be computed efficiently. More precisely, it is required that for two elements e1, e2, it holds that if the hashing function h returns different hashes, h(e1) 6= h(e2), then e1 is not the same object as e2. This is readily available for sets whose elements are stored explicitly, but for symbolic representation the situation is more complicated. The set of BV functions does not have the property of efficiently accessible canonical representations, i.e. BV formulae have canonical forms but the process of canonisation is an NP-complete problem. Clearly, a BV function, when stored as its abstract syntax tree is not a canonical representation of its image nor of its semantics. For example, 3 + 2 and 2 + 3 both map to 5, yet they are syntactically different and thus their representations do not match. Canonical representations of BV formulae exist, but their size grows exponentially in the presence of multiplication, and it is questionable what would be the improvement compared to the explicit set representation. Furthermore, the canonical form of BV formulae is canonical only with respect to satisfiability, i.e. two functions have the same canonical representation if and only if they are equisatisfiable. It follows that such canonical form need not preserve the images of functions contained within the original formula.
70 5.3.1 Linear Storage
Given the basic principle of the combination between explicit and symbolic approaches to model checking, that the control part of individual states is stored explicitly, one still can partially employ hash-based search. Effectively, only the states that share the same control part must be tested for equivalence of the data part in order to decide state matching. The control part is canonical and the standard hash-based search can be used to distinguish states that differ in the control part. This considerably improves the time efficiency, since the number of states among which one must search using the more expensive procedure is smaller. For further discussion let us denote the set of states of G = (S, s0,T,A) that share the same control part with a state s as [s]c, i.e. [s]c = {s0 ∈ S | s0(pc) = s(pc)}. A straightforward way of storing the set [s]c is in a linear array. Then deciding whether a new s0 is already present in [s]c constitutes a similar search as in hash table collisions: the array storing [s]c is traversed in a linear fashion and every element tested for equivalence with s0. If such a s00 is found for which image(s0, s00) is unsatisfiable then the traversal is aborted (s0 is already stored); otherwise, if [s]c was traversed to the end, s0 is appended to [s]c as a new state. Note that finding a state s0 to be a new state requires the traversal of the whole [s]c, yet for every s00 ∈ [s]c the formula image(s0, s00) is satisfiable. Since finding a satisfying assignment to a QBV formula is considerably faster than proving unsatisfiability, it follows that detecting already visited states is more expensive than finding a state to be genuinely new. There are several possible ways of improving this basic scheme. One can clearly see that the finer the state space is divided solely by the control part the faster the state matching is. This can be utilised by moving some of the data into the control part. For example, if we can be certain that a particular variable can be evaluated to a single value in a state, then it can be safely moved to the control part to be used in further, hash-based, state space division. Another optimisation is to use the satisfying assignments of previously checked image formulae. As stated before, such an assignment is in fact an evaluation of input variables that leads to an evaluation of program variables that exists in one state but not in the other. Since the data part of states connected by the same control part was formed under similar conditions (often only a different iteration of a cycle), it might happen that the evaluation that differentiated two such states could differentiate also the next ones. Furthermore, to test whether such evaluation could be reused is possible without calling the SMT solver, using the explicit evaluation method incorporated in the model checker. We now explain these optimising heuristics in more detail.
5.3.2 Explicating Variables
If some variables do not depend on inputs or their values were limited by path conditions it may happen that only one value is permissible as their evaluation (see [BL13] for explicit- value analysis using a similar idea). This can be detected by investigating the satisfiability model and employed during the hash-based search. It requires adding another set of control C variables CD to store the explicit values, i.e. for each x ∈ D we add x into CD. Thus hashing C C C is performed on the set {s(pc), s(x1 ), . . . , s(xn )} and we add new pairs (x , e ∈ sort(x)) into s when only a single evaluation e of x satisfies the path condition. More precisely, for 0 0 0 0 0 s (data) = (ρ1, . . . , ρn) when testing satisfiability of ρn ◦ ... ◦ ρ1, we extract the satisfying
71 0 0 evaluation as v. Then all data variables xi are tested for satisfiability of ρn ◦ ... ◦ ρ1 ∧ xi 6= vi. C If the previous formula is unsatisfiable we add (xi , vi) into s. Explicating variables does not require any further modification of either the accepting cycle detection or database searching. The desired effect of this heuristic is that the number of states C C with the same explicit part – evaluation of the variables from {pc, x1 , . . . , xn } – is lowered. The database thus stores more explicit states, but the complexity of a hash-based search is negligible compared to the satisfiability-based search which resolves hash-based collisions. There is a negative effect, however, in that detecting variables with a single satisfying value requires further calls of the SAT solver. The experiments of Section7 show that even when the explicating is successful in only a few instances, the overall effect is positive.
5.3.3 Witness-Based Matching Resolution
The advantage of hash-based searching is that the amortised complexity of a single search query can, under certain conditions, be constant. Less efficient storage methods use linear ordering on the elements and tree-like structure where all possible collisions can be detected in a logarithmic number of steps. The following heuristics attempts to simulate the important property of a linear order with respect to searching: every comparison divides the remaining number of potential candidates approximately in half. Furthermore, if there is a more efficient procedure than calling a SAT solver to compare two states, then that other procedure should be used instead. Two sets of symbolically represented states sharing the same explicit part cannot be distinguished using hashing. Consider the model Y gained from checking satisfiability of equiv(s, s0). It is an evaluation of input variables and also a witness of the difference between s and s0 (note that we use the equivalence-based state matching). This witness can be reused and can replace some calls of the SAT solver when processing a new state s00. For example, let the model from some previous equiv(s, s0) evaluate the input variables to y, such that 0 0 V V ρn0 ◦ ... ◦ ρ1 ∧ ij = yj evaluates the data variables to v, ρn ◦ ... ◦ ρ1 ∧ ij = wj evaluates 0 0 00 00 V the data variables to v , and v 6= v . Then if ρn00 ◦ ... ◦ ρ1 ∧ ij = yj evaluates the data variables to v00, where v00 6= v we have s00 6= s, otherwise s00 6= s0. Notice that obtaining v00 and testing the equality with v can be done without calling the SAT solver since we can simply substitute the input variables y and evaluate the resulting formulae. Let us focus the following discussion on a single set of states with the same explicit part. The above observation can be used more extensively as the number of states in- creases, since every pair among those states constitutes a witness. We propose to build a hierarchy among the states, attaching two sets of states to every witness, denoted as i i {s1, . . . , sl} y {z1, . . . , zr}, where y is the witness of s1 6= z1, si(data) = (ρ1, . . . , ρni ), i i and zi(data) = (ζ1, . . . , ζmi ). Let vi and wi be the data variable evaluations gained from the i i V i i satisfiability models of ρni ◦ ... ◦ ρ1 ∧ ij = yj and ζmi ◦ ... ◦ ζ1, respectively. The separation between s and z states is such that ∀(1 < i ≤ l), vi = v1 and ∀(1 ≤ i ≤ r), wi 6= v1. The witnesses are stored in a priority queue and the one with the highest priority is used for the next state matching. The priority of every witness equals the sum of the number of states that it connects, that is l + r, but we also continually update the priority with every state matching: if a state is found different from states in a set Z then we remove the states in Z from both sets connected by each witness. This ensures that, on average, each state matching removes the highest number of states still to be compared with. Once all witnesses with non-
72 zero priority have been tested, the remaining state matchings must be decided with standard calls to the SAT solver. Similar, though not equivalent simplification can be obtained from reusing the witnesses of image(s, s0), i.e. in the case of image-based state matching. There the model is again an evaluation y for the input variables in s such that for any input evaluation for s0, s0(data) leads to an evaluation of the state-forming variables different from s(data). While this cannot be used to differentiate a new state completely without any call to the SAT solver, one can exponentially decrease the theoretical complexity of the call by checking satisfiability in BV instead in QBV. (Equivalently, one could generalise the idea by stating that reusing a witness allows for differentiation with satisfiability of formulae with one less quantifier alternations.) 00 00 V 00 V 00 If the quantifier-free formula ρn ◦ ... ◦ ρ1 ∧ ρn00 ◦ ... ◦ ρn00 ∧ xi = xi ∧ ij = yj is satisfiable then s00 6= s0; if it is unsatisfiable then s00 6= s.
5.4 Model Checking Algorithm
At this point we have all the necessary methods described to sufficient detail to be able to construct the model checking algorithm using set-based state space reduction. It will prove useful to consider a model checker to consist of four communicating components: (1) an implicit graph description, i.e. the input program, (2) an accepting cycle detection algorithm which continuously generates the explicit graph, (3) a database of already visited states, and (4) a counterexample extractor. Also, even though the model checking will be explained on the control-flow graphs using image-based state matching, it works exactly the same for equivalence-based state matching. An implicit graph description for the purpose of model checking implements three func- tions: initial, successors, and is accepting, that together exhaustively represent a pro- gram P = (L, l0, T ,F ).
• The function initial simply returns the initial state of the program, i.e. s such that s(pc) = l0, s(data) = (>).
• The function successors takes a state s as its parameter and returns a set of all its →- successors generated as described in the Definition 5.13, i.e. s0 ∈ successors(s) ⇔ s → s0. Notice that generating each successor requires calling the SAT solver to verify that ρ 0 ρn ◦ ... ◦ ρ1 ∧ ρ is satisfiable, where s(data) = (ρ1, . . . , ρn) and s(pc) −→ s (pc) ∈ T . Using the explicit representation, computing successors(s) requires computing the successors for each v ∈ s . J K • Finally, is accepting(s) is > if and only if s(pc) ∈ F is an accepting state.
The database Visited is an associative array, storing a pair
73 Algorithm 6: Generic Accepting Cycle Detection Input : Program description Graph; Database Visited Output: A pair <{Correct, Incorrect}, S>, where S is a subset of program states that contains a reachable accepting cycle if the first element is Incorrect
1 Ready.push(:= Ready.pop() 4 if ¬Visited.has(s) then 5 Succs := Graph.successors(States[s]) 6 SuccRefs := map(Succs, States.push back) 7 Visited[s] :=
8
13 return
Furthermore, all database functions can be implemented by the generic searching function described in Algorithm7. Note that the function in Algorithm7 is essentially a hash-based search with linear conflict resolution. The reason for our describing it to any extent is to expose the difference between the hash-based search for the set of control equal, data conflicting database elements, and the search within this set. While the former is a very fast, syntax- based comparison of two blocks of memory, the latter, represented on line3 by the call to a SAT solver on one of the image functions, can be extremely slow, since it has to interpret the function representing variable evaluations. As the experiments will later show, the efficiency of the whole model checking approach is crucially dependent on the size of these conflicting sets of states. The accepting cycle detection algorithm itself, depicted in Algorithm6, is again a generic graph traversal, abstracted from a concrete implementation for some specific cycle detection algorithm. Details of particular algorithms are not necessary for understanding this work and can be found in original publication for respective algorithms. The interested reader should consult [BCMˇ S04ˇ ], [CP03ˇ ], [CVWY92] for details on MAP, OWCTY, and Nested DFS algorithms, respectively (those are the three algorithms implemented in DIVINE). The existence of a database of known states is used for three main reasons: (1) it is more expensive to generate successors of a state through implicit graph representation than to obtain them directly from the database, (2) in order to decide correctness of a system the algorithm must traverse the whole graph and thus be able to decide when all states have been visited, (3) the accepting cycle detection requires that the state matching is implemented. Hence based on (1) we should access the successors of a state using the database (line8) instead of reconstructing them again (line5) in the case the present state is already stored in the database. The output of the accepting cycle detection, in the case the program is incorrect, is a subset of the state space which contains an accepting cycle reachable from the initial state.
74 One detail of the accepting cycle detection is interesting with respect to the correctness of the overall model checking procedure. When a state is revisited, the algorithm only preserves the older version of the state: the one stored in the database. In standard model checking these two states are identical, but under set-based reduction they may differ in both the path condition and the functions representing variable evaluations. The following theorem shows that in the set-based reduced systems, i.e. with state matching at least as strongly differentiating as the image-based, any version of the state can be used with the same effect. The implications of storing only the oldest version of a state are observable when we try to interpret a path in the system, which happens solely when counterexamples are extracted. Theorem 5.21 (Image-Based Path Existence). Let s be a reachable state in a set-symbolic 0 0 G = (S, s0,T,A) and let π be a path from s to s such that 6|=SAT image(s, s ). Then for every 0 path ζ1 from s to s1 there is a path from s to s2 such that 6|=SAT image(s1, s2). 0 0 Lemma 5.22 (Single Step Equality). Let s, s ∈ S be such that 6|=SAT image(s, s ) and let ρ ∈ Φ be such that for r, r0 ∈ S we have: r(pc) = s(pc) and r(data) = s(data).ρ. Furthermore, r0(pc) = s0(pc) and r0(data) = s0(data).ρ. Then r = r0 . J K J K 0 0 0 Proof. Let s(data) = (ρ1, . . . , ρn) and s (data) = (ρ1, . . . , ρn). Then V 0 v ∈ r ⇔ |=SAT ρ ◦ ρn ◦ . . . ρ1 ∧ xi = vi J K V 0 V V 0 ⇔ ∃w, |=SAT ρ ∧ xi = vi ∧ xi = wi ∧ ρn ◦ ... ◦ ρ1 ∧ xi = wi V 0 V ⇔ ∃(w ∈ s ), |=SAT ρ ∧ xi = vi ∧ xi = wi J K0 V 0 V ⇔ ∃(w ∈ s ), |=SAT ρ ∧ xi = vi ∧ xi = wi J K 0 0 V 0 ⇔ |=SAT ρ ◦ ρn ◦ . . . ρ1 ∧ xi = vi ⇔ v ∈ r0 J K
ρ1 ρn Proof of Theorem 5.21. Let the path from s0 to s be s0(pc) −→ ... −→ s(pc) and the path from s to s0 s(pc) −→σ1 ... −−→σm s0(pc). We prove the statement by induction on the length 0 p of the path from s to s1. For p = 0 the statement follows from the assumptions. Let 0 ζ1 ζp ζp+1 s (pc) −→ ... −→ sp1(pc) −−−→ s1(pc), then from the induction hypothesis we have a path θ1 θp s(pc) −→ ... −→ sp2(pc) such that 6|=SAT image(sp1, sp2). From
|=SAT ζp+1 ◦ ζp ◦ ... ◦ ζ1 ◦ σm ◦ ... ◦ σ1 ◦ ρn ◦ ... ◦ ρ1 and the definition of sp2 we have |=SAT ζp+1 ◦ θp ◦ ... ◦ θ1 ◦ ρn ◦ ... ◦ ρ1 and thus there is J K sp2 → s2. Finally, from Lemma 5.22 it follows that s1 = s2 . J K J K The subset returned by the Algorithm6 is not in a form presentable to the user as a counterexample and needs to be further clarified, i.e. the actual counterexample has to be extracted from Next using the Algorithm8. The algorithm uses heavily the graph induced on parent edges in order to accelerate the location of both an accepting cycle and the path to that cycle from the initial state. Hence three different graph traversal procedures are used in this algorithm: forward search along normal edges (Reachf ), forward search along parent p p graph edges (Reachf ), and backward search along parent graph edges (Reachb ). The first three parameters are common to both procedures, i.e. the number of steps (∗ for unlimited traversal), starting vertex, and the subgraph that limits the scope of the search. The last parameter of the backward search is a set of vertices on which the traversal should be aborted.
75 Algorithm 7: Database Searching Input : State s; Database Visited; Modifying function update Output: Pair
1 Key := hash(s(pc)) 2 foreach s0 ∈ Visited : hash(s0(pc)) = Key do 0 0 3 if 6|=SAT image(s, s ) then // or equiv(s, s ) 4 return <>, &s0>
5 return <⊥, Visited.update(s)>
Example 5.23 (Counterexample Extraction). We use the relatively simple CFG below to demonstrate the counterexample extraction.
s1 s3
s4
s0
s5
s2 s6 sc
The input is the lighter green area, which contains an accepting cycle. In this example, we represent parent graph edges with full lines and the remaining edges with dashed lines. The extraction begins by locating the states with no successors in the parent graph, marked with small rectangle, i.e. the states s3, s4, s5, sc. Next, it identifies the states with an accepting predecessor on the parent graph, marked with red-filled rectangles. For each of those states, here demonstrated on state sc, the algorithm computes the set of successor vertices in the whole graph, marked with a darker blue square. Finally, the accepting cycle is identified as the yellow area on the bottom. 4
The parent graph is the shortest-path tree and thus contains no cycles. Yet some vertices in the parent graph may have outgoing edges in the full graph that end in one of their parent- graph predecessors. Such vertices cannot have successors in the parent graph. These are first located on line1 and then refined on2, since we are only interested in those that have 1 accepting predecessors; and then iteratively tested for being on a cycle. Hence the sets Rf and ∗ Rb contain the one-step forward closure and the full backward closure, respectively, the latter ∗ 1 along parent edges. Rb formation is also stopped at any vertex from Rf , which identifies the 1 ∗ vertex sc to be on a cycle. Given that the parent graph is a tree, we know that Rf ∩ Rb
76 Algorithm 8: Counterexample Extraction Input : Subset Next that contains a reachable accepting cycle Output: Pair <π, ζ>, where π = s0, . . . , sa is a path in Graph, such that s0 is the initial state, sa is an accepting state; ζ = sa, . . . , sa is a cycle in Graph. p 1 NoSucc := {s ∈ Next | Reachf (1, s, Next) = ∅} p 2 Acc := {s ∈ NoSucc | Reachb (∗, s, Next,F ) ∩ F 6= ∅} 3 foreach sc ∈ Acc do 1 4 Rf := Reachf (1, sc, Next) ∗ p 1 5 Rb := Reachb (∗, sc, Next,Rf ) 1 ∗ 6 if Rf ∩ Rb = ∅ then ∗ 7 sa := s ∈ Rb ∩ F p ∗ 8 return
∗ contains at most a single vertex. The cycle Rb must contain an accepting vertex sa, from which the final backward traversal towards the initial vertex can be started.
5.4.1 Counterexamples The ability to produce counterexamples is a crucial property of LTL model checking that must be retained even for our combination with symbolically represented data. In standard, explicit model checking a counterexample is an infinite path that violates the desired property, see Algorithm8. To the user, it is usually reported as a pair (path to a vertex on an accepting cycle, path from that vertex to itself). With symbolic representation of data we can either select one variable evaluation from among those comprising individual states, or use a part of the symbolic representation to provide the user with more information detailing the violating behaviour than what could be inferred from the explicit counterexample. We propose to represent the counterexample as a quadruple (path π to an accepting state, cycle ζ on that state, path conditions for π and ζ). In the description of paths we are only interested in the choices of the control-flow non-determinism that are required to be taken to steer the computation into and through the violating cycle. The path conditions on the other hand, need only to describe what data values are admissible to get to that cycle. It follows that from such a counterexample the user can get not only one (or a fixed number of) concrete run that violates the specified behaviour but also the precise description of all such runs, represented as the path condition. The negative aspect of symbolic counterexamples is related to the first fixed point search as described in Section 5.1, that can lead to counterexamples exponentially longer than those produced by the standard approach. 0 Altogether, the counterexample is a quadruple <π, ζ, sa(pc), sa(pc)>, where π = s0, . . . , sa 0 0 and ζ = sa, . . . , sa. Note that distinguishing between sa and sa is only necessary when employing image-based state matching. Given that equivalence-based state matching does 0 not allow reading from input variables on a cycle, the path conditions for sa and sa must be exactly the same. Furthermore, we can use the result of Algorithm8 without any modifications and simply project the paths to only control-flow information and extract the path condition 0 from sa. The image-based state matching must recompute the path condition for sa which is not stored in the database. Yet sa is stored and we know that sa is the direct predecessor
77 0 1 of sa and thus the states stored in Rf have correct path conditions. It follows that sa in the 0 algorithm already is the correct sa for the counterexample.
5.4.2 Deadlocks The above description is complete and correct for arbitrary system that does not contain deadlocks, i.e. at least one guard is satisfiable at any state of the system. In order to extend our approach to systems with deadlocks (which is desirable since deadlock-freedom is an important property to be verified), the algorithm has to correctly resolve the situation when there is a value for which no guard is satisfiable. Formally, let Γp be the guards on system transitions leaving a state s and Γa the guards on specification transitions. Then if W W 0 ρn ◦ ... ◦ ρ1 ∧ Γa ∧ Γp is unsatisfiable the traversal algorithm creates s as a copy of s, sets s0(data) to s(data).⊥ and stores it as another successor of s. This way the ability to detect deadlocks is restored and with it also the standard semantics of LTL on finite runs – implicit appending of a self-loop – is handled correctly.
5.4.3 Witnesses In the case of equivalence-based state matching, the witnesses may take two forms: (1) for demonstrating path condition difference and (2) for demonstrating data evaluation differ- ence. In both cases, the witness contains two sets of state references: one for positive and one for negative instances. Witnesses of form (1) further contain an evaluation of input variables y, such that for a positive state s it holds that y satisfies the path condition, i.e. V |=SAT ρn ◦ ... ◦ ρ1 ∧ ij = yj; y does not satisfy the path condition of negative states. Wit- nesses of form (2) contain two sets of evaluations: y of input variables and and v of state forming variables. The idea for witnesses of form (2) is that positive states map y to v, i.e. V V 0 for a state s it holds that ρn ◦ ... ◦ ρ1 ∧ ij = yj ∧ xi = vi. Negative states then map y to a vector other than v. Using witnesses can be incorporated into database searching (Algorithm7) as demon- strated in Algorithm9 into which three additions were made. First, the matching resolution was added on line 1a which prunes the database Visited, removing states using known wit- nesses while concurrently modifying the priority queue Witnesses (Algorithm 10). The second addition on line 4b extracts the satisfiability model and builds a new witness based on this model (Algorithm 11). The final addition is on line 4c and there we updates the priority keys of witnesses according to the newly added state ω0 (Algorithm 12).
78 Algorithm 9: Database Searching with Witness-Based Matching Resolution Input : State s; Database Visited; Modifying function update Output: Pair
1 Key := hash(s(pc)) 0 1a Visited := witnessBasedResolution(Visited) 0 2 foreach s0 ∈ Visited : hash(s0(pc)) = Key do 0 3 if 6|=SAT image(s, s ) then 4 return <>, &s0>
4a else 4b ω0 := extractWitness(s, s0) 4c updateWitnesses(ω0)
5 return <⊥, Visited.update(s)>
Algorithm 10: Witness-based Matching Resolution Input : Database Visited Output: Subset Visited 0 of known multi-states 0 1 Visited := Visited 2 while ¬Witnesses.empty() do 3 ω := Witnesses.pop() V V 4 res := (|=SAT ρn ◦ ... ◦ ρ1 ∧ ij = ω.yj ∧ xi = ω.vi)? positive : negative 0 0 5 Visited := Visited \ ω.res 77 foreach ω0 ∈ Witnesses do 8 ω0.positive := ω0.positive \ ω.res 9 ω0.negative := ω0.negative \ ω.res 10 key := |ω0.positive ∪ ω0.negative| 11 if key < threshold then 12 Witnesses.remove(ω0)
13 Witnesses.decreaseKey(ω0, key)
0 14 returnVisited
Algorithm 11: Extract Witness Input : Differentiable states s, s0 Output: Witness ω0
1 M := getModel(image(s, s0)) 0 2 ω .y := M|I 0 3 ω .v := M|D 4 return ω0
79 Algorithm 12: Update Witnesses Input : Witness ω0
1 Witnesses.add(ω0, 0) 2 foreach s0 ∈ Visited do V V 3 res := (|=SAT ρn ◦ ... ◦ ρ1 ∧ ij = ω.yj ∧ xi = ω.vi)? positive : negative 4 ω0.res := ω0.res ∪ {s0} 5 key := |ω0.positive ∪ ω0.negative| 6 Witness.decreaseKey(ω0, key + 1)
80 Chapter 6
Verification of Simulink Designs
The Simulink diagram presented in the introduction (in Figure 1.2), shows why are these de- sign stage models perfect candidates for demonstrating the efficiency of control explicit—data symbolic model checking. The standard explicit-state model checking cannot be used with- out modifying the diagram, since the semantics dictates that each input is read in each step. A comprehensive explicit verification would thus need to generate 232n potential successor states in each step, where n is the number of inputs. In Section3 we have formalised the process of generating a CFG from a parallel program. Simulink diagrams share many similarities to standard programs, but there are a few unique features, regarding especially the input/output behaviour. These features may simplify the verification and thus need to be pointed out in detail. We thus first formalise the diagrams, trying to remain consistent with the notation of Section3 wherever possible. The results of comparing purely explicit model checking with control explicit—data symbolic model checking are presented next.
6.1 Theory of Simulink Verification
In order to be able to denote explicitly what variables may occur in a particular formula, we will use the lower index, e.g. ϕX is a formula over variables from X. In order to distinguish between BV formulae and terms we will denote formulae with p (for predicates) and function with f.
A state of a Simulink system is described by the evaluation of X = {d1, . . . , dm}, the delay variables, where their values for the next iteration are given by a function fX ∪I , denoted as δi. There is a finite number of input variable templates, I = {i1, . . . , in}, which can be read iteratively, but for each there are bit-vector constraints limiting its possible evaluation (represented by a predicate pI , denoted as ιi). Finally, the complete description of a system also requires the outport functions O = {σ1, . . . , σk} (each of which is fX ∪I ). On the side of the specification, we will abstract the given B¨uchi automaton to only the predicates pX ∪I∪O labeling its transitions, Ψ = {ϕ1, . . . , ϕl}, and assume that the transition system, the order in which the predicates occur, is given implicitly. Hence the example below contains all the information needed for a Simulink diagram verification query Q = (X , I, O, Ψ).
Example 6.1 (CFG for Simulink). Let Q1 = (X , I, O, Ψ), where
81 X = {d1, d2}
o2 δ1 :(i1 + d1) ∗ (i2 + d2) δ2 : d1
O = {σ , σ , σ } ite 1 2 3 σ1 : max(i1, i2, i3) σ2 : ite (i1 > 7) d1 d2 σ3 : d1 + i3 d2 d1 > 7
I = {i1, i2, i3} ι : 1 ≤ i ≤ 20 + ∗ + i 1 1 1 ι2 : 3 ≤ i2 ≤ 5 ι3 : 5 ≤ i3 ≤ 10
i2 i3 max o1 Ψ = {ϕ1, ϕ2}
ϕ1 : σ2 > 3 ∨ i1 ≤ σ1/2 ϕ : σ = σ ⇒ i ≤ 4 2 1 2 2 4
As mentioned above, the states of a Simulink transition system are sets of evaluations of the variables from X . The initial state consists of a single evaluation {di 7→ 0}. Let I = {x := {ij 7→ y} | x |= ιj} be the |I|-dimensional polyhedron of allowed inport evaluations, for Q1 it would be {1..20} × {3..5} × {5..10}. Then, computing every transition consists of three steps:
1. compute the Cartesian product (of this state) with I: results in a set of evaluations of variables from X ∪ I;
2. create a new state, initialised with the above set of evaluations, for every available specification formula ϕ (there can be more that one, hence the transition system may branch at this point) and remove those evaluations that do not satisfy ϕ;
3. apply δ functions to compute the new evaluations of X .
Assuming that the computation is in state s, represented as the evaluation of delay variables, and that the relevant transition is associated with a specification ϕ, then the resulting next 0 state is s = {x := {di 7→ v[δi](v(d), v(i))} | v ∈ s × I ∧ x |= ϕ}. The system can reach a deadlock if there is no such v ∈ s × I for which {di 7→ v[δi](v(d), v(i))} |= ϕ. The model checking process computes the graph of the Simulink computation, where identical states are unified, i.e. if a newly generated state s00 (as a successor of s) is found equal to an already existing state s0, then s00 is discarded and s0 is acknowledged as a successor of s. For the purpose of providing the user with counterexample traces, the states also contain an |I|-long sequence of inport values that satisfies the specification condition. But that is easily implementable, and will only be mention if a significant change in the computation was necessary to incorporate the counterexample traces.
82 Example 6.2 (Simulink Semantics). We continue with the query Q1 from Example 6.1 and simplify the notation by establishing two transformations. First, ϕei maps a set of evaluations of X ∪ I to that subset where every evaluation satisfies ϕi. Rephrased formally we have
ν ν |= ϕ ϕ (ν) = i ei ⊥ otherwise
Second, ∆e maps a set of the same evaluations to evaluations of X , after the δ functions have been applied, hence
∆fi(ν) = {(di, vi) | vi = δi(ν(d1), . . . , ν(dn), ν(ι1), . . . , ν(ιm))}
Hence a part of the transition system of Q1 is as follows:
s01 = ∆e ◦ ϕf1(s0 × I)
s = {d 7→ 0} 0 i s011 = ∆e ◦ ϕf1(∆e ◦ ϕf1(s0 × I) × I)
s02 = ∆e ◦ ϕf2(s0 × I) s012 = ∆e ◦ ϕf2(s01 × I)
Concretely, the set s0 × I has 360 elements. ϕf1(s0 × I) and ϕf2(s0 × I) have 63 and 315 elements, respectively; s01 has 12 elements and s02 has 45 elements. For example s01 = {(3, 0), (4, 0), (5, 0), (6, 0), (8, 0), (9, 0), (10, 0), (12, 0), (15, 0), (16, 0), (20, 0), (25, 0)}. 4
The process described so far produces a transition system, where the states are evaluations of delay variables and a location of the B¨uchi automaton, and the branching is given by that same automaton. Without going into details with respect to the product of program and its specification – where we went in the previous section – the answer to a verification query Q is equivalent to the presence of accepting cycles in the produced transition system. The efficiency of answering a verification query, the complexity of Simulink model checking, is most noticeably influenced by the choice of how to represent the sets of evaluations. That is the biggest contributor to the state space explosion, since, unlike in parallel programs, the control-flow non-determinism is negligible in Simulink models.
6.1.1 Verification with Explicit Sets The first representation considered in the context of Simulink verification stores the allowed evaluations in an explicit set, enumerating all possible combinations of individual variable evaluations. This approach is closely related to the one we proposed in [BBB+12] but improves on it in several aspects. Effectively, rather than repetitive executions for every evaluation of both inport and delay variables, we only represent the delay variables in an explicit evaluation; the inport variables are stored purely symbolically, only as their constructing predicates ιj, outside individual states. The above description of Simulink verification semantics justifies this optimisation, since the inports are only used for the evaluation of Ψ predicates from the specification. They are discarded during the ∆e transformation, and can be recovered using an inverse transformation if necessary, e.g. during counterexample generation.
83 The implementation itself is then a relatively straightforward extension of the purely ex- plicit approach. The states enumerate every allowed evaluation and thus a separate access to each of these evaluations is permitted. Computing the Cartesian product with I amounts to enumerating all combinations of delay and inport variable evaluations. For the ϕe and ∆e transformations, one iteratively considers individual evaluations and removes those not satis- fying ϕ and applies the δ function, respectively. Most importantly, however, deciding equality of states is possible on the level of syntax: two states are the same if their representations in memory match. Verification with explicit sets clearly retains the biggest limitation of the purely explicit approach, i.e. the spacial complexity grows exponentially with the ranges of inport variables. We will postpone the discussion of the comparison between the explicit set approach and the purely explicit approach until Section 6.2.2 and only remark that while the improvement is considerable, it does not allow specification of ranges with respect to their bit-width. That ability appears to require a symbolic representation and we continue our analysis towards temporal verification of Simulink models with a method which uses BV formulae and the satisfiability modulo theories (SMT) procedures for this theory.
6.1.2 SAT-Based Verification
The need to iteratively read from inputs via the inport variables complicates the other- wise relatively straightforward application of symbolic execution. Later we will show that the execution-based approach, where current values of variables are represented as symbolic functions over input variables, entails another complication in deciding state equality. In standard symbolic execution, the variables are initialised to an arbitrary value only once, at the beginning. Thus reading from inports requires to strengthen the standard symbolic execution to allow computation with potentially infinite number of input variables. Henceforth, only the inport variables are considered as first-order variables, the delays and outports are only functions in the BV theory (over inport variables). The inports are history- dependent and thus one may label the symbols from I to stand for the inport variables, i.e. I0, I1, I2 and so on. Given that the only branching in the transition system is caused by the particular choice of a specification transition, the states can be represented as l-long sequences of numbers ρ = (r1, . . . , rl), where l is the length of this computation. Each number in this sequence represents which specification formula was selected in the respective branching of the computation. Indeed, the functions representing delays δl, outports σl and the specification predicates ϕl are unambiguously defined for each l recursively as follows:
0 0 l+1 l+1 l δ = δ[ij/ij , dj/0], δ = δ[ij/ij , dj/δj]; 0 l l+1 l+1 l σ = σ[ij/ij, dj/0], σ = σ[ij/ij , dj/δj]; (6.1) 0 0 0 l+1 l l+1 l ϕ = ϕ[σj/σj , ij/ij , dj/0], ϕ = ϕ[σj/σj, ij/ij , dj/δj], where f[x/y] is the formula f with every occurrence of x replaced by y. Hence every formula at any point of the verification computation uses only the inport variables decorated with history indices. As mentioned above, the system reaches a deadlock if no value satisfies the specification formula. Let the computation be in a state represented by ρ and the outgoing transition labelled with formula ϕp. Then checking whether this transition leads to a deadlock is equiv-
84 alent to checking satisfiability of the following recursive path condition formula, where is an empty sequence: V 0 path() = 1≤j≤|I| ιj , V |ρ|+1 |ρ|+1 path(ρ.p) = path(ρ) ∧ 1≤j≤|I| ιj ∧ ϕp . Also note that the model of this formula, the satisfying evaluation of inport variables, is a counterexample trace (one of them to be precise) which the SMT solver can generate while deciding the satisfiability. Finally, to complete the verification process, one needs to be able to decide if two states are the same. Let ρ and ρ0 be two states, of which it needs to be assessed whether they are identical, i.e. if the sets of possible evaluations of the delay variables are the same. Then the |ρ| 0 function δj represents the possible values of dj in ρ, provided that |= path(ρ) ∧ path(ρ ), i.e. both path conditions are satisfiable. However, the δ functions represent the current values of delays by the means of inport variables, describing to what evaluation of delays does a particular combination of inputs lead. The states ρ, ρ0 may not be of the same length and thus may differ in the number of inputs. Consequently, comparing two states amounts to comparing the images of the δ functions, i.e. comparing the sets of possible evaluations. We will continue this argument in Section 6.3 and for now simply state the quantified version of 0 0 ψ(ρ, ρ ) := Ψρρ0 ∨ Ψρ0ρ – a formula which is satisfiable iff ρ and ρ are two different states: