Modeling, Specification Languages, Array Programs
Total Page:16
File Type:pdf, Size:1020Kb
Chapter 4 Modeling, Specification Languages, Array Programs This chapter presents some techniques for verifying programs with complex data structures. Typical examples of such data structures are arrays and lists. The main ingredient is to provide extensions to the language of expressions and formulas, under the form of an expressive specification language. Section 4.1 presents generalities about specification languages, while other sections present an overview of the constructs for modeling and specifying as they are available in the Why3 tool [7]. Finally, Section 4.6 is dedicated to the modeling and the proof of programs on the array data structure. 4.1 Generalities on Specification Languages There have been a lot of variants of specification languages proposed in the past. Various classes of such languages are characterized by the kind of logic that they are based on. Algebraic Specifications This class of languages is characterized by the use of multi-sorted first-order logic with equality, and usually the use of rewriting techniques [13], which permits to mix executable- style and declarative-style specifications. Famous languages and tools of this class are Larch [17], KIV [26], CASL (Common Algebraic Specification Language) [2], OBJ [16], Maude [10]. Set-theory based approaches Another class of languages is based on classical Set theory. These languages are not typed, the typing properties being instead encoded by formulas on sets. Reasoning on such languages is also typically done by rewriting techniques. A precursor of this class is the Z notation [27]. It is used in tools like VDM [18], Atelier B [1], etc. Higher-Order Logic Higher-Order Logic is characterized by the use of mathematical functions as objects of the language. It is based on advanced lambda-calculi (e.g. System F). Reasoning on such specifications is typically done using interactive environment where proofs are constructed manually us- ing tactics/strategies. Famous tools implementing some higher-order logics are PVS [24], Isabelle [22], HOL4 [23], Coq [5], etc. Object-Oriented languages These are based on the idea of using the object-oriented paradigm in a logic world. They are typically in use in a context of object-oriented-style development. The precursor is the language Eiffel [20], which is both a programming language and a specification language, and was designed to support the concept of design-by-contract. Other famous languages are OCL [28] (Object 41 Constraint Language) which allows to incorporate formal specifications into the UML diagrammatic design methodology, and the Java Modeling Language [9]. Satisfiability Modulo Theories In 1980, Nelson and Oppen [21] proposed a new technique for com- bining propositional reasoning and dedicated decision procedures for specific theories. This gave birth to a new class of automated theorem provers called SMT solvers. A precursor of such solvers is Sim- plify [14] which was used for program verification in the ESC/Java tool. The SMT-LIB format [25] is not really a specification language but a common language for describing verification conditions as they typically come from program verification. In this context, the intermediate languages for program verification Why [15] and Boogie [3] propose small specification languages that can be supported by SMT solvers. Nowadays, the successors of Why and Boogie, respectively Why3 [7] and Dafny [19], propose modeling languages that can be seen as a compromise between: Expressivity of the language, to be able to design specifications of complex programs ; • Support by automated provers, in particular modern SMT solvers like Alt-Ergo [6], Z3 [12], • CVC3 [4] which support quantified first-order formulas. The following sections present an overview of the constructs of the Why3 specification language. 4.2 Defined Functions and Predicates Direct definitions of new functions and predicates can be given under the form function f(x1 : τ1, . , xn : τn): τ = e predicate p(x1 : τ1, . , xn : τn) = e where τi, τ are not reference types. In such definitions, no recursion is allowed, unless we are in the case of Section 4.3.3. Since expres- sions have no side-effects, these define total functions and predicates, which are pure in the sense that they have no side-effects. Example 4.2.1 Here are a few examples of definitions of functions and predicates. function sqr(x:int) : int = x * x function min(x:real,y:real) : real = if x y then x else y ≤ function abs(x:real) : real = if x 0.0 then x else -x ≥ predicate in_0_1(x:real) = 0.0 x x 1.0 ≤ ∧ ≤ predicate prime(x:int) = x 2 forall y z:int. y 0 z 0 x = y*z y=1 z=1 ≥ ∧ ≥ ∧ ≥ ∧ → ∨ 4.3 Constructed Types New logic data types can be defined by standard construction of compound types. Being in a logic language, these types must be pure in the sense of pure functional programming, that is, the values of these types cannot be modified “in-place”. 42 4.3.1 Product Types Types can be defined as being tuples of already defined types, for example type intpair = (int, int) function sumprod (n:int,m:int) : intpair = (n+m,n*m) type vector = (real, real, real) function diag(x:real) : vector = (x,x,x) The construct let x=e1 in e2 is extended so as to allow introducing a tuple, as in function sum (p:intpair) = let (x,y) = p in x+y The other kind of product type is the record type, defined classically by a sequence of named fields. The dot notation allows to access a given field. For example, type vector = { x:real; y:real } function norm(v:vector) : real = sqrt(v.x * v.x + v.y * v.y) function rotate(v:vector,a:real) : vector = { x = cos(a) * v.x + sin(a) * v.y ; y = sin(a) * v.y - sin(a) * v.x } Again, it is important to notice that these are pure types; the components of tuples and the fields of records are immutable. In other words, there is no way to assign a field. 4.3.2 Sum Types We can define new sum types in a functional programming style, with the form of declaration type t = C (τ , . , τ ) | 1 1,1 1,n1 | · · · C (τ , . , τ ) | k k,1 k,nk where C1,...,Ck are constructors. Example 4.3.1 The type of cards in a standard 54-card game can be defined as type suit = Spade | Heart | Diamond | Club type value = Ace | King | Queen | Jack | Number(int) type card = Normal(suit,value) | Joker Associated to sum types is the definition by pattern-matching, with the form match expression with C (x , . , x ) expression | 1 1,1 1,n1 → 1 | · · · C (x , . , x ) expression | k k,1 k,nk → k end where each xi,j is a variable that is bound in expressioni. For convenience, one may use extended pattern-matching [8]: inside patterns, an argument of a constructor may itself be a pattern. One may also use the wildcard character _. As in OCaml or Why3, we distinguish between constructors and variables by enforcing a capital letter as the first character of a constructor. Example 4.3.2 Here is a function that returns the score of a given card in an imaginary game. 43 function score (c:card,trump:suit): int = match c with | Joker 100 → | Normal(Ace,s) if s=trump then 50 else 20 → | Normal(Number(n),s) if s=trump then 50 else n → | Normal(v,_) → match v with | King | Queen 40 → | Jack 30 → | _ 0 → end end 4.3.3 Recursive Sum Types Sum type definitions can be recursive. Recursive definitions of functions or predicates are allowed if recursive calls are on structurally smaller arguments, which corresponds to the notion of primitive recursion on such a sum type. Example 4.3.3 Here is a definition of binary trees holding integers at each node, and a function giving the sum of the elements of a tree. type tree = Leaf | Node(tree,int,tree) function sum (t:tree) : int = match t with | Leaf 0 → | Node(l,x,r) sum(l) + x + sum(r) → end The recursion is allowed since arguments l and r are subtrees of the input. 4.3.4 Polymorphic Types All the constructs for type definitions above can be parameterized, i.e. the type is given one or several type variables as parameters, as in type vector α = (α,α,α) type tagged α = { tag : int ; content : α } type option α = None | Some(α) type tree α β = Leaf(α) | Node(tree α β, β, tree α β) 4.3.5 Example of Lists Lists, or finite sequences, are very common data structures. The type of lists can be modeled by a polymorphic, recursive sum type as type list α = Nil | Cons α (list α) Classical functions on lists can be defined by primitive recursion, such as function append(l1:list α,l2:list α) : list α = match l1 with | Nil l2 → 44 | Cons(x,l) Cons(x,append(l,l2)) → end function length(l:list α) : int = match l with | Nil 0 → | Cons(x,r) 1 + length(r) → end function rev(l:list α) : list α = match l with | Nil Nil → | Cons(x,r) append(rev(r),Cons(x,Nil)) → end When proving programs operating on lists, and on recursive sum types in general like binary trees above, it is very common that some properties are needed, such as the properties lemma append_nil: forall l:list α. append(l,Nil) = l lemma append_assoc: forall l1 l2 l3:list α. append(append(l1,l2),l3) = append(l1,append(l2,l3)) Such lemmas can be stated as this in the Why3 tool. They must be proved by induction, meaning that they are out of reach to SMT solvers.