Axiomatic Semantics
Total Page:16
File Type:pdf, Size:1020Kb
Advanced Topics in Programming Languages Spring Semester, 2012 Lecture 6: April 24, 2012 Lecturer: Mooly Sagiv Scribe: Michal Balas and Yair Asa Axiomatic Semantics 6.1 The basic idea The problem we would like to solve is how to prove that a program does what we require of it. Given a program, we specify its required behavior based on our intuitive understanding of it. We can run it according to the operational semantics or denotational semantics and compare to its behavior there. For some programs we need to be more abstract (for example programs that receive input), and then it is necessary to use some logic to reason about the program (and how it behaves on a set of inputs and not in one specific execution path). In this case we may eventually develop a formal proof system for properties of the program or showing that it satisfies a requirement. We can then use the proof system to show correctness. These rules of the proof system are called Hoare or Floyd-Hoare rules. Floyd-rules are for flow-charts and Hoare-rules are for structured languages. Originally their approach was advocated not just for proving properties of programs but also giving a method for explaining the meaning of program. The meaning of a program was specified in terms of \axioms" saying how to prove properties of it, in other words it is given by a set of verification rules. Therefore, this approach was named axiomatic semantics. Axiomatic semantics has many applications, such as: Program verifiers • Symbolic execution tools for bug hunting • Software validation tools • Malware detection • Automatic test generation • It is also used for proving the correctness of algorithms or hardware descriptions, \extended static checking (e.g., checking array bounds), and documenting programs and interfaces. 2 Advanced Topics in Programming Languages c Tel Aviv Univ. 6.1.1 Example P An example program that computes the sum of the first hundred numbers: 1≤m≤100m. S := 0; N := 1; while (N = 101) do : S := S + N; N := N + 1; We can see that the commands S := 0; N := 1; initialize the values in the locations. so we add comments as follows; S := 0; S = 0 Nf := 1;g N = 1 whilef g(N = 101) do : S := S + N; N := N + 1; We can also add the comment after the execution of the while loop meaning that S will have the required value. S := 0; S = 0 Nf := 1;g N = 1 whilef g(N = 101) do : S := S + N; N := N + 1; P N = 101 S = 1≤m≤100m f ^ g Inside the while loop we add comment on both N and S: N ranges from 1 to 101 and S represents the partial sum. This comment express the key relationship between the value at location S and the value at location N. S := 0; S = 0 Nf := 1;g N = 1 whilef g(N = 101) do : P 1 N < 101 S = 1≤m<N m Sf :=≤ S + N; ^ g P 1 N < 101 S = 1≤m≤N m Nf :=≤ N + 1; ^ g P N = 101 S = 1≤m≤100m f ^ g The basic idea 3 P The assertion S = 1≤m≤N m is called an invariant of the while-loop because it remains true under each iteration of the loop. 6.1.2 Partial Correctness and Total Correctness We can base a proof system on assertions of the form P S Q f g f g where P; Q are assertions, i.e. extensions of boolean expressions, and S is a statement (also called: command). The interpretation of an assertion of this form is: for all states σ which satisfy P , if the execution of S from state σ terminates in state σ0, then σ0 satisfies Q. In other words, any terminating execution of S from a state satisfying P ends up in a state satisfying Q. P is the precondition and Q is the postcondition of the assertion P S Q . For example: y x z := x; z := z + 1 y < z is a valid assertion. f g f g f ≤ g f g Assertions of this form are known as Hoare triples or Hoare assertions, named after the logician C.A.R. Hoare. These assertions are called partial correctness assertions because they do not say anything about the command S if it fails to terminate. An example for a statement that does not terminate when executed from any state is: S while true do skip ≡ Since S does not terminate, true S false is a valid partial correctness assertion. In fact any partial correctness assertionf forg thisf specificg S is valid. Total correctness assertions, on the other hand, are assertions of the form [P ]S[Q] and the interpretation of such an assertion is: for all states σ which satisfy P , the execution of S from state σ must terminate in a state σ0 that satisfies Q. The semantics of assertions for partial correctness: σ = A means that the state σ satisfies the assertion A (or: A is true at state σ). Inj P S Q , the statement S denotes a partial function from initial states to final states. Thus,f g thef partialg correctness assertion means: σ; σ0 Σ:(σ = P S; σ σ0) σ0 = Q: By the denotational8 2 semantics:j ^ h i ! ) j σ Σ:(σ = P S S σ = ) S S σ = Q: If we adopt8 the2 conventionj ^ thatJ K 6 represents? ) J anK undefinedj state that satisfies all assertions, that is, for all A, = A, the partial? correctness assertion P S Q can be defined as follows: σ Σ.σ?j= P S S σ = Q: f g f g 8 2 j ) J K j 4 Advanced Topics in Programming Languages c Tel Aviv Univ. 6.2 Assn - an Assertion Language What kind of assertions should we include? Since we would like to reason about boolean expressions, we include all the assertions in Bexp. We want to make assertions using the quantifiers \ i " and \ i ", therefore we work with extensions of Bexp and Aexp which include8 integer··· variables9 ··· i over which we can quantify. Then , for example, we use i:k = i l: to denote that an integer k is a multiple of another l. We also import well known9 mathematical× concepts, for example, we use: n! = n (n 1) 2 1 for the factorial function. × − × · · · × × We first define Aexpv which is, basically, Aexp extended to include integer variables i; j; k, etc.. So Aexpv is given by: a ::= n X i a0 + a1 a0 a1 a0 a1 j j j j − j × where n ranges over numbers, X ranges over locations and i ranges over integer variables. We extend boolean expressions to include these more general arithmetic expressions and quantifiers, as well as implication: A ::= true false a0 = a1 a0 a1 A0 A1 A0 A1 A A0 a1 i:A i:A j j j ≤ j ^ j _ j : j ) j 8 j 9 We call the set of extended boolean assertions, Assn. 6.2.1 Example A program that computes the gcd of two numbers: while (M = N) do if:M N ≤ then N := N M else M := M − N − In this example we would like to say that there are no preconditions (true) and that the postcondition is that M; N both hold the gcd of the original M; N. In order to formulate this we need to add two variables m; n. Now we can set: precondition: (M = m) (N = n) (m 0) (n 0) postcondition: M = N =^ gcd(m; n)^ ≥ ^ ≥ Assn - an Assertion Language 5 6.2.2 Free and Bound Variables An occurrence of an integer variable i in an assertion is bound if it occurs in the scope of an enclosing quantifier i or i. Otherwise, it is free. Examples: 8 9 i:k = i l •9the occurrence× of the integer variable i is bound, while those of k; l are free. (i + 100 77) ( i:j + 1 = i + 3) • here the≤ same^ integer8 variable i has two different occurrences in the same assertion: the first is free and the second is bound. j is free. A formal definition using definition by structural induction: Let FV (a) be the set of free variables of arithmetic expressions, extended by integer variables (a Aexpv). By structural induction: 2 FV (n) = FV (X) = FV (i) = i ; f g S FV (a0 + a1) = FV (a0 a1) = FV (a0 a1) = FV (a0) FV (a1) for all numbers n, locations X, and− integer variables× i. Let FV (A) be the free variables of an assertion A. By structural induction: FV (true) = FV (false) = ; S FV (a0 = a1) = FV (a0 a1) = FV (a0) FV (a1) ≤ S FV (A0 A1) = FV (A0 A1) = FV (A0 A1) = FV (A0) FV (A1) FV ( A^) = FV (A) _ ) FV (:i:A) = FV ( i:A) = FV (A) i 8 9 n f g for all a0; a1 Aexpv, integer variables i and assertions A0;A1;A. Any variable2 which occurs in an assertion A and is not free is said to be bound. An assertion with no free variables is closed. 6.2.3 Substitution Visualization of an assertion A, with free occurrences of integer variable i: --- i --- i ---- Let a be a \pure" arithmetic expression (contains no integer variables). Then the result of substituting a for i is: A[a=i] --- a --- a ---- ≡ In general substitutions, if a contains integer variables then it may be necessary to rename some bound variables of A in order to avoid some variables of a becoming bound by quantifiers in A.