Program Verification March 2014 Lecture 5 - Axiomatic Lecturer: Noam Rinetzky Scribes by: Nir Hemed

1.1 Axiomatic semantics

The development of the theory is contributed to Robert Floyd, C.A.R Hoare, and Edsger W. Dijkstra.

Proving program correctness

• Why proving correctness?

– we discussed this on previous classes

• What is correctness?

– we have a formal definition of correctness w.r.t a formal specification

• How should this be done?

– Reasoning at the level ∗ tedious and cumbersome ∗ there is a good chance of making mistakes – A better option might be formal reasoning using axiomatic semantics ∗ we will formulate a syntactic proof ∗ such proofs can be verified by a computer program (i.e. machine checkable)

Program correctness concepts

• Property - a relationship between an initial state and a final state.

• Partial correctness - properties that hold if a program terminates.

• Termination - the program always terminates.

• Total correctness = partial correctness + termination.

– other correctness conditions exists: on memory, on concurrency (e.g. lineariz- abilty), etc.

Example Factorial

Sfac ≡ y:=1; while (x!=1) do (y:=y*x; x:=x-1)

We define a correctness condition: if the statement Sfac terminates then the final value of y will be the factorial of the initial value of x.

1 2

Using Natural Semantics we would formally write

0 0 hSfac, si → s implies that s y = (s x)!

A detailed staged proof for this case is given in the slides1. As the proof shows, in order to prove the correctness condition of the program, in every stage we would have to examine the derivation tree of the statement. We link between pre and post states using the derivation trees. The key point is that such proof is by definition a semantic proof. The problems arising from such proofs may include:

1. proof is very laborious

• we need to connect all the transitions and argues about relationships between states • this is originated from the fact that we are too closely connected to the semantics of the programming language

2. there is no clear methodology to find this proof

3. can we tell if this proof is correct?

• other than manually examining it?

Axiomatic verification approach We ask again: what do we need in order to prove that a program does what it is supposed to do? We need to :

• Specify the required behaviour

• Compare the behaviour with the one obtained by the (already familiar) operational semantics

• Develop a proof system for showing that the program satisfies a requirement

• Mechanically use the proof system to show correctness

• The meaning of the program will now be a set of verification rules

Assertion based verification (Floyed, ’67) The technique was first applied on (annotated) Flow Programs. The basis for the axiomatic approach is being able to deduce out of the syntax of the program and into the domain of, for example, arithmetic. This system handled two problems that exist in computer programs and do not exist in logic sentences:

1. assignments to variables

2. handling of loops

• this was done by examining paths in the program and identifying cut-points which are points in the program in which a certain inductive assertion is true.

1 The staged proof of Sfac using NS is given in Wiley’s textbook (Pages 169-171) 3

• a simple path that ends in a cut point allows to formulate inductive, loop-free proofs.

Hoare logic C.A.R. Hoare first defined axiomatic semantics (1969). We will now define the semantics of the programming language as a proof system. We aim for a structural programming language. Assertions, a.k.a Hoare triples, are:

{P } C {Q} where:

• P - a pre-condition, is a state predicate (e.g. x > 0).

• Q - a post-condition, is a state predicate (e.g. x > 1).

• C - a statement.

• to be read ”if P holds in the initial state, and if the execution of C terminates on that state, then Q will hold in the state in which C halts”.

• C is not required to always terminate (e.g. {true} while true do skip {true})

Total correctness is expressed via

[P ] C [Q]

• to be read ”if P holds in the initial state, the execution of C must terminate on that state, and Q will hold in the state in which C halts”. 4

Example Factorial - continued. We ask: {?} y:=1; while (x!=1) do (y:=y*x; x:=x-1) {?} Can we say the following?

{x > 0} y:=1; while (x!=1) do (y:=y*x; x:=x-1) {y = x!}

The answer is no. the value of x in the final assertion is already different from the one in the initial state. A possible solution - using logical variables:

{x = n} y:=1; while (x!=1) do (y:=y*x; x:=x-1) {y = n!}

Note: a logical variable is not used by the program and is always immutable. Now, using logical variables, we can provide annotations to the program. Here is the factorial example partial correctness proof outline:

{x=n} y:=1; {x > 0 =⇒ y*x!=n! ∧ n ≥ x} while (x!=1) do {x - 1> 0 =⇒ (y*x)*(x-1)!=n! ∧ n ≥ (x-1)} y:=y*x; {x - 1> 0 =⇒ y*(x-1)!=n! ∧ n ≥ (x-1)} x:=x-1 {y*x!=n! ∧ n ≥ 0 ∧ x = 1}

This “proof” stands in comparison to the laborious semantic proof. There, we had to break the proof into different views: one for the body of the loop, one for the loop itself and another one for the entire program. In each stage we found a link between an initial and a final state and used that to unroll the loop. We would like to formalise such a relation between properties of initial and final states when using the axiomatic semantics. We do so by introducing the concept of partial correctness.

Formalizing partial correctness

• s |= P - “assertion P holds in state s”

• Σ - the set of program states

• ⊥ - a special undefined state

Let us remind how we defined the effect of a statement in natural semantics: ( s0, if hC, si → s0 Sns C s = J K ⊥, otherwise

We will use this definition to define partial correctness:

•{P } C {Q}

– ∀s, s0 ∈ Σ.(s |= P ∧ hC, si → s0) =⇒ s0 |= Q 5

– alternatively: ∀s ∈ Σ.(s |= P ∧ Sns C s 6=⊥) =⇒ Sns C s |= Q J K J K – conventions: ∗ ∀P. ⊥|= P

∗ ∀s ∈ Σ.s |= P =⇒ Sns C s |= Q J K

Notes: we chose natural semantics since the structure of the derivation trees is similar to what we use in the axiomatic approach. We could have alternatively use structural operational semantics, with a slightly different definition: •{P } C {Q}

– ∀s, s0 ∈ Σ.(s |= P ∧ hC, si =⇒∗ s0) =⇒ s0 |= Q

– alternatively: ∀s ∈ Σ.(s |= P ∧ Ssos C s 6=⊥) =⇒ Ssos C s |= Q J K J K – conventions: ∗ ∀P. ⊥|= P

∗ ∀s ∈ Σ.s |= P =⇒ Ssos C s |= Q J K A point to consider: could we have used natural semantics to define total correctness?

How can we express predicates? We can choose between two alternatives: 1. Existential approach: Abstract mathematical functions P : State −→ {tt, ff}

2. Intentional approach: Via language of formulae (a language that describes assertions) We choose the second option:

An assertion language We will use an assertion language based on first-order-logic language with arithmetic. We do so because propositional logic is not expressive enough to express predicates needed for many proofs. Intuitively, we obtain the language by augmenting Bexp in the following way: • Allow quantifiers (∀z., ∃z., e.g. ∃z.z = k × n)

• Import well known mathematical concepts (e.g. n! ≡ n × (n − 1) × ... × 2 × 1)

• We include both program variables and logical variables 6

First order logic (reminder)

Free/bound variables a variable is said to be bound in a formula when it occurs in the scope of a quantifier. Otherwise it is said to be free.

• ∃i.k = i × m - here i is bound.

• (i + 100 ≤ 44) ∧ (∀i.j + i = i + 3) - here i is free only on the first occurrence.

We denote the set of free variables of the expression A as FV (A). FV (A) is defined inductively on the abstract syntax tree of A:

Substitutions

• An expression is pure if it does not contain quantifiers (such expression is sometimes called term).

• A[t/z] denotes the assertion A0 which is the same as A, except that all instances of the free variable z are replaced by t.

Example A ≡ ∃i.k = i × m A[5/k] = ∃i.5 = i × m A[5/i] = A

• Figure 1.1 shows how to calculate substitutions 7

Figure 1.1: calculating substitutions

We now return to our formulation of axiomatic semantics;

1.2 Proof rules

Proof rules will be used to define the abstract meaning of the program, or, how to prove properties of programs.

1. Assignment rule (backward-style) [assp]

{P [a/x]} x:=a {P }

• Note that it is a “backward” rule. • x := a always terminates • why is this true? recall that in operational semantics hx := a, si −→ s[x 7→ A a s]. Here, {P [a/x]} means replacingJ K every occurrence of x in P with a. (Note that a might include x.) For example, if P = 2x < y + 1 and a = x + 3 then P [a/x] = 2(x + 3) < y + 1. Note that for any state s, s |= P [a/x] ⇐⇒ s[x 7→ A a s] |= P J K • for example {y*z<9} x:=y*z {x<9}

2. Assignment rule (forward-style)

{P } x:=a {∃y.P [y/x] ∧ x = a[y/x]}

• Note that it is a “forward” rule. • In the final state, y will have the value that x had in the initial state. • It is considered less elegant from the “backwards” version as it introduces quantifiers.

Backward style proofs have an advantage in the sense that they are “outcome- driven” - the developer knows what she wants to prove and she is interested in understanding under what circumstances the execution will be correct. 8

3. Skip rule [skipp] {P } skip {P }

4. Composition rule [compp]

{P } S1 {Q}{Q} S2 {R}

{P } S1; S2 {R}

• Holds when S1 terminates in every state where P holds and then Q holds and S2 terminates in every state where Q holds and then R holds.

5. Condition rule [condp]

{b ∧ P } S1 {Q} {¬b ∧ P } S2 {Q}

{P } if b then S1 else S2 {Q}

6. Loop rule [whilep] {b ∧ P } S {P } {P } while b do S {¬b ∧ P } • Here P is called an invariant of the loop. (a) holds before and after each loop iteration (b) finding loop invariants is the most challenging part of proofs. • When the loop finishes, b is false.

7. Rule of Consequence [consp] {P 0} S {Q0} {P } S {Q} if P ⇒ P 0 and Q0 ⇒ Q • Allows strengthening the precondition and weakening the postcondition. • The only rule that is not sensitive to the form of the statement (only the assertions) • See the following example: {y*z<9}x:=y*z{x<9} {y*z<9 ∧ w=5}x:=y*z{x<10}

1.2.1 Interference trees Proofs are written (formally) using inference trees which are similar to the derivation trees that we have seen in natural semantics. 1. The root of the tree is the judgement that we wish to prove. 2. Leaves are instances of axioms 3. Internal nodes correspond to conclusions of instantiated rules and they have corre- sponding premises as their immediate sons. 4. A tree is called simple if the tree is only an axiom, composite otherwise. 9

1.3 Provability

• We say that an assertion {P } C {Q} is provable if there exists an interference tree, denoted by `p {P } C {Q}. – Note: interference trees need not be unique; for example, there is always an option to push in consequences. Example Factorial - interference tree proof:

Annotated programs is a “streamlined version” of interference trees. • inline interference trees into programs • a kind of “proof carrying code” • going from annotated program into proof is a linear time translation

• Annotating Composition: when handling compositions of the form S1; S2; ...; Sn−1 instead of writing deep trees we can simply annotate:

{P1} S1 {P2} S2 ... {Pn−1} Sn−1 {Pn}

• annotated programs are not considered formal proofs, but they enable building interference trees. • annotations can be used on conditions as well as loops (see fig. 1.3)

1.4 Properties of the semantics

We are interested in several properties of the axiomatic semantics. 1. Equivalence - What is the analogue of program equivalence in axiomatic verification? 2. Soundness - Can we prove incorrect properties? 3. Completeness - Is there something we can’t prove? In general, proofs of properties of the axiomatic semantics use induction on the shape of the interference tree. 10

Figure 1.2: Using annotations

1.4.1 Provable equivalence

• We say that C1 and C2 are provably equivalent if for all P and Q

`p {P } C2 {Q} ⇐⇒`p {P } C2 {Q}

for example, S; skip and S or S1;(S2; S3) and (S1; S2); S3 • provable equivalence implies semantic equivalence

1.4.2 Valid assertions also called semantically correct.

• We say that {P } C {Q} is valid if

∀s ∈ Σ.(s |= P ∧ hC, si → s0) =⇒ s0 |= Q

• Denoted |=p {P } C {Q} 11

Logical implication and equivalence There is a connection between logical operators and and equivalence.

• For predicates A, B we write A =⇒ B if for all states s ∈ Σ if s |= A then s |= B.

– {s|s |= A} ⊆ {s|s |= B} – for every predicate A: A : false =⇒ A =⇒ true

• We write A ⇐⇒ B if A =⇒ B and B =⇒ A for example, false ⇐⇒ 5 = 7

• In writing Hoare style proofs, we will often replace a predicate A with A0 such that A ⇐⇒ A0 and A0 is “simpler”.

1.4.3 Soundness and Completeness • The interference system is sound

`p {P } C {Q} =⇒|=p {P } C {Q}

• The interference system is complete

|=p {P } C {Q} =⇒`p {P } C {Q}