Axiomatic

1. Language for making assertions about programs Floyd- 2. Rules for establishing, i.e. proving the assertions

Typical kinds of assertions: Ranjit Jhala • This program terminates. UC San Diego • During execution if var z has value 0, then x equals y • All array accesses are within array bounds

Some typical languages of assertions: • First-order logic • Other logics (e.g., temporal logic)

1 2

TODAY’S PLAN IMP: An Imperative Language

1. Define a small language syntax and operational 2. Define a logic for verifying assertions semantics

3 4 IMP Syntactic Entities Abstract Syntax: Arith Expressions (Aexp)

• Int integer literals n e ::= n for n ∈ Int | x for x ∈ Loc • Bool booleans {true,false} | e1 + e2 for e1, e2 ∈ Aexp

| e1 - e2 for e1, e2 ∈ Aexp • Loc locations x,y,z,… | e1 * e2 for e1, e2 ∈ Aexp

• Aexp arithmetic expressions e Note: • Variables are not declared • Bexp boolean expressions b • All variables have integer type • There are no side-effects • Comm commands c

5 6

Abstract Syntax: Bool Expressions (Bexp) Abstract Syntax: Commands (Comm) c ::= skip true ::= true | x:= e for x ∈ L & e ∈ Aexp

| false | c1;c2 for c1,c2 ∈ Comm

| if b then c else c2 for b ∈Bexp & c ,c2∈ Comm | e1 = e2 for e1,e2 ∈ Aexp 1 1 | while b do c for c ∈ Comm & b ∈ Bexp | e1 < e2 for e1,e2 ∈ Aexp | ! b for b ∈ Bexp Note: | b || b for e , e ∈ Bexp • Typing rules embedded in syntax definition 1 2 1 2 – Other checks may not be context-free – need to be specified separately (e.g., variables are declared) | b1 & b2 for e1, e2 ∈ Bexp • Commands contain all the side-effects in the language

7 8 Semantics of IMP : States of IMP

• Meaning of IMP expressions depends on Evaluation judgment for expressions: the values of variables • Ternary relation on expression, a state, and a value: • We write: ⇓ n “Expression e in state σ evaluates to n” • A state σ is a function from Loc to Int – Value of variables at a given moment Q: Why no state on the right ? – Evaluation of expressions has no side-effects: – Set of all states is Σ = Loc -> Int – i.e., state unchanged by evaluating an expression

Q: Can we view judgment as a function of 2 args e, σ ? – Only if there is a unique derivation …

9 10

Operational Semantics of IMP Evaluation Rules (for Aexp)

Evaluation judgement for commands • Ternary relation on expression, state, and a new state ⇓ n ⇓ σ(x) • We write: ⇓ σ’ “Executing cmd c from state σ takes system into state σ’ ” ⇓ n1 ⇓ n2 ⇓ n1 ⇓ n2 • Evaluation of a command has effect < , > n + n n - n – but no direct value e1 + e2 σ ⇓ 1 2 ⇓ 1 2 – So, “result” of a command is a new state σ’

Note: evaluation of a command may not terminate ⇓ n1 ⇓ n2

⇓ n1 * n2 Q: Can we view judgment as a function of 2 args e, σ ? – Only if there is a unique successor state …

11 12 Evaluation Rules (for Bexp) Evaluation Rules (for Comm)

⇓ true ⇓ false ⇓ σ

< > n < > n p is n ⇓n1 ⇓n2 p is n1=n2 e1, σ ⇓ 1 e2, σ ⇓ 2 1 2 ⇓ p ⇓ p 1 2 1 2 ⇓ σ’ ⇓ σ’’

⇓ σ’’ ⇓ p1 ⇓ p2 ⇓ p1 ⇓ p2

⇓ p1Æ p2 ⇓ p1Ç p2 Define σ[x := n] as: ⇓ n ⇓ p σ[x := n](x) = n < , σ> ⇓ σ[ := n] < :b, σ> ⇓ : p x := e x σ[x := n](y) = σ(y)

13 14

Evaluation Rules (for Comm) Axiomatic Semantics

1. Language for making assertions about programs ⇓ true ⇓ σ’ 2. Rules for establishing, i.e. proving the assertions ⇓ σ’ 1 2 Typical kinds of assertions: • This program terminates. • During execution if var z has value 0, then x equals y • All array accesses are within array bounds ⇓ false ⇓ σ’ ⇓ σ’ 1 2 Some typical languages of assertions: • First-order logic • Other logics (e.g., temporal logic)

15 16 History : Program Verification

• Tu r i n g 1949: Checking a large routine • Floyd 1967: Assigning meaning to programs • Hoare 1971: An axiomatic basis for computer programming

Axiomatic Semantics • Program Verifiers (70’s – 80’s) • PREfix: Symbolic Execution for bug-hunting (WinXP) • Software Validation tools

Foundation for Software Verification • Deductive Verifiers: ESCJava, Spec#, Verifast, Y0, ... • Model Checkers: SLAM, BLAST,... • Test Generators: DART, CUTE, EXE,...

17 18

Hoare Triples The Assertion Language

• Partial correctness assertion: {A} c {B} • Arith Exprs + First-order Predicate logic If A holds in state σ and exists σ’ s.t. ⇓σ’ then B holds in σ’ A ::= true | false

• Total correctness assertion: [A] c [B] | e1 = e2 | e1 ¸ e2

If A holds in state σ | ¬ A | A1 && A2 | A1 || A2 | A1 => A2 then there exists σ’ s.t. ⇓σ’ and B holds in σ’ | \exists x.A | \forall x.A

• [A] is called precondition, [B] is called postcondition • IMP boolean expressions are assertions • Example: { y=x } z := x; z := z+1 { y < z }

19 20 Semantics of Assertions Semantics of Assertions

• Judgment σ |= A means assertion holds in given state

σ |= true always Formal definition of partial correctness assertion: σ |= e1 = e2 iff ⇓ n1 , ⇓ n2 and n1 = n2 |= { A } c { B } σ |= e <= e iff ⇓ n , ⇓ n and n <= n 1 2 1 1 2 2 1 2 iff σ |= A1 && A2 iff σ |= A1 and σ |= A2 forall σ in Σ. σ |= A σ |= A1 || A2 iff σ |= A1 or σ |= A2 implies [forall σ’ in Σ. ⇓ σ’ implies σ’ |= B] σ |= A1 => A2 iff σ |= A1 implies σ |= A2 σ |= \exists x.A iff for some n in Z. σ[x := n] |= A σ |= \forall x. A iff for all n in Z. σ[x := n] |= A

21 22

Semantics of Assertions Deriving Assertions

• Formal |= {A} {B} hard to use • Total correctness assertion: c |= [ A ] c [ B ] iff • Defined in terms of the op-semantics |= { A } c { B } and • Next, symbolic technique (logic) forall σ in Σ. σ |= A implies [exists σ’ in Σ. ⇓ σ] • for deriving valid triples |- {A} c {B}

23 24 Derivation Rules for Hoare Triples Deriv. Rules for Hoare Logic |- {A} c {B}

• Write |- {A} c {B} when we can derive the Rules for each language construct triple using derivation rules |- {A} c1 {B} |- {B} c2 {C} |- {A} {A} skip |- {A} c1;c2 {C} • One rule per command

|- {A && b} c1 {B} |- {A&& !b} c2 {B} |- {A} if b then c else c {B} • Plus, the rule of consequence: 1 2

A’ => A |- {A} c {B} B => B’ |- {A && b} c {A} |- {A} while b do c {A && !b} |- {[e/x]A} x:=e {A} |- {A’} c {B’} And the rule of consequence…

25 26

Free and Bound Variables Substitution Key idea in logic/PL: scoping & substitution • [e’/x] e is substituting e’ for x in e – Also written as e[e’/x] • Assertions are equivalent up to renaming – Note: only substitute the free occurrences of bound variables (a.k.a. alpha-renaming) • Alpha-rename bound variables to avoid conflicts – To subst. [e’/x] in y.x = y rename y if it occurs in e’ • Examples: ∀ – Result of alpha-renaming: ∀z. e’ = z ∀x.x = x is the same as ∀y.y = y – Rename bound x with y • We say that substitution avoids variable capture ∀x. ∀y.x = y is the same as ∀z. ∀x.z = x [ x/z ] ∀x.z = x is ? – Rename bound x with z and y with x • ∀x.x = x Wrong • ∀y.x = y Correct

27 28 Example: Assignment Example: Conditional

Assume x does not appear in e Prove: {true} if y<=0 then x:=1 else x:=y {x>0}

Prove |- {true} x:=e { x = e } true & y<=0 =>1>0 |- {1>0} x:=1 {x>0} true & y>0 =>y>0 |- {y>0} x:=y{x>0}

Note [e/x](x = e) = e = [e/x]e = e = e |- {true & y<=0} x:=1 {x>0} |- {true & y>0} x:=y {x>0} |- {true} if y<=0 then x:=1 else x:=y {x > 0} Use assignment rule … then conseq. rule • Rule for if-then-else x does not appear in e • Rule for assignment + consequence true => e = e |- {e = e} x:=e {x = e} |- {true} x:=e {x = e}

29 30

Example: Loop Soundness of Axiomatic Semantics

• Prove |- {x<=0} while x<=5 do x:=x+1 {x=6} Formal Statement of Soundness: • Use the rule for while with invariant x <= 6: If |- {A} c {B} then |= {A} c {B}

|- {x+1<=6} {x<=6} x<=6 & x<=5 => x+1<=6 x:=x+1 Equivalently |- {x<=6 & x<=5} x:=x+1 {x<=6} If H:: |- {A} c {B} then |- {x<=6} while x<=5 do x:=x+1 {x<=6 & x>5} forall σ if σ |= A and D:: ⇓ σ’ then σ’ |= B • Finish off with consequence rule: Proof: x<=0 => x<=6 |- {x<=6} W {x<=6 & x>5} x<=6 & x>5 =>x=6 Simultaneous induction on structure of D and H |- {x<=0} W {x = 6}

31 32 Algorithmic Verification Hoare rules mostly syntax directed, but:

1. When to apply the rule of consequence ? 2. What invariant to use for while ? Making Floyd-Hoare Algorithmic: 3. How to prove implications (conseq. rule)? Predicate Transformers

Hint: (3) involves ... SMT (2) invariants are the hardest problem (1) lets see how to deal with ...

33 34

Technique: Weakest Preconditions Weakest Preconditions

|- { y >10 } x := y {x > 0} Define wp(c, B) using Hoare rules |- { y >100 } x := y {x > 0} |- { x=2 & y=5 } x := y {x > 0} wp( , B) c1;c2 |- {A} c1 {B} |- {B} c2

After what preconditions does postcond. x>0 hold? = wp(c1, wp(c2, B)) {C} |- {A} c1; c2 {C}

WP( ,B): weakest predicate s.t. {WP( ,B)} {B} c c c wp(x:=e , B) • For any A we have {A} c {B} iff A => WP(c, B) = [e/x]B |- {[e/x]A} x:=e {A}

How to verify |- {A} c {B} ? wp(if e then c1 else c2, B) |- {A&b} c {B} |- {A & !b} c {B} 1. Compute: WP(c,B) 1 2 = e=>wp( , B) && !e=>wp( , B) c1 c2 |- {A} if b then c1 else c2 {B} 2. Prove: A => WP(c,B)

35 36 Weakest Preconditions for Loops Technique: Strongest Postconditions

|- { y > 100 } x := y {x > 10} Start from the equivalence |- { y > 100 } x := y {x > 20} while b do c = |- { y > 100 } x := y {x > 100} if b then (c; while b do c) else skip What postcond. is guaranteed after prec. y>100 ?

Let W = wp(while b do c, B) SP(c,A): strongest predicate s.t. {A} c {SP(c,A)} It must be that: W = [b => wp(c, W) & !b=>B] • For any B we have {A} c {B} iff SP(c,A) => B

How to verify {A} c {B} ? But this is a recursive equation! How to compute?! 1. Compute: SP(c,A) • We’ll return to finding loop WPs later … 2. Prove: SP(c,A) => B

37 38

Strongest Postconditions

Define sp(c, B) following Hoare rules

sp(c1;c2, A) = |- {A} c1 {B} |- {B} c2

sp(c2, sp(c1,A)) {C} |- {A} c1; c2 {C} Axiomatic Semantics on Flow Graphs

sp(x:=e , A) = Floyd’s Original Formulation

\exists x0. [x0/x]A && x=[x0/x]e |- {[e/x]A} x:=e {A}

sp(if e then c1 else c2, A) = |- {A&b} c1 {B} |- {A & !b} c2 {B} sp(c , A & e) || sp(c , A & !e) 1 2 |- {A} if b then c1 else c2 {B}

39 40 Axiomatic Semantics over Flow Graphs Sequential Composition

{P} if P’ => P {P’} {P} {x-1{x >= ¸ y-1}y} {x ¸ y} C c c 1 x:= x-1 {Q} {\exists x . {Q} if Q => Q’ {Q}{Q’} {x >= y-1} 0 x0 >=y C2 y:= y-1 & x=x0-1 }

{R} {x >= y} {\exists y0 x0. x >= y Relaxing Specifications via Consequence 0 0 & x = x -1 Backwards using weakest preconditions 0 & y = y -1} Will revisit later as subtyping 0 Forwards using strongest postconditions

41 42

Conditionals Joins

{P} {(P1 & E)||(P2 & !E)} T F T F E E {P&E} { P & !E} {P } {P1} 2 {P} {P1} {P2} {P}

{ x >= 0 } { x=0 || x>=1} {P1 || P2} {P} T F T F x=0 x=0 { x>=0 { x>=0 {x=0} {x >= 1} & x = 0 } & x!=0 } Forwards Forwards Backwards Backwards

43 44 Conditional+Join: Forward Conditionals+Joins: Backward

{ x ≠ 0 || a = 0 } { (x ≠ 0 & true) || (x = 0 & a = 2*x) } T T x<>0 F x<>0 { x ≠ 0 } { x=0 & a=0 } { 2*x = 2*x } { a = 2*x } F a := 2*x a := 2*x { x ≠ 0 || a = 2*x} { a = 2*x} { a == 2*x } { a = 2*x }

• Check the implications (simplifications)

45 46

Forward or Backward ? Another Example: Double Locking

• Forward reasoning lock – Know the precondition unlock – Want to know what postcond the code guarantees unlock lock • Backward reasoning – Know what we want to code to establish – Want to know under what preconditions this happens “An attempt to re-acquire an acquired lock or release a released lock will cause a deadlock.”

Calls to lock and unlock must alternate.

47 48 Locking Rules Locking Example { !locked } T Boolean variable locked states if lock is held or not { !locked & x = 0 } x=0 … lock { !locked & x ≠ 0 } • {!locked & P[true/locked] } lock { P } { locked & x = 0 } … lock behaves as assert(!locked);locked:=true {locked & x=0 || !locked & x≠0)} T { locked & P[false/locked] } { P } x=0 • unlock { locked & x = 0 } behaves as unlock assert(locked);locked:=false … unlock { !locked & x ≠ 0 } { !locked & x = 0 }

{ !locked }

49 50

Review What about real languages ?

{ Q } { P1 } { P2 } • Loops x:=E { P } • Function calls { P } • Pointers if Q => P[E\x] } if P1 => P and P2 => P { P } T E F { P1 } { P2 } if P&E => P1 if P& !E => P2

Implication is always in the direction of the control flow

51 52 Reasoning about loops: Rules Reasoning about loops: Flow Graphs

• Loops can be handled using conditionals and joins |- {A & b} c {A} • Consider the while b do S statement |- {A} while b do c {A & !b} { P } { I } Loop invariant { I } Rewrite A with I : Loop Invariant T F S E { Q } { I & E } |- {I & b} c {I} P => I |- {I} while b do c {I & ! b} I & !b => Q if P => I (loop invariant holds initially) and I & !b => Q (loop establishes the postcondition) |- {P} while b do c {Q} and { I & b } S { I } (loop invariant is preserved) Rule of Consequence

53 54

Loop Example Loop Example (II)

Verify: Guess invariant y = 2*x {x=8 & y=16} while(x>0){x--; y-=2;} {y = 0} { x = 8 & y = 16 } { y = 2*x } { x = 8 & y = 16 } { I } { y = 2*x } x--; { I } T F y-=2 x>0 { y = 0 } x--; T F y-=2 x>0 { y = 0 } { y = 2*x & x > 0 } { I & x>0 } Check : – Initial: x = 8 & y = 16 => y = 2*x Find an appropriate invariant I – Preservation: y = 2*x & x>0 => y–2 = 2*(x–1) – Holds initially x = 8 & y = 16 – Final: y = 2*x & x<=0 => y = 0 Invalid – Holds at end y == 0

55 56 Loop Example (III) Loops Discussion

Guess invariant y = 2*x & x >= 0 • Simple forward/backward propagation fails { x = 8 & y = 16 } { y=2*x & x >= 0} { y = 2*x & x >= 0} • Require loop invariants

x--; F – Hardest part of program verification T { y = 0 } y-=2 x>0 – Guess the invariants (existing programs) – Write the invariants (new programs) Check – Initial : x = 8 & y = 16 => y = 2*x & x >= 0 – Preserv: y = 2*x & x >= 0 & x>0 =>y–2 = 2*(x –1) & x–1 >=0 Note: Invariant depends on your proof goal! – Final: y = 2*x & x >= 0 & x <= 0 => y = 0

57 58

Verification Example Verification Example

int square(int n) { 2 2 int k=0, r=0, s=1; Need: {r=k & s=2k+1 & …} c {r=k & s=2k+1} while(k != n) { { true } i.e. {r=k2 & s=2k+1 …} => WP(c, {r=k2 & s=2k+1}) r = r + s; k:=0 Pick I: r = k2 i.e. {r=k2 & s=2k+1 …} => {r+s=(k+1)2 Æ (s+2) = 2(k+1)+1} Valid s = s + 2; r:=0 k = k + 1; s:=1 { true } } { r=0 & k=0} k:=0 return r; r:=0 } I : {r = k2} r:=r+s s:=1 F s:=s+2 T k!=n { r=0 & k=0 & s=0} k:=k+1 {r=k2 & k=n} ) {r=n2} {r=k2 & !k=n} I : {r=k2 & s=2k+1} r:=r+s T F Need: {r=k2 & !k=n} c {r=k2} s:=s+2 k!=n { r=k2 { r=k2 i.e. {r=k2 & !k=n} => WP(c,{r=k2}) k:=k+1 & s=2k+1 & s=2k+1 => {r=n2} i.e. {r=k2 & !k=n} => {r+s=(k+1)2} Invalid & … } & k=n } 59 60 What about real languages ? Functions are big instructions

• Loops Suppose we have verified bsearch • Function calls int bsearch(int a[], int p) { Precondition • Pointers { sorted(a) } “Requires” … { r=-1 || (r>=0 & r < a.length & a[r]=p)} return r; Postcondition “Ensures” }

• Function spec = precondition + postconditon • Also called a contract

61 62

Function Calls Function Calls

• Consider a call to function y:= f(e) • Consider a call to function y:=f(e) – return variable r – return variable r – precondition Pre, postcondition Post – precondition Pre, postcondition Post

• Rule for function call: • Rule for function call:

|- P => Pre[e/x] |- {Pre} f {Post} |- Post[e/x,y/r] => Q { P } if P => Pre[E/x] |- {P} y:=f(e){Q} y:=f(e) { Q } and Post[E/x,y/r] => Q

63 64 Function Call: Example What about real languages ?

int bsearch(int a[],int p) { { sorted(a) } • Loops Consider the call … {sorted(arr) } { r=-1 || (r>=0 & r

• sorted[array] => Pre[a := arr] • Post[y/r, arr/a, 5/p] => (y=-1 || arr[y]=5)

65 66

Assignment and Aliasing Hoare Rules: Assignment and References

• When is the following Hoare triple valid? Does assignment rule work with aliasing ? { A } *x := 5 { *x + *y = 10 } If *x and *y are aliased then: • A should be “ *y = 5 or x = y ” {x=y} *x:=5 {*x + *y=10} • but Hoare rule for assignment gives:

[5/*x](*x + *y = 10) = 5 + *y = 10 = *y = 5 (uh oh! we lost one case! What happened?)

67 68 Hoare Rules: Assignment and References Memory Aliasing Modeling writes with memory expressions • Consider again: {A} *x:=5 {*x+*y=10 } • Treat memory as a whole with memory variables (M)

• upd(M,E1,E2) : update M at address E1 with value E2 A = [upd(M, x, 5)/M] (*x+*y=10) • sel(M,E1) : read M at address E1 Reason about memory expressions with McCarthy’s rule = [upd(M, x, 5)/M] (sel(M,x) + sel(M,y) = 10) = sel(upd(M, x, 5), x) + sel(upd(M, x, 5), y) = 10 E2 if E1 = E3 sel(upd(M, E1, E2), E3) = = 5 + sel(upd(M, x, 5), y) = 10 sel(M, E3) if E1 ≠ E3 = sel(upd(M, x, 5), y) = 5 Assignment (update) changes the value of memory = (x = y & 5 = 5) || (x != y & sel(M, y) = 5) = x=y || *y = 5 {B[upd(M, E1, E2)/M]} *E1:=E2 {B}

69 70

Program Verification Tools Algorithmic Program Verification

• Semi-automated …or how does ESC/Java work ? – You write some invariants and specifications – Tool tries to fill in the other invariants Q: How to algorithmically prove {P} c {Q} ? – And to prove all implications If no loops: – Explains when implication is invalid: counterexample for your specification 1. Compute: WP(c,Q) 2. Prove: P => WP(c,Q) • ESC/Java is one of the best tools Verification Condition • ... Spec#, Verifast, VCC Proved By SMT Solver

71 72 VC Generation for Loops VCGen We will write a function Suppose all loops annotated with Invariant vcgen :: Pred -> Com -> (Pred, [Pred]) whileI b do c Suppose (Q’,L’ ) = VCG(c,(Q,L;)) Compute VC: Then VC for {P} c {Q} is: P=>Q’ &&{f in L’}f SMTValid(VC) implies |- {P} c {Q} • L’ : the set of conditions that must be true Q: Why not iff ? – From loops (init, preservation, final) 1. Loop invariants may be bogus… • Q’ : “precondition” modulo invariants… 2. SMT solver may not handle logic...

73 74

VCGen VCGen vcgen :: Pred -> Com -> VC Pred ------verify :: Pred -> Com -> Pred -> Bool vcgen (Skip) q ------= return q

-- | The top level verifier, takes: vcgen (Asgn x e) q -- in : pre `p`, command `c` and post `q` -- out: True iff {p} c {q} is a valid Hoare-Triple = return $ q `subst` (x, e)

verify :: Pred -> Com -> Pred -> Bool vcgen (If b c1 c2) q verify p c q = all smtValid queries = do q1 <- vcgen q c1 where q2 <- vcgen q c2 (q', conds) = runState (vcgen q c) [] return $ (b `And` q1) `Or` (Not b `And` q2) queries = p `implies` q' : conds vcgen (While i b c) q = do q' <- vcgen i c valid $ (i `And` Not b) `implies` q' valid $ (i `And` b) `implies` q return $ i 75 76 ESC/Java Semi-automated “Deductive Verification”

• You write the invariants

• ESC/Java: – VCGen – Simplify: SMT used to prove VC

• Explains when implication is invalid: counterexample for your specification

77