Spring 2014 Program Analysis and Verification
Lecture 9: Abstract Interpretation I
Roman Manevich Ben-Gurion University Syllabus
Abstract Static Analysis Crafting your Semantics Interpretation Analysis Techniques own fundamentals
Natural Automating Numerical Lattices Soot Semantics Hoare Logic Domains
Structural Control Flow Galois From proofs to CEGAR semantics Graphs Connections abstractions
Systematically Axiomatic Equation Fixed-Points Alias analysis developing Verification Systems transformers
Collecting Widening/ Shape Semantics Narrowing Analysis
Domain constructors
Interprocedural Analysis
2 Previously
• Another static analysis example – constant propagation • Basic concepts in static analysis – Control flow graphs – Equation systems – Collecting semantics – (Trace semantics)
3 Annotating programs Annotate(P, S) = case S is x:=aexpr return {P} x:=aexpr {F*[x:=aexpr] P}
case S is S1; S2
let Annotate(P, S1) be {P} A1 {Q1}
let Annotate(Q1, S2) be {Q1} A2 {Q2}
return {P} A1; {Q1} A2 {Q2}
case S is if bexpr then S1 else S2
let Pt = F[assume bexpr] P
let Pf = F[assume bexpr] P
let Annotate(Pt, S1) be {Pt} A1 {Q1}
let Annotate(Pf, S2) be {Pf} A2 {Q2}
return {P} if bexpr then {Pt} A1 {Q1} else {Pf} A2 {Q2} {Q1 Q2} case S is while bexpr do S
N := Nc := P // Initialize repeat let Pt = F[assume bexpr] Nc
let Annotate(Pt, S) be {Nc} Abody {N} Nc := Nc N until N = Nc return {P} INV= {N} while bexpr do {P } A {F[assume bexpr](N)} t body 4 Collecting semantics example: input 1
1 label0: 2 if x <= 0 goto label1 3 x := x – 1 … [x3] [x2] [x1] entry 4 goto label0
5 label1:
[x-1] [x0] [x1] 2 if x > 0
[x-1] exit [x0] [x1] 3 x := x - 1
5 Collecting semantics example: input 2
1 label0: 2 if x <= 0 goto label1 3 x := x – 1 … [x3] [x2] [x1] entry 4 goto label0
5 label1:
[x2] [x-1] [x0] [x1] 2 if x > 0
[x-1] exit [x0] [x1] 3 x := x - 1 [x2]
6 Collecting semantics example: input 3
1 label0: 2 if x <= 0 goto label1 3 x := x – 1 … [x3] [x2] [x1] entry 4 goto label0
5 label1:
[x3] [x2] [x2] [x-1] [x0] [x1] 2 if x > 0
[x-1] exit [x0] [x1] 3 x := x - 1 [x3] [x2]
7 ad infinitum – fixed point
1 label0: 2 if x <= 0 goto label1 3 x := x – 1 … [x3] [x2] [x1] entry 4 goto label0
5 label1:
… [x3] [x2] [x2] [x-1] [x0] [x1] 2 if x > 0
… [x-2][ x-1] exit [x1] 3 x := x - 1 x x [ 3] [ …2 ]
8 Predicates at fixed point
1 label0: 2 if x <= 0 goto label1 3 x := x – 1 {true} entry 4 goto label0
5 label1:
{true} 2 if x > 0
{x0} exit {x>0} 3 x := x - 1 {x0}
9 Equational definition example
• A vector of variables R[0, 1, 2, 3, 4] • R[0] = {xZ} // established input Semantic function for assume x>0 R[1] = R[0] R[4] R[2] = R[1] {s | s(x) > 0} R[3] = R[1] {s | s(x) 0} Semantic function for x:=x-1 lifted to sets of states R[4] = x:=x-1 R[2] • A (recursive) system of equations
entry R[0] R[1] if x > 0 R[3] R[2] R[4]
exit x := x-1 10 General definition
• A vector of variables R[0, …, k] one per input/output of a node – R[0] is for entry • For node n with multiple predecessors add equation R[n] = {R[k] | k is a predecessor of n} • For an atomic operation node R[m] S R[n] add equation R[n] = S R[m]
• Transform if b then S1 else S2 to (assume b; S1) or (assume b; S2)
entry R[0] R[1] if x > 0 R[3] R[2] R[4]
exit x := x-1 11 Current lecture Appendix A. • Semantic domains – Preorders – Partial orders (posets) – Pointed posets – Ascending/descending chains – The height of a poset – Join and Meet operators – Complete lattices – Constructing new lattices from old
12 Abstract interpretation Theory [1977]
By Rama (Own work) [CC-BY-SA-2.0-fr (http://creativecommons.org/licenses/by-sa/2.0/fr/deed.en)], via Wikimedia Commons
13 Abstract Interpretation [CC77]
• A very general mathematical framework for approximating semantics – Generalizes Hoare Logic – Generalizes weakest precondition calculus • Allows designing sound static analysis algorithms – Usually compute by iterating to a fixed-point – Not specific to any programming language style • Results of an abstract interpretation are (loop) invariants – Can be interpreted as axiomatic verification assertions and used for verification
14 Annotating programs Annotate(P, S) = case S is x:=aexpr return {P} x:=aexpr {F*[x:=aexpr] P}
case S is S1; S2
let Annotate(P, S1) be {P} A1 {Q1} let Annotate(Q , S ) be {Q } A {Q } 1 2 1 2 2 Approximates concrete semantics return {P} A ; {Q } A {Q } 1 1 2 2 sp(x:=aexpr, P) F*[x:=aexpr] case S is if bexpr then S1 else S2
let Pt = F[assume bexpr] P
let Pf = F[assume bexpr] P
let Annotate(Pt, S1) be {Pt} A1 {Q1}
let Annotate(Pf, S2) be {Pf} A2 {Q2} Approximates
return {P} if bexpr then {Pt} A1 {Q1} disjunction else {Pf} A2 {Q2} {Q1 Q2} case S is while bexpr do S { P’ } S { Q’ } N := Nc := P // Initialize [consp] if PP’ and Q’Q repeat { P } S { Q } let Pt = F[assume bexpr] Nc
let Annotate(Pt, S) be {Nc} Abody {N} Nc := Nc N until N = Nc return {P} INV= {N} while bexpr do {P } A {F[assume bexpr](N)} t body 15 The big picture
• Use semantic domains to define both concrete semantics and abstract semantics • Relate semantics in a sound way • Interpret program over abstract semantics
abstract abstract statement S representation representation abstract semantics of sets of states of sets of states
abstraction meaning abstraction meaning
statement S set of states set of states set of states collecting semantics
16 A theory of semantic domains
1. Approximating elements 2. Approximating sets of elements
By Brett Jordan David Macdonald [CC-BY-2.0 (http://creativecommons.org/licenses/by/2.0)], via Wikimedia Commons
17 Overall idea
• A semantic domain can be used to define properties (representations of predicates) – Also called abstract states • Common representations – Logical formulas – Automata – Specialized graphs
18 A taxonomy of semantic domain types
Complete Lattice (D, , , , , )
Lattice (D, , , , , )
Join semilattice Meet semilattice (D, , , ) (D, , , )
Complete partial order (CPO) (D, , )
Partial order (poset) (D, )
Preorder (D, ) 19 preorders
20 Preorder
• Let D be a set of elements • We say that a binary order relation over D is a preorder if the following conditions hold for every d, d’, d’’ D – Reflexive: d d – Transitive: d d’ and d’ d’’ implies d d’’ • There may exist d, d’ such that d d’ and d’ d yet d d’
21 Preorder examples
• SAV-predicates – SAV-factoids = { x = y | x, y Var } { x = y + z | x, y, z Var } – SAV-predicates = 2 set – Order relation 1: P1 P2 iff P1 P2 imp – Order relation 2: P1 P2 iff P1 P2 – Which order relation is stronger (contains more pairs)? – Which order relation is easier to check?
– What if both P1 and P2 are in the image of explicate?
22 set SAV preorder 1: P1 P2 iff P1 P2
{} Var = {x, y}
{x=y} {y=x} {x=x+x} {y=y+y} {y=x+y} {y=y+x} {x=x+y} {x=y+x}
{x=y, y=x} {x=y, x=x+x} … {x=x+y, x=y+x}
{x=y, x=x+x, x=x+y} … {x=y, x=x+x, x=x+y}
{x=y, y=x, x=x+x, y=y+y, y=x+y, y=y+x, x=x+y, x=y+x}
23 imp SAV preorder 2: P1 P2 iff P1 P2
{} Var = {x, y}
{x=y} {y=x} {x=x+x} {y=y+y} {y=x+y} {y=y+x} {x=x+y} {x=y+x}
{x=y, y=x} {x=y, x=x+x} … {x=x+y, x=y+x}
…
{x=y, x=x+x, x=x+y} … {x=y, x=x+x, x=x+y}
{x=y, y=x, x=x+x, y=y+y, y=x+y, y=y+x, x=x+y, x=y+x}
24 Preorder examples
• CP-predicates – CP-factoids = { x = c | x Var, c Z } – CP-predicates = 2 set – Order relation 1: P1 P2 iff P1 P2 imp – Order relation 2: P1 P2 iff P1 P2 – Is there a difference? • {x=5, x=7, x=9} {x=5, x=7} • {x=5, x=7, x=9} {x=5, x=7} • {x=5, x=7} {x=5, x=7, x=9}
25 CP preorder example
{}
… {x=-3} {x=-2} {x=-1} {x=0} {x=1} {x=2} {x=3} …
Var = {x}
26 CP preorder example
{}
… {x=-3} {x=0} {x=3} … {y=-5} {y=0} {y=36} …
{x=-3, y=-5} {x=0, y=0} {x=3, y=36}
Var = {x, y}
27 The problem with preorders
• Equivalent elements have different representations – {x=y, x=a+b} S {Q} – {x=y, y=a+b} S {Q’} • Leads to unpredictability • Which result should our static analysis give?
28 The problem with preorders
• Equivalent elements have different representations – {x=y, x=a+b} assume ya+b {x=y, x=a+b} – {x=y, y=a+b} assume ya+b {false} • Leads to unpredictability • Which result should our static analysis give?
29 The problem with preorders
• Equivalent elements have different representations – {x=y, x=a+b} assume xa+b {false} – {x=y, y=a+b} assume xa+b {x=y, x=a+b} • Leads to unpredictability • Which result should our static analysis give?
In practice many static analyses still use preorders
30 Partial orders
31 Partially ordered sets (partial orders)
• A partially ordered set (Poset for short) is a pair (D , ) • D is a set of elements – a semantic domain • is a partial order between pairs of elements from D. That is : D D with the following
properties, for all d, d’, d’’ in D Makes it easier to choose the – Reflexive: d d best element – Transitive: d d’ and d’ d’’ implies d d’’ – Anti-symmetric: d d’ and d’ d implies d = d’ • If d d’ and d d’ we write d d’
32 Partially ordered sets (partial orders)
• A partially ordered set (Poset for short) is a pair (D , ) • D is a set of elements – a semantic domain • is a partial order between pairs of elements from D. That is : D D with the following properties, for all d, d’, d’’ in D – Reflexive: d d – Transitive: d d’ and d’ d’’ implies d d’’ – Anti-symmetric: d d’ and d’ d implies d = d’ • If d d’ and d d’ we write d d’
33 SAV partial order
• SAV-predicates – SAV-factoids = { x = y | x, y Var } { x = y + z | x, y, z Var } – SAV-predicates = 2 set • Order relation 1: P1 P2 iff P1 P2 Is this a partial order? imp • Order relation 2: P1 P2 iff P1 P2 that is models(P1) models(P2) Is this a partial order? set* • Order relation 3: P1 P2 iff set Explicate(P1) Explicate(P2) Is this a partial order?
34 CP partial order
• CP-predicates – CP-factoids = { x = c | x Var, c Z } – CP-predicates = 2 set • Order relation 1: P1 P2 iff P1 P2 Is it a partial order? imp • Order relation 2: P1 P2 iff P1 P2 Is it a partial order?
Can we define a more precise partial order?
35 CP partial order
• CP-predicates
– CP-factoids false = { x = c | x Var, c Z } – CP-predicates = 2 {false} – Define reduce : 2 2
reduce(P) = if exists {x=c1, x=c2}P then {false} else P – false = { P2 | P=reduce(P) } {false}
• Order relation: P1 P2 if P1 P2 or P1={false}
36 Pointed poset
• A poset (D, ) with a least element is called a pointed poset – For all dD we have that d • The pointed poset is denoted by (D , , ) • We can always transform a poset (D, ) into a pointed poset by adding a special bottom element (D {}, {d | dD}, ) • Example: false = { P2 | P=reduce(P) } {false}
37 chains
38 Chains
• If d d’ and d d’ we write d d’ • Similarly define d d’ • Let (D, ) be a poset • An ascending chain is a sequence x1 x2 … xk … • A descending chain is a sequence x1 x2 … xk … • The height of a poset is the length of the maximal ascending chain – What is the height of the SAV poset? – What is the height of the CP poset?
39 Ascending chain example
true
x0 x0
x<0 x=0 x>0
false
40 Joining elements
By Viviana Pastor (originally posted to Flickr as Harbour Bridge 1) [CC-BY-2.0 (http://creativecommons.org/licenses/by/2.0)], via Wikimedia Commons 41 Bounds
• Let (D , ) be a poset • Let X D be a set of elements from D • An element dD is an upper bound (ub) of X iff for every xD we have that xd • An element dD is a lower bound (lb) of X iff for every xD we have that dx • An element dD is the least upper bound (lub) of X iff d is the minimal of all upper bounds of X • An element dD is the greatest lower bound (glb) of X iff d is the maximal of all lower bounds of X
42 Bounds example
true the signs lattice (for variable x)
x0 x0
x<0 x=0 x>0
false
43 x0 and true are upper bounds
true
x0 x0
x<0 x=0 x>0
false
44 x0 is the least upper bound
true
x0 x0
x<0 x=0 x>0
false
45 Join (confluence) operator
• Assume a poset (D, ) • Let X D be a subset of D (finite/infinite) • The join of X is defined as – X = the least upper bound (LUB) of all elements in X if it exists
– X = min{ b | forall xX we have that xb} – The supremum of the elements in X – A kind of abstract union (disjunction) operator • Properties of a join operator – Commutative: x y = y x – Associative: (x y) z = x (y z) – Idempotent: x x = x • x y = y iff x y
46 Properties of join
• Can be used to define partial order x y = y iff x y • Monotone: if y z then (x y) (x z) • x = x • x =
47 Meet operator
• Assume a poset (D, ) • Let X D be a subset of D (finite/infinite) • The meet of X is defined as – X = the greatest lower bound (GLB) of all elements in X if it exists
– X = max{ b | forall xX we have that bx} – The infimum of the elements in X – A kind of abstract intersection (conjunction) operator • Properties of a join operator – Commutative: x y = y x – Associative: (x y) z = x (y z) – Idempotent: x x = x
48 Complete partial orders
49 Complete partial order (CPO)
• A CPO is a partial order where each ascending chain has a supremum
50 lattices
• A complete lattice (D, , , , , ) is • A set of elements D • A partial order x y • A join operator • A meet operator
52 Join semilattice
• A complete lattice (D, , , ) is • A set of elements D with • A partial order x y • A join operator
53 Meet semilattice
• A complete lattice (D, , , ) is • A set of elements D with • A partial order x y • A meet operator
54 Powerset lattices
• For a set of elements X we define the powerset lattice for X as (2X, , , , , X) – Notice it is a complete lattice • For a set of program states State, we define the collecting lattice (2State, , , , , State)
55 Composing lattices
56 One lattice per variable
true true
x0 x0 y0 y0
x<0 x=0 x>0 y<0 y=0 y>0
false false
How can we compose them?
57 Cartesian product of complete lattices
• For two complete lattices L1 = (D1, 1, 1, 1, 1, 1) L2 = (D2, 2, 2, 2, 2, 2) • Define the poset Lcart = (D1D2, cart, cart, cart, cart, cart) as follows:
– (x1, x2) cart (y1, y2) iff x1 1 y1 and x2 2 y2 – cart = ? cart = ? cart = ? cart = ? • Lemma: L is a complete lattice
• Define the Cartesian constructor Lcart = Cart(L1, L2)
58 Cartesian product example
true (true, true)
x0 x0 y0 y0
x0,y0 x0,y0 x0,y0 x0,y0 … x0,y<0 x0,y<0 x0,y=0 x0,y=0 x0,y>0 x0,y>0 … x>0,y0 x>0,y0 … x<0,y<0 x<0,y=0 x<0,y>0 x=0,y<0 x=0,y=0 x=0,y>0 x>0,y<0 x>0,y=0 x>0,y>0
false How does it represent (x<0y<0) (x>0y>0)? (false, false)
59 Disjunctive completion
• For a complete lattice L = (D, , , , , ) • Define the powerset lattice D L = (2 , , , , , ) = ? = ? = ? = ? = ? • Lemma: L is a complete lattice • L contains all subsets of D, which can be thought of as disjunctions of the corresponding predicates • Define the disjunctive completion constructor L = Disj(L)
60 The base lattice CPfalse
true
… {x=-2} {x=-1} {x=0} {x=1} {x=2} …
false
61 The disjunctive completion of CPfalse What is the height of this lattice? true
… {x=-2} {x=-1} {x=0} {x=1} {x=2} …
… {x=-2x=-1} {x=-2x=0} {x=-2x=1} … {x=1x=2} …
… {x=-1 x=1x=-2} … {x=0 x=1x=2} … …
false
62 Relational product of lattices
• L1 = (D1, 1, 1, 1, 1, 1) L2 = (D2, 2, 2, 2, 2, 2) D1D2 • Lrel = (2 , rel, rel, rel, rel, rel) as follows:
– Lrel = ?
63 Relational product of lattices
• L1 = (D1, 1, 1, 1, 1, 1) L2 = (D2, 2, 2, 2, 2, 2) D1D2 • Lrel = (2 , rel, rel, rel, rel, rel) as follows:
– Lrel = Disj(Cart(L1, L2)) • Lemma: L is a complete lattice • What does it buy us?
64 Cartesian product example
true
x0 x0 y0 y0
x0,y0 x0,y0 x0,y0 x0,y0 … x0,y<0 x0,y<0 x0,y=0 x0,y=0 x0,y>0 x0,y>0 … x>0,y0 x>0,y0 … x<0,y<0 x<0,y=0 x<0,y>0 x=0,y<0 x=0,y=0 x=0,y>0 x>0,y<0 x>0,y=0 x>0,y>0
false How does it represent What is the height of (x<0y<0) (x>0y>0)? this lattice?
65 Relational product example
true
x0 x0 y0 y0
(x<0y<0)(x>0y>0) (x<0y<0)(x>0y=0) (x<0y0)(x<0y0)
…
false How does it represent What is the height of (x<0y<0) (x>0y>0)? this lattice?
66 Collecting semantics
1 label0: 2 if x <= 0 goto label1 3 x := x – 1 … [x3] [x2] [x1] entry 4 goto label0
5 label1:
… [x3] [x2] [x2] [x-1] [x0] [x1] 2 if x > 0
… [x-2][ x-1] exit [x0] [x1] 3 x := x - 1 x x [ 3] [ …2 ]
67 Defining the collecting semantics
• How should we represent the set of states at a given control-flow node by a lattice? • How should we represent the sets of states at all control-flow nodes by a lattice?
68 Finite maps
• For a complete lattice L = (D, , , , , ) and finite set V • Define the poset LVL = (VD, VL, VL, VL, VL, VL) as follows:
– f1 VL f2 iff for all vV f1(v) f2(v) – VL = ? VL = ? VL = ? VL = ? • Lemma: L is a complete lattice
• Define the map constructor LVL = Map(V, L)
69 The collecting lattice
• Lattice for a given control-flow node v: ? • Lattice for entire control-flow graph with nodes V: ? • We will use this lattice as a baseline for static analysis and define abstractions of its elements
70 The collecting lattice
• Lattice for a given control-flow node v: State Lv=(2 , , , , , State) • Lattice for entire control-flow graph with nodes V:
LCFG = Map(V, Lv) • We will use this lattice as a baseline for static analysis and define abstractions of its elements
71 Equational definition of the semantics
• Define variables of type set of states for each control-flow node R[entry] entry • Define constraints
between them R[2] 2 if x > 0
R[exit] exit R[3] 3 x := x - 1
72 Equational definition of the semantics
• R[2] = R[entry] x:=x-1 R[3] • R[3] = R[2] {s | s(x) > 0} • R[exit] = R[2] {s | s(x) 0} R[entry] entry • A system of recursive equations • How can we approximate it using what we have learned so far? R[2] 2 if x > 0
R[exit] exit R[3] 3 x := x - 1
73 An abstract semantics Abstract transformer for x:=x-1 # • R[2] = R[entry] x:=x-1 R[3] Abstract representation • R[3] = R[2] {s | s(x) > 0}# of {s | s(x) < 0} • R[exit] = R[2] {s | s(x) 0}# R[entry] entry • A system of recursive equations
R[2] 2 if x > 0
R[exit] exit R[3] 3 x := x - 1
74 Next lecture: abstract interpretation II