Fundamentals of Programming Languages V Abstract Interpretation
Guoqiang Li
School of Software, Shanghai Jiao Tong University Assignment
Assignment 2 is announced! Deadline Nov. 15 Reference
Cousot P., Cousot R., Abstract Interpretation: A Unified Lattice Mode for Static Analysis of Programs by Construction or Approximation of Fixpoints, POPL’77
Steffen, B., Data Flow Analysis as Model Checking, TACS’91
Schmidt, D.A., Data Flow Analysis is Model Checking of Abstract Interpretations POPL’98
Lacey,D., Proving Correctness of Compiler Optimizations by Temporal Logic POPL’02 Turing Award 1986, Tony Hoare For his fundamental contributions to the definition and design of programming languages...
1991, Robin Milner For three distinct and complete achievements: 1) LCF, the mechanization of Scott’s Logic of Computable Functions, probably the first theoretically based yet practical tool for machine assisted proof construction; 2) ML, the first language to include polymorphic type inference together with a type-safe exception-handling mechanism; 3) CCS, a general theory of concurrency. In addition, he formulated and strongly advanced full abstraction, the study of the relationship between operational and denotational semantics...
1996, Amir Pnueli For seminal work introducing temporal logic into computing science and for outstanding contributions to program and systems verification...
2007, Edmund M. Clarke, E. Allen Emerson and Joseph Sifakis For their roles in developing model checking into a highly effective verification technology, widely adopted in the hardware and software industries... Abstract Interpretation in Practice Hardware Model Checking
• Classical Model Checking (in Clarke’s book)(hardware model checking): Implementer and verifier are same. • During designing a system, each model is checked and refined. • A correct system implementation is finally obtained. • e.g. a modulo 8 counter 0 • v0 = ¬v0 0 • v1 = v0 ⊕ v1 0 • v2 = (v0 ∧ v1) ⊕ v2 Software Model Checking
• Modern Model Checking(software model checking): Implementer and verifier may be different. • Given a system implementation (e.g. a source code), construct a model and verify it. • Construction of models: manually/automatically. • e.g. • security protocols • real-time embedded systems • program analysis Classical Program Analysis
• From POPA • From Dragon book • available expression analysis • dead code detection • reaching definition analysis • constant propagation • very busy expression analysis • common expression detection • live variable analysis • code motion
• Concurrent program analysis • Object-oriented program analysis • deadlock, livelock • point-to analysis • data race • escape analysis • • inference of class invariants Aspect-oriented program analysis • shape analysis • ????? Brief Idea of Abstract Interpretation
• In order to know properties of a program, execute it for all possible inputs. • It will not terminate since domains of variables are often infinite. • Abstract input data to a suitable finite domain. Then finiteness guarantees termination. • Functions needs to be reinterpreted. • Examples: • Security protocols: a principal may receive infinitely many messages from environments. (some affects the principal, some are not) • Embedded systems: a scheduler may have to schedule infinitely many released tasks. (when time exceeds, all tasks have the same results) Abstract Interpretation on Program Analysis: Examples • Even/Odd analysis: x × (x + 1) is even. • Abstraction: • Even • Odd • Interpretation: • Even + Even = Even, Even + Odd = Odd, • Even × Odd = Even,… • Positive/nagative analysis • Abstraction: • Pos • Neg • Zero • Interpretation: • Pos + Pos = Pos, Pos + Neg = {Pos, Zero, Neg},… • Neg × Neg = Pos, Pos × Zero = Zero,… • The result will be a (sound) approximation. Sound Abstraction
The result is an approximation
Completeness: sufficient condition • if no errors detected there are really no errors.
Soundness: necessary condition • if errors detected they are really errors.
f • Data Abstraction : abs : D → Abs DD • Typical concretization : con = abs−1 con abs abs • abs · con id ∧ con · abs ⊇ id ( = Abs) ( D) Int(f) guarantee soundness. Abs Abs
1 Abstract Interpretation
Define abstraction and finite abstract domain
abs : D → Abs
Interpret primitive functions
+, ×, if , =, ≥, ≤,...
Execute looping structures via program semantics
while, function call Program Semantics
Semantics: interpret a program (syntax) as an input-output relation (semantics).
Intuitively, the input-output relation is a relation between input environment and output environment of variables in a scopes.
Looping structure is solved as least fixed point.
Theorem Tarski’s least fixed point theorem: Assume τ : P(D) → P(D) is a monotonic, then there exists the S i unique least fixed point, which is computed as i τ (∅). We will come back to this theorem again! This can be read as ThisThis cancan bebe readread asas • UnfoldExample: while sentence Computing as repeated applications Sum of if. •• UnfoldUnfold whilewhile sentencesentence asas repeatedrepeated applicationsapplications ofof ifif.. env = [n,i,s,c] env = [n,i,s,c] envτ: env = [n,i,s,c]×envenv =→ [n, i,envs, c] ×env /* sum */ if ¬c τ: env×env → env×env /* sum */ τ: τenv: P×(envenv× env→) →envP(env×env× env) /*1: sumread */ n; if ¬c 1: read n; ifthen ¬ cskip 1:2: readi:=0; n; then skip step 0: φ 2: i:=0; thenelse skip τ step 0: φ 2:3: i:=0;S:=0; else step 0: φ 3: S:=0; elseS:=S+i; τ step 1: 3:4: S:=0;c:=True; S:=S+i; τ step 1: 4: c:=True; S:=S+i;i:=i+1; step{([ 01:,0, 0,True], 4:5: c:=True;while c i:=i+1; {([0,0,0,True], 5: while c i:=i+1;c:=i<=n; {([[00,0,,1,00,True],,False])} 5: whiledo c c:=i<=n; τ2 [0,1,0,False])} do c:=i<=n;if ¬c τ2 [0,1,0,False])} 6: doS=S+i; if ¬c τ2 step 2: 6: S=S+i; ifthen ¬ cskip step 2: 6:7: S=S+i;i:=i+1; then skip step{([ 02:,0, 0,True], 7:8: i:=i+1;c:=i<=n; thenelse skip {([0,0,0,True], 7: i:=i+1; else {([[00,0,,1,00,True],,False]), 8: c:=i<=n;od; elseS:=S+i; [0,1,0,False]), 8: c:=i<=n; S:=S+i; ([01,1,,0,0,False]),,True], 9: od;write S S:=S+i;i:=i+1; ([1,0,0,True], od; i:=i+1; ([[11,0,,2,01,True],,False])} 9: write S i:=i+1;c:=i<=n; [1,2,1,False])} 9: write S c:=i<=n; [1,2,1,False])} Unfoldc:=i<=n; while-loop Unfold while-loop Unfold while-loop Example:Example:ThisThis positive/negative can can Positive/Negative be be read read as as analysis • Abstraction: NatAnalysis→ {Pos, Zero}, Bool → Bool • •UnfoldUnfold while whilesentencesentence as as repeated repeated applications applications of of if .if . Abstraction: Nat → {Pos, Zero} envenv = [n,i,s,c] = [n,i,s,c] env = [n,i,s,c]env = [n, i, s, c] τ: τenv: ×envenv×env→ env→ ×envenv×env /* /*sum sum */ */ if ¬c ττ:: Penv(env××envenv) →→P(envenv ××envenv) /* sum */ if ¬c 1: 1:read read n; n; thenif ¬skipc φ 1: read n; then skip step 0: 2: 2:i:=0; i:=0; elsethen skip step 0: φ 2: i:=0; else τ’ step 0: φ 3: 3:S:=0; S:=0; S:=S+i;else τ step 1: 3: S:=0; S:=S+i; τ step 1: 4: 4:c:=True; c:=True; i:=i+1;S:=S+i; {([Zerostep,Zero, 1: Zero,True], 4: c:=True; i:=i+1; {([0,0,0,True], 5: 5:while while c c c:=i<=n;i:=i+1; [Zero{([,Pos,0,0,Zero0,True],,False])} 5: while c c:=i<=n; 2 [0,1,0,False])} do do c:=i<=n; τ2’ [0,1,0,False])} do if ¬c τ2 6: 6:S=S+i; S=S+i; if ¬c step 2: 6: S=S+i; then skip step 2: 7: 7:i:=i+1; i:=i+1; then skip {([Zerostep,Zero, 2: Zero,True], 7: i:=i+1; else {([0,0,0,True], 8: 8:c:=i<=n; c:=i<=n; else [Zero{([,Pos,0,0,Zero0,True],,False]), 8: c:=i<=n; S:=S+i; [0,1,0,False]), od;od; S:=S+i; [Pos,Zero,[0,1,Zero0,False]),,True], od; i:=i+1; ([1,0,0,True], 9: 9:write write S S i:=i+1; [Pos,Pos,([1,0,Pos0,True],, 9: write S c:=i<=n; [1,2,1,False])} c:=i<=n; [1,2,True/False1,False])}])} Unfold while-loop Converged!Least Fixed (analyzed!) Point! Unfold while-loop ThisThis can can be be read read as as VariationExample: of Variationabstraction • Unfold while sentence as repeated applications of if. •• Abstraction:Unfold while Natsentence→ {Pos, as repeatedZero}, Bool applications→ Bool of if. env = [n,i,s,c] Abstraction: Nat → {Many, One, Zero} env = [n,i,s,c] • Abstraction: Nat → {Many,Oneτ: env,Zero×env}, →Boolenv×→envBool /* sum */ τ: env×envenv= [n, i→, s, c]env×env /* sum */ if ¬c 1: /*read sum n; */ ifif ¬ c¬c τ : P(env × env) → P(env × env) 1: read n; then skip 2: 1:i:=0; read n; thenthen skip skip stepφ 0: φ 2: i:=0; else step 0: step 0: φ 3: 2:S:=0; i:=0; elseelse τ 3: S:=0; S:=S+i; ττ’ step 1: 4: 3:c:=True; S:=0; S:=S+i;S:=S+i; step 1: step 1: 4: c:=True; i:=i+1; {([0,0,0,True], 5: 4:while c:=True; c i:=i+1;i:=i+1; {([Zero{([,Zero,0,0,0Zero,True],,True], 5: while c c:=i<=n; [0,1,0,False])} 5:do while c c:=i<=n;c:=i<=n; τ2 [Zero,One,[0,1,Zero0,False])},False])} do if ¬c ττ’2 2 6: S=S+i;do if¬ ¬c 6: S=S+i; thenif cskip step 2: 7: 6:i:=i+1; S=S+i; then skip step 2: step 2: 7: i:=i+1; elsethen skip {([0,0,0,True], 8: 7:c:=i<=n; i:=i+1; else {([Zero{([,Zero,0,0,0Zero,True],,True], 8: c:=i<=n; elseS:=S+i; [0,1,0,False]), 8:od; c:=i<=n; S:=S+i; [Zero,One,[0,1,Zero0,False]),,False]), od; i:=i+1;S:=S+i; ([1,0,0,True], 9: writeod; S i:=i+1; [One,Zero,([1,0,Zero0,True],,True], 9: write S c:=i<=n;i:=i+1; [1,2,1,False])} 9: write S c:=i<=n;c:=i<=n; [One,One,[1,2,One1,False])},True])} Unfold while-loop Unfold while-loop Finer abstraction will find more; but this is trade-off Program Analysis = Abstract Interpretation + Model Checking Model Checking Based Program Analysis
• Each line of a program is regarded as nondeterministic transition. • After abstraction, states becomes finite. • Least fixed point computation over finite domains can be done by model checking. • Advantage: • Obtain more precise results. • Need not develop each program for each analysis, just reduce the analysis to a model checking problem. • Save implementing time. • Disadvantage: • not so efficient. (do we really need preciseness?) Example: Positive/Negative Example: positive/negativeAnalysis analysis •Abstraction:Abstraction:Nat → Nat {Pos→, Zero{Pos,} Zero}, Bool → Bool
/* sum */ 1 zero_n, in_n 1 1 pos_n, in_n 1: read n; 2: i:=0; 2 zero_i 2 2 zero_i 3: S:=0; 3 zero_s 3 3 zero_s 4: c:=True; 4 true_c 4 4 true_c 5: while c false_c do 5 true_c 5 5 5 5 true_c 6: S=S+i; 7: i:=i+1; 6 zero_s 6 pos_s 6 6 zero_s 8: c:=i<=n; od; 7 pos_i 7 pos_i 7 7 zero_i 9: write S 8 false_c 8 8 8 state = [n,i,s,c], false_c true_c in_n, out_s 9 out_s 9 9 n,i,s,c: Boolean CFG zero_n model pos_n Example: Even/Odd Analysis Example: even/odd analysis
statement
state = Line×[x,z,c], stmt(readstate = line x), x, [x, z, c] stmt(writereadX , write x) X ) x,z,c:x, zBoolean, c : Bool(even/odd) even/odd details are AG(!EF(writex) excercise∧x = even))
This is expected to return 1 CFG model spec: AG(!EF(stmt(write x) & x=even)) never reach to exit point with even x Dead Code DetectionExample : dead code detection
1 1 def_n Simplest abstraction: /* sum */ 1 point abstract domain 1: read n; 2 2 def_i 2: i:=0; 3 3 def_s • Detect variables that are defined, 3: S:=0; 4: c:=True; 4 4 def_c but never used or redefined before 5: while c used. do 5 5 use_c 6: S:=S+i; • Abstraction: 6 6 def_s,use_s,use_i 7: c:=False; • defx when x is defined 8: i:=i+1; 7 7 def_c (e.g. x := 2) 9: c:=i<=n; 8 8 def_i,use_i • usex when x is referred od; (e.g. y := x + 1) 10: write S 9 9 def_c,use_n,use_i
10 10 use_s
CFG model Dead Code Detection
• Useless code: x is defined but never used.
AG ¬(defx ∧ AX ¬EF usex)
x is never defined if it will not used • x is defined but not used before re-defined.
AG ¬(defx ∧ AX ¬A(¬usex U defx) Abstract Interpretation in Theory Abstract Interpretation
Mathematical framework that helps the highlevel design of a program analysis.
Based on lattice theory.
Suggests to design an analysis by choosing appropriate lattices and functions between them. PA in a Nutshell
Two main components of a program analysis: • Abstract domain D s.t. ⊥ ∈ D • Transfer function F : D → D
Aim: Compute a fixpoint of F.
Algorithm: Start from ⊥. Apply F repeatedly until fixpoint.
⊥, F(⊥), F2(⊥),..., Fn(⊥), Fn+1(⊥) where n is the first s.t. Fn(⊥) = Fn+1(⊥) A Running Example
assume(x == 2); while (nondet()){ x = x + 2; } assert(x > 0);
• D = Sign = {⊥, pos, neg, Z} where ⊥ = ∅ • F(d) = pos ∪ (if (d = neg) then Z else d). • Analysis run:
⊥ = ∅, F(⊥) = pos, F2(⊥) = pos Main Concerns
Q1: Does a fixpoint of F exist?
Q2: Is a fixpoint of F a correct answer?
Q3: Does our algorithm always terminate? If not, what should we do?
Abstract interpretation answers these questions. It provides guidance about how to choose D and F. Main Concerns
Q1: Does a fixpoint of F exist?
Q2: Is a fixpoint of F a correct answer?
Q3: Does our algorithm always terminate? If not, what should we do?
Abstract interpretation answers these questions. It provides guidance about how to choose D and F. Tarski’s Fixpoint Theorem Answer of Q1
Recall the components of a abstract interpretation:
D, F : D → D such that ⊥ ∈ D
. Q1: Does F have a fixpoint?
A1: If D is a complete lattice and F is monotone, F has the least and the greatest fixpoints. Preorder and Partial Order
A binary relation v on a set D is a preorder if • reflexivity: d v d for all d in D • transitivity: (d v e ∧ e v f ) ⇒ d v f
A preorder v is a partial order if • antisymmetry: (d v e ∧ e v d) ⇒ e = d
A preset means a set D with a preorder v.A poset means a set D with a partial order v. Usually written in (D, v)
Idea: v is an approximate subset/implication relation Quiz: Poset
[Ex] Which one is a poset? ⊑ is a partial order if reflexive, transitive, 1. Sign: Z antisymmetric. pos neg I think that it is better to write these examples in the whiteboard, and to ask students do solve them.
The first three examples (Sign, Subset, Q1: Sign: ∅ Interval) will be used repeatedly in my lecture.
Q2: Powerset of Z: (P(Z), ⊆) 2. Powerset of Z: (P(Z), ⊆) Q3: Interval: ({[n, m] | n, m ∈ Z ∧ n ≤ m}, ⊆) 3. Interval: ({ [n,m] | n,m∈Z ∧ n≤m }, ⊆) Q4: LinConA :({Ax ≤ b | b ∈ Zn}, ⇒) n 4. LinConA: ({ Ax≤b | b∈Z }, ⇒) Least Upper Bound (lub)
Let (D, v) be a poset, and E a subset of D • d is an upper bound of E if e v d for all e ∈ E • d is a least upper bound (lub) of E if d is an upper bound and d v d0 for all upper bound d0 of E • Notations: lub(E) and tE
Quiz: lub(E) is unique? Greatest Lower Bound (glb)
Dual to the definition of lub. • d is an lower bound of E if e w d for all e ∈ E • d is a greatest lower bound (glb) of E if d is a lower bound and d w d0 for all lower bound d0 of E. • Notations: glb(E) and uE [Ex] Which oneQuiz: is a poset? Compute⊑ is a partial lub order if reflexive, transitive, 1. Sign: Z antisymmetric. pos neg I think that it is better to write these examples in the whiteboard, and to ask students do solve them.
The first three examples (Sign, Subset, Q1: Sign: ∅ Interval) will be used repeatedly in my lecture. • pos t neg =?? 2. Powerset of Z: (P(Z), ⊆) Q2: Powerset of Z : P(Z) 3. • Interval:{2} t {4 } ({=?? [n,m] | n,m∈Z ∧ n≤m }, ⊆) • {2} t {4} t {6} t ... =?? n 4. LinConA: ({ Ax≤b | b∈Z }, ⇒) Q3: Interval: {[n, m] | n, m ∈ Z ∧ n ≤ m} • [2, 2] t [4, 4] =?? • [2, 2] t [4, 4] t [6, 6] t ... =?? Complete Lattice
A poset (D, v) is a complete lattice if every subset of D has the lub and glb. Quiz: Complete Lattice [Ex] Which one is a poset? ⊑ is a partial order if reflexive, transitive, 1. Sign: Z antisymmetric. pos neg I think that it is better to write these examples in the whiteboard, and to ask students do solve them.
The first three examples (Sign, Subset, Q1: Sign: ∅ Interval) will be used repeatedly in my lecture.
Q22. :Powerset of ofZ :Z:(P ((P(Z),Z), ⊆) ⊆)
3. Interval: ({ [n,m] | n,m∈Z ∧ n≤m }, ⊆) Q3: Interval: ([n, m] | n, m ∈ Z ∧ n ≤ m, ⊆)
n 4. LinConA: ({ Ax≤b | b∈Z }, ⇒) • (∅∪{[n, m] | n, m ∈ Z∪{−∞, +∞}∧ n ≤ m}, ⊆) Tarski’s fixpoint theorem
Let (C, ⊆) and (D, v) be presets. A function F : C → D is monotone if c ⊆ c0 ⇒ F(c) v F(c0) .
[Tarski’s Theorem] Every monotone function F on a complete lattice D has both the least and the greatest fixpoints.
Notations: lfp(F) and gfp(F). Quiz: Compute Fixpoints [Ex] Which one is a poset? ⊑ is a partial order if reflexive, transitive, 1. Sign: Z antisymmetric. pos neg I think that it is better to write these examples in the whiteboard, and to ask students do solve them.
The first three examples (Sign, Subset, Q1: Sign: ∅ Interval) will be used repeatedly in my lecture. • F(d) = pos t (if (d = neg) then Z else d). 2. Powerset of Z: (P(Z), ⊆) Q2: Powerset of Z : P(Z) 3. • Interval:F(d) = { 2({} t[n,m] {i + 2| n,m| i ∈∈dZ} ∧ n≤m }, ⊆) Q3: Interval: ∅ ∪ { n m | n m ∈ ∪ {−∞ ∞} ∧ n ≤ m} ⊆ ( [ , ] , nZ , + , ) 4. LinConA: ({ Ax≤b | b∈Z }, ⇒) • F(d) = [2, 2] t (if (d = [n, m]) then [n + 2, m + 2] else ∅) Proof of the Theorem
Tarski’s Theorem Every monotone function F on a complete lattice D has both the least and the greatest fixpoints.
• Let Pre = {d | F(d) v d} and Post = {d | d v F(d)}. • glb(Pre) is the least fixpoint of F. • lub(Post) is the greatest fixpoint of F. Finiteness Gains Termination
Lemma If D is a finite complete lattice and F : D → D is monotonic, then the sequence ⊥, F(⊥), F2(⊥),..., Fn(⊥),... converges to lfp(F) in finite steps.
The lemma implies that our analysis terminates.
Quiz: • What if we drop the finiteness requirement • Modify the lemma for greatest fixpoint Main Concerns
Q1: Does a fixpoint of F exist? • Yes, if D is a complete lattice and F is monotonic. Q2: Is a fixpoint of F a correct answer?
Q3: Does our algorithm always terminate? If not, what should we do? Main Concerns
Q1: Does a fixpoint of F exist? • Yes, if D is a complete lattice and F is monotonic. Q2: Is a fixpoint of F a correct answer?
Q3: Does our algorithm always terminate? If not, what should we do? • Use finite D. Main Concerns
Q1: Does a fixpoint of F exist? • Yes, if D is a complete lattice and F is monotonic. Q2: Is a fixpoint of F a correct answer?
Q3: Does our algorithm always terminate? If not, what should we do? • Use finite D. Galois Connection The Sign Example
assume(x == 2); while (nondet()){ x = x + 2; } assert(x > 0);
• D = Sign = {⊥, pos, neg, Z} where ⊥ = ∅ • F(d) = pos ∪ (if (d = neg) then Z else d). • lfp(F) = Pos. Correct. It overapproximates all possible values of x at the loop entry. The Sign Example
assume(x < 0); while (nondet()){ x = x + 2; } assert(x > 0);
• D = Sign = {⊥, pos, neg, Z} where ⊥ = ∅ • F(d) = pos ∪ (if (d = neg) then Z else d). • lfp(F) = Pos. Correct? The Guarantee of Correctness
assume(x == 2); while (nondet()){ x = x + 2; } assert(x > 0);
D = Sign, F(d) = pos ∪ (if (d = neg) then Z else d)
• Define the concrete semantics (called collecting semantics):
C = P(Z), E(c) = {2} ∪ {i + 2 | i ∈ c}
• Connect C and D via Galois connection • Show that F is a sound abstraction of E • Then, correct by fixpoint transfer theorem Galois Connection
Let (C, ⊆) and (D, v) be presets. A Galois connection from C to D is a pair of monotone functions:
α : C → D (abstraction), γ : D → C (concretization) such that for all c in C and d in D,
α(c) v d ⇔ c ⊆ γ(d)
Intuition: elements in D abstract those in C. α(c) is the best abstraction of c. γ(d) is the meaning of d. Order theory Galois Connections GC and Abstract Interpretation GaloisExamples Connection Why is this useful?
fsu-logo
David Henriques Order Theory 21/ 40 Quiz: Galois Connection [Ex] Which one is a poset? ⊑ is a partial order if reflexive, transitive, 1. Sign: Z antisymmetric. pos neg I think that it is better to write these examples in the whiteboard, and to ask students do solve them.
The first three examples (Sign, Subset, Sign: ∅ Interval) will be used repeatedly in my lecture. Powerset of Z : P(Z) Interval: (∅ ∪ {[n, m] | n, m ∈ Z ∪ {−∞, +∞} ∧ n ≤ m}, ⊆) 2. Powerset of Z: (P(Z), ⊆) Find Galois connections 3. Interval:• From ({ Powersets [n,m] | ton,m Intervals∈Z ∧ n≤m }, ⊆) • From Intervals to Signs n 4. LinCon• FromA: Powersets ({ Ax≤b to | Signsb∈Z }, ⇒) Soundness of Transfer Functions
Assume C, D: Complete lattices, and Galois connection from C to D: Soundnessα : C → D, γ : D → ofC transfer functions
LetLetE E:: C C→→C andandF F:: D →→DD be monotonicmono. functions. functions. F is a sound abstraction of E if F is a sound abstraction(E ◦ γ)( ofd )E⊆ if(γ ◦ F)(d) for all d (E∈ Do, γ diagrammatically,)(d) ⊑ (γ o F)(d) for all d∈D, diagrammatically, F D D γ γ E ⊑ C C Quiz: Which One is Sound
C = P(Z), E(c) = {2} ∪ {i + 2 | i ∈ c} D = {⊥, pos, neg, Z}
• F1(d) = if (d = neg) then Z else (d t pos)
• F2(d) = pos
• F3(d) = Z Best Abstraction
Let E : C → C andBestF : D →abstractionD be monotonic functions. F is the best abstraction of E if Let E: C→C and F: D→D be mono. functions. F is the best abstractionα ◦ E ◦ofγ E= ifF diagrammatically, α o E o γ = F diagrammatically, F D D
Point out that F1 is indeed the best γ α abstraction. E C C [Ex] Show that the best abstraction F is a sound abstraction of E and is the least such wrt. ⊑. Soundness of transfer functions
Let E: C→C andFixpoint F: D→D be mono. Transfer functions. F is a sound abstraction of E if
Let E : C(E→ oC γand)(d)F ⊑: D(γ→ o DF)(d)be monotonic for all d∈ functions.D, If diagrammatically, F D D γ γ E ⊑ C C then lfp(E) v γ(lfp(F))
Same condition. Preset D. For all d in D, if F(d) v d, then lfp(E) ⊆ γ(d). Guarantee for Correctness
Given data: • Program analysis (D, F : D → D) • Concrete semantics (C, E : C → C)
1 Find a Galois connection from C to D 2 Show that F is a sound abstraction of E 3 Then, correct by fixpoint transfer theorem Main Concerns
Q1: Does a fixpoint of F exist? • Yes, if D is a complete lattice and F is monotonic. Q2: Is a fixpoint of F a correct answer? • Yes, if Galois connection, and sound transfer. Q3: Does our algorithm always terminate? If not, what should we do? • Use finite D. Main Concerns
Q1: Does a fixpoint of F exist? • Yes, if D is a complete lattice and F is monotonic. Q2: Is a fixpoint of F a correct answer? • Yes, if Galois connection, and sound transfer. Q3: Does our algorithm always terminate? If not, what should we do? • Use finite D. Widening and Narrowing Analysis with Intervals
assume(x == 2); while (nondet()){ x = x + 2; } assert(x > 0);
• D = ∅ ∪ {[n, m] | n, m ∈ Z ∪ {−∞, +∞} ∧ n ≤ m} • F(d) = [2, 2] ∪ (if (d = [i, j]) then [i + 2, j + 2] else ∅). • lfp(F) = [2, +∞]. But it cannot be reached in finite steps. So, the analysis will not terminate. • ∅, F(∅) = [2, 2], F2(∅) = [2, 4], F3(∅) = [2, 6],... Widening Operator ∇
binary operator ∇ : D × D → D such that
• di v d1∇d2 for i ∈ {1, 2}; and
• for every sequence {dn}n in D, the following sequence {wn}n has wk with wk = wk+1:
w0 = d0, wn+1 = wn∇dn+1
Intuition: d1∇d2 extrapolates the change from d1 to d2. Quiz: Which One is Widening
Interval = ∅ ∪ {[i, j] | i, j ∈ Z ∪ {−∞, +∞} ∧ i ≤ j}
I. ∅∇d = d∇∅ = d, [i, j]∇[i0, j0] = [i00, j00] where • i00 = (i ≤ i0)?i : −∞ • j00 = (j0 ≤ j)?j : +∞
II. d∇d0 = d t d0
III. ∅∇d = d∇∅ = d, [i, j]∇[i0, j0] = [i00, j00] where • i00 = (min(i, i0) < L)? − ∞ : min(i, i0) • j00 = (max(j, j0) > U)? + ∞ : max(j, j0) Program Analysis with Widening
Analysis: (D, v, ⊥, F, ∇)
Algorithm: Generate wk according to the following rule until wk+1 = wk. w0 = ⊥, wk+1 = wk∇F(wk) Program Analysis with Widening
assume(x == 2); while (nondet()){ x = x + 2; } assert(x > 0);
• D = ∅ ∪ {[n, m] | n, m ∈ Z ∪ {−∞, +∞} ∧ n ≤ m} • F(d) = [2, 2] t (if (d = [i, j]) then [i + 2, j + 2] else ∅). • ∅∇d = d∇∅ = d, • [i, j]∇[i0, j0] = [i00, j00] where • i00 = (i ≤ i0)?i : −∞ • j00 = (j0 ≤ j)?j : +∞
Q: Simulate the analysis algorithm. Program Analysis with Widening
Analysis: (D, v, ⊥, F, ∇)
Algorithm: Generate wk according to the following rule until wk+1 = wk. w0 = ⊥, wk+1 = wk∇F(wk)
Theorem The algorithm terminates.
Theorem On termination, F(wk) v wk. Thus, correct if ∃-Galois connection and F is a sound abstraction. Modification
Analysis: (D, v, ⊥, F, ∇)
Algorithm: Generate wk according to the following rule until F(wk) v wk. w0 = ⊥, wk+1 = wk∇F(wk)
Theorem The algorithm terminates.
Theorem On termination, F(wk) v wk. Thus, correct if ∃-Galois connection and F is a sound abstraction. Quiz
Design a widening operator for P(Z). Narrowing Operation ∆
Binary operator ∆ : D × D → D such that
• d1 u d2 v d1∆d2 v d1 and
• for every decreasing sequence {dn}n in D, the following sequence {vn}n has vk with vk = vk+1:
v0 = d0, vn+1 = vn∆dn+1
Intuition: d1∆d2 interpolates the change from d1 to d1 u d2. Example
Interval = ∅ ∪ {[i, j] | i, j ∈ Z ∪ {−∞, +∞} ∧ i ≤ j}
∅∆d = d∆∅ = ∅
[i, j]∆[i0, j0] = [(i = −∞)?i0 : i, (j = +∞)?j0 : j] Program Analysis with Widening and Narrowing
Analysis: (D, v, ⊥, F, ∇, ∆)
Algorithm:
1 Generate wk according to the following rule until wk+1 = wk.
w0 = ⊥, wk+1 = wk∇F(wk)
2 Refine wk by narrowing, Generate vm according to the following rule until vm+1 = vm
v0 = wk, vm+1 = vm∆(vm u F(vm)) Example
assume(x == 2); while (x < 9) { x = x + 2; }
• D = ∅ ∪ {[n, m] | n, m ∈ Z ∪ {−∞, +∞} ∧ n ≤ m} • F(d) = [2, 2] t ((d u [−∞, 9] = [i, j])?[i + 2, j + 2]: ∅). • [i, j]∇[i0, j0] = [(i ≤ i0)?i : −∞, (j0 ≤ j)?j : +∞] • [i, j]∆[i0, j0] = [(i = −∞)?i0 : i, (j = +∞)?j0 : j]
Q: Simulate the analysis algorithm. Correctness
Why is the narrowing step correct? Main Concerns
Q1: Does a fixpoint of F exist? • Yes, if D is a complete lattice and F is monotonic. Q2: Is a fixpoint of F a correct answer? • Yes, if Galois connection, and sound transfer. Q3: Does our algorithm always terminate? If not, what should we do? • Use finite D. Or use widening. Roadmap
Specify an analysis in terms of the following data:
1 Preset D (abstract domain). 2 Monotone function F (abstract transfer function). 3 Widening operator ∇ (unless D is finite). 4 Narrowing operator ∆ (optional). 5 Galois connection from C (concrete domain) to D. 6 Soundness of F wrt. E (concrete transfer function). Then, our analysis terminates and is correct (sound). Report
Rep6. Algorithms for widening and narrowing (Maximal 3 students)