Fundamentals of Programming Languages V Abstract Interpretation

Guoqiang Li

School of , Shanghai Jiao Tong University Assignment

Assignment 2 is announced! Deadline Nov. 15 Reference

Cousot P., Cousot R., Abstract Interpretation: A Unified Lattice Mode for Static Analysis of Programs by Construction or Approximation of Fixpoints, POPL’77

Steffen, B., Data Flow Analysis as Model Checking, TACS’91

Schmidt, D.A., Data Flow Analysis is Model Checking of Abstract Interpretations POPL’98

Lacey,D., Proving Correctness of Compiler Optimizations by Temporal Logic POPL’02 1986, For his fundamental contributions to the definition and design of programming languages...

1991, For three distinct and complete achievements: 1) LCF, the mechanization of Scott’s Logic of Computable Functions, probably the first theoretically based yet practical tool for machine assisted proof construction; 2) ML, the first language to include polymorphic type inference together with a type-safe exception-handling mechanism; 3) CCS, a general theory of concurrency. In addition, he formulated and strongly advanced full abstraction, the study of the relationship between operational and ...

1996, For seminal work introducing temporal logic into computing science and for outstanding contributions to program and systems verification...

2007, Edmund M. Clarke, E. Allen Emerson and For their roles in developing model checking into a highly effective verification technology, widely adopted in the hardware and software industries... Abstract Interpretation in Practice Hardware Model Checking

• Classical Model Checking (in Clarke’s book)(hardware model checking): Implementer and verifier are same. • During designing a system, each model is checked and refined. • A correct system implementation is finally obtained. • e.g. a modulo 8 counter 0 • v0 = ¬v0 0 • v1 = v0 ⊕ v1 0 • v2 = (v0 ∧ v1) ⊕ v2 Software Model Checking

• Modern Model Checking(software model checking): Implementer and verifier may be different. • Given a system implementation (e.g. a source code), construct a model and verify it. • Construction of models: manually/automatically. • e.g. • security protocols • real-time embedded systems • program analysis Classical Program Analysis

• From POPA • From Dragon book • available expression analysis • dead code detection • reaching definition analysis • constant propagation • very busy expression analysis • common expression detection • live variable analysis • code motion

• Concurrent program analysis • Object-oriented program analysis • deadlock, livelock • point-to analysis • data race • escape analysis • • inference of class invariants Aspect-oriented program analysis • shape analysis • ????? Brief Idea of Abstract Interpretation

• In order to know properties of a program, execute it for all possible inputs. • It will not terminate since domains of variables are often infinite. • Abstract input data to a suitable finite domain. Then finiteness guarantees termination. • Functions needs to be reinterpreted. • Examples: • Security protocols: a principal may receive infinitely many messages from environments. (some affects the principal, some are not) • Embedded systems: a scheduler may have to schedule infinitely many released tasks. (when time exceeds, all tasks have the same results) Abstract Interpretation on Program Analysis: Examples • Even/Odd analysis: x × (x + 1) is even. • Abstraction: • Even • Odd • Interpretation: • Even + Even = Even, Even + Odd = Odd, • Even × Odd = Even,… • Positive/nagative analysis • Abstraction: • Pos • Neg • Zero • Interpretation: • Pos + Pos = Pos, Pos + Neg = {Pos, Zero, Neg},… • Neg × Neg = Pos, Pos × Zero = Zero,… • The result will be a (sound) approximation. Sound Abstraction

The result is an approximation

Completeness: sufficient condition • if no errors detected there are really no errors.

Soundness: necessary condition • if errors detected they are really errors.

f • Data Abstraction : abs : D → Abs DD • Typical concretization : con = abs−1 con abs abs • abs · con id ∧ con · abs ⊇ id ( = Abs) ( D) Int(f) guarantee soundness. Abs Abs

1 Abstract Interpretation

Define abstraction and finite abstract domain

abs : D → Abs

Interpret primitive functions

+, ×, if , =, ≥, ≤,...

Execute looping structures via program semantics

while, function call Program Semantics

Semantics: interpret a program (syntax) as an input-output relation (semantics).

Intuitively, the input-output relation is a relation between input environment and output environment of variables in a scopes.

Looping structure is solved as least fixed point.

Theorem Tarski’s least fixed point theorem: Assume τ : P(D) → P(D) is a monotonic, then there exists the S i unique least fixed point, which is computed as i τ (∅). We will come back to this theorem again! This can be read as ThisThis cancan bebe readread asas • UnfoldExample: while sentence Computing as repeated applications Sum of if. •• UnfoldUnfold whilewhile sentencesentence asas repeatedrepeated applicationsapplications ofof ifif.. env = [n,i,s,c] env = [n,i,s,c] envτ: env = [n,i,s,c]×envenv =→ [n, i,envs, c] ×env /* sum */ if ¬c τ: env×env → env×env /* sum */ τ: τenv: P×(envenv× env→) →envP(env×env× env) /*1: sumread */ n; if ¬c 1: read n; ifthen ¬ cskip 1:2: readi:=0; n; then skip step 0: φ 2: i:=0; thenelse skip τ step 0: φ 2:3: i:=0;S:=0; else step 0: φ 3: S:=0; elseS:=S+i; τ step 1: 3:4: S:=0;c:=True; S:=S+i; τ step 1: 4: c:=True; S:=S+i;i:=i+1; step{([ 01:,0, 0,True], 4:5: c:=True;while c i:=i+1; {([0,0,0,True], 5: while c i:=i+1;c:=i<=n; {([[00,0,,1,00,True],,False])} 5: whiledo c c:=i<=n; τ2 [0,1,0,False])} do c:=i<=n;if ¬c τ2 [0,1,0,False])} 6: doS=S+i; if ¬c τ2 step 2: 6: S=S+i; ifthen ¬ cskip step 2: 6:7: S=S+i;i:=i+1; then skip step{([ 02:,0, 0,True], 7:8: i:=i+1;c:=i<=n; thenelse skip {([0,0,0,True], 7: i:=i+1; else {([[00,0,,1,00,True],,False]), 8: c:=i<=n;od; elseS:=S+i; [0,1,0,False]), 8: c:=i<=n; S:=S+i; ([01,1,,0,0,False]),,True], 9: od;write S S:=S+i;i:=i+1; ([1,0,0,True], od; i:=i+1; ([[11,0,,2,01,True],,False])} 9: write S i:=i+1;c:=i<=n; [1,2,1,False])} 9: write S c:=i<=n; [1,2,1,False])} Unfoldc:=i<=n; while-loop Unfold while-loop Unfold while-loop Example:Example:ThisThis positive/negative can can Positive/Negative be be read read as as analysis • Abstraction: NatAnalysis→ {Pos, Zero}, Bool → Bool • •UnfoldUnfold while whilesentencesentence as as repeated repeated applications applications of of if .if . Abstraction: Nat → {Pos, Zero} envenv = [n,i,s,c] = [n,i,s,c] env = [n,i,s,c]env = [n, i, s, c] τ: τenv: ×envenv×env→ env→ ×envenv×env /* /*sum sum */ */ if ¬c ττ:: Penv(env××envenv) →→P(envenv ××envenv) /* sum */ if ¬c 1: 1:read read n; n; thenif ¬skipc φ 1: read n; then skip step 0: 2: 2:i:=0; i:=0; elsethen skip step 0: φ 2: i:=0; else τ’ step 0: φ 3: 3:S:=0; S:=0; S:=S+i;else τ step 1: 3: S:=0; S:=S+i; τ step 1: 4: 4:c:=True; c:=True; i:=i+1;S:=S+i; {([Zerostep,Zero, 1: Zero,True], 4: c:=True; i:=i+1; {([0,0,0,True], 5: 5:while while c c c:=i<=n;i:=i+1; [Zero{([,Pos,0,0,Zero0,True],,False])} 5: while c c:=i<=n; 2 [0,1,0,False])} do do c:=i<=n; τ2’ [0,1,0,False])} do if ¬c τ2 6: 6:S=S+i; S=S+i; if ¬c step 2: 6: S=S+i; then skip step 2: 7: 7:i:=i+1; i:=i+1; then skip {([Zerostep,Zero, 2: Zero,True], 7: i:=i+1; else {([0,0,0,True], 8: 8:c:=i<=n; c:=i<=n; else [Zero{([,Pos,0,0,Zero0,True],,False]), 8: c:=i<=n; S:=S+i; [0,1,0,False]), od;od; S:=S+i; [Pos,Zero,[0,1,Zero0,False]),,True], od; i:=i+1; ([1,0,0,True], 9: 9:write write S S i:=i+1; [Pos,Pos,([1,0,Pos0,True],, 9: write S c:=i<=n; [1,2,1,False])} c:=i<=n; [1,2,True/False1,False])}])} Unfold while-loop Converged!Least Fixed (analyzed!) Point! Unfold while-loop ThisThis can can be be read read as as VariationExample: of Variationabstraction • Unfold while sentence as repeated applications of if. •• Abstraction:Unfold while Natsentence→ {Pos, as repeatedZero}, Bool applications→ Bool of if. env = [n,i,s,c] Abstraction: Nat → {Many, One, Zero} env = [n,i,s,c] • Abstraction: Nat → {Many,Oneτ: env,Zero×env}, →Boolenv×→envBool /* sum */ τ: env×envenv= [n, i→, s, c]env×env /* sum */ if ¬c 1: /*read sum n; */ ifif ¬ c¬c τ : P(env × env) → P(env × env) 1: read n; then skip 2: 1:i:=0; read n; thenthen skip skip stepφ 0: φ 2: i:=0; else step 0: step 0: φ 3: 2:S:=0; i:=0; elseelse τ 3: S:=0; S:=S+i; ττ’ step 1: 4: 3:c:=True; S:=0; S:=S+i;S:=S+i; step 1: step 1: 4: c:=True; i:=i+1; {([0,0,0,True], 5: 4:while c:=True; c i:=i+1;i:=i+1; {([Zero{([,Zero,0,0,0Zero,True],,True], 5: while c c:=i<=n; [0,1,0,False])} 5:do while c c:=i<=n;c:=i<=n; τ2 [Zero,One,[0,1,Zero0,False])},False])} do if ¬c ττ’2 2 6: S=S+i;do if¬ ¬c 6: S=S+i; thenif cskip step 2: 7: 6:i:=i+1; S=S+i; then skip step 2: step 2: 7: i:=i+1; elsethen skip {([0,0,0,True], 8: 7:c:=i<=n; i:=i+1; else {([Zero{([,Zero,0,0,0Zero,True],,True], 8: c:=i<=n; elseS:=S+i; [0,1,0,False]), 8:od; c:=i<=n; S:=S+i; [Zero,One,[0,1,Zero0,False]),,False]), od; i:=i+1;S:=S+i; ([1,0,0,True], 9: writeod; S i:=i+1; [One,Zero,([1,0,Zero0,True],,True], 9: write S c:=i<=n;i:=i+1; [1,2,1,False])} 9: write S c:=i<=n;c:=i<=n; [One,One,[1,2,One1,False])},True])} Unfold while-loop Unfold while-loop Finer abstraction will find more; but this is trade-off Program Analysis = Abstract Interpretation + Model Checking Model Checking Based Program Analysis

• Each line of a program is regarded as nondeterministic transition. • After abstraction, states becomes finite. • Least fixed point computation over finite domains can be done by model checking. • Advantage: • Obtain more precise results. • Need not develop each program for each analysis, just reduce the analysis to a model checking problem. • Save implementing time. • Disadvantage: • not so efficient. (do we really need preciseness?) Example: Positive/Negative Example: positive/negativeAnalysis analysis •Abstraction:Abstraction:Nat → Nat {Pos→, Zero{Pos,} Zero}, Bool → Bool

/* sum */ 1 zero_n, in_n 1 1 pos_n, in_n 1: read n; 2: i:=0; 2 zero_i 2 2 zero_i 3: S:=0; 3 zero_s 3 3 zero_s 4: c:=True; 4 true_c 4 4 true_c 5: while c false_c do 5 true_c 5 5 5 5 true_c 6: S=S+i; 7: i:=i+1; 6 zero_s 6 pos_s 6 6 zero_s 8: c:=i<=n; od; 7 pos_i 7 pos_i 7 7 zero_i 9: write S 8 false_c 8 8 8 state = [n,i,s,c], false_c true_c in_n, out_s 9 out_s 9 9 n,i,s,c: Boolean CFG zero_n model pos_n Example: Even/Odd Analysis Example: even/odd analysis

statement

state = Line×[x,z,c], stmt(readstate = line x), x, [x, z, c] stmt(writereadX , write x) X ) x,z,c:x, zBoolean, c : Bool(even/odd) even/odd details are AG(!EF(writex) excercise∧x = even))

This is expected to return 1 CFG model spec: AG(!EF(stmt(write x) & x=even)) never reach to exit point with even x Dead Code DetectionExample : dead code detection

1 1 def_n Simplest abstraction: /* sum */ 1 point abstract domain 1: read n; 2 2 def_i 2: i:=0; 3 3 def_s • Detect variables that are defined, 3: S:=0; 4: c:=True; 4 4 def_c but never used or redefined before 5: while c used. do 5 5 use_c 6: S:=S+i; • Abstraction: 6 6 def_s,use_s,use_i 7: c:=False; • defx when x is defined 8: i:=i+1; 7 7 def_c (e.g. x := 2) 9: c:=i<=n; 8 8 def_i,use_i • usex when x is referred od; (e.g. y := x + 1) 10: write S 9 9 def_c,use_n,use_i

10 10 use_s

CFG model Dead Code Detection

• Useless code: x is defined but never used.

AG ¬(defx ∧ AX ¬EF usex)

x is never defined if it will not used • x is defined but not used before re-defined.

AG ¬(defx ∧ AX ¬A(¬usex U defx) Abstract Interpretation in Theory Abstract Interpretation

Mathematical framework that helps the highlevel design of a program analysis.

Based on lattice theory.

Suggests to design an analysis by choosing appropriate lattices and functions between them. PA in a Nutshell

Two main components of a program analysis: • Abstract domain D s.t. ⊥ ∈ D • Transfer function F : D → D

Aim: Compute a fixpoint of F.

Algorithm: Start from ⊥. Apply F repeatedly until fixpoint.

⊥, F(⊥), F2(⊥),..., Fn(⊥), Fn+1(⊥) where n is the first s.t. Fn(⊥) = Fn+1(⊥) A Running Example

assume(x == 2); while (nondet()){ x = x + 2; } assert(x > 0);

• D = Sign = {⊥, pos, neg, Z} where ⊥ = ∅ • F(d) = pos ∪ (if (d = neg) then Z else d). • Analysis run:

⊥ = ∅, F(⊥) = pos, F2(⊥) = pos Main Concerns

Q1: Does a fixpoint of F exist?

Q2: Is a fixpoint of F a correct answer?

Q3: Does our always terminate? If not, what should we do?

Abstract interpretation answers these questions. It provides guidance about how to choose D and F. Main Concerns

Q1: Does a fixpoint of F exist?

Q2: Is a fixpoint of F a correct answer?

Q3: Does our algorithm always terminate? If not, what should we do?

Abstract interpretation answers these questions. It provides guidance about how to choose D and F. Tarski’s Fixpoint Theorem Answer of Q1

Recall the components of a abstract interpretation:

D, F : D → D such that ⊥ ∈ D

. Q1: Does F have a fixpoint?

A1: If D is a complete lattice and F is monotone, F has the least and the greatest fixpoints. Preorder and Partial Order

A binary relation v on a set D is a preorder if • reflexivity: d v d for all d in D • transitivity: (d v e ∧ e v f ) ⇒ d v f

A preorder v is a partial order if • antisymmetry: (d v e ∧ e v d) ⇒ e = d

A preset means a set D with a preorder v.A poset means a set D with a partial order v. Usually written in (D, v)

Idea: v is an approximate subset/implication relation Quiz: Poset

[Ex] Which one is a poset? ⊑ is a partial order if reflexive, transitive, 1. Sign: Z antisymmetric. pos neg I think that it is better to write these examples in the whiteboard, and to ask students do solve them.

The first three examples (Sign, Subset, Q1: Sign: ∅ Interval) will be used repeatedly in my lecture.

Q2: Powerset of Z: (P(Z), ⊆) 2. Powerset of Z: (P(Z), ⊆) Q3: Interval: ({[n, m] | n, m ∈ Z ∧ n ≤ m}, ⊆) 3. Interval: ({ [n,m] | n,m∈Z ∧ n≤m }, ⊆) Q4: LinConA :({Ax ≤ b | b ∈ Zn}, ⇒) n 4. LinConA: ({ Ax≤b | b∈Z }, ⇒) Least Upper Bound (lub)

Let (D, v) be a poset, and E a subset of D • d is an upper bound of E if e v d for all e ∈ E • d is a least upper bound (lub) of E if d is an upper bound and d v d0 for all upper bound d0 of E • Notations: lub(E) and tE

Quiz: lub(E) is unique? Greatest Lower Bound (glb)

Dual to the definition of lub. • d is an lower bound of E if e w d for all e ∈ E • d is a greatest lower bound (glb) of E if d is a lower bound and d w d0 for all lower bound d0 of E. • Notations: glb(E) and uE [Ex] Which oneQuiz: is a poset? Compute⊑ is a partial lub order if reflexive, transitive, 1. Sign: Z antisymmetric. pos neg I think that it is better to write these examples in the whiteboard, and to ask students do solve them.

The first three examples (Sign, Subset, Q1: Sign: ∅ Interval) will be used repeatedly in my lecture. • pos t neg =?? 2. Powerset of Z: (P(Z), ⊆) Q2: Powerset of Z : P(Z) 3. • Interval:{2} t {4 } ({=?? [n,m] | n,m∈Z ∧ n≤m }, ⊆) • {2} t {4} t {6} t ... =?? n 4. LinConA: ({ Ax≤b | b∈Z }, ⇒) Q3: Interval: {[n, m] | n, m ∈ Z ∧ n ≤ m} • [2, 2] t [4, 4] =?? • [2, 2] t [4, 4] t [6, 6] t ... =?? Complete Lattice

A poset (D, v) is a complete lattice if every subset of D has the lub and glb. Quiz: Complete Lattice [Ex] Which one is a poset? ⊑ is a partial order if reflexive, transitive, 1. Sign: Z antisymmetric. pos neg I think that it is better to write these examples in the whiteboard, and to ask students do solve them.

The first three examples (Sign, Subset, Q1: Sign: ∅ Interval) will be used repeatedly in my lecture.

Q22. :Powerset of ofZ :Z:(P ((P(Z),Z), ⊆) ⊆)

3. Interval: ({ [n,m] | n,m∈Z ∧ n≤m }, ⊆) Q3: Interval: ([n, m] | n, m ∈ Z ∧ n ≤ m, ⊆)

n 4. LinConA: ({ Ax≤b | b∈Z }, ⇒) • (∅∪{[n, m] | n, m ∈ Z∪{−∞, +∞}∧ n ≤ m}, ⊆) Tarski’s fixpoint theorem

Let (C, ⊆) and (D, v) be presets. A function F : C → D is monotone if c ⊆ c0 ⇒ F(c) v F(c0) .

[Tarski’s Theorem] Every monotone function F on a complete lattice D has both the least and the greatest fixpoints.

Notations: lfp(F) and gfp(F). Quiz: Compute Fixpoints [Ex] Which one is a poset? ⊑ is a partial order if reflexive, transitive, 1. Sign: Z antisymmetric. pos neg I think that it is better to write these examples in the whiteboard, and to ask students do solve them.

The first three examples (Sign, Subset, Q1: Sign: ∅ Interval) will be used repeatedly in my lecture. • F(d) = pos t (if (d = neg) then Z else d). 2. Powerset of Z: (P(Z), ⊆) Q2: Powerset of Z : P(Z) 3. • Interval:F(d) = { 2({} t[n,m] {i + 2| n,m| i ∈∈dZ} ∧ n≤m }, ⊆) Q3: Interval: ∅ ∪ { n m | n m ∈ ∪ {−∞ ∞} ∧ n ≤ m} ⊆ ( [ , ] , nZ , + , ) 4. LinConA: ({ Ax≤b | b∈Z }, ⇒) • F(d) = [2, 2] t (if (d = [n, m]) then [n + 2, m + 2] else ∅) Proof of the Theorem

Tarski’s Theorem Every monotone function F on a complete lattice D has both the least and the greatest fixpoints.

• Let Pre = {d | F(d) v d} and Post = {d | d v F(d)}. • glb(Pre) is the least fixpoint of F. • lub(Post) is the greatest fixpoint of F. Finiteness Gains Termination

Lemma If D is a finite complete lattice and F : D → D is monotonic, then the sequence ⊥, F(⊥), F2(⊥),..., Fn(⊥),... converges to lfp(F) in finite steps.

The lemma implies that our analysis terminates.

Quiz: • What if we drop the finiteness requirement • Modify the lemma for greatest fixpoint Main Concerns

Q1: Does a fixpoint of F exist? • Yes, if D is a complete lattice and F is monotonic. Q2: Is a fixpoint of F a correct answer?

Q3: Does our algorithm always terminate? If not, what should we do? Main Concerns

Q1: Does a fixpoint of F exist? • Yes, if D is a complete lattice and F is monotonic. Q2: Is a fixpoint of F a correct answer?

Q3: Does our algorithm always terminate? If not, what should we do? • Use finite D. Main Concerns

Q1: Does a fixpoint of F exist? • Yes, if D is a complete lattice and F is monotonic. Q2: Is a fixpoint of F a correct answer?

Q3: Does our algorithm always terminate? If not, what should we do? • Use finite D. Galois Connection The Sign Example

assume(x == 2); while (nondet()){ x = x + 2; } assert(x > 0);

• D = Sign = {⊥, pos, neg, Z} where ⊥ = ∅ • F(d) = pos ∪ (if (d = neg) then Z else d). • lfp(F) = Pos. Correct. It overapproximates all possible values of x at the loop entry. The Sign Example

assume(x < 0); while (nondet()){ x = x + 2; } assert(x > 0);

• D = Sign = {⊥, pos, neg, Z} where ⊥ = ∅ • F(d) = pos ∪ (if (d = neg) then Z else d). • lfp(F) = Pos. Correct? The Guarantee of Correctness

assume(x == 2); while (nondet()){ x = x + 2; } assert(x > 0);

D = Sign, F(d) = pos ∪ (if (d = neg) then Z else d)

• Define the concrete semantics (called collecting semantics):

C = P(Z), E(c) = {2} ∪ {i + 2 | i ∈ c}

• Connect C and D via Galois connection • Show that F is a sound abstraction of E • Then, correct by fixpoint transfer theorem Galois Connection

Let (C, ⊆) and (D, v) be presets. A Galois connection from C to D is a pair of monotone functions:

α : C → D (abstraction), γ : D → C (concretization) such that for all c in C and d in D,

α(c) v d ⇔ c ⊆ γ(d)

Intuition: elements in D abstract those in C. α(c) is the best abstraction of c. γ(d) is the meaning of d. Order theory Galois Connections GC and Abstract Interpretation GaloisExamples Connection Why is this useful?

fsu-logo

David Henriques Order Theory 21/ 40 Quiz: Galois Connection [Ex] Which one is a poset? ⊑ is a partial order if reflexive, transitive, 1. Sign: Z antisymmetric. pos neg I think that it is better to write these examples in the whiteboard, and to ask students do solve them.

The first three examples (Sign, Subset, Sign: ∅ Interval) will be used repeatedly in my lecture. Powerset of Z : P(Z) Interval: (∅ ∪ {[n, m] | n, m ∈ Z ∪ {−∞, +∞} ∧ n ≤ m}, ⊆) 2. Powerset of Z: (P(Z), ⊆) Find Galois connections 3. Interval:• From ({ Powersets [n,m] | ton,m Intervals∈Z ∧ n≤m }, ⊆) • From Intervals to Signs n 4. LinCon• FromA: Powersets ({ Ax≤b to | Signsb∈Z }, ⇒) Soundness of Transfer Functions

Assume C, D: Complete lattices, and Galois connection from C to D: Soundnessα : C → D, γ : D → ofC transfer functions

LetLetE E:: C C→→C andandF F:: D →→DD be monotonicmono. functions. functions. F is a sound abstraction of E if F is a sound abstraction(E ◦ γ)( ofd )E⊆ if(γ ◦ F)(d) for all d (E∈ Do, γ diagrammatically,)(d) ⊑ (γ o F)(d) for all d∈D, diagrammatically, F D D γ γ E ⊑ C C Quiz: Which One is Sound

C = P(Z), E(c) = {2} ∪ {i + 2 | i ∈ c} D = {⊥, pos, neg, Z}

• F1(d) = if (d = neg) then Z else (d t pos)

• F2(d) = pos

• F3(d) = Z Best Abstraction

Let E : C → C andBestF : D →abstractionD be monotonic functions. F is the best abstraction of E if Let E: C→C and F: D→D be mono. functions. F is the best abstractionα ◦ E ◦ofγ E= ifF diagrammatically, α o E o γ = F diagrammatically, F D D

Point out that F1 is indeed the best γ α abstraction. E C C [Ex] Show that the best abstraction F is a sound abstraction of E and is the least such wrt. ⊑. Soundness of transfer functions

Let E: C→C andFixpoint F: D→D be mono. Transfer functions. F is a sound abstraction of E if

Let E : C(E→ oC γand)(d)F ⊑: D(γ→ o DF)(d)be monotonic for all d∈ functions.D, If diagrammatically, F D D γ γ E ⊑ C C then lfp(E) v γ(lfp(F))

Same condition. Preset D. For all d in D, if F(d) v d, then lfp(E) ⊆ γ(d). Guarantee for Correctness

Given data: • Program analysis (D, F : D → D) • Concrete semantics (C, E : C → C)

1 Find a Galois connection from C to D 2 Show that F is a sound abstraction of E 3 Then, correct by fixpoint transfer theorem Main Concerns

Q1: Does a fixpoint of F exist? • Yes, if D is a complete lattice and F is monotonic. Q2: Is a fixpoint of F a correct answer? • Yes, if Galois connection, and sound transfer. Q3: Does our algorithm always terminate? If not, what should we do? • Use finite D. Main Concerns

Q1: Does a fixpoint of F exist? • Yes, if D is a complete lattice and F is monotonic. Q2: Is a fixpoint of F a correct answer? • Yes, if Galois connection, and sound transfer. Q3: Does our algorithm always terminate? If not, what should we do? • Use finite D. Widening and Narrowing Analysis with Intervals

assume(x == 2); while (nondet()){ x = x + 2; } assert(x > 0);

• D = ∅ ∪ {[n, m] | n, m ∈ Z ∪ {−∞, +∞} ∧ n ≤ m} • F(d) = [2, 2] ∪ (if (d = [i, j]) then [i + 2, j + 2] else ∅). • lfp(F) = [2, +∞]. But it cannot be reached in finite steps. So, the analysis will not terminate. • ∅, F(∅) = [2, 2], F2(∅) = [2, 4], F3(∅) = [2, 6],... Widening Operator ∇

binary operator ∇ : D × D → D such that

• di v d1∇d2 for i ∈ {1, 2}; and

• for every sequence {dn}n in D, the following sequence {wn}n has wk with wk = wk+1:

w0 = d0, wn+1 = wn∇dn+1

Intuition: d1∇d2 extrapolates the change from d1 to d2. Quiz: Which One is Widening

Interval = ∅ ∪ {[i, j] | i, j ∈ Z ∪ {−∞, +∞} ∧ i ≤ j}

I. ∅∇d = d∇∅ = d, [i, j]∇[i0, j0] = [i00, j00] where • i00 = (i ≤ i0)?i : −∞ • j00 = (j0 ≤ j)?j : +∞

II. d∇d0 = d t d0

III. ∅∇d = d∇∅ = d, [i, j]∇[i0, j0] = [i00, j00] where • i00 = (min(i, i0) < L)? − ∞ : min(i, i0) • j00 = (max(j, j0) > U)? + ∞ : max(j, j0) Program Analysis with Widening

Analysis: (D, v, ⊥, F, ∇)

Algorithm: Generate wk according to the following rule until wk+1 = wk. w0 = ⊥, wk+1 = wk∇F(wk) Program Analysis with Widening

assume(x == 2); while (nondet()){ x = x + 2; } assert(x > 0);

• D = ∅ ∪ {[n, m] | n, m ∈ Z ∪ {−∞, +∞} ∧ n ≤ m} • F(d) = [2, 2] t (if (d = [i, j]) then [i + 2, j + 2] else ∅). • ∅∇d = d∇∅ = d, • [i, j]∇[i0, j0] = [i00, j00] where • i00 = (i ≤ i0)?i : −∞ • j00 = (j0 ≤ j)?j : +∞

Q: Simulate the analysis algorithm. Program Analysis with Widening

Analysis: (D, v, ⊥, F, ∇)

Algorithm: Generate wk according to the following rule until wk+1 = wk. w0 = ⊥, wk+1 = wk∇F(wk)

Theorem The algorithm terminates.

Theorem On termination, F(wk) v wk. Thus, correct if ∃-Galois connection and F is a sound abstraction. Modification

Analysis: (D, v, ⊥, F, ∇)

Algorithm: Generate wk according to the following rule until F(wk) v wk. w0 = ⊥, wk+1 = wk∇F(wk)

Theorem The algorithm terminates.

Theorem On termination, F(wk) v wk. Thus, correct if ∃-Galois connection and F is a sound abstraction. Quiz

Design a widening operator for P(Z). Narrowing Operation ∆

Binary operator ∆ : D × D → D such that

• d1 u d2 v d1∆d2 v d1 and

• for every decreasing sequence {dn}n in D, the following sequence {vn}n has vk with vk = vk+1:

v0 = d0, vn+1 = vn∆dn+1

Intuition: d1∆d2 interpolates the change from d1 to d1 u d2. Example

Interval = ∅ ∪ {[i, j] | i, j ∈ Z ∪ {−∞, +∞} ∧ i ≤ j}

∅∆d = d∆∅ = ∅

[i, j]∆[i0, j0] = [(i = −∞)?i0 : i, (j = +∞)?j0 : j] Program Analysis with Widening and Narrowing

Analysis: (D, v, ⊥, F, ∇, ∆)

Algorithm:

1 Generate wk according to the following rule until wk+1 = wk.

w0 = ⊥, wk+1 = wk∇F(wk)

2 Refine wk by narrowing, Generate vm according to the following rule until vm+1 = vm

v0 = wk, vm+1 = vm∆(vm u F(vm)) Example

assume(x == 2); while (x < 9) { x = x + 2; }

• D = ∅ ∪ {[n, m] | n, m ∈ Z ∪ {−∞, +∞} ∧ n ≤ m} • F(d) = [2, 2] t ((d u [−∞, 9] = [i, j])?[i + 2, j + 2]: ∅). • [i, j]∇[i0, j0] = [(i ≤ i0)?i : −∞, (j0 ≤ j)?j : +∞] • [i, j]∆[i0, j0] = [(i = −∞)?i0 : i, (j = +∞)?j0 : j]

Q: Simulate the analysis algorithm. Correctness

Why is the narrowing step correct? Main Concerns

Q1: Does a fixpoint of F exist? • Yes, if D is a complete lattice and F is monotonic. Q2: Is a fixpoint of F a correct answer? • Yes, if Galois connection, and sound transfer. Q3: Does our algorithm always terminate? If not, what should we do? • Use finite D. Or use widening. Roadmap

Specify an analysis in terms of the following data:

1 Preset D (abstract domain). 2 Monotone function F (abstract transfer function). 3 Widening operator ∇ (unless D is finite). 4 Narrowing operator ∆ (optional). 5 Galois connection from C (concrete domain) to D. 6 Soundness of F wrt. E (concrete transfer function). Then, our analysis terminates and is correct (sound). Report

Rep6. for widening and narrowing (Maximal 3 students)