<<

Topics

• static – principles: key ideas and connections cs264: Program Analysis – techniques: efficient algorithms catalog name: Implementation of Programming Languages – applications: from compilation to engineering

• advanced topics, time permitting Ras Bodik WF 11-12:30 – • ex.: run-time bug-finding – program representations slides adapted from Mooly Sagiv • ex.: Static Single Assignment form

Course Requirements Course structure

• Prerequisites •Source – a course (cs164) – Nielsen, Nielsen, Hankin, Principles of Program Analysis • Useful but not mandatory – research papers •Format – semantics of programming languages (cs263) – lectures (roughly following the textbook) –algorithms – discussions of research papers – discrete math • you’ll read the paper before lecture and send me a “mini-review” •4 paragraphs •Grade –4-5 homeworks – take-home exam – project

Guest lectures Outline

• Lectures may include several “guest lecturers” • What is static analysis – expert’s view on a more advanced topic – usage in and other clients • Guest lecture time usually not during class; instead • Why is it called ? – PS Seminar (Mondays 4pm in 320 Soda) – handling undecidability – Faculty candidate talks (TBD) – soundness of abstract interpretation – CHESS Seminar (Tuesdays, 4pm, 540 Cory) • Relation to program verification • First speaker • Complementary approaches – Shaz Qadeer, Monday 4pm 320 Soda – paper: KISS: Keep it Simple and Sequential – you’ll write a mini-review (instructions to come)

1 Static Analysis Example Static Analysis Problem

•Goal: • Find variables with constant value at a given program – automatic derivation of properties that hold on every location execution leading to a program location (label) – (without knowing program input) int p(int x) { int p(int x){ return (x * x) ; return (x * x); •Usage: } } void main() void main() – compiler optimizations { { – code quality tools int z; int z; if (getc()) z = p(6) + 8; if (getc()) z = p(3) + 1; •Identify bugs else z = p(5) + 7; else z = p(-2) + 6; • Prove absence of certain bugs printf (z); printf (z); } }

A more problematic program Another example static analysis problem int x; • Find variables which are live at a given program location void p(a) { c = read(); if (c > 0) { – Definition: variable x is live at a program location p if x’sR- a = a -2; value can be used before x is set p(a); – Corresponding property: there exists an x-definition-free a = a + 2; execution path from p to a use of x } x = -2 * a + 5; print(x); } void main { p(7); print(x); }

A simple liveness example Typical compiler source program Scanner /* c */ tokens L0: a := 0 Parser /* ac */ L1: b := a + 1 AST /* bc */ a b Semantic Analysis c := c + b AST + type information /* bc */ Code Generator c a := b * 2 IR /* ac */ Static analysis if c < N goto L1 register interference graph IR + analysis info /* c */ Transformations return c

2 Some Static Analysis Problems The Need for Static Analysis

• Live variables •Compilers • Reaching definitions – Advanced computer architectures (Superscalar pipelined, VLIW, prefetching) • Available expressions – High-level programming languages • Dead code (functional, OO, garbage collected, concurrent) • Pointer variables that never point to same location • Software Productivity Tools – Compile time • Points in the program in which it is safe to free an object • Strengthen type checking for C • A virtual method call whose target method is unique • Detect Array-bound violations • Identify dangling pointers • Statements that can be executed in parallel • Generate test cases • An access to a variable that must reside in the cache • Prove absence of runtime exceptions • Prove pre- and post-conditions • Integer intervals

Software Quality Tools. Detecting Hazards Memory leakage example

• Uninitialized variables: List* reverse(List head) a = malloc() ; typedef struct List { { int d; b = a; List *rev, *n; struct List* next; cfree (a); rev = NULL; } List; c = malloc (); while (head != NULL) { if (b == c) n = headÆnext; // unexpected equality headÆnext = rev; head = n; leakage of address pointed to by head rev = head; • References outside array bounds } • Memory leaks return rev; }

Challenges in Static Analysis Foundation of Static Analysis

• Correctness • Static analysis can be viewed as •Precision – interpreting the program over an “abstract domain” • Efficiency – executing the program over larger set of execution paths •Scaling • Guarantee sound results, ex.: – Every identified constant is indeed a constant – But not every constant is identified as such

3 Example Abstract Interpretation. Casting Out Nines Even/Odd Abstract Interpretation

• A (weak) sanity check of decimal arithmetic using 9 values • Determine if an integer variable is even or odd at a given – 0, 1, 2, 3, 4, 5, 6, 7, 8 program point • The casting-out-nine rule: – whenever an intermediate result exceeds 8, replace by the sum of its digits (recursively) • Example “123 * 457 + 76543 = 132654?” – 123*457 + 76543 = 132654? – 6 * 7 + 7 = 21? – 6 + 7 = 3? – 4 = 3? NO. Report an error. • Why this rule produces no false alarms: – (10a + b) mod 9 = (a + b) mod 9 – (a+b) mod 9 = (a mod 9) + (b mod 9) – (a*b) mod 9 = (a mod 9) * (b mod 9)

Example Program Abstract Interpretation

/* x=? */ while (x !=1) do { /* x=? */ γ if (x %2) == 0 /* x=E */ { x := x / 2; } /* x=? */ else α /* x=O */ { x := x * 3 + 1; /* x=E */ assert (x %2 ==0); } α } Abstract /* x=O*/ Concrete α Descriptors of Sets of stores sets of stores

Odd/Even Abstract Interpretation Odd/Even Abstract Interpretation All concrete states

All concrete states

γ α {x: x c Even} {-2, 1, 5} ? {-2, 1, 5} ? {x: x c Even} α {0,2} {0,2} α EO EO {0} {2} γ {0} {2} α

∅ ∅ ⊥ ⊥ α

4 Odd/Even Abstract Interpretation Odd/Even Abstract Interpretation

γ ⊥ All concrete states α α(X) = if X= ∅ return z else if for all z in X (z%2 == 0) return E ? else if for all z in X (z%2 != 0) return O {-2, 1, 5} else return ? {x: x c Even} α EO {0,2} γ(a) = if a=⊥ return ∅ {0} {2} ⊥ else if a = E return Even ∅ else if a = O return Odd else return Natural

Example Program Concrete and Abstract Interpretation

+0123… *0123… 00123… 00000… 11234… 10123… while (x !=1) do { 22345… 20246… if (x %2) == 0 33456… 30369… { x := x / 2; } MMMMM else MMMMM /* x=O */{ x := x * 3 + 1; /* x=E */ assert (x %2 ==0); } +’ ?OE *’ ?OE } ???? ???E O?EO O?OE E?OE EEEE

Abstract interpretation cannot be always precise Abstract (Conservative) interpretation

Operational semantics x := x/2 statement s {16, 32} {8, 16} Set of states Set of states

α α α

x := x /# 2 statement s abstract abstract abstract E E Abstract ? t representation Abstract representation ⊒ representation semantics a semantics ⊒

5 Abstract (Conservative) interpretation Challenges in Abstract Interpretation

Operational • Finding appropriate program semantics (runtime) semantics • Designing abstract representations statement s • What to forget Set of states Set of states ⊆ Set of states • What to remember • Summarize crucial information

• Handling loops • Handling procedures statement s abstract abstract • Scalability representation Abstract representation •Large programs semantics • Missing •Precise enough

Runtime vs. Abstract Interpretation Example Constant Propagation ( Tools) • Abstract representation set of Runtime Abstract integer values and and extra value “?” denoting variables +# ? 0 1 2 Effectiveness Missed Errors False alarms not known to be constants ? ? ? ? ? • Conservative interpretation of 0 ? 0 1 2 Locate rare errors + 1 ? 1 2 3 Cost Proportional to Proportional to program’s program’s size 2 ? 2 3 4 execution … ? … … …

Example Constant Propagation (Cont) Example Program

• Conservative interpretation of * x = 5; y = 7; *# ? 0 1 2 *# ? 0 1 2 if (getc()) ? ? 0 ? ? ? ? 0 ? ? 0 0 0 0 0 y = x + 2; 0 0 0 0 0 1 ? 0 1 2 1 ? 0 1 2 z = x +y; 2 ? 0 2 4 2 ? 0 2 4 … ? 0 … … … ? 0 … …

6 Example Program (2) Undecidability Issues

• It is undecidable if a program point is reachable if (getc()) in some execution x= 3 ; y = 2; • Some static analysis problems are undecidable even if else the program conditions are ignored x =2; y = 3; z = x +y;

The Constant Propagation Example Coping with undecidabilty

• Loop free programs while (getc()) { if (getc()) x_1 = x_1 + 1; • Simple static properties if (getc()) x_2 = x_2 + 1; • Interactive solutions ... if (getc()) x_n = x_n + 1; • Effects of conservative estimations } – Every enabled transformation cannot change the meaning of y = truncate (1/ (1 + p2(x_1, x_2, ..., x_n)) the code but some transformations are not enabled /* Is y=0 here? */ – Non optimal code – Every potential error is caught but some “false alarms” may be issued

Analogies with Numerical Analysis Violation of soundness

• Approximate the exact semantics • Loop invariant code motion • More precision can be obtained at greater computational • Dead code elimination costs •Overflow – But sometimes more precise can also be more efficient ((x+y)+z) != (x + (y+z)) • Quality checking tools may decide to ignore certain kinds of errors – Sound w.r.t different concrete semantics

7 Optimality Criteria Program Verification

• Precise (with respect to a subset of the programs) • Mathematically prove the correctness of the program • Precise under the assumption that all paths are • Requires formal specification executable (statically exact) • Relatively optimal with respect to the chosen abstract • Example. Hoare Logic {P} S {Q} domain – {x = 1} x++ ; {x = 2} • Good enough –{x =1} {true} if (y >0) x = 1 else x = 2 {?} – {y=n} z = 1 while (y>0) {z = z * y-- ; } {?}

Relation to Program Verification Complementary Approaches

Program Analysis Program Verification • Finite state

• Fully automatic • Requires specification and loop • Unsound approaches • But can benefit from specification invariants – Compute underapproximation • Applicable to a programming • Not decidable language • Program specific • Better design • Can be very imprecise • Type checking • May yield false alarms • Relative complete • Identify interesting bugs • Must provide counter examples • Proof carrying code • Establish non-trivial properties • Provide useful documentation using effective algorithms • Just in time and dynamic compilation •Profiling •Runtime tests

8