<<

Static Analysis: Graph (CFG) © Copyright 2005, Matthew B. Dwyer and Robby. The syllabus and lectures for this course are copyrighted materials and may not be used in other course settings outside University of Nebraska-Lincoln and Kansas State University in their current form or modified form without express written permission of one of the copyright holders. During this course, students are prohibited from selling notes to or being paid for taking notes by any person or commercial firm without the express written permission of one of the copyright holders. Control Flow Graph ─ Motivation n We want a better program representation for describing program flow in a method body ¨ initial statement ¨ last statement ¨ given a statement s, we want to be able to get

n next statements (successors) of s n previous statements (predecessors) of s

Static Analysis: Control Flow Graph (CFG) 2 CFG Example static int factorial(int n) { int result; int i; static int factorial(int n) { int result; int i; 1 [StaticJavaLib.assertTrue(n >= 1);] StaticJavaLib.assertTrue(n >= 1); [result = 1;]2 [i = 2;]3 result = 1; [while (i <= n) { [result = result * i;]5 [i = i + 1;]6 }]4 i = 2; [return result;]7 }

while (i <= n) …

result = result * i; return result;

i = i + 1;

}

Static Analysis: Control Flow Graph (CFG) 3 Control Flow Graph ─ Definition n A CFG is a 4-tuple (B, E, binit, blast) ¨ B: is a set of blocks, n a basic block is a straight-line piece of code without any jumps (e.g., a statement or a sequence of statements) n i.e., granularity of basic block is not fixed - it does not need to be maximal n For our work, a statement is a basic block ¨ E: B x B, is a relation on B

n (b1, b2) ∈ E means that b2 is a next (successor) basic block of b1 ¨ binit and blast are the unique initial basic block and the last basic block of the CFG n if there is no unique basic block, we can create a “virtual” one

Static Analysis: Control Flow Graph (CFG) 4 CFG Example

static int factorial(int n) { int result; CFG for factorial int i; n B = { 1, 2, 3, 4, 5, 6, 7 } [StaticJavaLib.assertTrue(n >= 1);]1 [result = 1;]2 n E = { (1, 2), (2, 3), [i = 2;]3 (3, 4), (4, 5), (4, 7), [while (i <= n) { [result = result * i;]5 (5, 6), (6, 4), (7, blast) } [i = i + 1;]6 n binit = 1 }]4 [return result;]7 n blast (virtual) }

Static Analysis: Control Flow Graph (CFG) 5 Building CFG ─ Auxiliary Functions n Suppose we have two functions first and last on a sequence of statements ¨ first([]) = ? (undefined) ¨ first([s]) = s ¨ last([]) = ? (undefined) ¨ last([s]) = s ¨ first([s1, s2, s3, …, sN]) = s1 ¨ last([s1, s2, s3, …, sN]) = sN n Notes ¨ Convention: S denotes a sequence of statements, and s denotes an individual statement ¨ first and last are called undefined (↑) if the given sequence is empty, otherwise they are defined (↓) as above

Static Analysis: Control Flow Graph (CFG) 6 Running Example

static void running(boolean b1, boolean b2, boolean b3) { [b1 = true;]1 [while (b1) { [if (b2) { [return;]4 } else { [while (b3) { [b3 = !b3;]6 }]5 }]3 }]2 }

Static Analysis: Control Flow Graph (CFG) 7 Building CFG ─ Init and Last Statements n For a method m whose body is a sequence of statements S

first(S), if first(S)↓ binit = blast, otherwise

Static Analysis: Control Flow Graph (CFG) 8 Running Example

static void running(boolean b1, boolean b2, boolean b3) { Method Declaration [b1 = true;]1 [while (b1) { n E = { } (initially) [b2 = !b2;]3 [if (b2) { [return;]5 ¨ add (2, blast ) } else { [while (b3) { n b = 1 [b3 = !b3;]7 init }]6 }]4 }]2 }

Static Analysis: Control Flow Graph (CFG) 9 Building CFG ─ Sequence of Statement n For S = [ s1 … sN ], (si, si +1) ∈ E

¨ where 0 < i < N and si is not an if-statement or a return statement

Static Analysis: Control Flow Graph (CFG) 10 Running Example

static void running(boolean b1, boolean b2, boolean b3) { Method Body [b1 = true;]1 [while (b1) { n E = { (2, b ) } [b2 = !b2;]3 last [if (b2) { [return;]5 ¨ add (1, 2) } else { [while (b3) { [b3 = !b3;]7 }]6 }]4 }]2 }

Static Analysis: Control Flow Graph (CFG) 11 Running Example

static void running(boolean b1, boolean b2, boolean b3) { Assignment [b1 = true;]1 [while (b1) { n E = { (1, 2), (2, b ) } [b2 = !b2;]3 last [if (b2) { [return;]5 ¨ ignored } else { [while (b3) { [b3 = !b3;]7 }]6 }]4 }]2 }

Static Analysis: Control Flow Graph (CFG) 12 Building CFG ─ While Statement n For s = while (e) { S1 }

¨ if first(S1)↓

n (s, first(S1)) ∈ E,

n if last(S1) is not a return statement, (last(S1), s) ∈ E

¨ otherwise, (s, s) ∈ E

Static Analysis: Control Flow Graph (CFG) 13 Running Example

static void running(boolean b1, boolean b2, boolean b3) { While Statement [b1 = true;]1 [while (b1) { n E = { (1, 2), (2, b ) } [b2 = !b2;]3 last [if (b2) { [return;]5 ¨ add (2, 3) } else { [while (b3) { ¨ add (4, 2) [b3 = !b3;]7 }]6 }]4 }]2 }

Static Analysis: Control Flow Graph (CFG) 14 Running Example

static void running(boolean b1, boolean b2, boolean b3) { While Body [b1 = true;]1 [while (b1) { n E = { (1, 2), (2, b ), [b2 = !b2;]3 last [if (b2) { [return;]5 (2, 3), (4, 2) } } else { [while (b3) { ¨ add (3, 4) [b3 = !b3;]7 }]6 }]4 }]2 }

Static Analysis: Control Flow Graph (CFG) 15 Running Example

static void running(boolean b1, boolean b2, boolean b3) { Assignment [b1 = true;]1 [while (b1) { n E = { (1, 2), (2, b ), [b2 = !b2;]3 last [if (b2) { [return;]5 (2, 3), (3, 4), } else { [while (b3) { (4, 2) } [b3 = !b3;]7 }]6 ¨ ignored }]4 }]2 }

Static Analysis: Control Flow Graph (CFG) 16 Building CFG ─ If-Statement n For […, s = if (e) { S1 } else { S2 }, S3], ¨ if first(S1)↓ ¨ if first(S2)↓ n (s, first(S1)) ∈ E, n (s, first(S2)) ∈ E, n if last(S1) not ret. stmt. n if last(S2) not ret. stmt. ¨ (last(S1), s’) ∈ E ¨ (last(S2), s’) ∈ E ¨ otherwise ¨ otherwise n (s, s’) ∈ E n (s, s’) ∈ E n where first(S3), if first(S3)↓ s’ = next of s from its parent, otherwise

Static Analysis: Control Flow Graph (CFG) 17 Running Example

static void running(boolean b1, boolean b2, boolean b3) { If Statement [b1 = true;]1 [while (b1) { n E = { (1, 2), (2, b ), [b2 = !b2;]3 last [if (b2) { [return;]5 (2, 3), (3, 4), } else { [while (b3) { (4, 2) } [b3 = !b3;]7 }]6 ¨ next is 2, from (4, 2) }]4 }]2 } ¨ remove (4, 2) ¨ add (4, 5); then ¨ add (4, 6), (6, 2); else

Static Analysis: Control Flow Graph (CFG) 18 Running Example

static void running(boolean b1, boolean b2, boolean b3) { If-Then Body [b1 = true;]1 [while (b1) { n E = { (1, 2), (2, b ), [b2 = !b2;]3 last [if (b2) { [return;]5 (2, 3), (3, 4), } else { [while (b3) { (4, 5), (4, 6), [b3 = !b3;]7 }]6 (6, 2) } }]4 }]2 ¨ } nothing to be added (there is only one stmt)

Static Analysis: Control Flow Graph (CFG) 19 Building CFG ─ Return Statement n For s = return ( e )? ;

¨ (s, blast) ∈ E

Static Analysis: Control Flow Graph (CFG) 20 Running Example

static void running(boolean b1, boolean b2, boolean b3) { Return Statement [b1 = true;]1 [while (b1) { n E = { (1, 2), (2, b ), [b2 = !b2;]3 last [if (b2) { [return;]5 (2, 3), (3, 4), } else { [while (b3) { (4, 5), (4, 6), [b3 = !b3;]7 }]6 (6, 2) } }]4 }]2 ¨ } add (5, blast)

Static Analysis: Control Flow Graph (CFG) 21 Running Example

static void running(boolean b1, boolean b2, boolean b3) { If-Else Body [b1 = true;]1 [while (b1) { n E = { (1, 2), (2, b ), [b2 = !b2;]3 last [if (b2) { [return;]5 (2, 3), (3, 4), } else { [while (b3) { (4, 5), (4, 6), [b3 = !b3;]7 }]6 (5, b ), (6, 2) } }]4 last }]2 ¨ } nothing to be added (there is only one stmt)

Static Analysis: Control Flow Graph (CFG) 22 Running Example

static void running(boolean b1, boolean b2, boolean b3) { While Statement [b1 = true;]1 [while (b1) { n E = { (1, 2), (2, b ), [b2 = !b2;]3 last [if (b2) { [return;]5 (2, 3), (3, 4), } else { [while (b3) { (4, 5), (4, 6), [b3 = !b3;]7 }]6 (5, b ), (6, 2) } }]4 last }]2 ¨ } add (6, 7) ¨ add (7, 6)

Static Analysis: Control Flow Graph (CFG) 23 Running Example

static void running(boolean b1, boolean b2, boolean b3) { While Body [b1 = true;]1 [while (b1) { n E = { (1, 2), (2, b ), [b2 = !b2;]3 last [if (b2) { [return;]5 (2, 3), (3, 4), } else { [while (b3) { (4, 5), (4, 6), [b3 = !b3;]7 }]6 (5, b ), (6, 2), }]4 last }]2 (6, 7), (7, 6) } } ¨ nothing to be added (there is only one stmt)

Static Analysis: Control Flow Graph (CFG) 24 Running Example

static void running(boolean b1, boolean b2, boolean b3) { Assignment [b1 = true;]1 [while (b1) { n E = { (1, 2), (2, b ), [b2 = !b2;]3 last [if (b2) { [return;]5 (2, 3), (3, 4), } else { [while (b3) { (4, 5), (4, 6), [b3 = !b3;]7 }]6 (5, b ), (6, 2), }]4 last }]2 (6, 7), (7, 6) } } ¨ ignored

Static Analysis: Control Flow Graph (CFG) 25 Running Example

static void running(boolean b1, boolean b2, boolean b3) { Done! [b1 = true;]1 [while (b1) { n E = { (1, 2), (2, b ), [b2 = !b2;]3 last [if (b2) { [return;]5 (2, 3), (3, 4), } else { [while (b3) { (4, 5), (4, 6), [b3 = !b3;]7 }]6 (5, b ), (6, 2) }]4 last }]2 (6, 7), (7, 6) } }

Static Analysis: Control Flow Graph (CFG) 26 CFG Example (Quiz 1)

static int factorial(int n) { int result; int i; [StaticJavaLib.assertTrue(n >= 1);]1 [result = 1;]2 [i = 2;]3 [while (i <= n) { [result = result * i;]5 [i = i + 1;]6 }]4 [return result;]7 }

Static Analysis: Control Flow Graph (CFG) 27 For You To Do n Try building the CFG for the following methods

int foo(int x) { void bar(boolean b) { void bazzzzz() { [return x;]1 [if (b) { [return;]2 } [if (true) { [x = x + 1;]2 else { [return;]3 }]1 [return;]2 [x = x - 1;]3 [return;]4 } else { } } [return;]3

}]1 }

Static Analysis: Control Flow Graph (CFG) 28 Statements Reachability

To compute the reachable statements from a particular statement s:

reachable(s) = { s }  E+(s)

unreachable(binit) = B / reachable(binit)

Note that for a relation R, its transitive closure (+) + i R = i∈ N R + n i.e., R (e) = R(e)  (e’ ∈ R(e) R(e’))  …

Static Analysis: Control Flow Graph (CFG) 29 Building CFG ─ Cleaning Up n We want to restrict E so it only relates reachable statements

E’ = E / { (b1, b2) | b1 or b2 ∈ unreachable(binit) }

int foo(int x) { E = { (1, b ), (2, 3), (3, b ) } [return x;]1 last last [x = x + 1;]2 [x = x - 1;]3 E’ = { (1, b ) } } last

Static Analysis: Control Flow Graph (CFG) 30 CFG succs/preds

For a CFG (B, E, binit, blast) n succs(b) = { b’ | (b, b’) ∈ E’ } ¨ a function that given a basic block b, it returns b’s set of next basic blocks n preds(b) = { b’ | (b’, b) ∈ E’ } ¨ a function that given a basic block b, it returns b’s set of previous basic blocks

Static Analysis: Control Flow Graph (CFG) 31 succs/preds Example

CFG for factorial n successors ¨ succs(1) = { 2 } n B = { 1, 2, 3, 4, 5, 6, 7 } ¨ succs(2) = { 3 } ¨ succs(3) = { 4 } n E’ = { (1, 2), (2, 3), ¨ succs(4) = { 5, 7 } (3, 4), (4, 5), (4, 7), ¨ succs(5) = { 6 } (5, 6), (6, 4), ¨ succs(6) = { 4 } ¨ succs(7) = { b } (7, b ) } last last n predecessors n binit = 1 ¨ preds(1) = { } ¨ preds(2) = { 1 } n blast (virtual block) ¨ preds(3) = { 2 } ¨ preds(4) = { 3, 6 } ¨ preds(5) = { 4 } ¨ preds(6) = { 5 } ¨ preds(7) = { 4 }

Static Analysis: Control Flow Graph (CFG) 32 succs+/preds+ Example

n successors n successors (transitive closure) ¨ succs(1) = { 2 } ¨ succs+(1) = { 2, 3, 4, 5, 6, 7 } ¨ succs(2) = { 3 } ¨ succs+(2) = { 3, 4, 5, 6, 7 } ¨ succs(3) = { 4 } ¨ succs+(3) = { 4, 5, 6, 7 } ¨ succs(4) = { 5, 7 } ¨ succs+(4) = { 4, 5, 6, 7 } ¨ succs(5) = { 6 } ¨ succs+(5) = { 4, 5, 6, 7 } ¨ succs(6) = { 4 } ¨ succs+(6) = { 4, 5, 6, 7 } + ¨ succs(7) = { blast } ¨ succs (7) = { blast }

n predecessors n predecessors (transitive closure) ¨ preds(1) = { } ¨ preds+(1) = { } ¨ preds(2) = { 1 } ¨ preds+(2) = { 1 } ¨ preds(3) = { 2 } ¨ preds+(3) = { 1, 2 } ¨ preds(4) = { 3, 6 } ¨ preds+(4) = { 1, 2, 3, 4, 5, 6 } ¨ preds(5) = { 4 } ¨ preds+(5) = { 1, 2, 3, 4, 5, 6 } ¨ preds(6) = { 5 } ¨ preds+(6) = { 1, 2, 3, 4, 5, 6 } ¨ preds(7) = { 4 } ¨ preds+(7) = { 1, 2, 3, 4, 5, 6 }

Static Analysis: Control Flow Graph (CFG) 33 For You To Do n Give the CFG, and its succs, preds, succs +, and preds+ for

int foo(int x) { void bar(boolean b) { void bazzz() { [return x;]1 [if (b) { [return;]2 } [if (true) { [x = x + 1;]2 else { [return;]3 }]1 [return;]2 [x = x + 1;]3 [return;]4 } else { } } [return;]3 }]1 + n Is foo.2 ∈ succs (foo.1)? } n Is bar.4 ∈ succs+(bar.1)? n Is bazzz.3 ∈ succs+(bazzz.1)?

Static Analysis: Control Flow Graph (CFG) 34 Assessment

Control Flow Graph n It is easy to determine very approximate control flow info for Java n It is hard to determine accurate info for Java and higher- order languages (e.g., where functions can be passed as arguments, exceptions) n Often times control flow information is depicted in graphical form, e.g., as a flow graph

Static Analysis: Control Flow Graph (CFG) 35