Index

abstract interpretations, 14 φ placement, 196 access expressions, 137 dominance frontier, 194 access graph, 141 renaming, 199 empty, 141 SSA destruction, 214 operations, 143–146 , 224 cleanUp, 143 , 129–135 CN, 143 data flow equations, 134 extension, 145 of parameters, 268 factorization, 145 data flow equations, 269–271 field, 143 alias relations, 129 lastNode, 143 aliasing makeGraph, 143 data flow equations, 134 path removal, 145 direct aliases, 131 root, 143 indirect aliases, 131 safety of, 146 link aliases, 130 union, 144 node aliases, 130 summarization, 142 of access paths, 3, 9 access path, 3, 137 of parameters, 267 aliasing of, 3 of pointers, 2, 129–135 base of, 137 all paths analysis, 33 directly generated, 139 anti-symmetric, 64 frontier of, 137 anticipable expressions analysis, 37–39 killed, 139 data flow equations, 38 liveness of, 3 any path analysis, 26 simple, 137 applications of data flow analysis, 16 summarization, 3, 12 assignment, 176, 177, 181 target of, 137 maximum fixed point (MFP), 77 transfer of, 139 maximum safe, 75 algorithm meet over paths (MOP), 75 DFST construction, 60 associativity, 66 elimination, 184 available expressions analysis, 33–35 iterative common subexpressions elimination, round-robin, 4, 19, 79, 90, 163– 34 164 data flow equations, 33 work list, 165–172,312, 319, 320, 322 backward data flow problem, 26 SSA construction backward edges, 61

379 380 Index backward flow, 24, 160 termination length, 328 basic block, 23 termination, issues in, 305–310 bidirectional data flow equations, 39, 42, value-based termination, 317–324, 184 328 bidirectional data flow frameworks, 73, calling context, 295 159, 175, 184 restricted, 296–301 bit vector, 23 unrestricted, see call string bit vector frameworks, 23–57, 73, 86, canonical SSA, 207 88, 89, 162, 171 chain, 67 anticipable expressions analysis, 37– chordal graph, 219 39 chromatic number, 216 available expressions analysis, 33– code movement, 33 35 coloring, 222 combined may-must availability anal- combined may-must availability analy- ysis, 53–56 sis, 53 dead variables analysis, 29 data flow equations, 55 lazy code motion, 49–53 common subexpression elimination, 34 live variables analysis, 26–29 commutativity, 66 partial redundancyelimination, 39– complete lattice, 67 49 complexity of data flow analysis partially available expressions anal- bit vector frameworks, 99, 171–175 ysis, 36–37 depth of CFG, 61, 94, 97, 99 reaching definitions analysis, 29– fast frameworks, 175–179 33 information flow paths, 171–184 specification in GCC, 336–340 non-separable frameworks, 179–183 bit vector operations, 23, 56 rapid frameworks, 85–99 boundary information (BI), 26 width, 175 bounded lattice, 70 component function, 102, 153 composite entity functions (cef), 155 call multigraph, 18, 234, 235 computation of MFP assignment, 79 call node, 235 conditional constant propagation, 116 call segment, 295 data flow equations, 117 call site, 235 conservative approximations, 11, 65, 142, call string, 295, 302 147, 151, 238, 246, 250, 251, approximate, 328 254 data flow equations, 298, 303 constant propagation, 108–119, 157 restricted, 296 conditional, 116 data flow equations, 298 copy, 118 unrestricted, 302 data flow equations, 112 construction, 302, 303 linear, 119 data flow equations, 303 constraint resolution systems, 13 equivalence, 312, 317, 318 context independence, 236 for bit vector frameworks, 328 context sensitive analysis, 296 regeneration of, 319 context sensitivity, 12, 15, 17, 157, 243 representation of, 319 control flow graph, 1 Data Flow Analysis: Theory and Practice 381 control flow graph (CFG), 18, 23, 24, data flow framework, 72 235 instance of, 73 convergence, 12 data flow frameworks converging paths, 189 bit vector, 23–57 copy, 32 anticipable expressions analysis, copy propagation, 32 37–39 critical edges, 47, 50 available expressions analysis, 33– critical nodes, 48 35 cross edges, 61 bidirectional, 39, 42 CSSA, see canonical SSA combined may-must availability cyclic call sequence, 308 analysis, 53–56 cyclic return sequence, 308 dead variables analysis, 29 lazy code motion, 49–53 data flow equations, 24 live variables analysis, 26–29 alias analysis, 134 partial redundancyelimination, 39– alias analysis, of parameters, 269– 49 271 partially available expressions anal- anticipable expressions analysis, 38 ysis, 36–37 available expressions analysis, 33 reaching definitions analysis, 29– bidirectional, 39, 42, 160, 184 33 call string, restricted, 298 side effects analysis of procedure call string, unrestricted, 303 calls, 259 combined may-must availability anal- category by flow functions ysis, 55 bit vector, 73, 86, 162, 171 conditional constant propagation, 117 distributive, 73 constant propagation, 112 fast, 89, 175 explicit liveness, see explicit live- k-bounded, 86 ness, data flow equation monotone, 73 faint variables analysis, 103 non-separable, 159, 179 generic, 160 rapid, 86–90 interprocedural data flow analysis separable, 159 constructing summary flow func- direction of flow tions, 278–279 bidirectional, 73, 159, 162, 175, using summary flow functions, 184 283 unidirectional, 73, 159, 162 lazy code motion, 49–52 non-separable, 101–157 liveness, 26 alias analysis, 129–135 partial redundancy elimination, 40, alias analysis of parameters, 268 42 constant propagation, 108–119 points-to analysis, 121 faint variables analysis, 103–106 possibly uninitialized variables, 107 heap liveness analysis, 135–152 reaching definitions, 30 points-to analysis, 119–129 side effects of procedurecalls, 261– possibly uninitialized variables anal- 265 ysis, 106–108 unidirectional, 160 properties of, 86 382 Index data flow information faint variables analysis, 103–106, 157 exhaustiveness of, 11, 66 data flow equations, 103 generation, 24, 56, 102 fast data flow framework, 86, 89, 175 inherited, 237, 238, 254 fixed point, 77 invalidation, 24 fixed point assignment, 77 killing, 56, 102 fixed point theorem, 78 precision of, 12, 112, 115–119,128, flow functions, 64, 71 131, 147, 152 backward, 73 representation of, 12, 18, 23, 25, distributive, 72, 73, 83, 90 54, 64–70, 112, 120, 130, 141 entity functions safety of, 11, 65, 66, 75, 87, 117 composite (cef), 155 synthesized, 237, 238, 254 primitive (pef), 153, 172, 176, dead code, 33 177, 180, 181, 184 , 28 fast, 86, 89 dead variables analysis, 29 forward, 73 def-use chains, 30, 185 k-bounded, 86 depth, see loop connectedness monotonic, 71–73 depth first spanning tree, 60 non-separable, 101, 102, 154, 156 descending chain condition, 67 separable, 101 ff DFST, see depth first spanning tree flow insensitive side e ects, 248, 263 ff distributive data flow frameworks, 73 flow sensitive side e ects, 251, 261 dominance, 62, 191 flow sensitivity, 12, 15, 17, 157, 241, dominance and reducibility, 62 261 dominance frontier, 191–193 forbidden subgraph, 62 algorithm forward edges, 61 complexity, 194 forward flow, 24, 160 downwards exposed, 25 garbage collection, 2 dynamic analysis, 14 generation of liveness, 6 generic data flow analyzer, 360, 368 equivalenceof iterated join and IDF, 201 generic data flow equations, 160 exhaustive analysis, 15 generic flow functions, 162 explicit liveness gimple representation, 334 convergence of, 149 gimple version of CFG, 342–346 data flow equation, 147 glb, see greatest lower bound, 68, 69 efficiency of, 150 global data flow analysis, 17, 23, 360– specification, 140 362 explicitly live access paths, 138 global data flow information, 24 expression GNU Compiler Collection (GCC), 333, anticipable, 38 365 available, 33 graph, 59 partially available, 36 access graph, 141 partially redundant, 36 call multigraph, 18, 235 redundant, 34 chordal, 219 extensive point of function, 78 coloring of chordal, 222 Data Flow Analysis: Theory and Practice 383

control flow graph (CFG), 18, 23, context sensitive, 157 24, 235 flow insensitive, 157 depth first spanning tree, 60 equality-based, 157 depth of, 61 subset-based, 157 memory graph, 136 flow sensitive, 157 parameter binding graph, 273 functional approach, 238, 259–290 points-to graph, 129 summary flow function program summary graph, 291 construction, 278–282 reducible, 62 enumeration, 285–290 supergraph, 18, 235, 293 reduction, 275–278 graph reachability, 291 side effects of procedurecalls, 259– greatest element, 64 265 greatest lower bound, 66 using restricted contexts, 296 using unrestricted context, 301 Hasse diagram, 65 value-based, 239, 293–328 hoisting interprocedurally valid desirability, 41 control flow path, 294 safety, 40, 41 information flow path, 295 intraprocedural segment, 295 idempotence, 66 invocation graph, 329 immediate dominator, 192 iterated dominance frontier, 195 incremental analysis, 15 complexity of algorithm, 196 detection, 189 iterated join, 198 inference systems, 13 information flow join, 66, 198 origin of, 171, 176, 180 join semilattice, 70 information flow paths, 165, 172 in bit vector framework, 171 k-bounded data flow framework, 86 in fast frameworks, 175, 176 killing liveness, 6 in non-separable framework, 179 width of, 174 lattice, 67 interference graph, 216 lattice of flow functions, 274 interprocedural approximation, 6 lazy code motion interproceduralconstant propagation,233 critical edges, splitting of, 50 interprocedural data flow analysis, 17 data flow equations, 49–52 call string earliest points of placement, 50, 53 construction, 302, 303 latest points of placement, 51, 54 equivalence, 312, 317, 318 region of safe placement, 50 for bit vector frameworks, 328 transformation, 52 regeneration of, 319 least element, 64 representation of, 319 least upper bound, 66 termination length, 328 left locations, 120 termination, issues in, 305–310 lifetime of a name, 12 value-based termination, 317–324 link aliases, 130 context insensitive, 157 live range, 208 384 Index

interference, 208 MOP assignment, see meet over paths live variables analysis, 26–29 assignment, 76, 81, 82 data flow equations, 26 undecidability of, 85 liveness, 26 MPCP, see modified Post’s correspon- of access paths, 3 dence problem of heap data, 135–152 must availablility analysis, see available of pointers, 2 expressions analysis summary, 139 must points-to, 121 liveness analysis, 20 of heap, 135–152 node aliases, 130 of variables, see live variables anal- non-separable flow functions, 101 ysis non-separableframeworks, 29, 31, 101– liveness of access paths, 4, 138 157, 159, 179 local data flow analysis, 17, 23–25, 358– alias analysis, 129–135 360 constant propagation, 108–119 local data flow information, 24 faint variables analysis, 103–106 constant, 102 heap liveness analysis, 135–152 dependent, 102 points-to analysis, 119–129 loop closure, 86 possibly uninitialized variables anal- loop connectedness, 61 ysis, 106–108 lost-copy problem, 208, 214 parameter alias analysis, 268 lower bound, 66 parameter binding graph, 273 lub, see least upper bound, 68, 69 partial order, 64 partial redundancy elimination, 39–49 maximum fixed point assignment, 77 critical edges, 47 maximum safe assignment, 75 critical nodes, 48 may alias analysis, 74 data flow equations, 40, 42 may availablility analysis, see partially hoisting path, 40 available expressions analysis transformation, using, 44 may points-to, 121 limitations of, 45 may-must alias analysis, 69 partial transfer functions, 291 may-must availability, 69 partially available expressions analysis, meet, 66 36–37 meet over paths assignment, 75 path meet semilattice, 69 in a graph, 59 memory graph, 136 length, 60 merge operation, 64 PEO, see perfect elimination order MFP assignment, see maximum fixed point PEO ordering, 223 assignment, 81, 82 perfect elimination order, 223 minimal SSA, 190 periodic point, 316 model checking, 14 φ-instruction placement, 194 modified Post’s correspondenceproblem, φ-variable renaming, 196 84 φ-congruence, 209 monotone data flow frameworks, 73 φ-function, 186 Data Flow Analysis: Theory and Practice 385

φ-instruction, 186 use-def chains, 30 φ-related, 209 reducible, 62 φ-variable renaming algorithm, 197–199 reducing function compositions, 275 φ-variables, 187 reducing function confluences, 275 , 157 reductive point of function, 78 strong update, 120 reflexive, 64 weak update, 120 register allocation, 28 points-to analysis, 119–129 register allocation via graph coloring, 216 data flow equations, 121 register copies, 226 degree of certainty, 125 register transfer graph, 226 inverse dependence of, 124 relation, 64 left locations, 120 return node, 235 may points-to, 121 return segment, 295 must points-to, 121 reverse postorder, 28, 61 right locations, 120 right locations, 120 strong update, 120 round-robin iterative algorithm, 4, 19, weak update, 120 79, 90, 163–164 points-to graph, 129 poset, 64 safe assignment, 75 for available expression, 65 safety of data flow analysis, 11 for live variables, 64 scalar variables, 23 possibly uninitialized variables analysis, separable, 102 31, 106–108 separable frameworks, 159 data flow equations, 107 shape analysis, 157 postorder, 28 side effects, 256 predecessor, 24 flow insensitive, 263 primitive entity functions (pef), 153, 172, flow sensitive, 261 176, 177, 180, 181, 184 side effects analysis, 20, 244, 248, 259, procedure cloning, 254 290 procedure inlining, 254 simplicial, 222 product lattice, 101 spanning tree, 60 components, 68 spilling algorithm, 220 program entities, 23 SSA program representations, 16, 17 construction, 189 program summary graph, 291 construction algorithm pruned SSA, 190 φ placement, 196 dominance frontier, 194 qualified data flow value, 295 renaming, 199 correctness of construction, 198–206 rapid condition, 88 destruction, 207 rapid data flow framework, 86–90 destruction algorithm, 214 reaching definitions analysis, 19, 29–33 destruction through register alloca- data flow equations, 30 tion, 216 def-use chains, 30 coalescing, 223 for copy propagation, 32 coloring algorithm, 224 386 Index

register copies, 226 use of the variable, 25 dominance property, 206 use-def chains, 30, 185 form program, 186 stabilization of descending chain, 67 valid transformation to SSA form, 190 static analysis, 14 variable strict dominance, 191 dead, 29 strictly stronger than, 64 faint, 103 strictly weaker than, 64 live, 26 strong update, 120 possibly uninitialized, 31, 106 stronger than, 64 strongly live, 157 strongly live variables analysis, 157 variable definition, 206 successor, 24 variable use, 206 summarization, 7, 12 variable versions, 186 of liveness, 6 very busy expressions analysis, 37 summary flow function, 236, 238, 290 construction, 278–282 weak update, 120 data flow equations, 278–279 weaker than, 64 data flow equations, using, 283 whole program analysis, 253, 274, 290 context insensitive, 253 enumeration, 285–290 context sensitive, 254 reduction, 275–278 work list iterative algorithm, 165–172, side effects of procedurecalls, 259– 312, 319, 320, 322 265 supergraph, 18, 235, 293 swap problem, 208, 215 symmetric segment, 295 termination of call string construction, 305 traditional register allocator, 216–217 transfer, 101 transfer of access path, 6 transformed SSA, 207 transitive, 64 traversal conforming, 174 non-conforming, 174 tree edges, 61 TSSA, see transformed SSA undecidability of MOP assignment, 83 unidirectional data flow frameworks, 73, 159 unrestricted context, 302 upper bound, 66 upwards exposed, 25