MODULE 35 – Loops in Flow Graphs After Discussing the Function

MODULE 35 – Loops in Flow graphs After discussing the function preserving transformations and loop optimizations in the previous module, we will try to extend the optimization technique that would be carried out in the assembly language level. Peephole optimization technique is carried out at the assembly language level by looking at the code through a small window. We will also discuss the terminologies and concepts involved in loops and control flow graphs. 35.1 Peephole Optimization Peephole optimization technique is carried out at the assembly language level. This optimization technique examines a short sequence of target instructions in a window (peephole) and replaces the instructions by a faster and/or shorter sequence when possible. Peephole optimization can also be carried out at the intermediate code level. The typical optimizations that are carried out using peephole optimization techniques are the following • Redundant instruction elimination • Flow-of-control optimizations • Algebraic simplifications • Use of machine idioms 35.1.1 Redundant instruction elimination This optimization technique eliminates redundant loads and stores. Consider the following sequence of instructions which are typical of the simple code generator algorithm that was discussed in one of the previous modules: MOV R0,a MOV a,R0 When this sequence is observed through a peephole, the second instruction can be deleted provided, if it is not labeled with a target label. Peephole represents sequence of instructions with at most one entry point. The first instruction can also be deleted by looking at the next-use information, if live (a)=false. 35.1.2 Deleting Unreachable Code Codes that are never to be reached during a control flow could be removed. This optimization can be carried out at the intermediate code level or final code level. Unlabeled blocks can be removed along with their instructions. Consider figure 35.1, where, the condition “0 == 0” is always true and hence, we will end up going to block L2. The block that starts with the instruction, “b:=x+y” is never reachable and can be removed from the control flow graph. Figure 35.1 Example for unreachable code. 35.1.3 Branch Chaining This is carried out at the intermediate code level. The idea behind this optimization is to shorten the length of branches by modifying target labels. Figure 35.2 Branch chaining example Consider figure 35.2, where the code has two labels. If the condition, “a==0” is true then the control switches to label L2. Label L2 transfer the control to label L3. This is referred to as branch over branch. This could be avoided by having the initial branch to L3 and thus the instruction with the label L2 can be removed. 35.1.4 Flow-of-Control Optimizations Figure 35.3 Flow-of control optimization example Consider figure 35.3, where the sequential flow is obstructed by having a goto L1 statement, which essentially converts this to a sequential statement. These statements could be removed and thus resulting in merging the basic blocks into a single block and removing the statement “goto L1”. 35.1.4 Algebraic Simplification Peephole optimization also incorporates optimization to impact strength reduction by replacing complex computations with cheaper ones. Consider figure 35.4, where the exponentiation function is replaced with a multiplication and dividing by 8 is replaced with right shifting 3 times. This is carried out at the intermediate code level. Figure 35.4 Example for algebraic simplification In addition, mathematic identities could also be used to simplify the generated code at the intermediate level. Consider figure 35.5, were instructions involving mathematic identities have been removed as they do not impact the computation. As we know multiplicative identity is “1” and will not change the value of the LHS variable “b” and additive identity is “0” and will not change the value of the LHS variable “a”. Figure 35.5 Algebraic simplification using mathematical identities On the other hand, algebraic simplification could be carried out at the assembly language level where machine idioms are used to generate optimized code. As shown in figure 35.6, the addition by “1” instruction is replaced with the increment operator. Figure 35.6 Machine idioms usage example 35.2 Loops As discussed in the previous module, loops are major areas that need optimization. Control Flow Graph (CFG) is a graph, which may contain loops, known as strongly-connected-components (SCC). Generally, a loop is a directed graph, whose nodes can reach all other nodes along some path. This includes “unstructured” loops, with multiple entry and multiple exit points. A structured loop which is called as a normal loop has one entry point, and (generally) a single point of exit. Loops created by mapping high-level source programs to intermediate code or assembly code are normal. A “goto” statement can create any loop on the other hand a “break” statement creates additional exits. Figure 35.7 (a) and (b) Loops example As can be seen in figure 35.7, consider each node as a basic block. If we are to determine the number of loops, it is a little complicated. In figure 35.7 (a) nodes 2,3,4,5 all form part of the loop. The nodes in the path “2 – 3- 4 – 2” will form a loop and so is the path that has nodes “3-5- 3”. The nodes, “2,3,4,5” is said to be an unstructured loop. On the other hand, consider figure 35.7 (b), the path “2-3-2” and “2-4-2” are structured loops while, “2-3-2-4-2” will be an unstructured loop. In the following section, we will try to find out such structured and unstructured loops in any flow graph which could be further used for optimization. 35.3 Determining Loops in Flow Graphs: Dominators To discuss the loops in flow graphs an important concept called as dominators need to be studied. Dominators are used to identify the leaders of a control flow graph and to optimize loops. Given two nodes, ‘d’ and ‘n’, we say, node ‘d’ dominates ‘n’, denotes as “d dom n”, if every path from the initial node of the control flow graph (CFG) to ‘n’ goes through ‘d’. The loop entry dominates all nodes in the loop. The immediate dominator ‘m’ of a node ‘n’ is the last dominator on the path from the initial node to ‘n. If d n and “d dom n” then “d dom m”. Dominator Trees are used to indicate which node dominates the other nodes. Consider figure 35.8, having nearly 10 nodes. The root of the CFG is node “1”. Thus without reaching 1, we cannot go to any other node. Thus 1 dominates all other nodes. Consider node 2, which is reached from 1. Node 3 is also reached from 1. Thus 2 is not a dominator node as this node could be skipped to reach node 3. After reaching node 3, this is the key to reach all nodes and there is a simple edge from node 3 to 4 and thus resulting in node 4 being dominant to all the nodes that are below node 4. Figure 35.8 Control flow graph for dominators Proceeding in a similar fashion, the dominator tree is constructed that indicates the dominating node as a parent and the dominated nodes as the children. This is shown in figure 35.9. Figure 35.9 Dominator tree As can be seen, node 1 dominates all nodes and its immediate children are 2, 3. 3 dominate all other nodes. Similarly, node 4 dominates all other nodes and so is node 7 and 8 respectively. 35.4 Natural Loops The next work is to identify simple or natural loops. A back edge is an edge a b whose head ‘b’ dominates its tail ‘a’. Given a back edge n d, we define the natural loop as one that consists of ‘d’ and the nodes that can reach ‘n’ without going through ‘d’. The loop header is identified as node. Unless two loops have the same header, they are disjoint or one is nested within the other. A nested loop is an inner loop if it contains no other loop. Algorithm 35.1 elucidates the construction of natural loop for any control flow graph. Algorithm 35.1 – Natural loop identification Input: A flow graph G and a back edge n d Output: Set loop consisting of all nodes in the natural loop of n d The algorithm, tries to examine each node in loop except for “d” which is placed on stack and its predecessors are examined. Procedure (insert m) { { if’ ‘m’ is not in loop then loop := loop U {m}; push ‘m’ onto stack; } stack := empty loop := {d}; insert(n); While stack not empty { pop m; for each predecessor p of m do insert(p); } } Figure 35.10 Natural loop example Figure 35.10 shows the same CFG with the indication of the natural loops. As can be seen, 3-4-3 is a simple natural loop. So is 7-8-10-7 is a loop. 35.5 Pre-Headers Identifying natural loops and dominators help in loop optimization. As discussed in the previous module, to perform strength reduction, the complex operation is removed from a basic block and is moved outside the block and is computed before entering a basic block. To facilitate loop transformations, a compiler often adds a pre-header to a loop. Code motion, strength reduction, and other loop transformations populate the pre-header. The layout of the header and pre-header is shown in figure 35.11. As can be seen, the loops everything stays with the header and the statements that are independent is moved to the pre-header.

MODULE 35 – Loops in Flow Graphs After Discussing the Function

9. Optimization

Value Numbering

Language and Compiler Support for Dynamic Code Generation by Massimiliano A

Transparent Dynamic Optimization: the Design and Implementation of Dynamo

Optimizing for Reduced Code Space Using Genetic Algorithms

Peephole Optimization from Wikipedia, the Free Encyclopedia

Code Optimizations Recap Peephole Optimization

Compiler-Based Code-Improvement Techniques

Stan-(X-249-71 December 1971

Quick Compilers Using Peephole Optimization

Machine Independent Code Optimizations

Presentation (Pdf)