MODULE 35 – Loops in Flow graphs

After discussing the function preserving transformations and loop optimizations in the previous module, we will try to extend the optimization technique that would be carried out in the level. Peephole optimization technique is carried out at the assembly language level by looking at the code through a small window. We will also discuss the terminologies and concepts involved in loops and control flow graphs.

35.1 Peephole Optimization

Peephole optimization technique is carried out at the assembly language level. This optimization technique examines a short sequence of target instructions in a window (peephole) and replaces the instructions by a faster and/or shorter sequence when possible. Peephole optimization can also be carried out at the intermediate code level. The typical optimizations that are carried out using peephole optimization techniques are the following

• Redundant instruction elimination

• Flow-of-control optimizations

• Algebraic simplifications

• Use of machine idioms

35.1.1 Redundant instruction elimination

This optimization technique eliminates redundant loads and stores. Consider the following sequence of instructions which are typical of the simple code generator algorithm that was discussed in one of the previous modules:

MOV R0,a MOV a,R0

When this sequence is observed through a peephole, the second instruction can be deleted provided, if it is not labeled with a target label. Peephole represents sequence of instructions with at most one entry point. The first instruction can also be deleted by looking at the next-use information, if live (a)=false.

35.1.2 Deleting Unreachable Code

Codes that are never to be reached during a control flow could be removed. This optimization can be carried out at the intermediate code level or final code level. Unlabeled blocks can be removed along with their instructions. Consider figure 35.1, where, the condition “0 == 0” is always true and hence, we will end up going to block L2. The block that starts with the instruction, “b:=x+y” is never reachable and can be removed from the control flow graph.

Figure 35.1 Example for unreachable code.

35.1.3 Branch Chaining

This is carried out at the intermediate code level. The idea behind this optimization is to shorten the length of branches by modifying target labels.

Figure 35.2 Branch chaining example

Consider figure 35.2, where the code has two labels. If the condition, “a==0” is true then the control switches to label L2. Label L2 transfer the control to label L3. This is referred to as branch over branch. This could be avoided by having the initial branch to L3 and thus the instruction with the label L2 can be removed.

35.1.4 Flow-of-Control Optimizations

Figure 35.3 Flow-of control optimization example

Consider figure 35.3, where the sequential flow is obstructed by having a goto L1 statement, which essentially converts this to a sequential statement. These statements could be removed and thus resulting in merging the basic blocks into a single block and removing the statement “goto L1”.

35.1.4 Algebraic Simplification

Peephole optimization also incorporates optimization to impact by replacing complex computations with cheaper ones. Consider figure 35.4, where the exponentiation function is replaced with a multiplication and dividing by 8 is replaced with right shifting 3 times. This is carried out at the intermediate code level.

Figure 35.4 Example for algebraic simplification

In addition, mathematic identities could also be used to simplify the generated code at the intermediate level. Consider figure 35.5, were instructions involving mathematic identities have been removed as they do not impact the computation. As we know multiplicative identity is “1” and will not change the value of the LHS variable “b” and additive identity is “0” and will not change the value of the LHS variable “a”.

Figure 35.5 Algebraic simplification using mathematical identities

On the other hand, algebraic simplification could be carried out at the assembly language level where machine idioms are used to generate optimized code. As shown in figure 35.6, the addition by “1” instruction is replaced with the increment operator.

Figure 35.6 Machine idioms usage example

35.2 Loops

As discussed in the previous module, loops are major areas that need optimization. Control Flow Graph (CFG) is a graph, which may contain loops, known as strongly-connected-components (SCC). Generally, a loop is a directed graph, whose nodes can reach all other nodes along some path. This includes “unstructured” loops, with multiple entry and multiple exit points. A structured loop which is called as a normal loop has one entry point, and (generally) a single point of exit. Loops created by mapping high-level source programs to intermediate code or assembly code are normal. A “goto” statement can create any loop on the other hand a “break” statement creates additional exits.

Figure 35.7 (a) and (b) Loops example

As can be seen in figure 35.7, consider each node as a basic block. If we are to determine the number of loops, it is a little complicated. In figure 35.7 (a) nodes 2,3,4,5 all form part of the loop. The nodes in the path “2 – 3- 4 – 2” will form a loop and so is the path that has nodes “3-5- 3”. The nodes, “2,3,4,5” is said to be an unstructured loop. On the other hand, consider figure 35.7 (b), the path “2-3-2” and “2-4-2” are structured loops while, “2-3-2-4-2” will be an unstructured loop. In the following section, we will try to find out such structured and unstructured loops in any flow graph which could be further used for optimization.

35.3 Determining Loops in Flow Graphs: Dominators

To discuss the loops in flow graphs an important concept called as dominators need to be studied. Dominators are used to identify the leaders of a control flow graph and to optimize loops. Given two nodes, ‘d’ and ‘n’, we say, node ‘d’ dominates ‘n’, denotes as “d dom n”, if every path from the initial node of the control flow graph (CFG) to ‘n’ goes through ‘d’. The loop entry dominates all nodes in the loop. The immediate dominator ‘m’ of a node ‘n’ is the last dominator on the path from the initial node to ‘n. If d  n and “d dom n” then “d dom m”.

Dominator Trees are used to indicate which node dominates the other nodes. Consider figure 35.8, having nearly 10 nodes. The root of the CFG is node “1”. Thus without reaching 1, we cannot go to any other node. Thus 1 dominates all other nodes. Consider node 2, which is reached from 1. Node 3 is also reached from 1. Thus 2 is not a dominator node as this node could be skipped to reach node 3. After reaching node 3, this is the key to reach all nodes and there is a simple edge from node 3 to 4 and thus resulting in node 4 being dominant to all the nodes that are below node 4.

Figure 35.8 Control flow graph for dominators

Proceeding in a similar fashion, the dominator tree is constructed that indicates the dominating node as a parent and the dominated nodes as the children. This is shown in figure 35.9.

Figure 35.9 Dominator tree

As can be seen, node 1 dominates all nodes and its immediate children are 2, 3. 3 dominate all other nodes. Similarly, node 4 dominates all other nodes and so is node 7 and 8 respectively.

35.4 Natural Loops The next work is to identify simple or natural loops. A back edge is an edge a  b whose head ‘b’ dominates its tail ‘a’. Given a back edge n  d, we define the natural loop as one that consists of ‘d’ and the nodes that can reach ‘n’ without going through ‘d’. The loop header is identified as node. Unless two loops have the same header, they are disjoint or one is nested within the other. A nested loop is an inner loop if it contains no other loop. Algorithm 35.1 elucidates the construction of natural loop for any control flow graph.

Algorithm 35.1 – Natural loop identification Input: A flow graph G and a back edge n  d

Output: Set loop consisting of all nodes in the natural loop of n d

The algorithm, tries to examine each node in loop except for “d” which is placed on stack and its predecessors are examined. Procedure (insert m) { { if’ ‘m’ is not in loop then loop := loop U {m}; push ‘m’ onto stack; } stack := empty loop := {d}; insert(n); While stack not empty { pop m; for each predecessor p of m do insert(p); } }

Figure 35.10 Natural loop example

Figure 35.10 shows the same CFG with the indication of the natural loops. As can be seen, 3-4-3 is a simple natural loop. So is 7-8-10-7 is a loop.

35.5 Pre-Headers

Identifying natural loops and dominators help in . As discussed in the previous module, to perform strength reduction, the complex operation is removed from a basic block and is moved outside the block and is computed before entering a basic block. To facilitate loop transformations, a often adds a pre-header to a loop. Code motion, strength reduction, and other loop transformations populate the pre-header. The layout of the header and pre-header is shown in figure 35.11. As can be seen, the loops everything stays with the header and the statements that are independent is moved to the pre-header.

Figure 35.11 Header and Preheader

35.6 Reducible and Structured flow graphs

A control flow graph is typically constructed by identifying leaders and target labels of loops. Hence, there is a possibility for this control flow graph to be reduced to make it an optimized one. A control flow graph is said to be reducible, if the forward edges form an acyclic graph where every node can be reached from the initial node G. In addition, if the back edges consists of edges whose heads dominate their tails the graph is said to be reducible.

Figure 35.12 Reducible and irreducible control flow graph

As can be seen from figure 35.12, the first graph satisfies both conditions and hence can be reduced. On the other hand, the second graph, cannot be reduced as we cannot infer whether node 2 dominates node 3 or vice-versa.

A CFG is well-structured and reducible iff all its loops are natural loops characterized by their back edges. In a well-structured control-flow graph there are no jumps into the middles of loops. Each loop is entered only through its header. CFG derived from programs using structured flow- of-control statements such as if-then-else, while-do, continue, and break statements are always well-structured. Many dataflow analysis algorithms work only on well-structured CFGs.

35.6 Control flow graph Synthesis

A Control Flow Graph (CFG) of some program p , named CFG(p), is a static abstraction of p, in which each node represents a Basic Block (B). Edges connecting the nodes in CFG represent the control flow from any one basic block to its successors. A CFG only represents the static control flow. Thus it is not mandatory to store, which of the 2 successors in an “if-else expression” is connected by the true condition. We just have to show 2 successors as the control flow from this “if-else” block.

The CFG Algorithm cfg_build(pc)

Aside from its parameter pc, input to the CFG Algorithm cfg_build() is a list of instructions which is broken into Basic Blocks. One of these blocks holds the select entry instruction at address: pc. The CFG Algorithm creates a new cfg node for each basic block . The CFG algorithm is given in algorithm 35.2

Algorithm 35.2 CFG Synthesis algorithm

Input: The program split into basic blocks.

Output: Control flow graph which is optimized

CFG_Build(pc) {

• For each successor ‘s’ of a Basic Block(n), cfg_build() installs a directed edge from Basic Block(n) to s.

• During this process each basic block is labeled as reached. At completion, all basic blocs are inspected; those not reached are filtered out as unreachable blocks, hence each of its instructions is unreachable code. Thus, the control flow graph building algorithm is used for code optimization.

• Output of cfg_build() is a pointer to the cfg node associated with address pc }

Summary: In this module we looked at an important optimization technique namely peephole optimization. We also looked at the basic properties of flow graphs and definitions which is necessary for data flow analysis