Bachelor Seminar Complexity Analysis of Register Allocation
Total Page:16
File Type:pdf, Size:1020Kb
Bachelor Seminar Complexity Analysis of Register Allocation Markus Koenig Embedded Systems Group, University of Kaiserslautern m [email protected] Abstract Register allocation is the task of assigning temporary variables in a program to the available registers in the machine. Optimal register allocation (minimizing the number of registers used) is proved to be NP-complete by Chaitin et al, by a reduction from graph coloring problem. In particular, Chaitin et al showed for any given arbitrary graph, there exists a program whose interference graph is the same as the given arbitrary graph. In this paper, we study two existing analyses on complexity of register allocation [5, 1]. [5] proves that although optimal register allocation can be done in polynomial time for programs in static single assignment (SSA) form, the complexity after classical SSA elimination remains NP-complete. [1] shows that although register allocation is NP-complete due to the correspondence with coloring problem, the real complexity arises from the further optimizations of spilling and coalescing and from critical edges. Furthermore we also study a technique for solving combined register allocation and instruction scheduling [4]. 1 Introduction In most programs we have to store variables for a later use. This fact makes register allocation so im- portant and it is a reason to do this as fast as possible. The physical memory split up into two or more sections, here is the simple approach with the main memory and a cache enough. To store a variable in main memory needs, compared to the registers, a lot of time so it is clear to avoid this when ever possi- ble. The access to the memory is called spill (load/ store). Sometimes it is useful to transfer a variable to another register adding such an instruction is splitting and removing such an instruction is coalescing. So before the register allocation is done it is a good idea to check whether splitting or coalescing save a spill. Another problem is to find out which variable is the best for spill and so on to find the minimum number of variables that have to be spilled. It is also important to mention that the number of registers are part of the input but however the algorithm should find the smallest number of registers need to be allocated. The SSA, Single Static Assignment, Form is used in compilers as first step before a program is transferred to an executable form. In SSA every variable is defined and used for a definition only one time. The second step is the use of instruction scheduling during the register allocation. The problems arising there would be in case of a separate use of them two, which will be shown later. The combined method is called Crisp (Combined Register allocation and Instruction Scheduling Problem). To examine the out- come of the combined solution a cost function is used and a detailed analysis of the single steps to reorder the basic blocks is needed. With a look at the complexity another way, instead of the graph coloring, is 2 shown and at last point an experiment will show numbers for comparison with other algorithms. The first part of the solution, 3.1, is about the problem structure and a closer look on the proof of Chaitin. Coloring an arbitrary graph is Np-complete but it is not like that in all cases and now there are more optimization algorithms. Although it is Np-complete we see that the optimization is useful to save one register as shown in the two example figures. In section 3.2 the SSA model is introduced. There are some transformations done before it goes to the coloring and hopefully the chordal structure of the graph is after the SSA -process good enough for a simple coloring. Part 3.3 is another way to prepare the graph coloring. A combined method is used which includes in- struction scheduling and register allocating. An example that shows the effect of the combined method is given in Figure 8. To see more information an experiment shows the improvement of this approach in relation to the separate use of instruction scheduling and register allocation. Then some limitations are shown of the different improvements. Unfortunately in some cases it is im- possible to gain an easy graph for coloring so the time expensive spill and Np-complete coloring must be done. 2 Related Work The first Np-completeness proof that was made for register allocation was done by R. Sethi. He modeled the problem as graph coloring where the variables are the vertices and two vertices are connected by an edge if they are alive at the same time in program execution. The number of the available registers is the k that is the number of the possible colors. He used a DAG (directed acyclic graph) and find out the Np-completeness comes from the fact that the appearance of the instructions in the program code is not fixed. In this first approach we already see that the problem got two instances, on the one hand to decide whether a variable has to be spilled or not and on the other hand how to color the graph that K registers are enough. One is exactly defined in the following: Pereira, F. M. Q., & Palsberg, J. (2006, March). REGISTER ALLOCATION AFTER CLASSICAL SSA ELIMINATION IS NP-COMPLETE. In International Conference on Foundations of Software Science and Computation Structures (pp. 79-93). Springer, Berlin, Heidelberg. [5] ”Core register allocation problem: Instance: a program P and a number K of available registers. Problem: can each of the temporaries of P be mapped to one of the K registers such that temporary variables with interfering live ranges are assigned to different registers?” By analyzing the problem carefully it is important to look at the single steps of the problem instance. First we get a program that consists of instructions with variables. Then before it comes to model a graph and color this there are some steps to take for optimization. With a special view to the basic blocks there are also interesting things to recognize. The similarity between a basic block witch is the smallest part of the program that is analyzed and the final coloring, that is the biggest step, is given. Motwani, R., Palem, K. V., Sarkar, V., & Reyen, S. (1995). COMBINING REGISTER ALLOCATION AND INSTRUCTION SCHEDULING. Courant Institute, New York University. [4] A formulation of the combined register allocation and instruction scheduling problem within a basic block as a single opti- mization problem, showing that a simple instance of the combined problem (single register, no latencies, single functional unit) is NP-hard, even though instruction scheduling for a basic block with 0/1 latencies on a single pipelined functional unit is not NP-hard. 3 Figure 1: (a) shows the matrix equation with f function. (b) and (c) shows a matrix with its semantic. So there is a f function for every n rows and each column represents another execution path in the program. [4] The improvements will shorten the time needed for the most steps, the graph structure can be formed bet- ter, maybe to chordal graph, the instructions can be sorted favorable, however this is also possible in some special cases and nevertheless with all the improvements the register allocation will be a Np-complete problem. 3 The Solution First, in section 3.1, the SSA- method is introduced which uses interval graphs SSA- circle graphs and phi- functions. Second, 3.2, we need a closer look at the problem instances and find out in which step it has got which complexity. The last part, 3.3, is another method in contrast to SSA for the simplification of the program before the graph coloring starts. 3.1 The SSA (Static Single Assignment) Approach 3.1.1 Phi functions For the SSA form the f- functions are important. These functions are used like a naming system for the variables which choose the correct name and value for each variable. They are needed because in SSA there can not be two variables with the same name. So if we use a variable a second time or in another branch of the program the f- function remove the old name, which is used before, and then set a new name. Here the syntax is described as a matrix modeled by Hack et al. In figure 1 the f functions are evaluated simultaneously when each basic block begins. In fact that every column is a separate line in the control flow graph the variables in a row are independent of each 4 Figure 2: (a) shows a classical progrm in SSA-form. (b) shows the control flow graph and (c) the program with SSA-elimination. The three steps point out the transformation from a normal program into the post SSA form.[5] other and can allocated at the same register. Referring to figure 2, in (b) we can see in the second block that the variables v11 and i1 interfere but V11 and i2 does not just like the description of SSA. The Post SSA-form or also called SSA-elimination shown in (c) is the executable program. It is used because f- functions are not supported in every programming language. This is the reason why the number of variables looking at the whole program is increased but maxlive is not. In fact that a variable in SSA-form is allowed only to use one time it is defined and another for defining a new variable there is always a new variable introduced when a value has to be used a second time.