The Design of an Optimizing Compiler Wulf, Johnsson, Weinstock, Hobbs
Total Page:16
File Type:pdf, Size:1020Kb
The Design of an Optimizing Compiler Wulf, Johnsson, Weinstock, Hobbs, and Geschke and The Production-Quality Compiler-Compiler Project Leverett, Cattell, Hobbs, Newcomer, Reiner, Schatz, Wulf Bliss/11 • an outstanding optimizing compiler for the PDP-11 • a fairly simple systems programming language • relied heavily on special case knowledge • strong emphasis on code selection & limited resources Comp 12, Spring 1996 Lecture 5, Page 1 Structure of the Bliss/11 Compiler delay tla - lex syn flo - - - rank final - pack - code - - The phases lexsynflo — scanning, parsing, data-flow analysis delay — determines shape & evaluation order, estimates costs tla, rank, pack — register allocation (memory–memory ops) code — tries to generate optimal code for each expression final — produce object code and perform peephole optimizations Comp 12, Spring 1996 Lecture 5, Page 2 lex-syn-flo lex-syn • hand constructed dfa finds lexemes & builds tables • top-down, recursive descent parser on simplified language (operator language w/o left recursion) • uses an explicit, auxilliary stack to construct tree • local constant folding in constructors • retain extra information to improve error recovery flo • flow analysis to identify optimization opportunities • finds common subexpressions and threads them together • finds possible code motion (hoisting, licm) & threads them • works inside out & incrementally (action routines) Output is set of tables, trees, and threaded opportunities Comp 12, Spring 1996 Lecture 5, Page 3 delay Goals 1. determine the “general shape” of the object code 2. estimate cost of each “program segment” 3. determine an evaluation order for each program segment Engineering • single treewalk using recursive action routines – makes preliminary decisions and passes them down tree – on return, uses returned context to make final decisions • decides on profitability of potential optimizations • uses Sethi-Ullman modified to track cse’s created & deleted • delays unary complements; chooses destroyed operands • multiply into shifts and adds • distributes multiply over adds • more constant folding Comp 12, Spring 1996 Lecture 5, Page 4 tla 1st part of TNBIND Structure • mutually recursive action routines pass around TNs • performs targeting, cost estimation, lifetime characterization, runtime stack optimization, and label assignment Engineering • targeting — parents suggest names for child’s return value • costs — sum over refs of code size, weighted by nesting depth • stack — use “dynamic temporaries” from parameter “pops” (an artifact of PDP-11 idiosyncracies) Comp 12, Spring 1996 Lecture 5, Page 5 rank 2nd part of TNBIND Goals • assign priorities (ranks) to each TN for allocation • costs by number of uses, with arithmetic preference to dense uses Engineering Divide references into four categories based on binding requirements 1. binding to a specific hardware register 2. must be bound to a register 3. must be bound to a memory location 4. may be bound to memory or a register Bliss/11 allows programmer to specify 1–3 by declaration Comp 12, Spring 1996 Lecture 5, Page 6 pack 3rd part of TNBIND Goals • make good use of the machine’s register set • assign TN’s to same register when possible & profitable Engineering • build an interference graph • build a preference graph • interference graph drives allocation • preference graph drives assignment • uses bin packing to find an allocation (equivalent to graph coloring, different terminology) pack tries to minimize memory ops, moves, and initialization TNBIND lives on (in spirit) in DEC compilers Comp 12, Spring 1996 Lecture 5, Page 7 code Goals • capitalize on all the earlier work • perform target-specific case analysis • generate outstanding code Engineering • myriad small details • original compiler used massive case analysis • PQCC substituted automatic pattern matching techniques • PQCC generated patterns from machine descriptions Strong arguments for automating this kind of detailed case analysis “In the final analysis, the quality of the local code has a greater impact on both the size and speed of the final program than any other optimization” Wulf et al., p. 89 Comp 12, Spring 1996 Lecture 5, Page 8 final Goals • produce the final “listing” and object code • perform some final optimizations on results of code Engineering • phase 1 examines adjacent instructions (lfp-style) – cross-jumping – removing code after unconditional branch – simplify algebraic identities involving literals – simplify tests and compares – address mode manipulations (PDP-11 specific) – redundant store elimination • phase 2 looks at each instruction for simplifications • phase 3 tries to use short branches whenever possible It is difficult to determine to what extent the final optimizations would be needed if more complete algorithms, rather than heuristics, existed in earlier phases of the compiler. we suspect that there will always be a role for a module similar to final in compilers with optimization aspirations similar to those of Bliss/11. Wulf et al., p. 125 Comp 12, Spring 1996 Lecture 5, Page 9.