The Design of an Optimizing Compiler
Wulf, Johnsson, Weinstock, Hobbs, and Geschke
and
The Production-Quality Compiler-Compiler Project
Leverett, Cattell, Hobbs, Newcomer, Reiner, Schatz, Wulf
Bliss/11
• an outstanding optimizing compiler for the PDP-11 • a fairly simple systems programming language • relied heavily on special case knowledge • strong emphasis on code selection & limited resources
Comp 12, Spring 1996 Lecture 5, Page 1 Structure of the Bliss/11 Compiler
delay tla - lex syn flo - - - rank
final - pack - code - -
The phases lexsynflo — scanning, parsing, data-flow analysis delay — determines shape & evaluation order, estimates costs tla, rank, pack — register allocation (memory–memory ops) code — tries to generate optimal code for each expression final — produce object code and perform peephole optimizations
Comp 12, Spring 1996 Lecture 5, Page 2 lex-syn-flo lex-syn
• hand constructed dfa finds lexemes & builds tables • top-down, recursive descent parser on simplified language (operator language w/o left recursion) • uses an explicit, auxilliary stack to construct tree • local constant folding in constructors • retain extra information to improve error recovery
flo
• flow analysis to identify optimization opportunities • finds common subexpressions and threads them together • finds possible code motion (hoisting, licm) & threads them • works inside out & incrementally (action routines)
Output is set of tables, trees, and threaded opportunities
Comp 12, Spring 1996 Lecture 5, Page 3 delay
Goals
1. determine the “general shape” of the object code 2. estimate cost of each “program segment” 3. determine an evaluation order for each program segment
Engineering
• single treewalk using recursive action routines – makes preliminary decisions and passes them down tree – on return, uses returned context to make final decisions • decides on profitability of potential optimizations • uses Sethi-Ullman modified to track cse’s created & deleted • delays unary complements; chooses destroyed operands • multiply into shifts and adds • distributes multiply over adds • more constant folding
Comp 12, Spring 1996 Lecture 5, Page 4 tla 1st part of TNBIND
Structure
• mutually recursive action routines pass around TNs • performs targeting, cost estimation, lifetime characterization, runtime stack optimization, and label assignment
Engineering
• targeting — parents suggest names for child’s return value • costs — sum over refs of code size, weighted by nesting depth • stack — use “dynamic temporaries” from parameter “pops” (an artifact of PDP-11 idiosyncracies)
Comp 12, Spring 1996 Lecture 5, Page 5 rank 2nd part of TNBIND
Goals
• assign priorities (ranks) to each TN for allocation • costs by number of uses, with arithmetic preference to dense uses
Engineering Divide references into four categories based on binding requirements
1. binding to a specific hardware register 2. must be bound to a register 3. must be bound to a memory location 4. may be bound to memory or a register
Bliss/11 allows programmer to specify 1–3 by declaration
Comp 12, Spring 1996 Lecture 5, Page 6 pack 3rd part of TNBIND
Goals
• make good use of the machine’s register set
• assign TN’s to same register when possible & profitable
Engineering
• build an interference graph • build a preference graph • interference graph drives allocation • preference graph drives assignment • uses bin packing to find an allocation (equivalent to graph coloring, different terminology) pack tries to minimize memory ops, moves, and initialization
TNBIND lives on (in spirit) in DEC compilers
Comp 12, Spring 1996 Lecture 5, Page 7 code
Goals
• capitalize on all the earlier work • perform target-specific case analysis • generate outstanding code
Engineering
• myriad small details • original compiler used massive case analysis
• PQCC substituted automatic pattern matching techniques
• PQCC generated patterns from machine descriptions
Strong arguments for automating this kind of detailed case analysis
“In the final analysis, the quality of the local code has a greater impact on both the size and speed of the final program than any other optimization” Wulf et al., p. 89
Comp 12, Spring 1996 Lecture 5, Page 8 final
Goals
• produce the final “listing” and object code • perform some final optimizations on results of code
Engineering
• phase 1 examines adjacent instructions (lfp-style) – cross-jumping – removing code after unconditional branch – simplify algebraic identities involving literals – simplify tests and compares – address mode manipulations (PDP-11 specific) – redundant store elimination • phase 2 looks at each instruction for simplifications • phase 3 tries to use short branches whenever possible
It is difficult to determine to what extent the final optimizations would be needed if more complete algorithms, rather than heuristics, existed in earlier phases of the compiler. . . . we suspect that there will always be a role for a module similar to final in compilers with optimization aspirations similar to those of Bliss/11. Wulf et al., p. 125
Comp 12, Spring 1996 Lecture 5, Page 9