Lambda, the Ultimate Label Or a Simple Optimizing Compiler for Scheme

Lambda, the Ultimate Label or A Simple Optimizing Compiler for Scheme William D Clinger* Lars Thomas Hansen Lightship Software 1626 Arthur Place OACIS Eugene OR 97402 Tri-Step lth@cs. uoregon. edu University of Oregon Abstract is designed to reward more sophisticated register targeting, and to make register allocation easy to express by adjusting Optimizing compilers for higher-order languages need not the formal parameter lists of lambda expressions. Its de- be terriblv comdex. The Droblems created bv non-local. “ sign achieves some of the benefits of an intermediate e form non-globrd variables can be eliminated by allocating all such based on continuation-passing style (CP S) without actually variables in the heap. Lambda lifting makes this practical reauirin~ a conversion to CPS [23.1.181. by eliminating all non-local variables except for those that ‘Simp~icit y is achieved by a ‘front e~d whose two passes would have to be allocated in the heap anyway. The elimi- culminate in wholesale lambda iiftingl [3,1]. This lambda nated non-local variables become local variables that can be lifting is made possible by a cascade of simpler optimiza- allocated in registers. Since calls to known procedures are tion, of which the most important is a first order closure just gotos that pass arguments, lifted lambda expressions are analysis known as single assignment analysis. The output just assembly language labels that have been augmented by of the front end is an intermediate e form in which lambda a list of symbolic names for the registers that are live at that expressions are marked to indicate whether they correspond label. to assembly language labels (which represent both register allocation and control), to register allocation (let), or to 1 Introduction closure allocation. Portability is achieved by using a small, stylistically rigid Twobit is a compiler for Scheme in the tradition of Rabbit, subset of IEEE/ANSI Scheme [15] as the intermediate form, Orbit, and Gambit [23,16,9]. Unlike these previous com- in which quoted data in command positions (where the value pilers, which advanced the state of the art in generating is ignored) and stylistic variations convey additional infor- efficient code, the main design goals for Twobit were sim- mation to the code generator. This intermediate form can plicity, portability, and reasonably fast compilation, while be compiled correct~y by any Scheme compiler, but is de- generating code that is good enough for use in fairly high- signed as input to a code generator that will use the encoded performance systems comparable to Chez Scheme, Standard information. ML of New Jersev. and commercial imdementations of Com- The current code generator generates assembly code for a mon Lisp. The~e’ goals have been m’et. The fundamental hypothetical MacScheme machine similar to that described idea on which Twobit is built is that lambda expressions in [4], but register-based instead of stack-based. The Mac- can be viewed as assembly language labels, and the formal Scheme machine instructions have a semantics that was de- .Daramet ers of a lambda exmession. can be viewed as an in- signed for effective peephole optimization. Table-driven op- variant that asserts the contents of general registers that are timizing assemblers currently generate byte code for an in- live at that label. terpreter or machine code for the SPARC. Twobit has been None of the optimizations used in Twobit are new, but used to construct an implementation of Scheme2 known as their combined effectiveness has not been reported previ- Larceny, which has been used for research into the effect of ously, nor has the use of single assignment analysis for first programming style on the performance of Scheme programs order closure analysis. As explained in Section 14, the flow [11]. equation used for lambda lifting in Twobit has some advan- 1Lambda I]fting M called closure-conversion In [1] tages over the similar equations in [1], especially for a simple 2 Larceny is a nearly complete Implementation of IEEE/ANSI compiler. Scheme It is Incomplete mainly because of some bugs In the bignum Twobit is biased toward RISC architectures. The pri- dlv,smn routine mary goal of optimization in the front end is to reduce the problem of generating good code to a matter of register al- Permission to copy without fee all or part of this material is location and targeting. Currently the code generator is a granted provided that the copies are not made or distributed for direct commercial advantage, the ACM copyright notice and the simple, conventional generator that allocates registers as a title of the publication and its date appear, and notice is given stack and performs no register targeting apart from choos- that copying is by permission of the Association of Computing ing an optimal order of evaluation for the operands of a Machinery. To copy otherwise, or to republish, requires a fee procedure call (parallel assignment optimization). Twobit and/or specific permission. * Current affiliation none of the below LISP 94- 6/94 Orlando, Florida USA (3 1994 ACM 0-89791 -643-3/94/0006..$3.50 128 Grammar for output of passes 1, 2, and 3: ((lambda () (begin L> .- (lambda (1-1 . ..) (set! reverse (begin D . ..) (lersbda (.x-2) (quote (R F <decls> <dot>) ((lambda (.1ooP-3) E) (begin 1 (lambda (1-1 . I-rest) (Set! .100p-3 (begin D . ..) (lambda (.x-5 y-5) (quote (R F <decls> <dot>)) (if (null? .x-5) E) y-s -. D> (define I L) (.1ooP-3 (cdr .x-5) -- 1?> (quote X) ; constants (cons (car .x-5) (begin I) ; variable references .y_5))))) / L lambda expressions ((lambda () (.1ooP-3 .x-2 ‘()))))) I (EO El . ..) ; calls ‘# !unspecified) ) ) (set ! I E) assignments ‘reverse) ) ) / (if EO El E2) ; conditionals I (begin EO El E2 . ..) ; sequent ial expressions -- 1> <identifier> Figure2: The output of Pass ]. R> -- ((I <references> <assignments> <calls>) . ..) F> -- (I . ..) Theoutputof Paes lisexpressed using therigidlystyl- ized subset of Scheme shown in Figure 1. Note especially Figure 1: The intermediate form. that avariable misrepresented by the equivalent (begin x), which gives to variable references a list structure that can be shared and side effected. The following invariants hold 2 Overview of Twobit for the output of Pass 1: ● There are no internal definitions. Twobit currently operates as a four-paxs compiler. An op- tional fifth pass is planned but not yet implemented. The ● Noidentifier containing an upper case letter is bound planned passes are any where.3 1. Standardization of syntax. ● Noidentifier is bound in more than one place. 2. Optimization. ● Each Rcontains one entry for every identifier bound 3. Representation inference (not yet implemented). in the formal parameter list and the internal definition list that precede the R. Each entry contains a list 4. Code generation. of pointers to all references to the identifier, a list of pointers to all assignments to the identifier, and a list 5. Assembly. of pointers to all calls to the identifier. These passes will be discussed in order, using a definition of reverse as the main example: ● Except for constants, the expression does not share structure with the original input or itself, except that (define reverse the references and assignments in R are guaranteed to (lambda (x) share structure with the expression. Thus the expres- (define (loop x y) sion may beside effected, and side effects to references (if (null? x) or assignments obtained through R are guaranteed to Y change the references or assignments pointed to by R. (loop (cdr x) (cons (car x) y)))) (loopx ‘()))) ● F is garbage. In Scheme, any of the standard procedures can be re- Theintermediate form is acyclic, anditsprinted form is defined, which in combination with separate compilation genuine Scheme code, but it is large because of the shared means that a compiler cannot generate inline code for calls structure and is hard to read because of all the clutter. Pass to +, car, et cetera. Twobit, like most Scheme compilers, 1 converts the definition of reverse into an intermediate provides acompiler switch (titegrate-usual-procedures) form equivalent to the code shown in Figure 2, which was through which the programmer can promise not to redefine prodnced by a make-readable procedure. a subset of the standard procedures. The examples in this The#!unspecified notation in Figure 2, which stands paper assume that switch is true, which is the default. foracanonical unspecified value, wasprodnced bya particular implementation of the letrec macro. This notation is 3 Pass 1: Standardization of syntax not a standard part of Scheme and is not treated specially by Twobit. Also, Scheme does not permit periods to be- Pass 1 expands macros, eliminates internal definitions, checks gin identifiers, but some such illegal prefix is advisable to syntax, andgives aunique name toeachlocal variable (alpha Sscheme is not case-sensitive. This paper assumes an implement~ conversion). In addition, Pass 1 creates for each local vari- tlon that standardizes variable names to lower case, which allows the able a table R containing all references, assignments, and compiler to use upper case names for its own purpoaefi, The handling procedure calls to that variable. of case is actuslly a parameter to the compiler 129 prevent the renamed local variables from shadowing global Since the single assignment probably resulted from the variables. The particular prefix used by Two bit is a param- elimination of an internal definition during Pass 1, it may eter of the compiler. seem that nothing has been accomplished. The key is that Pass 1 was designed as an extension of an efficient al- an internal definition in the output of Pass 2 records in- gorithm for hygienic macro expansion [6,7], but the current formation gained by a simple closure analysis, whereas an implement at ion still uses non-hygienic macros.

Lambda, the Ultimate Label Or a Simple Optimizing Compiler for Scheme

Cognitive Programming Language (CPL) Programmer's Guide

An Optimized R5RS Macro Expander

A Compiler for a Simple Language. V0.16

Hy Documentation Release 0.12.1+64.G5eb9283

The Evolution of Lisp

Programming Basics - FORTRAN 77

An Industrial Strength Theorem Prover for a Logic Based on Common Lisp

Fortran 90 Programmer's Reference

Control Flow Graphs • Can Be Done at High Level Or Low Level – E.G., Constant Folding

Structured Foreign Types

Chez Scheme Version 5 Release Notes Copyright C 1994 Cadence Research Systems All Rights Reserved October 1994 Overview 1. Funct

Drscheme: a Pedagogic Programming Environment for Scheme