Control Flow Analysis in Scheme

Olin Shivers Carnegie Mellon University [email protected]

Abstract Fortran). Representative texts describing these techniques are [Dragon], and in more detail, [Hecht]. Flow analysis is perhaps Traditional ¯ow analysis techniques, such as the ones typically the chief tool in the optimising writer's bag of tricks; employed by optimising Fortran , do not work for an incomplete list of the problems that can be addressed with Scheme-like languages. This paper presents a ¯ow analysis ¯ow analysis includes global constant subexpression elimina- technique Ð control ¯ow analysis Ð which is applicable to tion, loop invariant detection, redundant assignment detection, Scheme-like languages. As a demonstration application, the , constant propagation, range analysis, information gathered by control ¯ow analysis is used to per- code hoisting, elimination, copy propaga- form a traditional ¯ow analysis problem, induction variable tion, live variable analysis, , and loop jamming. elimination. Extensions and limitations are discussed. However, these traditional ¯ow analysis techniques have The techniques presented in this paper are backed up by never successfully been applied to the Lisp family of computer working code. They are applicable not only to Scheme, but languages. This is a curious omission. The Lisp community also to related languages, such as Common Lisp and ML. has had suf®cient time to consider the problem. Flow analysis dates back at least to 1960, ([Dragon], pp. 516), and Lisp is 1 The Task one of the oldest computer programming languages currently in use, rivalled only by Fortran and COBOL. Flow analysis is a traditional optimising compiler technique Indeed, the Lisp community has long been concerned with for determining useful information about a program at compile the execution speed of their programs. Typical Lisp pro- time. Flow analysis determines path invariant facts about points grams, such as large AI systems, are both interactive and cycle- in a program. A ¯ow analysis problem is a question of the form: intensive. AI researchers often ®nd their research efforts frus- ªWhat is true at a given point p in my program, in- trated by the necessity of waiting several hours for their enor- dependent of the execution path taken to p from the mous Lisp-based production system or back-propagation net- start of the program?º work to produce a single run. Since Lisp users are willing to Example domains of interest might be the following: pay premium prices for special computer architectures specially designed to execute Lisp rapidly, we can safely assume they are  Range analysis: What is the range of values that a given even more willing to consider compiler techniques that apply reference to an integer variable is constrained to lie within? generally to target machines of any nature. Range analysis can be used, for instance, to do array bounds checking at compile time. Finally, the problems addressed by ¯ow analysis are relevant to Lisp. None of the ¯ow analysis problems listed above are

 Loop invariant detection: Do all possible prior assign- restricted to arithmetic computation; they apply just as well to ments to a given variable reference lie outside its contain- symbolic processing. Furthermore, Lisp opens up new domains ing loop? of interest. For example, Lisp permits weak typing and run time Over the last thirty years, standard techniques have been de- type checking. Type information is not statically apparent, as veloped to answer these questions for the standard imperative, it is in the Algol family of languages. Type information is Algol-like languages (e.g., Pascal, C, Ada, Bliss, and chie¯y nonetheless extremely important to the ef®cient execution of Lisp programs, both to remove run time safety checks of func- tion arguments, and to open code functions that operate on a variety of argument types. Thus ¯ow analysis offers a tempt- ing opportunity to perform type inference from the occurence To appear at the ACM SIGPLAN '88 Conference on Program- of calls to type predicates in Lisp programs. ming Language Design and Implementation, Atlanta, Georgia, June 22-24, 1988. So it is clear that ¯ow analysis has much potential with re- spect to compiling Lisp programs. Unfortunately, this potential has not been realised because Lisp is suf®ciently different from the Algol family of languages that the traditional techniques

Page 1

2 Scheme Flow Analysis

    E

developed for them are not applicable. START i<=30? E STOP

   

Dialects of Lisp can be contrasted with the Algol family in c

c s:=a[i] E the following ways: i:=0 E a[i]:=cos(s+4) s<0?

 Binding versus assignment: Both classes of language have the same two mechanisms a[i]:=(s+4)Ã2c for associating values with variables: parameter binding

and variable assignment. However, there are differences b[i]:=s+4c ' in frequency of useage. Algol-family languages tend to en- i:=i+1 courage the use of assignment statements; Lisp languages tend to encourage binding. Figure 1: Control ¯ow graph

 Functions as ®rst class citizens: Functions in modern Lisps are data that can be passed as arguments to procedures, returned as values from function in a temporary, we can eliminate the redundant addition in calls, stored into arrays, etc.. b[i]:=s+4. This information is arrived at through consider- ation of the paths through the control ¯ow graph. Since traditional ¯ow analysis techniques tend to concen- trate on tracking assignment statements, it's clear that the Lisp The problem with Lisp is that there is no static control ¯ow emphasis on variable binding changes the complexion of the graph at compile time. Consider the following fragment of problem. Generally, however, variable binding is a simpler, Scheme code: weaker operation than the extremely powerful operation of side (let ((f (foo 7 g k)) effecting assignments. Analysis of binding is a more tractable (h (aref a7 i j))) problem than analysis of assignments, because the semantics of (if (< i j) (h 30) (f h))) binding are simpler, and there are more invariants for program- Consider the control ¯ow of the if expression. Its graph is:

analysis tools to invoke. In particular, the invariants of the f !-calculus are usually applicable to modern Lisps. Note Dif-

i

On the other hand, the higher-order, ®rst-class nature of Lisp fc functions can hinder efforts even to derive basic ¯ow analysis information from Lisp programs. I claim it is this aspect of After evaluating the conditional's predicate, control can transfer Lisp which is chie¯y responsible for the mysterious absence of either to the function that is the value of h, or to the function ¯ow analysis from Lisp compilers to date. A brief discussion that is the value of f. But what's the value of f? What's the of traditional ¯ow analytic techniques will show why this is so. value of h? Unhappily, they are computed at run time. If we knew all the functions that h and f could possibly be 2 The Problem bound to, independent of program execution, we could build a control ¯ow graph for the code fragment. So, if we wish Consider the following piece of Pascal code: a control ¯ow graph for a piece of Scheme code, we need to FOR i := 0 to 30 DO BEGIN answer the following question: for every function call in the program, what are the possible lambda expressions that call s := a[i]; could be a jump to? But this is a ¯ow analysis question! So IF s < 0 THEN a[i] := (s+4)^2 with regard to ¯ow analysis in Lisp, we are faced with the following unfortunate situation: ELSE

a[i] := cos(s+4);  In order to do ¯ow analysis, we need a control ¯ow graph. b[i] := s+4; END  In order to determine control ¯ow graphs, we need to do ¯ow analysis. Flow analysis requires construction of a control ¯ow graph for the code fragment (®g. 1). Oops. Every vertex in the graph represents a basic block of code: a sequence of instructions such that the only branches into the 3 CPS: The Hedgehog's Representation block are branches to the beginning of the block, and the only branches from the block occur at the end of the block. The The fox knows many things, but the hedgehog knows edges in the graph represent possible transfers of control be- one great thing. tween basic blocks. Having constructed the control ¯ow graph, ± Archilocus we can use graph algorithms to determine path invariant facts The ®rst step towards ®nding a solution to this conundrum about the verteces. is to develop a representation for our Lisp programs suitably In this example, for instance, we can determine that on all adapted to the task at hand. In this section, we will develop an control paths from START to the dashed block (b[i]:=s+4; intermediate representation language, CPS Lisp, which is well i:=i+1), the expression s+4 is evaluated with no subse- suited for doing ¯ow analyis and representing Lisp programs. quent assignments to s. Hence, by caching the result of s+4 We will handle the full syntax of Lisp, but at one remove: Scheme Flow Analysis 3

ªstandardº Lisp is mapped into a much simpler, restricted subset  The only expressions that can appear as arguments to a of Lisp, which has the effect of greatly simplifying the analysis. function call are constants, variables, and lambda expres- In Lisp, we must represent and deal with transfers of control sions. caused by function calls. This is most important in the Scheme Standard non-CPS Lisp can be easily transformed into an dialects [R3-Report], where lambda expressions occur with ex- equivalent CPS program, so this representation carries no loss treme frequency. In the interests of simplicity, then, we adopt of generality. Once committed to CPS, we make further sim- a representation where all transfers of control Ð sequencing, pli®cations:

looping, function call/return, conditional branching Ð are rep-  No special syntactic forms for conditional branch. resented with the same mechanism: the tail recursive function The semantics of primitive conditional branch is captured call. This representation is called CPS, or Continuation Passing by the primitive function %if which takes one boolean ar- Style, and is treated at length in [Declarative]. gument, and two continuation arguments. If the boolean is CPS stands in contrast to the intermediate representation true, the ®rst continuation is called; otherwise, the second languages commonly chosen for traditional optimising compil- is called. ers. These languages are conventionally some form of slightly

 No special labels or letrec syntax for mutual re- cleaned-up assembly language: quads, three-address code, or cursion. triples. The disadvantage of such representations are their ad Mutual recursion is captured by the primitive function Y, hoc, machine-speci®c, and low-level semantics. The alternative

which is the ªparadoxical combinatorº of the !-calculus. of CPS was ®rst proposed by Steele in [Rabbit], and further ex-

plored by Kranz, et al. in [ORBIT]. The advantages of CPS lie  Side effects to variables are not allowed.

in its appeal to the formal semantics of the !-calculus, and its We allow side effects to data structures only. Hence side representational simplicity. effects are handled by primitive functions, and special syn- tax is not required. CPS conversion can be referred to as the ªhedgehogº ap- proach, after a quotation by Archilocus. All control and envi- Lisp code violating any of these restrictions is easily mapped ronment structures are represented in CPS by lambda expres- into equivalent Lisp code preserving them, so they carry no sions and their application. After CPS conversion, the compiler loss of generality. In point of fact, the front end of the OR- need only know ªone great thingº Ð how to compile lambdas BIT compiler [ORBIT] performs the transformation of standard very well. This approach has an interesting effect on the prag- Scheme code into the above representation as the ®rst phase of matics of Scheme compilers. Basing the compiler on lambda compilation. expressions makes lambda expressions very cheap, and encour- These restrictions leave us with a fairly simple language to ages the programmer to use them explicitly in his code. Since deal with: There are only ®ve syntactic entities: lambdas, vari- lambda is a very powerful construct, this is a considerable boon ables, primops, constants and calls (®g. 2), and two basic se- for the programmer. mantic operations: abstraction and application. In CPS, function calls are tail recursive. That is, they do

not return; they are like GOTOs. If we are interested in the  Lambda expressions: (lambda var-set call) value computed by a function f of two values, we must pass f a third argument, the continuation. The continuation is a  Variable references: foo, bar,... function; after f computes its value v, instead of ªreturningº

 Constants: 3, "doghen", '(a 3 elt list),... the value v, it calls the continuation on v. Thus the continuation

represents the control point which will be transferred to after  Primitive Operations: +, %if, Y,...

the execution of f. 

 Function calls: ( fun arg ) For example, if we wish to print the value computed by where fun is a lambda, var, or primop, and the args are

(x + y)  (z + w), we do not write: lambdas, vars, or constants. (print (* (+ x y) (+ z w))) instead, we write: Figure 2: CPS language grammar (+ x y (lambda (xy)

(+ z w (lambda (zw)  Lambda Expressions: (* xy zw print))))) A lambda expression has the syntax (lambda var-set call), where var-set is a list of vari- Here, the primitive operations Ð +, *, and print Ð ables, e.g. (n m a), and call is a function call. A lambda are all rede®ned to take a continuation as an extra ar- expression denotes a function. gument. The ®rst + function calls its third argument,

(lambda (xy) ...) on the sum of its ®rst two arguments,  Function Calls: x and y. Likewise, the second + function calls its third argu- A function call has the syntax ( fun arg1 ... argn ) for ment, (lambda (zw) ...) on the sum of its ®rst two ar- n ! 0. fun must be a function, and the argi must be args. guments, z and w. And the * function calls its third argument, [see below] the print function, on the product of its ®rst two arguments, ± Function: x + y, and z + w. A function is a lambda expression, a variable refer- Continuation passing style has the following invariant: ence, or a primop. 4 Scheme Flow Analysis

± Args: Y returns the ®xpoint of its argument, i.e. that function f H H An argument is a lambda expression, a variable ref- such that (lambda (f) ...) applied to f H yields f . erence, or a constant. However, if we convert (lambda (f) ...) to CPS notation, we get:

 Primitive Operations: A primop denotes some primitive function, and is given (lambda (f k) by a prede®ned identi®er, e.g., +, %if. (k (lambda (n c) (test-zero? n

 Variables: (lambda () (c 1)) Variables have the standard identi®er syntax. (lambda () (- n 1 (lambda (n1)

 Constants: (f n1 (lambda (a) Constants have no functional interpretation, and so have (* a n c)) syntax and semantics of little relevance to our analysis. )))))))) I use integers, represented by base 10 numerals, in my examples. Note that our functional doesn't return a value (since no CPS function returns a value). Instead, it calls its con- Note that this syntax speci®cation relegates primops to sec- tinuation k on its computed value. So the ª®xpointº of ond class status: they may only be called, not used as data, our new, CPS functional is that function f H such that or passed around as arguments. This leads to no loss of gen- (lambda (f k) ...) applied to f H and some con-

erality, since everywhere one would like to use the + primop tinuation c, calls c on f H . That is, calling the continuation H as data, for instance, one can instead use an equivalent lambda on f H is equivalent to returning f . We can generalise expression: (lambda (a b c) (+ a b c)). our notion of ª®xpointº to include groups of mutually re- The reason behind splitting out primops as second class is cursive functions by allowing our functional to take more fairly operational: we view primops as being small, open- than one non-continuation argument, i.e. a ª®xpointº of

codeable types of objects. Calls to primops are where compu- some CPS function (lambda (f g h c) ...) is a

H H

tation actually gets done. There are other formulations possible collection of three functions, f H , g , and h , such that if

H H

with ®rst-class primops, with corresponding ¯ow analytic solu- (lambda (f g h c) ...) is applied to f H , g , h ,

H H tions. Relegating primops to second class status simpli®es the and some continuation k, then f H , g , and h will be passed presentation of the analysis technique presented in this paper. to the continuation k. Note also that the de®nition of CPS Lisp implies that the So our CPS version of the Y combinator looks like: only possible body a lambda expression can have is a function (Y (lambda (f g h k) (k f-de®nition call. This is directly re¯ected in our syntax speci®cation. g-de®nition A restatement of the syntactic invariants in our representa- h-de®nition)) tion: c)

 A lambda's body consists of a single function call. The result of this example is to call continuation

H H c on three values, the ®xpoints f H , g , and h of

 A function call's arguments may only be lambdas, vari- (lambda (f g h k) ...). ables, or constants. That is, nested calls of the form:

(f (g a) (h b)) are not allowed. Figure 3 shows two complete examples using the CPS Y g operator. fNote CPS and Triples Some Primops: 4 Control Flow Analysis: A Technique %if, test: As discussed above, %if is a primop taking three ar- guments: a boolean, and two continuations. If the boolean Our intermediate representation de®ned, we can now proceed is true, the ®rst continuation is called, otherwise the second to develop a technique for deriving the control ¯ow information continuation is called. present in a Scheme program. The solution to the dilemma of %if can have specialised test forms that take a non- section 1 is to use a ¯ow technique which will ªbootstrapº the boolean ®rst argument, and perform some test on it, e.g. control ¯ow graph into being. As is typical in ¯ow analysis (test-zero? x f g) calls f if x is zero, otherwise problems, our solution may err, as long as it errs conserva- g. (test-nil? y h k) calls h if y is nil, otherwise tively. That is, it may introduce spurious edges into the control k. ¯ow graph, but may not leave out edges representing transfers of control that could actually occur during execution of the

Y: Y is the CPS version of the !-calculus ªparadoxical combi- program. natorº ®xpoint function. The CPS de®nition is tricky, and The intuition is that once we determine the control ¯ow graph is best arrived at in stages. Consider the following use of for the Lisp function, we can then use it to do other, standard the non-CPS ®xpoint operator: data ¯ow analyses. We refer to the problem of determining (Y (lambda (f) the control ¯ow graph for the purposes of subsequent data ¯ow (lambda (n) analysis as control ¯ow analysis. (This intuition will only be (if (zero? n) 1 partially borne out. The limitations of control ¯ow analysis (* n (f (- n 1))))))) will be discussed in section 8.) Scheme Flow Analysis 5

(lambda (n k) ; Call K on N! (lambda (n k) ; Call K on N! ;; Y calls continuation (LAMBDA (G)...) on (y (lambda (f c) ;; fixpoint of (LAMBDA (F C) ...). (c (lambda (m a k1) ;; (LAMBDA (M K1) is factorial code. (test-zero? m (y (lambda (f c) (lambda () (k1 a)) (c (lambda (m k1) (lambda () (test-zero? m (* a m (lambda (a1) (lambda () (k1 1)) (- m 1 (lambda (m1) (lambda () (f m1 a1 k1)) (- m 1 (lambda (m1) )))))))) (f m1 (lambda (a) (lambda (g) (g n 1 k)))) (* a m k1)) )))))))) (lambda (g) (g n k)))) (a) Recursive factorial (b) Iterative factorial

Figure 3: Two factorial functions using the CPS Y operator

As an initial approximation, let's discuss control ¯ow anal- 4.1 Handling Lambda ysis for a Scheme that does not allow side effects, even side

effects to data structures. This allows us to avoid worrying Defs can be given with a recursive de®nition: In P about functions that ªescapeº into data structures, to emerge a call c :(f ... ai ...) , Vl Defs(f) of the form who-knows-where. Later in the paper, we will patch our side- (lambda (... vi... ) ... ), effectless solution up to include side-effects.

Defs(ai) & Defs(vi) [LAM] The point of using CPS Lisp for an intermediate representa- tion is that all transfers of control are represented by function I.e. if lambda l can be called from call site c, then l's ith variable call. Thus the control ¯ow problem is de®ned by the following can evaluate to any of the lambdas that c's ith argument can question: evaluate to. What we are doing here is willfully confusing closures with For each call site c in program P, what is the set L(c) lambda expressions. A lambda expression is not a function; of lambda expressions that c could be a call to? I.e., it must be paired with an environment to produce a function. if there is a possible execution of P such that lambda When we speak of ªcalling a lambda expression l,º we really l is called from call site c, then l must be an element mean: ªcalling some function f which is a closure of lambda of L(c). expression l.º An alternate view is that we are reducing a potentially in®nite set of functions to a ®nite set by `folding' together the environments of all functions constructed from the The de®nition of the problem does not de®ne a unique func- tion L. The trivial solution is the function L(c) = AllLambdas, same lambda expression. This issue will be dealt with in detail in a later section. i.e. the conclusion that all lambdas can be reached from any call site. What we want is the tightest possible L. It is possi- ble, in fact, to construct algorithms that compute the smallest 4.2 Handling Primops L function for a given program. The catch lies in determin- ing when to halt the algorithm. In general, then, we must be It can be seen from the de®nition of Defs that we are content with an approximation to the optimal solution. ¯owing information about which lambdas are called from which call sites. But this is not the whole story. Not Let ARGS be the set of arguments to calls in program P, all function calls happen at call sites. Consider the frag- and LAMBDAS be the set of lambda expressions in P. That ment (+ a b (lambda (s) (f a s))). Where is is, ARGS is the set of variables and lambda expressions in P (lambda (s) (f a s)) called from? It is called from (we assume variables names are unique here, i.e. that variables the innards of the + primop; there is no corresponding call site have been -converted to unique names.) We de®ne a function in the program syntax to mark this. We need to endow primops

Defs : ARGS 3 € (LAMBDAS). Defs(a) gives all the lambda with special internal call sites to mark calls to functions that expressions that a could possibly evaluate to. That is, happen internal to the primop. Different primops have differ-

ent calling behavior with respect to their arguments: + calls its g

Defs[(lambda ... )] = f(lambda ... ) third argument only. %if calls its second and third arguments. g Defs[primop] = fprimop Y has even more complex behavior. So we model each primop

specially.

j g Defs[v] = fl v bound to lambda l

For each call to each primop c :(primop arg1... ), we as- Y L is trivially determined from Defs. In a call c :(f a1 ... an) , sociate a related set of internal call sites ic1 Y ... icn for the L(c) is just Defs(f). primop's use. Most normal primops, e.g. +, have a single 6 Scheme Flow Analysis internal call site, which marks the call to the primop's contin- 2. Y's continuation is passed to the ®xpointer as its continu- uation. %if has two internal call sites, one for the consequent ation argument: continuation, and one for the alternate continuation.

j Defs(cont) & Defs(k) [Y-2]

Let icpYc be the jth internal call site for the primop p called at call site c. 3. Extract the args to the call to k, i.e. the de®nitions f of

An ordinary primop Ð e.g. +, cons, aref Ð takes its i g the label functions v , and for each lambda f , pass ff single continuation as its last argument. Any lambda which i i i in as the ®xpointer's ith argument: that last argument could evaluate to, then, can be called from the internal call site of the primop. That is, in call fi P Defs(vi) [Y-3] c :(primop arg1... cont),

1

& 4.3 Handling External References Defs(cont) L(icprimopYc) [PRIM] Besides call sites, lambda expressions and primops, there are %if is a special primop, in that it takes two continuations as two other special syntactic values we must represent in our arguments, either of which can be branched to from inside the analysis. We need a way to represent unknown functions that %if. %if has two internal call sites, the ®rst for the consequent are passed into our program from the outside world at run time. continuations, and the second for the alternate continuations. In addition, we need a call site to correspond to calls to func- So there are two propagation conditions for an occurence of tions that happen external to the program text. For example, %if, c :(%if pred cons alt). Any lambda the consequent suppose we have the following fragment in our program continuation can evaluate to can be called from the ®rst internal call site: (foo (lambda (k) (k 7)) 1

Defs(cons) & L(ic ) [IF-1]

%ifYc If foo is free over the program, then in general we have no idea Any lambda the alternate continuation can evaluate to can be at compile time what functions foo could be bound to at run called from the second internal call site: time. The call to foo is essentially an escape from the program to the outside world, and we must record this fact. We do this 2

Defs(alt) & L(ic ) [IF-2] by including XLAMBDA, the external lambda, in Defs[foo].

%ifYc Further, since (lambda (k) (k 7)) is passed out of the The propagation condition for the Y operator is more compli- program to external routines, it can be called from outside the cated. The Y primop is not available to the user; it is only in- program. This is represented with a special call site, XCALL, troduced during CPS conversion of labels expressions. The the external call. We record that (lambda (k) (k 7)) is CPS transformation for labels expressions turns in L(XCALL). (labels ((f f-lambda) Functions such as (lambda (k) (k 7)) in the above (g g-lambda)) example that are passed to the external lambda have escaped body) to program text unavailable at compile time, and so can be called in arbitrary, unobservable ways. They have escaped close with continuation kont into: scrutiny, and in the face of limited information, we are forced (Y (lambda (huaskt f g c) simply to assume the worst. We maintain a set ESCAPED of (c (lambda (k) body) escaped functions, which initially contains XLAMBDA and the f-lambda top level lambda of the program. The rules for the external g-lambda))) call, the external lambda and escaped functions are simple: (lambda (bodyfun hunoz hukairz) (bodyfun kont)) 1. Any escaped function can be called from the external call: hunoz hukairz huaskt where , , and are unreferenced Ð ESCAPED & L(XCALL) [X-1] ignored dummy variables.

Since Y is only introduced into the program text in a con- 2. Any escaped function may be applied to any es- P trolled fashion, we can make particular assumptions about the caped function: Vl ESCAPED of the form syntax of its arguments. In particular, we can assume that (lambda (... vi ...) ... ), f-lambda and g-lambda are lambda expressions, and not vari-

ables. This restriction can be relaxed with the addition of a ESCAPED & Defs(vi) [X-2] small amount of extra machinery. We call the functional which is Y's second argument the This provides very weak information, but for external functions ®xpointer. Y's propagation condition has three parts. For an whose properties are completely unknown at compile time, it occurence of the Y primop: is the best we can do. (On the other hand, many external func- tions, e.g. print, or length, have well known properties. c :(Y (lambda (v1 ... vn k)(k f1 ... fn)) cont) For instance, both print and length only call their contin- 1. Y calls the ®xpointer from its internal continuation call uations. Thus, their continuations do not properly escape. Nor site: are their continuations passed escaped functions as arguments. It is straightforward to extend the analysis presented here to

(lambda (v1 ... vn k) ... ) P L(ic ) [Y-1] YYc utilise this stronger information.) Scheme Flow Analysis 7

5 Two Examples analysing Ð inner loops Ð use side effects primarily for updat- ing iteration variables, which are rarely functional values. The Consider the program control ¯ow of inner loops is usually explicitly written out, and (lambda () (if 5 (+ 3 4) (- 1 2))) tends not to use functions as ®rst class citizens. which evaluates to 7. In CPS form, this is: To further editorialise, I believe that the updating of loop (lambda (k1) (%if 5 (lambda () (+ 3 4 k1)) variables is a task best left to loop packages such as Waters' (lambda () (- 1 2 k1)))) LetS [LetS] or the Yale Loop [YLoop], where the actual up- Labeling all the lambda expressions and call sites for reference, dating technique can be left to the macro writer to implement we get: ef®ciently, and ignored by the application programmer. Even l1:(lambda (k1) the standard Common Lisp and Scheme looping construct, do, c1:(%if 5 l2:(lambda () c2:(+ 3 4 k1)) permits a binding interpretation of its iteration variable update

l3:(lambda () c3:(- 1 2 k1)))) semantics.

Y g ESCAPED is initially fXLAMBDA l1 . Our escaped func- For these reasons, we can afford to settle for a solution that

tion rules tell us that l1 can be called from XCALL, or that is merely correct, without being very informative. Therefore,

f Y g & l1 P L(XCALL), and that l1 XLAMBDA Defs(k1). The the control ¯ow analysis presented here uses a very weak tech- propagation condition for %if gives us that %if's ®rst inter- nique to deal with side effects. The CPS transformation dis- 1 1 nal call site, ic , calls l , i.e. l P L(ic ), and %if's cussed in section 3 removes all side effects to variables. Only %if 2 2 %if 2 2 second internal call site, ic , calls l , i.e. l P L(ic ). side effects to data structures are allowed. All side effects and %if 3 3 %if The propagation condition for + gives us that ic+ contains accesses to data structures are performed by primops. Among

cons rplaca rplacd car cdr vref

Y g Defs(k1) = fl1 XLAMBDA , or that +'s internal continuation these primops are , , , , , , call can transfer control to l1 or the XLAMBDA. Similarly, we and vset. We divide these into two sets: data stashers, and

cons rplaca rplacd

Y g

f data fetchers. Stashers, such as , , , derive for - that L(ic ) contains l1 XLAMBDA . This com- pletes the control ¯ow analysis for the program. and vset, take functions and tuck them away into some data structure. Fetchers Ð car, cdr, vref Ð fetch functions Consider the in®nite loop: from some data structure. (labels ((f (lambda (n) (f n)))) The analysis takes the simple view that once a function is (f 5)) stashed into a data structure, it has escaped from the anal- With top-level continuation *k*, this is CPSised into: yser's ability to track it. It could potentially be the value of (lambda (*k*) any data structure access, or escape outside the program to the (Y (lambda (ignore-b f k1) ESCAPED set. Thus we have two rules for side effects: (k1 (lambda (k2) (f 5 k2)) 1. Any lambda passed to a stasher primop is included in the (lambda (n k3) (f n k3)))))) ESCAPED set. (lambda (b ignore-f) (b *k*))

(cons ) A

Labelling all the call sites for reference: a d kont [S-1] & (lambda (*k*) Defs(a)Y Defs(d) ESCAPED

c1:(Y (lambda (ignore-b f k1) (vset vec index val kont) A [S-2] c2:(k1 (lambda (k2) c3:(f 5 k2)) Defs(val) & ESCAPED (lambda (n k3) c4:(f n k3))))))

etcX (lambda (b ignore-f) c5:(b *k*))

Y's internal call site is labelled icY. After performing the prop- 2. A fetcher primop can pass any ESCAPED function to its

agations, we get: continuation:

f Y g

L(XCALL) = (lambda (*k*) ... ) XLAMBDA (fetcher ... cont) A [F-1]

f g

P Y

L(c1) = Y V(lambda (x) ...) Defs(cont)

f g

L(icY) = (lambda (ignore-b f k1) ... ) ESCAPED & Defs(x) g L(c2) = f(lambda (b ignore-f) ... )

In a sense, stashers and fetchers act as black and white holes: g L(c3) = f(lambda (n k3) (f n k3)

values disappear into stasher primops and pop out at fetcher g

L(c4) = f(lambda (n k3) (f n k3) primops. g L(c5) = f(lambda (k2) (f 5 k2)) 7 Induction Variable Elimination: An Application 6 Side Effects Once we've performed control ¯ow analysis on a program, we For the purposes of doing control ¯ow analysis of Lisp, track- can use the information gained to do other data ¯ow analyses. ing side effects is not of primary importance. Since Lisp makes Control ¯ow analysis by itself does not provide enough infor- variable binding convenient and cheap, side effects occur rel- mation to do all the data ¯ow problems we might desire Ð the atively infrequently. (This is especially true of well written limitations of control ¯ow analysis are discussed in the section Lisp.) The pieces of Lisp we are most interested in ¯ow 8 Ð but we can still solve some useful problems. 8 Scheme Flow Analysis

As an example, consider induction variable elimination. In- A dependent induction variable (DIV) family is a set of vari-

H

g 

duction variable elimination is a technique used to transform ables B = fwi together with a function f(n) = a + b n (a,b g array references inside loops into pointer incrementing opera- constant) and a BIV family B = fvi such that: tions. For example, suppose we have the following C routine:

 For each wi, every de®nition of wi is wi = f(vj) for some int a[50][30]; vj P B.

example() { We call the value f(vi) the dependent value.

g f g

integer r, c; We can introduce three sets of new variables, fzi , xi , and g for(r=0; r<50; r++) fyi . The zi and xi track the dependent value, and the yi are for(c=0; c<30; c++) introduced as temporaries to help update the basic variable. In a[r][c] = 4; the following description, we use for illustration the following } piece of Lisp code: The example function assigns 4 to all the elements of array (+ v1 4 (lambda (v2) a. The array reference a[r][c] on a byte addressed machine ^ ... ^ is equivalent to the address computation a+4*(c+30*r). | (* 3 v2|(lambda (w1) ...)))) This calculation requires two multiplications and two additions. | | ^ By replacing the array indices with a pointer variable, we can +------+ DIV w1 defined remove the entire address calculation: BIVs v1 & v2 to be 3*v2. example() {

with associated function f(n) = 3  n. integer *ptr; for(ptr=a; ptr

 Introduce code to compute zj = f(vj) from zi = f(vi)

Y Y g B = fv1 ... vn obeying the following constraints: when vj is computed from vi 1. Every call site that calls one of the vi's lambda ex- All call sites who have continuations binding one of the pression may only call one lambda expression. In vi (i.e. call sites that step the BIV family) are modi®ed to terms of the Defs and L functions given earlier, this is update the corresponding zi value. Temporary variables xi equivalent to stating that for all call sites c such that and yi are used to serially compute the new values.

(lambda (... vi ...) ...) P L(c) for some vi, L(c) Call site (+ v1 4 k) becomes: must be a singleton set: (+ z1 12 (lambda (x2)

(+ v1 4 (lambda (y2)

Y P Vcall site's c vi B

(k y2 x2)))))

A j j (lambda (... vi ...) ...) P L(c) L(c) = 1

 Replace computation of DIV wi from vj with simple 2. Each vi may only have de®nitions of the form b (constant binding:

b), and a + vj (vj P B, constant a). This corresponds to Since zj = f(vj), we can remove the computation of wj stating that vi's lambda must be one of: from vj, and simply bind wj to zj. Dependent variable com- putation (* v2 3 (lambda (w1) ...)) becomes:

 Called from a call site that simply binds vi to a con- stant: ((lambda (w1) ...) z2) and z2 can be  substi-

((lambda (x vi z) ...) foo 3 bar). tuted for w1.

g ‘ f g ‘ f g Notice that fwi zi xi now form a new BIV family,  The continuation of a + call, with vj and a constant for addends: and may trigger off new applications.

(+ vj a (lambda (vi) ...)). For example, consider the loop of ®gure 4(a). It is converted g to the partial CPS of (b). Now, fn is a BIV family, and

 The continuation of a - call, with vj and a constant

g f g for subtrahend and minuend: fm is a DIV family dependent on n , with f(n) = 3n.A

(- vj a (lambda (vi) ...)). wave of IVE transformation gives us the optimised result of

Y Y g (c). Analysis of (c) reveals that f3n 3n' 3n% is a BIV

 Called from a call site that simply binds vi to some g family, and fp is a DIV family, with f(n) = 4 + n. Another vj, e.g.: wave of IVE gives us the (d). If we examine (d), we notice that ((lambda (x vi z) ...) foo vj bar) 3n, 3n', and 3n% are never used, except to compute values (This is a special case of a = 0). for each other. Another analysis using control ¯ow information, Given the control ¯ow information, it is fairly simple to com- Useless Variable Elimination spots these useless variables; they pute sets of variables satisfying these constraints. can be removed, leaving the optimised result of (e). Scheme Flow Analysis 9

(labels ((f (lambda (n) (if (< n 50) (if (= (aref a n) n) (f (+ n 1)) (block (print (+ 4 (* 3 n))) (f (+ n 2)))))))) (f 0)) (a) Before (labels ((f (lambda (n) (if (< n 50) (if (= (aref a n) n) (+ n 1 f) (* 3 n (lambda (m) (+ 4 m (lambda (p) (print p) (+ n 2 f)))))))))) (f 0)) (b) Partial CPS conversion (labels ((f (lambda (n 3n) (if (< n 50) (if (= (aref a n) n) (+ 3n 3 (lambda (3n') (+ n 1 (lambda (n') (f n' 3n'))))) (+ 4 3n (lambda (p) (print p) (+ 3n 6 (lambda (3n%) (+ n 2 (lambda (n%) (f n% 3n%))))))) ))))) (f 0 0)) (c) IVE wave 1 (labels ((f (lambda (n 3n 3n+4) (if (< n 50) (if (= (aref a n) n) (+ 3n+4 3 (lambda (3n+4') (+ 3n 3 (lambda (3n') (+ n 1 (lambda (n') (f n' 3n' 3n+4'))))))) (block (print 3n+4) (+ 3n+4 6 (lambda (3n+4%) (+ 3n 6 (lambda (3n%) (+ n 2 (lambda (n%) (f n% 3n% 3n+4%))))))))))))) (f 0 0 4)) (d) IVE wave 2 (labels ((f (lambda (n 3n+4) (if (< n 50) (if (= (aref a n) n) (+ 3n+4 3 (lambda (3n+4') (+ n 1 (lambda (n') (f n' 3n+4'))))) (block (print 3n+4) (+ 3n+4 6 (lambda (3n+4%) (+ n 2 (lambda (n%) (f n% 3n+4%))))))))))) (f 0 4)) (e) After

Figure 4: Example IVE application 10 Scheme Flow Analysis

8 Limitations and Extensions is not so simple. The inde®nite extent of variable bindings im- plies that we can have simultaneous multiple bindings of the This paper is the ®rst in a series intended to develop general same variable. For instance, consider the following perverse ¯ow analysis optimisations for Scheme-like languages. The example: basic control ¯ow analysis technique of section 4 can be devel- oped in many directions. In this section, I will sketch several (let ((f (lambda (h x) of these directions, which are beyond the scope of this paper, (if (zero? x) (h) and will be published in future reports. (lambda () x))))) (f (f nil 3) 0))

8.1 First Order Closures: An Improvement The ®rst call to f, (f nil 3) returns a closure over (lambda () x), in an environment [x/3]. This function is Control ¯ow analysis attempts to deal with in®nite structures then used as an argument to the second, outer call to f, where by approximation. We would like to ¯ow around functions. it is bound to the variable h. Suppose we naively picked up But the set of functions that a given program can produce is information from the conditional (zero? x) test, and ¯owed in®nite. Consider the following in®nite loop: x = 0 and x =T 0 down the THEN arm and the ELSE arm of (labels ((f (lambda (g) the if, respectively. This would work in the single ¯at envi- (f (lambda () ronment model of assembler. In our example, however, we are (+ (g) (g))))))) undone by the power of the lambda. After ensuring that x = 0 (f (lambda () 1))) in the THEN arm of the (if (zero? x)...) conditional,

n we jump off to h, and evaluate x, which inconveniently per-

j ! g g is bound to the in®nite set of functions ff(x) = 2 n 0 sists in evaluating to 3, not 0. The one variable has multiple So, in general, there is no way to perfectly answer the ques- simultaneously extant values. Confusing two bindings of the tion ªwhat are all the functions called from a given call site?º same variable can lead us to draw incorrect conclusions. The analysis technique presented in section 4 uses the approx- imation that identi®es all functions that have the same ªcode,º This problem prevents us from using plain control ¯ow anal- i.e. the functions that are closed over the same lambda expres- ysis to perform an important ¯ow analysis optimisation for sion. Lisp: type inference. We would like to perform deductions of the form: We can use ®ner grained approximations, expending more work to gain more information. To borrow a technique from ;; references to x in f are int refs [Hudak1], we can track multiple closures over the same lambda. (if (integer? x) (f) (g)) Since it's clear that in the real lambda semantics, a ®nite pro- Unfortunately, as demonstrated above, we can't safely make gram can give rise to unbounded numbers of closures, we must such inferences. identify some real closures together. In the zeroth order control ¯ow analysis (0CFA) presented What is missing is environment information. We need to in section 4, we identi®ed all closures over a lambda together. know which references to a variable occur in the same envi- In the ®rst order case (1CFA), the contour created by calling ronment. For example, consider the lambda expression: a lambda from call point c1 is distinguished from the contour (lambda (0:x) created by calling that lambda from call point c2. Closures over (cond ((zero? 1:x) these two contours are distinct. (print 2:x) (* 3:x 4)) A full treatment of 1CFA will be found in the forthcoming (t (+ 4:x 7)))) CMU tech report1 based on this conference paper. There is no execution path of a program containing the above lambda such that reference 2:x occurs in the same environment 8.2 Environment Flow Analysis as reference 4:x. On the other hand, in all execution paths, reference 2:x occurs in the same environment as reference Although control ¯ow analysis provides enough information to 3:x. perform some traditional ¯ow analysis optimisations on Scheme code, e.g. induction variable elimination, it is not suf®cient to We cannot do general data ¯ow analysis unless we can un- solve the full range of ¯ow analysis problems. Why is this? tangle the multiple environments that variables can be bound Traditional ¯ow analysis techniques assume a single, ¯at en- in. This is not surprising. As is pointed out in [Declarative], vironment over the program text. New computed values are lambda serves a dual semantic role of providing both control communicated from one portion of the program to another by and environment structure. Control ¯ow analysis has provided side-effecting and referencing global variables. Thus, any two information about only one of these structures. We must also references to the same variable refer to the same binding of that gather information about the relationships among the binding variable. This is, for instance, the model provided by assembly environments established during program execution. code. The exact nature of the environment information we need is Our technique, however, uses a different intermediate repre- succinctly captured in the lastref function: sentation, CPS applicative-order !-calculus. Here, the situation For a reference r to variable v, what is LR(r), the set 1Due to space limits, the description of Useless Variable Elimination and the of possible immediately prior references to the same algorithm for ®nding BIV families are also deferred to the tech report. binding of v? Scheme Flow Analysis 11

Lastref information provides just enough information to dis- References entangle the multiple environments that can be created over a given variable. If we had lastref information, we could, for [Dragon] Aho, Ullman. Principles of Compiler Design. example, correctly perform type inference. Deriving lastref in- Addison-Wesley (1977). formation is referred to as environment ¯ow analysis. [Hecht] Hecht, Matthew S. Data Flow Analysis of Com- A complete treatment of the environment problem, and tech- puter Programs. American Elsevier (New York, niques for computing the lastref function are beyond the scope 1977). of this paper. The next paper in this ¯ow analysis series, En- 3 vironment Flow Analysis in Scheme, will treat the environment [R3-Report] J. Rees & W. Clinger, Ed.. ªThe Revised Re- problem, showing its solution, and the application of the lastref port on the Algorithmic Language Scheme.º SIG- function to performing general ¯ow analysis, including type PLAN Notices 21(12) (Dec. 1986), pp. 37±79. inference as a demonstration example. [Declarative] Steele, Guy L. Lambda: The Ultimate Declar- ative. AI Memo 379. MIT AI Lab (Cambridge, 8.3 Mathematics November 1976).

The mathematics of Scheme control, environment, and data [Rabbit] Guy L. Steele. Rabbit: A Compiler for Scheme. ¯ow analysis is captured by abstract semantic interpretations AI-TR-474. MIT AI Lab (Cambridge, May [Cousot] [Hudak2]. This will be the topic of another paper. 1978).

[ORBIT] Kranz, David, et al. ªOrbit: An Optimizing 9 Acknowledgements Compiler for Scheme.º Proceedings of SIGPLAN '86 Symposium on Compiler Construction (June I would like to thank my advisor, Peter Lee, for carefully re- 1986), pp. 219±233. viewing several drafts of this paper. [LetS] Waters, Richard C. LETS: an Expressional Loop Notes Notation. AI Memo 680. MIT AI Lab (Cam-

bridge, October 1982). g fNote Dif®culties with Binding For all its pleasant properties, Lisp binding does introduce a [YLoop] Online documentation for the T3 implementation problem that surfaces when we attempt to ®nd applications for of the Yloop package is distributed by its cur- ¯ow analysis-derived information, namely that lexical scoping rent maintainer: Prof. Chris Riesbeck, Yale CS coupled with ªupwardº functions gives the programmer very Dept. ([email protected]). tight control over environments. This can frustrate optimising [Hudak1] Hudak, Paul. ªA Semantic Model of Reference techniques involving code motion, where we might decide we Counting and its Abstraction.º Proceedings of would like to move a computation to a given control point p, the 1986 ACM Conference on Lisp and Func- only to ®nd that the variables appearing in the computation are tional Programming (August 1986). not available in the environment obtaining at p. This problem does not arise with Algol-like languages, since [Hudak2] Hudak, Paul. Collecting Interpretations of Ex- their internal representations are typically close to assembly pressions (Preliminary Version). Research Report language. The environment is a single, large ¯at space, hence YALEU/DCS/RR-497. Yale University (August a given variable is visible over the entire body of code. Some 1986). further consequences of this difference are discussed in subsec- [Cousot] P. Cousot and R. Cousot. ªAbstract interpreta- tion 8.2. tion: a uni®ed lattice model for static analysis

of programs by construction or approximation of g fNote CPS and Triples ®xpoints.º 4th ACM Symposium on Principles of CPS looks a lot like triples: generally, the code is broken down Programming Languages (1977), pp. 238±252. into primitive operations, which take the variables or constants they are applied to, and a continuation that speci®es the variable to receive the computed value. CPS differs from triples in the following ways: 1. The continuation actually serves two roles. (1) It speci- ®es the the variable to receive the value computed by the primitive operation. (2) It speci®es where the control point will transfer to after executing the primitive operation. In triples this is split out. 2. A triple side-effects its target. A continuation binds its variable. 3. Continuations have runtime de®nitions; triples can be com- pletely determined at compile time.