Interpro cedural Analysis of FirstOrder Programs

with Tail Call Optimization

Saumya K Debray and To dd A Pro ebsting

Department of

The University of Arizona

Tucson AZ USA

Email fdebray toddgcsarizonaedu

Decemb er

Abstract

The analysis of control ow

Involves guring out where returns will go

How this may b e done

With items LR and

Is what in this pap er we show

Intro duction

Most co de optimizations dep end on control ow analysis typically expressed in the form of a control ow

graph Traditional algorithms construct intrapro cedural ow graphs which do not account for control

ow b etween pro cedures Optimizations that dep end on this limited information cannot consider the b e

havior of other pro cedures Interpro cedural versions of these optimizations must capture the ow of control

across pro cedure b oundaries Determining interpro cedural control ow for rstorder programs is relatively

straightforward in the absence of tail call optimization since pro cedures return control to the p oint imme

diately after the call Tail call optimization complicates the analysis b ecause returns may transfer control

to a pro cedure other than the active pro cedures caller

The problem can b e illustrated by the following simple program that tak es a list of values and prints in

their original order all the values that satisfy some prop erty eg exceed To take advantage of tailcall

optimization it uses an accumulator to collect these values as it traverses the input list However this causes

the order of the values in the accumulator to b e reversed so the accumulated list is reversedagain using

an accumulatorb efore it is returned

mainL print extractL

extractxs acc

if xs then reverseacc

else if hdxs then extracttlxs hdxsacc

else extracttlxs acc

reversexs acc

if xs then acc

else reversetlxs conshdxsacc



The work of S Debray was supp orted in part by the National Science Foundation under grant numb er CCR The

work of T Pro ebsting was supp orted in part by NSF Grants CCR CCR ARPA Grant DABT

and IBM Corp oration

Supp ose that for co de optimization purp oses we want to construct a ow graph for this entire program

The return from the function reverse in line corresp onds to some basic blo ck and in order to construct

the ow graph we need to determine the successors of this blo ck The call graph of the program indicates

that reverse can b e called either from extract in line or from reverse in line However b ecause

of tail call optimization it turns out that reverse do es not return to either of these call sites Instead it

t pro cedure entirely namely to the pro cedure main in line the successor blo ck to the returns to a dieren

return in line is the basic blo ck that calls print Clearly some nontrivial control ow analysis is necessary

to determine this

Most of the work to date on control ow analysis has fo cused on higherorder languages Shivers

and Jagannathan and Weeks use abstract interpretation for this purp ose while Heintze and Tang

and Jouvelot use typ ebased analyses These analyses are very general but very complex Many

are rstorder languages Furthermore even for higherorder widely used languages such as Sisal and

languages sp ecic programs often use only rstorder constructs or can have most higherorder constructs

removed via transformations such as inlining and uncurrying As a pragmatic issue therefore we are

interested in ordinary rstorder programs our aim is to account for interpro cedural control ow in such

programs in the presence of tail call optimization To our knowledge the only other work addressing this issue

is that of Lindgren who uses setbased analysis for control ow analysis of Prolog Unlike Lindgrens

work our analyses can maintain context information see Section

The main contribution of this pap er is to show how control ow analysis of rstorder programs with tail

call optimization can b e formulated in terms of simple and wellundersto o d concepts from parsing theory In

particular we show that contextinsensitive or zerothorder control ow analysis corresp onds to the notion

of FOLLOW sets in context free grammars while contextsensitive or rstorder control ow analysis

corresp onds to the notion of LR items This is useful b ecause it allows the immediate application of

wellundersto o d technology without for example having to construct complex abstract domains It is also

esthetically pleasing in that it provides an application of concepts such as FOLLOW sets and LR items

which were originally develop ed purely in the context of parsing to a very dierent application

The remainder of the pap er is organized as follows Section intro duces denitions and notation Section

denes an abstract mo del for control ow and Section shows how this mo del can b e describ ed using

context free grammars Section discusses control ow analysis that maintain no context information and

Section discusses how context information can b e maintained to pro duce more precise analyses Section

and illustrates these ideas with a nontrivial example Section discusses tradeos b etween eciency

precision Pro ofs of the theorems may b e found in

Denitions and Notation

We assume that a program consists of a set of pro cedure denitions together with an entry p oint pro cedure

It is straightforward to extend these ideas to accommo date multiple entry p oints Since we assume a

rstorder language the intrapro cedural control ow can b e mo delled by a control ow graph This is a

of executable co de directed graph where each no de corresp onds to a basic blo ck ie a maximal sequence

that has a single entry p oint and a single exit p oint and where there is an edge from a no de A to a no de B

if and only if it is p ossible for execution to leave no de A and immediately enter no de B If there is an edge

from a no de A to a no de B then A is said to b e a predecessor of B and B is a successor of A Because of the

eects of tail call optimization interpro cedural control ow information cannot b e assumed to b e available

Therefore we assume that the input to our analysis consists of one control ow graph for each pro cedure

dened in the program

For simplicity of exp osition we assume that each ow graph has a single entry no de Eac h ow graph

consists of a set of vertices which corresp ond to basic blo cks and a set of edges which capture control ow

b etween basic blo cks If a basic blo ck contains a pro cedure call the call is assumed to terminate the blo ck

if a basic blo ck B ends in a call to a pro cedure p we say that B cal ls p

If the last action along an execution path in a pro cedure p is a call to some pro cedure q ie if the

only action that would b e p erformed on return from q would b e to return to the caller of pthe call to q

cal l A tail call can b e optimized in particular any environment allo cated for the caller p is termed a tail

can b e deallo cated and control transfer eected via a direct jump to the callee q this is usually referred to

as tail call optimization and is crucial for ecient implementations of functional and logic languages If

ck if B ends in a pro cedure call that is a basic blo ck B ends in a tail call we say that it is a tail cal l blo

not a tail call we say B is a cal l block In the latter case B must set a return address L b efore making the

call L is said to b e a return label If a basic blo ck B ends in a return from a pro cedure it is said to b e a

return block As is standard in the program analysis literature we assume that either branch of a conditional

can b e executed at runtime The ideas describ ed here are applicable to programs that do not satisfy this

assumption in that case the analysis results will b e sound but p ossibly conservative

The set of basic blo cks and lab els app earing in a program P are denoted by Blo cks and Lab els

P P

resp ectively The set of pro cedures dened in it is denoted by Pro cs Finally the Kleene closure of a set

P



S ie the set of all nite sequences of elements of S is written S The reexive transitive closure of a

relation R is written R

Abstracting Control Flow

Before we can analyze the control ow b ehavior of such programs it is necessary to sp ecify this b ehavior

carefully First consider the actual runtime b ehavior of a program

Co de not involving pro cedure calls or returns is executed as exp ected each instruction in a basic blo ck

is executed in turn after which control moves to a successor blo ck and so on

Pro cedure calls are executed as follows

A nontail cal l loads arguments into the appropriate lo cations saves the return address for

simplicity we can assume that it is pushed on a control stack and branches to the callee

A tail cal l loads arguments reclaims any space allo cated for the callers environment and transfers

control to the callee

A pro cedure return simply p ops the topmost return address from the control stack and transfers control

to this address

We can ignore any asp ect of a programs runtime b ehavior that is not concerned directly with ow of control

Conceptually therefore control moves from one basic blo ck to another pushing a return address on a stack

when making a nontail pro cedure call and p opping an address from it when returning from a pro cedure

corresp onding to a This can b e formalized using a very simple pushdown automaton the automaton M

P

automaton Given a program P the set of states Q of M is given by program P is called its control ow

P

Q Blo cks Pro cs its input alphab et is Lab els the initial state of M is p where p is the entry p oint

P P P P

of P and its stack alphab et Lab els Blo cks fg where is a sp ecial b ottomofstack marker that

P P

is the initial stack symb ol

The general idea is that the state of M at any p oint corresp onds to the basic blo ck b eing executed by

P

P while the return lab els on its stack corresp ond to the stack of pro cedure calls in P The input string do es

not play a direct role in determining the b ehavior of M but it turns out to b e technically very convenient

P

to match up symb ols read from the input with lab els p opp ed from the stack The language accepted by M

P

is then the set of sequences of lab els that control can jump to on pro cedure returns during an execution of

the program P

Let the transitions of a pushdown automaton b e denoted as follows if from a conguration where it

0

in its input and on its stack it can make a transition to state q with input string is in state q and has w

0 0 0

w and with on its stack we write q w q w The stack contents are written such that the

top of the stack is to the left if a a then a is assumed to b e at the top of the stack The moves

1 n 1

of M are dened as follows

P

0

If basic blo ck B is a predecessor of basic blo ck B and B do es not make a call or tail call then M

P

0

move from B to B can make an

0

B w B w

If basic blo ck B makes a call to pro cedure p with return lab el where the basic blo ck with lab el is

0 0

B then M can push two symb ols B on its stack and make an move to state p

P

0

B w p w B

If basic blo ck B makes a tail call to pro cedure p then M can make an move to state p

P

B w p w

If the entry no de of the ow graph of a pro cedure p is B then M can make an move from state p

P

to state B

p w B w

0

If B is a return blo ck then if app ears on M s input and the lab el and blo ck B app ear on the top

P

0 0

of its stack then M can read from the input p op and B o its stack and go to state B

P

0 0

B w B B w

Finally M accepts by empty stack for each state q there is the move

P

q q

We refer to the lab el app earing on the top of M s stack as the current return label since this is the lab el

P

of the program p oint to which control returns from a return blo ck

Control Flow Grammars

Given a set of control ow graphs for the pro cedures in a program we can construct a context free grammar

that describ es its control ow b ehavior We call such a grammar a control ow grammar

V Denitio n A control ow grammar for a program P is a context free grammar G T P S

P

where the set of terminals T is the set Lab els of return lab els of P the set of variables V is given by

P

V Blo cks Pro cs the start symb ol of the grammar is the entry pro cedure p of the program and the

P P

pro ductions of the grammar are given by the following

0 0

B if B is a basic blo ck that is a predecessor of a basic blo ck B then there is a pro duction B

0

if B is a call blo ck with return lab el where the basic blo ck lab elled is B and the called pro cedure

0

is p there is a pro duction B p B

if B is a tail call blo ck and the called pro cedure is p then there is a pro duction B p

if p is a pro cedure dened in P and the control ow graph of p has entry no de B then there is a

pro duction p B

If B is a return blo ck then there is a pro duction B

Example Consider the mainextractreverse program from Section A partial ow graph for

this program is shown in Figure with ordinary control transfers shown using solid lines and calls to

pro cedures using dashed lines Because control transfers at pro cedure returns have not yet b een determined

the predecessors of basic blo ck B and the successors of blo ck B are not yet known The control ow

g nonterminals fmain extract reverse grammar for this program has as its terminals the set of lab els fL

B Bg and the following pro ductions main: extract: B1 B3

call extract ret addr=L2

L2: B2 B4 B5

call print call reverse

B6 B7 call extract call extract

reverse: B8

B9 B10

return call reverse

Figure A Partial Flow Graph for the mainextractreverse Program

main B B B

B extract L B B extract

B print B extract

extract B reverse B

B B B B

B B B B

B reverse B

B B B reverse

The start symb ol of the grammar is main 2

The pro ductions of the control ow grammar G closely resemble the moves of the control ow automaton

P

M and it comes as no surprise that they b ehave very similarly Let denote the leftmost derivation

lm P

relation in G The following theorem whose pro of closely resembles the standard pro of of the equivalence

P

b etween pushdown automata and contextfree languages expresses the intuition that the control ow

grammar of a program mirrors the b ehavior of its control ow automaton

Given a program P with entry point S control ow grammar G and control ow automaton Theorem

P

M

P

S xA if and only if S xw A w

lm



Lab els and A Blo cks Pro cs where x w

P P

P

ZerothOrder Control Flow Analysis

Zerothorder control ow analysis also referred to as CFA involves determining for each pro cedure p in a

program the set of lab els RetLbl p to which control can return after some call to p Consider a call blo ck

B in a program P If B is not a tailcall blo ck it pushes its return address onto the top of the stack b efore

transferring control to the called pro cedure On the other hand if B is a tail call blo ck it leaves the control

stack untouched and transfers control directly to the callee Eventually when executing a return control

branches to the lab el app earing on the top of the stack Thus in either case the set of lab els to which

a pro cedure p can return is the set of current return lab els when control enters p in some conguration

reachable from the initial conguration of M

P

0

q w p w g RetLbl p f j

0

It is a direct consequence of Theorem that this set is precisely the FOLLOW set of p in the control ow

grammar of the program see for the denition of FOLLOW sets

Theorem For any procedure p in a program P RetLbl p FOLLOW p

Pro of Supp ose the program P has entry p oint S From the denition of RetLbl p RetLbl p if and

only if there is a call blo ck A that calls p such that S xw A w p w B From Theorem

0

this is true if and only if S xA xpB ie S xp But this is true if only if FOLLOWp

It follows that RetLbl p FOLLOWp 2

Example FOLLOW sets for some of the variables in the grammar of Example are

X FOLLOWX

main

extract L

reverse L

Here a refers to the external caller eg the user It is immediately apparent from this that control

returns from reverse to the basic blo ck in main that calls print 2

There is one remaining subtlety in constructing the interpro cedural control ow graph of a program

once the set of return lab els for each function have b een computed If we consider the FOLLOW sets the

grammar of Example we nd that L o ccurs in FOLLOW extract and it is correct to infer from this

that control is transferred to the basic blo ck B lab elled by L after completion of the call to extract

However we cannot conclude that each blo ck that has L in its FOLLOW set has the blo ck B as a successor

while L o ccurs in the FOLLOW sets of B B B B B B B and B it is not dicult to see that

instructionshould have B as a successor The algorithm for only Bwhich actually contains a return

constructing the control ow graph of a program taking this into account is given in Figure

Applications of CFA

An example application of CFA is interpro cedural unb oxing optimization in languages that are either

dynamically typ ed or that supp ort p olymorphic typing In implementations of such languages the

cannot always predict the exact typ e of a variable at a program p oint and as a result it b ecomes necessary

to ensure that values of dierent typ es lo ok the same which is achieved by b oxing Unfortunately

manipulating b oxed values is exp ensive

The issue of maintaining untagged values has received considerable attention in recent years in the

context of strongly typ ed p olymorphic languages Using explicit representation typ es this

work relies on the typ e system to propagate data representation information through the program While

theoretically elegant the typ e system cannot b e aware of lowlevel pragmatic concerns such as the costs

of various representation conversion op erations and the execution frequencies of dierent co de fragments

As a result it is dicult to guarantee that the optimized program is in fact more ecient than the

unoptimized version Also the idea do es not extend readily to dynamically typ ed languages

Peterson takes a pro cedures control ow graph and determines the optimal placement of represen

tation conversion op erations based on basic blo ck execution frequencies and conversion op eration costs As

given this is an intrapro cedural optimization For many programs unb oxing across pro cedure calls yields

R

1

x

e dx using signicant p erformance improvements As an example we tested a program that computes

0

trap ezoidal numerical integration with adaptive quadrature For this program intrapro cedural unb oxing

with a tail call from a function to itself b eing optimization yields a p erformance improvement of ab out

Input A program P

Output The interpro cedural CFA control ow graph for P

d Metho

Construct the control ow grammar G for P

P

Compute FOLLOW sets for the nonterminals of G

P

Construct a partial control ow graph for P without accounting for control transfers due to pro

cedure returns This is done by adding edges corresp onding to intrapro cedural control transfers

as in together with an edge from each call blo ck and tailcall blo ck to the entry no de of the

corresp onding called function

For each Lab els and each nonterminal X of G do

P P

if FOLLOWX and the basic blo ck corresp onding to X contains a return instruction

then add an edge from X to B where B is the basic blo ck lab elled by

Figure An algorithm for constructing the interpro cedural CFA control ow graph of a program

recognized and implemented as a lo op With interpro cedural unb oxing however p erformance improves by

1

ab out To apply Petersons algorithm interpro cedurally we need to construct the control ow graph

for the entire program The presence of tail call optimization makes computing the set of successors of a

return blo ck dicult with just the call graph of the program Fortunately CFA provides precisely what

is needed to determine the successors of return blo cks and thereby to construct the control ow graph for

a program

Another application of CFA is interpro cedural basic blo ck fusion The basic idea of this optimization

is straightforward if we have two basic blo cks B and B where B is the only predecessor of B and B

0 1 0 1 1

is the only successor of B then they can b e combined into a single basic blo ck in the interpro cedural

0

case the blo cks b eing fused may b elong to dierent pro cedures The b enets of this optimization include a

reduction in the numb er of jump instructions executed with a concomitant decrease in pip eline bubbles

as well as p otentially improved opp ortunities for b etter instruction scheduling in the enlarged basic blo ck

resulting from the optimization Our exp erience with lter fusion indicates that this optimization can b e

of fundamental imp ortance for p erformance in applications involving automatically generated source co de

Consider the mainextractreverse program from Section A partial ow graph for this program is

given in Figure It is not dicult to see that the CFA algorithm of Figure would determine that basic

blo ck B is the only successor of blo ck B and B is the only predecessor of B thereby allowing these two

that a naive analysis that handles tail calls as if they were calls that returned to an blo cks to b e fused Note

empty basic blo ck immediately following the call site would infer that basic blo ck B had three predecessors

blo cks B B and B thereby preventing the application of the optimization in this case

FirstOrder Control Flow Analysis

While CFA tells us the p ossible return addresses for each basic blo ck and pro cedure it leaves out all

information ab out the context in which a call o ccurs ie who called whom to get to this call site This

may render CFA inadequate in some situations Information ab out where control came from could

provide more precise liveness or aliasing information at a particular program p oint allowing a compiler to

generate b etter co de

1

We did not implement Petersons algorithm b ecause the control ow analysis describ ed here had not b een develop ed when

optimization Instead our compiler uses a heuristic that pro duces go o d but not necessarily we implemented the unb oxing

optimal representation conversion placements These p erformance improvements can therefore b e seen as lower b ounds on

what can b e attained using an optimal algorithm

At any p oint during a program P s execution the return addresses on its control stack which corresp ond

to the contents of the stack in some execution of the control ow automaton M give us a complete history

P

of the interpro cedural control ow b ehavior of the program upto that p oint Since the set of all p ossible

nite sequences of lab els is innite we seek nitely computable approximations to this information An

lab els on the stack of M for some xed k CFA obvious p ossibility is to keep track of the top k

P

where we keep track of no context information at all corresp onds to cho osing k A control ow analysis

that keeps track of the top k return addresses on the stack of M is called a k thorder control ow analysis

P

or k CFA this corresp onds to the callstrings approach of Sharir and Pnueli In this section we

fo cus our attention on rstorder control ow analysis or CFA

In the previous section we showed that the FOLLOW sets of the control ow grammar give CFA

information How might we incorp orate additional context information into such analyses In parsing

theory FOLLOW sets are used to construct SLR parsers which are based on LR items Because

SLR parsers do not maintain much context information they are unable to handle many simple grammars

Intro ducing additional context information into the items using lo okahead tokens xes this problem this

leads to the use of LR items

This analogy carries over to control ow analysis LR items for the control ow grammar G are closely

P

a conveys the related to the information manipulated during CFA Basically an LR item A 

information that control can reach A with current return lab el a In an item A  a we will often

fo cus on the nonterminal A on the left hand side of the pro duction and the lo okahead token a but not in the

a details of the structure of  in such cases to reduce visual clutter we will write the item as A

In the context of this discussion we are not concerned with whether or not the control ow grammar G is

P

LRparsable

We know from parsing theory that given a control ow grammar G with variables V and terminals T

P

there is a nondeterministic nite automaton NFA Q q Q that recognizes viable prexes of G

0

This NFA which we will refer to as the viable prex NFA is dened as follows its set of states Q consists

of the set of LR items for G together with a state q that is not an item its alphab et V T the

P 0

initial state is q every state is a nal state and the transition function is dened as follows

0

Given a program P with entry p oint p q fS  p g i

0

ii A  B a fB  b j B is a pro duction b FIRST ag

iii A  B a B fA B  ag B

An item I is said to b e reachable if q ; I ie if there is a path from the the initial state q of the viable

0 0

prex NFA to I The following result makes explicit the corresp ondence b etween LR items in G and

P

return addresses on top of the stack of M

P

Given a program with entry point p p x A y a if and only if there is a reachable Theorem

item A a

The set of current return lab els ie lab els at the top of M s stack when control enters a basic blo ck or

P

pro cedure is now easy to determine

Let A be any basic block or procedure in a program P The set of current return labels when Corollary

control enters A is given by f j there is a reachable LR item A g

An LR item A a tells us ab out the return lab els that can app ear on top of the control stack

ie ab out addresses that control can go to Fortunately it turns out that we can use the reachability relation

; in the viable prex NFA to trace the origins of a call Consider a program P with a call blo ck A that

calls a pro cedure p in the control ow grammar G this corresp onds to a pro duction

P

A p C

where is the return lab el and C is the blo ck with lab el This pro duction gives rise to LR items of the

form A  p C b Let B b e the entry no de of the ow graph for p then G contains the pro duction

p P

p B so in the viable prex NFA there is a transition from each of these items to the item p  B

p p

then G has pro ductions B C B C and Supp ose the blo ck B has successors C C

P p 1 p k p 1 k

the viable prex NFA will therefore have transitions from the item p  B to each of the items

p

B  C B  C Supp ose one of these blo cks say C makes a tail call to a pro cedure

p 1 p k j

then G contains the pro ductions C q and q B and this q whose ow graph has entry no de B

P j q q

gives rise to transitions from B  C to C  q and thence to q  B We can follow

p j j q

transitions in this way to trace a sequence of control transfers that do es not involve any pro cedure returns

Conversely we can follow transitions backwards from a call to work out where it could have come from

Intuitively we want to b e able to characterize a collection of successive basic blo cks and pro cedures

control can go throughie a sequence of states of the control ow automatonwithout any pro cedure

returns except p erhaps at the very end Since the set of sequences of blo cks is innite we need a nite

approximation as b efore one simple way to do this is to consider sets of blo cks there are only nitely

many together with the current return lab el when control reaches each blo ck These ideas can b e made

more precise using the notion of a forward chain

is either a Denition A forward chain in a program P is a set fB B g where each B

0 0 n n i

pro cedure in P or a basic blo ck in P Lab els for i n and where for each i i n the following

i P

hold i B is not a return blo ck and ii in the control ow automaton M B x B y

i P i i i+1 i+1

for some x y

Reasoning as ab ove it is easy to show the following result

fB B g is a forward chain in a program P if and only if there is a sequence Theorem

0 0 n n

of transitions in the viable prex NFA for G of the form

P

B  B ; B  B

0 1 0 0 1 2 1 1

;

; B  B

n 1 n n1 n1

; B 

n n n

where FIRST for some

i+1 i i 0 n

Now consider the pro cess of applying the subset construction to the viable prex NFA to construct an

equivalent DFA Each state of the DFA consists of a set of NFA statesthat is a set of LR items

obtained by starting with a set of NFA states and then adding all the states reachable using only transitions

The DFA construction is useful b ecause the set of NFA states comprising each state of the DFA corresp onds

to the largest set of NFA states reachable from some initial set using only transitions ie to a set of

maximallength forward chains In other words if a forward chain o ccurs in a state of the ciable prex DFA

it is entirely contained in that state it can can never spill over into another state thereby simplifying the

search for forward chains This is expressed by the following result

Corollary fB B g is a forward chain in a program if and only if there is a

0 0 n n

A for G containing items B  B B  B state in the viable prex DF

P 0 1 0 0 1 2 1 1

B  B B where FIRST for some

n1 n n1 n1 n n n i+1 i i 0 n

Intuitively control can come from a call A to a p oint B if there is a forward chain from A to B that contains

no intervening callsie A is the most recent call preceding B The following result is now immediate

Corollary Let A be a cal l block or a tail cal l block in a program P and B a basic block or a procedure

in P Then control can come from A to B if and only if there is a state in the viable prex DFA of G

P

containing items B  B B  B B  B B

0 1 0 0 1 2 1 1 n1 n n1 n1 n n n

where FIRST for some such that B A B B and B is not a cal l block or

i+1 i i 0 n 0 n i

tail cal l block for i n

We conjecture that the analogy b etween control ow analysis and LR items continues to hold when more

context information is maintained In particular we conjecture that just as rstorder control ow analysis

CFA corresp onds to LR items k thorder control ow analysis k CFA corresp onds to LRk items

Applications of CFA

An example application of CFA is in contextsensitive interpro cedural dataow analysis Much of the

recent work on interpro cedural dataow analysis has fo cused on languages such as C and Fortran whose

implementations usually do not supp ort tail call optimization These analyses determine for each call the

b ehavior of the called pro cedure then propagate this information to the program p oint to which that call

returns For the languages considered the determination of the return p oints for the calls is straightforward

Because the p oint to which a call returns is not obvious in the presence of tail call optimization it is not

obvious how to apply these analyses to systems with tail call optimization While CFA can b e used to

determine the set of successors for each return blo ck this do es not maintain enough context information to

determine where control came from As a result the analysis can infer spurious p ointer aliases by propagating

information from one call site back to a dierent call site Using contextsensitive interprocedural analyses

avoids this by maintaining information ab out where a call came from which is precisely the

information provided by CFA

As a sp ecic example of the utility of contextsensitive ow information our exp eriments with dead

co de elimination based on interpro cedural liveness analysis in the context of the alto linktime optimizer

applied to a numb er of SPEC b enchmarks indicate that compared to the numb er of register loads and

stores that can b e deleted based on contextinsensitive liveness information an additional can b e

deleted using contextsensitive liveness information

Whether or not a contextsensitive version of an interpro cedural analysis is useful dep ends to a great ex

tent on the analysis and the application under consideration Our exp eriments with interpro cedural liveness

analysis indicate that there are situations when such analyses can lead to a noticeable improvement in the

co de generated On the other hand in comparing contextsensitive and contextinsensitive alias analyses due

to indirect memory references through p ointers Ruf observes that the contextsensitive analysis do es

compute more precise alias relationships at some program p oints However when we restrict our attention

to the lo cations accessed by or mo died by indirect memory references no additional precision is measured

However if a contextsensitive dataow analysis is deemed necessary for a language implementation with

tail call optimization the control ow analysis describ ed here can b e used to provide the necessary supp ort

A Larger Example

Consider the following program adapted from Section of to determine whether a prop ositional

formula in conjunctive normal form is a tautology

tautp andalso tautq fun tautConjpq

tautp intposp negp

fun posAtoma a

posNegAtoma

posDisjpq appposp posq

fun negAtoma

negNegAtoma a

negDisjpq appnegp negq

fun int ys

intxxs ys if memx ys then x intxs ys else intxs ys

fun memx false

memx yys xy orelse memx ys

fun app ys ys

appxxs ys xappxs ys

The partial ow graph for this program is shown in Figure To reduce clutter we have not explicitly

shown control transfers due to pro cedure calls moreover to aid the reader in understanding the control

ow b ehavior of this program each nontail call is connected to the basic blo ck corresp onding to its return

address with a dashed arc The control ow grammar G V T P S for this program is given by the

L L L L L L L following V ftaut pos neg int mem app B Bg T fL L taut: pos: neg: B0 B7 B14

B1 B3 B8 B9 B15 B16 call taut call pos return return ret addr=L1 ret addr=L4 L2: L4: B2 B4 B10 B11 B17 B18 return call neg return call pos return call neg ret addr=L5 ret addr=L12 ret addr=L19

L5: L12: L19: B5 B12 B19 call int call pos call neg ret addr=L6 ret addr=L13 ret addr=L20 L6: L13: L20: B6 B13 B20 return call app call app

int: mem: app: B21 B28 B31

B22 B23 B29 B30 B32 B33 return call mem return call mem return call app

ret addr=L24 L24: B24

B25 B26 call int call int ret addr=L27 L27: B27

return

Figure A Partial Flow Graph for the Tautology Checker Program

taut B B pos L B B B

B B B app B int

B B neg B B int L B

B taut L B B B B

B B B mem B

B pos L B B B B

B neg L B B B B B

B int L B B B

B B neg L B B mem

pos B B neg L B app B

B B B app B B

B B int B B B

B B B B

B B B B B append

B B B B mem L B

B B B B B

B pos L B

Figure Pro ductions for the control ow grammar of the program in Section

Lg S taut and the set of pro ductions P as shown in Figure The set of p ossible return addresses for

each function as obtained using CFA is as follows

taut L

pos L L L

neg L L L

int L L

mem L

app L L L L L L

Due to space constraints we do not repro duce all the sets of LR items for this grammar The dierence

b etween CFA and CFA can b e illustrated by examining the b ehavior of the function app On examining

the viable prex DFA we nd four states that are relevant to this function One of these states consists of

the following two groups of LR items

B pos L  B L B pos L  B L

B  app L B  app L

app  B L app  B L

B  B L B  B L

B  B L B  B L

B  L B  L

B  app L B  app L

From the forward chains in the rst group we can determine that app can b e called from basic blo ck B

of the function pos with return lab el L note that this refers to a blo ck thatb ecause of the control ow

eects of tailcall optimizationdo es not b elong to the calling function and this can then recursively call

itself with the same return lab el In this case the return lab el indicates that the calling function pos was

itself called from taut The second group of LR items shows a similar call sequence to app from basic

blo ck B except that in this case the calling function pos is b eing called recursively from basic blo ck B

in the b o dy of pos The remaining three states relevant to the function app provide similar information one

of these contains two groups of items the rst of which is identical to the rst group ab ove and the second

of which is similar to the second group ab ove except that it refers to a recursive call to pos from basic blo ck

B the remaining two states provide similar information for calls to app from the function neg

Trading Precision for Eciency

One of the biggest advantages we see for a grammatical formulation of control ow analysis is that grammars

and parsing have b een studied extensively and are generally well understo o d Because of this a wide variety

of techniques and to ols originally devised for syntax analysis are applicable to control ow analysis

As an example of this consider the fact that the eciency of compile time analyses can b e improved by

reducing the amount of information maintained and manipulated ie by decreasing precision In the case

of control ow analysis determining where control came from involves examining the states of the viable

prex DFA of the control ow grammar constructed using LR items It is wellknown that the numb er

of states in such a DFA can b ecome very large but that by judiciously merging certain states those with a

common kernel see the numb er of states can b e reduced considerably without signicantly sacricing

the information contained in the DFA Parsers that are constructed in this way are known as LALR

parsers which can b e built eciently without initially building the LR DFA

It do es not come as a surprise that the same idea can b e applied to CFA as well The resulting analysis

is more precise than CFA and p otentially somewhat less precise than CFA with tongue rmly in cheek

1

CFA It is usually considerably more ecient than CFA As an example for we call such an analysis

2

the tautology checker program of Section the viable prex DFA constructed from LR items contains

states while that constructed from LALR items contains states if we consider the entire tautology

checker from which works for arbitrary prop ositional formulae the LR viable prex DFA has

states while the LALR DFA has states If we fo cus on calls to the function app as in Section we

nd that with LALR items it suces to examine a single state of the DFA in contrast to four states for

the LR case Moreover there is no loss of information regarding the calling contexts in this case

Conclusions

Knowledge of lowlevel control ow is essential for many compiler optimizations In systems with tail call

optimization the determination of interpro cedural control ow is complicated by the fact that b ecause of

tail call optimization control ow at pro cedure returns is not readily evident from the call graph of the

program In this pap er we show how interpro cedural control ow analysis of rstorder programs can b e

carried out using wellknown concepts from parsing theory In particular we show that CFA corresp onds

to the notion of FOLLOW sets in context free grammars and CFA corresp onds to the analysis of LR

items The control ow information so obtained can b e used to improve the precision of interpro cedural

dataow analyses as well as to extend certain lowlevel co de optimizations across pro cedure b oundaries

References

A V Aho R Sethi and J D Ullman Principles Techniques and Tools AddisonWesley

JD Choi M Burke and P Carini Ecient FlowSensitive Interpro cedural Computation of Pointer

ACM Symposium on Principles of Programming Lan Induced Aliases and Side Eects Proc th

guages Jan pp

K De Bosschere and S K Debray alto A LinkTime Optimizer for the DEC Alpha Technical

Rep ort Dept of Computer Science The University of Arizona Tucson June

S K Debray and T A Pro ebsting Interpro cedural Control Flow Analysis of FirstOrder Programs

with Tail Call Optimization Technical Rep ort Dept of Computer Science The University of

Arizona Tucson Dec

N Heintze Control Flow Analysis and Typ e Systems Technical Rep ort CMUCS Scho ol of

Computer Science Carnegie Mellon University Pittsburgh PA Dec

F Henglein and J Jrgensen Formally Optimal Boxing Proc st ACM Symp on Principles of

Programming Languages Portland OR Jan pp

J E Hop croft and J D Ullman Introduction to Automata Theory Languages and Computation

Addison Wesley

S Jagannathan and S Weeks A Unied Treatment of Flow Analysis in HigherOrder Languages

Proc nd ACM Symp on Principles of Programming Languages San Francisco Jan pp

W Landi and B G Ryder A Safe Approximate Algorithm for Interpro cedural Pointer Aliasing

Conference on Programming Language Design and Implementation June Proc ACM SIGPLAN

pp

T Lindgren Control Flow Analysis of Prolog Proc International Symposium on Logic Pro

gramming Dec pp MIT Press

Leroy Unb oxed ob jects and p olymorphic typing Proc th ACM Symp on Principles of Pro X

gramming Languages Albuquerque NM Jan pp

J C Peterson Untagged Data in Tagged Environments Cho osing Optimal Representations at Compile

Time Proc Languages and Computer Architecture London Sept pp

Programmer Cambridge University Press L C Paulson ML for the Working

S Peyton Jones and J Launchbury Unb oxed values as rst class citizens in a nonstrict functional

language Proc Functional Programming Languages and Computer Architecture pp

T A Pro ebsting and S A Watterson Filter Fusion Proc r d ACM Symposium on Principles of

Programming Languages Jan pp

E Ruf ContextInsensitive Alias Analysis Reconsidered Proc SIGPLAN Conference on Pro

gramming Language Design and Implementation June pp

M Sharir and A Pnueli Two Approaches to Interpro cedural Dataow Analysis in Program Flow

Analysis Theory and Applications eds S S Muchnick and N D Jones PrenticeHall pp

O Shivers Control Flow Analysis in Scheme Proc SIGPLAN Conference on Programming Lan

guage Design and Implementation June pp

O Shivers Control Flow Analysis of HigherOr der Languages PhD Dissertation Carnegie Mellon

University May Also available as Technical Rep ort CMUCS Scho ol of Computer Science

Carnegie Mellon University Pittsburgh PA May

Tang and P Jouvelot ControlFlow Eects for Escap e Analysis Proc WSA Bordeaux France Y

Y Tang and P Jouvelot Separate Abstract Interpretation for Control Flow Analysis Proc TACS

D Tarditi G Morrisett P Cheng C Stone R Harp er and P Lee TIL A Typ eDirected Op

Conference on Programming Language Design and timizing Compiler for ML Proc SIGPLAN

Implementation June pp

R P Wilson and M S Lam Ecient ContextSensitive Pointer Analysis for C Programs Proc

SIGPLAN Conference on Programming Language Design and Implementation June pp