Interpro cedural Control Flow Analysis of First-Order Programs
with Tail Call Optimization
Saumya K. Debray and To dd A. Pro ebsting
Department of Computer Science
The University of Arizona
Tucson, AZ 85721, USA.
Email: fdebray, [email protected]
Technical Rep ort 96-20
Decemb er 2, 1996
Abstract
The analysis of control ow
Involves guring out where returns will go.
How this may b e done
With items LR-0 and -1
Is what in this pap er we show.
The work of S. Debraywas supp orted in part by the National Science Foundation under grantnumb er CCR-9502826. The
work of T. Pro ebsting was supp orted in part by NSF Grants CCR-9415932, CCR-9502397, ARPA GrantDABT63-95-C-0075, and IBM Corp oration.
1 Intro duction
Most co de optimizations dep end on control ow analysis, typically expressed in the form of a control ow
graph [1]. Traditional algorithms construct intrapro cedural ow graphs, which do not account for control
owbetween pro cedures. Optimizations that dep end on this limited information cannot consider the b e-
havior of other pro cedures. Interpro cedural versions of these optimizations must capture the owofcontrol
across pro cedure b oundaries. Determining interpro cedural control ow for rst-order programs is relatively
straightforward in the absence of tail call optimization, since pro cedures return control to the p oint imme-
diately after the call. Tail call optimization complicates the analysis b ecause returns may transfer control
to a pro cedure other than the active pro cedure's caller.
The problem can b e illustrated by the following simple program that takes a list of values and prints, in
their original order, all the values that satisfy some prop erty, e.g., exceed 100. To take advantage of tail-call
optimization, it uses an accumulator to collect these values as it traverses the input list. However, this causes
the order of the values in the accumulator to b e reversed, so the accumulated list is reversed|again using
an accumulator|b efore it is returned.
1 mainL = print extractL, []
2 extractxs, acc =
3 if xs = [] then reverseacc, []
4 else if hdxs > 100 then extracttlxs, conshdxs,acc
5 else extracttlxs, acc
6 reversexs, acc =
7 if xs = [] then acc
8 else reversetlxs, conshdxs,acc
Supp ose that, for co de optimization purp oses, wewant to construct a ow graph for this entire program.
The return from the function reverse in line 7 corresp onds to some basic blo ck, and in order to construct
the ow graph we need to determine the successors of this blo ck. The call graph of the program indicates
that reverse can b e called either from extract, in line 3, or from reverse, in line 8. However, b ecause
of tail call optimization, it turns out that reverse do es not return to either of these call sites. Instead, it
returns to a di erent pro cedure entirely, namely, to the pro cedure main in line 1: the successor blo cktothe
return in line 7 is the basic blo ck that calls print. Clearly, some nontrivial control ow analysis is necessary
to determine this.
Most of the work to date on control ow analysis has fo cused on higher-order languages: Shivers [17,18]
and Jagannathan and Weeks [7] use abstract interpretation for this purp ose, while Heintze [4] and Tang
and Jouvelot [19,20] use typ e-based analyses. These analyses are very general, but very complex. Many
widely used languages, such as Sisal and Prolog, are rst-order languages. Furthermore, even for higher-order
languages, sp eci c programs often use only rst-order constructs, or can have most higher-order constructs
removed via transformations such as inlining and uncurrying [21]. As a pragmatic issue, therefore, we are
interested in \ordinary" rst-order programs: our aim is to account for interpro cedural control ow in such
programs in the presence of tail call optimization. To our knowledge, the only other work addressing this
issue is that of Lindgren [9], who uses set-based analysis for control ow analysis of Prolog. Unlike Lindgren's
work, our analyses can maintain context information see Section 6.
The main contribution of this pap er is to showhow control ow analysis of rst-order programs with tail
call optimization can b e formulated in terms of simple and well-understo o d concepts from parsing theory.In
particular, we show that context-insensitive, or zeroth-order, control ow analysis corresp onds to the notion
of FOLLOW sets in context free grammars, while context-sensitive, or rst-order, control ow analysis
corresp onds to the notion of LR1 items. This is useful, b ecause it allows the immediate application of
well-understo o d technology without, for example, having to construct complex abstract domains. It is also
esthetically pleasing, in that it provides an application of concepts suchasFOLLOW sets and LR1 items,
whichwere originally develop ed purely in the context of parsing, to a very di erent application.
The remainder of the pap er is organized as follows. Section 2 intro duces de nitions and notation. Section
3 de nes an abstract mo del for control ow, and Section 4 shows how this mo del can b e describ ed using 1
context free grammars. Section 5 discusses control ow analysis that maintain no context information, and
Section 6 discusses how context information can b e maintained to pro duce more precise analyses. Section
7 illustrates these ideas with a nontrivial example. Section 8 discusses tradeo s b etween eciency and
precision.
In order to maintain continuity, the pro ofs of theorems have b een relegated to the app endix.
2 De nitions and Notation
We assume that a program consists of a set of pro cedure de nitions, together with an entry p oint pro cedure.
It is straightforward to extend these ideas to accommo date multiple entry p oints. Since we assume a
rst-order language, the intrapro cedural control ow can b e mo delled by a control ow graph [1]. This is a
directed graph where each no de corresp onds to a basic blo ck, i.e., a maximal sequence of executable co de
that has a single entry p oint and a single exit p oint, and where there is an edge from a no de A to a no de B
if and only if it is p ossible for execution to leavenode A and immediately enter no de B . If there is an edge
from a no de A toanodeB, then A is said to b e a predecessor of B and B is a successor of A. Because of the
e ects of tail call optimization, interpro cedural control ow information cannot b e assumed to b e available.
Therefore, we assume that the input to our analysis consists of one control ow graph for each pro cedure
de ned in the program.
For simplicity of exp osition, we assume that each ow graph has a single entry no de. Each ow graph
consists of a set of vertices, which corresp ond to basic blo cks, and a set of edges, which capture control ow
between basic blo cks. If a basic blo ck contains a pro cedure call, the call is assumed to terminate the blo ck;
if a basic blo ck B ends in a call to a pro cedure p,wesay that B cal ls p.
If the last action along an execution path in a pro cedure p is a call to some pro cedure q |i.e., if the
only action that would b e p erformed on return from q would b e to return to the caller of p|the call to q
is termed a tail cal l. A tail call can b e optimized: in particular, anyenvironment allo cated for the caller p
can b e deallo cated, and control transfer e ected via a direct jump to the callee q ; this is usually referred to
as \tail call optimization," and is crucial for ecient implementations of functional and logic languages. If
a basic blo ck B ends in a tail call, wesay that it is a tail cal l block;ifB ends in a pro cedure call that is
not a tail call, wesayB is a cal l block. In the latter case, B must set a return address L b efore making the
call: L is said to b e a return label. If a basic blo ck B ends in a return from a pro cedure, it is said to b e a
return block. As is standard in the program analysis literature, we assume that either branch of a conditional
can b e executed at runtime. The ideas describ ed here are applicable to programs that do not satisfy this
assumption; in that case, the analysis results will b e sound but p ossibly conservative.
The set of basic blo cks and lab els app earing in a program P are denoted by Blo cks and Lab els
P P
resp ectively. The set of pro cedures de ned in it is denoted by Pro cs . Finally, the Kleene closure of a set
P
S , i.e., the set of all nite sequences of elements of S , is written S . The re exive transitive closure of a
?
relation R is written R .
3 Abstracting Control Flow
Before we can analyze the control ow b ehavior of such programs, it is necessary to sp ecify this b ehavior
carefully. First, consider the actual runtime b ehavior of a program:
{ Co de not involving pro cedure calls or returns is executed as exp ected: each instruction in a basic blo ck
is executed in turn, after which control moves to a successor blo ck, and so on.
{ Pro cedure calls are executed as follows.
{A non-tail cal l loads arguments into the appropriate lo cations, saves the return address for
simplicity,we can assume that it is pushed on a control stack, and branches to the callee.
{Atail cal l loads arguments, reclaims any space allo cated for the caller's environment, and transfers
control to the callee. 2
{ A pro cedure return simply p ops the topmost return address from the control stack and transfers control
to this address.
We can ignore any asp ect of a program's runtime b ehavior that is not concerned directly with owofcontrol.
Conceptually, therefore, control moves from one basic blo ck to another, pushing a return address on a stack
when making a non-tail pro cedure call, and p opping an address from it when returning from a pro cedure.
This can b e formalized using a very simple pushdown automaton: the automaton M corresp onding to a
P
program P is called its control ow automaton. Given a program P , the set of states Q of M is given by
P
Q = Blo cks [ Pro cs ; its input alphab et is Lab els ; the initial state of M is p, where p is the entry p oint
P P P P
of P ; and its stack alphab et = Lab els [ Blo cks [f$g, where $ is a sp ecial b ottom-of-stack marker that
P P
is the initial stack symb ol.
The general idea is that the state of M at any p oint corresp onds to the basic blo ck b eing executed by
P
P , while the return lab els on its stack corresp ond to the stack of pro cedure calls in P . The input string do es
not play a direct role in determining the b ehavior of M , but it turns out to b e technically very convenient
P
to matchupsymb ols read from the input with lab els p opp ed from the stack. The language accepted by M
P
is then the set of sequences of lab els that control can jump to on pro cedure returns during an execution of
the program P .
Let the transitions of a pushdown automaton b e denoted as follows [6]: if, from a con guration where it
0
is in state q and has w in its input and on its stack, it can make a transition to state q with input string
0 0 0
w and with on its stack, we write q; w; ` q ;w ; . The stack contents are written such that the
top of the stack is to the left: if a :::a then a is assumed to b e at the top of the stack. The moves
1 n 1
of M are de ned as follows:
P
0
1. If basic blo ck B is a predecessor of basic blo ck B , and B do es not make a call or tail call, then M
P
0
can makean "-move from B to B :
0
B; w; ` B ;w;
2. If basic blo ck B makes a call to pro cedure p with return lab el `, where the basic blo ck with lab el ` is
0 0
B , then M can push two symb ols `B on its stack and makean "-move to state p:
P
0
B; w; ` p; w ; `B
3. If basic blo ck B makes a tail call to pro cedure p, then M can makean "-move to state p:
P
B; w; ` p; w ;
4. If the entry no de of the ow graph of a pro cedure p is B , then M can makean "-move from state p
P
to state B :
p; w ; ` B; w;
0
5. If B is a return blo ck, then if ` app ears on M 's input and the lab el ` and blo ck B app ear on the top
P
0 0
of its stack, then M can read ` from the input, p op ` and B o its stack, and go to state B :
P
0 0
B ; `w ; `B ` B ;w;
6. Finally, M accepts by empty stack: for each state q , there is the move
P
q; "; $ ` q ; "; ".
We refer to the lab el app earing on the top of M 's stack as the current return label, since this is the lab el
P
of the program p oint to which control returns from a return blo ck. 3 main: extract: B1 B3
call extract ret addr=L2
L2: B2 B4 B5
call print call reverse
B6 B7 call extract call extract
reverse: B8
B9 B10
return call reverse
Figure 1: A Partial Flow Graph for the main/extract/reverse Program
4 Control Flow Grammars
Given a set of control ow graphs for the pro cedures in a program, we can construct a context free grammar
that describ es its control ow b ehavior. We call such a grammar a control ow grammar.
De nition 4.1 A control ow grammar for a program P is a context free grammar G =V; T ; P; S
P
where the set of terminals T is the set Lab els of return lab els of P ; the set of variables V is given by
P
V = Blo cks [ Pro cs ; the start symb ol of the grammar is the entry pro cedure p of the program; and the
P P
pro ductions of the grammar are given by the following:
0 0
1. if B is a basic blo ck that is a predecessor of a basic blo ck B , then there is a pro duction B ! B ;
0
2. if B is a call blo ck with return lab el `, where the basic blo ck lab elled ` is B , and the called pro cedure
0
is p, there is a pro duction B ! p` B ;
3. if B is a tail call blo ck and the called pro cedure is p, then there is a pro duction B ! p;
4. if p is a pro cedure de ned in P , and the control ow graph of p has entry no de B , then there is a
pro duction p ! B .
5. If B is a return blo ck, then there is a pro duction B ! ".
Example 4.1 Consider the main/extract/reverse program from Section 1. A partial ow graph for
this program is shown in Figure 1, with ordinary control transfers shown using solid lines and calls to
pro cedures using dashed lines. Because control transfers at pro cedure returns have not yet b een determined,
the predecessors of basic blo ck B2 and the successors of blo ck B9 are not yet known. The control ow
grammar for this program has as its terminals the set of lab els fL2g, nonterminals fmain, extract, reverse,
B1, . . . , B10g, and the following pro ductions: 4
main ! B1 B5 ! B7
B1 ! extract L2 B2 B6 ! extract
B2 ! print B7 ! extract
extract ! B3 reverse ! B8
B3 ! B4 B8 ! B9
B3 ! B5 B8 ! B10
B4 ! reverse B9 ! "
B5 ! B6 B10 ! reverse
The start symb ol of the grammaris main 2
The pro ductions of the control ow grammar G closely resemble the moves of the control ow automaton
P
M , and it comes as no surprise that they b ehavevery similarly. Let denote the leftmost derivation
P lm
relation in G . The following theorem, whose pro of closely resembles the standard pro of of the equivalence
P
between pushdown automata and context-free languages [6], expresses the intuition that the control ow
grammar of a program mirrors the b ehavior of its control ow automaton:
Theorem 4.1 Given a program P with entry point S ,control ow grammar G and control ow automaton
P
M ,
P
? ?
S xA if and only if S; xw ; $ ` A; w ;
lm
where x; w 2 Lab els and A 2 Blo cks [ Pro cs .
P P
P
5 Zeroth-Order Control Flow Analysis
Zeroth-order control ow analysis, also referred to as 0-CFA, involves determining, for each pro cedure p in a
program, the set of lab els RetLbl p to which control can return after some call to p. Consider a call blo ck
B in a program P .IfBis not a tail-call blo ck, it pushes its return address onto the top of the stack b efore
transferring control to the called pro cedure. On the other hand, if B is a tail call blo ck, it leaves the control
stackuntouched and transfers control directly to the callee. Eventually, when executing a return, control
branches to the lab el app earing on the top of the stack. Thus, in either case, the set of lab els to which
a pro cedure p can return is the set of current return lab els when control enters p, in some con guration
reachable from the initial con guration of M .
P
? 0
RetLbl p=f`jq ;w;$ ` p; w ;` g
0
It is a direct consequence of Theorem 4.1 that this set is precisely the FOLLOW set of p in the control ow
grammar of the program see [1] for the de nition of FOLLOW sets:
Theorem 5.1 For any procedure p inaprogram P ,RetLbl p=FOLLOWp.
Pro of Supp ose the program P has entry p oint S .From the de nition of RetLbl p, ` 2 RetLbl pifand
?
only if there is a call blo ck A that calls p, such that S; xw ; $ ` A; w ; ` p; w ; `B . From Theorem 4.1,
? ? 0
this is true if and only if S xA xp`B , i.e., S xp` . But this is true if only if ` 2 FOLLOWp.
It follows that RetLbl p=FOLLOWp. 2
Example 5.1 FOLLOW sets for some of the variables in the grammar of Example 4.1 are:
X FOLLOWX
main $
extract L2
reverse L2
Here, a `$' refers to the \external caller," e.g., the user. It is immediately apparent from this that control
returns from reverse to the basic blo ckinmain that calls print. 2 5
Input: A program P .
Output: The interpro cedural 0-CFA control ow graph for P .
Metho d:
1. Construct the control ow grammar G for P .
P
2. Compute FOLLOW sets for the nonterminals of G .
P
3. Construct a partial control ow graph for P , without accounting for control transfers due to pro-
cedure returns. This is done by adding edges corresp onding to intra-pro cedural control transfers,
as in [1], together with an edge from each call blo ck and tail-call blo ck to the entry no de of the
corresp onding called function.
4. For each ` 2 Lab els and each nonterminal X of G do:
P P
if ` 2 FOLLOWX and the basic blo ck corresp onding to X contains a return instruction
then add an edge from X to B , where B is the basic blo ck lab elled by `;
` `
Figure 2: An algorithm for constructing the interpro cedural 0-CFA control ow graph of a program
There is one remaining subtlety in constructing the interpro cedural control ow graph of a program
once the set of return lab els for each function have b een computed. If we consider the FOLLOW sets the
grammar of Example 4.1, we nd that L2 o ccurs in FOLLOWextract, and it is correct to infer from this
that control is transferred to the basic blo ck B2, lab elled by L2, after completion of the call to extract.
However, we cannot conclude that each blo ck that has L2 in its FOLLOW set has the blo ck B2 as a successor:
while L2 o ccurs in the FOLLOW sets of B3, B4, B5, B6, B7, B8, B9 and B10, it is not dicult to see that
only B9|which actually contains a return instruction|should have B2 as a successor. The algorithm for
constructing the control ow graph of a program, taking this into account, is given in Figure 2.
5.1 Applications of 0-CFA
An example application of 0-CFAisinterpro cedural unboxing optimization in languages that are either
dynamically typ ed, or that supp ort p olymorphic typing. In implementations of such languages, the compiler
cannot always predict the exact typeofavariable at a program p oint, and as a result it b ecomes necessary
to ensure that values of di erenttyp es \lo ok the same," whichisachieved by\boxing." Unfortunately,
manipulating b oxed values is exp ensive.
The issue of maintaining untagged values has received considerable attention in recentyears in the
context of strongly typ ed p olymorphic languages [5, 10, 13]. Using explicit \representation typ es," this
work relies on the typ e system to propagate data representation information through the program. While
theoretically elegant, the typ e system cannot b e aware of low-level pragmatic concerns such as the costs
of various representation conversion op erations and the execution frequencies of di erent co de fragments.
As a result, it is dicult to guarantee that the \optimized" program is, in fact, more ecient than the
unoptimized version. Also, the idea do es not extend readily to dynamically typ ed languages.
Peterson [11] takes a pro cedure's control ow graph, and determines the optimal placement of represen-
tation conversion op erations, based on basic blo ck execution frequencies and conversion op eration costs. As
given, this is an intrapro cedural optimization. For many programs, unboxing across pro cedure calls yields
R
1
x
signi cant p erformance improvements. As an example, we tested a program that computes e dx using
0
trap ezoidal numerical integration with adaptive quadrature. For this program, intra-pro cedural unboxing
optimization yields a p erformance improvement of ab out 30.3 with a tail call from a function to itself b eing
recognized and implemented as a lo op. With inter-pro cedural unboxing, however, p erformance improves by
1
ab out 52.9. To apply Peterson's algorithm interpro cedurally,we need to construct the control ow graph
1
We did not implementPeterson's algorithm, b ecause the control ow analysis describ ed here had not b een develop ed when 6
for the entire program. The presence of tail call optimization makes computing the set of successors of a
return blo ck dicult with just the call graph of the program. Fortunately, 0-CFA provides precisely what
is needed to determine the successors of return blo cks, and thereby to construct the control ow graph for
a program.
Another application of 0-CFAisinterpro cedural basic blo ck fusion. The basic idea of this optimization
is straightforward: if wehavetwo basic blo cks B and B where B is the only predecessor of B and B
0 1 0 1 1
is the only successor of B , then they can b e combined into a single basic blo ck; in the interpro cedural
0
case, the blo cks b eing fused may b elong to di erent pro cedures. The b ene ts of this optimization include a
reduction in the numb er of jump instructions executed, with a concomitant decrease in pip eline \bubbles,"
as well as p otentially improved opp ortunities for b etter instruction scheduling in the enlarged basic blo ck
resulting from the optimization. Our exp erience with lter fusion [14] indicates that this optimization can b e
of fundamental imp ortance for p erformance in applications involving automatically generated source co de.
Consider the main/extract/reverse program from Section 1. A partial ow graph for this program is
given in Figure 1. It is not dicult to see that the 0-CFA algorithm of Figure 2 would determine that basic
blo ck B2 is the only successor of blo ck B9, and B9 is the only predecessor of B2, thereby allowing these two
blo cks to b e fused. Note that a naive analysis that handles tail calls as if they were calls that returned to an
empty basic blo ck immediately following the call site would infer that basic blo ck B2 had three predecessors,
blo cks B4, B6 and B7, thereby preventing the application of the optimization in this case.
6 First-Order Control Flow Analysis
While 0-CFA tells us the p ossible return addresses for each basic blo ck and pro cedure, it leaves out all
information ab out the \context" in which a call o ccurs i.e., who called whom to get to this call site. This
may render 0-CFA inadequate in some situations. Information ab out where control \came from" could
provide more precise liveness or aliasing information at a particular program p oint, allowing a compiler to
generate b etter co de.
Atany p oint during a program P 's execution, the return addresses on its control stack which corresp ond
to the contents of the stack in some execution of the control ow automaton M give us a complete history
P
of the interpro cedural control ow b ehavior of the program upto that p oint. Since, the set of all p ossible
nite sequences of lab els is in nite, we seek nitely computable approximations to this information. An
obvious p ossibilityistokeep track of the top k lab els on the stackofM , for some xed k 0. 0-CFA,
P
where wekeep trackofnocontext information at all, corresp onds to cho osing k = 0. A control ow analysis
that keeps track of the top k return addresses on the stackofM is called a k -th.-order control ow analysis,
P
or k -CFA this corresp onds to the \call-strings approach" of Sharir and Pnueli [16]. In this section, we
fo cus our attention on rst-order control ow analysis, or 1-CFA.
In the previous section, we showed that the FOLLOW sets of the control ow grammar give 0-CFA
information. How mightwe incorp orate additional context information into such analyses? In parsing
theory,FOLLOW sets are used to construct SLR1 parsers, which are based on LR0 items. Because
SLR1 parsers do not maintain much context information, they are unable to handle many simple grammars.
Intro ducing additional context information into the items using lo okahead tokens xes this problem: this
leads to the use of LR1 items.
This analogy carries over to control ow analysis. LR1 items for the control ow grammar G are closely
P
related to the information manipulated during 1-CFA. Basically, an LR1 item [A ! ; a] conveys the
information that control can reach A with current return lab el a. In an item [A ! ; a]we will often
fo cus on the nonterminal A on the left hand side of the pro duction and the lo okahead token a, but not in the
details of the structure of : in such cases, to reduce visual clutter we will write the item as [A ! ;a].
In the context of this discussion we are not concerned with whether or not the control ow grammar G is
P
LR1-parsable.
We know, from parsing theory, that given a control ow grammar G with variables V and terminals T ,
P
there is a nondeterministic nite automaton NFA Q; ;;q ;Q that recognizes viable pre xes of G [6].
0
we implemented the unboxing optimization. Instead, our compiler uses a heuristic that pro duces go o d, but not necessarily
optimal, representation conversion placements. These p erformance improvements can therefore b e seen as lower b ounds on
what can b e attained using an optimal algorithm. 7
This NFA, whichwe will refer to as the viable pre x NFA, is de ned as follows: its set of states Q consists
of the set of LR1 items for G , together with a state q that is not an item; its alphab et = V [ T ; the
P 0
initial state is q ;every state is a nal state; and the transition function is de ned as follows:
0
i Given a program P with entry p oint p, q ;"=f[S ! p; $]g.
0
ii [A ! B ; a];"= f[B ! ; b] j B ! is a pro duction;b 2 FIRST ag.
iii [A ! B ; a];B= f[A! B ; a]g B 6= ".
?
An item I is said to b e reachable if q ; I , i.e., if there is a path from the the initial state q of the viable
0 0
pre x NFAtoI. The following result makes explicit the corresp ondence b etween LR1 items in G and
P
return addresses on top of the stackofM :
P
?
Theorem 6.1 Given a program with entry point p, p; x; $ ` A; y ; a if and only if thereisareachable
item [A ! ;a].
The set of current return lab els, i.e., lab els at the top of M 's stack, when control enters a basic blo ckor
P
pro cedure is now easy to determine:
Corollary 6.2 Let A be any basic block or procedureinaprogram P . The set of current return labels when
control enters A is given by f` j thereisareachable LR1 item [A ! ;`]g.
An LR1 item [A ! ;a] tells us ab out the return lab els that can app ear on top of the control stack,
i.e., ab out addresses that control can go to. Fortunately, it turns out that we can use the reachability relation
; in the viable pre x NFA to trace the origins of a call. Consider a program P with a call blo ck A that
calls a pro cedure p: in the control ow grammar G , this corresp onds to a pro duction
P
A ! p`C
where ` is the return lab el and C is the blo ck with lab el `. This pro duction gives rise to LR1 items of the
form [A ! p`C;b]. Let B b e the entry no de of the ow graph for p, then G contains the pro duction
p P
p ! B , so in the viable pre x NFA there is a "-transition from each of these items to the item [p ! B ;`].
p p
Supp ose the blo ck B has successors C ;:::;C , then G has pro ductions B ! C ; B ! C , and
p 1 k P p 1 p k
the viable pre x NFA will therefore have "-transitions from the item [p ! B ;`] to each of the items
p
[B ! C ;`], .. ., [B ! C ;`]. Supp ose one of these blo cks, say C , makes a tail call to a pro cedure
p 1 p k j
q , whose ow graph has entry no de B , then G contains the pro ductions C ! q and q ! B , and this
q P j q
gives rise to "-transitions from [B ! C ;`]to[C ! q; `] and thence to [q ! B ;`]. We can follow
p j j q
"-transitions in this way to trace a sequence of control transfers that do es not involveany pro cedure returns.
Conversely,we can follow "-transitions backwards from a call to work out where it could have come from.
Intuitively,wewant to b e able to characterize a collection of successive basic blo cks and pro cedures
control can go through|i.e., a sequence of states of the control ow automaton|without any pro cedure
returns, except p erhaps at the very end. Since the set of sequences of blo cks is in nite, we need a nite
approximation: as b efore, one simple way to do this is to consider sets of blo cks there are only nitely
many, together with the current return lab el when control reaches each blo ck. These ideas can b e made
more precise using the notion of a forward chain:
De nition 6.1 A forward chain in a program P is a set fB ;` :::;B ;` g where each B is either a
0 0 n n i
pro cedure in P or a basic blo ckinP,` 2 Lab els for 0 i n, and where for each i,0 i i P hold: i B is not a return blo ck; and ii in the control ow automaton M ,B ;x;` ` B ;y;` i P i i i+1 i+1 for some x; y ; ; . Reasoning as ab ove, it is easy to show the following result: Theorem 6.3 fB ;` ;:::;B ;` g is a forward chain in a program P if and only if thereisasequence 0 0 n n of "-transitions in the viable pre x NFA for G of the form P [B ! B ;`];[B !B ;`] 0 1 0 0 1 2 1 1 ; ; [B ! B ;` ] n 1 n n 1 n 1 ;[B ! ;` ] n n n 8 where ` 2 FIRST ` , for some ;:::; . i+1 i i 0 n Now consider the pro cess of applying the subset construction to the viable pre x NFA to construct an equivalentDFA. Each state of the DFA consists of a set of NFA states|that is, a set of LR1 items| obtained by starting with a set of NFA states and then adding all the states reachable using only "-transitions. The DFA construction is useful b ecause the set of NFA states comprising each state of the DFA corresp onds to the largest set of NFA states reachable from some initial set using only "-transitions, i.e., to a set of maximal-length forward chains. In other words, if a forward chain o ccurs in a state of the ciable pre x DFA, it is entirely contained in that state: it can can never spill over into another state, thereby simplifying the search for forward chains. This is expressed by the following result: Corollary 6.4 fB ;` ;:::;B ;` g is a forward chain in a program if and only if thereisa 0 0 n n state in the viable pre x DFA for G containing items [B ! B ;`], [B !B ;`], , P 0 1 0 0 1 2 1 1 [B ! B ;` ],[B ! ;`]where ` 2 FIRST ` , for some ;:::; . n 1 n n 1 n 1 n n n i+1 i i 0 n Intuitively, control can \come from" a call A to a p oint B if there is a forward chain from A to B that contains no intervening calls|i.e., A is the most recent call preceding B . The following result is now immediate: Corollary 6.5 Let A beacal l block or a tail cal l block in a program P , and B abasic block or a procedure in P . Then, control can come from A to B if and only if there is a state in the viable pre x DFAofG P containing items [B ! B ;`], [B !B ;`], , [B ! B ;` ],[B ! ;`] 0 1 0 0 1 2 1 1 n 1 n n 1 n 1 n n n where ` 2 FIRST ` , for some ;:::; , such that B A, B B , and B is not a cal l block or i+1 i i 0 n 0 n i tail cal l block for 0 We conjecture that the analogy b etween control ow analysis and LR items continues to hold when more context information is maintained. In particular, we conjecture that just as rst-order control ow analysis 1-CFA corresp onds to LR1 items, k -th.-order control ow analysis k -CFA corresp onds to LRk items. 6.1 Applications of 1-CFA An example application of 1-CFAisincontext-sensitiveinterpro cedural data ow analysis. Much of the recentwork on interpro cedural data ow analysis has fo cused on languages suchasCandFortran, whose implementations usually do not supp ort tail call optimization. These analyses determine, for each call, the b ehavior of the called pro cedure, then propagate this information to the program p oint to which that call returns. For the languages considered, the determination of the return p oints for the calls is straightforward. Because the p oint to which a call returns is not obvious in the presence of tail call optimization, it is not obvious how to apply these analyses to systems with tail call optimization. While 0-CFA can b e used to determine the set of successors for each return blo ck, this do es not maintain enough context information to determine where control came from. As a result, the analysis can infer spurious p ointer aliases by propagating information from one call site back to a di erent call site. Using context-sensitive interprocedural analyses avoids this by maintaining information ab out where a call came from [2, 8, 22], which is precisely the information provided by 1-CFA. As a sp eci c example of the utility of context-sensitive ow information, our exp eriments with dead co de elimination based on interpro cedural liveness analysis, in the context of the alto link-time optimizer [3] applied to a numb er of SPEC b enchmarks, indicate that compared to the numb er of register loads and stores that can b e deleted based on context-insensitive liveness information, an additional 5{8 can b e deleted using context-sensitive liveness information. Whether or not a context-sensitiveversion of an interpro cedural analysis is useful dep ends, to a great ex- tent, on the analysis and the application under consideration. Our exp eriments with interpro cedural liveness analysis indicate that there are situations when such analyses can lead to a noticeable improvement in the co de generated. On the other hand, in comparing context-sensitive and context-insensitive alias analyses due to indirect memory references through p ointers, Ruf observes [15] that \. . . the context-sensitive analysis do es compute more precise alias relationships at some program p oints. However, when we restrict our attention to the lo cations accessed by or mo di ed by indirect memory references, no additional precision is measured." However, if a context-sensitive data ow analysis is deemed necessary for a language implementation with tail call optimization, the control ow analysis describ ed here can b e used to provide the necessary supp ort. 9 taut: pos: neg: B0 B7 B14 B1 B3 B8 B9 B15 B16 call taut call pos return return ret addr=L1 ret addr=L4 L2: L4: B2 B4 B10 B11 B17 B18 return call neg return call pos return call neg ret addr=L5 ret addr=L12 ret addr=L19 L5: L12: L19: B5 B12 B19 call int call pos call neg ret addr=L6 ret addr=L13 ret addr=L20 L6: L13: L20: B6 B13 B20 return call app call app int: mem: app: B21 B28 B31 B22 B23 B29 B30 B32 B33 return call mem return call mem return call app ret addr=L24 L24: B24 B25 B26 call int call int ret addr=L27 L27: B27 return Figure 3: A Partial Flow Graph for the Tautology Checker Program 10 taut ! B0 B12 ! pos L13 B13 B16 ! B18 B0 ! B1 B13 ! app B25 ! int B0 ! B3 neg ! B14 B26 ! int L27 B27 B1 ! taut L2 B2 B14 ! B15 B27 ! " B2 ! " B14 ! B16 mem ! B28 B3 ! pos, L4 B4 B15 ! " B28 ! B29 B4 ! neg L5 B5 B16 ! B17 B28 ! B30 B5 ! int L6 B6 B17 ! " B29 ! " B6 ! " B18 ! neg L19 B19 B30 ! mem pos ! B7 B19 ! neg L20 B20 app ! B31 B7 ! B8 B20 ! app B31 ! B32 B7 ! B9 int ! B21 B31 ! B33 B8 ! " B21 ! B22 B32 ! " B9 ! B10 B21 ! B23 B33 ! append B9 ! B11 B22 ! " B23 ! mem L24 B24 B10 ! " B24 ! B25 B24 ! B26 B11 ! pos L12 B12 Figure 4: Pro ductions for the control ow grammar of the program in Section 7 7 A Larger Example Consider the following program, adapted from Section 4.17 of [12], to determine whether a prop ositional formula in conjunctive normal form is a tautology: fun tautConjp,q = tautp andalso tautq | tautp = [] <> intposp, negp ; fun posAtoma = [a] | posNegAtoma = [] | posDisjp,q = appposp, posq; fun negAtoma = [] | negNegAtoma = [a] | negDisjp,q = appnegp, negq; fun int[], ys = [] | intx::xs, ys = if memx, ys then x :: intxs, ys else intxs, ys; fun memx, [] = false | memx, y::ys = x=y orelse memx, ys; fun app[], ys = ys | appx::xs, ys = x::appxs, ys; The partial ow graph for this program is shown in Figure 3. To reduce clutter, wehave not explicitly shown control transfers due to pro cedure calls; moreover, to aid the reader in understanding the control ow b ehavior of this program, each non-tail call is connected to the basic blo ck corresp onding to its return address with a dashed arc. The control ow grammar G =V; T ; P; S for this program is given by the following: V = ftaut, pos, neg, int, mem, app, B0, . . . , B33g; T = fL2, L4, L5, L6, L12, L13, L19, L20, L24, L27g; S = taut; and the set of pro ductions P as shown in Figure 4. The set of p ossible return addresses for each function, as obtained using 0-CFA, is as follows: taut : L2,$ pos : L4, L12, L13 neg : L5, L19, L20 int : L6, L27 mem : L24 app : L4, L12, L13, L5, L19, L20 11 Due to space constraints, we do not repro duce all the sets of LR1 items for this grammar. The di erence between 0-CFA and 1-CFA can b e illustrated by examining the b ehavior of the function app. On examining the viable pre x DFA, we nd four states that are relevant to this function. One of these states consists of the following two groups of LR1 items: [B 12 ! pos L13 B 13; L4] [B 12 ! pos L13 B 13; L12] [B 13 ! app; L4] [B 13 ! app; L12] [app ! B 31; L4] [app ! B 31; L12] [B 31 ! B 32; L4] [B 31 ! B 32; L12] [B 31 ! B 33; L4] [B 31 ! B 33; L12] [B 32 ! ; L4] [B 32 ! ; L12] [B 33 ! app; L4] [B 33 ! app; L12] From the forward chains in the rst group, we can determine that app can b e called from basic blo ck B13 of the function pos, with return lab el L4 note that this refers to a blo ck that|b ecause of the control ow e ects of tail-call optimization|do es not b elong to the calling function, and this can then recursively call itself with the same return lab el. In this case, the return lab el indicates that the calling function pos was itself called from taut. The second group of LR1 items shows a similar call sequence to app from basic blo ck B13, except that in this case the calling function pos is b eing called recursively from basic blo ck B11 in the b o dy of pos. The remaining three states relevant to the function app provide similar information: one of these contains two groups of items, the rst of which is identical to the rst group ab ove, and the second of which is similar to the second group ab ove except that it refers to a recursive call to pos from basic blo ck B12; the remaining two states provide similar information for calls to app from the function neg. 8 Trading Precision for Eciency One of the biggest advantages we see for a grammatical formulation of control ow analysis is that grammars and parsing have b een studied extensively and are generally well understo o d. Because of this, a wide variety of techniques and to ols originally devised for syntax analysis are applicable to control ow analysis. As an example of this, consider the fact that the eciency of compile time analyses can b e improved by reducing the amount of information maintained and manipulated, i.e., by decreasing precision. In the case of control ow analysis, determining where control came from involves examining the states of the viable pre x DFA of the control ow grammar, constructed using LR1 items. It is well-known that the number of states in suchaDFA can b ecome very large, but that by judiciously merging certain states those with a common \kernel", see [1] the numb er of states can b e reduced considerably without signi cantly sacri cing the information contained in the DFA. Parsers that are constructed in this way are known as LALR1 parsers, which can b e built eciently without initially building the LR1 DFA. It do es not come as a surprise that the same idea can b e applied to 1-CFAaswell. The resulting analysis is more precise than 0-CFA and p otentially somewhat less precise than 1-CFA: with tongue rmly in cheek, 1 we call such an analysis -CFA. It is usually considerably more ecient than 1-CFA. As an example, for 2 the tautology checker program of Section 7, the viable pre x DFA constructed from LR1 items contains 97 states, while that constructed from LALR1 items contains 55 states; if we consider the entire tautology checker from [12], whichworks for arbitrary prop ositional formulae, the LR1 viable pre x DFA has 304 states while the LALR1 DFA has 112 states. If we fo cus on calls to the function app, as in Section 7, we nd that with LALR1 items it suces to examine a single state of the DFA, in contrast to four states for the LR1 case. Moreover, there is no loss of information regarding the calling contexts in this case. 9 Conclusions Knowledge of low-level control ow is essential for many compiler optimizations. In systems with tail call optimization, the determination of interpro cedural control ow is complicated by the fact that b ecause of tail call optimization, control ow at pro cedure returns is not readily evident from the call graph of the program. In this pap er, we showhowinterpro cedural control ow analysis of rst-order programs can b e 12 carried out using well-known concepts from parsing theory. In particular, we show that 0-CFA corresp onds to the notion of FOLLOW sets in context free grammars, and 1-CFA corresp onds to the analysis of LR1 items. The control ow information so obtained can b e used to improve the precision of interpro cedural data ow analyses as well as to extend certain low-level co de optimizations across pro cedure b oundaries. References [1] A. V. Aho, R. Sethi and J. D. Ullman, Compilers { Principles, Techniques and Tools, Addison-Wesley, 1986. [2] J.-D. Choi, M. Burke, and P. Carini, \Ecient Flow-SensitiveInterpro cedural Computation of Pointer- Induced Aliases and Side E ects", Proc. 20th. ACM Symposium on Principles of Programming Lan- guages, Jan. 1993, pp. 232{245. [3] K. De Bosschere and S. K. Debray,\alto : A Link-Time Optimizer for the DEC Alpha", Technical Rep ort 96-15, Dept. of Computer Science, The University of Arizona, Tucson, June 1996. [4] N. Heintze, \Control Flow Analysis and Typ e Systems", Technical Rep ort CMU-CS-94-227, Scho ol of Computer Science, Carnegie Mellon University, Pittsburgh, PA, Dec. 1994. [5] F. Henglein and J. Jrgensen, \Formally Optimal Boxing", Proc. 21st. ACM Symp. on Principles of Programming Languages,Portland, OR, Jan. 1994, pp. 213{226. [6] J. E. Hop croft and J. D. Ullman, Introduction to Automata Theory, Languages, and Computation, Addison Wesley, 1979. [7] S. Jagannathan and S. Weeks, \A Uni ed Treatment of Flow Analysis in Higher-Order Languages", Proc. 22nd. ACM Symp. on Principles of Programming Languages, San Francisco, Jan. 1995, pp. 393{ 407. [8] W. Landi and B. G. Ryder, \A Safe Approximate Algorithm for Interpro cedural Pointer Aliasing", Proc. ACM SIGPLAN '92 ConferenceonProgramming Language Design and Implementation, June 1992, pp. 235{248. [9] T. Lindgren, \Control Flow Analysis of Prolog", Proc. 1995 International Symposium on Logic Pro- gramming, Dec. 1995, pp. 432{446. MIT Press. [10] X. Leroy, \Unboxed ob jects and p olymorphic typing", Proc. 19th. ACM Symp. on Principles of Pro- gramming Languages, Albuquerque, NM, Jan. 1992, pp. 177{188. [11] J. C. Peterson, \Untagged Data in Tagged Environments: Cho osing Optimal Representations at Compile Time", Proc. Functional Programming Languages and Computer Architecture, London, Sept. 1989, pp. 89{99. [12] L. C. Paulson, ML for the Working Programmer, Cambridge University Press, 1991. [13] S. Peyton Jones and J. Launchbury, \Unboxed values as rst class citizens in a non-strict functional language", Proc. Functional Programming Languages and Computer Architecture 1991, pp. 636{666. [14] T. A. Pro ebsting and S. A. Watterson, \Filter Fusion", Proc. 23rd. ACM Symposium on Principles of Programming Languages, Jan. 1996, pp. 119{129. [15] E. Ruf, \Context-Insensitive Alias Analysis Reconsidered", Proc. SIGPLAN '95 ConferenceonPro- gramming Language Design and Implementation, June 1995, pp. 13{22. [16] M. Sharir and A. Pnueli, \Two Approaches to Interpro cedural Data ow Analysis", in Program Flow Analysis: Theory and Applications, eds. S. S. Muchnick and N. D. Jones, Prentice-Hall, 1981, pp. 189{ 233. 13 [17] O. Shivers, \Control Flow Analysis in Scheme", Proc. SIGPLAN '88 ConferenceonProgramming Lan- guage Design and Implementation, June 1988, pp. 164{174. [18] O. Shivers, Control Flow Analysis of Higher-Order Languages, PhD. Dissertation, Carnegie Mellon University,May 1991. Also available as Technical Rep ort CMU-CS-91-145, Scho ol of Computer Science, Carnegie Mellon University, Pittsburgh, PA, May 1991. [19] Y. Tang and P. Jouvelot, \Control-Flow E ects for Escap e Analysis", Proc. WSA 92, Bordeaux, France, 1992. [20] Y. Tang and P. Jouvelot, \Separate Abstract Interpretation for Control Flow Analysis", Proc. TACS-94, 1994. [21] D. Tarditi, G. Morrisett, P. Cheng, C. Stone, R. Harp er, and P. Lee, \TIL: A Typ e-Directed Op- timizing Compiler for ML", Proc. SIGPLAN '96 ConferenceonProgramming Language Design and Implementation, June 1996, pp. 181{192. [22] R. P. Wilson and M. S. Lam, \Ecient Context-SensitivePointer Analysis for C Programs", Proc. SIGPLAN '95 ConferenceonProgramming Language Design and Implementation, June 1995, pp. 1{12. 14 A Pro ofs of Main Theorems Theorem 4.1 Given a program P with entry point S ,control ow grammar G and control ow automaton P M , P ? ? S xA if and only if S; xw ; $ ` A; w ; lm where x; w 2 Lab els and A 2 Blo cks [ Pro cs . P P P i i Pro of: We prove a slightly stronger result, namely, that for all i 0, S xA if and only if S; xw ; $ ` lm A; w ; . The pro of is by induction on i. The base case, with i = 0, is trivial, with S = A, x = " and = ".For the inductive case, assume that the theorem holds for derivations of length upto i, and consider a derivation of length i + 1. This must have the form i S yB xA lm i i where x; y 2 Lab els . Since S yB , from the induction hyp othesis wehaveS; yw; $ ` B; w; . P lm Consider the last step in the leftmost derivation in G shown ab ove. Dep ending on the pro duction used in P this step, wehave the following cases: Case 1 : This covers the cases where either B is not a call blo ck, and has basic blo ck A as a successor; or B is a tail call blo ck that calls pro cedure A;orB is a pro cedure whose ow graph entry no de is A.In each case, the pro duction used has the form B ! A where A is a nonterminal, and the de nition of M indicates that the only move M can makeistochange its state from B to A, leaving its input P P i and stackuntouched. Thus, y = x and = , and wehaveS; xw ; $ ` B; w; ` A; w ; . It i+1 i+1 follows from this that S xA i S; xw ; $ ` A; w ; . lm 0 Case 2 : B is a call blo ck. In this case, A must b e the called pro cedure. Let ` b e the return lab el and B the basic blo ck with lab el `. The pro duction used in the derivation of G must b e P 0 B ! A`B i 0 0 which yields the derivation S yB xA`B i.e., = `B . The corresp onding moveof M P lm must b e 0 B; w; ` A; w ; `B =A; w ; . i+1 i+1 Thus, wehaveS xA i S; xw ; $ ` A; w ; . lm Case 3 : B is a return blo ck. The pro duction used in the derivation of G is B ! ", and the resulting P derivation is i S yB x . lm It is straightforward to show, by induction on the length of derivations, that must b e of the form ` B ` B ` B , where ` is a terminal and B is a nonterminal. Thus, x = y` , A = B , and 1 1 2 2 n n 1 1 1 1 = ` B ` B .So =` A . The corresp onding moveof M is 2 2 n n 1 P B; w; = B; w; ` A ` A; w ; . 1 i+1 i+1 Thus, wehaveS xA i S; xw ; $ ` A; w ; . lm These cases exhaust all the p ossibilities. The theorem follows. 2 ? Theorem 6.1 Given a program with entry point p, p; x; $ ` A; y ; a if and only if thereisareachable item [A ! ;a]. i ? Pro of: [ ]We show, by induction on i, that if p; x; $ ` A; y ; a then q ; [A ! ;a]. The base 0 case is for i = 0: in this case, p A, x y , a $ and ". There is a pro duction p ! B in the control ow grammar G , where B is the entry no de of the ow graph of p. It follows that there is a transition P from q to [p ! B; $] in the viable pre x NFA. 0 j ? Assume that p; x; $ ` A; y ; a implies q ; [A ! ;a] for all j i, and consider a sequence of 0 moves 15 i p; x; $ ` B; y; b ` A; z ; a . Consider the last move, B; y; b ` A; z ; a . Wehave the following cases, dep ending on the nature of B : Case 1 : B is a call blo ck where the called pro cedure is A, the return lab el is a, and the blo ck with lab el a is C . In this case, y z , and the last move b ecomes B; y; b ` A; y ; aC b . In other words, B; y; b ` A; y ; a , where Cb . In the control ow grammar G , there is a P pro duction B ! AaC . ? Nowwehave, from the induction hyp othesis, that q ; [B ! ;b]. This can happ en if and only if 0 ? there is some item I of the form [X ! B ; c] such that q ; I . Since B ! AaC is a pro duction, 0 this means that we can have the transition sequence ? q ; [X ! B ; c] 0 ; [B ! AaC ; d]; d 2 FIRST c ; [A ! ;a] i+1 ? Thus, p; x; $ ` A; y ; a implies q ; [A ! ;a]. 0 Case 2 : B is a return blo ck. The moves of M must b e of the form P i p; x; $ ` B ; `z ; `A ` A; z ; . Now consider how `A came to b e on M 's stack: there must have b een a call blo ck C that called a P pro cedure p with return lab el `, where the basic blo ck with lab el ` is A. This means that there is a sequence of moves j p; x; $ ` C ; u; a ` p; u; `Aa where j that control returns to the return address ` i.e., the blo ck A when the execution of the call has completed, and that the stack at that p ointis a .From the induction hyp othesis, ? q ; [C ! p`A; a] 0 ; [C ! p `A; a] ; [C ! p` A; a] ; [A ! ;a]: i+1 ? Thus, p; x; $ ` A; z ; a implies q ; [A ! ;a]. 0 Case 3 : B is not a call blo ck, or B is a pro cedure and A is the entry no de in the ow graph of B ,orB is a tail call blo ck and A the tail-called pro cedure. The corresp onding pro duction in G is B ! A. This P case corresp onds to a simple transfer of control that do es not a ect the stack. So the moves of M are P i p; x; $ ` B; z; a ` A; z ; a . ? As in Case 1, q ; [B ! ;b] implies that there is an item I of the form [X ! B ; c] such 0 ? ? that q ; I . Since B ! A is a pro duction, it follows from the induction hyp othesis that q ; I ; 0 0 [B ! A; a] ; [A ! ;a]. i ? This establishes that for all i 0, p; x; $ ` A; z ; a implies q ; [A ! ;a]. 0 k []We use the notation A ; B to denote that there is a path of length k from state A to state B of a i ? viable pre x NFA. We show, by induction on i, that q ; [A ! ;a] implies p; x; $ ` A; z ; a . The 0 2 base case, with i = 1, is straightforward: in this case A p is the entry p oint of the program and z $. 2 The reason the base case is for i = 1 is that the viable pre x NFA contains an initial state q that do es not corresp ond to 0 any item: thus, it requires a single "-transition to reach an item. 16 There is a transition from q to the item [A ! B; $] where B is the entry no de of the owgraph of A, and 0 A; x; $ is the initial con guration of M . P j ? Assume that q ; [A ! ;a] implies p; x; $ ` A; y ; a for all j i, and consider a transition 0 sequence i q ; I :[B ! ;b];I :[A ! ;a]. 0 0 1 Consider the last transition of the viable pre x NFA: either it is an "-transition, or it is not. If it is an "- transition, B cannot b e a return blo ck, since a return blo ck X corresp onds to an item of the form [X ! ;a], and the corresp onding state in the viable pre x NFA has no outgoing transitions. We therefore have the following cases, dep ending on the nature of B : Case 1 : B is a call blo ck. In this case, given the de nition of G , the item I must b e of the form P 0 [B ! AaC ; b], where a is the return lab el for this call and C is the basic blo ck with lab el a. This yields the transition sequence i q ; I :[B ! AaC ; b] ; I :[A! ;a]. 0 0 1 The lo okahead symbol of I is obtained from FIRSTaC b= fag.From the induction hyp othesis, 1 ? p; x; $ ` B; z; b , and since B is a call blo ck, the next moveof M is B; z; b ` A; z ; aC b . P Thus, wehave ? p; x; $ ` B; z; b ` A; z ; a ; where = Cb . Case 2 : B is not a call blo ck, i.e., it has no e ect on the stack. The corresp onding grammar pro ductions are of the form B ! A, and wehave i q ; [B! A; a] ; [A ! ;a]. 0 ? From the induction hyp othesis, p; x; $ ` B; z; a , and from the de nition of M ,B; z; a ` P A; z ; a . Thus, M makes the moves P ? p; x; $ ` B; z; a ` A; z ; a . i+1 ? Thus, q ; [A ! ;a] implies p; x; $ ` A; z ; a . 0 On the other hand, if the last transition is not an "-transition, the items I and I must have the form 0 1 I =[A! X; a] and I =[A! X ;a], for some ; 2 V [ T . In this case, the transitions of 0 1 the viable pre x NFAmust b e of the form i q ; I :[A! X; a] ; I :[A! X ;a]. 0 0 1 i ? Then, since q ; I :[A! X; a], the induction hyp othesis implies that p; x; $ ` A; z ; a . This 0 0 i ? establishes that for all i 0, q ; [A ! ;a] implies p; x; $ ` A; z ; a . 0 i ? i Wehave shown that p; x; $ ` A; z ; a implies q ; [A ! ;a] for all i 0; and q ; [A ! ;a] 0 0 ? ? ? implies p; x; $ ` A; z ; a for all i 0. It follows that p; x; $ ` A; y ; a if and only if q ; 0 [A ! ;a].2 17