Interprocedural Control Flow Analysis of First-Order Programs with Tail Call Optimization 1 Introduction
Total Page:16
File Type:pdf, Size:1020Kb
Interpro cedural Control Flow Analysis of FirstOrder Programs with Tail Call Optimization Saumya K Debray and To dd A Pro ebsting Department of Computer Science The University of Arizona Tucson AZ USA Email fdebray toddgcsarizonaedu Decemb er Abstract The analysis of control ow Involves guring out where returns will go How this may b e done With items LR and Is what in this pap er we show Intro duction Most co de optimizations dep end on control ow analysis typically expressed in the form of a control ow graph Traditional algorithms construct intrapro cedural ow graphs which do not account for control ow b etween pro cedures Optimizations that dep end on this limited information cannot consider the b e havior of other pro cedures Interpro cedural versions of these optimizations must capture the ow of control across pro cedure b oundaries Determining interpro cedural control ow for rstorder programs is relatively straightforward in the absence of tail call optimization since pro cedures return control to the p oint imme diately after the call Tail call optimization complicates the analysis b ecause returns may transfer control to a pro cedure other than the active pro cedures caller The problem can b e illustrated by the following simple program that tak es a list of values and prints in their original order all the values that satisfy some prop erty eg exceed To take advantage of tailcall optimization it uses an accumulator to collect these values as it traverses the input list However this causes the order of the values in the accumulator to b e reversed so the accumulated list is reversedagain using an accumulatorb efore it is returned mainL print extractL extractxs acc if xs then reverseacc else if hdxs then extracttlxs conshdxsacc else extracttlxs acc reversexs acc if xs then acc else reversetlxs conshdxsacc The work of S Debray was supp orted in part by the National Science Foundation under grant numb er CCR The work of T Pro ebsting was supp orted in part by NSF Grants CCR CCR ARPA Grant DABTC and IBM Corp oration Supp ose that for co de optimization purp oses we want to construct a ow graph for this entire program The return from the function reverse in line corresp onds to some basic blo ck and in order to construct the ow graph we need to determine the successors of this blo ck The call graph of the program indicates that reverse can b e called either from extract in line or from reverse in line However b ecause of tail call optimization it turns out that reverse do es not return to either of these call sites Instead it t pro cedure entirely namely to the pro cedure main in line the successor blo ck to the returns to a dieren return in line is the basic blo ck that calls print Clearly some nontrivial control ow analysis is necessary to determine this Most of the work to date on control ow analysis has fo cused on higherorder languages Shivers and Jagannathan and Weeks use abstract interpretation for this purp ose while Heintze and Tang and Jouvelot use typ ebased analyses These analyses are very general but very complex Many are rstorder languages Furthermore even for higherorder widely used languages such as Sisal and Prolog languages sp ecic programs often use only rstorder constructs or can have most higherorder constructs removed via transformations such as inlining and uncurrying As a pragmatic issue therefore we are interested in ordinary rstorder programs our aim is to account for interpro cedural control ow in such programs in the presence of tail call optimization To our knowledge the only other work addressing this issue is that of Lindgren who uses setbased analysis for control ow analysis of Prolog Unlike Lindgrens work our analyses can maintain context information see Section The main contribution of this pap er is to show how control ow analysis of rstorder programs with tail call optimization can b e formulated in terms of simple and wellundersto o d concepts from parsing theory In particular we show that contextinsensitive or zerothorder control ow analysis corresp onds to the notion of FOLLOW sets in context free grammars while contextsensitive or rstorder control ow analysis corresp onds to the notion of LR items This is useful b ecause it allows the immediate application of wellundersto o d technology without for example having to construct complex abstract domains It is also esthetically pleasing in that it provides an application of concepts such as FOLLOW sets and LR items which were originally develop ed purely in the context of parsing to a very dierent application The remainder of the pap er is organized as follows Section intro duces denitions and notation Section denes an abstract mo del for control ow and Section shows how this mo del can b e describ ed using context free grammars Section discusses control ow analysis that maintain no context information and Section discusses how context information can b e maintained to pro duce more precise analyses Section and illustrates these ideas with a nontrivial example Section discusses tradeos b etween eciency precision Pro ofs of the theorems may b e found in Denitions and Notation We assume that a program consists of a set of pro cedure denitions together with an entry p oint pro cedure It is straightforward to extend these ideas to accommo date multiple entry p oints Since we assume a rstorder language the intrapro cedural control ow can b e mo delled by a control ow graph This is a of executable co de directed graph where each no de corresp onds to a basic blo ck ie a maximal sequence that has a single entry p oint and a single exit p oint and where there is an edge from a no de A to a no de B if and only if it is p ossible for execution to leave no de A and immediately enter no de B If there is an edge from a no de A to a no de B then A is said to b e a predecessor of B and B is a successor of A Because of the eects of tail call optimization interpro cedural control ow information cannot b e assumed to b e available Therefore we assume that the input to our analysis consists of one control ow graph for each pro cedure dened in the program For simplicity of exp osition we assume that each ow graph has a single entry no de Eac h ow graph consists of a set of vertices which corresp ond to basic blo cks and a set of edges which capture control ow b etween basic blo cks If a basic blo ck contains a pro cedure call the call is assumed to terminate the blo ck if a basic blo ck B ends in a call to a pro cedure p we say that B cal ls p If the last action along an execution path in a pro cedure p is a call to some pro cedure q ie if the only action that would b e p erformed on return from q would b e to return to the caller of pthe call to q cal l A tail call can b e optimized in particular any environment allo cated for the caller p is termed a tail can b e deallo cated and control transfer eected via a direct jump to the callee q this is usually referred to as tail call optimization and is crucial for ecient implementations of functional and logic languages If ck if B ends in a pro cedure call that is a basic blo ck B ends in a tail call we say that it is a tail cal l blo not a tail call we say B is a cal l block In the latter case B must set a return address L b efore making the call L is said to b e a return label If a basic blo ck B ends in a return from a pro cedure it is said to b e a return block As is standard in the program analysis literature we assume that either branch of a conditional can b e executed at runtime The ideas describ ed here are applicable to programs that do not satisfy this assumption in that case the analysis results will b e sound but p ossibly conservative The set of basic blo cks and lab els app earing in a program P are denoted by Blo cks and Lab els P P resp ectively The set of pro cedures dened in it is denoted by Pro cs Finally the Kleene closure of a set P S ie the set of all nite sequences of elements of S is written S The reexive transitive closure of a relation R is written R Abstracting Control Flow Before we can analyze the control ow b ehavior of such programs it is necessary to sp ecify this b ehavior carefully First consider the actual runtime b ehavior of a program Co de not involving pro cedure calls or returns is executed as exp ected each instruction in a basic blo ck is executed in turn after which control moves to a successor blo ck and so on Pro cedure calls are executed as follows A nontail cal l loads arguments into the appropriate lo cations saves the return address for simplicity we can assume that it is pushed on a control stack and branches to the callee A tail cal l loads arguments reclaims any space allo cated for the callers environment and transfers control to the callee A pro cedure return simply p ops the topmost return address from the control stack and transfers control to this address We can ignore any asp ect of a programs runtime b ehavior that is not concerned directly with ow of control Conceptually therefore control moves from one basic blo ck to another pushing a return address on a stack when making a nontail pro cedure call and p opping an address from it when returning from a pro cedure corresp onding to a This can b e formalized using a very simple pushdown automaton the automaton M P automaton Given a program P the set of states Q of M is given by program P is called its control ow P Q Blo cks Pro cs its input alphab et is Lab els the initial state of M is p where p is the entry p oint P P