Return Value Placement and Tail Call Optimization in High Level Languages

J LOGIC PROGRAMMING RETURN VALUE PLACEMENT AND TAIL CALL OPTIMIZATION IN HIGH LEVEL LANGUAGES PETER A BIGOT AND SAUMYA DEBRAY This pap er discusses the interaction b etween tail call optimization and the placement of output values in functional and logic programming languages Implementations of such languages typically rely on xed placement p oli cies most functional language implementations return output values in registers while most logic programming systems return outputs via mem ory Such xed placement p olicies incur unnecessary overheads in many commonly encountered situations the former are unable to implement many intuitively iterative computations in a truly iterative manner while the latter incur a p erformance p enalty due to additional memory refer e describ e an approach that determines based on a lowlevel cost ences W mo del for an implementation together with an estimated execution prole for a program whether or not the output of a pro cedure should b e returned in registers or in memory This can b e seen as realizing a restricted form of interpro cedural register allo cation and avoids the disadvantages asso ciated with the xed register and xed memory output placement p olicies Exp erimental results indicate that it provides go o d p erformance improve ments compared to existing approaches This work was supp orted in part by the National Science Foundation under grant numb er The rst author was also supp orted by graduate fellowships from the US Oce CCR of Naval Research and ATT Bell Lab oratories A preliminary version of this pap er app eared in Pr oc Eleventh International Conference on Logic Programming Santa Margherita Ligure Italy June Address correspondence to Saumya K Debray Department of Computer Science University Tucson AZ USA Email debraycsarizonaedu of Arizona THE JOURNAL OF LOGIC PROGRAMMING c Elsevier Science Publishing Co Inc Avenue of the Americas New York NY INTRODUCTION Programs in functional and logic programming languages tend to b e pro cedure call intensive Because of this implementations of such languages must handle the data tly in order to get go o d and control transfers at pro cedure calls and returns ecien p erformance The data transfer overhead is usually reduced by placing the argu ments to pro ceduresand in many systems the values returned by pro ceduresin registers A very imp ortant comp onent of techniques that reduce the control trans fer overhead is tail call optimization This pap er examines the interaction b etween the data passing optimization of placing arguments and return values in registers and the control passing optimization of tail call optimization Implementations of functional languages typically adopt xed register usage con ventions for passing arguments to functions and returning values from them A common approach is to use a xed mapping from the p osition of a value in an argument sequence to the register in which that value is passed the rst argument the second argument in register etc the to a function is passed in register rst return value is returned in register the second return value in register and so on see for example the S Common Lisp compiler uses this approach for numerical return values A similar situation arises in systems such as Standard ML of New Jersey that use continuation passing style and which wn functions in registers since functions in continuation pass arguments to kno passing style do not actually return any values to their caller but pass them instead as arguments to a continuation the placement of these return values is deter mined by the scheme used for passing arguments into a function The advantage of such xed schemes is uniformity and simplicity They have two disadvantages rst as we will show in Section an a priori commitment to pass return values in registers may force a program to incur unnecessary space and time overheads and second a xed p ositional mapping of values to registers can require additional reg ister shuing to move a value into the appropriate register The second problem namely register shuing can b e addressed to some extent by techniques such as register targeting but these do not address the additional space overheads that can b e incurred by such schemes It is interesting to contrast such registerreturn mo dels commonly used in func tional programming systems with implementations of logic programming languages such as Prolog Prolog pro cedures do not in general have any notion of input and output arguments and a particular argument to a pro cedure can b e an input ar gument in one invo cation and an output argument in another Because of this it is simplest to pass all arguments to a pro cedure in registers with each unb ound variableusually corresp onding to an output argumentpassed by reference as a p ointer to the cell o ccupied by that variable An output value is returned by bind ing it to a variable ie by writing to the corresp onding memory lo cation This works well in some cases but incurs unnecessary overheads in others b ecause of the additional memory references incurred in initializing the output lo cations writing values to them and then reading these values back at the p oint of use At rst glance the placement of return values would seem to b e a rather small and presumably unimp ortant asp ect of an implementation of a programming lan guage It turns out that b ecause of interactions b etween return value placement and tail call optimization placement decisions can have a surprisingly large impact on execution sp eed Moreover no single xed placement scheme is go o d for all programs many commonly encountered programs do b etter with register place ments and many others run faster with memory placements What is desirable is a metho d whereby a compiler can determine for each pro cedure in a program which placement scheme is b est for it This pap er discusses an algorithm that accomplishes this by taking into account execution frequency estimates and the relative costs of various low level op erations to evaluate the costs and b enets of t output arguments various alternatives and cho osing placements for the dieren for dierent pro cedures in a program in a way that attempts to minimize the over all execution time of the program The assumptions made by our algorithm are fairly weak and are applicable to a reasonably wide variety of languages and sys tems The most fundamental assumption we make is that tail call optimization is implemented In other words when the last action p erformed by a pro cedure p is a call to another pro cedure q a situation that is referred to as a tail cal lany environment allo cated for p is no longer needed and can therefore b e reclaimed once the arguments to the call to q have b een computed into the appropriate lo thereby cations This allows the call to q to b e implemented as a simple jump avoiding unnecessary state saving and a pro cedure call and return We assume also that input arguments to a pro cedure call are passed in registers the mapping that determines which parameter gets passed in which register need not b e the same for all functions This assumption is satised by most mo dern implementations of high level languages Exp erimental results indicate that our algorithm generally makes the right decisions cho osing register placements for pro cedures that b enet from having their outputs returned in registers and memory placements for pro cedures for which this is b etter OUTPUT VALUE PLACEMENT AND TAIL CALL OPTIMIZATION Consider the following Scheme function to count the length of a list length x define if null x length cdr x Supp ose the recursive call to length returns its value in a register The next action that has to b e taken up on return from this call is to increment the value returned and since it is already available in a register this can b e done by a simple register increment op eration If on the other hand the returned value had b een placed in memory it would b e necessary to incur several memory op erationswhich are considerably more exp ensiveto achieve the same eect In this case therefore the natural place to put the return value is in a register As this simple example illustrates there are in many cases signicant p erfor mance advantages to returning output values in registers rather than in memory Ho wever the situation is complicated by the interaction of this optimization with tail call optimization The problem is that if a tail call returns a value to its caller in a register r but the caller wants that value in a dierent lo cation x then it or load instructions after the tail call to reconcile the is necessary to insert move return lo cations of caller and callee and this inserted co de precludes tail call opti mization This can b e seen in the context of pro cedures that recursively construct data structures which are common in functional and logic programming languages In many implementations such structures are allo cated on the heap In these cases if the recursive calls that construct the rest of the structure return their values in registers additional co de is necessary to store the values into memory rendering tail call optimization inapplicable and increasing the memory requirements of pro grams To see this consider the following Scheme function to double each element of a list x define ldbl if null x cons car x ldbl cdr x This function creates and returns a list which naturally resides on the heap thus the longer the input list the more space it will need to create its output However

Load more