Abstract Interpretation and Low-Level Code Optimization

Abstract Interpretation and LowLevel Co de Optimization Saumya Debray t of Computer Science Departmen University of Arizona Tucson AZ Abstract language levels our current implementation applies op timizations at levels the Janus virtual machine Abstract interpretation is widely accepted as a natu the intermediate representations of the C compiler ral framework for semanticsbased analysis of program and the target machine co de the last two within prop erties However most formulations of abstract in the C compiler In each case the optimizations can el semantic enti terpretation are in terms of highlev b e seen as program transformations at a particular lan address the needs of low ties that do not adequately guage level A fundamental requirement of the compila level optimizations In this pap er we discuss the role tion pro cess is that it should b e semanticspreserving of abstract interpretation in lowlevel compiler opti in the sense that the meaning or b ehavior of the mizations examine some of its limitations and consider executable co de should conform to what the semantics ways in which they might b e addressed or this to of the source program says it should b e F happ en it is necessary in general that b oth transla tions and optimizations should b e semanticspreserving Intro duction in this sense Since our primary fo cus is on optimiza tions rather than translations we will assume here that The pro cess of compilation by which executable co de our translations satisfy this requirement and fo cus our is generated from a source program can b e thought of attention on optimizations as a series of transformations and translations through It is very often the case that an optimization is not a succession of languages starting at the source lan universally applicable In other words in order to en guage and ending at the target language In this pic sure that an optimization do es not alter the observable ture we can distinguish b etween two kinds of trans b ehavior of a program in unacceptable ways we have formations translations which take a program in a to ensure that certain preconditions particular to that language and pro duce a program in a dierent usu optimization are satised As an example consider reg ally lowerlevel language and optimizations which ister allo cation in a C compiler the value of a variable transform a program in a language to another program can b e kept in a register only if certain conditions re in the same language As an example a compiler that garding aliasing are fullled In general this means that we have implemented for a logic programming language it may b e necessary to examine a program and extract called Jan us works by translating the input pro some information ab out its b ehavior which can then grams into C then invoking a C compiler to generate b e used for optimization purp oses Further in order to executable co de In this system we can identify the fol verify that the prop erties so inferred describ e all p ossi lowing language levels the source language the ble runtime b ehaviors of a program it is necessary to Janus virtual machine language C the inter b e able to relate the analyses to the semantics of the mediate representations within the C compiler and language in a precise way the target machine language In principle optimiz Semanticsbased techniques such as abstract inter ing transformations can b e applied at each of these ve pretation provide a natural framework for This work was supp orted in part by the National Science Foun such program analyses The general idea is to rely on dation under grant CCR the formal semantics of a program to sp ecify all of its p ossible computational b ehaviors and to derive nitely computable descriptions of such b ehaviors by system atically approximating the op erational b ehavior of the 0 Benchmark Execution Time secs Heap Usage words noopt opt nooptopt noopt opt nooptopt aquad bessel binomial chebyshev e fib log mandelbrot muldiv nrev pi sum tak Geometric Mean Table Performance improvements due to lowlevel optimizations jc on a SparcstationIPC or not an optimization is to considered lowlevel de program The correctness of an analysis can then b e p ends among other things on the language b eing con derived from the mathematical relationships b etween sidered for example in a language with explicit con the actual computational domain of the program and structs for iteration the implementation of a tail re the domain of descriptions manipulated by the analy cursive pro cedure in terms of iteration could b e consid sis and b etween the actual op erations executed by the ered as a highlevel optimization in a language without program and the approximations to those op erations sourcelevel iterative constructs however this would b e used during the analysis a lowlevel optimization program transformations can b e viewed Optimizing There are two reasons why lowlevel optimizations at many levels corresp onding to the dierent levels of are imp ortant The rst is that they are b eyond the languages encountered during compilation At a high reach of the user The p oint is that when faced with a level for example we have transformations such as compiler that do es not do much in the way of highlevel nite dierencing recursion removal ie trans optimizations the determined user can in principle formation of recursive programs to tail recursive form carry out the transformations manually where necessary deforestation transformations for par a in order to obtain co de with go o d p erformance With allelization and vectorization see for example compiler that do es not p erform lowlevel optimizations as well as various transformations describ ed by Bacon however there is little that even the most determined of et al At the level of intermediate co de we users can do In particular this implies that in the ab have machineindep endent lowlevel optimizations such sence of lowlevel optimizations even carefully crafted as induction variable elimination closure representa programs written by skilled programmers will incur p er tion optimization in functional languages and formance p enalties over which they have little control dereferencing optimizations in logic programming lan The second reason such optimizations are imp ortant guages At a lower level still we have machine is that they can pro duce substantial p erformance im dep endent transformations such as register allo cation provements As an example of this Table gives some and instruction scheduling Concep p erformance numb ers for jc an implementation of a tually we can divide these various optimizations into dynamically typ ed logic programming language two classes high level optimizations which corresp ond The jc compiler currently p erforms only lowlevel opti roughly to optimizations that can b e expressed in terms mizations call forwarding which is a form of jump of transformations on the source program or its ab redirection at the intermediate co de level a simple form stract syntax tree and lowlevel optimizations which of interpro cedural register allo cation for output value that are not visible at the involve constructs and ob jects placement and representation optimization ie us source level and which therefore cannot b e so expressed ing unb oxed values where p ossible for numerical val this classication is not absolute of course whether ues As Table indicates for the b enchmarks tested ab out machinelevel entities has b een abstracted away these optimizations more than double the sp eed of the The problem of course is that usually we think of the programs on the average and also lead to signicant pro cess of abstraction as forgetting ab out irrelevant improvements in heap memory usage The sp eed of asp ects of the b ehavior of a program while in this case the resulting co de is comp etitive with that of optimized it is precisely the most relevant asp ects of the programs C co de written in a natural imp erative style on the b ehavior that are b eing forgotten b enchmarks shown the Janus programswhic h are dy The problem can b e addressed by abstract interpre w synchronization b e namically typ ed and use datao tation based on a lowlevel semantics While this do es tween pro ducers and consumersis on the average not seem dierent from any other sort of abstract inter only slower than C co de compiled with gcc O pretation at a conceptual level the practical details can ab out faster than C compiled with cc O and b ecome messy As an example it is very likely simpler faster than C compiled with cc O This indicates and more convenient to manipulate a highlevel repre that lowlevel optimizations can b e a valuable source of sentation of a program such as an abstract syntax tree p erformance improvements for such analyses since the numb er of dierent kinds of The app eal of semanticsbased program manipula ob jects and op erations that have to b e dealt with for tion techniques is that they allow us to reason formally such representations is relatively small However it is ab out the manipulations themselves and certify with not clear that a high level program representation can some condence that such manipulations will not cause enco de lowlevel information in a reasonable way with bad things to happ en This pap er considers the appli out implicit or explicit assumptions ab out the b ehav cability and relevance of semanticsbased program anal ior of the co de generator This in turn

Load more