TIL: a Type-Directed Optimizing Compiler for ML

TIL:ATyp e-Directed Optimizing Compiler for ML David Tarditi Greg Morrisett Perry Cheng Chris Stone Rob ert Harp er Peter Lee February 29, 1996 CMU-CS-96-108 Scho ol of Computer Science Carnegie Mellon University Pittsburgh, PA 15213 This pap er will app ear in the Proceedings of the ACM SIGPLAN '96 ConferenceonProgram- ming Language Design and Implementation, Philadelphia, Pennsylvania, May 21-24, 1996. It is also published as Fox Memorandum CMU-CS-FOX-96-01 Abstract We describ e a new compiler for Standard ML called TIL, that is based on four technologies: intensional p olymorphism, tag-free garbage collection, conventional functional language optimization, and lo op optimization. We use intensional p olymorphism and tag-free garbage collection to provide sp ecialized representations, even though SML is a p olymorphic language. We use conventional functional language optimization to reduce the cost of intensional p olymorphism, and lo op optimization to generate go o d co de for recursive functions. We present an example of TIL compiling an SML function to machine co de, and compare the p erformance of TIL co de against that of a widely used compiler, Standard ML of New Jersey. This researchwas sp onsored in part by the Advanced Research Pro jects Agency CSTO under the title \The Fox Pro ject: Advanced Languages for Systems Software", ARPA Order No. C533, issued by ESC/ENS under Contract No. F19628-95-C-0050, and in part by the National Science Foundation under Grant No. CCR-9502674, and in part by the Isaac Newton Institute for Mathematical Sciences, Cambridge, England. David Tarditi was also partly supp orted byanAT&T Bell Labs PhD Scholarship. The views and conclusions contained in this do cument are those of the authors and should not b e interpreted as representing ocial p olicies, either expressed or implied, of the Advanced Research Pro jects Agency, the U.S. Government, the National Science Foundation or AT&T. Keywords: Standard ML, compilation, typ e theory, p olymorphism, tag-free garbage collection, optimization, applicative (functional) programming 1 Intro duction We are investigating a new approach to compiling Standard ML (SML) based on four key technologies: intensional polymorphism [23], nearly tag-freegarbage col lection [12, 46,34], conventional functional language optimization,andloop optimization. To explore the practicality of our approach, wehave constructed a compiler for SML called TIL, and are thus far encouraged bythe results: On DEC ALPHA workstations, programs compiled by TIL are roughly three times faster, do one- fth the total heap allo cation, and use one-half the physical memory of programs compiled by SML of New Jersey (SML/NJ). However, our results are still preliminary | wehavenotyet investigated how to improve compile time; TIL takes ab out eight times longer to compile programs than SML/NJ. Also, wehave not yet implemented the full mo dule system of SML, although we do provide supp ort for structures and separate compilation. Finally,we exp ect the p erformance of programs compiled by TIL to improve signi cantly as we tune the compiler and implement more optimizations. Twokey issues in the compilation of advanced languages such as SML are the presence of garbage col lection and type variables. Most compilers use a universal representation for values of unknown or variable typ e. In particular, values are forced to t into a tagged machine word; values larger than a machine word are represented as p ointers to tagged, heap-allo cated ob jects. This approach supp orts fast garbage collection and ecient p olymorphic functions, but can result in inecient co de when typ es are known at compile time. Even with recentadvances in SML compilation, such as Leroy's representation analysis [28], values must b e placed in a universal representation b efore b eing stored in up dateable data structures (e.g., arrays) or recursive data structures (e.g., lists). Intensional polymorphism and tag-freegarbage col lection eliminate the need to use a universal representation when compiling p olymorphic languages. TIL uses these technologies to represent many data values \naturally". For example, TIL provides tag-free, unallo cated, word-sized integers; aligned, unboxed oating-p ointarrays; and unallo cated multi-argument functions. These natural representations and calling conventions not only improve the p erformance of SML programs, but also allow them to interop erate with legacy co de written in languages suchasCandFortran. When typ es are unknown at compile time, TIL may pro duce machine co de which is slower and bigger than conventional approaches. This is b ecause typ es must b e constructed and passed to p olymorphic functions, and p olymorphic functions must examine the typ es at run-time to determine appropriate execution paths. However, when typ es are known at compile time, no overhead is incurred to supp ort p olymorphism or garbage collection. Because these technologies make p olymorphic functions slower, it b ecomes imp ortant to eliminate as many p olymorphic functions at compile time as is p ossible. Inlining and uncurrying are well-known techniques for eliminating p olymorphic and higher-order functions. Wehave found that for the b enchmarks used here, these techniques eliminate all p olymorphic functions and all but a few higher-order functions when programs are compiled as a whole. Wehave also found that applying traditional lo op optimizations to recursive functions, suchas common sub-expression elimination and invariant removal, is imp ortant. In fact, these optimization reduce execution time by a median of 39%. An imp ortant prop erty of TIL is that all optimizations and the key transformations are p er- formed on typed intermediate languages (hence the name TIL). Maintaining correct typ e information throughout optimization is necessary to supp ort b oth intensional p olymorphism and garbage collection, b oth of which require typ e information at run time. By using strongly-typ ed intermediate languages, we ensure that typ e information is maintained in a principled fashion, instead of relying up on ad hoc invariants. In fact, using the intermediate forms of TIL, an \untrusted" compiler can 1 pro duce fully optimized intermediate co de, and a client can automatically verify the typ e integrity of the co de. Wehave found that this ability has a strong engineering b ene t: typ e-checking the output of each optimization or transformation helps us identify and eliminate bugs in the compiler. In the remainder of this pap er, we describ e the technologies used by TIL in detail, givean overview of the structure of TIL, present a detailed example showing how TIL compiles ML co de, and give p erformance results of co de pro duced by TIL. 2 Overview of the Technologies This section contains a high-level overview of the technologies we use in TIL. 2.1 Intensional Polymorphism Intensional p olymorphism [23] eliminates restrictions on data representations due to p olymorphism, separate compilation, abstract datatyp es, and garbage collection. It also supp orts ecient calling conventions (multiple arguments passed in registers) and tag-free p olymorphic, structural equality. With intensional p olymorphism, typ es are constructed and passed as values at run time to p olymorphic functions, and these functions can branch based on the typ es. For example, when extracting a value from an array, TIL uses a typecase expression to determine the typeofthe array and to select the appropriate sp ecialized subscript op eration: fun sub[ ](x: array,i:int)= typecase of int => intsub(x, i) | oat => floatsub(x, i) | ptr( ) => ptrsub(x, i) If the typ e of the array can b e determined at compile-time, then an optimizer can eliminate the typecase: sub[ oat](a, 5) ,! floatsub(a, 5) However, intensional p olymorphism comes with two costs. First, wemust construct and pass representations of typ es to p olymorphic functions at run time. Furthermore, wemust compile p olymorphic functions to supp ort any p ossible representation and insert typecase constructs to select the appropriate co de paths. Hence, the co de we generate for p olymorphic functions is b oth bigger and slower, and minimizing p olymorphism b ecomes quite imp ortant. Second, in order to use typ e information at run time, for b oth intensional p olymorphism and tag- free garbage collection, wemust propagate typ es through each stage of compilation. To address this second problem, almost all compilation stages, including optimization and closure conversion, are expressed as typ e-directed, typ e-preserving translations to strongly-typ ed intermediate languages. The key dicultywithusingtyp ed intermediate languages is formulating a typ e system that is expressive enough to statically typ e check terms that branchontyp es at run time, suchassub. The typ e system used in TIL is based on the approach suggested by Harp er and Morrisett [23, 33]. Typ es themselves are represented as expressions in a simply-typ ed -calculus extended with an inductively generated base kind (the monotyp es), and a corresp onding induction elimination form. The induction elimination form is essentially a \Typ ecase" at the typ e level; this allows us to write typ e expressions that track the run-time control ow of term-level typecase expressions. Nevertheless, the typ e system used by TIL remains b oth sound and decidable. This implies that at any stage during optimization, we can automatically verify the typ e integrity of the co de. 2 2.2 Conventional and Lo op-Oriented Optimizations Program optimization is crucial to reducing the cost of intensional p olymorphism,improving lo ops and recursive functions, and eliminating higher-order and p olymorphic functions.

TIL: a Type-Directed Optimizing Compiler for ML

Program Analysis and Optimization for Embedded Systems

Type-Safe Multi-Tier Programming with Standard ML Modules

Four Lectures on Standard ML

Latest Results from the Procedure Calling Test, Ackermann's Function

A Separate Compilation Extension to Standard ML

Functional Programming (ML)

Stochastic Program Optimization by Eric Schkufza, Rahul Sharma, and Alex Aiken

Robert Harper Carnegie Mellon University

The Effect of Code Expanding Optimizations on Instruction Cache Design

Standard ML Mini-Tutorial [1Mm] (In Particular SML/NJ)

ARM Optimizing C/C++ Compiler V20.2.0.LTS User's Guide (Rev. V)

Compiler Optimization-Space Exploration