
Bridging the gulf a common intermediate language for ML and Haskell Simon Peyton Jones John Launchbury University of Glasgow and Oregon Graduate Institute Oregon Graduate Institute Mark Shields Andrew Tolmach University of Glasgow and Oregon Graduate Institute Portland State University ML compiler might well p erform a global analysis that iden Abstract ties pure subexpressions though in practice few do How ever one might wonder whether the analysis would discover Compilers for ML and Haskell use intermediate languages all the pure subexpressions in a Haskell program translated that incorp orate deeplyemb edded assumptions ab out order into the IL In the same way if an ML program were trans of evaluation and side eects We prop ose an intermediate lated into a Haskell compilers IL the latter might not dis language into which one can compile b oth ML and Haskell cover all the o ccasions in which a function argument was thereby facilitating the sharing of ideas and infrastructure guaranteed to b e already evaluated This thought motivates and supp orting language developments that move each lan the following question could we design a common compiler guage in the direction of the other Achieving this goal with intermediate language IL that would serve equal ly wel l for out compromising the ability to compile as go o d co de as a both strict and lazy languages The purp ose of this pap er is more direct route turned out to b e much more subtle than to explore the design space for just such a language we exp ected We address this challenge using monads and unp ointed typ es identify two alternative language designs We restrict our attention to higher order polymorphical ly and explore the choices they emb o dy typed intermediate languages There is considerable interest at the moment in typ edirected compilation for p olymorphic languages in which typ e information is maintained accu rately right through compilation and even on to run time Intro duction Harp er Morrisett Shao App el Tarditi et al Hence we fo cus on higher order statically typ ed Functional programmers are typically split into two camps source languages represented in this pap er by ML Milner the strict or callbyvalue camp and the lazy or callby Tofte and Haskell Peterson et al need camp As the discipline has matured though each At rst we exp ected the design to b e relatively straight camp has come more and more to recognise the merits of the forward but we discovered that it was not In particular other and to recognise the huge areas of common interest making sure that the IL has go o d operational prop erties for It is hard these days to nd anyone who b elieves that lazi b oth strict and lazy languages turns out to b e rather subtle ness is never useful or that strictness is always bad While Identifying these subtleties is the main contribution of the there are still p ervasive stylistic dierences b etween strict pap er and lazy programming it is now often p ossible to adopt lazy evaluation at particular places in a strict language Okasaki We employ monads to express and delimit state in or strict evaluation at particular p oints in a lazy one putoutput and exceptions Section Using mon for example Haskells strictness annotations Peterson et ads in this way is now well known to theorists Moggi al and to language designers Launchbury Pey This rappro chement has not yet however propagated to ton Jones Peyton Jones Wadler 1 our implementations The insides of an ML compiler lo ok Wadler a but with one exception no compiler p ervasively dierent to those of a Haskell compiler Notably that we know has monads built into its intermediate sequencing and supp ort for side eects and exceptions are language usually implicit in an ML compilers intermediate language We employ unpointed types to express the idea that IL but explicit where they o ccur in a Haskell compiler an expression cannot diverge Section We show Launchbury Peyton Jones On the other hand that the straightforward use of unp ointed typ es do es thunk formation and forcing are implicit in a Haskell com not lead to a go o d implementation Section This pilers intermediate language but explicit in an ML com leads us to explore two distinct language designs The piler These p ervasive dierences make it imp ossible to rst L is mathematically simple but cannot b e com share co de and hard to share results and analyses b etween 1 piled well Section An alternative design L adds the two styles 2 op erational signicance to unp ointed typ es by guar To say that supp ort for side eects are implicit in an ML anteeing that a variable of unp ointed typ e is evaluated compilers IL for example is not to say that an ML com Section this means L can b e compiled well but 2 piler will take no notice of side eects on the contrary an weakens its theory We identify an interaction b etween unp ointed typ es p olymorphism and recursion in L Section In 1 To app ear in the ACM Symp osium on Principles of terestingly the problem turns out to b e more easily Programming Languages POPL solved in L than L Section 2 1 1 Personal communication Nick Benton Persimmon IT Ltd None of these ingredients are new Our contribution is to ex a less ecient basic evaluation mo del esp ecially when plore the interactions of mixing them together We emerge starting from ML Indeed our hop e is that we may with the core of a practical IL that has something to oer ultimately b e able to generate better co de through this b oth the strict and lazy community in isolation as well as new route oering them a common framework Our longterm goal is to establish an intermediate language that will enable the two communities to share b oth ideas analyses transforma L a totally explicit language tions and systems optimisers co de generators runtime systems prolers etc more eectively than hitherto It is clear that the IL must b e explicit ab out things that are implicit in traditional compiler ILs Where are these im plicit asp ects of a traditional IL currently made explicit The ground rules Answer in the denotational semantics of the IL For ex ample the denotational semantics of a callbyvalue lamb da 2 We seek an intermediate language IL with the following calculus lo oks something like this prop erties E e e E e b if a b 1 2 1 ? if a It must be possible to translate both core ML and where a E e 2 Haskel l into the IL Extensions that add laziness to ML or strictness to Haskell should b e readily incor Here the two cases in the righthand side deal with the p os p orated We make no attempt to treat MLs mo dule sible nontermination of the argument What is implicit in system though that would b e a desirable extension the IL the evaluation of the argument in this case b e comes explicit in the semantics An obvious suggestion is In order to accommo date ML and Haskell the ILs therefore to make the IL reect the denotational semantics type system must support polymorphism This ground of the source language directly so that everything is explicit rule turns out to have very signicant and rather in the IL and nothing remains to b e explicated by the se unfortunate impact up on our language designs Sec mantics This is our rst design L tion but it seems quite essential Nearly all exist 1 ing compilers generate p olymorphic target co de and Figure gives the syntax and typ e rules for L We note 1 although researchers have exp erimented with compil the following features ing away p olymorphism by typ e sp ecialisation Jones Tolmach Oliva problems with sepa As a compromise in the interest of brevity all our rate compilation and p otential co de explosion remain formal material describ es only a simplytyp ed calcu unresolved lus although supp orting p olymorphism is one of our The IL should be explicitly typed Harper Mitchel l ground rules The extensions to add p olymorphism We have in mind a variant of System F Gi complete with explicit typ e abstractions and applica rard with its explicit typ e abstractions and tions in the term language are fairly standard Harp er applications The expressiveness of System F really Mitchell Peyton Jones Tarditi et al is required For example there are several reasons However p olymorphism adds some extra com for wanting p olymorphic arguments to functions the plications Section translation of Haskell typ e classes creates dictionar We omit recursive data typ es constructors and case ies with p olymorphic comp onents we would like to b e expressions for the sake of simplicity b eing content able to simulate mo dules using records Jones with pairs and selectors rank p olymorphism is required to express encap sulated state Launchbury Peyton Jones let is simply very convenient syntactic sugar It is not and datastructure fusion Gill Launchbury Pey there to intro duce p olymorphism even in the p olymor ton Jones phic extension of the language explicit typing removes IL programs can readily b e typ echecked but there this motivation for let is no requirement that one could infer typ es from a letrec intro duces recursion Though we only give it typ eerased IL program one binding here our intention is that it should ac The IL should have a single wel ldened semantics On commo date multiple bindings We use it rather than the face of it compilers for b oth strict and lazy lan a constant fix b ecause the latter requires heavy en guages already use a common language namely the co ding for mutual recursion that is not reected in lamb da calculus But this similarity is only at the an implementation
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages13 Page
-
File Size-