The Glasgow Haskell Compiler: a Technical Overview

The Glasgow Haskell Compiler: a Technical Overview

The Glasgow Haskell compiler a technical overview Simon L Peyton Jones Cordy Hall Kevin Hammond Will Partain Phil Wadler Department of Computing Science University of Glasgow G QQ email simonpjdcsglasgowacuk December This paper appears in the Proceedings of the UK Joint systems We hop e that this pap er serves to substantiate Framework for Information Technology JFIT Technical this p oint Conference Keele The compiler work describ ed in this pap er is one of the two strands of the SERCfunded GRASP pro ject The other concerns parallel functional programming on the Abstract GRIP multiprocessor but space precludes coverage of b oth We give an overview of the Glasgow Haskell compiler fo cusing esp ecially on way in which we have b een able to exploit the rich theory of functional languages to give Goals very practical improvements in the compiler Haskell is a purely functional nonstrict language de The compiler is p ortable mo dular generates go o d co de signed by an international group of researchers Hudak and is freely available et al It has now b ecome a de facto standard for the nonstrict or lazy functional programming commu nity with at least three compilers available Introduction Our goals in writing a new compiler were these Computer Science is b oth a scientic and an engineering discipline As a scientic discipline it seeks to establish To make freely available a robust and portable com generic principles and theories that can b e used to ex piler for Haskel l that generates good quality code plain or underpin a variety of particular applications As This goal is more easily stated than achieved Haskell an engineering discipline it constructs substantial arte is a rather large language incorp orating a rich syn facts of software and hardware sees where they fail and tax a new type system that supp orts systematic where they work and develops new theory to underpin overloading using socalled type classes Wadler areas that are inadequately supp orted Milner elo Blott a wide variety of builtin data types quently argues for this dual approach in Computer Sci including arbitraryprecision integers rationals and ence arrays a mo dule system and an inputoutput sys Functional programming is a research area that oers an tem unusually close interplay b etween these two asp ects Pey To provide a modular foundation that other re ton Jones b Theory often has immediate practical searchers can extend and develop In our exp erience application and practice often leads directly to new de researchers are often unable to evaluate their ideas mands on theory This pap er describ es our exp erience of b ecause the sheer eort of building the framework re building a substantial new compiler for the purely func quired is to o great We have tried very hard to build tional language Haskell We discuss our motivations ma our compiler in a welldocumented and mo dular way jor design decisions and achievements paying particular so that others will nd it relatively easy to mo dify attention to the interaction of theory and practice Scaling prototypes up into large real systems app ears To learn what real programs do The intuition of to b e less valued in the academic community than small an implementor is a notoriously p o or basis for tak systems that demonstrate concepts b eing sometimes b e ing critical design decisions The RISC revolution in ing dismissed as just development work Nevertheless computer architecture was based partly on the sim we b elieve that many research problems can only b e ex ple idea of measuring what real programs actually do p osed during the act of constructing large and complex most often and implementing those op erations very Mo dule Lines well Hennessy Patterson Lazy functional Major passes programs execute in a particularly nonintuitive fash Main ion and we plan to make careful quantitative mea Parser surements of their b ehaviour Renamer Type inference An overview of the compiler Desugaring Corelanguage transformation The compiler and its runtime system have the following STGlanguage transformation ma jor characteristics Co de generation Data type denition and manipulation It is written almost entirely in Haskell The only Haskell abstract syntax exception is that the parser is written in Yacc and Core language C STG language Abstract C It generates C as its target co de This has now b e Identier representations come relatively common conferring as it do es wide Type representations p ortability The big question is of course what ef Prelude denitions ciency p enalty is paid a matter we discuss in Sec Utility modules tion Utilities Proling We have extended the language to allow mixed TOTAL language programming by supp orting arbitrary in line statements written in C It is of course not p os Figure Breakdown of mo dule sizes sible to do this in a completely secure way for ex ample the C pro cedure could overwrite the Haskell heap but our technique is referentially transparent parser the compiler itself the C compiler the Unix as that is all the usual program transformations remain sembler and the Unix linker The main passes of the valid compiler itself are shown in Figure They are as follows Figure summarises their sizes This mixedlanguage working allows us to extend Haskell easily for example to provide access to ex A simple recursivedescent parser recognises the sim isting pro cedure libaries Without a general way of ple syntax output by the separate main parser pro calling C each such extension would require a sepa cess The abstract syntax tree pro duced by this rate mo dication to the co de generator parser faithfully represents every construct in the Haskells monolithic arrays are fully implemented Haskell source language even where a distinction is with O access time In addition we have ex purely syntactic This improves the readability of tended the language with incrementallyupdatable error messages from the type checker arrays indeed the monolithic arrays are imple The renamer resolves scoping and naming issues es mented using these mutable arrays p ecially those concerned with mo dule imp orts and exp orts The interface b etween the storage manager and the compiler is carefully dened and highly congurable The type inference pass annotates the program with For example the storage manager comes with no type information and transforms out all the over fewer than four dierent garbage collectors including loading The details of the latter transformation are a generational one given by Wadler Blott we do not discuss it further here The compiler and its runtime system also supp ort comprehensive runtime proling of b oth space and The desugarer converts the rich Haskell abstract syn time at b oth the user level and the evaluationmodel tax into a very much simpler functional language we level Sansom Peyton Jones call the Core language Notice that desugaring fol lows type inference As a result type error messages Organisation are expressed in terms of the original source and they also app ear more quickly The overall organisation of the compiler is quite conven tional A driver program runs a sequence of Unix pro A variety of optional Corelanguage transformation cesses namely a literatescript prepro cessor the main passes improve the co de Other front ends Haskell source Lex/Yacc parser CoreSyntax Transform Prefix form CoreToStg Reader StgSyntax Transform AbsSyntax CodeGen Renamer Abstract C AbsSyntax Other code Flatten generators Typechecker C AbsSyntax C compiler Desugarer Native code Figure Overview of the Glasgow Haskell Compiler Copy propagation is a sp ecial case of inlining a A simple pass then converts Core to the Shared Term 1 letb ound variable Graph STG language an even simpler but still purely functional language let v w in e e w =v A variety of optional STGlanguage transformation passes improve the STG co de where v and w are variables It is a sp ecial case b ecause the rule remains valid even if w is an arbi The code generator converts the STG language to Ab trary expression which is certainly not true in an stract C Abstract C is no more than an internal data imp erative language type that can b e printed in C syntax or if preferred though we have not done this in assemblylanguage Pro cedure inlining is an example of b eta reduction syntax for a particular machine If f is dened like this The targetcode printer prints Abstract C in a form f xy b acceptable to a C compiler then an expression with a call of f such as f a a 1 2 can b e transfomed to b a =x ; a =y 1 2 Compilation by transformation Lifting invariant expressions out of lo ops corresp onds A consistent theme runs through all our design deci to a simple transformation called the full laziness sions namely that most of the compilation process is transformation Hughes Peyton Jones expressed as correctnesspreserving transformations of a Lester purelyfunctional program A wide variety of conventional imp erativeprogram optimisations have simple counter parts as functionallanguage transformations each of This idea of compilation by transformation is not new which replaces equals by equals Here are just three App el Fradet Metayer Kelsey examples but it is particularly applicable in a nonstrict language Although nonstrict semantics carries an implementation 1 The STG language was originally short for Spineless Tagless cost it also means that transformation rules such as those Gmachine language but in fact the language is entirely indep en ab ove can b e applied globally and

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    9 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us