A Portable Assembly Language That Supports Garbage

A Portable Assembly Language That Supports Garbage

C a p ortable assembly language that supp orts garbage collection 1 2 3 Simon Peyton Jones Norman Ramsey and Fermin Reig 1 simonpjmicrosoftcom Microsoft Research Ltd 2 nrcsvirginiaedu University of Virginia 3 reigdcsglaacuk University of Glasgow Abstract For a compiler writer generating go o d machine co de for a variety of platforms is hard work One might try to reuse a retargetable co de generator but co de generators are complex and dicult to use and they limit ones choice of implementation language One might try to use C as a p ortable assembly language but C limits the compiler writers exibility and the p erformance of the resulting co de The wide use of C despite these drawbacks argues for a p ortable assembly lan guage C is a new language designed expressly for this purp ose The use of a p ortable assembly language introduces new problems in the sup p ort of such highlevel runtime services as garbage collection exception handling concurrency proling and debugging We address these prob lems by combining the C language with a C runtime interface The combination is designed to allow the compiler writer a choice of source language semantics and implementation techniques while still providing go o d p erformance Introduction Supp ose you are writing a compiler for a highlevel language How are you to generate highquality machine co de You could do it yourself or you could try to take advantage of the work of others by using an otheshelf co de gener ator Curiously despite the huge amount of research in this area only three retargetable optimizing co de generators app ear to b e freely available VPO Benitez and Davidson MLRISC George and the gcc back end Stallman Each of these impressive systems has a rich complex and ill do cumented interface Of course these interfaces are quite dierent from one another so once you start to use one you will b e unable to switch easily to an other Furthermore they are languagesp ecic To use MLRISC you must write your front end in ML to use the gcc back end you must write it in C and so on All of this is most unsatisfactory It would b e much b etter to have one p ortable assembly language that could b e generated by a front end and implemented by any of the available co de generators So pressing is this need that it has b e come common to use C as a p ortable assembly language Atkinson et al Bartlett b Peyton Jones Tarditi Acharya and Lee Henderson Conway and Somogyi Pettersson Serrano and Weis Unfortu nately C was never intended for this purp ose it is a programming language not an assembly language C lo cks the implementation into a particular calling convention makes it imp ossible to compute targets of jumps provides no sup p ort for garbage collection and provides very little supp ort for exceptions or debugging Section The obvious way forward is to design a language sp ecically as a compiler tar get language Such a language should serve as the interface b etween a compiler for a highlevel language the front end and a retargetable co de generator the back end The language would not only make the compiler writers life much easier but would also give the author of a new co de generator a readymade cus tomer base In an earlier pap er we prop ose a design for just such a language C Peyton Jones Oliva and Nordin but the story do es not end there Sepa rating the front and back ends greatly complicates runtime supp ort In general the front end back end and runtime system for a programming language are designed together They co op erate intimately to supp ort such highlevel features as garbage collection exception handling debugging proling and concurrency highlevel runtime services If the back end is a p ortable assembler like C we want the co op eration without the intimacy an implementation of C should b e indep endent of the front ends with which it will b e used One alternative is to make all these highlevel services part of the abstraction of fered by the p ortable assembler For example the Java Virtual Machine which provides garbage collection and exception handling has b een used as a tar get for languages other than Java including Ada Taft ML Benton Kennedy and Russell Scheme Clausen and Danvy and Haskell Wakeling But a sophisticated platform like a virtual machine embo dies to o many design decisions For a start the semantics of the virtual machine may not match the semantics of the language b eing compiled eg the exception se mantics Even if the semantics happ en to match the engineering tradeos may dier dramatically For example functional languages like Haskell or Scheme allo cate like crazy Diwan Tarditi and Moss and JVM implementations are typically not optimised for this case Finally a virtual machine typically comes complete with a very large infrastructure class loaders veriers and the like that may well b e inappropriate Our intended level of abstraction is much much lower Our problem is to enable a client to implement highlevel services while still using C as a co de generator As we discuss in Section supp orting high level services requires knowledge from both the front and back ends The insight b ehind our solution is that C should include not only a lowlevel assembly language for use by the compiler but also a lowlevel runtime system for use by the front ends runtime system The only intimate co op eration required is b etween the C back end and its runtime system the front end works with C at arms length through a welldened language and a welldened run time interface Section This interface adds something fundamentally new the ability to insp ect and mo dify the state of a susp ended computation It is not obvious that this approach is workable Can just a few assemblylanguage capabilities supp ort many highlevel runtime services Can the frontend run time system easily implement highlevel services using these capabilities How much is overall eciency compromised by the armslength relationship b etween the frontend runtime and the C runtime We cannot yet answer these ques tions denitively Instead the primary contributions of this pap er are to identify needs that are common to various highlevel services and to prop ose sp ecic mechanisms to meet these needs We demonstrate only how to use C to im plement the easiest of our intended services namely garbage collection Rening our design to accommo date exceptions concurrency proling and debugging has emerged as an interesting research challenge Its imp ossible or its C The dream of a p ortable assembler has b een around at least since UNCOL Conway Is it an imp ossible dream then Clearly not Cs p opularity as an assembler is clear evidence that a need exists and that something useful can b e done If C is so p opular then p erhaps C is p erfectly adequate Not so There are many diculties of which the most fundamental are these The C route rewards those who can map their highlevel language rather directly onto C A highlevel language pro cedure b ecomes a C pro cedure and so on But this mapping is often awkward and sometimes imp ossible For example some source languages fundamentally require tailcall optimi sation a pro cedure call whose result is returned to the caller of the current pro cedure must b e executed in the stack frame of the current pro cedure This optimisation allows iteration to b e implemented eciently using recursion More generally it allows one to think of a pro cedure as a lab elled extended basic blo ck that can b e jumped to rather than as subprogram that can only b e called Such pro cedures give a front end the freedom to design its own control ow It is very dicult to implement the tailcall optimisation in C and no C com piler known to us do es so across separately compiled mo dules Those using C have b een very ingenious in nding ways around this deciency Steele Tarditi Acharya and Lee Peyton Jones Henderson Conway and Somogyi but the results are complex fragile and heavily tuned for one particular implementation of C usually gcc A C compiler may lay out its stack frames as it pleases This makes it dicult for a garbage collector to nd the live p ointers Implementors either arrange not to keep p ointers on the C stack or they use a conservative garbage collector These restrictions are Draconian The unknown stackframe layout also complicates supp ort for exception handling debugging proling and concurrency For example an exception handling mechanism needs to walk the stack p erhaps removing stack frames as it go es Again C makes it essentially imp ossible to implement such mecha nisms unless they can b e closely mapp ed onto what C provides ie setjmp and longjmp A C compiler has to b e very conservative ab out the p ossibility of memory aliasing This seriously limits the ability of the instruction scheduler to p er mute memory op erations or hoist them out of a lo op The frontend compiler often knows that aliasing cannot o ccur but there is no way to convey this information to the C compiler So much for fundamental issues C also lacks the ability to control a number of imp ortant lowlevel features including returning multiple values in registers from a pro cedure misaligned memory accesses arithmetic data layout and omitting range checks on multiway jumps In short C is awkward to use as a p ortable assembler and many of these di culties translate into p erformance hits A p ortable assembly language should b e able to oer b etter p erformance as well as greater ease of use An overview of C In this section we give an overview of the design of C Fuller descriptions can

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    31 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us