`C and Tcc: a Language and Compiler for Dynamic Code Generation

`C and tcc: A Language and Co mpiler for Dynamic Co d e Generatio n Massimiliano Poletto Lab or atory forComputer Science, Massachusetts Institute of Technology and Wilson C. Hsieh Depar tmentofComputer Science, Universityof Utah and Dawson R. Engler Lab or atory forComputer Science, Massachusetts Institute of Technology and M. Frans Kaasho ek Lab or atory forComputer Science, Massachusetts Institute of Technology Dyna mic c o de g ene rat io n allows p rog rammerstouserun -time informa tion in order to a chieve performan ce an d expressivene ss sup e rior t o tho se of sta tic co de. The `C TickC langu age is a sup e rset o f ANSI C tha t supp o rts ec ient and h igh -leve l use o f dyna mic c o de ge nerat ion . `C provides dy namic co de gene ra tion at t he level o f C exp ressions and st ate ments, and su pp orts the co mp ositio n of dy namiccodeatrun t ime. These fea tures enab le prog rammers t o add dy namic co de gene ra tion t o existing C co d e inc rementa lly, a nd t o write important app licat ions suchas \just-in-time" co mpilers easily. The pa p er presents many e xamples of how`C can b e used t o solve pract ic al problems. The tcc compiler is an ecient , p ortab le, and freelyavailable implementat io n of `C. tcc allows programmers to trad e dy namiccompilation sp eed fo r dy namiccodequality: in some a pplic ations, it is most imp orta nt t o ge nerate co de quickly, while in ot hers c o de qualitymat te rsmore tha n co mpilat ion sp eed . The overhe ad of dyna miccompila tion is o n the ord er o f 10 0 to 6 00 cy cles p er ge nerate d inst ruc tion, d ep en ding on the level of dyn amic op timiza tion. Measu rements show t hat th e use of dyn amic co d e ge nerat ion ca n improveperforma nce byalmostanorderofmagnitu de; two - t o four-fold sp e edup s are co mmon. In mo st c ases, the overhe ad of dyna mic c ompila tion is recovered in un der 10 0 uses o f the dy namic c o de ; sometimes it c an b e rec overe d with in one u se. Cat ego ries a nd S ub je ct Descriptors: D.3.2 [Programming Languages]: Lang uag e Classi cation s|specialized ap plicatio n langu ages; D.3.3 [Programming Languages]: La nguag e Con- struct s an d Fe atu res; D.3.4 [Programming Languages ]: Pro cessors|comp ilers; code generation; ru n-time enviro nments Ge neral Terms: Algorith ms, La ngua ges, Performa nce Additiona l Key Words and Ph rase s: compilers, d ynamic co de ge nerat io n, dyn amic co de opt imiza- tion , ANSI C Email:[email protected] du, [email protected] h.e du, eng [email protected] , kaasho e [email protected]. L ab o- rato ry for Co mp ute r Sc ien ce, Massachuset ts Inst it ute o f Techno logy, Cambrid ge, MA 0 213 9. The sec ond aut hor can b e reached at: University of Ut ah, Comput er Sc ie nce, 5 0 S Centra l Ca mpus Drive , Ro om 3190 , Salt Lake City, UT 8 4112 -9 205. This resea rchwas supp o rte d in part bytheAdvanc ed Resea rch Pro ject s Age ncy und er co ntrac ts N000 14-94-1-098 5 a nd N6 600 1-96-C-85 22, a nd by a NSF Nation al Youn g I nve stigat or Award. 2 Poletto, Hs ieh, Engle r, Kaasho ek 1. INTRODUCTION Dynamic co de generation | th e generation of executable co de at run time | en- ables th e use ofrun-tim e in formation t o improve co de quality. Information about run-t ime invariants provides new opp ortuni ties for classical optimizations suchas strength reduction, dead co de elimination, and inlining. In addition, d ynamic co de generation is the key technology b ehind j ust-in-time compilers, compilin g inter- preters, and other comp onents of mo dern m obile co de and oth eradaptive systems. `C is a sup erset of A NSI C that supp orts the high-level and e cient use of dynamic co de generation. It ext ends ANSI C with a small number of constructs that allow the programmer to express dynamic co d e at th e l evel ofCexpressions and statem ents,and t o compose arbit rary dynamic code atrun ti me. These f eatures enable programm ers to write comp lex im p erativeco de manipulation program s in a style similar to Lisp [St eele Jr. 1990], and make it relati vely easy to write p owerful and p ortable dynamic co de. Furthermore, since `C is a sup erset of AN SI C, it is not dicult to im proveperformance of co de incrementally byaddin g dynami c co de generationtoexistin g C p rograms. `C's extensionstoC|twotyp e constructors, t hree unaryop erators,and a few sp ecial forms | allow dynamic co de to be typ e-checked statically. Much of the overhead of dynamic compilation can theref ore be incurred statically, which improves the eciency ofdynamic compilation. W hile these constru cts were designed for AN SI C, it should be straightf orward to add analogous construct s to other statically typ ed languages. tcc is an ecientand freely available implementation of `C, consist ing ofafront end, back ends that comp ile to C and to MIPS and SPARC assembly, and two runt ime syst ems. tcc allows the user to trade dynamic co de quality for dynamic co de generation sp eed. If com pilation sp eed must be maximized, dynamic co de generation and register allo cation can be performed in one pass; if co de quality is most i mp ortant, the system can construct and opti mize an intermediate repre- sentation prior to code generation. The overhead of dynamic co de generation is approxim ately 100 cycles p er generated instruction when tcc only p erforms si mple dynamic co de optimi zation, and approximately 600 cycles p er generated instruction when all of tcc's dynam ic optimi zations are turned on. This paper makes t he f ollowin g contributions: |It describ es the `C language, and motivates t he d esign of th e language. |It describ es tcc, with sp ecial emphasis on it s tworunt ime systems, one tu ned f or co de quality and the ot her for fast dynamic co de generation. |It presents an extensivesetof`Cexamples, which illu strate the u tilityof dynamic co de generation and the ease of use of`Cinavarietyof contexts. |It analyzes the p erformance of tcc and tcc-generated dynamic code on several b enchmarks. Measurement s showthatuseof dynamic comp ilati on can im prove p erformance byalmost an order of magnitude in some cases, and generall y results in two- to four-fold sp eedup s. The overheadof dynamic compilation is usually recovered in un der 100 uses of th e dynamic co d e; sometimes it can b e recovered withi n one use. The rest of t his paper is organized as follows. Section 2 describ es `C, and Sec- `C and tcc: ALang uag e and Co mpiler for DynamicCode Generation 3 tion 3 describ es tcc. Section 4 illustrates severalsamp le applications of `C. S ect ion5 presents p erformance m easurements. Finally,we discuss related work in Sect ion6, and summarize our con clusions in Section 7. App endix A d escrib es the `C exten- sions to t he A NSI C grammar. 2. THE `C LA NGUAGE The `C language was designed to sup p ort easy-to-usedynamic co de generation in a systems and applications programm ing environment. This requirementmotivated some of the key f eatures of the language: |`C is a small extension of ANSI C: it adds very few construct s | two type constructors, t hree un ary operators, and a few sp ecial forms { and leaves the rest of the language intact. Asaresult, it is p ossible t o convert existing C co de to `C incrementally. |Dynamic co de in `C is st atically typ ed. Th is i s consistent with C, and improves the p erformance of dynamic compilation by eliminating the need for dynamic typ e-checking. The same constructs used to extend C with dynamic co de generation should b e applicable to other staticall y typ ed languages. |The dynami c compilation pro cess is imp erative: t he `C programmer directs t he creation and comp osition of dynamic co de. This approach distinguishes `C f rom several recent declarative dynamic compilation systems [Auslan der et al. 1996; Grant etal. 1997; Consel and No el 1996]. We b elieve that the imp erative approach is b ett er suit ed to a syst ems environ ment , where t he programmer wants tightcontrolover dynamically generated co de. In `C, a p rogrammer creates code speci cati ons, whichare stat ic descrip tions of dynamic co de.

Load more