Language and Compiler Support for Dynamic Code Generation by Massimiliano A

Language and Compiler Support for Dynamic Code Generation by Massimiliano A. Poletto S.B., Massachusetts Institute of Technology (1995) M.Eng., Massachusetts Institute of Technology (1995) Submitted to the Department of Electrical Engineering and Computer Science in partial fulfillment of the requirements for the degree of Doctor of Philosophy at the MASSACHUSETTS INSTITUTE OF TECHNOLOGY September 1999 © Massachusetts Institute of Technology 1999. All rights reserved. A u th or ............................................................................ Department of Electrical Engineering and Computer Science June 23, 1999 Certified by...............,. ...*V .,., . .* N . .. .*. *.* . -. *.... M. Frans Kaashoek Associate Pro essor of Electrical Engineering and Computer Science Thesis Supervisor A ccepted by ................ ..... ............ ............................. Arthur C. Smith Chairman, Departmental CommitteA on Graduate Students me 2 Language and Compiler Support for Dynamic Code Generation by Massimiliano A. Poletto Submitted to the Department of Electrical Engineering and Computer Science on June 23, 1999, in partial fulfillment of the requirements for the degree of Doctor of Philosophy Abstract Dynamic code generation, also called run-time code generation or dynamic compilation, is the creation of executable code for an application while that application is running. Dynamic compilation can significantly improve the performance of software by giving the compiler access to run-time infor- mation that is not available to a traditional static compiler. A well-designed programming interface to dynamic compilation can also simplify the creation of important classes of computer programs. Until recently, however, no system combined efficient dynamic generation of high-performance code with a powerful and portable language interface. This thesis describes a system that meets these requirements, and discusses several applications of dynamic compilation. First, the thesis presents 'C (pronounced "Tick-C"), an extension to ANSI C that allows the programmer to express and manipulate C code fragments that should be compiled at run time. 'C is currently the only language to provide an imperative programming interface suitable for high- performance dynamic compilation. It can be used both as an optimization, to improve performance relative to static C code, and as a software engineering tool, to simplify software that involves compilation and interpretation. Second, the thesis proposes several algorithms for fast but effective dynamic compilation, including fast register allocation, data flow analysis, and peephole optimization. It introduces a set of dynamic compilation benchmarks, and uses them to evaluate these algorithms both in isolation and as part of an entire 'C implementation, tcc. Dynamic 'C code can be significantly faster than equivalent static code compiled by an optimizing C compiler: two- to four-fold speedups are common on the benchmarks in this thesis. Furthermore, tcc generates code efficiently-at an overhead of about 100 to 600 cycles per generated instruction-which makes dynamic compilation practical in many situations. Finally, the thesis establishes a taxonomy of dynamic optimizations. It highlights strategies for improving code with dynamic compilation, discusses their limits, and illustrates their use in several application areas. Thesis Supervisor: M. Frans Kaashoek Title: Associate Professor of Electrical Engineering and Computer Science 3 4 Acknowledgments This thesis would not exist without Dawson Engler and Frans Kaashoek. Dawson was the main designer of the 'C language, and his enthusiasm for dynamic code generation sparked my interest in this area. Throughout grad school he was a source of energy, ideas, and intellectual curiosity. I am also happy and grateful to have worked with Frans. He has been an excellent advisor-friendly, encouraging, and enthusiastic. I envy his energy and unshakeable optimism. He deserves all the credit for convincing me to finish this thesis when I was nauseated and ready to abandon it all in favor of new projects. I also thank Wilson Hsieh, Vivek Sarkar, and Chris Fraser. Wilson was a great help at many points in my 'C work, and has given me friendship and good advice even though I once ate his ivy plant. I much enjoyed working with Vivek, and thank him for the opportunity to be supported by an IBM Cooperative Fellowship. Chris shared my enthusiasm for the Olympic Mountains, and his minimalist aesthetic continues to influence my thinking about engineering. My thanks to all of the PDOS group for providing an excellent (if sometimes abrasive) work environment over the last five years. It was fun and a little humbling to interact with everyone, from Anthony and Debby when I first started, to JJ, Chuck, and Benjie now that I'm almost done. Neena made bureaucracy easy, and valiantly tried to keep my office plants alive: my evil dark thumb eventually vanquished her good green thumb, and by this I am deeply saddened. A special thank-you to my two long-time officemates, David and Eddie, who put up with my moods and taught me more than most about computers and about myself, and also to Emmett, whose humor helped to keep me out of the bummer tent. I would have quit grad school many crises ago without the support and love of Rosalba. I hope to have given back a fraction of what I have received. Infine, di nuovo, un grazie di cuore ai miei genitori per tutto. Sometimes a scream is better than a thesis. Ralph Waldo Emerson The 'C language was originally designed by Engler et al. [23]. I was involved in later modifications and extensions to the language. Several parts of this thesis, including much of Chapters 2, 3, and 4, are adapted from Poletto et al. [57]. Section 5.2 is a revision of text originally published in Poletto and Sarkar [58]. I was supported in part by the Advanced Research Projects Agency under contracts N00014-94-1-0985, N66001- 96-C-8522, and F30602-97-2-0288, by a NSF National Young Investigator Award to Frans Kaashoek, and by an IBM Cooperative Fellowship. 5 6 Contents 1 Introduction 15 1.1 A n Exam ple . .. ... ... ... .. ... ... .. ... ... ... .. ... ... .. 15 1.2 Contributions .. ... .. ... .. ... .. ... .. ... .. .. - . .. .. 17 1.3 Issues in Dynamic Compilation . ... .. ... .. ... .. ... .. ... .. ... 18 1.3.1 Implementation Efficiency . .. ... .. ... ... .. ... .. ... .. ... 18 1.3.2 Interface Expressiveness .. ... .. ... .. ... .. ... .. ... .. ... 20 1.4 R elated W ork . ... .. ... .. ... .. ... .. ... .. ... .. ... .. .. .. 21 1.4.1 Ease of Use and Programmer Control . ... ... .. ... .. ... .. ... 21 1.4.2 Compilation Overhead and Programmer Control .. ... .. ... .. ... 23 1.5 O utline .. ... .. ... .. ... .. .. ... .. ... .. ... .. .. ... .. .. 25 2 Language Design 27 2.1 The 'C Language ...................................... 27 2.1.1 The Backquote Operator ... .. ... .. ... ... .. ... .. ... .. 28 2.1.2 Type Constructors .. .. .. ... .. .. ... .. ... .. .. ... .. .. 29 2.1.3 Unquoting Operators. .. ... .. ... .. .. ... .. .. ... .. .. .. 30 2.1.4 Special Forms .. ... .. ... .. ... .. ... ... .. ... .. ... .. 31 2.2 Subtleties and Pitfalls . ... .. ... .. ... .. ... .. ... .. ... .. ... 33 2.2.1 The Dynamic Compilation Model. .. .. ... ... .. ... .. ... .. .. 33 2.2.2 Unquoting . .. .. ... .. .. ... .. .. ... .. .. ... .. .. ... .. 34 2.2.3 Interactions with C . .. ... .. ... .. ... ... .. ... .. ... .. .. 35 2.3 Potential Extensions to 'C . ... .. ... .. ... .. ... .. ... .. ... .. .. 37 2.3.1 Syntactic Sugar for Dynamic Code ... .. .. ... .. .. ... .. .. ... 37 2.3.2 Declarative Extensions .. .. .. ... .. .. ... .. .. ... .. .. ... 38 2.3.3 To Add or Not to Add? .. .. ... .. .. .. .. .. ... .. .. .. .. .. 40 3 Applications 43 3.1 A Taxonomy of Dynamic Optimizations . .. .. .. ... .. .. .. .. ... .. .. 43 3.1.1 Specialization to Data Values ... .. .. .. .. ... .. .. .. .. .. ... 44 3.1.2 Specialization to Data Types .. .. ... .. .. .. .. .. ... .. .. .. 50 3.1.3 Elimination of Call Overhead .. .. ... .. .. ... .. .. .. ... .. .. 53 3.1.4 Elimination of Interpretation Overhead . ... .. .. .. .. .. ... .. .. 54 3.2 Limits of Dynamic Compilation . ... .. ... .. ... .. ... .. ... .. ... 58 4 Implementation and Performance 59 4.1 Architecture . .. ....................... .. .. .. .. .. ... .. .*. ... 59 4.2 The Dynamic Compilation Process .. .. .. ... .. .. .. ... .. .. ... .. 60 4.2.1 Static Compile Time .. .. .. .. .. .. ... .. .. .. .. .. ... .. .. 60 4.2.2 Run Time .. .. .. .. ... .. .. .. .. .. ... .. .. .. .. .. ... 62 4.3 Run-Time Systems .. ... .. .. .. .. .. .. ... .. .. .. .. .. .. .. ... 64 4.3.1 V C O D E . .. .. .. .. .. .. .. .. .. .. ... .. .. .. .. .. .. .. 64 4.3.2 ICODE .. .. .. .. .. .. .. .. .. .. .. .. ... .. ... .. 64 4.4 Evaluation Methodology .. .. .. .. .. .. .. .. .. .. - - . .. .. .. 66 7 4.4.1 Benchmark-Specific Details . .... ... 66 4.4.2 Measurement Infrastructure ... ... .... .... ... .... ... .... 67 4.4.3 Comparison of Static and Dynamic Code ... .... ... .... .... .. 67 4.5 Performance Evaluation . ... ... .... ... ... .... ... ... .... ... 67 4.5.1 Performance ... ... ... ... ... ... .... ... ... ... ... .. 68 4.5.2 Analysis .... .... .... .... .... .... .... .... ..... .. 71 4.6 Summary ... .... .... .... ..... .... .... .... ..... .... 72 5 Register Allocation 75 5.1 One-Pass Register Allocation ... .... ... .... .... ... .... .... .. 75 5.2 Linear Scan Register Allocation ... .... .... .... ... .... ...

Language and Compiler Support for Dynamic Code Generation by Massimiliano A

Metaheuristics1

Lecture 4 Dynamic Programming

The LLVM Instruction Set and Compilation Strategy

Dynamic Programming Via Static Incrementalization 1 Introduction

Automatic Code Generation Using Dynamic Programming Techniques

9. Optimization

1 Lecture 25

Dynamic Extension of Typed Functional Languages

Comparative Studies of Programming Languages; Course Lecture Notes

Control Flow Analysis in Scheme

Value Numbering

Bohm2013.Pdf (3.003Mb)