Bridging the Gap Between Machine and Language Using First-Class Building Blocks
Total Page:16
File Type:pdf, Size:1020Kb
Bridging the Gap between Machine and Language using First-Class Building Blocks Inauguraldissertation der Philosophisch-naturwissenschaftlichen Fakultat¨ der Universitat¨ Bern vorgelegt von Toon Verwaest von Belgien Leiter der Arbeit: Prof. Dr. O. Nierstrasz Institut fur¨ Informatik und angewandte Mathematik Von der Philosophisch-naturwissenschaftlichen Fakultat¨ angenommen. Copyright © 2012 Toon Verwaest. Software Composition Group University of Bern Institute of Computer Science and Applied Mathematics Neubruckstrasse¨ 10 CH-3012 Bern http://scg.unibe.ch/ ISBN: 978-1-105-51835-5 This work is licensed under the Creative Commons Attribution–ShareAlike 3.0 License. The license is available at http://creativecommons.org/licenses/by-sa/3.0/. This dissertation is available at http://scg.unibe.ch. Acknowledgments My adventure in Bern started more than four years ago, in October 2007. This page is too short to capture the gratitude I have towards all those who have contributed in any way to this dissertation. First I’d like to extend my gratitude to Oscar Nierstrasz for supporting me throughout my time at the Software Composition Group. He gave me the freedom to develop my own research project, and provided invaluable sup- port in formulating my thoughts. It is thanks to his well-organized research group that developing this thesis almost seemed effortless. I thank Marcus Denker for his continued support and interest in my work. I very much enjoyed our discussions about research; especially while ex- ploring the Swiss mountains. I’d like to thank him for reviewing this thesis, writing the Koreferat, and accepting to be on the PhD committee. I thank Torsten Braun for accepting to chair the PhD defense. I am especially grateful to my bachelor student Olivier Fluckiger,¨ and mas- ter students Camillo Bruni and David Gurtner for the many weeks we have spent discussing, and programming on Pinocchio. Without your optimism and input this thesis would not have been possible! I am much obliged to my co-authors Camillo Bruni, David Gurtner, Adrian Lienhard, Oscar Nierstrasz, Mircea Lungu, Niko Schwarz, Erwann Wernli, and Olivier Fluckiger¨ for the stressful days (and nights) writing the papers that formed the base of this dissertation. I would like to thank Camillo Bruni, Erwann Wernli, Olivier Fluckiger,¨ and Niko Schwartz for struggling through earlier drafts of this dissertation, and for helping to shape it into what it is today. I thank all former and current members of the Software Composition Group: Marcus Denker, Tudor Gˆırba, Orla Greevy, Adrian Kuhn, Adrian Lienhard, Mircea Lungu, Fabrizio Perin, Lukas Renggli, Jorge Ressia, David Rothlis-¨ berger, Niko Schwarz, and Erwann Wernli. I thank Gabriela Arevalo´ for introducing me to this wonderful group. I am thankful to Therese Schmidt, i and especially to Iris Keller, for taking almost all administrative tasks off my shoulders. I thank my bachelor students Daniel Langone, Camillo Bruni, and master student Sandro De Zanet, for the fun projects we worked on. I am thankful to my Hilfsassistants Camillo Bruni for the Programming Lan- guages lecture, and Raffael Krebs for the Compiler Construction lecture, for greatly reducing my additional workload. I thank my friend Adrian Kuhn for the hours of hacking (and drinking) together, especially during the first year of my PhD. I am grateful to my parents for believing in me, for their support, and espe- cially for the many nice visits. I thank my dear friends Sandro De Zanet, Raffael Krebs, and Camillo Bruni for the many dinners, drinks, games, days of skiing, hikes, etc. You have made my stay in Switzerland a pleasant one! Above all, I’m indebted to Laura Sanchez´ Serrano for joining me in Bern, and for sticking through yet another thesis. I thank her for supporting me, taking over chores, and listening to me complain during the writing phase; but also for the many adventures we have lived together. Toon Verwaest February 20, 2012 ii Abstract High-performance virtual machines (VMs) are increasingly reused for pro- gramming languages for which they were not initially designed. Unfor- tunately, VMs are usually tailored to specific languages, offer only a very limited interface to running applications, and are closed to extensions. As a consequence, extensions required to support new languages often entail the construction of custom VMs, thus impacting reuse, compatibility and performance. Short of building a custom VM, the language designer has to choose between the expressiveness and the performance of the language. In this dissertation we argue that the best way to open the VM is to eliminate it. We present Pinocchio, a natively compiled Smalltalk, in which we identify and reify three basic building blocks for object-oriented languages. First we define a protocol for message passing similar to calling con- ventions, independent of the actual message lookup mechanism. The lookup is provided by a self-supporting runtime library written in Smalltalk and compiled to native code. Since it unifies the meta- and base-level we obtain a metaobject protocol (MOP). Then we decouple the language-level manipulation of state from the machine-level implementation by extending the structural reflective model of the language with object layouts, layout scopes and slots. Finally we reify behavior using AST nodes and first-class interpreters sep- arate from the low-level language implementation. We describe the implementations of all three first-class building blocks. For each of the blocks we provide a series of examples illustrating how they enable typical extensions to the runtime, and we provide benchmarks vali- dating the practicality of the approaches. iii iv Contents 1 Introduction1 1.1 Contributions............................3 1.2 Outline................................4 2 Object-Oriented Building Blocks7 2.1 Communication: Efficient Message Dispatch..........7 2.1.1 Lookup Cache........................8 2.1.2 Virtual Method Tables...................8 2.1.3 Inline Caches........................8 2.1.4 Method Inlining...................... 10 2.1.5 Customization....................... 11 2.2 Data: Object Format and Management.............. 11 2.2.1 Hardware Constraints................... 12 2.2.2 Garbage Collection Constraints.............. 13 2.3 Code: Method Structure...................... 15 2.3.1 AST.............................. 15 2.3.2 Bytecode........................... 16 2.3.3 Threaded Code....................... 17 2.3.4 Native Code......................... 20 2.4 Code: Method Behavior...................... 20 2.4.1 Recursive interpretation.................. 21 2.4.2 Manual stack management................ 21 2.4.3 Stack-mapped context frame............... 22 2.4.4 Register-Based Execution................. 23 2.5 Summary............................... 23 3 Background and Problems 25 3.1 Reflection............................... 25 3.1.1 Discrete and Continuous Behavioral Reflection..... 26 3.1.2 Separation of base and meta-level............ 27 3.1.3 Partial Behavioral Reflection............... 27 3.1.4 High-level reflective API.................. 28 3.1.5 AOP............................. 29 3.2 Generating Tailored Virtual Machines.............. 29 3.3 Problem 1: The Tyranny of a Closed VM............. 30 3.4 Problem 2: Low-level Object Structure.............. 32 v Contents 3.5 Problem 3: Low-Level Execution Model............. 33 3.6 Summary............................... 34 4 First-Class Message Lookup 35 4.1 The Message is the Medium.................... 36 4.2 See Pinocchio Run.......................... 38 4.2.1 Invocation Conventions.................. 39 4.2.2 Lookup and Apply..................... 39 4.2.3 Avoiding Meta-regression................. 41 4.2.4 Separation of Class and Behavior............. 42 4.2.5 Native Compilation.................... 42 4.2.6 Bootstrapping Pinocchio.................. 44 4.2.7 Performance......................... 45 4.2.8 Metrics............................ 46 4.3 Evaluation and applications.................... 46 4.3.1 Reconfiguring the meta-level............... 47 4.3.2 Changing lookup semantics................ 49 4.4 Discussion.............................. 49 4.5 Related Work............................ 52 4.6 Summary............................... 53 5 First-Class Object Layouts 55 5.1 Flexible Object Layouts in a Nutshell............... 56 5.2 First-Class Slots........................... 58 5.2.1 Primitive Slots........................ 59 5.2.2 Customized Slots...................... 60 5.2.3 Virtual Slots......................... 65 5.2.4 Implementation and Performance............ 66 5.3 First-Class Layout Scopes..................... 67 5.3.1 Bit Field Layout Scope................... 67 5.3.2 Property Layout Scope................... 69 5.4 Stateful Traits............................ 72 5.4.1 Traits............................. 73 5.4.2 Stateful Traits........................ 74 5.4.3 Installing State of Stateful Traits............. 74 5.4.4 Installing Behavior of Stateful Traits........... 76 5.5 Inspecting Objects.......................... 76 5.6 Dynamic Structure Modifications................. 77 5.6.1 Modification Model.................... 77 5.6.2 Building and Installing................... 78 5.6.3 Pharo Class Installer.................... 79 5.6.4 Metrics............................ 80 5.7 Related Work............................ 80 5.7.1 Language Extensions.................... 81 5.7.2 Meta Modeling....................... 81 5.7.3 Annotations......................... 82 vi 5.7.4 Object Relational Mapping...............