Optimizing Software–Hardware Interplay in Efficient Virtual Machines

OPTIMIZING SOFTWARE–HARDWARE INTERPLAY IN EFFICIENT VIRTUAL MACHINES by Gregory B. Prokopski School of Computer Science McGill University, Montreal February 2009 A thesis submitted to McGill University in partial fulfillment of the requirements of the degree of Doctor of Philosophy Copyright c 2009 by Gregory B. Prokopski Library and Archives Bibliothèque et Canada Archives Canada Published Heritage Direction du Branch Patrimoine de l’édition 395 Wellington Street 395, rue Wellington Ottawa ON K1A 0N4 Ottawa ON K1A 0N4 Canada Canada Your file Votre référence ISBN: 978-0-494-61833-2 Our file Notre référence ISBN: 978-0-494-61833-2 NOTICE: AVIS: The author has granted a non- L’auteur a accordé une licence non exclusive exclusive license allowing Library and permettant à la Bibliothèque et Archives Archives Canada to reproduce, Canada de reproduire, publier, archiver, publish, archive, preserve, conserve, sauvegarder, conserver, transmettre au public communicate to the public by par télécommunication ou par l’Internet, prêter, telecommunication or on the Internet, distribuer et vendre des thèses partout dans le loan, distribute and sell theses monde, à des fins commerciales ou autres, sur worldwide, for commercial or non- support microforme, papier, électronique et/ou commercial purposes, in microform, autres formats. paper, electronic and/or any other formats. The author retains copyright L’auteur conserve la propriété du droit d’auteur ownership and moral rights in this et des droits moraux qui protège cette thèse. Ni thesis. Neither the thesis nor la thèse ni des extraits substantiels de celle-ci substantial extracts from it may be ne doivent être imprimés ou autrement printed or otherwise reproduced reproduits sans son autorisation. without the author’s permission. In compliance with the Canadian Conformément à la loi canadienne sur la Privacy Act some supporting forms protection de la vie privée, quelques may have been removed from this formulaires secondaires ont été enlevés de thesis. cette thèse. While these forms may be included Bien que ces formulaires aient inclus dans in the document page count, their la pagination, il n’y aura aucun contenu removal does not represent any loss manquant. of content from the thesis. Abstract To achieve the best performance, most computer languages are compiled, ei- ther ahead of time and statically, or dynamically during runtime by means of a Just-in-Time (JIT) compiler. Optimizing compilers are complex, however, and for many languages such as Ruby, Python, PHP, etc., an interpreter-based Virtual Ma- chine (VM) offers a more flexible and portable implementation method, and more- over represents an acceptable trade-off between runtime performance and development costs. VM performance is typically maximized by use of the basic direct- threading interpretation technique which, unfortunately, interacts poorly with modern branch predictors. More advanced techniques, like code-copying have been pro- posed [RS96,PR98,EG03c,Gag02] but have remained practically infeasible due to important safety concerns. On this basis we developed two cost-efficient, well-performing solutions. First, we designed a C/C++ language extension that allows programmers to ex- press the need for the special safety guarantees of code-copying. Our low-maintenance approach is designed as an extension to a highly-optimizing, industry-standard GNU C Compiler (GCC), and we apply it to Java, OCaml, and Ruby interpreters. We tested the performance of these VMs on several architectures, gathering extensive analysis data for both software and hardware performance. Significant improvement is possible, 2.81 times average speedup for OCaml and 1.44 for Java on Intel 32-bit, but varies by VM and platform. We provide detailed analysis and design guidelines for helping developers predict and evaluate the benefit provided by safe code-copying. In our second approach we focused on alleviating the limited scope of optimizations in code-copying with an ahead-of-time-based (AOT) approach. A source-based i approach to grouping bytecode instructions together allows for more extensive cross- bytecode optimizations, and so we develop a caching compilation server that special- izes interpreter source code to a given input application. Although this requires AOT profiling, it further improves performance over code-copying, 27% average for OCaml and 4-5% on selected Java benchmarks. This thesis work provides designs for low maintenance, high efficiency VMs, and also demonstrates the large performance improvement potentially enabled by tailor- ing language implementation to modern hardware. Our designs are both based on understanding the impact of lower-level components on VM behavior. By optimizing the software-hardware interplay we thus show significant speed-up is possible with very minimal VM and compiler development costs. ii Résumé Pour avoir une meilleure performance, la plupart des langages de programma- tion sont compilés,soit avant leur exécutionet statiquement, ou dynamiquement, pendant leur utilisation, àl’aide d’un compilateur “Just-in-Time” (JIT). Cepen- dant, les compilateurs avec des fonctionnalitésd’optimisation sont complexes, et plusieurs langages, tel que Ruby, Python, PHP, profitent mieux d’une solution flexible et portable tel qu’une machine virtuelle (MV) interprétée. Cette solution offre un échange acceptable entre la performance d’exécutionet les coûtsde développement. La performance de la MV est typiquement maximiséepar l’utilisation de la technique d’interprétation“direct threading”, qui, malheureusement, interagit mal avec les prédicteursde branches moderne. Des techniques plus avancées,tel que “code- copying” ont étéproposées[RS96,PR98,EG03c,Gag02], mais ne sont pas applicable en pratique àcause de préoccupation de sécurité.C’est sur les bases suivantes que nous avons développédeux solutions coût-efficacequi offrent une bonne performance. Premièrement, nous avons développéune extension au langage C qui permet aux programmeurs d’exprimer le besoin pour des garanties spéciales pour la technique de “code-copying”. Notre technique, qui requiert trèspeu de maintenance, est développée comme une extension àun compilateur qui a non seulement des fonctionnalitésd’optimisation trèsélaboréesmais qui est aussi un standard d’industrie, le “GNU C Com- piler” (GCC). Nous pouvons alors appliquer cette technique sur les interpréteurJava, OCaml et Ruby. Nous avons évaluéla performance de ces MV sur plusieurs architectures, en collectionnant de l’information pour analyser la performance logiciel et matériel.La marge d’améliorationpossible est trèsgrande, une accélérationd’ordre 2.81 pour OCaml et 1.44 pour Java sur l’architecture Intel 32-bit. Il est important de iii noter que cette marge est différente selon les MV et les architectures. Nous fournissons une analyse détailléeet des normes de développement pour aider les programmeurs à prédireet évaluer les bénéficespossibles d’une utilisation sécuritairede la technique de “code-copying”. Notre deuxièmesolution vise àréduireles limitations sur la portéedes optimisations de la technique de ”code-copying” baséesur une méthode “ahead-of-time” (AOT). Une méthode basésur le code source qui regroupe les instructions de bytecode permet des optimisations cross-bytecode plus élaborées.Nous avons donc développé un serveur de compilation àmémoirecache qui se spécialisedans l’interprétationde code source donnécomme entréeàune application. Quoique cette solution nécessite un profilage AOT, elle amélioreles performances de la technique de “code-copying” par 27% pour le langage OCaml et par 4-5% sur certain tests de performance Java. Cet ouvrage fournis des modèlespour des MV nécessitant peu de maintenance et hautement performant. De plus, il démontre le grand potentiel d’améliorationpossible avec des techniques qui personnalisent l’implémentation selon le matériel.Tous nos modèlessont baséssur la compréhensiondes impacts des composantes de bas-niveau sur le comportement de la MV. En optimisant les interactions entre le matérielet le logiciel, nous démontrons que des améliorationsimportantes sont possible avec très peu de coûtde développement pour le compilateur et la MV. iv Acknowledgments This journey and this thesis would not be possible without the help of many people. Foremostly, I would like to thank my adviser, Professor Clark Verbrugge, for his guidance, patience, enthusiasm in all kinds of topics, the drive to focus our efforts on what is truly important, and for his financial support. I wish to thank Professor Laurie Hendren for her excellent teachings in the field of compilers, and that together with Professor Clark Verbrugge they had faith in me in difficult times. It is thanks to Professor Etienne M. Gagnon, that I had the opportunity to pursue studies at McGill University, and I am thankful for his support in personal and financial matters, and for seeding the initial idea of this thesis. I owe special thanks to my dear friend Ladan Mahabadi for her brutal honesty and ability to listen. I will always remember the help, warmth, and great laughs we shared with many members of the administrative and system staff of the School Of Computer Science. I would like to thank my dear Liza for her companionship, care, and invigorating energy. I am grateful to my parents, who always encouraged my passion for technology, and my brave sister, Krystyna, who unknowingly kept reminding me that a balanced, happy life can be created in many different ways, one for each of us. This research was supported in part by NSERC and FQRNT. v Contents Abstract i Résumé iii Acknowledgments v Contents vi List of Figures ix List of Tables xiii 1 Introduction 1 1.1 Virtual Machines . 2 1.2 Efficiency of Software-hardware Interplay . 4 1.3 Contributions . 8 1.4 Thesis Outline .

Optimizing Software–Hardware Interplay in Efficient Virtual Machines

Learning to Program in Perl

Hacker Public Radio

2016 8Th International Conference on Cyber Conflict: Cyber Power

A Java Virtual Machine Extended to Run Parrot Bytecode

Threat Modeling and Circumvention of Internet Censorship by David Fifield

LAMP and the REST Architecture Step by Step Analysis of Best Practice

Graceful Language Extensions and Interfaces

Graceful Language Extensions and Interfaces

Haskell Communities and Activities Report

Multimedia Applications for Training in Communications

Team Unknown ]

Table of Contents • Index • Reviews • Reader Reviews • Errata Perl 6 Essentials by Allison Randal, Dan Sugalski, Leopold Tötsch