The Design and Implementation of the SELF Compiler, an Optimizing Compiler for Object-Oriented Programming Languages

Total Page:16

File Type:pdf, Size:1020Kb

The Design and Implementation of the SELF Compiler, an Optimizing Compiler for Object-Oriented Programming Languages The Design and Implementation of the SELF Compiler, an Optimizing Compiler for Object-Oriented Programming Languages A Dissertation Submitted to the Department of Computer Science and the Committee on Graduate Studies of Stanford University in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy Craig Chambers March 13, 1992 © Copyright by Craig Chambers 1992 All Rights Reserved ii I certify that I have read this dissertation and that in my opinion it is fully adequate, in scope and quality, as a dissertation for the degree of Doctor of Philosophy. David Ungar (Principal Advisor) I certify that I have read this dissertation and that in my opinion it is fully adequate, in scope and quality, as a dissertation for the degree of Doctor of Philosophy. John Hennessy I certify that I have read this dissertation and that in my opinion it is fully adequate, in scope and quality, as a dissertation for the degree of Doctor of Philosophy. Mark Linton Approved for the University Committee on Graduate Studies: Dean of Graduate Studies iii Abstract Object-oriented programming languages promise to improve programmer productivity by supporting abstract data types, inheritance, and message passing directly within the language. Unfortunately, traditional implementations of object-oriented language features, particularly message passing, have been much slower than traditional implementations of their non-object-oriented counterparts: the fastest existing implementation of Smalltalk-80 runs at only a tenth the speed of an optimizing C implementation. The dearth of suitable implementation technology has forced most object-oriented languages to be designed as hybrids with traditional non-object-oriented languages, complicating the languages and making programs harder to extend and reuse. This dissertation describes a collection of implementation techniques that can improve the run-time performance of object-oriented languages, in hopes of reducing the need for hybrid languages and encouraging wider spread of purely object-oriented languages. The purpose of the new techniques is to identify those messages whose receiver can only be of a single representation and eliminate the overhead of message passing by replacing the message with a normal direct procedure call; these direct procedure calls are then amenable to traditional inline-expansion. The techniques include a type analysis component that analyzes the procedures being compiled and extracts representation-level type information about the receivers of messages. To enable more messages to be optimized away, the techniques include a number of transformations which can increase the number of messages with a single receiver type. Customization transforms a single source method into several compiled versions, each version specific to a particular inheriting receiver type; customization allows all messages to self to be inlined away (or at least replaced with direct procedure calls). To avoid generating too much compiled code, the compiler is invoked at run-time, generating customized versions only for those method/receiver type pairs used by a particular program. Splitting transforms a single path through a source method into multiple separate fragments of compiled code, each fragment specific to a particular combination of run-time types. Messages to expressions of these discriminated types can then be optimized away in the split versions. The techniques are designed to coexist with other requirements of the language and programming environment, such as generic arithmetic, user-defined control structures, robust error-checking language primitives, source-level debugging, and automatic recompilation of out-of-date methods after a programming change. These techniques have been implemented as part of the compiler for the SELF language, a purely object-oriented language designed as a refinement of Smalltalk-80. If only pre-existing implementation technology were used for SELF, programs in SELF would run one to two orders of magnitude slower than their counterparts written in a traditional non-object-oriented language. However, by applying the techniques described in this dissertation, the performance of the SELF system is five times better than the fastest Smalltalk-80 system, better than that of an optimizing Scheme implementation, and close to half that of an optimizing C implementation. These techniques could be applied to other object-oriented languages to boost their performance or enable a more object-oriented programming style. They also are applicable to non-object-oriented languages incorporating generic arithmetic or other generic operations, including Lisp, Icon, and APL. Finally, they might be applicable to languages that include multiple representations or states of a single program structure, such as logic variables in Prolog and futures in Multilisp. iv Acknowledgments My heartfelt appreciation and thanks go to my advisor, David Ungar, for providing me with the great opportunity to participate in the SELF project. David always treated me as a colleague, promoted my work to others, and did his best to prepare me for independent research. He taught me much about research, writing, speaking, advising, and cartooning. I could never have had such an enjoyable and educational graduate experience without him. I will strive to be as good to my students as he was to me. Besides serving on my thesis reading committee and providing much important feedback, Mark Linton also provided advice and direction during my first year at Stanford. His weekly research group meetings throughout my years at Stanford were interesting and educational. I appreciate Mark’s strong support of my work. John Hennessy also served on my reading committee; his perceptive comments on my dissertation significantly improved its presentation, and in the process taught me a great deal about good research. I also thank the other members of the SELF team for stimulating discussions and fun outings. Thanks to Elgin Lee for sharing an office and the birth of the SELF system with me. Bay-Wei Chang and Urs Hölzle worked and played with me as the SELF project matured. These patient people put up with my not-so-occasional coffee-induced tirades and continued to listen and discuss after the caffeine wore off. I will always remember fondly the times we’ve shared. More recently, Randy Smith, Ole Agesen, John Maloney, and Lars Bak have kept the SELF group a fun and stimulating collection of people. Others at Stanford provided moral support and social diversions that helped me through. Ross Finlayson, Steve “Brat” Goldberg, Paul Calder, and John Vlissides were particularly good friends. My brother-in-common-law, Martin Rinard, made my time at Stanford interesting, to say the least. I appreciate the support and patience of my new colleagues at the University of Washington as I finished this dissertation concurrently with my other responsibilities. By helping to make my transition from student to professor so smooth, they enabled me to finish this tome relatively quickly and painlessly. I also owe much to my early training as an undergraduate at MIT in Barbara Liskov’s research group. Mark Day, Bob Scheifler, Paul Johnson, and Bill Weihl, my undergraduate thesis advisor, taught me about research and real system-building, and this experience enabled me to get started on research at Stanford right away. Finally, I thank my family for their continued support and encouragement. I thank Bill McLaughlin, one of my real brothers-in-law, for his help in putting together the final version of this thesis and taking care of the last administrative details. To my wife, Sylvia, I owe my sincerest gratitude and give my deepest love. This research has been generously supported by the National Science Foundation, Sun Microsystems, IBM, Apple Computer, Cray Laboratories, Tandem Computers, NCR, Texas Instruments, and Digital Equipment Corporation. v vi Table of Contents Chapter 1 Introduction . .1 1.1 Introduction 1 1.2 The SELF Language 1 1.3 Our Research 1 1.4 Outline 2 Chapter 2 Language Features and Implementation Problems. .3 2.1 Abstract Data Types 3 2.1.1 Benefits to Programmers 3 2.1.2 Implementation Effects 4 2.2 Object-Oriented Programming 4 2.2.1 Benefits to Programmers 4 2.2.1.1 Message Passing 4 2.2.1.2 Inheritance 6 2.2.2 Implementation Effects 7 2.2.3 Traditional Compromises 7 2.3 User-Defined Control Structures 9 2.3.1 Benefits to Programmers 9 2.3.2 Implementation Effects 9 2.3.3 Traditional Compromises 9 2.4 Safe Primitives 10 2.4.1 Benefits to Programmers 10 2.4.2 Implementation Effects 10 2.4.3 Traditional Compromises 10 2.5 Generic Arithmetic 10 2.5.1 Benefits to Programmers 10 2.5.2 Implementation Effects 11 2.5.3 Traditional Compromises 11 2.6 Summary 11 Chapter 3 Previous Work . .13 3.1 Smalltalk Systems 13 3.1.1 The Smalltalk-80 Language 13 3.1.2 Deutsch-Schiffman Smalltalk-80 System 13 3.1.3 Typed Smalltalk and TS Optimizing Compiler 16 3.1.4 Atkinson’s Hurricane Compiler 19 3.1.5 Summary 20 3.2 Statically-Typed Object-Oriented Languages 20 3.2.1 C++ 20 3.2.2 Other Fast Dispatch Mechanisms 23 3.2.3 Trellis/Owl 25 3.2.4 Emerald 26 3.2.5 Eiffel 27 3.2.6 Summary 27 vii 3.3 Scheme Systems 27 3.3.1 The Scheme Language 27 3.3.2 The T Language and the ORBIT Compiler 28 3.3.3 Shivers’ Control Flow Analysis and Type Recovery 28 3.3.4 Summary 28 3.4 Traditional Compiler Techniques 29 3.4.1 Data Flow Analysis 29 3.4.2 Abstract Interpretation 30 3.4.3 Partial Evaluation 30 3.4.4 Common Subexpression Elimination and Static Single Assignment Form 30 3.4.5 Wegman’s Node Distinction 31 3.4.6 Procedure Inlining 31 3.4.7 Register Allocation 32 3.4.8 Summary 32 Chapter 4 The SELF Language .
Recommended publications
  • Adaptive Optimization for Self: Reconciling High Performance with Exploratory Programming
    ADAPTIVE OPTIMIZATION FOR SELF: RECONCILING HIGH PERFORMANCE WITH EXPLORATORY PROGRAMMING A DISSERTATION SUBMITTED TO THE DEPARTMENT OF COMPUTER SCIENCE AND THE COMMITTEE ON GRADUATE STUDIES IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY Urs Hölzle August 1994 © Copyright by Urs Hölzle 1994 All rights reserved I certify that I have read this dissertation and that in my opinion it is fully adequate, in scope and quality, as a dissertation for the degree of Doctor of Philosophy. _______________________________________ David M. Ungar (Principal Advisor) I certify that I have read this dissertation and that in my opinion it is fully adequate, in scope and quality, as a dissertation for the degree of Doctor of Philosophy. _______________________________________ John L. Hennessy (Co-Advisor) I certify that I have read this dissertation and that in my opinion it is fully adequate, in scope and quality, as a dissertation for the degree of Doctor of Philosophy. _______________________________________ L. Peter Deutsch I certify that I have read this dissertation and that in my opinion it is fully adequate, in scope and quality, as a dissertation for the degree of Doctor of Philosophy. _______________________________________ John C. Mitchell Approved for the University Committee on Graduate Studies: _______________________________________ iii iv Abstract Object-oriented programming languages confer many benefits, including abstraction, which lets the programmer hide the details of an object’s implementation from the object’s clients. Unfortunately, crossing abstraction boundaries often incurs a substantial run-time overhead in the form of frequent procedure calls. Thus, pervasive use of abstrac- tion, while desirable from a design standpoint, may be impractical when it leads to inefficient programs.
    [Show full text]
  • University of California, Irvine Dissertation Doctor
    UNIVERSITY OF CALIFORNIA, IRVINE Computational State Transfer: An Architectural Style for Decentralized Systems DISSERTATION submitted in partial satisfaction of the requirements for the degree of DOCTOR OF PHILOSOPHY in Information and Computer Science by Michael Martin Gorlick Dissertation Committee: Professor Richard N. Taylor, Chair Associate Professor James A. Jones Professor Crista V. Lopes 2016 Copyright © 2016 Michael Martin Gorlick DEDICATION To the memory of my mother Doreen Viola Gorlick (née Rose) December 1927 – February 2015 and my brother Patrick David Gorlick February 1953 – December 2008 ii TABLE OF CONTENTS Page LIST OF FIGURES vi LIST OF TABLES viii ACKNOWLEDGMENTS ix CURRICULUM VITAE xi ABSTRACT OF THE DISSERTATION xv INTRODUCTION 1 CHAPTER 1: Computation Exchange — From Idiom to Style 14 Thesis and Claims 15 Architecture Can Induce Security 17 Architecture Can Induce Adaptation 18 Capability Confines Risk 27 Computation Exchange: A Historical Perspective 28 Security for Computation Exchange 32 Summary 34 CHAPTER 2: COAST Roots — A Brief History of Mobile Code 35 Early History of Mobile Code (1960–1985) 35 Remote Function Evaluation (1987) 37 Remote Evaluation (1986) 38 Mobile Objects 38 Functional Mobile Code Languages (1995–2007) 41 Mobile Agents 45 Mobility Semantics (1996) 47 COAST Dictates Language Semantics and Infrastructure 49 CHAPTER 3: Computational State Transfer: The Style 50 The Threat Model of Computation Exchange 51 The Communication Model of COAST 54 The COAST Architectural Style 60 COAST Scenarios 64
    [Show full text]
  • Generation Scavenging: a Non-Disruptlve High Perfornm.Nce Storage Reclamation Algorithm
    Generation Scavenging: A Non-disruptlve High Perfornm.nce Storage Reclamation Algorithm David Ungar Computer Science Division Department of Electrical Engineering and Computer Sciences University of California Berkeley, California 94720 1. Introduction ABSTRACT Researchers have designed several interactive pro- gramming environments to expedite software construe- Many interactive computing environmencs tion [She83] Central to such environments are high level provide automatic storage reclamation and vir- languages like Lisp, Cedar Mesa, and Smalltalk-80 TM tual memory to ease the burden of managing iGoR83], that provide virtual memory and automatic storage. Unfortunately, many storage reclams~- storage reclamation. Traditionally, the cost of a storage tion algorithms impede interaction with dis- management strategy has been measured by its use of tracting pauses. Generation Scavenging is a rec- CPU time, primary memory, and backing store opera- lamation algorithm that has no noticeable tions averaged over a session. Interactive systems pauses, eliminates page faults for transient demand good short-term performance as well. Large, objects, compacts objects without resorting to unexpected pauses caused by thrashing or storage recla- indirection, and reclaims circular structures, in mation are distracting and reduce productivity. We have one third the time of traditional approaches. designed~ implemented, and measured Generation We have incorporated Generation Scaveng. Scavenging, a new garbage collector that int. in Berkeley Smalltalk (BS), our Smalltalk- • limits pause times to a fraction of a second, 80* implementation, and instrumented it to obtain performance data. We are also designing • requires no hardware support, a microprocessor with hardware support for • meshes well with virtual memory, Generation Scavenging. • reclaims circular structures, and Keywords: garbage collection, generation, per- • uses less than 2% of the CPU time in one Smalltalk sonal computer, real time, scavenge, Smalltalk, system.
    [Show full text]
  • Gilad Bracha Curriculum Vitae
    Gilad Bracha Curriculum Vitae April 8, 2021 Executive Summary • Specializing in design of programming languages, core platform facilities and development tools • Currently at F5 Networks. • Co-designer of the Dart programming language at Google. • VP, Cloud Programming Model, SAP Labs. • Designer of the Newspeak programming language and platform. • Distinguished Engineer at Cadence Design Systems. • Distinguished Engineer at Sun Microsystems. • Java Language architect and maintainer and co-author of the Java Lan- guage Specification. • Co-maintainer and co-author of the Java Virtual Machine Specification. • Member of the Animorphic Smalltalk team. • Recognized authority on object-oriented language research and develop- ment. Awarded the senior Dahl-Nygaard prize in 2017. • Track record of bringing leading edge research results into practical indus- trial application. 1 Education • Ph.D. Computer Science, 1991, University of Utah. Gary Lindstrom, ad- visor. Research area: Object Oriented Programming Languages. Disser- tation title: The Programming Language Jigsaw: Modularity, Mixins and Multiple Inheritance. • B.Sc. Mathematics and Computer Science, Ben-Gurion University, Israel, 1983. Employment • February 2020 - present Architect, Shape Security division of F5 Net- works. • July 2019 - January 2020. Distinguished Engineer, Shape Security. • July 2017 - April 2019. Tensyr. { Responsibility: Co-designed dataflow model for defining autonomous vehicle behavior. Used Lua, Python and OCaml to implement DSL serving as front end to on-vehicle dataflow engine. • July 2011 - July 2017. Google. { Responsibility: Co-designer of the Dart programming language, responsible for the written language specification and design of re- flective libraries. • June 2010 - July 2011. Vice President, Cloud Programming Model, Office of CTO, SAP Labs. • January 2009 - May 2010. Independent consultant.
    [Show full text]
  • Exploratory and Live, Programming and Coding: a Literature Study
    Exploratory and Live, Programming and Coding A Literature Study Comparing Perspectives on Liveness Patrick Reina, Stefan Ramsona, Jens Linckea, Robert Hirschfelda, and Tobias Papea a Hasso Plattner Institute, University of Potsdam, Germany Abstract Various programming tools, languages, and environments give programmers the impression of changing a program while it is running. This experience of liveness has been discussed for over two decades and a broad spectrum of research on this topic exists. Amongst others, this work has been carried out in the communities around three major ideas which incorporate liveness as an important aspect: live programming, exploratory programming, and live coding. While there have been publications on the focus of each particular community, the overall spectrum of liveness across these three communities has not been investigated yet. Thus, we want to delineate the variety of research on liveness. At the same time, we want to investigate overlaps and differences in the values and contributions between the three communities. Therefore, we conducted a literature study with a sample of 212 publications on the terms retrieved from three major indexing services. On this sample, we conducted a thematic analysis regarding the following aspects: motivation for liveness, application domains, intended outcomes of running a system, and types of contributions. We also gathered bibliographic information such as related keywords and prominent publica- tions. Besides other characteristics the results show that the field of exploratory programming is mostly about technical designs and empirical studies on tools for general-purpose programming. In contrast, publications on live coding have the most variety in their motivations and methodologies with a majority being empirical studies with users.
    [Show full text]
  • 11 European Lisp Symposium
    Proceedings of the 11th European Lisp Symposium Centro Cultural Cortijo de Miraflores, Marbella, Spain April 16 – 17, 2018 In-cooperation with ACM Dave Cooper (ed.) ISBN-13: 978-2-9557474-2-1 ISSN: 2677-3465 Contents ELS 2018 iii Preface Message from the Program Chair Welcome to the 11thth edition of the European Lisp Symposium! This year’s ELS demonstrates that Lisp continues at the forefront of experimental, academic, and practical “real world” computing. Both the implementations themselves as well as their far-ranging applications remain fresh and exciting in ways that defy other programming lan- guages which rise and fall with the fashion of the day. We have submissions spanning from the ongoing refinement and performance improvement of Lisp implementation internals, to prac- tical and potentially lucrative real-world applications, to the forefront of the brave new world of Quantum Computing. Virtually all the submissions this year could have been published, and it was a challenge to narrow them down enough to fit the program. This year’s Program also leaves some dedicated time for community-oriented discussions, with the purpose of breathing new life and activity into them. On Day One, the Association of Lisp Users (ALU) seeks new leadership. The ALU is a venerable but sometimes dormant pan-Lisp organization with a mission to foster cross-pollenization among Lisp dialects. On Day Two, the Common Lisp Foundation (CLF) will solicit feedback on its efforts so far and will brainstorm for its future focus. On a personal note, this year I was fortunate to have a “working retreat” in the week prior to ELS, hosted by Nick & Lauren Levine in their villa nestled in the hills above Marbella.
    [Show full text]