Two Decades of Smalltalk VM Development: Live VM Development Through Simulation Tools Eliot Miranda, Clément Bera, Elisa Gonzalez Boix, Dan Ingalls
Total Page:16
File Type:pdf, Size:1020Kb
Two Decades of Smalltalk VM Development: Live VM development through Simulation Tools Eliot Miranda, Clément Bera, Elisa Gonzalez Boix, Dan Ingalls To cite this version: Eliot Miranda, Clément Bera, Elisa Gonzalez Boix, Dan Ingalls. Two Decades of Smalltalk VM Development: Live VM development through Simulation Tools. Virtual Machines and Language Implementations VMIL 2018, 2018, Boston, United States. 10.1145/3281287.3281295. hal-01883380 HAL Id: hal-01883380 https://hal.archives-ouvertes.fr/hal-01883380 Submitted on 8 Oct 2018 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. Two Decades of Smalltalk VM Development Live VM development through Simulation Tools Eliot Miranda Clément Béra Feenk Software Languages Lab San Francisco, California Vrije Universiteit Brussel [email protected] Brussel, Belgium [email protected] Elisa Gonzalez Boix Dan Ingalls Software Languages Lab ARCOS Vrije Universiteit Brussel Aptos, California Brussel, Belgium [email protected] [email protected] Abstract [BDN+07], an open-source Smalltalk dialect, and its VM + OpenSmalltalk-VM is a virtual machine (VM) for languages [IKM 97], written in Smalltalk using the code from Smalltalk- in the Smalltalk family (e.g. Squeak, Pharo) which is itself 80 [GR83] as a starting point. Part of the code base was, how- written in a subset of Smalltalk that can easily be translated ever, narrowed down to a subset of Smalltalk, called Slang, to to C. Development is done in Smalltalk, an activity we call allow Smalltalk to C compilation. Development was done in “Simulation”. The production VM is derived by translating Smalltalk, an activity we call "Simulation", and the support the core VM code to C. As a result, two execution models code for which is "The Simulator". The production VM is coexist: simulation, where the Smalltalk code is executed on derived by translating the core VM code to C, combining top of a Smalltalk VM, and production, where the same code this generated C code with a set of platform-specific support is compiled to an executable through a C compiler. files, and compiling with the platform’s C compiler. In this paper, we detail the VM simulation infrastructure Two execution models were effectively available, simu- and we report our experience developing and debugging the lation, where the Smalltalk code is executed on top of a garbage collector and the just-in-time compiler (JIT) within Smalltalk VM, and production, where the same code is com- it. Then, we discuss how we use the simulation infrastructure piled to executable code through the C compiler. Simulation to perform analysis on the runtime, directing some design is used to develop and debug the VM. Production is used 1 decisions we have made to tune VM performance. to release the VM. Dummy Smalltalk message sends were used to embed meta information in the code, such as self var: CCS Concepts • Software and its engineering → Run- ’foo’ type: ’char *’ which have no effect during simulation but time environments; Just-in-time compilers; Interpreters; guide the translation process of the executable Smalltalk. Keywords Just-in-Time compiler, garbage collector, virtual When the Squeak VM was released it consisted mainly in: machine, managed runtime, tools, live development • an interpreter with a spaghetti stack, ACM Reference Format: • a memory manager with a compact but complex object Eliot Miranda, Clément Béra, Elisa Gonzalez Boix, and Dan Ingalls. representation: a pointer-reversing tracing garbage 2018. Two Decades of Smalltalk VM Development: Live VM de- collector and a heap divided into two generations, velopment through Simulation Tools. In Proceedings of the 10th • WarpBlt, a rotation and scaling extension for the bit- ACM SIGPLAN International Workshop on Virtual Machines and based BitBlt graphics engine, Intermediate Languages (VMIL ’18), November 4, 2018, Boston, MA, • external C code and makefiles to support running the USA. ACM, New York, NY, USA, 10 pages. https://doi.org/10.1145/ VM on popular platforms. 3281287.3281295 The first three components of the VM were written en- 1 Introduction tirely in Slang. A few extra features, such as file management, To specify their virtual machine (VM), the Smalltalk-80 team were written both in Smalltalk for simulation purposes and at Xerox PARC wrote a Smalltalk VM entirely in Smalltalk in C for the production VM. [GR83]. In 1995, members of the same team built Squeak VMIL ’18, November 4, 2018, Boston, MA, USA 1 2018. ACM ISBN 978-1-4503-6071-5/18/11...$15.00 We use the Smalltalk terminology, send, to discuss virtual calls since we https://doi.org/10.1145/3281287.3281295 are talking about Smalltalk. VMIL ’18, November 4, 2018, Boston, MA, USA Eliot Miranda, Clément Béra, Elisa Gonzalez Boix, and Dan Ingalls Over the years, the Squeak VM evolved to give birth 2 Virtual Machine Simulation 2 recently to OpenSmalltalk-VM , the default VM for vari- The key idea of the Simulation is to allow developers to reuse + ous Smalltalk-like systems such as Pharo [BDN 09], Squeak the whole Smalltalk IDE including the browser, inspectors + + [BDN 07], Cuis, Croquet and Newspeak [BvdAB 10]. As the and debugger to develop the VM. Most new features can be VM evolved, the original simulator co-evolved as a tool to developed interactively, adding code to the VM at runtime, develop and debug the VM. The metadata used to guide the in the simulation environment, like Smalltalk programming. translation process was replaced by pragmas [DMP16]. But In this section we first briefly introduce the compilation the most significant evolution of the simulator came with the pipeline to generate the production VM and Smalltalk snap- introduction of the Just-In-Time compiler (JIT) [Mir11]. The shots, key elements to understand the design of the simu- spaghetti stack was mapped to a more conventional stack lation infrastructure. Then we describe the memory layout frame organisation in a stack zone of a few hundred kilobytes of both the production and simulation runtimes. Next we organised into small pages, a scheme called context-to-stack detail how VM execution is simulated and the interactions mapping [DS84, Mir99]. A JIT was written. Machine code with the simulated memory. In the following subsection, generated by the JIT was executed by binding multiple pro- we explain specific aspects of the simulation infrastructure cessor simulators (Bochs [Law96] for x86 and x64, SkyEye such as the simulation of the machine code generated by for ARMv6, Smalltalk code for 32-bit MIPS). Support for mul- the JIT through the processor simulator. The last subsection tiple bytecode sets was added, initially to support Newspeak, discusses simulation features related to the development of and more recently to support adaptive optimization. Finally, the JIT. a new object representation and garbage collector was added to improve performance and also to support 64 bits [MB15]. 2.1 Context This required refactoring the interpreter to allow the object 3 VM compilation. The VM executable is generated in a two representation to be chosen at start-up . Slang was also ex- step process. Firstly, the Slang-to-C compiler translates the tended with type inference to allow Smalltalk code that is Slang code (interpreter, JIT and GC code) to C code, generat- independent of word size by virtue of Smalltalk’s infinite ing a few C files. This first step usually takes several seconds. precision arithmetic to be used in both 32 and 64 bit contexts. Secondly, the C compiler (depending on the platform LLVM, One of the key design decisions in OpenSmalltalk-VM was GCC or MSVC) translates all the C files (generated C files to keep the core components (i.e. the interpreter, the JIT and and platform-specific C files) into an executable. This second the memory manager) written in Slang and not in C/C++. step can take up to a few minutes the first time. Subsequent By interpreting the Slang code as Smalltalk code, emulating compilations usually only take a few seconds, since object native code using an external processor simulator and simu- files for unchanged C files do not need to be recreated. lating the memory using a large byte array, it is possible to The VM can be configured in two main flavours, interpreter- simulate the whole VM execution. This allows development only or interpreter+JIT 5. Although the version with the JIT and debugging of the VM with the Smalltalk development is the most widely used in production, the interpreter version tools resulting in a live programming experience for VM 4 is convenient for development purposes and essential on de- development . vices that outlaw JITs. For example, debugging the garbage In this paper, we present key details of the Simulation collector or evaluating new language features can be done infrastructure, give examples that demonstrate its produc- in the interpreter-only VM, avoiding JIT complexity. tivity advantages, and discuss some of its limitations. The paper is structured as follows. Section 2 introduces the sim- Snapshots. Smalltalk is an object system, rather than a lan- ulation infrastructure used to develop and debug the VM. guage. The entire system, including its development tools Section 3 reports our experience developing the VM with the and application code is stored in a snapshot file, which is simulation infrastructure. Section 4 discusses some of the essentially a memory dump of the entire heap.