A New Backend for Standard ML of New Jersey (Draft Paper)

A New Backend for Standard ML of New Jersey (Draft Paper) Kavon Farvardin John Reppy Computer Science Computer Science University of Chicago University of Chicago Chicago, IL, USA Chicago, IL, USA [email protected] [email protected] ABSTRACT compiler to use what they called a “Continuation-Passing, Closure- 1 This paper describes the design and implementation of a new back- Passing Style” [3, 6]. end for the Standard ML of New Jersey (SML/NJ) system that At the same time, additional machine-code generators were writ- is based on the LLVM compiler infrastructure. We first describe ten for the MIPS and SPARC architectures, but with the proliferation the history and design of the current backend, which is based on of Reduced-Instruction-Set Computers (RISC) in the early 1990’s, the MLRISC framework. While MLRISC has many similarities to there was a need for more backends. These code generators also LLVM, it provides a lower-level, policy agnostic, approach to code suffered from the problem that they did not share code, each was generation that enables customization of the code generator for non- a standalone effort, and that they did not support many machine- standard runtime models (i.e., register pinning, calling conventions, code-level optimizations. These problems lead to the development etc.). In particular, SML/NJ uses a stackless runtime model based on of MLRISC [20] as a new, portable machine-code generator for continuation-passing style with heap-allocated continuation closures. SML/NJ. MLRISC defined an abstract load-store virtual-machine This feature, and others, pose challenges to building a backend using architecture that could sit between the language-specific parts of LLVM. We describe these challenges and how we address them in the code generator and the target-machine-specific parts, such as our backend. instruction selection, register allocation, and instruction scheduling. Over the past 25 years, MLRISC has been used to support roughly KEYWORDS ten different target architectures in the SML/NJ system. It has also been used by several other compilers [14–16] and as a platform for Code Generation, Compilers, LLVM, Standard ML, Continuation- research into advanced register allocation techniques [5, 19] and Passing Style SSA-based optimization [27]. 2 ACM Reference Format: Unfortunately, MLRISC is no longer under active development, Kavon Farvardin and John Reppy. 2020. A New Backend for Standard so we need to consider alternatives. An obvious choice is the LLVM ML of New Jersey (Draft Paper). In Proceedings of 32nd Symposium on project, which provides a portable framework for generating and Implementation and Application of Functional Languages (IFL 2020). ACM, optimizing machine code [24, 25]. LLVM takes a language-centric New York, NY, USA, 11 pages. https://doi.org/10.1145/1122445.1122456 approach to code generation by defining a low-level SSA-based [11] language, called LLVM IR, for describing code. LLVM IR has a 1 INTRODUCTION textual representation, which we refer to as LLVM assembly code, as well as a binary representation, called bitcode, and a procedural Standard ML of New Jersey is one of the oldest actively-maintained representation in the form of a C++ API for generating LLVM IR functional language implementations in existence [1, 7]. Much like in memory. The LLVM framework includes many analysis and the proverbial “Ship of Theseus,” every part of the compiler, runtime optimization passes on both the target-independent LLVM IR and system, and libraries has been reimplemented at least once, with on machine-specific code. Most importantly, it supports the operating some parts having been reimplemented half a dozen times or more. systems and architectures that SML/NJ supports, as well as some The backend of the compiler is one such example. The origi- that we want to support in the future. While LLVM was originally nal code generator translated a direct-style λ-calculus intermediate developed to support C and C++ compilers, it has been used by a representation (IR) to Motorola 68000 and DEC VAX machine number of other functional-language implementations [12, 13, 26, code [7]. Inspired by Kranz et al.’s work on the ORBIT compiler 31, 36, 37]. for Scheme [22, 23], Appel and Jim converted the backend of the Therefore, we are undertaking a project to migrate the backend of SML/NJ to use the LLVM infrastructure. This paper describes Permission to make digital or hard copies of all or part of this work for personal or the challenges faced by this migration and how these challenges are classroom use is granted without fee provided that copies are not made or distributed being met. While there are many similarities between this effort and for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM previous applications of LLVM to functional-language compilers, must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]. 1This CPS IR, with modifications to support multiple precisions of numeric types[17] IFL 2020, September 2–4, 2020, Online and direct calls to C functions [9], continues to be used in the backend of the SML/NJ © 2020 Association for Computing Machinery. compiler. ACM ISBN 978-x-xxxx-xxxx-x/YY/MM. $15.00 2The last significant work was the addition of support for the amd64 (a.k.a., x86-64) https://doi.org/10.1145/1122445.1122456 architecture. IFL 2020, September 2–4, 2020, Online Kavon Farvardin and John Reppy there are also a number of novel aspects driven by the SML/NJ Table 1: CMACHINE general purpose registers runtime model and compiler architecture. std-link holds address of function for standard calls 2 STANDARD ML OF NEW JERSEY std-clos holds pointer to closure object for standard calls The Standard ML of New Jersey (SML/NJ) system provides both in- std-cont holds address of continuation teractive compilation in the form of a Read-Eval-Print Loop (REPL) std-arg first general-purpose argument register and batch compilation. In both cases, SML source code is compiled misci miscellaneous argument registers (including callee- to binary machine code that is either loaded into a heap-allocated save registers) code object for execution or written to a file. Linking is handled in the elaborator, which wraps the compilation unit with a λ-abstraction that closes over its free variables; this code is then applied to the passed as parameters across calls. The base register is recomputed dynamic representation of the environment to link it. Dynamically, on entry to a function (since the caller and callee may be in different a compilation unit is represented as a function that takes a tuple of modules), and is threaded through the body of the function. bindings for its free variables and returns a tuple representing the In addition, the compiler assumes that intermediate results, ar- bindings that it has introduced. Thus, the SML/NJ system does not guments to primitive operations, and arguments to function calls need to understand system-specific object-file formats or dynamic are always held in registers. The CMACHINE registers are assigned linking. specific roles in the calling conventions as described in Table1. In the remainder of this section, we first describe SML/NJ’s Function calls come in three forms: runtime conventions at an abstract level, then discuss the existing (1) Standard function calls are calls to “escaping” functions that backend implementation, and the MLRISC-based machine-code use a standard calling convention; i.e., functions where at least generator. some call sites or targets are statically unknown.4 The first three arguments of a standard function call are the function’s 2.1 Runtime Conventions address (std-link), its closure (std-clos), and return contin- As described by Appel [2, 3], SML/NJ has a runtime model that can uation address (std-cont). Following these arguments are k be described as a simple abstract machine (called the CMACHINE). callee-save registers [8] (typically k = 3), which are assigned The CMACHINE defines a small set of special registers to represent to the first k miscellaneous registers (misc0;:::; misck−1). its state; these are: The remaining arguments correspond to the user arguments to the function and are mapped to registers by type; i.e., pointers • alloc is the allocation pointer, which points to the next word and integers are assigned to std-arg, misc , misc , etc., to allocate in the nursery. k k+1 and floating-point arguments are assigned to floating-point • limit is the allocation-limit pointer, which points to the upper registers. limit of the nursery minus a buffer of 1024 words. This buffer, (2) Standard continuation calls are calls to “escaping” continu- which is called the allocation slop, allows most heap-limit ations. The first argument is the continuation’s address and tests to be implemented as a simple comparison. is assigned to the std-cont register; it is followed by the k • store is the store-list pointer, which points to a list of locations callee-save registers, some of which are used to hold the con- that have been modified since the last garbage collection (i.e., tinuation’s free variables. The remaining arguments to the it implements a write barrier). continuation are mapped to registers in the same way as for • exnptr is the current-exception-handler pointer, which points standard functions. to a closure representing the current exception handler. (3) Known function calls are “gotos with arguments” [34] that • varptr is the var pointer, which is a global mutable location represent the internal control flow (loops and join points) ina that can be used to implement features such as thread-local standard function or continuation.

Load more