Pallene: a Statically Typed Companion Language for Lua
Total Page:16
File Type:pdf, Size:1020Kb
Pallene: A statically typed companion language for Lua Hugo Musso Gualandi Roberto Ierusalimschy PUC-Rio PUC-Rio [email protected] [email protected] ABSTRACT that interact with the underlying operating system, and a The simplicity and flexibility of dynamic languages make high-level scripting language for the parts that need flexibility them popular for prototyping and scripting, but the lack of and ease of use. Its main advantage is that the programmer compile-time type information makes it very challenging to can choose the programming language most suited for each generate efficient executable code. particular task. The main disadvantages are due to the large Inspired by ideas from scripting, just-in-time compilers, differences between the languages. That makes it hardto and optional type systems, we are developing Pallene, a rewrite a piece of code from one language to the other and statically typed companion language to the Lua scripting also adds run-time overhead in the API between the two language. Pallene is designed to be amenable to standard language runtimes. ahead-of-time compilation techniques, to interoperate seam- Just-in-time compilers [6] dynamically translate high level lessly with Lua (even sharing its runtime), and to be familiar code into low-level machine code during the program execu- to Lua programmers. tion, on demand. To maximize performance, JIT compilers In this paper, we compare the performance of the Pallene may collect run-time information, such as function parameter compiler against LuaJIT, a just in time compiler for Lua, and types, and use that information to generate efficient special- with C extension modules. The results suggest that Pallene ized code. The most appealing aspect of JIT compilers for can achieve similar levels of performance. dynamic languages is that they can provide a large speedup without the need to rewrite the code in a different language. CCS CONCEPTS In practice, however, this speedup is not always guaranteed as programmers often need to rewrite their code anyway, • Software and its engineering → Scripting languages; using idioms and “incantations” that are more amenable to Just-in-time compilers; Dynamic compilers; Imperative lan- optimization [10]. guages; As the name suggests, optional type systems allow pro- ACM Reference Format: grammers to partially add types to programs.1 These systems Hugo Musso Gualandi and Roberto Ierusalimschy. 2018. Pallene: combine the static typing and dynamic typing disciplines in A statically typed companion language for Lua. In XXII Brazilian Symposium on Programming Languages (SBLP 2018), September a single language. From static types they seek better compile- 20–21, 2018, SAO CARLOS, Brazil. ACM, New York, NY, USA, time error checking, machine-checked lightweight documen- 8 pages. https://doi.org/10.1145/3264637.3264640 tation, and run-time performance; from dynamic typing they seek flexibility and ease of use. One of the selling points of 1 INTRODUCTION optional type systems is that they promise a smooth transi- tion from small dynamically typed scripts to larger statically The simplicity and flexibility of dynamic languages make typed applications [27]. The main challenge these systems them popular for prototyping and scripting, but the lack of face is that it is hard to design a type system that is at the compile-time type information makes it very challenging to same time simple, correct, and amenable to optimizations. generate efficient executable code. There are at least three Both scripting and optional types assume that program- approaches to improve the performance of dynamically typed mers need to restrict the dynamism of their code when they languages: scripting, just-in-time compilation, and optional seek better performance. Although in theory JIT compilers type systems. do not require this, in practice programmers also need to The scripting approach [19] advocates the use of two sepa- restrict themselves to achieve maximum performance. Realiz- rate languages to write a program: a low-level system language ing how these self-imposed restrictions result in the creation for the parts of the program that need good performance and of vaguely defined language subsets, and how restricting dy- Permission to make digital or hard copies of all or part of this work namism seems unavoidable, we asked ourselves: what if we for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage accepted the restrictions and defined a new programming and that copies bear this notice and the full citation on the first page. language based on them? By focusing on this “well behaved” Copyrights for components of this work owned by others than the subset and making it explicit, instead of trying to optimize author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, or type a dynamic language in its full generality, we would be requires prior specific permission and/or a fee. Request permissions able to drastically simplify the type system and the compiler. from [email protected]. To study this question, we are developing the Pallene pro- SBLP 2018, September 20–21, 2018, SAO CARLOS, Brazil © 2018 Copyright held by the owner/author(s). Publication rights gramming language, a statically typed companion to Lua. licensed to ACM. 1 ACM ISBN 978-1-4503-6480-5/18/09. $15.00 Gualandi [9] presents a historical review of type systems for dynamic https://doi.org/10.1145/3264637.3264640 languages. SBLP 2018, September 20–21, 2018, SAO CARLOS, Brazil Hugo Musso Gualandi and Roberto Ierusalimschy Pallene is intended to act as a system language counterpart to this advice when the code is mainly about low-level opera- Lua’s scripting, but with better interoperability than existing tions that are easy to express in C, such as doing arithmetic static languages. To avoid the complexity of a JIT compila- and calling external libraries. Another obstacle to this sug- tion, Pallene should be amenable to standard ahead-of-time gestion is that it is hard to estimate in advance both the compiler optimization techniques. To minimize the run-time costs of rewriting the code and the performance benefits to mismatch, Pallene should use Lua’s data representation and be achieved by the change. Often, the gain in performance is garbage collector. To minimize the conceptual mismatch, Pal- not what one would expect: As we will see in Section 5, the lene should be familiar to Lua programmers, syntactically overhead of converting data from one runtime to the other and semantically. can cancel out the inherent gains of switching to a static In the next section of this paper, we overview the main language. approaches currently used to tackle the performance problems of dynamic languages. In Section 3, we describe how we 2.2 JIT Compilers designed Pallene, aiming to combine desirable properties from those approaches. In Section 4, we discuss how our Just-in-time (JIT) compilers are the state of the art in dy- goals for Pallene affected its implementation. In Section 5,we namic language optimization. A JIT compiler initially ex- evaluate the performance of our prototype implementation of ecutes the program without any optimization, observes its Pallene on a set of micro benchmarks, comparing it with the behavior at runtime, and then, based on this, generates highly reference Lua implementation and with LuaJIT [20], a state specialized and optimized executable code. For example, if it of the art JIT compiler for Lua. This evaluation suggests observes that some code is always operating on values of type that it is possible to produce efficient executable code for double, the compiler will optimistically compile a version of Pallene programs. In the last two sections of this paper we this code that is specialized for that type. It will also insert compare Pallene with related research in type systems and tests (guards) in the beginning of the code that jump back optimization for dynamic languages, and we discuss avenues to a less optimized generic version in case some value is not for future work. of type double as expected. JIT compilers are broadly classified as either method-based or trace-based [7], according to their main unit of compilation. 2 OPTIMIZING SCRIPTING In method-based JITs, the unit of compilation is one function LANGUAGES or subroutine. In trace-based JITs, the unit of compilation is In this section we discuss the existing approaches to optimiz- a linear trace of the program execution, which may cross over ing dynamic languages that we mentioned in the introduction. function boundaries. Trace compilation allows for a more embedable implementation and is better at compiling across abstraction boundaries. However, it has trouble optimizing 2.1 Scripting programs which contain unpredictable branch statements. For One way to overcome the slowness of dynamic languages this reason, most JIT compilers now tend to use the method- is to avoid them for performance-sensitive code. Dynamic based approach, with the notable exceptions of LuaJIT [20] scripting languages are often well-suited for a multi-language and the RPython Framework [5]. architecture, where a statically typed low-level system lan- JIT compilers detect types at run-time because inferring guage is combined with a flexible dynamically typed scripting types at compile type is very hard and usually produces less language, a style of programming that has been championed specific results. Additionally, in many dynamic languages by John Ousterhout [19]. data types are created at run-time and there are no data Lua has been designed from the start with scripting in definition declarations or type annotations for the compiler mind [13] and many applications that use Lua follow this to take advantage from.