Counting Immutable Beans Reference Counting Optimized for Purely Functional Programming

Counting Immutable Beans Reference Counting Optimized for Purely Functional Programming Sebastian Ullrich Leonardo de Moura Karlsruhe Institute of Technology Microsoft Research Germany USA [email protected] [email protected] ABSTRACT counts every time a reference is created or destroyed can signif- Most functional languages rely on some kind of garbage collection icantly impact performance because they not only take time but for automatic memory management. They usually eschew reference also affect cache performance, especially in a multi-threaded pro- counting in favor of a tracing garbage collector, which has less gram [Choi et al. 2018]. Second, reference counting cannot collect bookkeeping overhead at runtime. On the other hand, having an circular [McBeth 1963] or self-referential structures. Finally, in most exact reference count of each value can enable optimizations such reference counting implementations, pause times are deterministic as destructive updates. We explore these optimization opportunities but may still be unbounded [Boehm 2004]. in the context of an eager, purely functional programming language. In this paper, we investigate whether reference counting is a We propose a new mechanism for efficiently reclaiming memory competitive memory management technique for purely functional used by nonshared values, reducing stress on the global memory languages, and explore optimizations for reusing memory, perform- allocator. We describe an approach for minimizing the number of ing destructive updates, and for minimizing the number of reference reference counts updates using borrowed references and a heuristic count increments and decrements. The former optimizations in par- for automatically inferring borrow annotations. We implemented ticular are beneficial for purely functional languages that otherwise all these techniques in a new compiler for an eager and purely can only perform functional updates. When performing functional functional programming language with support for multi-threading. updates, objects often die just before the creation of an object of Our preliminary experimental results demonstrate our approach is the same kind. We observe a similar phenomenon when we in- competitive and often outperforms state-of-the-art compilers. sert a new element into a pure functional data structure such as binary trees, when we use map to apply a given function to the CCS CONCEPTS elements of a list or tree, when a compiler applies optimizations by transforming abstract syntax trees, or when a proof assistant • Software and its engineering ! Runtime environments; rewrites formulas. We call it the resurrection hypothesis: many ob- Garbage collection. jects die just before the creation of an object of the same kind. KEYWORDS Our new optimization takes advantage of this hypothesis, and en- ables pure code to perform destructive updates in all scenarios purely functional programming, reference counting, Lean described above when objects are not shared. We implemented all ACM Reference Format: the ideas reported here in the new runtime and compiler for the Sebastian Ullrich and Leonardo de Moura. 2020. Counting Immutable Beans: Lean programming language [de Moura et al. 2015]. We also report Reference Counting Optimized for Purely Functional Programming. In preliminary experimental results that demonstrate our new com- Proceedings of International Symposium on Implementation and Application piler produces competitive code that often outperforms the code of Functional Languages (IFL’19). ACM, New York, NY, USA, 12 pages. generated by high-performance compilers such as ocamlopt and GHC (Section 8). 1 INTRODUCTION Lean implements a version of the Calculus of Inductive Con- Although reference counting [Collins 1960] (RC) is one of the old- structions [Coquand and Huet 1988; Coquand and Paulin 1990], arXiv:1908.05647v3 [cs.PL] 5 Mar 2020 est memory management techniques in computer science, it is not and it has mainly been used as a proof assistant so far. Lean has considered a serious garbage collection technique in the functional a metaprogramming framework for writing proof and code au- programming community, and there is plenty of evidence it is in tomation, where users can extend Lean using Lean itself [Ebner general inferior to tracing garbage collection algorithms. Indeed, et al. 2017]. Improving the performance of Lean metaprograms was high-performance compilers such as ocamlopt and GHC use tracing the primary motivation for the work reported here, but one can garbage collectors. Nonetheless, implementations of several popu- apply the techniques reported here to general-purpose functional lar programming languages, e.g., Swift, Objective-C, Python, and programming languages. Perl, use reference counting as a memory management technique. We describe our approach as a series of refinements starting from Reference counting is often praised for its simplicity, but many dis- λpure, a simple intermediate representation for eager and purely advantages are frequently reported in the literature [Jones and Lins functional languages (Section 3). We remark that in Lean and λpure, 1996; Wilson 1992]. First, incrementing and decrementing reference it is not possible to create cyclic data structures. Thus, one of the main criticisms against reference counting does not apply. From IFL’19, September 2019, Singapore λpure, we obtain λRC by adding explicit instructions for increment- 2020. ACM ISBN 978-x-xxxx-xxxx-x/YY/MM...$15.00 ing (inc) and decrementing (dec) reference counts, and reusing IFL’19, September 2019, Singapore Sebastian Ullrich and Leonardo de Moura memory (Section 4). The inspiration for explicit RC instructions them exactly once. Now we contrast that with a function that only comes from the Swift compiler, as does the notion of borrowed refer- inspects its argument: the function isNil xs returns true if the list ences. In contrast to standard (or owned) references, of which there xs is empty and false otherwise. If the argument xs is taken as an should be exactly as many as the object’s reference counter implies, owned reference, our compiler generates the following code borrowed references do not update the reference counter but are isNil xs = case xs of assumed to be kept alive by a surrounding owned reference, further (Nil ! dec xs ; ret true) minimizing the number of inc and dec instructions in generated Cons ! dec xs ret false code. ( ; ) We present a simple compiler from λpure to λRC, discussing We need the dec instructions because a function must consume heuristics for inserting destructive updates, borrow annotations, all arguments taken as owned references. One may notice that and inc/dec instructions (Section 5). Finally, we show that our decrementing xs immediately after we inspect its constructor tag approach is compatible with existing techniques for performing is wasteful. Now assume that instead of taking the ownership of destructive updates on array and string values, and propose a sim- an RC token, we could borrow it from the caller. Then, the callee ple and efficient approach for thread-safe reference counting (Sec- would not need to consume the token using an explicit dec oper- tion 7). ation. Moreover, the caller would be responsible for keeping the Contributions. We present a reference counting scheme optimized borrowed value alive. This is the essence of borrowed references: a for and used by the next version of the Lean programming language. borrowed reference does not actually keep the referenced value • We describe how to reuse allocations in both user code and alive, but instead asserts that the value is kept alive by another, language primitives, and give a formal reference-counting owned reference. Thus, when xs is a borrowed reference, we com- semantics that can express this reuse. pile isNil into our IR as • We describe the optimization of using borrowed references. isNil xs = case xs of (Nil ! ret true)(Cons ! ret false) • We define a compiler that implements all these steps. The hasNone compiler is implemented in Lean itself and the source code As a less trivial example, we now consider the function xs true xs is available. that, given a list of optional values, returns if contains a None • We give a simple but effective scheme for avoiding atomic value. This function is often defined in a functional language reference count updates in multi-threaded programs. as • We compare the new Lean compiler incorporating these hasNone [] = false ideas with other compilers for functional languages and hasNone (None : xs) = true show its competitiveness. hasNone (Somex : xs) = hasNone xs 2 EXAMPLES Similarly to isNil, hasNone only inspects its argument. Thus if xs is taken as a borrowed reference, our compiler produces the following In reference counting, each heap-allocated value contains a refer- RC-free IR code for it ence count. We view this counter as a collection of tokens. The inc instruction creates a new token and dec consumes it. When a hasNone xs = case xs of function takes an argument as an owned reference, it is responsible (Nil ! ret false) for consuming one of its tokens. The function may consume the (Cons ! let h = projhead xs ; case h of owned reference not only by using the dec instruction, but also by (None ! ret true) storing it in a newly allocated heap value, returning it, or passing (Some ! let t = projtail xs ; let r = hasNonet ; ret r)) it to another function that takes an owned reference. We illustrate Note that our case operation does not introduce binders. Instead, our intermediate representation (IR) and the use of owned and we use explicit instructions proj for accessing the head and tail of borrowed references with a series of small examples. i the Cons cell. We use suggestive names for cases and fields in these The identity function id does not require any RC operation when initial examples, but will later use indices instead. Our borrowed it takes its argument as an owned reference. inference heuristic discussed in Section 5 correctly tags xs as a idx = ret x borrowed parameter. As another example, consider the function mkPairOf that takes x When using owned references, we know at run time whether and returns the pair ¹x; xº.

Load more