Sound Gradual Typing Is Nominally Alive and Well

Sound Gradual Typing is Nominally Alive and Well FABIAN MUEHLBOECK, Cornell University, USA ROSS TATE, Cornell University, USA Recent research has identified significant performance hurdles that sound gradual typing needs to overcome. These performance hurdles stem from the fact that the run-time checks gradual type systems insert into code can cause significant overhead. We propose that designing a type system for a gradually typed language hand in hand with its implementation from scratch is a possible way around these and several other hurdles on the way to efficient sound gradual typing. Such a design process also highlights the type-system restrictions required for efficient composition with gradual typing. We formalize the core of a nominal object-oriented language that fulfills a variety of desirable properties for gradually typed languages, and present evidence that an implementation of this language suffers minimal overhead even in adversarial benchmarks identified in earlier work. CCS Concepts: · General and reference → Performance; · Software and its engineering → Formal language definitions; Runtime environments; Object oriented languages; Additional Key Words and Phrases: Gradual Typing, Nominal, Immediate Accountability, Transparency ACM Reference Format: Fabian Muehlboeck and Ross Tate. 2017. Sound Gradual Typing is Nominally Alive and Well. Proc. ACM Program. Lang. 1, OOPSLA, Article 56 (October 2017), 30 pages. https://doi.org/10.1145/3133880 1 INTRODUCTION Sound gradual typing is the idea that typed and untyped code can be mixed together in a single language in such a way that the typed code is able to execute as if there were no untyped code. This means that typed code can rely on type soundness to enable type-driven optimizations. The basic intuition of how to achieve sound gradual typing is relatively simple: we must protect the guarantees obtained through static type-checking by inserting run-time checks at locations in the code where values flow from untyped components to typed components. If a value from untyped code fails to have its expected type at run time, an exception is thrown. Thus, the statically checked components of the program can assume all values passed to them are well-typed. Painted this way, the picture leads us to expect that gradual typing may incur some overhead for those inserted checks, proportional to the number of times we transition from untyped to typed parts of the program. In the optimal scenario, these checks are infrequent and efficient, and thus the overall cost of gradual typing is low and can be easily estimated and planned for by analyzing the level of interaction between typed and untyped parts of the program. Furthermore, typed code can be optimized in ways untyped code cannot, so one would expect performance to smoothly improve as types are added to a code base. However, we are not aware of any existing proposal for a sound gradually typed language which has such performance behavior and reasonable performance for Authors’ addresses: Fabian Muehlboeck, Cornell University, Ithaca, NY, 14850, USA, [email protected]; Ross Tate, Cornell University, Ithaca, NY, 14850, USA, [email protected]. Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice andthe full citation on the first page. Copyrights for components of this work owned by others than the author(s) must behonored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]. © 2017 Copyright held by the owner/author(s). Publication rights licensed to Association for Computing Machinery. 2475-1421/2017/10-ART56 https://doi.org/10.1145/3133880 Proc. ACM Program. Lang., Vol. 1, No. OOPSLA, Article 56. Publication date: October 2017. 56 56:2 Fabian Muehlboeck and Ross Tate fully untyped code. In fact, Takikawa et al. [2016] measured extreme and unpredictable dips in performance for programs consisting of both typed and untyped code. These measurements were made on their current proposal for sound gradual typing for Racket [Tobin-Hochstadt and Felleisen 2006], but they argue that no other system provides convincing reasons for why it should perform significantly better. In this paper, we present a way to achieve efficient sound gradual typing. The core component of this approach is a nominal type system with run-time type information, which lets us verify assumptions about many large data structures with a single, quick check. The downside of this approach is that it limits expressivity, particularly with respect to structural data. Nonetheless, we argue that one should be able to build useful programming languages even under these limitations, and we sketch ideas about how to address the limitations in the future. Furthermore, by designing this system hand-in-hand with gradual typing, we are able to execute even untyped code efficiently despite our reliance on nominal typing. To support our claims about efficiency, we built a prototype compiler for a nominal object- oriented language and used it to implement key examples presented by Takikawa et al. Given that this involves a major shift in programming paradigms, we engineered the examples to exhibit the same complexity in terms of transitions between untyped and typed code as they did in Racket. Whereas Takikawa et al. measured overheads over 10,000% relative to the performance of fully untyped code, we measured worst-case overheads of less than 10%. In summary, the contributions of this paper are as follows: • We present new desirable properties of sound gradual type systems that we believe significantly improve their performance (Section 3). • We present a simple gradually typed nominal object-oriented language (Section 5 through 7) that fulfills the properties traditionally desired of gradual type systems in addition toour own new properties (Section 8). We also give a crisp connection between the direct semantics of the language (Section 6) and the cast semantics of the language (Section 7). • We provide evidence of our approach’s feasibility and efficiency by presenting an implementation of said language and comparing benchmarks between it, Typed Racket [Takikawa et al. 2016], C# [Bierman et al. 2010], and Reticulated Python [Vitousek et al. 2014] (Section 9). • We illustrate the significant tradeoffs and future challenges of our approach and sketch possible avenues for addressing its current limitations in the future (Section 10). This work draws heavily from earlier work on gradual typing, which we review next. 2 BACKGROUND Gradual typing, as originally proposed by Siek and Taha [2006], features two core elements: a special type dyn and a consistency relation τ ∼ τ ′ expressing that τ and τ ′ are structurally equal except for places featuring dyn. For example, the following types are consistent: dyn ∼ (dyn → dyn) ∼ (int → dyn) ∼ (int → bool) A program in their gradually typed language type-checks according to the original typing rules, but with type equality replaced with the consistency relation in many places. This enables dyn to stand for any type while also maintaining the familiar static typing rules where dyn is not present. The gradually typed program is then translated into a variant of the original statically typed language by inserting dynamic casts where run-time checks are necessary to monitor the boundary between untyped and typed code. Here is how this works for an example program: f : dyn → dyn ⊢ (λf ′ : int → int. f ′ 5) f Proc. ACM Program. Lang., Vol. 1, No. OOPSLA, Article 56. Publication date: October 2017. Sound Gradual Typing is Nominally Alive and Well 56:3 The lambda term expects a parameter of type int → int, but it is applied to an argument of type dyn → dyn. Despite this difference in types, the program type-checks because these two types are considered to be consistent with each other. This gradually typed program is then translated to a statically typed program by inserting run-time casts, resulting in f : dyn → dyn ⊢ (λf ′ : int → int. f ′ 5)(f :: int → int) Here e :: τ represents a run-time check that e has type τ , conceptually throwing a run-time exception if it does not. 2.1 Casting Strategies In most gradual typing settings, casts are the only source of run-time overhead incurred by gradual typing. Thus, where casts are inserted and how they work has a big impact on the performance of a gradually typed program. Before we suggest our own variation on casts in Section 3, we give an overview of existing casting strategies that have been studied as such. 2.1.1 Guarded. Most work on sound gradual typingÐincluding the original works by Siek and Taha [2006], Tobin-Hochstadt and Felleisen [2006], Matthews and Findler [2007], and Gronski et al. [2006]Ðuses the guarded cast semantics. In those systems, a cast like the one above reduces as follows: f :: int → int 7→ λx : int. (f (x :: dyn)) :: int Instead of checking whether the function f always returns an int when given an int, which is generally impossible, it is wrapped in a new function that upcasts its input to dynÐwhich always worksÐand, after the call to f completes, checks that its output is an int. Wadler and Findler [2009], and later Ahmed et al. [2011], showed that this is sound even if it is later discovered that f does not always return an int when given an int. However, instead of having one check at the point where the function is passed to the typed part of the program, this strategy will incur checks every time the function is called, which can cause signficant overhead if that function is heavily used.

Load more