Type-Level Computations for Ruby Libraries

Type-Level Computations for Ruby Libraries Milod Kazerounian Sankha Narayan Guria Niki Vazou University of Maryland University of Maryland IMDEA Software Institute College Park, Maryland, USA College Park, Maryland, USA Madrid, Spain [email protected] [email protected] [email protected] Jeffrey S. Foster David Van Horn Tufts University University of Maryland Medford, Massachusetts, USA College Park, Maryland, USA [email protected] [email protected] Abstract types depend on values. Yet this case occurs surprisingly Many researchers have explored ways to bring static typing often, especially in Ruby libraries. For example, consider the to dynamic languages. However, to date, such systems are following database query, written for a hypothetical Ruby not precise enough when types depend on values, which on Rails (a web framework, called Rails henceforth) app: often arises when using certain Ruby libraries. For example, Person.joins (:apartments).where ({ name: ' Alice ' , age: 30, the type safety of a database query in Ruby on Rails depends apartments: {bedrooms: 2}}) on the table and column names used in the query. To address This query uses the ActiveRecord DSL to join two database this issue, we introduce CompRDL, a type system for Ruby tables, people1 and apartments, and then filter on the values that allows library method type signatures to include type- of various columns (name, age, bedrooms) in the result. level computations (or comp types for short). Combined with We would like to type check such code, e.g., to ensure singleton types for table and column names, comp types let the columns exist and the values being matched are of the us give database query methods type signatures that com- right types. But we face an important problem: what type pute a table’s schema to yield very precise type information. signature do we give joins? Its return type—which should Comp types for hash, array, and string libraries can also in- describe the joined table—depends on the value of its argu- crease precision and thereby reduce the need for type casts. ment. Moreover, for n tables, there are n2 ways to join two We formalize CompRDL and prove its type system sound. of them, n3 ways to join three of them, etc. Enumerating all Rather than type check the bodies of library methods with these combinations is impractical. comp types—those methods may include native code or be To address this problem, in this paper we introduce Comp- complex—CompRDL inserts run-time checks to ensure li- RDL, which extends RDL [18], a Ruby type system, to include brary methods abide by their computed types. We evaluated method types with type-level computations, henceforth re- CompRDL by writing annotations with type-level computa- ferred to as comp types. More specifically, in CompRDL we tions for several Ruby core libraries and database query APIs. can annotate library methods with type signatures in which We then used those annotations to type check two popular Ruby expressions can appear as types. During type check- Ruby libraries and four Ruby on Rails web apps. We found ing, those expressions are evaluated to produce the actual the annotations were relatively compact and could success- type signature, and then typing proceeds as usual. For ex- fully type check 132 methods across our subject programs. ample, for the call to Person.joins, by using a singleton type Moreover, the use of type-level computations allowed us for :apartments, a type-level computation can look up the arXiv:1904.03521v1 [cs.PL] 6 Apr 2019 to check more expressive properties, with fewer manually database schemas for the receiver and argument and then inserted casts, than was possible without type-level com- construct an appropriate return type.2 putations. In the process, we found two type errors and a Moreover, the same type signature can work for any model documentation error that were confirmed by the developers. class and any combination of joins. And, because CompRDL Thus, we believe CompRDL is an important step forward in allows arbitrary computation in types, CompRDL type sig- bringing precise static type checking to dynamic languages. natures have access to the full, highly dynamic Ruby envi- ronment. This allows us to provide very precise types for the Keywords type-level computations, dynamic languages, large set of Rails database query methods. It also lets us give types, Ruby, libraries, database queries precise types to methods of finite hash types (heterogeneous hashes), tuple types (heterogeneous arrays), and const string 1 Introduction 1Rails knows the plural of person is people. There is a large body of research on adding static typing 2The use of type-level computations and singleton types could be considered to dynamic languages [2–4, 21, 28, 36, 37, 42–44]. However, dependent typing, but as our type system is much more restricted we existing systems have limited support for the case when introduce new terminology to avoid confusion (see § 2.4 for discussion). types (immutable strings), which can help eliminate type type checking these benchmarks required 4.75× fewer type casts that would otherwise be required. cast annotations compared to standard types, demonstrating Note that in all these cases, we apply comp types to library comp types’ increased precision. (§5 contains the results of methods whose bodies we do not type check, in part to avoid our evaluation.) complex, potentially undecidable reasoning about whether Our results suggest that using type-level computations a method body matches a comp type, but more practically provides a powerful, practical, and precise way to statically because those library methods are either implemented in type check code written in dynamic languages. native code (hashes, arrays, strings) or are complex (database queries). This design choice makes CompRDL a particularly 2 Overview practical system which we can apply to real-world programs. The starting point for our work is RDL [18], a system for To maintain soundness, we insert dynamic checks to ensure adding type checking and contracts to Ruby programs. RDL’s that these methods abide by their computed types at runtime. type system is notable because type checking statically an- (§2 gives an overview of typing in CompRDL.) alyzes source code, but it does so at runtime. For example, C We introduce λ , a core, object-oriented language that for- line6 in Figure 1a gives a type signature for the method C malizes CompRDL type checking. In λ , library methods can defined on the subsequent line. This “annotation” is actu- be declared with signatures of the form ¹a<:e1/A1º ! e2/A2, ally a call to the method type,3 which stores the type signa- where A1 and A2 are the conventional (likely overapproxi- ture in a global table. The type annotation includes a label mate) argument and return types of the method. The precise :model. (In Ruby, strings prefixed by colon are symbols, which argument and return types are determined by evaluating are interned strings.) When the program subsequently calls e1 and e2, respectively, and that evaluation may refer to the RDL.do_typecheck :model (not shown), RDL will type check C type of the receiver and the type a of the argument. λ also the source code of all methods whose type annotations are performs type checking on e1 and e2, to ensure they do not go labeled :model. C wrong. To avoid potential infinite recursion, λ does not use This design enables RDL to support the metaprogramming type-level computations during this type checking process, that is common in Ruby and ubiquitous in Rails. For exam- instead using the conventional types for library methods. Fi- ple, the programmer can perform type checking after meta- C nally, λ includes a rewriting step to insert dynamic checks programming code has run, when corresponding type defini- to ensure library methods abide by their computed types. We tions are available. See Ren and Foster [36] for more details. C prove λ ’s type system is sound. (See §3 for our formalism.) We note that while CompRDL benefits from this runtime We implemented CompRDL on top of RDL, an existing type checking approach—we use RDL’s representation of Ruby type checker. Since CompRDL can include type-level types in our CompRDL signatures, and our subject programs computation that relies on mutable values, CompRDL inserts include Rails apps—there is nothing specific in the design of additional runtime checks to ensure such computations eval- comp types that relies on it, and one could implement comp uate to the same result at method call time as they did at type types in a fully static system. checking time. Additionally, CompRDL uses a lightweight analysis to check that type-level computations (and thus type 2.1 Typing Ruby Database Queries checking) terminate. The termination analysis uses purity While RDL’s type system is powerful enough to type check effects to check that calls that invoke iterator methods—the Rails apps in general, it is actually very imprecise when main source of looping in Ruby, in our experience—do not reasoning about database (DB) queries. For example, consider mutate the receiver, which could introduce non-termination. Figure 1a, which shows some code from the Discourse app. Finally, we found that several kinds of comp types we devel- Among others, this app uses two tables, users and emails, oped needed to include weak type updates to handle muta- whose schemas are shown on lines2 and3. Each user has an tion in Ruby programs. (§4 describes our implementation in id, a username, and a flag indicating whether the account more detail.) was staged. Such staged accounts were created automatically We evaluated CompRDL by first using it to write type by Discourse and can be claimed by the email address owner.

Load more