And Effect-Guided Program Synthesis

RbSyn: Type- and Effect-Guided Program Synthesis Sankha Narayan Guria Jeffrey S. Foster David Van Horn University of Maryland Tufts University University of Maryland College Park, Maryland, USA Medford, Massachusetts, USA College Park, Maryland, USA [email protected] [email protected] [email protected] Abstract does not explicitly consider side effects, which are perva- In recent years, researchers have explored component-based sive in many domains. For example, consider synthesizing a synthesis, which aims to automatically construct programs method that updates a database. Without reasoning about that operate by composing calls to existing APIs. However, effects—in this case, that the method body needs to change prior work has not considered efficient synthesis of meth- the database—synthesis of such a method reduces to brute- ods with side effects, e.g., web app methods that update a force search, limiting its performance. database. In this paper, we introduce RbSyn, a novel type- In this paper, we address this issue by introducing RbSyn, and effect-guided synthesis tool for Ruby. An RbSyn synthe- a new tool for synthesizing Ruby methods. In RbSyn, the sis goal is specified as the type for the target method anda user specifies the desired method by its type signature and series of test cases it must pass. RbSyn works by recursively a series of test cases it must pass. RbSyn then searches for generating well-typed candidate method bodies whose write a solution by enumerating candidates and checking them effects match the read effects of the test case assertions. After against the tests. The key novelty of RbSyn is that the search finding a set of candidates that separately satisfy each test, is both type- and effect-guided. Specifically, the search begins RbSyn synthesizes a solution that branches to execute the with a typed hole tagged with the method’s return type. Each correct candidate code under the appropriate conditions. We step either replaces a typed hole with an expression of that type, possibly introducing more typed holes; inserts an effect formalize RbSyn on a core, object-oriented language _B~= and describe how the key ideas of the model are scaled-up hole, annotated with a write effect that may be needed to in our implementation for Ruby. We evaluated RbSyn on 19 satisfy a test assertion; or replaces an effect hole with an benchmarks, 12 of which come from popular, open-source expression with the given write effect, possibly inserting Ruby apps. We found that RbSyn synthesizes correct solu- another effect hole. Once this process finds a set of method tions for all benchmarks, with 15 benchmarks synthesizing bodies that cumulatively pass all tests, RbSyn uses a novel in under 9 seconds, while the slowest benchmark takes 83 merging strategy to construct a complete solution: It cre- seconds. Using observed reads to guide synthesize is effec- ates a method whose body branches among the conditions, tive: using type-guidance alone times out on 10 of 12 app executing the corresponding (passing) code, thus yielding benchmarks. We also found that using less precise effect an- a single method that passes all tests. (§2 gives a complete notations leads to worse synthesis performance. In summary, example of RbSyn’s synthesis process.) _ we believe type- and effect-guided synthesis is an important We formalize RbSyn for B~=, a core object-oriented lan- step forward in synthesis of effectful methods from test cases. guage. The synthesis algorithm is comprised of three parts. The first part, type-guided synthesis, is similar to prior work Keywords program synthesis, type and effect systems, Ruby [16, 29, 31], but is geared towards imperative, object-oriented programs. The second part is effect-guided synthesis, which 1 Introduction tries to fill an effect hole ^ : n with an expression with arXiv:2102.13183v2 [cs.PL] 8 Apr 2021 effect n. In _ , an effect accesses a region 퐴.A, where 퐴 A key task in modern software development is writing code B~= is a class and A is an uninterpreted identifier. For example, that composes calls to existing APIs, such as from a library Post.author might indicate reading instance field author or framework. Component-based synthesis aims to carry out of class Post. This notion of effects balances precision and this task automatically, and researchers have shown how to tractability: effects are precise enough to guide synthesis perform component-based synthesis using SMT solvers [25]; effectively, yet coarse enough that reasoning about them is how to synthesize branch conditions [30]; and how to per- simple. The last part of the synthesis algorithm synthesizes form synthesis given a very large number of components [12]. branch conditions to create a merged program that combines This prior work guides the synthesis process using types solutions for individual tests into an overall solution for the or special properties of the synthesis domain, which is crit- complete problem. (§3 discusses our formalism.) ical to achieving good performance. However, prior work Our implementation of RbSyn is built on top of RDL, a Ruby type system [15]. Our implementation extends RDL to Conference’17, July 2017, Washington, DC, USA include effect annotations, including a self region to give 2021. ACM ISBN 978-x-xxxx-xxxx-x/YY/MM...$15.00 https://doi.org/10.1145/nnnnnnn.nnnnnnn 1 Conference’17, July 2017, Washington, DC, USA Sankha Narayan Guria, Jeffrey S. Foster, and David Van Horn more precise effect information in the presence of inheri- 1 # User schema{name: Str, username: Str} # Post schema{author: Str, title: Str, slug: Str} tance. Our implementation also makes use of RDL’s type-level 2 3 computations [26] to provide precise typing during synthesis. 4 define :update_post,"(Str, Str,{author:?Str, title: Finally, when searching for solutions, our implementation 5 ?Str, slug:?Str}) ! Post", [User, Post] do heuristically prioritizes further exploration of candidates that 6 spec"author can only change titles" do are small and have passed more assertions. (§4 describes our 7 setup { seed_db# add some users and their posts to db implementation.) 8 9 @post = Post.create(author: 'author', slug: We evaluated RbSyn on a suite of 19 benchmarks, in- 10 'hello-world', title: 'Hello World') cluding seven benchmarks we wrote and 12 benchmarks 11 update_post('author', 'hello-world', author: extracted from three widely used, open-source Ruby apps: 12 'dummy', title: 'Foo Bar', slug: 'foobar') Discourse, Gitlab, and Disaspora. For the former, we wrote 13 } postcond { |updated| our own specifications. For the latter, we used unit tests 14 15 assert { updated.id == @post.id } that came with the benchmarks. We found that RbSyn syn- 16 assert { updated.author =="author"} thesizes correct solutions for all benchmarks and does so 17 assert { updated.title =="Foo Bar"} quickly, taking less than 9 seconds each for 15 of the bench- 18 assert { updated.slug == 'hello-world' } marks, and 83 seconds for the slowest benchmark. Moreover, 19 } end type- and effect-guidance is critical. Without it, a majority 20 21 spec"other users cannot change anything" do of the benchmarks time out after five minutes. Finally, we 22 setup { ...# same setup as above except next line examine the tradeoff of effect precision versus performance. 23 update_post('dummy', ...)# other args same We found that restricting effects to class names only causes 3 24 } benchmarks to time out, and restricting effects to only puri- 25 postcond { |updated| ...# same other three asserts assert { updated.title =="Hello World"} ty/impurity causes 10 benchmarks to time out. (§5 discusses 26 27 } the evaluation in detail.) 28 end end We believe that RbSyn is an important step forward in synthesis of effectful methods from test cases. Figure 1. Specification for update_post method 2 Overview In this section, we illustrate RbSyn by using it to synthesize These classes can then be used to invoke singleton (class) a method from a hypothetical web blogging app. This app methods in the synthesized method. For simplicity, we as- makes heavy use of ActiveRecord, a popular database access sume that RbSyn can use only these constants for this exam- library for Ruby on Rails. It is the ActiveRecord methods ple. In practice, RbSyn can synthesize predefined numeric whose side effects RbSyn uses to guide synthesis. or string constants like 0, 1 or the empty string. Figure1 shows the synthesis problem. This particular app Finally, the synthesis problem includes a number of specs, includes database tables for users and posts. In ActiveRe- which are just test cases. Each spec has a title, for human cord, rows of these tables are represented as instances of convenience; a setup block to establish any necessary pre- classes User and Post, respectively. For reference, the table conditions and call the synthesized method; and a postcond schemas are shown in lines1 and2. Each user has a name block with assertions that must hold after the synthesized and username. Each post has the author’s username, the method runs. As we will see below, separating the pre- and post’s title, and a slug, used to compute a permalink. postconditions allows RbSyn to more easily use effects to The goal of this particular synthesis problem, given by guide synthesis. In this example, both specs add a few users the call to define, is to create a method update_post that and a post created by each of them to the database (call to allows users to change the information about a post. Lines4 seed_db, details not shown) and then create a post titled and5 specify the method’s type signature in the format of “Hello World” by the user author. The first spec asserts that RDL [15], a Ruby type system that RbSyn uses for types and update_post allows author to update a post’s title.

Load more