Constraint-Based Type-Directed Program Synthesis

Constraint-Based Type-Directed Program Synthesis Peter-Michael Osera Department of Computer Science Grinnell College United States of America [email protected] Abstract 2. The result of pattern matching on the second argument a We explore an approach to type-directed program synthe- (in the Just case, its argument will have type ). sis rooted in constraint-based type inference techniques. By These restrictions highly constrain the set of valid programs doing this, we aim to more efficiently synthesize polymor- that will typecheck, giving the impression that the function phic code while also tackling advanced typing features such writes itself as long as the programmer can navigate the type as GADTs that build upon polymorphism. Along the way, system appropriately. To do this, they must systematically we also present an implementation of these techniques in check the context to see what program elements are rele- Scythe, a prototype live, type-directed programming tool vant to their final goal and try to put them together into for the Haskell programming language and reflect on our a complete program. However, such navigation is usually initial experience with the tool. mechanical and tedious in nature. It would be preferable if a CCS Concepts • Software and its engineering → Se- tool automated some or all of this type-directed development mantics; Automatic programming; • Theory of compu- process. tation → Logic and verification. Such luxuries are part of the promise of type-directed program synthesis tools. Program synthesis is the automatic Keywords Functional Programming, Program Synthesis, generation of programs from specification. In this particular Type Inference, Type Theory case, the specification of our program is its rich type cou- ACM Reference Format: pled with auxiliary information, e.g., concrete examples of Peter-Michael Osera. 2019. Constraint-Based Type-Directed Pro- intended behavior. gram Synthesis. In Proceedings of the 4th ACM SIGPLAN Interna- tional Workshop on Type-Driven Development (TyDe ’19), August 1.1 From Typechecking to Program Synthesis 18, 2019, Berlin, Germany. ACM, New York, NY, USA, 13 pages. Type-directed synthesis techniques search the space of pos- hps://doi.org/10.1145/3331554.3342608 sible programs primarily through a reinterpretation of the programming language’s type system [6, 19, 22]. Tradition- 1 Introduction ally, type systems are specified by defining a relation between Functional programmers frequently comment how richly- a context, expression, and type, e.g., Γ ⊢ e : τ declares that e typed functional programs just “write themselves”.1 For ex- has type τ under context Γ. From this specification, we would ample, consider writing down a function that obeys the type: like to extract a typechecking algorithm for the language. f :: a -> Maybe a -> a However, because relations do not distinguish between in- Because the return type of the function, a, is polymorphic puts and outputs, it is sometimes not clear how to extract we can only produce a value from two sources: such an algorithm. Bi-directional typechecking systems [21] alleviate these concerns by making it explicit which compo- 1. The first argument to the function (of type a). nents of the system are inputs and outputs of the system, 1Once you get over the complexity of the types! typically by distinguishing the cases where we check that a term has a type from the cases where we infer that a term Permission to make digital or hard copies of all or part of this work for has a type. In the former case, the type acts as an input to personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that the system where in the latter case, it is an output. copies bear this notice and the full citation on the first page. Copyrights In both cases, the term being typechecked serves as an for components of this work owned by others than the author(s) must input to the typechecking algorithm. However, with type- be honored. Abstracting with credit is permitted. To copy otherwise, or directed program synthesis, we instead view the term as republish, to post on servers or to redistribute to lists, requires prior specific an output and the type as an input. For example, consider permission and/or a fee. Request permissions from [email protected]. TyDe ’19, August 18, 2019, Berlin, Germany the standard function application rule found in most type © 2019 Copyright held by the owner/author(s). Publication rights licensed systems: to ACM. Γ ⊢ e1 : τ1 → τ2 Γ ⊢ e2 : τ1 ACM ISBN 978-1-4503-6815-5/19/08...$15.00 hps://doi.org/10.1145/3331554.3342608 Γ ⊢ e1 e2 : τ2 TyDe ’19, August 18, 2019, Berlin, Germany Peter-Michael Osera While we can observe that the function application should be there is nothing directly constraining the type variable typed at the result type of the function e1, it isn’t clear which a. parts of the relations are inputs and outputs and the order Current systems that handle polymorphic synthesis in which the checks ought to be carried out. A bidirectional simply enumerate and explore all the possible types interpretation of this rule makes the inputs and outputs that can be created from the context. However, this explicit: may lead to an excessive exploration of type combi- nations that do not work, e.g., choosing a to be Int Γ Γ ⊢ e1 ⇒ τ1 → τ2 ⊢ e2 ⇐ τ1 but not having any functions of type Int -> Bool Γ ⊢ e1 e2 ⇒ τ1 → τ2 available in the context. Furthermore, it may lead to repeated checking of terms that are, themselves, poly- Γ Here, the relation ⊢ e1 ⇒ τ1 → τ2 means that we infer that morphic, e.g., the empty list [] :: [a] for any type a. Γ the type of e1 is τ1 → τ2. The relation ⊢ e2 ⇐ τ1 means We would like to develop a system that systematically that we check that the type of e2 is indeed τ1. With this, the searches the space of polymorphic instantiations while procedure for type checking a function application is clear: minimizing the work done as much as possible. infer a function type τ1 → τ2 for e1 and then check that e2 • Reasoning about richer types: polymorphic types has that input type τ1; the type of the overall application is are relatively easy to handle in an ad hoc fashion. How- then inferred to be the output type τ2. ever, polymorphic types form the basis for a variety of With type-directed synthesis, we turn typechecking into advanced type features such as generalized algebraic term generation by reinterpreting the inputs and outputs of datatypes (GADTs) and typeclasses that are commonly the typechecking relation. used in advanced functional programming languages. Rather than developing ad hoc solutions for all these Γ ⊢ τ1 → τ2 ⇒ e1 Γ ⊢ τ1 ⇒ e2 related features, it would be useful to have a single Γ ⊢ τ2 ⇒ e1 e2 framework for tackling them all at once. The relation Γ ⊢ τ ⇒ e asserts that whenever we have a goal type τ we can generate a term e of that type. Now our term 1.3 Outline generation rule for function application says that whenever In this paper we present a solution to the problems described we have a goal typeτ2, we can generate a function application above: a constraint-based approach to type-directed pro- e1 e2 where the output of the function type agrees with the gram synthesis. Constraint-based typing is used primarily to goal. We can apply this pattern to the other rules of the specify type inference systems which generate constraints type system to obtain a complete term generation system between types in the program and then solves those con- for a language. We can further augment the system with straints to discover the types of unknown type variables. In type-directed example decomposition [6, 19] or richer types the spirit of type-directed program synthesis, we flip the such as refinements [22] to obtain true program synthesis inputs and outputs of the constraint-based typing system systems. to arrive at a synthesis system that tracks type constraints This type-theoretic interpretation of program synthesis throughout the synthesis process. This embodies a new strat- gives us immediate insight into how to synthesize programs egy for synthesis—“infer types while synthesizing”—that for languages with rich type systems. Because rich types allows for efficient synthesis of polymorphic code while also constrain the set of possible programs dramatically, program giving us a framework to tackle GADTs and other advanced synthesis with types can lead to superior synthesis perfor- type features built on polymorphism. mance. On top of this, the type-directed synthesis style di- In section 2, we develop a constraint-based program syn- rectly supports type-directed programming, a hallmark of thesis based on a basic constraint-based type inference sys- richly-typed functional programming languages. tem. In section 3, we extend the system to account for GADTs. In section 4, we discuss our prototype implementation of 1.2 The Perils of Polymorphism this approach in our program synthesis tool, Scythe. And fi- However, in supporting rich types, in particular polymor- nally in section 5 and section 6 we close by discussing future phism, we run into a pair of problems that deserve special extensions and related work. attention. • The type enumeration problem: when dealing with 2 Constraint-based Program Synthesis polymorphic types, we must first instantiate them. We first develop a constraint-based program synthesis sys- However, there may be many possible instantiations tem for a core functional programming language featuring of a polymorphic type. For example, consider the map algebraic datatypes and polymorphism. To do this, we first function of type (a -> b) -> [a] -> [b].

Load more