Learning in System F

Learning in System F Joey Velez-Ginorio Nada Amin Massachusetts Institute of Technology Harvard University Cambridge, Massachusetts, U.S.A. Cambridge, Massachusetts, U.S.A. [email protected] [email protected] Abstract data Nat= Zero Program synthesis, type inhabitance, inductive program- | S Nat ming, and theorem proving. Different names for the same problem: learning programs from data. Sometimes the pro- succ :: Nat -> Nat grams are proofs, sometimes they’re terms. Sometimes data succn = [<Zero, S Zero>,<S Zero, S (S Zero)>] are examples, and sometimes they’re types. Yet the aim is the same. We want to construct a program which satisfies main :: Nat -> Nat some data. We want to learn a program. main = succ What might a programming language look like, if its programs could also be learned? We give it data, and it learns a program from it. This work shows that System F yields succ :: Nat -> Nat an approach for learning from types and examples. Beyond succn = case n[Nat] of simplicity, System F’s expressivity gives us the potential for Zero ->S Zero two key ideas. (1) Learning as a first-class activity. That is, S n2 ->S (S n2) learning can happen anywhere in a program. (2) Bootstrap- ping its learning to higher-level languages—much the same lam x: (R.(Unit -> R) -> (R -> R) -> R). way we bootstrap the capabilities of a functional core to forall X. modern, high-level functional languages. This work takes lam c0: (Unit -> R). some first steps in that direction of the design space. (lam c1: (R -> R). Keywords: Program Synthesis, Type Theory, Inductive Pro- (c1 (((x R) c0) c1)) gramming Figure 1. An example learning problem, the learned solution 1 Introduction in pretty print, and the learned solution in System F. 1.1 A tricky learning problem It takes a natural number, and seems to return its successor. Imagine we’re teaching you a program. Your only data is Any ideas? Perhaps a program which computes... the type =0C !=0C. It takes a natural number, and returns a natural number. Any ideas? Perhaps a program which 5 ¹Gº = G ¸ 1, 5 ¹Gº = G ¸ 2 − 1, 5 ¹Gº = G ¸ · · · computes... The good news is that 5 ¹Gº = G ¸ 1 is correct. And so are all 5 ¹Gº = G, 5 ¹Gº = G ¸ 1, 5 ¹Gº = G ¸ · · · the other programs, as long as we’re agnostic to some details. Types and examples impose useful constraints on learning. 5 ¹Gº G ¸ The good news is that = 1 is correct. The bad news It’s the data we use when learning in System F [Girard et al. is that the data let you learn a slew of other programs too. 1989]. It doesn’t constrain learning enough if we want to teach Existing work can learn successor from similar data [Osera 5 ¹Gº G ¸ = 1. As teachers, we can provide better data. 2015; Polikarpova et al. 2016]. But suppose =0C is a church Round 2. Imagine we’re teaching you a program. But this encoding. For some base type 퐴, =0C B ¹퐴 ! 퐴º ! ¹퐴 ! time we give you an example of the program’s behavior. 퐴º. Natural numbers are then higher-order functions. They =0C !=0C 5 ¹ º Your data are the type and an example 1 = 2. take and return functions. In this setting, existing work can no longer learn successor. Authors’ addresses: Joey Velez-Ginorio, Massachusetts Institute of Technol- ogy, 43 Vassar St., Cambridge, Massachusetts, 02139, U.S.A., [email protected]; 1.2 A way forward Nada Amin, Harvard University, 29 Oxford St., Cambridge, Massachusetts, 02138, U.S.A., [email protected]. The difficulty is with how to handle functions in the return type. The type =0C !=0C returns a function, a program of 2018. 2475-1421/2018/1-ART1 $15.00 type =0C. To learn correct programs, you need to ensure https://doi.org/ candidates are the correct type or that they obey examples. Conference’17, July 2017, Washington, DC, USA Joey Velez-Ginorio and Nada Amin Imagine we want to verify that our candidate program 5 advantageous for learning. Treat this section as an answer obeys 5 ¹1º = 2. With the church encoding, 5 ¹1º is a function, to the following question: and so is 2. To check 5 ¹1º = 2 requires that we decide func- Why learn in System F? tion equality—which is undecidable in a Turing-complete language [Sipser et al. 2006]. Functions in the return type create this issue. There are two ways out. 2.1 Syntax 1. Don’t allow functions in the return type, keep Turing- System F’s syntax is simple (Figure2). There aren’t many completeness. syntactic forms. Whenever we state, prove, or implement 2. Allow functions in the return type, leave Turing- things in System F we often use structural recursion on the completeness. syntax. A minimal syntax means we are succint when we Route 1 is the approach of existing work. They don’t allow state, prove, or implement those things. functions in the return type, but keep an expressive Turing- While simple, the syntax is still expressive. We can encode complete language for learning. This can be a productive many staples of typed functional programming: algebraic move, as many interesting programs don’t return functions. data types, inductive types, and more [Pierce 2002]. For ex- Route 2 is the approach we take. We don’t impose restric- ample, consider this encoding of products: tions on the types or examples we learn from. We instead g g U. g g U U sacrifice Turing-completeness. We choose a language where 1 × 2 B 8 ¹ 1 ! 2 ! º ! function equality is decidable, but still expressive enough to h41, 42i B ΛU._5 : ¹g1 ! g2 ! Uº.5 4142 learn interesting programs. Our work shows that this too is a productive move, as many interesting programs return 2.2 Typing functions. This route leads us to several contributions: System F is safe. Its typing ensures both progress and preser- • Detail how to learn arbitrary higher-order programs vation, i.e. that well-typed programs do not get stuck and in System F. (Section 2 & 3) that they do not change type [Pierce 2002]. When we intro- • Prove the soundness and completeness of learning. duce learning, we lift this safety and extend it to programs (Section 2 & 3) we learn. We omit the presentation of typing, as it’s simi- • Implement learning, extending strong theoretical guar- lar equivalent to a learning relation presented in the next antees in practice. (Section 4 & 5) section. 2 System F 2.3 Evaluation We assume you are familiar with System F, the polymor- System F is strongly normalizing. All its programs termi- phic lambda calculus. You should know its syntax, typing, nate. As a result, we can use a simple procedure for deciding and evaluation. If you don’t, we co-opt its specification in equality of programs (including functions). [Pierce 2002]. For a comprehensive introduction we defer the confused or rusty there. Additionally, we provide the 1. Run both programs until they terminate. specification and relevant theorems in the appendix. 2. Check if they share the same normal form, up to alpha- Our focus in this section is to motivate System F: its syn- equivalence (renaming of variables). tax, typing, and evaluation. And why properties of each are 3. If they do, they are equal. Otherwise, unequal. Syntax 4 = :: terms: g ::= types: G variable g g 4 4 1 ! 2 function type 1 2 application 8U.g polymorphic type _G:g.4 abstraction U 4 dge type application type variable U.4 Λ type abstraction Γ ::= contexts: E = · empty :: values: G:g, Γ variable _G:g.4 abstraction U, ΛU.4 type abstraction Γ type variable Figure 2. Syntax in System F Learning in System F Conference’17, July 2017, Washington, DC, USA For example, this decision procedure renders these programs Unlike typing, we only consider programs 4 in normal equal: form. We discuss later how this pays dividends in the implementation. With reference to the syntax in the appendix, we _G :g.G =V ¹_~ : ¹g ! gº.~º_I :g.I define System F programs 4 in normal form: The decision procedure checks that two programs exist in the 4 4 _G g.4 U.4 transitive reflexive closure of the evaluation relation. This B ˆ j : j Λ only works because programs always terminate in System F 4ˆ B G j 4ˆ 4 j 4ˆ dge [Girard et al. 1989]. 3.1 Learning, a relation 3 Learning from Types If you squint, you may think that learning in Figure4 looks a We present learning as a relation between contexts Γ, pro- lot like typing in Figure3. The semblance isn’t superficial, but grams 4, and types g. instead reflects an equivalence between learning and typing which we later state. Despite this, the learning relation isn’t g 4 Γ ` redundant. It forms the core of an extended learning relation The relation asserts that given a context Γ and type g, you in the next section, where we learn from both types and can learn program 4. examples. (L-Var) says if G is bound to type g in the context Γ, then Like typing, we define the relation with a set of infer- G g ence rules. These rules confer similar benefits to typing. We you can learn the program of type . can prove useful properties of learning, and the rules guide G:g 2 G:g (L-Var) implementation. G:g ` g G Typing Γ ` 4 : g G : g 2 Γ Γ,U ` 4 : g (T-Var) (T-TAbs) Γ ` G : g Γ ` ΛU.4 : 8U.g Γ,G:g1 ` 42 : g2 Γ ` 4 : 8U.g1 (T-Abs) (T-TApp) Γ ` _G:g1.42 : g1 ! g2 Γ ` 4 dg2e : »g2/U¼g1 Γ ` 41 : g1 ! g2 Γ ` 42 : g1 (T-App) Γ ` 4142 : g2 Figure 3.

Load more