The Fresh Approach: Functional Programming with Names and Binders

UCAM-CL-TR-618 Technical Report ISSN 1476-2986 Number 618 Computer Laboratory The Fresh Approach: functional programming with names and binders Mark R. Shinwell February 2005 15 JJ Thomson Avenue Cambridge CB3 0FD United Kingdom phone +44 1223 763500 http://www.cl.cam.ac.uk/ c 2005 Mark R. Shinwell This technical report is based on a dissertation submitted December 2004 by the author for the degree of Doctor of Philosophy to the University of Cambridge, Queens' College. Technical reports published by the University of Cambridge Computer Laboratory are freely available via the Internet: http://www.cl.cam.ac.uk/TechReports/ ISSN 1476-2986 Abstract This report concerns the development of a language called Fresh Objective Caml, which is an extension of the Objective Caml language providing facilities for the manipulation of data structures representing syntax involving α-convertible names and binding operations. After an introductory chapter which includes a survey of related work, we describe the Fresh Objective Caml language in detail. Next, we proceed to formalise a small core language which captures the essence of Fresh Objective Caml; we call this Mini-FreshML. We provide two varieties of operational semantics for this language and prove them equivalent. Then in order to prove correctness properties of representations of syntax in the language we introduce a new variety of domain theory called FM-domain theory, based on the permutation model of name binding from Pitts and Gabbay. We show how classical domain-theoretic constructions— including those for the solution of recursive domain equations—fall naturally into this setting, where they are augmented by new constructions to handle name-binding. After developing the necessary domain theory, we demonstrate how it may be exploited to give a monadic denotational semantics to Mini-FreshML. This semantics in itself is quite novel and demonstrates how a simple monad of continuations is sufficient to model dynamic allocation of names. We prove that our denotational semantics is computationally adequate with respect to the operational semantics—in other words, equality of denotation implies observational equivalence. After this, we show how the denotational semantics may be used to prove our desired correctness properties. In the penultimate chapter, we examine the implementation of Fresh Objective Caml, describing detailed issues in the compiler and runtime systems. Then in the final chapter we close the report with a discussion of future avenues of research and an assessment of the work completed so far. 3 4 Contents 1 Introduction 7 1.1 Other approaches to handle binding . 9 1.1.1 de Bruijn indices . 9 1.1.2 Threading of name states . 10 1.1.3 Higher-order abstract syntax . 10 1.2 Other related work . 11 1.3 Report structure . 12 2 Fresh Objective Caml 13 2.1 Representation of object language names . 13 2.1.1 Bindable names: typing . 13 2.1.2 Bindable names: creation . 14 2.2 Representation of object language binding . 14 2.2.1 Abstraction values: construction . 15 2.2.2 Abstraction values: typing . 17 2.2.3 Abstraction values: deconstruction . 17 2.2.4 Abstraction values: two sides of the same coin . 19 2.2.5 Abstraction values: equality testing . 20 2.3 Explicit atom-swapping . 21 2.4 Fresh-for test . 21 2.5 Something missing? . 21 2.6 Reference cells . 22 2.7 A full example . 22 2.8 Pretty-printing . 22 3 Mini-FreshML, operationally 25 3.1 Syntax . 25 3.2 Static semantics . 26 3.3 Dynamic semantics via big-step . 27 3.4 Dynamic semantics via frame stacks . 30 3.4.1 A termination relation . 30 3.4.2 Relationship to the big-step semantics . 32 3.5 Environment style semantics . 34 3.6 Notions of observational equivalence . 35 3.7 Correctness for Mini-FreshML . 36 4 A domain theory for names 39 4.1 FM-sets, FM-cpos and FM-cppos . 39 4.2 Some FM-cpos and their construction . 41 4.2.1 Atoms, lifting, products and sums . 41 4.2.2 Functions and function spaces . 42 4.2.3 Abstraction FM-cppos . 44 4.2.4 Some curiosities . 47 4.3 Fixed points . 47 4.4 Categorical constructions . 48 4.5 Solution of recursive equations on FM-cppos . 50 4.6 FM-sets of syntax . 54 5 6 CONTENTS 5 Mini-FreshML, denotationally 55 5.1 An overview . 55 5.1.1 Dynamic allocation monads . 56 5.2 Definition of the denotational semantics . 57 5.2.1 Denotation of types . 57 5.2.2 Denotation of expressions and frame stacks . 58 5.3 Some properties of the semantics . 61 5.3.1 Support and equivariance properties . 61 5.3.2 Substitutivity properties . 62 5.4 Computational adequacy . 62 5.4.1 Construction of the logical relations . 62 5.4.2 Relational structures . 63 5.4.3 Properties of the logical relations . 68 5.4.4 Completing the proof . 75 5.5 The road to equivalence . 77 5.6 Algebraic identities . 79 5.7 Correctness of representation . 80 5.7.1 Background theory . 80 5.7.2 Correctness results . 83 6 Implementation 87 6.1 Library, or bespoke system? . 87 6.2 System overview . 88 6.3 Creation of fresh names . 88 6.4 Creation of abstraction values . 89 6.5 Pattern-matching on abstraction values . 89 6.6 Implementation of swapping . 90 6.6.1 When to swap? . 90 6.6.2 How to swap? . 92 6.7 Preservation of sharing . 93 6.8 Freshness inference . 94 6.8.1 The motivation for a static analysis . 94 6.8.2 Static semantics with freshness inference . 95 6.8.3 Purity analysis . 98 6.8.4 A note on denotational semantics . 98 6.8.5 An abstraction monad . 99 7 Conclusions and future work 101 7.1 Future work . 101 7.1.1 Delayed permutations . 101 7.1.2 Data structures and algorithms . 103 7.1.3 Objects and modules . 104 7.1.4 Reference cells . 104 7.1.5 Enhanced abstraction types . 105 7.1.6 Standard library enhancements . 106 7.1.7 Denotational semantics . 106 Bibliography 109 1 Introduction Ìmagination is more important than knowledge.' —Einstein THIS REPORT is about two things: names and metaprogramming. Names are ubiquitous in the field of computer science and throughout today's networked world. Important elements of computer systems such as files, disks, machines and networks are all identified by names. Programs and machines often communicate over virtual `channels' which are identified by names[34]. When writing programs, use is made of variables, file handles, process identifiers and many other sorts of name in disguise. All of these names may be divided[39, 35] into two varieties: pure names and impure names. A name which is said to be pure is just a handle on something. Given a pure name, we can inspect what it is referring to and check to see if it is the same name as another. However, usually that is all and no further operations are permitted. In contrast, an impure name has some structure actually within the name which we can examine. An example might be the electronic mail address [email protected] or the Internet address of a computer, say 131.111.8.42. In each of these cases, there is some user-visible structure to the name—in these examples, the structure is defined by international standards. The names justify their designation because they still refer to something: a mailbox and a computer respectively. This report deals with the use of pure names in metaprogramming systems. Metaprogram- ming is the art of constructing software which manipulates the syntactic structures of other programs. There are many large-scale examples in common use today: compilers, interpreters and theorem provers being good examples. Metaprogramming can be homogenous, meaning that the metaprogram is written in the same language as the program being manipulated; or heterogeneous, meaning that these languages may differ. Sheard[58] provides a good survey of these and other issues in metaprogramming. We shall be particularly interested in possibly-heterogeneous metaprogramming in functional languages such as Haskell, Standard ML, Objective Caml and so forth. These languages are particularly good for metaprogramming due to a few salient features, of which the follow- ing two are arguably the most important. I User-defined datatypes and pattern-matching. It is straightforward to represent the syntax of some target language (known as the object language) by defining a datatype to represent the various shapes of syntactic structure which occur in that language. Values of such datatypes are created by parsing a program's concrete syntax into abstract syntax[30]. A portion of abstract syntax will likely be tagged to say what it is (a variable, a conditional statement, etc.) and it may well contain smaller syntactic entities within it. Functional programming languages make it easy to construct values representing abstract syntax and then pick them apart again using pattern-matching. These features tend to save a large amount of tedious coding and help with the readability of metaprograms. I Definition by structural recursion. Functional languages permit functions over syntax to be defined by structural recursion[8]. Typically.

Load more