Practical Implementation of a Dependently Typed Functional Programming Language
Total Page:16
File Type:pdf, Size:1020Kb
Practical Implementation of a Dependently Typed Functional Programming Language by Edwin C. Brady Submitted in conformity with the requirements for the degree of PhD Department of Computer Science University of Durham Copyright °c 2005 by Edwin C. Brady Abstract Practical Implementation of a Dependently Typed Functional Programming Language Edwin C. Brady Types express a program’s meaning, and checking types ensures that a program has the intended meaning. In a dependently typed programming language types are predicated on values, leading to the possibility of expressing invariants of a program’s behaviour in its type. Dependent types allow us to give more detailed meanings to programs, and hence be more confident of their correctness. This thesis considers the practical implementation of a dependently typed programming language, using the Epigram notation defined by McBride and McKinna. Epigram is a high level notation for dependently typed functional programming elaborating to a core type theory based on Luo’s UTT, using Dybjer’s inductive families and elimination rules to implement pattern matching. This gives us a rich framework for reasoning about programs. However, a na¨ıve implementation introduces several run-time overheads since the type sys- tem blurs the distinction between types and values; these overheads include the duplication of values, and the storage of redundant information and explicit proofs. A practical implementation of any programming language should be as efficient as pos- sible; in this thesis we see how the apparent efficiency problems of dependently typed pro- gramming can be overcome and that in many cases the richer type information allows us to apply optimisations which are not directly available in traditional languages. I introduce three storage optimisations on inductive families; forcing, detagging and collapsing. I further introduce a compilation scheme from the core type theory to G-machine code, including a pattern matching compiler for elimination rules and a compilation scheme for efficient run- time implementation of Peano’s natural numbers. We also see some low level optimisations for removal of identity functions, unused arguments and impossible case branches. As a result, we see that a dependent type theory is an effective base on which to build a feasible programming language. ii Acknowledgements My thanks to my supervisors, James McKinna and Zhaohui Luo. James made the original suggestion for a thesis topic and has constantly provided advice and feedback. Zhaohui’s enthusiasm and knowledge has been an inspiration, and I would like to thank both for their feedback on previous drafts of this thesis. I would also like to thank the other members of the Computer Assisted Reasoning Group, in particular Conor McBride. Conor taught me a lot of what I know about type theory and introduced several implementation techniques to me. I also owe him thanks for his useful comments on an earlier draft of this thesis, and a collection of LATEX macros which made writing it marginally less painful! My office mates, successively Yong Luo, Paul Townend, Simon Pears, David Johnstone, Robert Kiessling, and Chris Lindop helped to provide a friendly environment in which to work, and I thank them. Thank you also to the Computing Society, the Go Club and Ustinov College Cricket Club for providing distractions when I needed them. Finally, I would like to thank my family for their love and encouragement, and Jenny for her support and limitless patience over the last few months. iii Contents 1 Introduction 1 1.1 Types in Programming . 2 1.2 Dependent Types in Programming . 3 1.2.1 Cayenne . 3 1.2.2 DML . 4 1.2.3 Inductive Families and Epigram ..................... 5 1.2.4 Benefits of Dependent Types . 7 1.2.5 Strong Normalisation . 10 1.3 Contributions . 11 1.4 Related Work . 12 1.5 Overview . 14 1.5.1 System Overview . 14 1.5.2 Chapter Outline . 16 1.5.3 Implementation Note . 17 2 Epigram and its Core Type Theory 19 2.1 TT — The Core Type Theory . 19 2.1.1 The Core Language . 20 2.1.2 Inductive Datatypes . 23 2.1.3 Elimination Rules . 26 2.1.4 Equality . 28 2.1.5 Properties of TT .............................. 30 2.1.6 Universe Levels and Cumulativity . 31 2.1.7 TT Examples . 32 2.1.8 Labelled Types . 33 2.2 Programming in Epigram ............................. 34 2.2.1 Basic Notation . 34 2.2.2 Programming with Elimination Rules . 36 2.2.3 Impossible Cases . 37 2.2.4 Example — Vector lookup . 38 iv CONTENTS v 2.2.5 Alternative Elimination Rules . 40 2.2.6 Derived Eliminators and Memoisation . 42 2.2.7 Matching on Intermediate Values . 44 2.3 Programming Idioms . 45 2.3.1 Dependent Pairs . 45 2.3.2 Induction Over Proofs . 47 2.3.3 Views . 47 2.3.4 Termination . 50 2.4 Summary . 53 3 Compiling ExTT 55 3.1 Execution Environments . 56 3.1.1 Normalisation by Evaluation . 56 3.1.2 Compilation . 57 3.1.3 Program Extraction . 59 3.1.4 Execution of Epigram .......................... 60 3.2 The Run-Time Language RunTT ......................... 61 3.2.1 Supercombinators and Lambda Lifting . 61 3.2.2 RunTT Syntax . 61 3.3 Translating Function Definitions to RunTT ................... 63 3.3.1 Grouping λ-abstractions . 63 3.3.2 Lambda Lifting . 64 3.3.3 Tidying up . 65 3.3.4 Arity . 65 3.4 Translating Elimination Rules to RunTT ..................... 66 3.5 The G-machine . 67 3.5.1 Graph Representation . 67 3.5.2 Machine State . 68 3.5.3 Informal Semantics . 69 3.5.4 Operational Semantics . 70 3.5.5 Translation Scheme . 71 3.5.6 Example — plus and N-Elim ...................... 72 3.5.7 Implementing a G-machine Compiler With Dependent Types . 72 3.6 Proper Tail Recursion . 75 3.7 Run-time Considerations . 76 3.7.1 Invariants of Inductive Families . 78 3.7.2 Proofs . 79 3.7.3 Number Representation . 80 3.7.4 Dead Code In Impossible Cases . 81 3.7.5 Intermediate Data Structures . 82 CONTENTS vi 3.8 Summary . 82 4 Optimising Inductive Families 83 4.1 Elimination Rules and Their Implementation . 84 4.1.1 Form of Elimination Rules . 84 4.1.2 Pattern Syntax and its Run-Time Semantics . 86 4.1.3 Standard Implementation . 88 4.1.4 Alternative Implementations . 89 4.2 ExTT and Its Properties . 89 4.2.1 Properties of ExTT ............................. 91 4.2.2 Defining Optimisations . 92 4.2.3 Typechecking via ExTT .......................... 92 4.3 Building Efficient Implementations . 94 4.3.1 Eliding Redundant Constructor Arguments . 94 4.3.2 Eliding Redundant Constructor Tags . 98 4.3.3 Collapsing Content Free Families . 102 4.3.4 The Collapsing Optimisation . 104 4.3.5 Interaction Between Optimisations . 106 4.3.6 Using the Standard Implementation . 107 4.4 Compilation Scheme for ExTT . 108 4.4.1 Extensions to RunTT ............................108 4.4.2 Compiling Elimination Rules . 109 4.4.3 Extensions to the G-machine . 116 4.5 Examples . 118 4.5.1 The Finite Sets . 119 4.5.2 Comparison of Natural Numbers . 119 4.5.3 Domain Predicates . 121 4.5.4 Non-repeating Lists . 123 4.5.5 Simply Typed λ-calculus . 124 4.5.6 Results summary . 126 4.6 A larger example — A well-typed interpreter . 127 4.6.1 The language . 128 4.6.2 Representation . 129 4.6.3 The interpreter . 130 4.6.4 Optimisation . 132 4.6.5 Results . 132 4.7 Summary . 133 CONTENTS vii 5 Number Representation 136 5.1 Representing Numbers in Type Theory . 137 5.1.1 What is N used for? . 137 5.2 The Word family . ..