LAZY EVALUATION from Natural Semantics to a Machine-Checked Compiler Transformation

Joachim Breitner LAZY EVALUATION From natural semantics to a machine-checked compiler transformation Joachim Breitner Lazy Evaluation From natural semantics to a machine-checked compiler transformation The cover background pattern was created using the substrate algorithm by J. Tarbell, as implemented for the XScreensaver project by Mike Kershaw. Lazy Evaluation From natural semantics to a machine-checked compiler transformation by Joachim Breitner Dissertation, Karlsruher Institut für Technologie (KIT) Fakultät für Informatik, 2016 Tag der mündlichen Prüfung: 25. April 2016 Erster Gutachter: Prof. Dr.-Ing. Gregor Snelting Zweiter Gutachter: Prof. Tobias Nipkow, Ph.D. Impressum Karlsruher Institut für Technologie (KIT) KIT Scientific Publishing Straße am Forum 2 D-76131 Karlsruhe KIT Scientific Publishing is a registered trademark of Karlsruhe Institute of Technology. Reprint using the book cover is not allowed. www.ksp.kit.edu This document – excluding the cover, pictures and graphs – is licensed under the Creative Commons Attribution-Share Alike 3.0 DE License (CC BY-SA 3.0 DE): http://creativecommons.org/licenses/by-sa/3.0/de/ The cover page is licensed under the Creative Commons Attribution-No Derivatives 3.0 DE License (CC BY-ND 3.0 DE): http://creativecommons.org/licenses/by-nd/3.0/de/ Print on Demand 2016 ISBN 978-3-7315-0546-4 DOI 10.5445/KSP/1000056002 Contents 1 Introduction 1 1.1 Notation and conventions . .5 1.2 Reproducibility and artefacts . .6 1.3 Lazy evaluation . .7 1.4 The GHC Haskell compiler . .9 1.4.1 GHC Core . .9 1.4.2 Rewrite rules and list fusion . 12 1.4.3 Evaluation and function arities . 15 1.5 Arities and eta-expansion . 17 1.6 Nominal logic . 19 1.6.1 Permutation sets . 20 1.6.2 Support and freshness . 21 1.6.3 Abstractions . 22 1.6.4 Strong induction rules . 22 1.6.5 Equivariance . 23 1.7 Isabelle . 24 1.7.1 The prettiness of Isabelle code . 25 1.7.2 Nominal logic in Isabelle . 28 1.7.3 Domain theory and the HOLCF package . 29 i Contents 2 Formalizing Launchbury’s natural semantics 33 2.1 Launchbury’s semantics . 34 2.1.1 Natural semantics . 35 2.1.2 Denotational semantics . 39 2.1.3 Discussions of modifications . 42 2.2 Correctness . 45 2.2.1 Discussions of modifications . 51 2.3 Adequacy . 52 2.3.1 The resourced denotational semantics . 52 2.3.2 Denotational black holes . 54 2.3.3 Resourced adequacy . 56 2.3.4 Relating the denotational semantics . 58 2.3.5 Concluding the adequacy . 59 2.3.6 Discussions of modifications . 59 2.4 Data type encodings and base values . 62 2.4.1 Data types via Church encoding . 62 2.4.2 Adding Booleans . 64 2.5 A small-step semantics . 68 2.5.1 Sestoft’s mark-1 abstract machine . 69 2.5.2 Relating Sestoft’s and Launchbury’s semantics . 70 2.5.3 Discussions of modifications . 74 2.6 The Isabelle formalisation . 76 2.6.1 Employing nominal logic . 76 2.6.2 The type of environments . 77 2.6.3 Abstracting over the denotational semantics . 79 2.6.4 Relating the domains Value and CValue ...... 82 2.7 Related work . 84 3 Call Arity 87 3.1 The need for co-call analysis . 89 3.1.1 A syntactical analysis . 89 3.1.2 Incoming arity . 90 3.1.3 Called-once information . 91 3.1.4 Mutually exclusive calls . 92 3.1.5 Co-call analysis . 93 ii Contents 3.2 The type of co-call graphs . 94 3.3 The Call Arity analysis . 95 3.3.1 The specification . 95 3.3.2 The equations . 97 3.4 The implementation . 105 3.4.1 Interesting variables . 106 3.4.2 Finding the fixed points . 107 3.4.3 Top-level values . 108 3.4.4 The graph data structure . 109 3.5 Discussion . 110 3.5.1 Call Arity and list fusion . 110 3.5.2 Limitations . 111 3.5.3 Measurements . 113 3.5.4 Compiler performance . 118 3.6 Related work . 119 3.6.1 GHC’s arity analyses . 119 3.6.2 Higher order sharing analyses . 121 3.6.3 Explicit one-shot annotation . 122 3.6.4 unfoldr/destroy and stream fusion . 124 3.6.5 Worker-wrapper list fusion . 124 3.6.6 Control flow based analyses . 126 3.7 Future work . 126 3.7.1 Improvements to the analysis . 127 3.7.2 Tighter integration into GHC . 127 4 The safety of Call Arity 129 4.1 Proof outline . 130 4.2 Arity analyses . 133 4.2.1 A concrete arity analysis . 140 4.2.2 Functional correctness . 141 4.3 Cardinality analyses . 146 4.3.1 Abstract cardinality analysis . 146 4.3.2 Trace tree cardinality analysis . 151 4.3.3 Co-call cardinality analysis . 158 4.3.4 Call Arity, concretely . 164 iii Contents 4.4 The Isabelle formalisation . 166 4.4.1 Size and effort . 166 4.4.2 Structure . 167 4.4.3 The trace tree type implementation . 167 4.5 The formalisation gap . 171 4.5.1 Core vs. my syntax . 172 4.5.2 Core vs. my semantics . 172 4.5.3 Core’s annotations . 174 4.5.4 Implementation vs. formalisation . 175 4.5.5 Performance and safety in the larger context . 176 4.6 Related work . 176 5 Conclusion 179 A Formal definitions and main theorems 183 A.1 Terms . 184 A.2 Semantics . 187 A.2.1 Natural semantics . 187 A.2.2 Small-step semantics . 188 A.2.3 Denotational semantics . 188 A.3 Correctness and adequacy theorems . 190 A.4 Call Arity . 190 A.4.1 Arities . 190 A.4.2 Co-call graphs . 192 A.4.3 The Call Arity analysis . 193 A.4.4 Call Arity theorems . 195 B Call Arity code 197 B.1 Co-call graphs . 197 B.2 The Call Arity analysis . 200 Bibliography 211 Index 223 iv Abstract High level programming languages, in particular the lazy, pure, functional kind, liberate the programmer from having to think about the low-level details of how his code is going to be executed, and they give the compiler extra leeway in optimising the program. This distance to the actual machine makes it harder to reason about the effect of the compiler’s transformations on the program’s performance. Therefore, these transformations are often only evaluated empirically by measuring the performance of a few benchmark programs. This yields useful evidence, but not universal assurance. Formal semantics of programming languages can serve as guide rails to the implementation of a compiler, and formal proofs can universally show that the compiler does not inadvertently change the meaning of a program. Can they also be used effectively to establish that a program transformation performed by the compiler is indeed an optimisation? In this thesis, I answer this question in three steps: I develop a new compiler transformation; I build the tools to analyse it in an interactive theorem prover; finally I prove safety of the transformation, i.e. that the transformed program – in a suitable abstract sense – performs at least as well as the original one. My compiler transformation and accompanying program analysis Call Arity, which is now shipped with the Haskell compiler GHC, solves v Abstract a long-standing problem with the list fusion program transformation: Accumulator passing list consumers like foldl and sum would, if they were allowed to take part in list fusion, produce badly performing code. Call Arity empowers the compiler to further rewrite such code, by eta- expanding function definitions, into a form that runs efficiently again. The key ingredient is a novel cardinality analysis based on the notion of co-call graphs, which can detect whether a variable is used at most once, even in the presence of recursion. I provide empirical evidence that my analysis is indeed able to solve the problem: Now list fusion can provide significant improvements in these cases. The measurements also show that there are instances besides list fusion where the transformation fires and improves the program. No program in the benchmark suite regressed as a result of introducing Call Arity. In order to be able to verify these statements formally, I formalise Launch- bury’s natural semantics for lazy evaluation in the interactive theorem prover Isabelle. As Launchbury’s semantics is a very successful and commonly accepted semantics for lambda calculus with mutually re- cursive let-bindings that models lazy evaluation, it is a natural choice for this endeavour. My formalisation uses nominal logic, in the form of the Isabelle package Nominal2, to handle the issue of names and binders, which is gener- ally one of the main hurdles in any formalisation work in programming languages. It is one of the largest Isabelle developments using this method, and the first to effectively combine it with the HOLCF package for domain theory. My first attempt to combine these turned out to be a dead end. I explain how and why that did not go well and how I eventually overcame the challenges. Furthermore, I give the first rigorous adequacy proof of Launchbury’s semantics. The proof sketch given by Launchbury has resisted past attempts to complete it. I found a more elegant and direct proof by slightly deviating from his outline. Equipped with this formalisation, I model the Call Arity analysis and transformation in Isabelle and prove that it does not degrade program vi Abstract performance. My abstract measure of performance is the number of allocations performed by the program; I explain why this is a suitable choice for my use case.

Load more