Department of Computing Science Catamorphism-Based Program
Total Page:16
File Type:pdf, Size:1020Kb
Department of Computing Science Catamorphism-Based Program Transformations for Non-Strict Functional Languages L´aszl´oN´emeth A thesis submitted in partial fulfilment of the requirements for the degree of Doctor of Philosophy in Computing Science at the University of Glasgow November 2000 c L´aszl´oN´emeth 2000 Abstract In functional languages intermediate data structures are often used as glue to connect separate parts of a program together. These intermediate data structures are useful because they allow modularity, but they are also a cause of inefficiency: each element need to be allocated, to be examined, and to be deallocated. Warm fusion is a program transformation technique which aims to eliminate intermediate data structures. Functions in a program are first transformed into the so called build-cata form, then fused via a one-step rewrite rule, the cata-build rule. In the process of the transformation to build-cata form we attempt to replace explicit recursion with a fixed pattern of recursion (catamorphism). We analyse in detail the problem of removing — possibly mutually recursive sets of — polynomial datatypes. We have implemented the warm fusion method in the Glasgow Haskell Compiler, which has allowed practical feedback. One important conclusion is that catamorphisms and fusion in general deserve a more prominent role in the compilation process. We give a detailed measurement of our implementation on a suite of real application programs. Contents Abstract ii 1 Introduction 1 1.1 Contributions................................... 2 1.2 Structureofthethesis . .. .. .. .. .. .. .. .. 3 2 State of the Art in Intermediate Data Structure Removal 5 2.1 Wadler’s schema deforestation . ..... 7 2.2 Therulesandstrategiesapproach. ..... 9 2.3 Listlessnessanddeforestation . ...... 10 2.4 Cheapdeforestation .............................. 12 2.5 Supercompilation................................ 13 2.6 WarmFusion ................................... 13 2.7 Warm Fusion (almost) without inlining . ...... 14 2.8 HyloFusion.................................... 15 2.9 Deforestationforfree. 16 2.10 Generic program transformation . ..... 16 3 The Theory of Warm Fusion 18 3.1 Preliminaries ................................... 19 3.2 Catamorphisms.................................. 20 3.3 Build........................................ 25 3.4 Thecorrectnessofbuildify. 25 iii Contents iv 4 The Practice of Warm Fusion I: The Basics 27 4.1 Informalintroductiontowarmfusion. ...... 27 4.1.1 Buildifyinformally . 29 4.1.2 Catifyinformally............................. 32 4.2 Definitions..................................... 36 4.3 Overviewofthemethod ............................. 37 4.3.1 Thepreprocessingstage . 38 4.3.2 Firststageoffusion ........................... 39 4.3.3 Buildifydetailed ............................. 41 4.3.4 Catifydetailed .............................. 42 4.3.5 Thesecondstage............................. 44 4.3.6 Cleaningup................................ 44 4.4 Discussion..................................... 44 4.4.1 Do catas deserve a special treatment or should they be ordinary Core bindings? ................................. 45 4.4.2 When should catas, maps and builds be derived? . 46 4.4.3 When to transform functions to build-cata form .......... 48 4.4.4 Buildify-catify vs catify-buildify . ....... 48 4.5 First-orderfusion............................... 49 4.5.1 Derivingmaps .............................. 49 4.5.2 Deriving catas: implementing the cata evaluation rule........ 51 4.5.3 Cata-Corerules.............................. 54 4.5.4 Buildify .................................. 55 4.5.5 Catify ................................... 60 5 The Practice of Warm Fusion II: Extensions 64 5.1 Functions with more than one argument . 64 5.1.1 Avoiding more than one argument . 65 5.1.2 Higher-ordercatas ............................ 67 5.1.3 Buildify .................................. 69 Contents v 5.1.4 Catify ................................... 70 5.1.5 Higher-orderfusion. 73 5.1.6 Static argument transformation . 74 5.2 Mutuallyrecursivedatatypes . 74 5.2.1 Derivingmaps .............................. 76 5.2.2 Derivingcatas .............................. 78 5.2.3 NewCata-Corerules........................... 79 5.2.4 Buildify .................................. 80 5.2.5 Catify ................................... 81 5.3 Thedynamicrewritesystem. 84 5.3.1 Thedetails ................................ 84 5.4 Standardisingargumentordering . ..... 86 5.5 Twopracticalissues .............................. 88 5.5.1 Separatecompilation. 88 5.5.2 Listcomprehensions ........................... 91 6 Measuring Warm Fusion 99 6.1 Measuringwarmfusion ............................. 99 6.2 Whatwewanttomeasure............................ 100 6.3 Howtomeasureit?................................ 101 6.4 Adetailedexample................................ 102 6.5 Thebenchmarks ................................. 117 6.6 Ashortanalysisofthebenchmarks. 117 6.7 Summary ..................................... 120 6.7.1 Thecontrolrun.............................. 120 6.7.2 Normalisedrun.............................. 121 6.7.3 Buildifyonly ............................... 123 6.7.4 Catifyonly ................................ 124 6.8 Conclusions .................................... 129 7 Conclusions and Further Work 136 Contents vi 7.1 FurtherWork................................... 137 7.1.1 Automatically deriving code from types . 137 7.1.2 Special abstract machine for fused programs . 138 7.1.3 Transparency of transformations . 138 7.1.4 Moreaggressiveinlining . 139 7.1.5 Fusion for datatypes with embedded functions . 139 7.1.6 Fegarasstylefolds ............................ 140 7.1.7 Monadicmaps,foldsandfusion. 140 7.1.8 Warmerfusion .............................. 141 A The Framework 142 A.1 Thecompiler(pre-warmfusion). 142 A.2 Thesimplifier................................... 144 A.3 Thecompiler(post-warmfusion) . 146 List of figures 2.1 Theideaofprogramtransformation . .... 6 2.2 Wadler’s algebraic deforestation rules . ......... 8 4.1 Overview of the fusion transformation . ...... 39 4.2 Rules for the interaction of catamorphisms and Core . ......... 55 5.1 Cata-Core rules in the presence of mutually recursive datatypes . 79 5.2 Semantics of Haskell list comprehensions . ........ 92 5.3 Traditional list comprehension desugaring scheme . ........... 93 5.4 Optimal list comprehension desugaring scheme . ........ 98 A.1 Glasgow Haskell Compiler passes . 149 A.2 SyntaxoftheCorelanguage. 150 vii List of tables 2.1 Summaryofdeforestationefforts . .... 7 6.1 Programsoftheimaginarysubset. 117 6.2 Programsofthespectralsubset . 118 6.3 Programs of the spectral subset: the Hartel Benchmarks . .......... 119 6.4 Programsoftherealsubset . 119 6.5 Controlrun: imaginarysubset. 121 6.6 Controlrun: theHartelBenchmarks . 121 6.7 Controlrun:spectralsubset. 122 6.8 Controlrun:therealsubset . 123 6.9 Normalise: imaginarysubset. 123 6.10 Normalise: the Hartel Benchmarks . 124 6.11 Normalise: spectralsubset. 125 6.12 Normalise: therealsubset . 126 6.13 A datatype making buildify too successful . ........ 126 6.14 Buildifyonly: imaginarysubset . 127 6.15 Buildify only: the Hartel Benchmarks . 127 6.16 Buildifyonly: spectralsubset . 128 6.17 Buildifyonly: therealsubset . 129 6.18 Catifyonly: imaginarysubset . 130 6.19 Catifyonly: spectralsubset . 131 6.20 Catify only: the Hartel Benchmarks . 132 6.21 Catifyonly: therealsubset . 132 viii List of tables ix 6.22 Buildify, catify and the cata-build rule: imaginary subset . 133 6.23 Buildify, catify and the cata-build rule: spectralsubset. 134 6.24 Buildify, catify and the cata-build rule: the Hartel Benchmarks . 135 6.25 Buildify, catify and the cata-build rule: therealsubset. 135 A.1 Localtransformations . 145 Acknowledgements I would like to thank my supervisor, Simon Peyton Jones for his support, encouragement, and for answering my endless stream of questions about GHC. I would also like to thank my previous supervisor, Phil Wadler who encouraged me to apply to Glasgow. I spent my second year at the Oregon Graduate Institute (Portland, Oregon). It was a terrible year — both personally and professionally — but this was not entirely their fault. The country, the climate, Rock Creek, and myself also had prominent roles in it. The person I need to thank from this period is John Launchbury who said ’yes’ to a question which, at that time, seemed very important to me. As a consequence of my time in the US (and probably the winter months in Glasgow) I developed a taste for living in sunny places. Thanks to Limsoon Wong of Kent Ridge Digital Laboratories, Singapore for giving me the opportunity to work in Singapore for three months and to discover the pleasures of eating a real mango and laksa. For similar reasons (a fine mixture of sunshine and wild night life), I would like to thank Wai Wong of the Hong Kong Baptist University for the six months I worked there. It was a refreshing break from hacking GHC: I was hacking HOL instead. I would also like to thank my long-suffering officemates, flatmates and all sorts of other people I met for bearing with me during long, dark periods of depression. I know I hurt a lot of them so besides my thanks I would also like to apologise. Special thanks go to Mark Shields, office and flatmate, who actually saved my life. Joy Goodman read the first draft of this thesis and corrected many of my mistakes. Manuel Chakravarty released his haskell.sty for LATEX just in time to let me typeset the code appearing in this thesis. The friendship