Graph Reduction Without Pointers
Total Page:16
File Type:pdf, Size:1020Kb
Graph Reduction Without Pointers TR89-045 December, 1989 William Daniel Partain The University of North Carolina at Chapel Hill Department of Computer Science ! I CB#3175, Sitterson Hall Chapel Hill, NC 27599-3175 UNC is an Equal Opportunity/Aflirmative Action Institution. Graph Reduction Without Pointers by William Daniel Partain A dissertation submitted to the faculty of the University of North Carolina at Chapel Hill in partial fulfillment of the requirements for the degree of Doctor of Philosophy in the Department of Computer Science. Chapel Hill, 1989 Approved by: Jfn F. Prins, reader ~ ~<---( CJ)~ ~ ;=tfJ\ Donald F. Stanat, reader @1989 William D. Partain ALL RIGHTS RESERVED II WILLIAM DANIEL PARTAIN. Graph Reduction Without Pointers (Under the direction of Gyula A. Mag6.) Abstract Graph reduction is one way to overcome the exponential space blow-ups that simple normal-order evaluation of the lambda-calculus is likely to suf fer. The lambda-calculus underlies lazy functional programming languages, which offer hope for improved programmer productivity based on stronger mathematical underpinnings. Because functional languages seem well-suited to highly-parallel machine implementations, graph reduction is often chosen as the basis for these machines' designs. Inherent to graph reduction is a commonly-accessible store holding nodes referenced through "pointers," unique global identifiers; graph operations cannot guarantee that nodes directly connected in the graph will be in nearby store locations. This absence of locality is inimical to parallel computers, which prefer isolated pieces of hardware working on self-contained parts of a program. In this dissertation, I develop an alternate reduction system using "sus pensions" (delayed substitutions), with terms represented as trees and vari ables by their binding indices (de Bruijn numbers). Global pointers do not exist and all operations, except searching for redexes, are entirely local. The system is provably equivalent to graph reduction, step for step. I show that . if this kind of interpreter is implemented on a highly-parallel machine with a locality-preserving, linear program representation and fast scan primitives (an FFP Machine is an appropriate architecture) then the interpreter's worst case space complexity is the same as that of a graph reducer (that is, equiv alent sharing), and its time complexity falls short on only one unimportant case. On the other side of the ledger, graph operations that involve chaining through many pointers are often replaced with a single associative-matching operation. What is more, this system has no difficulty with free variables in redexes and is good for reduction to full beta-normal form. These results suggest that non-naive tree reduction is an approach to supporting functional programming that a parallel-computer architect should not overlook. 111 Preliminaries Programming language. I support some of my descriptions by show ing an implementation encoded in Standard ML, set in a sans serif font. Appendix A is a reader's introduction to ML and defines the supporting routines for the programs in the dissertation proper. I chose ML because the compiler from Bell Laboratories [8] was the best-implemented functional language available to me. The code shown is directly extracted from working programs. A stylistic matter. I veer from the usual habit of calling myself "we," siding with E. B. White: It is almost impossible to write anything decent using the editorial "we," unless you are the Dionne family. Anonymity, plus the "we," gives a writer a cloak of dishonesty, and he finds himself going around, like a masked reveler at a ball, kissing all the pretty girls [84, page 121]. Acknowledgments. I am not sure how I got into the Ph.D. business, but I know how I got through it. The Team Mag6 members have been the best of colleagues, notably David Middleton, initiator into the Mysteries, Edoardo Biagioni, with his startling imagination, and Bruce Smith, with his breadth of understanding and good sense. Charles Molnar and his group added more than a little spice through their collaboration with us. Vern Chi and the Microelectronics Systems Lab have provided superb computing facilities, even if I was mainly an intruder. "Net people" provided many small helps and assurances; Paul Watson went beyond the call duty by sending a copy of his hard-to-get thesis. Bharat J ayaraman and Rick Snodgrass, former committee members, read drafts even after they had decided to leave; that is conscientiousness! Gyiirgy Revesz provided a needed boost during his visit from the T. J. Watson Research Center. As for sanity, my family and friends have served admirably as bouncers for Dorothea Dix hospital, including my IV father who has hounded me mercilessly about finishing and my mother who made a point not to. Ed McKenzie was as good a lunch crony as one could hope to find, but I will never forgive him for completing his degree in only four years. Phil and Paige LeMasters failed to disguise their deliberate effort to keep me in contact with the world outside Sitterson Hall. My noble and ennobling committee-Gyula Mag6, David Plaisted, Jan Prins, Don Stanat and Jennifer Welch-have bravely weathered the drafts from room 327 and have vastly improved my material. Prof. Stanat lured me into the Department and has stood behind his mistakes, even as I flamed the faculty and wrote purple prose into proposals. Prof. Mag6 has enhanced his reputation as the best thesis advisor in the Department, unbuffeted by the fashions of graduate-student whims, tolerant of not-always-serious meet ings, readily available for consultation, and incisive (but not opaque) in his critique. I offer my heartfelt thanks to each one. I am grateful to the U. S. Army Research Office which provided financial support for this work through an Army Science and Technology Fellowship (grant number DAAL03-86-G-0050). Comments. I welcome your comments and corrections. My e-mail address is partain©cs. unc. edu, and paper mail will reach me via the Computer Sci ence Department, UNC, Sitterson Hall, Chapel Hill, NC 27599-3175. Elec tronic comments sent to Prof. Mag6 (mago©cs. unc. edu) will be forwarded to me even after I leave UNC. v Contents 1 The problem 1 1.1 Motivation ....... 1 1. 2 Thesis statement . 9 1.3 Dissertation organization 10 2 The A-calculus 12 2.1 Syntax .............. 13 2.2 Computing with the A-calculus 15 2.3 Other A-calculus reduction rules 16 2.4 !'~-normal form and normal-order reduction 17 2.5 Other normal forms and evaluation orders 18 2.6 The name-free A-calculus ..... 20 2. 7 Combinators . 22 2.8 The practical use of the A-calculus 26 2.9 The necessity of sharing for normal-order evaluation 27 30 3 Graph reduction: the A9 -interpreter 3.1 Graph structure and terminology .. 32 3.2 Finding a redex and j39 -reduction .. 35 3.3 Sharing free expressions and lazy copying 38 3.4 More on variable bindings . 40 3.4.1 Graph reduction with binding indices 40 3.4.2 Wadsworth's use of backpointers .. 42 3.5 Graph-reduction architectures and sharing . 45 3.6 Graph rewriting ............... 51 4 Reduction with suspensions: the As-interpreter 53 4.1 As-term structure and terminology 58 4.2 !'is-reduction: the !'is rule ............. 63 VI 4.3 Searching for the next redex . 66 4.4 Lazy copying of shared rators 75 4.5 Tidying .\,-terms . 76 4.5.1 Removing useless suspensions 79 4.5.2 Removing trivial suspensions 80 4.5.3 Moving .\,-abstractions above suspensions 81 4.5.4 Rotating suspensions . 84 4.5.5 Upward and downward s-connections . 85 4.5.6 Constraints on moving suspensions . 86 4.5. 7 Tidying: definition and important properties 86 4.5.8 The recurring example on the .\,-interpreter . 90 4.6 a-equivalence of .\,-terms . 90 4.7 Equivalence to graph reduction: correctness 92 4.8 Related approaches to A-calculus evaluation . 94 4.8.1 Efforts to find simpler reduction rules 94 4.8.2 Comparison with environment-based evaluation . 97 4.8.3 Environment/reduction hybrids . 100 4.8.4 Pointers versus .\,-pointers . 100 5 The .\,-interpreter on an FFP Machine 102 5.1 Introduction to the FFP Machine . 103 5.1.1 Project history and design goals 103 5.1.2 Basic structure of an FFP Machine . 104 5.1.3 Communication and partitioning . 106 5.1.4 Broadcasting and sorting operations 108 5.1.5 Single-result operations . 109 5.1.6 Scan, or parallel prefix, operations 109 5.1. 7 Computing level numbers . 110 5.1.8 Calculating exotic level numbers 112 5.1.9 Calculating selectors and first/last bits . 113 5.1.10 Low-level programming style in an FFP Machine 114 5.1.11 Copying and sharing in an FFP Machine 115 5.1.12 Related non-graph-reduction architectures . 116 5.2 An implementation of a .\,-interpreter 119 5.2.1 Basic algorithms . 120 5.2.2 Controlling the interpreter . 125 5.2.3 Reducing to $-normal or root-lambda form 126 5.2.4 $,-reduction and local tidying . 130 5.2.5 Nonlocal tidying . 132 Vll 5.2.6 Global checking . 133 5.2. 7 Summary of the FFP Machine implementation 138 5.3 Equivalence to graph reduction: efficiency 140 5.3.1 Space complexity . 141 5.3.2 Time complexity . 142 5.4 Previous FFP Machine implementations of the .\-calculus 151 6 Embellishments 153 6.1 Suspension lists 153 6.2 Dealing with recursion 156 6.3 Exotic suspensions . 157 6.4 More parallelism . 158 6.5 Ordering in As-terms and supercombinators 158 6.6 Speculative copying . 160 6. 7 Going further with binding indices 161 7 Conclusions 163 A Programming with ML 166 A.1 The one-minute ML programmer 166 A.2 Utility functions .