Self-Adjusting Computation

Self-Adjusting Computation Umut A. Acar May 2005 CMU-CS-05-129 School of Computer Science Carnegie Mellon University Pittsburgh, PA 15213 Thesis Committee: Guy Blelloch, co-chair Robert Harper, co-chair Daniel Dominic Kaplan Sleator Simon Peyton Jones, Microsoft Research, Cambridge, UK Robert Endre Tarjan, Princeton University Submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy c 2005 Umut A. Acar This research was sponsored in part by the National Science Foundation under grant CCR-0085982, CCR-0122581 and EIA- 9706572, and the Department of Energy under contract no. DE-FG02-91ER40682. The views and conclusions contained in this document are those of the author and should not be interpreted as representing the official policies, either expressed or implied, of any sponsoring institution, the U.S. government or any other entity. Keywords: Self-adjusting computation, dynamic algorithms, dynamic data structures, kinetic data structures, dynamic dependence graphs, memoization, change propagation, trace stability, functional programming, lambda calculus, type systems, operational semantics, modifiable references, selective memoization, sorting, convex hulls, parallel tree contraction, dynamic trees, rake-and-compress trees. Abstract This thesis investigates a model of computation, called self-adjusting computation, where computations adjust to any external change to their data (state) automatically. The external changes can change any data (e.g., the input) or decisions made during the computation. For example, a self-adjusting program can compute a property of a dynamically changing set of objects, or a set of moving objects, etc. This thesis presents algorithmic and programming-language techniques for devising, analyzing, and implementing self- adjusting programs. From the algorithmic perspective, we describe novel data structures for tracking the dependences in a computation and a change-propagation algorithm for adjusting computations to changes. We show that the overhead of our dependence tracking techniques is O(1). To determine the effectiveness of change propagation, we present an analysis technique, called trace stability, and apply it to a number of applications. From the languages perspective, we describe language facilities for writing self-adjusting programs in a type-safe and correct manner. The techniques make writing self-adjusting programs nearly as easy as ordinary (non-self-adjusting) programs. A key property of the techniques is that they enable the programmer to control the cost of dependence tracking by applying it selectively. Using language techniques, we also formalize the change-propagation algorithm and prove that it is correct. We demonstrate that our techniques are efficient both in theory and in practice by considering a number of applications. Our applications include a random sampling algorithm on lists, the quick sort and merge sort algorithms, the Graham’s Scan and the quick algorithm for planar convex hull, and the tree-contraction algorithm. From the theoretical perspective, we apply trace stability to our applications and show complexity bounds that are within an expected constant factor of the best bounds achieved by special-purpose algorithms. From the practical perspective, we implement a general purpose library for writing self-adjusting programs, and implement and evaluate self-adjusting versions of our applications. Our experiments show that our techniques dramatically simplify writing self-adjusting programs, and can yield very good perfor- mance even when compared to special-purpose algorithms, both in theory and in practice. For my parents Durdane¨ and Ismail˙ Acar Acknowledgments This thesis would not have been possible without the support and encouragement of my advisors Guy Blel- loch and Robert Harper. Many of the results in this thesis have come out of long conversations and meetings with Bob and Guy. I thank you both very much. I thank Simon Peyton Jones and Bob Tarjan for supporting my research endevours and for giving me feedback on the thesis. Simon’s feedback was critical in tying the different parts of the thesis together. Bob’s feedback helped simplify the first three parts of the thesis and started us thinking about some interesting questions. My colleague Maverick Woo helped with the proofs in the third part of the thesis, especially on the stability of tree contraction. Maverick also uncovered a body of related work that we were not aware of. I thank Renato Werneck (of Princeton) for giving us his code for the Link-Cut trees, and for many discussions about dynamic-trees data structures. I thank Jernej Barbic for his help with the graphics library for visualizing kinetic convex hulls. I have had the chance to work with bright young researchers at CMU. Jorge Vittes (now at Stanford) did a lot of the hard work in implementing our libraries for kinetic data structures and for dynamic trees. Kanat Tangwongsan helped implement key parts our SML library and helped with the experiments. I thank Jorge and Kanat for their enthusiasm and dedication. Matthias Blume (of Toyota Technological Institute) helped with the implementation of the SML library. Matthias and John Reppy (of the University of Chicago) helped in figuring out the intricacies of the SML/NJ’s garbage collector. Many colleagues and friends from CMU enriched my years in graduate school. I thank them for their friendship: Konstantin Andreev, Jernej Barbic, Paul Bennett, Mihai Budiu, Armand Debruge, Terry Derosia, Derek Dreyer, Jun Gao, Mor Harchol-Balter, Stavros Harizopolous, Rose Hoberman, Laurie Hiyakumoto, Yiannis Koutis, Gary Miller, Aleks Nanevski, Alina Oprea, Florin Oprea, Lina Papadaki, Spriros Papadim- itrou, Sanjay Rao, Daniel Spoonhover, Srinath Sridhar, Mike Vandeweghe, Virginia Vassilevska, Maverick Woo, Shuheng Zhou. Outside CMU, I thank Yasemin Altun (of TTI), Gorkem¨ Çelik (of UBC), Thu Doan (of IBM Austin), Burak Erdogan (of UIUC), Ram Mettu (of Dartmouth College) for their friendship and for the great times biking, rock-climbing, snowboarding, or just enjoying. I had great fun hanging out with the RISD crowd; thanks to you Jenni Katajamaki,¨ Celeste Mink, Quinn Shamlian, Amy Stein, and Jared Zimmerman. I thank my family for their unwavering support. My parents Durdane¨ and Ismail,˙ my brother Ugur,ˇ and my sister Aslı have always been there when I needed them. Finally, I thank my wife Melissa for her love and support over the years and for help in proofreading this thesis. vii viii ACKNOWLEDGMENTS Contents 1 Introduction 1 1.1 Overview and Contributions of this Thesis . 4 1.1.1 Part I: Algorithms and Data Structures . 4 1.1.2 Part II: Trace Stability . 5 1.1.3 Part III: Applications . 5 1.1.4 Part IV: Language Techniques . 5 1.1.5 Part V: Implementation and Experiments . 6 2 Related Work 9 2.1 Algorithms Community . 9 2.1.1 Design and Analysis Techniques . 10 2.1.2 Limitations . 10 2.1.3 This Thesis . 12 2.2 Programming Languages Community . 13 2.2.1 Static Dependence Graphs . 13 2.2.2 Memoization . 13 2.2.3 Partial Evaluation . 14 2.2.4 This Thesis . 14 I Algorithms and Data Structures 17 3 The Machine Model 21 3.1 The Closure Machine . 22 3.2 The Normal Form . 24 3.2.1 The Correspondence between a Program and its Normal Form . 26 4 Dynamic Dependence Graphs 29 4.1 Dynamic Dependence Graphs . 29 ix x CONTENTS 4.2 Virtual Clock and the Order Maintenance Data Structure . 30 4.3 Constructing Dynamic Dependence Graphs. 31 4.4 Change Propagation . 33 4.4.1 Example Change Propagation . 35 5 Memoized Dynamic Dependence Graphs 37 5.1 Limitations of Dynamic Dependence Graphs . 38 5.2 Memoized Dynamic Dependence Graphs . 39 5.3 Constructing Memoized Dynamic Dependence Graphs. 40 5.4 Memoized Change Propagation . 42 5.5 Discussions . 44 6 Data Structures and Analysis 45 6.1 Data Structures . 45 6.1.1 The Virtual Clock . 45 6.1.2 Dynamic Dependence Graphs . 45 6.1.3 Memo Tables . 46 6.1.4 Priority Queues . 46 6.2 Analysis . 46 II Trace Stability 51 7 Traces and Trace Distance 55 7.1 Traces, Cognates, and Trace Distance . 55 7.2 Intrinsic (Minimum) Trace Distance . 58 7.3 Monotone Traces and Change Propagation . 58 7.3.1 Change Propagation for Monotone Traces . 60 8 Trace Stability 65 8.1 Trace Models, Input Changes, and Trace Stability . 65 8.2 Bounding the priority queue overhead . 67 8.2.1 Dependence width . 67 8.2.2 Read-Write Regular Computations . 69 8.3 Trace Stability Theorems . 71 III Applications 73 9 List Algorithms 77 CONTENTS xi 9.1 Combining with an Arbitrary Associative Operator . 77 9.1.1 Analysis . 78 9.2 Merge sort . 82 9.2.1 Analysis . 83 9.3 Quick Sort . 85 9.3.1 Analysis . 86 9.4 Graham’s Scan . 91 9.4.1 Analysis . 91 10 Tree Contraction 97 10.1 Tree Contraction . 97 10.2 Self-Adjusting Tree Contraction . 99 10.3 Trace Stability . 101 IV Language Techniques 107 11 Adaptive Functional Programming 111 11.1 Introduction . 112 11.2 A Framework for Adaptive Computing . 113 11.2.1 The ML library . 113 11.2.2 Making an Application Adaptive . 114 11.2.3 Adaptivity . 114 11.2.4 Dynamic Dependence Graphs . 116 11.2.5 Change Propagation . 117 11.2.6 The ML Implementation . 120 11.3 An Adaptive Functional Language . 122 11.3.1 Abstract Syntax . 122 11.3.2 Static Semantics . 124 11.3.3 Dynamic Semantics . 124 11.4 Type Safety of AFL . ..

Self-Adjusting Computation

Software for Numerical Computation

Quantum Hypercomputation—Hype Or Computation?

Mathematics and Computation

Introduction to the Theory of Computation Computability, Complexity, and the Lambda Calculus Some Notes for CIS262

What Is Computation? the Enduring Legacy of the Turing Machine by Lance Fortnow, Northwestern University

A Game Plan for Quantum Computing

Models of Computation (RAM)

Foundations of Computation, Version 2.3.2 (Summer 2011)

Introduction to Theory of Computation

What Is Computation? Peter J

Computation Vs. Information Processing: How They Are Different and Why It Matters

Neural Computation and the Computational Theory of Cognition