Conversational Concurrency Copyright © 2017 Tony Garnock-Jones
Total Page:16
File Type:pdf, Size:1020Kb
CONVERSATIONALCONCURRENCY tony garnock-jones Submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy College of Computer and Information Science Northeastern University 2017 Tony Garnock-Jones Conversational Concurrency Copyright © 2017 Tony Garnock-Jones This document was typeset on December 31, 2017 at 9:22 using the typographical look-and-feel classicthesis developed by André Miede, available at https://bitbucket.org/amiede/classicthesis/ Abstract Concurrent computations resemble conversations. In a conversation, participants direct ut- terances at others and, as the conversation evolves, exploit the known common context to advance the conversation. Similarly, collaborating software components share knowledge with each other in order to make progress as a group towards a common goal. This dissertation studies concurrency from the perspective of cooperative knowledge-sharing, taking the conversational exchange of knowledge as a central concern in the design of concur- rent programming languages. In doing so, it makes five contributions: 1. It develops the idea of a common dataspace as a medium for knowledge exchange among concurrent components, enabling a new approach to concurrent programming. While dataspaces loosely resemble both “fact spaces” from the world of Linda-style lan- guages and Erlang’s collaborative model, they significantly differ in many details. 2. It offers the first crisp formulation of cooperative, conversational knowledge-exchange as a mathematical model. 3. It describes two faithful implementations of the model for two quite different languages. 4. It proposes a completely novel suite of linguistic constructs for organizing the internal structure of individual actors in a conversational setting. The combination of dataspaces with these constructs is dubbed Syndicate. 5. It presents and analyzes evidence suggesting that the proposed techniques and constructs combine to simplify concurrent programming. The dataspace concept stands alone in its focus on representation and manipulation of con- versational frames and conversational state and in its integral use of explicit epistemic knowl- edge. The design is particularly suited to integration of general-purpose I/O with otherwise- functional languages, but also applies to actor-like settings more generally. Acknowledgments Networking is interprocess communication. —Robert Metcalfe, 1972, quoted in Day( 2008) I am deeply grateful to the many, many people who have supported, taught, and encouraged me over the past seven years. My heartfelt thanks to my advisor, Matthias Felleisen. Matthias, it has been an absolute privilege to be your student. Without your patience, insight and willingness to let me get the crazy ideas out of my system, this work would not have been possible. My gratitude also to the members of my thesis committee, Mitch Wand, Sam Tobin-Hochstadt, and Jan Vitek. Sam in particular helped me convince Matthias that there might be something worth looking into in this concurrency business. I would also like to thank Olin Shivers for providing early guidance during my studies. Thanks also to my friends and colleagues from the Programming Research Lab, including Claire Alvis, Leif Andersen, William Bowman, Dan Brown, Sam Caldwell, Stephen Chang, Ben Chung, Andrew Cobb, Ryan Culpepper, Christos Dimoulas, Carl Eastlund, Spencer Florence, Oli Flückiger, Dee Glaze, Ben Greenman, Brian LaChance, Ben Lerner, Paley Li, Max New, Jamie Perconti, Gabriel Scherer, Jonathan Schuster, Justin Slepak, Vincent St-Amour, Paul Stan- sifer, Stevie Strickland, Asumu Takikawa, Jesse Tov, and Aaron Turon. Sam Caldwell deserves particular thanks for being the second ever Syndicate programmer and for being willing to pick up the ideas of Syndicate and run with them. Many thanks to Alex Warth and Yoshiki Ohshima, who invited me to intern at CDG Labs with a wonderful research group during summer and fall 2014, and to John Day, whose book helped motivate me to return to academia. Thanks also to the DARPA CRASH program and to several NSF grants that helped to fund my PhD research. I wouldn’t have made it here without crucial interventions over the past few decades from a wide range of people. Nigel Bree hooked me on Scheme in the early ’90s, igniting a life- long interest in functional programming. A decade later, while working at a company called LShift, my education as a computer scientist truly began when Matthias Radestock and Greg Meredith introduced me to the π-calculus and many related ideas. Andy Wilson broadened my mind with music, philosophy and political ideas both new and old. A few years later, Alexis Richardson showed me the depth and importance of distributed systems as we developed new ideas about messaging middleware and programming languages while working together on RabbitMQ. My colleagues at LShift were instrumental to the development of the ideas that ultimately led to this work. My thanks to all of you. In particular, I owe an enormous debt of gratitude to my good friend Michael Bridgen. Michael, the discussions we have had over the years contributed to this work in so many ways that I’m still figuring some of them out. Life in Boston wouldn’t have been the same without the friendship and hospitality of Scott and Megs Stevens. Thank you both. viii Finally, I’m grateful to my family. The depth of my feeling prevents me from adequately conveying quite how grateful I am. Thank you Mum, Dad, Karly, Casey, Sabrina, and Blyss. Each of you has made an essential contribution to the person I’ve become, and I love you all. Thank you to the Yates family and to Warren, Holden and Felix for much-needed distraction and moments of zen in the midst of the write-up. But most of all, thank you to Donna. You’re my person. Tony Garnock-Jones Boston, Massachusetts December 2017 Contents i background xvii 1 introduction1 2 philosophy and overview of the Syndicate design5 2.1 Cooperating by sharing knowledge . 6 2.2 Knowledge types and knowledge flow . 9 2.3 Unpredictability at run-time . 11 2.4 Unpredictability in the design process . 11 2.5 Syndicate’s approach to concurrency . 12 2.6 Syndicate design principles . 19 2.7 On the name “Syndicate”................................. 23 3 approaches to coordination 25 3.1 A concurrency design landscape . 25 3.2 Shared memory . 27 3.3 Message-passing . 29 3.4 Tuplespaces and databases . 33 3.5 The fact space model . 37 3.6 Surveying the landscape . 40 ii theory 43 4 computational model i: the dataspace model 47 4.1 Abstract dataspace model syntax and informal semantics . 47 4.2 Formal semantics of the dataspace model . 55 4.3 Cross-layer communication . 60 4.4 Messages versus assertions . 60 4.5 Properties . 61 4.6 Incremental assertion-set maintenance . 67 4.7 Programming with the incremental protocol . 71 4.8 Styles of interaction . 72 5 computational model ii:Syndicate 75 5.1 Abstract Syndicate/λ syntax and informal semantics . 76 5.2 Formal semantics of Syndicate/λ ............................ 80 5.3 Interpretation of events . 91 5.4 Interfacing Syndicate/λ to the dataspace model . 94 5.5 Well-formedness and Errors . 95 5.6 Atomicity and isolation . 99 5.7 Derived forms: during and select ............................. 100 5.8 Properties . 102 x Contents iii practice 103 6 Syndicate/rkt tutorial 107 6.1 Installation and brief example . 107 6.2 The structure of a running program: ground dataspace, driver actors . 108 6.3 Expressions, values, mutability, and data types . 110 6.4 Core forms . 110 6.5 Derived and additional forms . 116 6.6 Ad-hoc assertions . 121 7 implementation 127 7.1 Representing Assertion Sets . 128 7.1.1 Background . 128 7.1.2 Semi-structured assertions & wildcards . 129 7.1.3 Assertion trie syntax . 131 7.1.4 Compiling patterns to tries . 133 7.1.5 Representing Syndicate data structures with assertion tries . 133 7.1.6 Searching . 134 7.1.7 Set operations . 135 7.1.8 Projection . 138 7.1.9 Iteration . 140 7.1.10 Implementation considerations . 140 7.1.11 Evaluation of assertion tries . 145 7.1.12 Work related to assertion tries . 146 7.2 Implementing the dataspace model . 147 7.2.1 Assertions . 148 7.2.2 Patches and multiplexors . 149 7.2.3 Processes and behavior functions . 150 7.2.4 Dataspaces . 151 7.2.5 Relays . 151 7.3 Implementing the full Syndicate design . 152 7.3.1 Runtime . 152 7.3.2 Syntax . 153 7.3.3 Dataflow . 154 7.4 Programming tools . 155 7.4.1 Sequence diagrams . 156 7.4.2 Live program display . 159 8 idiomatic Syndicate 163 8.1 Protocols and Protocol Design . 163 8.2 Built-in protocols . 166 8.3 Shared, mutable state . 168 8.4 I/O, time, timers and timeouts . 172 8.5 Logic, deduction, databases, and elaboration . 178 8.5.1 Forward-chaining . 179 8.5.2 Backward-chaining and Hewitt’s “Turing” Syllogism . 179 Contents xi 8.5.3 External knowledge sources: The file-system driver . 180 8.5.4 Procedural knowledge and Elaboration: “Make” . 182 8.5.5 Incremental truth-maintenance and Aggregation: All-pairs shortest paths . 183 8.5.6 Modal reasoning: Advertisement . 185 8.6 Dependency resolution and lazy startup: Service presence . 189 8.7 Transactions: RPC, Streams, Memoization . 191 8.8 Dataflow and reactive programming . 197 iv reflection 201 9 evaluation: patterns 205 9.1 Patterns . 205 9.2 Eliminating and simplifying patterns . 206 9.3 Simplification as key quality attribute . 207 9.4 Event broadcast,.