Kent Academic Repository Full text document (pdf)

Citation for published version Barnes, Frederick R.M. (2003) Dynamics and Pragmatics for High Performance Concurrency. Doctor of Philosophy (PhD) thesis, University of Kent at Canterbury.

DOI

Link to record in KAR http://kar.kent.ac.uk/13969/

Document Version Publisher pdf

Copyright & reuse Content in the Kent Academic Repository is made available for research purposes. Unless otherwise stated all content is protected by copyright and in the absence of an open licence (eg Creative Commons), permissions for further reuse of content should be sought from the publisher, author or other copyright holder.

Versions of research The version in the Kent Academic Repository may differ from the final published version. Users are advised to check http://kar.kent.ac.uk for the status of the paper. Users should always cite the published version of record.

Enquiries For any further enquiries regarding the licence status of this document, please contact: [email protected] If you believe this document infringes copyright then please contact the KAR admin team with the take-down information provided at http://kar.kent.ac.uk/contact.html DYNAMICS AND PRAGMATICS FOR HIGH PERFORMANCE CONCURRENCY

a thesis submitted to The University of Kent at Canterbury in the subject of computer science for the degree of doctor of philosophy.

By Frederick R. M. Barnes June 2003 Abstract

This thesis is concerned with support at all levels for building highly concurrent and dynamic parallel processing systems. The CSP model of concurrency, as (largely) embodied in the occam programming language is used due to its simplicity, expressiveness, architecture-independent nature, and potential for high performance. Additionally, occam provides guarantees regarding freedom from aliasing and race-hazard error. This thesis addresses one of the grand challenges of present day computer science: providing a technology that offers the dynamic flexibility and performance of mainstream object ori- ented environments with the level of safety, formal analysis, modularity and lightweight concurrency offered by CSP/occam. Two approaches to this challenge are possible: do something to make the mainstream languages (e.g. Java, C++) safe, or make occam dynamic — without compromising its existing good properties. This thesis follows the latter route. The first part of this thesis concentrates on enhancing the occam language and run-time system, on a commodity platform (IBM PC) running the freely available . After a brief introduction to the various components of the KRoC occam system, additions and extensions to the occam programming language and supporting run-time system are examined. These provide a greater degree of programming flexibility in occam (for example, by adding support for dynamic allocation, mobile semantics and dynamic network construction), without compromising the safety of programs which use them. Benchmarks are reported that demonstrate significant improvements in performance (for example, channel communication in tens of nano-seconds). The second part concentrates on improving the level of interaction between occam programs and the OS environment. Providing easy access to sockets and networking, for example. This thesis concludes with a discussion of the work presented herein, with consideration given to parallels with object-oriented languages. Also described are details of ongoing and potential future research. The modified language grammar, details of new generated code, and miscellany are provided in the appendices.

ii Contents

Abstract ii

List of Tables viii

List of Figures ix

List of Algorithms xi

Acknowledgements xii

1 Introduction 1 1.1 History ...... 1 1.2 AimsofThisWork...... 2 1.3 Other Approaches to Compiling occam ...... 2 1.4 Motivation ...... 4 1.5 StructureofThisThesis ...... 5

I Dynamic Parallel Computing 7

2 Introduction to KRoC/Linux 9 2.1 Origins of KRoC/Linux ...... 9 2.2 Packaging...... 9 2.3 The occam Compiler—occ21...... 10 2.4 The Translator — tranx86 ...... 11 2.5 TheRun-TimeKernel—CCSP...... 12

3 Extending occam 13 3.1 ChannelDirectionSpecifiers...... 13 3.1.1 Syntax Changes ...... 14 3.1.2 CompilerChecks ...... 16 3.2 Extending PROTOCOLs...... 17 3.2.1 Protocol Inclusion ...... 18 3.2.2 Protocol Inheritance ...... 19 3.2.3 Implementation Aspects of Protocol Inheritance ...... 21 3.3 RESULT Parameters and Abbreviations ...... 22 3.4 Array-Constructors...... 23 3.4.1 Syntax and Transformation ...... 24

iii 3.5 Other Language Extensions ...... 26 3.5.1 Optional OF ...... 26 3.5.2 Empty Array Support ...... 27 3.6 STEP inReplicators...... 28 3.6.1 Supporting STEP atRun-Time ...... 28 3.6.2 Loop-End Optimisation ...... 29 3.7 TheExtendedRendezvous...... 30 3.7.1 Syntax...... 31 3.7.2 ImplementingtheExtendedRendezvous ...... 33 3.7.3 Further Uses of the Extended Rendezvous ...... 38 3.7.4 Formal Semantics ...... 40 3.8 Dynamic Memory Support for occam ...... 45 3.9 OtherExtensions...... 46 3.9.1 Modified SKIP in ALT Checking...... 46 3.9.2 Reversed ALT Disabling ...... 47 3.9.3 Enhanced ALT Enabling ...... 49 3.9.4 Benchmarking the ALT Enhancements ...... 53 3.9.5 Auto-Extended CASE Input ...... 57 3.9.6 StrictChecking...... 59 3.10 A C-Like Syntax for occam ...... 59 3.10.1 Definitions and Declarations ...... 60 3.10.2 Sequential and Parallel Composition ...... 61 3.10.3 Conditionals ...... 62 3.10.4 Alternatives ...... 63 3.10.5 Communication, Assignment and Expressions ...... 64 3.10.6 Inclusion and Separate Compilation ...... 66

4 Mobile Data, Channels and Processes 67 4.1 Introduction...... 67 4.2 MobileDataTypes...... 69 4.2.1 Declaring Mobiles ...... 69 4.2.2 Mobilespace ...... 70 4.2.3 Scoping, Communication and Assignment ...... 75 4.2.4 Dynamic Mobile Arrays ...... 77 4.3 MobileChannelTypes ...... 80 4.3.1 Declaration and Initialisation of Mobile Channels ...... 82 4.3.2 Communicating Mobile Channels ...... 83 4.3.3 Semaphore Support for Shared Channels ...... 87 4.4 Anonymous and Recursive Channel Types ...... 88 4.4.1 Anonymous Channel Types ...... 90 4.4.2 Implementing Anonymous Channel Types ...... 92 4.4.3 RecursiveChannelTypes ...... 94 4.5 MobileProcessTypes ...... 95 4.5.1 Mobile Agents ...... 96 4.5.2 Process Types for occam ...... 97 4.5.3 UsingMobileProcesses ...... 101 4.5.4 ExtendingMobileProcesses...... 107

iv 4.6 Undefined Usage Checking ...... 108 4.6.1 Undefinedness ...... 109 4.6.2 Implementation ...... 110 4.6.3 Handling Arrays and Records ...... 114 4.7 Usage Checking Channel Types ...... 117

5 Dynamic Process Creation 119 5.1 Recursion in occam ...... 119 5.1.1 ImplementingRecursion ...... 121 5.1.2 Mobilespace For Recursive Processes ...... 123 5.1.3 Tail Call Optimisation ...... 125 5.1.4 Mutual Recursion ...... 125 5.2 n-replicated PARs...... 127 5.2.1 Implementing n-replicated PARs ...... 128 5.2.2 Mobilespace Support for n-replicated PARs ...... 132 5.2.3 Usage Checking n-replicated PARs ...... 133 5.3 FORKs and FORKING ...... 134 5.3.1 FORK Parameter Passing ...... 135 5.3.2 Semantics of FORK ...... 136 5.3.3 Unsynchronised FORKing...... 137 5.3.4 Implementing FORK ...... 139 5.3.5 Mobilespace for FORKedProcesses...... 141 5.3.6 Forked Recursion ...... 142

II Wider Interaction and Accessibility 143

6 CCSP and Linux 145 6.1 Introduction...... 145 6.2 Interfacing With occam ...... 146 6.2.1 Target Register Mapping ...... 147 6.2.2 Run-timeKernelCalling...... 148 6.2.3 External Synchronisations ...... 152 6.3 BlockingSystemCalls ...... 153 6.3.1 Building Internet Applications ...... 154 6.3.2 The occam Interface ...... 155 6.3.3 Safe Termination ...... 157 6.3.4 ImplementingBlockingSystemCalls ...... 159 6.3.5 DispatchingBlockingCalls ...... 160 6.3.6 CollectingFinishedCalls ...... 162 6.3.7 TerminatingBlockingCalls ...... 163 6.3.8 Performance ...... 165 6.3.9 Interacting with occam Programs ...... 168 6.4 UserDefinedChannels...... 170 6.4.1 The occam Interface ...... 172 6.4.2 The C Interface ...... 173 6.4.3 Generating Code for User Defined Channels ...... 174 6.4.4 Compile-Time Translation for User Defined Channels ...... 174

v 6.4.5 Run-Time Handling for User Defined Channels ...... 175 6.4.6 ALTing on User Defined Channels ...... 177 6.5 Dynamic Loadable Processes ...... 183 6.5.1 GeneratingDynamicProcesses ...... 183 6.5.2 Loading and Running Dynamic Processes ...... 184 6.5.3 SuspendingandResumingDynamicProcesses ...... 186 6.5.4 ImplementingSuspendandResume ...... 188 6.6 CPUTimerSupport ...... 189 6.6.1 Motivation ...... 190 6.6.2 Implementation ...... 191 6.6.3 64-Bit Timers ...... 192 6.6.4 OptimisingProcessTimeouts ...... 193 6.7 DirectOSKernelSupportforCSP ...... 194 6.7.1 Named Channels for Inter-Process Communication ...... 195 6.7.2 SupportforSleeping ...... 198 6.7.3 Support for Blocking System Calls ...... 198

7 Further Extensions 201 7.1 Concurrent C and occam ...... 202 7.1.1 CreatingandRunningCProcesses ...... 203 7.1.2 Masquerading as occam ...... 204 7.2 ProcessPriority...... 207 7.2.1 The occam Interface ...... 208 7.2.2 Implementing Priority Handling ...... 208 7.2.3 Performance ...... 213 7.3 Post-Mortem Debugging ...... 215 7.3.1 Standard Run-Time Errors ...... 216 7.3.2 Low-Level Debugging ...... 220 7.3.3 Deadlock Detection ...... 220 7.4 Support for Higher Level Interaction ...... 222 7.4.1 Accessing and Using Protocol Converters ...... 224 7.4.2 Detaching and Attaching Dynamic Mobile Arrays ...... 225 7.4.3 Implementation of Protocol Conversion ...... 226 7.5 A Pre-Processor for occam ...... 236 7.5.1 Named Constants ...... 236 7.5.2 Built-InDefines...... 237 7.5.3 Conditional Compilation and Indentation ...... 238 7.5.4 User-Generated Errors ...... 240

8 Conclusions and Further Work 241 8.1 occam and Object Orientation ...... 241 8.1.1 A Ring of Processes ...... 242 8.1.2 ProcessTypes ...... 243 8.2 Desirable OO Features for occam ...... 243 8.2.1 Objects ...... 243 8.2.2 Inheritance ...... 244 8.2.3 Polymorphism ...... 244 8.3 FutureWork—TidyingUp...... 245

vi 8.3.1 Arbitrary Process FORKing...... 245 8.3.2 Full Nested Mobilespace Support ...... 246 8.3.3 Implementation of Mobile Processes ...... 247 8.4 FutureWork—MovingOn ...... 247 8.4.1 Fault-Tolerance for Concurrent Systems ...... 247 8.4.2 Higher-Order Channel-Type Communication ...... 248 8.4.3 Scalar Types For occam ...... 249 8.5 ConcludingRemarks ...... 249

Bibliography 251

III Appendices 261

A Ordered Syntax 263 A.1 Names, Strings and Numbers ...... 263 A.2 Core Language Grammar ...... 264 A.3 Pre-ProcessorGrammar ...... 273

B Extended Transputer Code Additions 275 B.1 NewInstructions ...... 275 B.1.1 Dynamic Allocation ...... 277 B.1.2 Mobile Communication ...... 279 B.1.3 ExtendedInputs ...... 281 B.1.4 External Communication ...... 283 B.1.5 Miscellany...... 288 B.2 NewETCSpecials ...... 292 B.2.1 Mobilespace Initialisation Specials ...... 292 B.2.2 New LOOPEND Specials...... 293 B.2.3 Magic Compiler Comments ...... 293 B.2.4 Semaphores,ReschedulingandOthers ...... 294

vii List of Tables

3.1 Outcomes for channel-direction specifier compatibility checks ...... 17

6.1 Transputerprocessstate...... 146 6.2 Intel i386 general-purpose integer registers ...... 147 6.3 Transputer register mapping on the Intel i386 ...... 147 6.4 Methods used to transfer control from occam totheCCSPrun-timekernel . . . . . 149 6.5 Blocking system-call ‘killcall()’results ...... 156 6.6 Channel IOCTL calls for the Linux CSP-driver ...... 195 6.7 Channel direction constants for the CSP-driver ...... 196 6.8 Timeout IOCTL calls for the Linux CSP-driver ...... 198 6.9 Blocking system-call IOCTL calls for the Linux CSP-driver ...... 199

7.1 Run-timeintegererrors ...... 219 7.2 Intel floating-point errors reported by KRoC/Linux ...... 220 7.3 Compiler generated pre-processor defines ...... 238

A.1 Key to ordered syntax additions ...... 263

B.1 Virtual-Transputer instructions to support dynamic memory ...... 275 B.2 Virtual-Transputer instructions to support MOBILE communications ...... 276 B.3 Virtual-Transputer instructions to support the extended rendezvous ...... 276 B.4 Virtual-Transputer instructions to support external communication ...... 276 B.5 Virtual-Transputer instructions to support miscellaneous extensions ...... 277

viii List of Figures

2.1 The KRoC/Linux compilation sequence ...... 10

3.1 Process network for a running-sum integrator ...... 13 3.2 Using protocol conversion components to wire up a GUI ...... 20 3.3 Workspace layout and code-generation for supporting arbitrary replicator STEP values 29 3.4 A tapped ‘squares’processpipeline ...... 32 3.5 Multi-waysynchronisingprocesses ...... 39 3.6 Exampleextendedsynchronisingprocesses ...... 43 3.7 Alt-benchmark results for a replicated ALT ...... 55 3.8 Alt-benchmark results for a replicated PRI ALT ...... 56 3.9 Alt-benchmark results for a replicated fair-ALT ...... 57

4.1 Copying, aliasing and movement semantics for communication ...... 68 4.2 Allocation of variables in mobilespace ...... 71 4.3 Allocation of process instances in mobilespace ...... 72 4.4 Mobile-communicating process network showing initial and final mobilespace . . . . 75 4.5 Mobilespace with temporary aliasing ...... 76 4.6 Dynamic mobile array memory allocation ...... 79 4.7 Example ‘encode’ server and clients connected using mobile channel-ends ...... 81 4.8 Layout of the ‘encode’ channel type in memory ...... 82 4.9 Example ‘encode’ servers, clients and a manager process, using shared channel-ends 86 4.10 Modified ‘encode’ network, incorporating a shared channel ...... 90 4.11 Recursive channel-type communication ...... 95 4.12 Global-state based mobile agent ...... 97 4.13 An example mobile process and local connections ...... 97 4.14 Example ‘integrator’ mobile process implementations ...... 100 4.15 Mobile-process communicating processes ...... 102 4.16 Mobile-process communicating process network ...... 104 4.17 Mobile process interface conversion component ...... 107 4.18 Undefined nesting for records ...... 116 4.19 Undefined nesting for arrays ...... 116

5.1 Process network for the parallel recursive Sieve of Eratosthenes ...... 121 5.2 Workspace allocation for occam procedure calls ...... 122 5.3 Workspace allocation for recursive occam procedure calls ...... 123 5.4 Dynamic mobilespace allocation for recursive processes ...... 124 5.5 Setting up a fixed-size replicated PAR ...... 129

ix