A Python-Based Framework for Distributed Programming and Rapid Prototyping of Distributed Programming Models

A Python-Based Framework for Distributed Programming and Rapid Prototyping of Distributed Programming Models

A Python-based Framework for Distributed Programming and Rapid Prototyping of Distributed Programming Models by ALEXEY S. FEDOSOV B.S. (University of San Francisco) 2001 THESIS Submitted in partial satisfaction of the requirements for the degree of MASTER OF SCIENCE in the DEPARTMENT OF COMPUTER SCIENCE of the UNIVERSITY OF SAN FRANCISCO Approved: Gregory D. Benson, Professor (Chair) Peter S. Pacheco, Professor Christopher H. Brooks, Associate Professor Jennifer E. Turpin, Dean of College of Arts and Sciences 2009 A Python-based Framework for Distributed Programming and Rapid Prototyping of Distributed Programming Models Copyright 2009 by Alexey S. Fedosov ii Abstract River is a Python-based framework for rapid prototyping of reliable parallel and distributed run- time systems. The current quest for new parallel programming models is hampered, in part, by the time and complexity required to develop dynamic run-time support and network communication. The simplicity of the River core combined with Python’s dynamic typing and concise notation makes it possible to go from a design idea to a working implementation in a matter of days or even hours. With the ability to test and throw away several implementations River allows researchers to explore a large design space. In addition, the River core and new extensions can be used directly to develop parallel and distributed applications in Python. This thesis describes the River system and its core interface for process creation, naming, discovery, and message passing. We also discuss various River extensions, such as Remote Access and Invocation (RAI) extension, the Trickle task-farming programming model, and a River MPI implementation. iii Soli Deo Gloria iv Contents List of Figures viii List of Tables x 1 Introduction 1 2 Background 4 2.1 Motivation...................................... ...... 4 2.2 Parallel and Distributed Languages . .............. 6 2.3 Communication Libraries and Mechanisms . ............. 9 2.4 ThePythonLanguage............................... ....... 11 2.5 Distributed Programming in Python . ............ 13 3 Programming with River 15 3.1 VirtualResources ................................ ........ 16 3.2 VirtualMachines ................................. ....... 18 3.3 VRExecution..................................... ..... 19 3.4 Discovery, Allocation, and Deployment . .............. 20 3.5 Super Flexible Messaging . .......... 22 3.6 ACompleteRiverProgram. ........ 28 3.7 SelectiveDiscovery ..... ...... ..... ...... ...... .. ......... 29 3.8 Examples ........................................ 30 3.8.1 MonteCarloPi .................................. 30 3.8.2 Distributed Word Frequency Count . ......... 32 3.8.3 Parallel Dot Product and Matrix Multiplication . ............... 33 3.9 Extensions...................................... ...... 35 4 Design and Implementation 38 4.1 Super Flexible Messaging . .......... 38 4.1.1 Encoding ...................................... 38 4.1.2 Decoding ...................................... 39 4.1.3 QueueMatching ................................. 40 4.2 RiverVM ......................................... 41 4.2.1 SendingData ................................... 42 4.2.2 ReceivingData ................................. 42 4.3 Connection Model and Caching . ......... 44 v 4.3.1 NameResolution ................................ 47 4.4 Discovery, Allocation, and Deployment . .............. 47 4.5 FaultTolerance .................................. ....... 50 4.6 ExperimentalResults . ......... 51 4.6.1 SFM .......................................... 51 4.6.2 RiverCore ..................................... 53 4.6.2.1 TTCP .................................... 53 4.6.2.2 Conjugate Gradient Solver . 55 5 Remote Access and Invocation 57 5.1 ProgrammingModel ................................ ...... 57 5.1.1 Passing Remote Object References . ......... 61 5.2 LocalDirectoryExport . ......... 62 5.3 Design and Implementation . .......... 62 5.4 Discussion...................................... ...... 65 5.5 ComparisonwithPYRO .............................. ...... 67 6 The Trickle Programming Model 72 6.1 ProgrammingModel ................................ ...... 72 6.1.1 Connecting and Injecting . ........ 74 6.1.2 Synchronous Remote Access . ....... 75 6.1.3 Asynchronous Remote Invocation . ......... 78 6.1.4 DynamicScheduling ...... ..... ...... ...... ..... ..... 79 6.2 WordFrequencyExample . ....... 81 6.3 Implementation .................................. ....... 82 6.3.1 Deployment.................................... 82 6.3.2 DynamicScheduling ...... ..... ...... ...... ..... ..... 83 6.4 Discussion...................................... ...... 84 6.5 ComparisonwithIPython. ......... 85 6.6 ExperimentalResults . ......... 87 6.7 Trickle Quick Reference . ......... 89 7 River MPI Implementation 90 7.1 Design.......................................... 90 7.2 ProgrammingModel ................................ ...... 91 7.3 Implementation .................................. ....... 93 7.3.1 BaseImplementation . ...... 94 7.3.2 DerivedDatatypes .............................. ..... 96 7.3.3 Non-blocking Communication . ........ 97 7.3.4 Optimized Collectives . ....... 98 7.4 Comparison with other Python MPI packages . .............101 7.5 ExperimentalResults . .........102 8 Conclusions and Future Work 105 8.1 Experience with River Development . ............105 8.2 FutureWork...................................... 107 vi Bibliography 109 A Examples Source Code 115 A.1 Monte Carlo Pi Calculator [mcpi.py] . .............115 A.2 Word Frequency Counter [wfreq.py] . ............116 A.3 Conjugate Gradients Solver [rcg.py] . ..............116 A.4 Remote Dictionary Extension [remdict.py] . ................118 A.5 Ping-pong Benchmark [pingpong.py] . ............119 A.6 Word Frequency Counter with Trickle [wfreq-trickle.py] ...................120 A.7 Word Frequency Counter with IPython [wfreq-ipy.py] . ..................121 B River Core Source Code 122 B.1 Virtual Resource Base Class [vr.py] . .............122 B.2 River Virtual Machine [riverVM.py] . .............124 B.3 ControlVR[control.py]. ..........128 B.4 Super Flexible Packet [packet.py] . .............131 B.5 SFMQueue[oneq.py] ............................... 132 B.6 Network Connection [nc.py] . ..........133 B.7 Connection Pool [pool.py] . ..........134 C River Extensions Source Code 136 C.1 RAIClientCode[remote.py] . ..........136 C.2 RAIServerCode[rai.py] . .........138 C.3 Trickle[trickle.py] . ..........140 C.4 rMPIBase[mpi.py]................................ .......143 C.5 rMPI Derived Datatypes [dd.py] . ...........151 C.6 rMPI Non-blocking [nbmpi.py] . ...........153 C.7 rMPI Optimized Collectives [optmpi.py] . ...............157 vii List of Figures 2.1 Using Python Keyword Arguments . .......... 12 3.1 RiverVirtualResource . ......... 16 3.2 River Virtual Machines on a Network . ........... 19 3.3 River Program in Execution . .......... 20 3.4 SimpleSFMExample................................ ...... 25 3.5 SFMMatching ..................................... 26 3.6 FirstRiverProgram............................... ........ 28 3.7 BroaderDiscovery ................................ ....... 29 3.8 Selective Discovery with Functions . .............. 30 3.9 MonteCarloPi .................................... ..... 31 3.10WordCount...................................... ..... 32 3.11 Parallel Dot-Product and Matrix Multiplication . ................... 34 3.12 Word Frequency Example with the Remote Dictionary Extension............... 35 3.13 Remote Dictionary Extension . ............ 37 4.1 SFMEncodingProcess .............................. ....... 39 4.2 RiverVirtualMachines . ......... 41 4.3 SendingData ..................................... ..... 43 4.4 ReceivingData ................................... ...... 44 4.5 Multiplexing Connections . .......... 45 4.6 ConnectionModel ................................. ...... 46 4.7 ControlVR ....................................... 48 4.8 ObtainingBaseClassList. .......... 49 4.9 Determining the Class to Instantiate . .............. 50 4.10 Ping-PongLatency ............................... ........ 52 4.11 Ping-PongBandwidth. ......... 53 4.12 TTCPBandwidth .................................. ...... 54 4.13 Conjugate Gradients Solver Speedup . .............. 56 5.1 Remote Access and Invocation . .......... 58 5.2 SampleRAIServerModule. ........ 58 5.3 SampleRAIClient ................................. ...... 59 5.4 RAI Client for an Unrestricted Server . ............. 60 5.5 Using Local Directory Export . .......... 62 viii 5.6 Locating a Parent for Orphan Remote Objects . .............. 64 5.7 Asynchronous Invocation in RAI . ........... 65 5.8 Partial Function Application in RAI . ............. 66 6.1 TrickleModel.................................... ...... 73 6.2 ASimpleTrickleProgram . ........ 74 6.3 TrickleInjection................................ ......... 75 6.4 SynchronousDataAccess. ......... 76 6.5 Synchronous Function Invocation . ............ 77 6.6 Synchronous Object Invocation . ............ 77 6.7 Asynchronous Invocation . .......... 79 6.8 DynamicWorkScheduling . ........ 80 6.9 Document Word Frequency Counter . .......... 81 6.10 TrickleDeployment. ......... 82 6.11 Implementation of forkwork() ................................ 83 6.12 Implementation of forkgen() ................................. 84 6.13 Word Frequency Performance . .......... 88 6.14 Word Frequency Speedup . ......... 88 6.15 Trickle Quick Reference . .......... 89 7.1 ASimplerMPIProgram .............................

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    171 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us