Investigating the Feasibility of an MPI-Like Library Implemented in .Net Using Only Fully Managed Code
Total Page:16
File Type:pdf, Size:1020Kb
Investigating the Feasibility of an MPI-like Library Implemented in .Net Using Only Fully Managed Code Daniel Holmes MSc in High Performance Computing The University of Edinburgh Year of Presentation: 2007 Abstract The .Net development platform and the C# language, in particular, offer many benefits to programmers including increased productivity, security, reliability and robustness, as well as standards-based application portability and cross-language inter-operation. The Message Passing Interface (MPI) is a standardised high performance computing paradigm with efficient, frequently-used implementations in many popular languages. A partial implementation of McMPI, the first MPI-like library to be targeted at .Net and written in pure C#, is presented. It is sufficiently complete to demonstrate typical application code and to evaluate relative performance. Although the effective bandwidth for large messages (over 100 Kbytes) using 100Mbit/s Ethernet is good, the overheads introduced by .Net remoting and object serialisation are shown to result in high latency and to limit bandwidth to 166Mbit/s when using a 1Gbit/s Ethernet interconnection. A possible resolution that still uses pure C#, i.e. using .Net sockets, is proposed but not implemented. Contents Chapter 1. Introduction ..................................................................................... 1 Chapter 2. Background ..................................................................................... 2 2.1 Object-Oriented HPC .................................................................................. 2 2.2 Comparing .Net and Java ............................................................................ 3 2.2.1 Portability ............................................................................................... 5 2.2.2 Connectivity ........................................................................................... 5 2.2.3 Software Engineering ............................................................................. 6 2.2.4 Security .................................................................................................. 6 2.2.5 Other Benefits ........................................................................................ 8 2.2.6 Numerical Issues .................................................................................... 8 2.2.7 Performance Issues ................................................................................. 8 2.3 Programming in .Net ................................................................................... 9 2.4 HPC and .Net Programming...................................................................... 11 2.5 Implementation Paradigms for MPI in C# ................................................. 12 2.5.1 Platform Invoke – MPI.NET ................................................................ 13 2.5.2 Sockets vs. Remoting ........................................................................... 14 Chapter 3. Requirements Analysis for a Solution in C# ............................... 16 3.1 Headers and Message Data........................................................................ 16 3.2 Queues and Progress ................................................................................. 17 3.3 Threads and Locks .................................................................................... 19 3.4 External Interface ...................................................................................... 20 3.4.1 Data types ............................................................................................. 20 3.4.2 Memory References ............................................................................. 21 3.4.3 Layout .................................................................................................. 23 3.4.4 Process Identifiers ................................................................................ 23 Chapter 4. Designing and Building a Solution in C# ..................................... 24 4.1 Object Interaction ...................................................................................... 24 4.1.1 Non-Blocking Send .............................................................................. 26 i 4.1.2 Non-Blocking Receive ......................................................................... 27 4.2 Class Structure........................................................................................... 28 4.2.1 The Internal Storage Structures ............................................................ 29 4.2.2 The RemoteController Class ................................................................ 30 4.2.3 The Process Class ................................................................................ 32 4.2.4 The Communicator Class ..................................................................... 33 4.2.5 The Request and Status Classes ............................................................ 34 4.3 Configuration, Start up and Shutdown....................................................... 35 4.3.1 The Node Manager Utility.................................................................... 35 4.3.2 The Initialise Method ........................................................................... 36 4.3.3 The Finalise Method ............................................................................. 36 4.4 Summary ................................................................................................... 36 Chapter 5. Testing............................................................................................ 37 5.1 Proving Correctness using Ping-Pong Duplex ........................................... 37 5.2 Investigating Performance using Ping-Pong Simplex ................................ 38 5.3 Discussion of Ping Pong Simplex Results ................................................. 42 5.4 Demonstrating Real-World Code using Image Processor.......................... 43 5.5 Discussion of Image Processor Results ..................................................... 49 Chapter 6. Further Work ................................................................................ 50 6.1 Reducing Latency...................................................................................... 50 6.2 Testing Support for InfiniBand and Myrinet ............................................. 51 6.3 Increasing Portability ................................................................................. 52 6.4 Scaling to Large Parallel Systems.............................................................. 53 6.5 Completing McMPI .................................................................................. 53 6.6 Building Remoting from MPI ................................................................... 53 Chapter 7. Conclusion ..................................................................................... 54 Appendix A. Project Plan .................................................................................... A Appendix B. Data tables ...................................................................................... C Appendix C. References & Bibliography ........................................................... N Appendix D. Glossary .......................................................................................... R ii List of Figures Figure 1: Building and Executing Java Programs ...................................................... 3 Figure 2: Building and Executing .Net Programs ...................................................... 4 Figure 3: Definition of Type-safe and Verifiably Type-safe in .Net (25) ................... 6 Figure 4: Simplified View of Code Access Security in .Net (26) .............................. 7 Figure 5: Remoting infrastructure overview (33) .................................................... 15 Figure 6: Source code for using the IsSerializable property .................................... 21 Figure 7: UML Sequence diagram for non-blocking send and receive .................... 25 Figure 8: UML Static Structure for the public interface of the McMPI classes ....... 28 Figure 9: Source code for the internal storage structures ......................................... 29 Figure 10: Source code for the PullMessageData method ...................................... 30 Figure 11: Source code for the PushSendHeader method ....................................... 31 Figure 12: Source code for the CombineMatched and PullData functions .............. 32 Figure 13: Ping Pong round-trip time for small messages – 100Mbit/s ................... 39 Figure 14: Ping Pong round-trip time for small messages – 1Gbit/s ........................ 39 Figure 15: Ping Pong round-trip time for all message sizes – 100Mbit/s ................. 40 Figure 16: Ping Pong round-trip times for all message sizes – 1Gbit/s .................... 40 Figure 17: Ping Pong effective bandwidth – 100Mbit/s .......................................... 41 Figure 18: Ping Pong effective bandwidth – 1Gbit/s ............................................... 41 Figure 19: Image Processor times – small image, single precision .......................... 45 Figure 20: Image Processor times – small image, double precision ......................... 45 Figure 21: Image Processor times – medium image, single precision...................... 46 Figure 22: Image Processor times – medium image, double precision .................... 46 iii Figure 23: Image Processor times – large image, single