The MVAPICH Approach

The MVAPICH Approach

Enhancing MPI Communication using Accelerated Verbs: The MVAPICH Approach Talk at UCX BoF (SC ‘18) by Dhabaleswar K. (DK) Panda The Ohio State University E-mail: [email protected] http://www.cse.ohio-state.edu/~panda Introduction, Motivation, and Challenge • HPC applications require high-performance, low overhead data paths that provide – Low latency – High bandwidth – High message rate • Hardware Offloaded Tag Matching • Different families of accelerated verbs available – Burst family • Accumulates packets to be sent into bursts of single SGE packets – Poll family • Optimizes send completion counts • Receive completions for which only the length is of interest • Completions that contain the payload in the CQE • Can we integrate accelerated verbs into existing HPC middleware to extract peak performance and overlap? Network Based Computing Laboratory SC’18 2 Verbs-level Performance: Message Rate 12 12 12 Read 10.47 Write 10.20 Send 10 10 9.66 10 8.73 8.97 8 8 8 8.05 7.41 6 6 6 4 4 4 regular regular regular 2 2 2 Million Messages / second Million Messages acclerated / second Million Messages acclerated / second Million Messages acclerated 0 0 0 1 8 64 512 4096 1 8 64 512 4096 1 8 64 512 4096 Message Size (Bytes) Message Size (Bytes) Message Size (Bytes) ConnectX-5 EDR (100 Gbps), Intel Broadwell E5-2680 @ 2.4 GHz MOFED 4.2-1, RHEL-7 3.10.0-693.17.1.el7.x86_64 Network Based Computing Laboratory SC’18 3 Verbs-level Performance: Bandwidth regular regular regular acclerated acclerated acclerated 1000 1000 1000 100 100 100 Million Bytes / second Million Bytes / second Million Bytes / second Million Bytes 16.39 18.35 Read Write 16.10 Send 12.21 14.72 10 10 10 14.13 2 8 32 128 512 2 8 32 128 512 2 8 32 128 512 Message Size (Bytes) Message Size (Bytes) Message Size (Bytes) ConnectX-5 EDR (100 Gbps), Intel Broadwell E5-2680 @ 2.4 GHz MOFED 4.2-1, RHEL-7 3.10.0-693.17.1.el7.x86_64 Network Based Computing Laboratory SC’18 4 The MVAPICH Approach High Performance Parallel Programming Models Message Passing Interface PGAS Hybrid --- MPI + X (MPI) (UPC, OpenSHMEM, CAF, UPC++) (MPI + PGAS + OpenMP/Cilk) High Performance and Scalable Communication Runtime Diverse APIs and Mechanisms Point-to- Remote Collectives Energy- I/O and Fault Active Introspection point Job Startup Memory Virtualization Algorithms Awareness File Systems Tolerance Messages & Analysis Primitives Access Support for Modern Networking Technology (InfiniBand, iWARP, RoCE, Omni-Path) Transport Protocols Modern Interconnect Features Accelerated Verbs Family* Modern Switch Features SR- Multi Tag RC XRC UD DC UMR ODP Burst Poll Multicast SHARP IOV Rail Match * Upcoming Network Based Computing Laboratory SC’18 5.

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    5 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us