NIC-Based Offload of Dynamic User-Defined Modules for Myrinet Clusters Adam Wagner, Hyun-Wook Jin and Dhabaleswar K. Panda Rolf Riesen Network-Based Computing Laboratory Scalable Computing Systems Dept. Dept. of Computer Science and Engineering Sandia National Laboratories The Ohio State University
[email protected] ¡ wagnera, jinhy, panda ¢ @cse.ohio-state.edu Abstract The common approach to NIC-based offload is to hard- code an optimization into the control program which runs Many of the modern networks used to interconnect nodes on the NIC in order to achieve the highest possible perfor- in cluster-based computing systems provide network inter- mance gain. While such approaches have proved successful face cards (NICs) that offer programmable processors. Sub- in improving performance, they suffer from several draw- stantial research has been done with the focus of offloading backs. First, NIC-based coding is quite complex and error processing from the host to the NIC processor. However, the prone due to the specialized nature of the NIC firmware research has primarily focused on the static offload of spe- and the difficulty of validating and debugging code on the cific features to the NIC, mainly to support the optimization NIC. Because of the level of difficulty involved in making of common collective and synchronization-based communi- such changes and the potential consequences of erroneous cations. In this paper, we describe the design and implemen- code, these sorts of optimizations may only be performed by tation of a new framework based on MPICH-GM to sup- system experts. Second, hard-coding features into the NIC port the dynamic NIC-based offload of user-defined mod- firmware is inflexible.