A Brief Introduction to the Openfabrics Interfaces a New Network API for Maximizing High Performance Application Efﬁciency

2015 IEEE 23rd Annual Symposium on High-Performance Interconnects A Brief Introduction to the OpenFabrics Interfaces A New Network API for Maximizing High Performance Application Efficiency Paul Grun∗, Sean Hefty†, Sayantan Sur†, David Goodell‡, Robert D. Russell§, Howard Pritchard¶, Jeffrey M. Squyres‡ ∗Cray, Inc. [email protected] †Intel Corp. {sean.hefty, sayantan.sur}@intel.com ‡Cisco Systems, Inc. {dgoodell, jsquyres}@cisco.com §UNH InterOperability Laboratory [email protected] ¶Los Alamos National Laboratory [email protected] Abstract—OpenFabrics Interfaces (OFI) is a new family of This reduces overall software overhead and improves appli- application program interfaces that exposes communication cation efficiency when transmitting or receiving data over a services to middleware and applications. Libfabric is the first fabric. member of OFI and was designed under the auspices of the OpenFabrics Alliance by a broad coalition of industry, II. MOTIVATIONS academic, and national labs partners over the past two years. Building and expanding on the goals and objectives of the The motivations for developing OFI evolved from expe- verbs interface, libfabric is specifically designed to meet the rience gained by developing OpenFabrics Software (OFS), performance and scalability requirements of high performance which is produced and distributed by the OpenFabrics applications such as Message Passing Interface (MPI) libraries, Symmetric Hierarchical Memory Access (SHMEM) libraries, Alliance (OFA) [1]. Starting as an implementation of the Partitioned Global Address Space (PGAS) programming mod- InfiniBand Trade Association (IBTA) Verbs specification [2], els, Database Management Systems (DBMS), and enterprise OFS evolved over time to include support for both the applications running in a tightly coupled network environ- iWARP [3–5] and RoCE [6] specifications. As use of these ment. A key aspect of libfabric is that it is designed to be technologies grew, new ideas emerged about how to best independent of the underlying network protocols as well as the implementation of the networking devices. This paper provides access the features available in the underlying hardware. a brief discussion of the motivation for creating a new API, New applications appeared with the potential to utilize net- describes the novel requirements gathering process which drove works in previously unanticipated ways. In addition, demand its design, and summarizes the API’s high-level architecture arose for greatly increased scalability and performance. and design. More recently new paradigms, such as Non-Volatile Memory Keywords-fabric; interconnect; networking; interface (NVM), have emerged. All the above events combined to motivate a “Birds-of- I. INTRODUCTION a-Feather” session at the SC13 Conference. The results of OpenFabrics Interfaces, or OFI, is a framework focused that session provided the impetus for the OFA to form a new on exporting communication services to applications. OFI project known as OpenFabrics Interfaces, and a new group is specifically designed to meet the performance and scala- called the OpenFabrics Interface Working Group (OFIWG). bility requirements of high-performance computing (HPC) From the beginning, this project and the associated working applications such as MPI, SHMEM, PGAS, DBMS, and group sought to engage a wide spectrum of user com- enterprise applications running in a tightly coupled network munities who were already using OFS, who were already environment. The key components of OFI are: application moving beyond OFS, or who had become interested in high- interfaces, provider libraries, kernel services, daemons, and performance interconnects. These communities were asked test applications. to contribute their ideas about the existing OFS and, most Libfabric is a library that defines and exports the user- importantly, they were asked to describe their requirements space API of OFI, and is typically the only software that for interfacing with high-performance interconnects. applications deal with directly. Libfabric is supported on The response was overwhelming. OFIWG spent months commonly available Linux based distributions. The libfabric interacting with enthusiastic representatives from various API is independent of the underlying networking protocols, groups, including MPI, SHMEM, PGAS, DBMS, NVM and as well as the implementation of particular networking others. The result was a comprehensive requirements doc- devices over which it may be implemented. ument containing 168 specific requirements. Some requests OFI is based on the notion of application centric I/O, were educational – complete, on-line documentation. Some meaning that the libfabric library is designed to align fabric were practical – a suite of examples and tests. Some were services with application needs, providing a tight semantic organizational – a well-defined revision and distribution fit between applications and the underlying fabric hardware. mechanism. Some were obvious but nevertheless challenging 978-1-4673-9160-3/15 $31.00 © 2015 IEEE 34 DOI 10.1109/HOTI.2015.19 – scalability to millions of communication peers. Some directly for most operations in order to ensure the lowest were specific to a particular user community – provide tag- possible software latencies. matching that could be utilized by MPI. Some were an As captured in Figure 1, libfabric can be grouped into expansion of existing OFS features – provide a full set of four main services. atomic operations. Some were a request to improved existing OFS features – redesign memory registration. Some were A. Control Services aimed at the fundamental structure of the interface – divide These are used by applications to discover information the world into applications and providers, and allow users about the types of communication services available in the to select specific providers and features. Some were entirely system. For example, discovery will indicate what fabrics new – provide remote byte-level addressing. are reachable from the local node, and what sort of commu- After examining the major requirements, including a nication each fabric provides. requirement for independence from any given network tech- Discovery services allow an application to request specific nology and a requirement that a new API be more abstract features, or capabilities, for example, the desired communi- than other network APIs and more closely aligned with cation model, from the underlying provider. In response, a application usage, the OFIWG concluded that a new API provider can indicate what additional capabilities an appli- based solely on application requirements was the appropriate cation may use without negatively impacting performance or direction. scalability, as well as requirements on how an application can best make use of the underlying fabric hardware. A III. ARCHITECTURAL OVERVIEW provider indicates the latter by setting mode bits which encode restrictions on an application’s use of the interface. Figure 1 highlights the general architecture of the two Such restrictions are due to performance reasons based on main OFI components, the libfabric library and an OFI the internals of a particular provider’s implementation. provider, shown situated between OFI enabled applications The result of the discovery process is that a provider uses and a hypothetical NIC that supports process direct I/O. the application’s request to select a software path that is best The libfabric library defines the interfaces used by appli- suited for both that application’s needs and the provider’s cations, and provides some generic services. However, the restrictions. bulk of the OFI implementation resides in the providers. Providers hook into libfabric and supply access to fabric B. Communication Services hardware and services. Providers are often associated with These services are used to set up communication between a specific hardware device or NIC. Because of the structure nodes. They include calls to establish connections (con- of libfabric, applications access the provider implementation nection management) as well as the functionality used to " $ . % Libfabric Enabled Applications )(!% !4(" 21642) 20091(" 721 203)$721 6 4 15%$4 (5"2@$4A 211$"721 &06 @$16 9$9$5 $55 &$ 9$9$5 ##4$55 $"6245 2916$45 & 6"'(1& 620("5 )" 42@(#$4 (5"2@$4A 211$"721 &06 @$16 9$9$5 $55 &$ 9$9$5 ##4$55 $"6245 2916$45 & 6"'(1& 620("5 " 200 1# 9$9$5 200 1# 9$9$5 Figure. 1: Architecture of libfabric and an OFI provider layered between applications and a hypothetical NIC 35 address connectionless endpoints (address vectors). Com- are associated with a set of interfaces. For example, RMA munication interfaces are designed to abstract fabric and services are accessible using a set of well-defined functions. hardware specific details used to connect and configure Interface sets are associated with objects exposed by libfab- communication endpoints. ric. The relationship between an object and an interface set is The connection interfaces are modeled after sockets to roughly similar to that between an object-oriented class and support ease of use. However, address vectors are designed its member functions, although the actual implementation around minimizing the amount of memory needed to store differs for performance and scalability reasons. addressing data for potentially millions of remote peers. An object is configured based on the results of the discovery services. In order to enable optimized code paths C. Completion Services between the application and fabric

Load more