Mochi-JCST-01-20.Pdf

Ross R, Amvrosiadis G, Carns P et al. Mochi: Composing data services for high-performance computing environments. JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY 35(1): 121–144 Jan. 2020. DOI 10.1007/s11390-020-9802-0 Mochi: Composing Data Services for High-Performance Computing Environments Robert B. Ross1, George Amvrosiadis2, Philip Carns1, Charles D. Cranor2, Matthieu Dorier1, Kevin Harms1 Greg Ganger2, Garth Gibson3, Samuel K. Gutierrez4, Robert Latham1, Bob Robey4, Dana Robinson5 Bradley Settlemyer4, Galen Shipman4, Shane Snyder1, Jerome Soumagne5, and Qing Zheng2 1Argonne National Laboratory, Lemont, IL 60439, U.S.A. 2Parallel Data Laboratory, Carnegie Mellon University, Pittsburgh, PA 15213, U.S.A. 3Vector Institute for Artificial Intelligence, Toronto, Ontario, Canada 4Los Alamos National Laboratory, Los Alamos NM, U.S.A. 5The HDF Group, Champaign IL, U.S.A. E-mail: [email protected]; [email protected]; [email protected]; [email protected]; [email protected] E-mail: [email protected]; [email protected]; [email protected]; [email protected] E-mail: [email protected]; [email protected]; [email protected]; {bws, gshipman}@lanl.gov E-mail: [email protected]; [email protected]; [email protected] Received July 1, 2019; revised November 2, 2019. Abstract Technology enhancements and the growing breadth of application workflows running on high-performance computing (HPC) platforms drive the development of new data services that provide high performance on these new platforms, provide capable and productive interfaces and abstractions for a variety of applications, and are readily adapted when new technologies are deployed. The Mochi framework enables composition of specialized distributed data services from a collection of connectable modules and subservices. Rather than forcing all applications to use a one-size-fits-all data staging and I/O software configuration, Mochi allows each application to use a data service specialized to its needs and access patterns. This paper introduces the Mochi framework and methodology. The Mochi core components and microservices are described. Examples of the application of the Mochi methodology to the development of four specialized services are detailed. Finally, a performance evaluation of a Mochi core component, a Mochi microservice, and a composed service providing an object model is performed. The paper concludes by positioning Mochi relative to related work in the HPC space and indicating directions for future work. Keywords storage and I/O, data-intensive computing, distributed services, high-performance computing 1 Introduction The performance, cost, and durability of these technologies motivate the development of complex, mixed The technologies for storing and transmitting data deployments to obtain the highest value. At the are in a period of rapid technological change. Non- same time, networking technologies are also improv- [1] volatile storage technologies such as 3D NAND pro- ing rapidly. The availability of high-radix routers has vide new levels of performance for persistent sto- fostered the adoption of very low-diameter network rage, while emerging memory technologies such as 3D architectures such as Dragonfly [3], Slim Fly [4], and XPoint [2] blur the lines between memory and storage. Megafly [5] (a.k.a. Dragonfly+ [6]). Coupled with ad- Regular Paper Special Section on Selected I/O Technologies for High-Performance Computing and Data Analytics This work is in part supported by the Director, Office of Advanced Scientific Computing Research, Office of Science, of the U.S. Department of Energy under Contract No. DE-AC02-06CH11357; in part supported by the Exascale Computing Project under Grant No. 17-SC-20-SC, a joint project of the U.S. Department of Energy’s Office of Science and National Nuclear Security Administration, responsible for delivering a capable exascale ecosystem, including software, applications, and hardware technology, to support the nation’s exascale computing imperative; and in part supported by the U.S. Department of Energy, Office of Science, Office of Advanced Scientific Computing Research, Scientific Discovery through Advanced Computing (SciDAC) program. ©Institute of Computing Technology, Chinese Academy of Sciences & Springer Nature Singapore Pte Ltd. 2020 122 J. Comput. Sci. & Technol., Jan. 2020, Vol.35, No.1 vanced remote direct memory access (RDMA) capabil- tion of rapid technological advancement and adoption ities, single-microsecond access times to remote data along with an influx of new applications calls for the are feasible at the hardware level. Because of the development of an ecosystem of new data services that performance sensitivity of high-performance comput- both make efficient use of these technologies and pro- ing (HPC) applications, HPC facilities are often early vide high-performance implementations of the capabili- adopters of these cutting-edge technologies. ties needed by applications. No single-service model The application mix at HPC facilities and the as- (e.g., key-value (KV) store, document store, file sys- sociated data needs are undergoing change as well. tem) is understood to meet this wide variety of needs. Data-intensive applications have begun to consume a A report from NERSC [8] states the following: significant fraction of compute cycles. These applica- “Ensuring that users, applications, and workflows tions exhibit very different patterns of data use from will be ready for this transition will require immediate the checkpoint/restart behavior common with compu- investment in testbeds that incorporate both new non- tational science simulations. Specifically, the applica- volatile storage technologies and advanced object sto- tions rely more heavily on read access throughout the rage software systems that effectively use them. These course of workflow execution, and they access and gene- testbeds will also provide a foundation on which a new rate data more irregularly than their bulk-synchronous class of data management tools can be built ....” counterparts do. However, the question of how to enable the required Additionally, the types of data these workflows ope- ecosystem of such data services remains open. Without rate on differ significantly from the structured multidi- significant code reuse, the cost of each implementation mensional data common in simulations: raw data from is too high for an ecosystem to develop; and without sensors and collections of text or other unstructured significant componentization, no single implementation data are often common components. Machine learn- is likely to be readily ported from one platform to an- ing and artificial intelligence algorithms are increasingly other. a component of scientific workflows as well. For ex- Composition Model for Providing Specialized I/O ample, researchers have employed the Cori system○1 Software with Mochi. Mochi directly addresses the at the National Energy Research Scientific Comput- 2 challenges of specialization, rapid development, and ing (NERSC) Center○ for training models to identify ease of porting through a composition model. The new massive supersymmetric (“RPV-Susy”) particles 3 Mochi framework provides a methodology and tools in Large Hadron Collider experiments○. As with data- intensive applications, these workloads exhibit irregu- for communication, data storage, concurrency mana- lar accesses, a greater prominence of read accesses, and gement, and group membership for rapidly developing nontraditional types of data. distributed data services. Mochi components provide Further complicating the situation is that many remotely accessible building blocks such as blob and KV modern workflows are in fact a mix of these different stores that have been optimized for modern hardware. types of activities. For example, recent work investi- Together, these tools enable teams to quickly build new gating optical films for photovoltaic applications em- services catering to the needs of specific applications, to ployed HPC resources for a multiphase workflow [7]. extract high performance from specific platforms, and The first phase of the workflow extracted candidate to move to new platforms as they are brought online. materials from the academic literature by using nat- In doing so, Mochi is enabling an ecosystem of services ural language processing. In the second phase, many to develop and mature, supporting the breadth of HPC small jobs filtered materials to remove inappropriate users on a wide variety of underlying technologies. (e.g., poisonous) materials. The third phase relied on Contributions. This paper makes the following con- traditional quantum chemistry simulation, resulting in tributions: 1) presents the Mochi components and a set of materials for experimental analysis. Each phase methodology for building services; 2) describes a new of the workflow has its own unique data needs. service composed from Mochi components; 3) evalu- I/O Service Specialization: A Requirement of ates a set of Mochi components and services on modern Technology and Application Diversity. The combina- hardware. ○1 http://www.nersc.gov/users/computational-systems/cori/, Nov. 2019. ○2 https://www.nersc.gov, Nov. 2019. ○3 http://home.cern/topics/large-hadron-collider, Nov. 2019. Robert B. Ross et al.: Mochi: Composing Data Services for HPC Environments 123 2 Mochi (SSG), provides a concept of a group of providers and aids in tracking membership. ABT-IO provides In this section we describe the components of the hooks to POSIX I/O capabilities. Lastly, Argobots [9] Mochi framework, the

Mochi-JCST-01-20.Pdf

Details

Download

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

Support