Bridging the Gap Between HPC and Iaas

1 Bridging the Gap Between HPC and IaaS Andrew J. Younge and Geoffrey C. Fox Pervasive Technology Institute, Indiana University 2729 E 10th St., Bloomington, IN 47408, U.S.A. Email: fajyounge,[email protected] Abstract—With the advent of virtualization and of use with clouds often have implicit performance impacts Infrastructure-as-a-Service (IaaS), the broader scientific that, up until now, have been overlooked. computing community is considering the use of clouds for their technical computing needs. This is due to the relative II. PERFORMANCE scalability, ease of use, advanced user environment customization abilities, and the many novel computing paradigms available for As with the HPC industry, performance must be a first class data-intensive applications. However, there is still a notable gap function of the new cloud architecture. In the Magellan Project that exists between the performance of IaaS when compared to [2], one of the major obstacles identified with virtualized envi- typical high performance computing (HPC) resources, limiting the applicability of IaaS for many potential users. ronments is the performance gap when compared to supercom- This work proposes to bridge the gap between supercomput- puters. While virtualization can induce overhead to the com- ing and clouds, specifically by enabling both tightly coupled putation, clouds or virtualization do not intrinsically impose applications and distributed data-intensive applications under any theoretical limits to performance by design. Rather, the a single, unified framework. Through heterogeneous hardware, implementation of such technologies leads to the bottlenecks performance-tuned virtualization technologies, advanced I/O interconnects, and unique open-source IaaS software, a distributed, and limitations identified by the Magellan project. Like many high performance cloud computing architecture is formed. new technologies some overhead may be unavoidable, but the goal is to minimize the overhead whenever possible and, in time, consider virtualization’s overhead in the same light as I. INTRODUCTION we do with language compiler overhead today. A detailed analysis of the overhead of various virtualization Scientific computing endeavours have created clusters, technologies has already commenced [3]. From a compu- grids, and supercomputers as high performance computing tational perspective, the amount of overhead introduced by (HPC) platforms and paradigms, which are capable of tack- virtualization needs to be minimized or eliminated entirely. ling non-trivial parallel problems. HPC resources continually This applies not only to typical floating-point operations strive for the best possible computational performance on the common in MPI applications and represented in the Top500 cusp of Moore’s Law. This pursuit can be seen through the [4], [5], but also with data-intensive calculations as embodied focus on cutting edge architectures, high-speed, low-latency by the Graph500 benchmark [6]. Using HPC benchmarks interconnects, parallel languages, and dynamic libraries, all provide an ideal opportunity to model applications on real tuned to maximize computational efficiency. Performance has hardware implementations. These and other benchmarks can been the keystone of HPC since its conception, with many perform well in IaaS when based on a design backed by ways to evaluate performance. The Top500 supercomputer experimentally-verified practices. ranking system, which has been running for over 20 years, However, CPU and memory utilization is only one aspect of is a hallmark to performance’s importance in supercomputing. supercomputing application performance. Another key focal Many industry leaders have focused on leveraging the point is the I/O bandwidth and interconnect performance. economies of scale from data center operations and advanced Advanced networking technologies have been relatively un- virtualization technologies to service two classes of prob- available in clouds, which often only support Gigabit Eth- lems: handling millions of user interactions concurrently or ernet (the only notable exception is the use of 10GbE in organizing, cataloging, and retrieving mountains of data in Amazon Cluster Compute Instances). InfiniBand and MMP short order. The result of these efforts has led to the advent interconnects, the backbone of the HPC industry, have largely of cloud computing, which leverages data center operations, gone unused in clouds. Work on the Palacios project has virtualization, and a unified and user-friendly interface to looked at supporting 10GbE networks in the VMM with interact with computational resources. This is culminated in notable success [7]. As such, a comprehensive system for the *aaS mentality, where everything is delivered as a service. connecting and interweaving virtual machines though a high This model treats both data and compute resources as a speed interconnects is paramount to the success of a high commodity and places an immediate and well defined value performance IaaS architecture. on the cost of using large resources [1]. Users are able to scale Two technologies have become available to enable virtu- their needs from a single small compute instance to thousands alzied environments to leverage the same interconnects as or more instances aggregated together in a single data center. many supercomputers; hardware-assisted I/O virtualization Clouds also provide access to complex parallel resources (VT-d or IOMMU) and Single Root I/O Virtualization (SR- through simple interfaces, which have enabled a new class of IOV). With VT-d, PCI-based hardware passed directly to a internet applications and tools ranging from social networking guest VM, thereby removing the overhead of communicating to cataloging the world’s knowledge. Unfortunately, the ease with the host OS through emulated drivers. As a result, VMs 2 gain direct access to InfiniBand adapters and use drivers V. DIRECTION without emulation or modification to achieve near-native I/O Currently implementation and experimentation is under way performance. Second, with the use of SR-IOV, adapter func- to evaluate the applicability of a high performance IaaS archi- tions can be assigned directly to a given VM. This enables IaaS tecture. Research is currently leveraging the OpenStack IaaS providers to leverage InfiniBand interconnects for applications framework, as it represents an open-source, scalable, commu- that utlize RDMA, IB Verbs, or even IP while simultanously nity driven software stack with a wide consortium of users providing a guaranteed QoS based on SR-IOV configuration. [11]. Utilizing the Xen hypervisor to obtain near-native performance of supercomputing resources with experimentally- III. HETEROGENEITY validated deployments and hardware-assisted virtualization, The heterogeneous nature of the HPC industry has existed we hypothesize VM performance overhead becoming negli- since its birth, with a number of unique architectures that gible. Furthermore, we plan to leverage advanced hardware enable novel applications within scientific computing [8]. such as InfiniBand and nVidia GPUs available within today’s However, a one-size-fits-all supercomputer or cloud that exists supercomputing resources within an OpenStack IaaS. The to serve the needs of all is unlikely to exist. This can be noted FutureGrid project [12], a distributed grid and cloud testbed, by the current state of the XSEDE project (formally TeraGrid), provides an idea test-bed to build an experimental IaaS based which hosts almost a dozen different supercomputers that are on real-world resources, eliminating the need for simulation not directly compatible. While we do not hope to adapt a one- or emulation. size-fits-all model to HPC, we instead propose to provide these While using supercomputing benchmarks provides a way to heterogeneous resources through the same unified interface evaluate experimental computational resources, we will also provided by current IaaS. Specifically, supporting a wide array look for novel scientific applications in Bioinformatics to proof hardware such as accelerator cards, GPUs, ARM & x86 vide in-situ evaluation of the proposed architecture. Looking at CPU architectures, and networking resources is now becoming large scale analysis of newly sequenced and divergent genomes possible through the use of hardware-assisted virtualization and applying computationally intensive analysis techniques or bare-metal provisioning. As a proof of concept, enabling provides an ideal use case to evaluate a distributed high nVidia Tesla GPUs provides an ideal use case, especially as performance IaaS system. A successful evaluation could lead many supercomputing resources look towards utilizing GPU to substantial impacts in Bioinfromatics and many other fields architectures as the community moves through petascale and looking to better utilize advanced computational resources. towards exascale computing. Providing heterogeneity in IaaS deployments is akin to REFERENCES many Grid computing solutions which saw a swath of meta- [1] M. Armbrust, A. Fox, R. Griffith, A. D. Joseph, R. Katz, A. Konwinski, schedulers and workflow models to take advantage of het- G. Lee, D. Patterson, A. Rabkin, I. Stoica, and M. Zaharia, “A view of erogeneous resources in an imperfect user environment [9]. cloud computing,” Commun. ACM, vol. 53, no. 4, pp. 50–58, Apr. 2010. [2] K. Yelick, S. Coghlan, B. Draney, and R. S. Canon, “The magellan Applying similar abstract models of resource federation from report on cloud computing for science,”

Load more