Implementation of GPU Virtualization Using PCI Pass-Through Mechanism

Implementation of GPU virtualization using PCI pass-through mechanism Chao-Tung Yang, Jung-Chun Liu, Hsien-Yi Wang & Ching-Hsien Hsu The Journal of Supercomputing An International Journal of High- Performance Computer Design, Analysis, and Use ISSN 0920-8542 J Supercomput DOI 10.1007/s11227-013-1034-4 1 23 Your article is protected by copyright and all rights are held exclusively by Springer Science +Business Media New York. This e-offprint is for personal use only and shall not be self- archived in electronic repositories. If you wish to self-archive your article, please use the accepted manuscript version for posting on your own website. You may further deposit the accepted manuscript version in any repository, provided it is only made publicly available 12 months after official publication or later and provided acknowledgement is given to the original source of publication and a link is inserted to the published article on Springer's website. The link must be accompanied by the following text: "The final publication is available at link.springer.com”. 1 23 Author's personal copy J Supercomput DOI 10.1007/s11227-013-1034-4 Implementation of GPU virtualization using PCI pass-through mechanism Chao-Tung Yang · Jung-Chun Liu · Hsien-Yi Wang · Ching-Hsien Hsu © Springer Science+Business Media New York 2013 Abstract As a general purpose scalable parallel programming model for coding highly parallel applications, CUDA from NVIDIA provides several key abstractions: a hierarchy of thread blocks, shared memory, and barrier synchronization. It has proven to be rather effective at programming multithreaded many-core GPUs that scale transparently to hundreds of cores; as a result, scientists all over the industry and academia are using CUDA to dramatically expedite on production and codes. GPU-based clusters are likely to play an essential role in future cloud computing centers, because some computation-intensive applications may require GPUs as well as CPUs. In this paper, we adopted the PCI pass-through technology and set up virtual machines in a virtual environment; thus, we were able to use the NVIDIA graphics card and the CUDA high performance computing as well. In this way, the virtual machine has not only the virtual CPU but also the real GPU for computing. The performance of the virtual machine is predicted to increase dramatically. This paper measured the difference of performance between physical and virtual machines using CUDA, and investigated how virtual machines would verify CPU numbers under the influence of CUDA performance. At length, we compared CUDA performance of two open source virtualization hypervisor environments, with or without using PCI pass- through. Through experimental results, we will be able to tell which environment is most efficient in a virtual environment with CUDA. C.-T. Yang (B) · J.-C. Liu · H.-Y. Wang Department of Computer Science, Tunghai University, Taichung 40704, Taiwan e-mail: [email protected] J.-C. Liu e-mail: [email protected] C.-H. Hsu Department of Computer Science and Information Engineering, Chung Hua University, Hsinchu, Taiwan e-mail: [email protected] Author's personal copy C.-T. Yang et al. Keywords CUDA · GPU virtualization · Cloud computing · PCI pass-through 1 Introduction 1.1 Motivations Graphics processing units (GPUs) are true many-core processors with hundreds of processing elements. The GPU is a specialized microprocessor that offloads and ac- celerates graphics rendering from the central microprocessor. Modern GPUs are very efficient at manipulating computer graphics, and their highly parallel structures make them more effective than general-purpose CPUs over a range of complex algorithms. Currently, a CPU has only 8 cores in a single chip, but a GPU has grown to 448 cores. From the number of cores, the GPU is fitting to execute programs suitable for massive parallel processing. Although the clock frequency of cores on the GPU is lower than that of the CPU, its powerful parallel processing ability conquers the problem of lower frequency. So far, the GPU has been used on supercomputers: on the TOP500 site in November 2010 [1], three of the first five supercomputers were built with NVIDIA GPU [2], and Titan, the world’s fastest supercomputer according to the TOP500 list released in November 2012 was also powered by NVIDIA GPU [1]. In recent years, the virtualization environment on Cloud [3] has become more popular than before. The balance between performance and cost is the most important factor. In order to live up to the potential of the server resource, the virtualization technology is the main solution for running many more virtual machines on a server and yet its resources can be used a lot more effectively. However, virtual machines have their own performance limitations so that users are restrained from using a lot of computing on them. Building a virtual environment in a Cloud computing system for users has become an important trend in the last few years. Proper use of hardware resources and computing power of each virtual machine is the aim of the Infrastructure as a Ser- vice (IaaS), which is one of the feature architectures of Cloud computing. Neverthe- less, the virtual machine has limitation when the virtual environment system does not have support of Compute Unified Device Architecture (CUDA), such as the physical General-Purpose computing on Graphics Processing Units (GPGPU) [4, 41–43]used by virtual machines in real machines to assist computing. Since the GPU is a real many-core processor, the computing power of virtual machines will be increased. 1.2 Goal and contribution In this paper, we explored various hypervisor environments for virtualization, differ- ent virtualization types on the cloud system, and several types of hardware virtualization [40]. We focused on the GPU virtualization, implemented a system with virtualization environment, and used the PCI pass-through [5] technology that enables the virtual machines on the system to use the GPU accelerator to increase the computing power. We conducted experiments to compare performance between virtual Author's personal copy Implementation of GPU virtualization using PCI pass-through mechanism machines with GPU virtualization and PCI pass-through and the native machine with GPU. Then we showed the GPU performance between the virtual machine and native machine, and also compared system time of virtual machines with that of the native machine. At last, we analyzed two other GPU virtualization technologies; the experimental results displayed the advantage of performance by using PCI pass-through over the other GPU virtualization technologies. 1.3 Organization of paper The rest of this work is organized as follows. Section 2 provides the background re- views of the Cloud computing, virtualization technology, and CUDA [6]. Section 3 describes the system implementation, architecture, and specifications of Tesla C1060 and Tesla C2050, and end-user’s interface. Section 4 presents the experimental environment, the used methods, results of GPU virtualization, and the improved performance of the proposed approach. Finally, conclusions are made in Sect. 5. 2 Background review 2.1 Cloud computing Cloud computing [3] is a computing approach based on the Internet, in which users can remotely use software services and data storage in remote servers. It is a new service architecture that brings a new choice of software and data storage services to users. To use the “Cloud,” users no longer need to find out details of the infrastructure in advance, not necessary to possess the professional knowledge, and are without direct control of the real machines that provide services. National Institute of Standards and Technology, aka NIST, defined following five basic features for Cloud computing in April 2009 [7]: • On-demand self-service • Broad network access • Resource pooling • Rapid elasticity • Measured service Cloud computing can be considered to include three levels of service: Infrastruc- ture as a Service (IaaS), Platform as a Service (PaaS), and Software as a Service (SaaS) [3]. The architecture of cloud computing is shown in Fig. 1. • Infrastructure as a Service (IaaS): Users can follow the required level of computers and network equipment and other resources, to set the service provider subscription service, and may require changes to settings. The cost is calculated according to the use of the CPU, memory, disk space, and the network load. • Platform as a Service (PaaS): development of service vendors who rent computers with the necessary hardware and software development environment; developer fees are calculated in accordance with the amount of traffic of the use of resources. • Software as a Service (SaaS): the software stored in the data center to provide users network access services. The type of charge is on a period or pay-per-order. Author's personal copy C.-T. Yang et al. Fig. 1 Architecture of cloud computing Fig. 2 Diagram of virtualization 2.2 Virtualization Virtualization technology [8] is a technology that creates a virtual version of some- thing, such as a hardware platform, operating system, a storage device, or network resources. The goal of virtualization is to centralize administrative tasks, while im- prove scalability and overall hardware-resource utilization. By using virtualization, several operating systems can be run in parallel on a single powerful server without glitches. The diagram of virtualization is shown in Fig. 2. The case of a general operating system is shown in Fig. 3. To protect instructions, there are four levels of permissions. The user’s applications are the implementation of the Ring 3 in the CPU part, and the implementation of the operating system is in Author's personal copy Implementation of GPU virtualization using PCI pass-through mechanism Fig. 3 The general operating system Fig. 4 The virtualization operating system Ring 0 to control the CPU and hardware. The hardware directly executes requests of the operating system and instructions of user applications.

Implementation of GPU Virtualization Using PCI Pass-Through Mechanism

GPU Virtualization on Vmware's Hosted I/O Architecture

Gscale: Scaling up GPU Virtualization with Dynamic Sharing of Graphics

Crane: Fast and Migratable GPU Passthrough for Opencl Applications

A Full GPU Virtualization Solution with Mediated Pass-Through

GPU Virtualization on Vmware's Hosted I/O Architecture

Enabling VDI for Engineers and Designers Positioning Information

Virtual GPU Software User Guide Is Organized As Follows: ‣ This Chapter Introduces the Capabilities and Features of NVIDIA Vgpu Software

With NVIDIA Virtual GPU

GPGPU Task Scheduling Technique for Reducing the Performance Deviation of Multiple GPGPU Tasks in RPC-Based GPU Virtualization Environments

NVIDIA GRID GPU Acceleration for Virtualization

CUSTOMER EXPERIENCES with GPU VIRTUALIZATION and 3D REMOTING Derek Thorslund Director of Product Management, HDX

Using PCI Pass-Through for GPU Virtualization with CUDA*