Boosting GPU Virtualization Performance with Hybrid Shadow Page Tables

Total Page:16

File Type:pdf, Size:1020Kb

Boosting GPU Virtualization Performance with Hybrid Shadow Page Tables Boosting GPU Virtualization Performance with Hybrid Shadow Page Tables Yaozu Dong and Mochi Xue, Shanghai Jiao Tong University and Intel Corporation; Xiao Zheng, Intel Corporation; Jiajun Wang, Shanghai Jiao Tong University and Intel Corporation; Zhengwei Qi and Haibing Guan, Shanghai Jiao Tong University https://www.usenix.org/conference/atc15/technical-session/presentation/dong This paper is included in the Proceedings of the 2015 USENIX Annual Technical Conference (USENIC ATC ’15). July 8–10, 2015 • Santa Clara, CA, USA ISBN 978-1-931971-225 Open access to the Proceedings of the 2015 USENIX Annual Technical Conference (USENIX ATC ’15) is sponsored by USENIX. Boosting GPU Virtualization Performance with Hybrid Shadow Page Tables Yaozu Dong1, 2, Mochi Xue1,2, Xiao Zheng2, Jiajun Wang1,2, Zhengwei Qi1, Haibing Guan1 {eddie.dong, xiao.zheng}@intel.com {xuemochi, jiajunwang, qizhenwei, hbguan}@sjtu.edu.cn 1Shanghai Jiao Tong University, 2Intel Corporation Abstract computing paradigm called GPU cloud (such as Ama- zon’s GPU cloud [2]). Hence, it is now vitally important The increasing adoption of Graphic Process Unit (GPU) to provide efficient GPU virtualization to provision elas- to computation-intensive workloads has stimulated a new tic GPU resources to multiple users. computing paradigm called GPU cloud (e.g., Amazon’s To address this challenge, two recent full GPU virtual- GPU Cloud), which necessitates the sharing of GPU re- ization techniques, gVirt [29] and GPUvm [28], are pro- sources to multiple tenants in a cloud. However, state-of- posed respectively. gVirt is the first open-source product- the-art GPU virtualization techniques such as gVirt still level full GPU virtualization approach based on Xen hy- suffer from non-trivial performance overhead for graph- pervisor [11] for Intel GPUs, while GPUvm provides ics memory-intensive workloads involving frequent page a Graphic Process Unit (GPU) virtualization approach table updates. on the NVIDIA card. This paper mainly focuses on To understand such overhead, this paper first presents gVirt due to its open-source availability. Specifically, GMedia, a media benchmark, and uses it to analyze the gVirt presents a vGPU instance to each VM to run na- causes of such overhead. Our analysis shows that fre- tive graphics driver, which achieves high performance quent updates to guest VM’s page tables causes excessive and good scalability for GPU-intensive workloads. updates to the shadow page table in the hypervisor, due to the need to guarantee the consistency between guest page While gVirt has made an important first step to provide table and shadow page table. To this end, this paper pro- full GPU virtualization, our measurement shows that it poses gHyvi1, an optimized GPU virtualization scheme still incurs non-trivial overhead for media transcoding based on gVirt, which uses adaptive hybrid page table workloads. Specifically, we build GMedia using Intel’s shadowing that combines strict and relaxed page table MSDK (Media Software Development Kit) to charac- schemes. By significantly reducing trap-and-emulation terize the performance of gVirt. Our analysis uncovers due to page table updates, gHyvi significantly improves that gVirt still suffers from non-trivial performance slow- gVirt’s performance for memory-intensive GPU work- down due to an issue called Massive Update Issue. This loads. Evaluation using GMedia shows that gHyvi can is caused by frequent updates on guest page tables, which achieve up to 13x performance improvement compared lead to excessive VM-exits to the hypervisor to synchro- to gVirt, and up to 85% native performance for multi- nize the shadow page table with the guest page table. thread media transcoding. To address the Massive Update Issue, this paper intro- duces gHyvi, which provides a hybrid page table shad- owing scheme to provide optimized full GPU virtual- 1 Introduction ization based on Xen hypervisor for Intel GPUs. In- spired by the GPU programming model, we introduce a The emergence of HPC cloud [30] has shifted many new asynchronous mechanism, namely relaxed page ta- computation-intensive workloads such as machine learn- ble shadowing, which removes trap-and-emulation and ing [24], molecular dynamics simulations [31] and me- thus reduces the overhead of massive page table’s mod- dia transcoding to cloud environments. This necessi- ifications. To minimize the overhead of making guest tates the use of GPU to boost the performance of such and shadow page tables consistent, we combine the two computation-hungry applications, resulting in a new mechanisms into a adaptive hybrid page table shadow- 1The source code of gHyvi will be available at https://01.org/ ing scheme, which take advantage of both the traditional igvt-g. strict and the new relaxed page table shadowing. When 1 USENIX Association 2015 USENIX Annual Technical Conference 517 there are infrequent page table accesses, gHyvi works in 2 Background strict page table shadowing; once the gHyvi detects the guest VM is frequently updating the page table, it will switch to the relaxed page table shadowing. 2.1 GPU for Computing One critical issue of using the relaxed page table shad- owing scheme is to reconstruct the shadow pages when shadow pages are inconsistent with guest pages. To bet- ter understand the tradeoff of different reconstruction Batch Buffer policies, we implement and evaluate four page table re- Head Render Engine Frame Buffer construction policies: full reconstruction, static partial CMDs Ring Access reconstruction, dynamic partial reconstruction and dy- Buffer Graphic Memory GPU Page Table namic segmented partial reconstruction. Our analysis Fetch shows that the last one usually has better performance Commands Tail than the others, which is thus used as the default policy Feed Access Commands for gHyvi. CPU System Memory We have implemented gHyvi based on gVirt, which Access comprises 600 LoCs. Experiments using GMedia on an Intel GPU card show that gHyvi can achieve up to Figure 1: GPU Programming Model 13x performance improvement compared to gVirt, and up to 85% native performance for multi-thread media transcoding. Our analysis shows that gHyvi wins due to the reduction of up to 69% VM-exits. GPU programming model: Figure 1 illustrates the In summary, this paper makes the following contribu- GPU programming model. The graphics driver produces tions: GPU commands into primary buffer and batch buffer, which is driven by the high level programming APIs like • A GPU-enabled benchmark for media transcoding OpenGL and DirectX. GPU consumes the commands performance (GMedia), by invoking functions from and fulfills the acceleration work accordingly. The pri- Intel MSDK to evaluate and collect the performance mary buffer is a ring structure (ring buffer), which is data on Intel’s GPU platforms. designed to deliver the primary commands. Due to the limited space in the ring buffer, the majority (up to 98%) • A relaxed page table shadowing mechanism as well of commands are in the batch buffer chained to the ring as a hybrid shadow page table scheme, which com- buffer. bines the strict page table shadowing with the re- laxed page table shadowing. A register tuple, which includes a head register and a tail register, is implemented in the ring buffer. CPU • Four reconstruction policies: the full reconstruc- fills commands from tail to head, and GPU fetches com- tion policy, static partial reconstruction policy, dy- mands from head to tail, all within the ring buffer. The namic partial reconstruction policy, and the dy- driver notifies GPU the submission and completion of the namic segmented partial reconstruction policy for commands through the tail, while GPU updates the head. relaxed page table shadowing mechanism. Once the CPU completes the placement of commands in the ring buffer and batch buffer, it informs GPU to fetch • An evaluation showing that gHyvi achieves up to the commands. In general, GPU will not fetch the com- 85% native performance for multi-thread media mands placed by the CPU in the ring buffer until the CPU transcoding and a 13x speedup over gVirt. updates the tail register [29]. GPU Cloud: Due to the massive computing power, The rest of the paper is organized as follows: Sec- GPU has been expanded from the original graphic com- tion 2 describes some background information on gVirt puting to general purpose computing. The rising of GPU and GPU programming model. Section 3 presents our cloud, which extends today’s elastic resource manage- benchmark for media transcoding and discusses the Mas- ment capability from CPU to GPU, further enables effi- sive Update Issue in detail, followed by the design and cient hosting of GPU workload in cloud and datacenter implementation of gHyvi In section 4. Then, section 5 environments. The strong demand of hosting GPU ap- evaluates the gHyvi and section 6 discusses the related plications calls for GPU clouds that offer full GPU virtu- work. Finally, section 7 concludes with a brief discus- alization solutions with good performance, full features sion on future work. and sharing capability. 2 518 2015 USENIX Annual Technical Conference USENIX Association 2.2 GPU Benchmarks VM1 global page table VM2 global page table While there are many GPU benchmarks evaluating the ballooned ballooned performance of GPU cards, they mainly focus on graph- ics ability of cards [1, 8] either for OpenGL or DirectX commands. Though there are a few benchmarks for gen- System memory eral purpose computing (GPGPU) such as Rodinia [12] Guest and Parboil [27], they are not available for Intel’s GPU. Host shadow global page table Besides, existing benchmarks neglect the media process- ing workloads, which is a key to boost the performance of media applications in cloud. To this end, this paper presents GMedia, a media transcoding benchmark shown in Figure 4, based on In- tel’s MSDK (Media Software Development Kit). Intel’s MSDK grants media application developers access to Figure 2: Shared shadow global page table hardware acceleration through a unified API.
Recommended publications
  • GPU Virtualization on Vmware's Hosted I/O Architecture
    GPU Virtualization on VMware’s Hosted I/O Architecture Micah Dowty, Jeremy Sugerman VMware, Inc. 3401 Hillview Ave, Palo Alto, CA 94304 [email protected], [email protected] Abstract more computational performance than CPUs. At the Modern graphics co-processors (GPUs) can produce same time, GPU acceleration has extended beyond en- high fidelity images several orders of magnitude faster tertainment (e.g., games and video) into the basic win- than general purpose CPUs, and this performance expec- dowing systems of recent operating systems and is start- tation is rapidly becoming ubiquitous in personal com- ing to be applied to non-graphical high-performance ap- puters. Despite this, GPU virtualization is a nascent field plications including protein folding, financial modeling, of research. This paper introduces a taxonomy of strate- and medical image processing. The rise in applications gies for GPU virtualization and describes in detail the that exploit, or even assume, GPU acceleration makes specific GPU virtualization architecture developed for it increasingly important to expose the physical graph- VMware’s hosted products (VMware Workstation and ics hardware in virtualized environments. Additionally, VMware Fusion). virtual desktop infrastructure (VDI) initiatives have led We analyze the performance of our GPU virtualiza- many enterprises to try to simplify their desktop man- tion with a combination of applications and microbench- agement by delivering VMs to their users. Graphics vir- marks. We also compare against software rendering, the tualization is extremely important to a user whose pri- GPU virtualization in Parallels Desktop 3.0, and the na- mary desktop runs inside a VM. tive GPU. We find that taking advantage of hardware GPUs pose a unique challenge in the field of virtu- acceleration significantly closes the gap between pure alization.
    [Show full text]
  • Gscale: Scaling up GPU Virtualization with Dynamic Sharing of Graphics
    gScale: Scaling up GPU Virtualization with Dynamic Sharing of Graphics Memory Space Mochi Xue, Shanghai Jiao Tong University and Intel Corporation; Kun Tian, Intel Corporation; Yaozu Dong, Shanghai Jiao Tong University and Intel Corporation; Jiacheng Ma, Jiajun Wang, and Zhengwei Qi, Shanghai Jiao Tong University; Bingsheng He, National University of Singapore; Haibing Guan, Shanghai Jiao Tong University https://www.usenix.org/conference/atc16/technical-sessions/presentation/xue This paper is included in the Proceedings of the 2016 USENIX Annual Technical Conference (USENIX ATC ’16). June 22–24, 2016 • Denver, CO, USA 978-1-931971-30-0 Open access to the Proceedings of the 2016 USENIX Annual Technical Conference (USENIX ATC ’16) is sponsored by USENIX. gScale: Scaling up GPU Virtualization with Dynamic Sharing of Graphics Memory Space Mochi Xue1,2, Kun Tian2, Yaozu Dong1,2, Jiacheng Ma1, Jiajun Wang1, Zhengwei Qi1, Bingsheng He3, Haibing Guan1 {xuemochi, mjc0608, jiajunwang, qizhenwei, hbguan}@sjtu.edu.cn {kevin.tian, eddie.dong}@intel.com [email protected] 1Shanghai Jiao Tong University, 2Intel Corporation, 3National University of Singapore Abstract As one of the key enabling technologies of GPU cloud, GPU virtualization is intended to provide flexible and With increasing GPU-intensive workloads deployed on scalable GPU resources for multiple instances with high cloud, the cloud service providers are seeking for practi- performance. To achieve such a challenging goal, sev- cal and efficient GPU virtualization solutions. However, eral GPU virtualization solutions were introduced, i.e., the cutting-edge GPU virtualization techniques such as GPUvm [28] and gVirt [30]. gVirt, also known as GVT- gVirt still suffer from the restriction of scalability, which g, is a full virtualization solution with mediated pass- constrains the number of guest virtual GPU instances.
    [Show full text]
  • Crane: Fast and Migratable GPU Passthrough for Opencl Applications
    Crane: Fast and Migratable GPU Passthrough for OpenCL Applications James Gleeson, Daniel Kats, Charlie Mei, Eyal de Lara University of Toronto Toronto, Canada {jgleeson,dbkats,cmei,delara}@cs.toronto.edu ABSTRACT (IaaS) providers now have virtualization offerings with ded- General purpose GPU (GPGPU) computing in virtualized icated GPUs that customers can leverage in their guest vir- environments leverages PCI passthrough to achieve GPU tual machines (VMs). For example, Amazon Web Services performance comparable to bare-metal execution. However, allows VMs to take advantage of NVIDIA’s high-end GPUs GPU passthrough prevents service administrators from per- in EC2 [6]. Microsoft and Google recently introduced simi- forming virtual machine migration between physical hosts. lar offerings on Azure and GCP platforms respectively [34, Crane is a new technique for virtualizing OpenCL-based 21]. GPGPU computing that achieves within 5:25% of pass- Virtualized GPGPU computing leverages modern through GPU performance while supporting VM migra- IOMMU features to give the VM direct use of the GPU tion. Crane interposes a virtualization-aware OpenCL li- through DMA and interrupt remapping – a technique which brary that makes it possible to reclaim and subsequently is usually called PCI passthrough [17, 50]. By running reassign physical GPUs to a VM without terminating the the GPU driver in the context of the guest operating guest or its applications. Crane also enables continued GPU system, PCI passthrough bypasses the hypervisor and operation while the VM is undergoing live migration by achieves performance that is comparable to bare-metal OS transparently switching between GPU passthrough opera- execution [47]. tion and API remoting.
    [Show full text]
  • A Full GPU Virtualization Solution with Mediated Pass-Through
    A Full GPU Virtualization Solution with Mediated Pass-Through Kun Tian, Yaozu Dong, David Cowperthwaite Intel Corporation Abstract shows the spectrum of GPU virtualization solutions Graphics Processing Unit (GPU) virtualization is an (with hardware acceleration increasing from left to enabling technology in emerging virtualization right). Device emulation [7] has great complexity and scenarios. Unfortunately, existing GPU virtualization extremely low performance, so it does not meet today’s approaches are still suboptimal in performance and full needs. API forwarding [3][9][22][31] employs a feature support. frontend driver, to forward the high level API calls inside a VM, to the host for acceleration. However, API This paper introduces gVirt, a product level GPU forwarding faces the challenge of supporting full virtualization implementation with: 1) full GPU features, due to the complexity of intrusive virtualization running native graphics driver in guest, modification in the guest graphics software stack, and and 2) mediated pass-through that achieves both good incompatibility between the guest and host graphics performance and scalability, and also secure isolation software stacks. Direct pass-through [5][37] dedicates among guests. gVirt presents a virtual full-fledged GPU the GPU to a single VM, providing full features and the to each VM. VMs can directly access best performance, but at the cost of device sharing performance-critical resources, without intervention capability among VMs. Mediated pass-through [19], from the hypervisor in most cases, while privileged passes through performance-critical resources, while operations from guest are trap-and-emulated at minimal mediating privileged operations on the device, with cost. Experiments demonstrate that gVirt can achieve good performance, full features, and sharing capability.
    [Show full text]
  • GPU Virtualization on Vmware's Hosted I/O Architecture
    GPU Virtualization on VMware's Hosted I/O Architecture Micah Dowty Jeremy Sugerman VMware, Inc. VMware, Inc. 3401 Hillview Ave, Palo Alto, CA 94304 3401 Hillview Ave, Palo Alto, CA 94304 [email protected] [email protected] ABSTRACT 1. INTRODUCTION Modern graphics co-processors (GPUs) can produce high fi- Over the past decade, virtual machines (VMs) have be- delity images several orders of magnitude faster than general come increasingly popular as a technology for multiplexing purpose CPUs, and this performance expectation is rapidly both desktop and server commodity x86 computers. Over becoming ubiquitous in personal computers. Despite this, that time, several critical challenges in CPU virtualization GPU virtualization is a nascent field of research. This paper have been overcome, and there are now both software and introduces a taxonomy of strategies for GPU virtualization hardware techniques for virtualizing CPUs with very low and describes in detail the specific GPU virtualization archi- overheads [2]. I/O virtualization, however, is still very tecture developed for VMware’s hosted products (VMware much an open problem and a wide variety of strategies are Workstation and VMware Fusion). used. Graphics co-processors (GPUs) in particular present a challenging mixture of broad complexity, high performance, We analyze the performance of our GPU virtualization with rapid change, and limited documentation. a combination of applications and microbenchmarks. We also compare against software rendering, the GPU virtual- Modern high-end GPUs have more transistors, draw more ization in Parallels Desktop 3.0, and the native GPU. We power, and offer at least an order of magnitude more com- find that taking advantage of hardware acceleration signif- putational performance than CPUs.
    [Show full text]
  • Enabling VDI for Engineers and Designers Positioning Information
    Enabling VDI for Engineers and Designers Positioning Information The ability to work remotely has been a critical business need for many years. But with the recent pandemic, the focus has shifted from supporting the occasional business trip or conference attendee, to supporting all non-essential employees. Virtual Desktop Infrastructure (VDI) provides a robust solution for businesses, allowing their employees to access business data on any mobile device – be it a laptop, tablet, or smartphone – without risk of losing that business data if the device is lost or stolen. With VDI the data resides in the data center, behind company firewalls. VDI allows companies to implement flexible work schedules, where employees have the freedom to pick up the kids from school and log on later in the evening to finish up. VDI also provides for centralized management of security fixes and application upgrades, ensuring all workers are using the latest version of applications. Although a great solution for your typical office workers, the benefits of VDI have largely been unavailable to another set of employees. Power users – engineers and designers using 3D or other graphics visualization applications – have generally been left out of these flexible arrangements. The graphics visualization applications they use are processor intensive, and the files they create and modify can be very large. The workstations that run the applications and the high-resolution monitors that display the designs are heavy and thus difficult to move. In order to have decent response times, power users have needed to be connected to a high-speed local area network. Work has been underway for several years to expand the benefits of VDI to the power user, and today solutions exist to support the majority of visualization applications.
    [Show full text]
  • Virtual GPU Software User Guide Is Organized As Follows: ‣ This Chapter Introduces the Capabilities and Features of NVIDIA Vgpu Software
    Virtual GPU Software User Guide DU-06920-001 _v13.0 Revision 02 | August 2021 Table of Contents Chapter 1. Introduction to NVIDIA vGPU Software..............................................................1 1.1. How NVIDIA vGPU Software Is Used....................................................................................... 1 1.1.2. GPU Pass-Through.............................................................................................................1 1.1.3. Bare-Metal Deployment.....................................................................................................1 1.2. Primary Display Adapter Requirements for NVIDIA vGPU Software Deployments................2 1.3. NVIDIA vGPU Software Features............................................................................................. 3 1.3.1. GPU Instance Support on NVIDIA vGPU Software............................................................3 1.3.2. API Support on NVIDIA vGPU............................................................................................ 5 1.3.3. NVIDIA CUDA Toolkit and OpenCL Support on NVIDIA vGPU Software...........................5 1.3.4. Additional vWS Features....................................................................................................8 1.3.5. NVIDIA GPU Cloud (NGC) Containers Support on NVIDIA vGPU Software...................... 9 1.3.6. NVIDIA GPU Operator Support.......................................................................................... 9 1.4. How this Guide Is Organized..................................................................................................10
    [Show full text]
  • With NVIDIA Virtual GPU
    Virtualizing AI/DL platform: NVIDIA Virtual GPU Solution Doyoung Kim, Sr. Solution Architect NVIDIA Virtual GPU Solution Support, Updates & Maintenance NVIDIA Virtual GPU Software Tesla Datacenter GPUs How it works? CPU Only VDI With NVIDIA Virtual GPU Apps and VMs NVIDIA Graphics Drivers Apps and VMs NVIDIA Virtual GPU Hypervisor NVIDIA virtualization software Hypervisor Server NVIDIA Tesla GPU Server Evolution of Virtual GPU Live Migration Ultra High End Simulation, NGC Support Photo Realism, 3D Rendering, AL/DL Business Designers, User Architects, Engineers 2013 2016 2017 Today GPU Compute on Virtual GPU Optimized DL Container from NGC Virtual Machine Virtual Quadro GPU CUDA & OpenCL enabled NVIDIA Virtualization Software Hypervisor NVIDIA Tesla GPU Server Virtual GPU can…. ✓ Run GPU compute workload - Any CUDA / OpenCL applications - Requires NVIDIA Virtual GPU 5.0 or higher - Requires Pascal or later GPU ✓ Be fully integrated with Virtualization solution - Live migration support from Virtual GPU 6.0 or higher - Support Cluster / Host / VM / Application level performance monitoring - Enabled for all major solutions such as vSphere / XenServer / KVM ✓ Provide every level of AI / DL Platforms - From desktop level to multi-GPU server - Fully support NVIDIA GPU Cloud (NGC) - Support up to 4 multi-vGPU per Virtual Machine* * Requires Virtual GPU 7.0 or higher / RHEL KVM only Why Virtualization? Which is best for AI/DL research system? PC with Consumer GPU GPU Server w/o Virtualization NVIDIA GPU Virtualization Manageability Require MGMT
    [Show full text]
  • GPGPU Task Scheduling Technique for Reducing the Performance Deviation of Multiple GPGPU Tasks in RPC-Based GPU Virtualization Environments
    S S symmetry Article GPGPU Task Scheduling Technique for Reducing the Performance Deviation of Multiple GPGPU Tasks in RPC-Based GPU Virtualization Environments Jihun Kang and Heonchang Yu * Department of Computer Science and Engineering, Korea University, Seoul 02841, Korea; [email protected] * Correspondence: [email protected] Abstract: In remote procedure call (RPC)-based graphic processing unit (GPU) virtualization envi- ronments, GPU tasks requested by multiple-user virtual machines (VMs) are delivered to the VM owning the GPU and are processed in a multi-process form. However, because the thread executing the computing on general GPUs cannot arbitrarily stop the task or trigger context switching, GPU monopoly may be prolonged owing to a long-running general-purpose computing on graphics processing unit (GPGPU) task. Furthermore, when scheduling tasks on the GPU, the time for which each user VM uses the GPU is not considered. Thus, in cloud environments that must provide fair use of computing resources, equal use of GPUs between each user VM cannot be guaranteed. We propose a GPGPU task scheduling scheme based on thread division processing that supports GPU use evenly by multiple VMs that process GPGPU tasks in an RPC-based GPU virtualization environment. Our method divides the threads of the GPGPU task into several groups and controls the execution time of each thread group to prevent a specific GPGPU task from a long time monopolizing the GPU. The Citation: Kang, J.; Yu, H. GPGPU Task Scheduling Technique for efficiency of the proposed technique is verified through an experiment in an environment where Reducing the Performance Deviation multiple VMs simultaneously perform GPGPU tasks.
    [Show full text]
  • Op E N So U R C E Yea R B O O K 2 0
    OPEN SOURCE YEARBOOK 2016 ..... ........ .... ... .. .... .. .. ... .. OPENSOURCE.COM Opensource.com publishes stories about creating, adopting, and sharing open source solutions. Visit Opensource.com to learn more about how the open source way is improving technologies, education, business, government, health, law, entertainment, humanitarian efforts, and more. Submit a story idea: https://opensource.com/story Email us: [email protected] Chat with us in Freenode IRC: #opensource.com . OPEN SOURCE YEARBOOK 2016 . OPENSOURCE.COM 3 ...... ........ .. .. .. ... .... AUTOGRAPHS . ... .. .... .. .. ... .. ........ ...... ........ .. .. .. ... .... AUTOGRAPHS . ... .. .... .. .. ... .. ........ OPENSOURCE.COM...... ........ .. .. .. ... .... ........ WRITE FOR US ..... .. .. .. ... .... 7 big reasons to contribute to Opensource.com: Career benefits: “I probably would not have gotten my most recent job if it had not been for my articles on 1 Opensource.com.” Raise awareness: “The platform and publicity that is available through Opensource.com is extremely 2 valuable.” Grow your network: “I met a lot of interesting people after that, boosted my blog stats immediately, and 3 even got some business offers!” Contribute back to open source communities: “Writing for Opensource.com has allowed me to give 4 back to a community of users and developers from whom I have truly benefited for many years.” Receive free, professional editing services: “The team helps me, through feedback, on improving my 5 writing skills.” We’re loveable: “I love the Opensource.com team. I have known some of them for years and they are 6 good people.” 7 Writing for us is easy: “I couldn't have been more pleased with my writing experience.” Email us to learn more or to share your feedback about writing for us: https://opensource.com/story Visit our Participate page to more about joining in the Opensource.com community: https://opensource.com/participate Find our editorial team, moderators, authors, and readers on Freenode IRC at #opensource.com: https://opensource.com/irc .
    [Show full text]
  • NVIDIA GRID GPU Acceleration for Virtualization
    NVIDIA GRID™ GPU Acceleration for Virtualization GRID For VDI GRID Enabled Solutions AGENDA User Profiles and Experiences VDI Circa 2011 Task Worker GRID Powered VDI Designer Task Worker A TRUE PC EXPERIENCE Delivered to any device for the hundreds of millions of power users who want to bring their own devices to work. Power Knowledge User Worker Key Components of GRID GRID VGX GRID GRID VCA Software GPUs Visual Computing Appliance VDI VIRTUAL MACHINE VIRTUAL DESKTOPS NVIDIA GRID Enabled Virtual Desktop NVIDIA Driver NVIDIA GRID ENABLED Hypervisor NVIDIA GRID GPU Key Components of GRID GRID VGX GRID GRID VCA Software GPUs Visual Computing Appliance Key Components of GRID GRID GPUs NVIDIA Brands ® GeForce Quadro® Tegra® NVIDIA GRID™ Tesla® NVIDIA GRID K1 NVIDIA GRID K2 GPU 4 Kepler GPUs 2 High End Kepler GPUs CUDA cores 768 (192 / GPU) 3072 (1536 / GPU) Memory Size 16GB DDR3 (4GB / GPU) 8GB GDDR5 Max Power 130 W 225 W Form Factor Dual Slot ATX, 10.5” Dual Slot ATX, 10.5” Display IO None None Aux power requirement 6-pin connector 8-pin connector PCIe x16 x16 PCIe Generation Gen3 (Gen2 compatible) Gen3 (Gen2 compatible) Cooling solution Passive Passive # users 4 - 1001 2 – 641 Watts per user ~ 1.5 W ~ 3.5 W OpenGL 4.x 4.x Microsoft DirectX 11 11 GRID VGX Virtualization support Yes Yes 1 Number of users depends on software solution, workload, and screen resolution GRID Enabled OEM Platforms IBM iDataPlex DX360 Cisco UCS C240 M3 2 GRID K1 or 2 GRID K2 2 GRID K1 or 2 GRID K2 Dell PowerEdge R720 2 GRID K1 or 2 GRID K2 HP ProLiant SL250 2 GRID
    [Show full text]
  • CUSTOMER EXPERIENCES with GPU VIRTUALIZATION and 3D REMOTING Derek Thorslund Director of Product Management, HDX
    CUSTOMER EXPERIENCES WITH GPU VIRTUALIZATION AND 3D REMOTING Derek Thorslund Director of Product Management, HDX © 2014 Citrix HDX 3D PRO – REAL CUSTOMER EXPERIENCES . Industry leaders have been delivering centralized 3D graphics applications using Citrix XenDesktop HDX 3D Pro since 2009 . Range of software and GPU hardware technologies — “Software GPU” — Dedicated GPUs — Shared GPUs including GRID vGPU © 2014 Citrix CITRIX MILESTONES IN 3D GRAPHICS REMOTING 2006 2009 2011 2012 2013 Project K2 GA of XenServer 6.0 Higher fps via OpenGL & delivers CATIA XenDesktop hypervisor NVIDIA GRID™ DirectX vGPU / to Boeing HDX 3D Pro introduces API plus GPU Sharing Dreamliner with Deep GPU improved with NVIDIA designers Compression Passthrough compression GRID™ cards © 2014 Citrix XenDesktop HDX 3D Pro XenDesktop VDI XenApp Any Multi-user device Single-user Windows Server ICA Windows 7 or 8 2008/2012 RDS XenServer or vSphere/ESXi or Bare Metal 1 user/machine: N users/machine: . XenServer/vSphere direct GPU . XenServer/vSphere direct GPU passthrough or bare metal passthrough or bare metal N users/machine: . XenServer NVIDIA GRID vGPU © 2014 Citrix Insider tips: GPU passthrough is a great option for high-end users doing intensive simulations • Servers like the HP ws460c Gen8 can take 8 GPUs, and even larger servers are coming this year • Full CUDA and OpenCL support (not yet available with GRID vGPU) • Well-established technology (XenServer and vSphere/ESXi) • Can mix GPU passthrough with vGPU on NVIDIA GRID K2 and K1 cards • High-end designers often require other parts of the HDX portfolio • 3D Space Mouse, printers, secure smart card access, etc. Industrial Automation © 2014 Citrix HDX 3D Pro client options Desktop Virtualization for High-end Graphics Users Content creation: Content access: © 2014 Citrix Segmenting the user population Tier 1: Professional users (e.g.
    [Show full text]