<<

DELIVERING HIGH-PERFORMANCE REMOTE GRAPHICS WITH NVIDIA GRID VIRTUAL GPU Andy Currid NVIDIA WHAT YOU’LL LEARN IN THIS SESSION

. NVIDIA's GRID Virtual GPU Architecture — What it is and how it works

. Using GRID Virtual GPU on Citrix XenServer

. How to deliver great remote graphics from GRID Virtual GPU WHY VIRTUALIZE?

ENGINEER / DESIGNER Workstation

POWER USER High-end PC

KNOWLEDGE Entry-level WORKER PC WHY VIRTUALIZE?

. Awesome performance!

. High cost Desktop workstation

. Hard to fully utilize, limited mobility Quadro GPU . Challenging to manage

. Data security can be a problem … CENTRALIZE THE WORKSTATION

. Awesome performance! Datacenter

. Easier to fully DNotebookesktop or workstationthin client utilize, manage and secure Quadro GPU Remote Graphics . Even more expensive!

… VIRTUALIZE THE WORKSTATION

Datacenter

GPU-enabled server Virtual Machine Notebook or Guest OS Guest OS thin client Apps Apps NVIDIA NVIDIA Driver Driver Citrix XenServer VMware ESX Direct GPU Remote Red Hat Enterprise Linux access from Graphics Open source , KVM guest VM

Dedicated GPU per user NVIDIA GRID GPU … SHARE THE GPU

Datacenter

GPU-enabled server Virtual Machine Virtual Machine Notebook or Guest OS Guest OS thin client Hypervisor Apps Apps GRID Virtual GPU NVIDIA NVIDIA Manager Driver Driver

Direct GPU Remote Physical GPU Hypervisor access from Management Graphics guest VMs NVIDIA GRID vGPU NVIDIA GRID VIRTUAL GPU . Standard NVIDIA driver stack in each guest VM GPU-enabled server VM 1 VM 2 — API compatibility Guest OS Guest OS Hypervisor Apps Apps . Direct hardware access GRID Virtual GPU NVIDIA NVIDIA Manager Driver Driver from the guest Hypervisor — Highest performance

NVIDIA GRID vGPU . GRID Virtual GPU Manager — Increased manageability VIRTUAL GPU RESOURCE SHARING . Frame buffer — Allocated at VM startup GPU-enabled server VM 1 VM 2 . Channels Guest OS Guest OS Hypervisor Apps Apps — Used to post work to the GPU GRID Virtual GPU NVIDIA NVIDIA Manager Driver Driver — VM accesses its channels via GPU Base Address Register Hypervisor CPU MMU (BAR), isolated by CPU’s Memory Management Unit GPU BAR VM1 BAR VM2 BAR (MMU) NVIDIA GRID vGPU Channels

Timeshared Scheduling Framebuffer . GPU Engines VM1 FB 3D CE NVENC NVDEC VM2 FB — Timeshared among VMs, like multiple contexts on single OS

VIRTUAL GPU ISOLATION . GPU MMU controls access from engines to framebuffer GPU-enabled server and system memory VM 1 VM 2 Guest OS Guest OS Hypervisor . Apps Apps vGPU Manager maintains per-VM pagetables in GPU’s GRID Virtual GPU NVIDIA NVIDIA Manager Driver Driver framebuffer

Hypervisor Translated DMA access to VM physical memory and FB . Valid accesses are routed to framebuffer or system NVIDIA GRID vGPU Framebuffer GPU MMU VM1 FB memory Untranslated accesses VM2 FB VM1 pagetables 3D CE NVENC NVDEC VM2 pagetables . Invalid accesses are blocked Pagetable access VIRTUAL GPU DISPLAY

GPU-enabled server VM 1 Guest OS Hypervisor Apps GRID Virtual GPU NVIDIA . Virtual GPU exposes virtual display Manager Driver heads for each VM Hypervisor — E.g. 2 heads at 2560x1600 resolution

NVIDIA GRID vGPU Framebuffer . Primary surfaces (front buffers) for

VM1 FB each head are maintained in a VM’s Head Head framebuffer 1 2 3D CE NVENC . Physical scanout to a monitor is replaced by hardware delivery direct to system memory NVIDIA GRID REMOTE GRAPHICS SDK

. Available on vGPU and Network Apps Remote passthrough GPU Apps Apps Graphics Stack Graphics H.264 or . Fast readback of desktop or commands raw streams individual render targets

GRID GPU or vGPU

NVENC . Hardware H.264 encoder 3D NVIFR NVFBC . Citrix XenDesktop

Render Front . VMware View Target Buffer . NICE DCV Framebuffer . HP RGS USING NVIDIA GRID vGPU

. Citrix XenServer — First hypervisor to support GRID vGPU — Also supports GPU passthrough XenServer vSphere — Open source — Full tools integration for GPU — GRID certified server platforms

. VMware vSphere — Coming soon!

XENSERVER SETUP

. Install XenServer . Install XenCenter management GUI on PC . Install GRID Virtual GPU Manager

rpm -i NVIDIA-vgx-xenserver-6.2-331.30.i386.rpm ASSIGNING A vGPU TO A VIRTUAL MACHINE

. Citrix XenCenter management GUI

. Assignment of virtual GPU, or passthrough of dedicated GPU

BOOT, INSTALL OF NVIDIA DRIVERS

. VM’s console accessed through XenCenter

. Install NVIDIA guest vGPU driver vGPU OPERATION . NVIDIA driver now loaded, vGPU is fully operational

. Verify with NVIDIA control panel DELIVERING GREAT REMOTE GRAPHICS

.Use a high performance remote graphics stack

.Tune the platform for best graphics performance

TUNING THE PLATFORM

.Platform basics

.GPU selection

.NUMA considerations

PLATFORM BASICS

. Use sufficient CPU! — Graphically intensive apps typically need multiple cores

. Ensure CPUs can reach their highest clock speeds — Enable extended P-states / TurboBoost in the system BIOS — Set XenServer’s frequency governor to performance mode

xenpm set-scaling-governor performance

/opt/xensource/libexec/xen-cmdline --set-xen

cpufreq=xen:performance . Use sufficient RAM! - don’t overcommit memory

. Fast storage subsystem - local SSD or fast NAS / SAN MEASURING UTILIZATION

[root@xenserver-vgx-test2 ~]# nvidia-smi Mon Mar 24 09:56:42 2014 . nvidia-smi +------+ | NVIDIA-SMI 331.62 Driver Version: 331.62 | |------+------+------+ command line | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | utility |======+======+======| | 0 GRID K1 On | 0000:04:00.0 Off | N/A | | N/A 31C P0 20W / 31W | 530MiB / 4095MiB | 61% Default | +------+------+------+ | 1 GRID K1 On | 0000:05:00.0 Off | N/A | | N/A 29C P0 19W / 31W | 270MiB / 4095MiB | 46% Default | . +------+------+------+ Reports GPU | 2 GRID K1 On | 0000:06:00.0 Off | N/A | | N/A 26C P0 15W / 31W | 270MiB / 4095MiB | 7% Default | utilization, +------+------+------+ | 3 GRID K1 On | 0000:07:00.0 Off | N/A | memory usage, | N/A 28C P0 19W / 31W | 270MiB / 4095MiB | 46% Default | +------+------+------+ | 4 GRID K1 On | 0000:86:00.0 Off | N/A | temperature, and | N/A 26C P0 19W / 31W | 270MiB / 4095MiB | 45% Default | +------+------+------+ much more | 5 GRID K1 On | 0000:87:00.0 Off | N/A | | N/A 27C P0 15W / 31W | 10MiB / 4095MiB | 0% Default | +------+------+------+ | 6 GRID K1 On | 0000:88:00.0 Off | N/A | | N/A 33C P0 19W / 31W | 270MiB / 4095MiB | 53% Default | +------+------+------+ | 7 GRID K1 On | 0000:89:00.0 Off | N/A | | N/A 32C P0 19W / 31W | 270MiB / 4095MiB | 46% Default | +------+------+------+ MEASURING UTILIZATION . GPU utilization graph in XenCenter PICK THE RIGHT GRID GPU GRID K2 2 high-end Kepler GPUs 3072 CUDA cores (1536 / GPU) 8GB GDDR5 (4GB / GPU) ENGINEER / DESIGNER

GRID K1 4 entry Kepler GPUs POWER USER 768 CUDA cores (192 / GPU) 16GB DDR3 (4GB / GPU)

KNOWLEDGE WORKER SELECT THE RIGHT VGPU GRID K2 2 high-end Kepler GPUs 3072 CUDA cores (1536 / GPU) GRID K260Q ENGINEER 8GB GDDR5 (4GB / GPU) 2GB framebuffer DESIGNER 4 heads, 2560x1600

GRID K240Q 1GB framebuffer POWER USER 2 heads, 2560x1600

GRID K200 KNOWLEDGE 256MB framebuffer WORKER 2 heads, 1920x1200 SELECT THE RIGHT vGPU

GRID K1 GRID K140Q 4 entry Kepler GPUs 1GB framebuffer POWER USER 768 CUDA cores (192 / GPU) 2 heads, 2560x1600 16GB DDR3 (4GB / GPU)

GRID K100 KNOWLEDGE 256MB framebuffer WORKER 2 heads, 1920x1200 TAKE ACCOUNT OF NUMA . Non-Uniform Memory Access Memory 0 Memory 1 . Memory and GPUs connected to each CPU

CPU Socket 0 CPU Socket 1 Core Core Core Core Core Core . CPUs connected via

Core Core Core CPU Core Core Core proprietary interconnect Interconnect PCI PCI . CPU/GPU access to memory Express Express on same socket is fastest

GPU GPU GPU GPU . Access to memory on remote socket is slower PIN vCPUS TO SOCKETS . VM pinned to CPU Memory 0 Memory 1 socket by restricting its vCPUs to run only on that socket CPU Socket 0 CPU Socket 1 Core Core Core Core Core Core Virtual Machine Core Core Core Core Core Core vCPU vCPU . xe vm-param-set vCPU vCPU PCI Express uuid= VCPUs-params:mask= GPU GPU GPU GPU 0,1,2,3,4,5 SELECTING A vGPU ON A SPECIFIC SOCKET

Memory 0 Memory 1

CPU Socket 0 CPU Socket 1 Core Core Core Core Core Core Core Core Core Core Core Core

GRID K2 GRID K2 GRID K2 GRID K2

GPU GPU GPU GPU GPU GPU GPU GPU 1 2 3 4 5 6 7 8 GPU GROUPS . XenServer manages physical Memory 0 Memory 1 GPUs by means of GPU groups

GPUCPU Group Socket “GRID 0 K2” CPU Socket 1 . Default behavior: all physical

AllocationCore Core policy: Core depth first Core Core Core GPUs of same type are Core Core Core Core Core Core placed in one GPU group Physical GPUs: . GPU group allocation policy: — Depth first: allocate vGPU on K2 K2 K2 K2 most loaded GPU

— Breadth first: allocate vGPU on GPU GPU GPU GPU GPU GPU GPU GPU least loaded GPU 1 2 3 4 5 6 7 8 GPU GROUPS . Default GPU group takes no Memory 0 account of where a VM is running

GPUCPU Group Socket “GRID 0 K2” CPU Socket 1 . Your VM may end up using a Virtual Machine Core Core Core vGPU that’s allocated on a AllocationvCPU policy:vCPU depth first Core Core Core GPU on a remote CPU socket PhysicalvCPU GPUs: vCPU

GPU 7

K2 GPU GROUPS . Create custom GPU groups Memory 0 Memory 1 — Per socket, or per GPU for ultimate control

. GPU CPUGroup Socket “GRID 0K2 Socket 0” GPU GroupCPU Socket“GRID K2 1 Socket 1” xe gpu-group-create name-label= AllocationCore Core policy: Core breadth first AllocationCore policy:Core Corebreadth first "GRID K2 Socket 0” PhysicalCore GPUs:Core Core PhysicalCore GPUs: Core Core . xe pgpu-param-set uuid= gpu-group-uuid= GRID K2 GRID K2 GRID K2 GRID K2

. xe gpu-group-param-set GPU GPU GPU GPU GPU GPU GPU GPU uuid= 1 2 3 4 5 6 7 8 allocation-algorithm= breadth-first WRAP UP . NVIDIA's GRID Virtual GPU Architecture

. GRID Virtual GPU on Citrix XenServer

. Remote graphics performance RESOURCES . NVIDIA GRID vGPU User Guide — Included with GRID vGPU drivers — Visit http://www.nvidia.com/vgpu, look for driver download link

. Citrix XenServer with 3D Graphics Pack — Visit http://www.citrix.com/go/vgpu

. Qualified server platforms — Visit http://www.nvidia.com/buygrid

RESOURCES . Remote Graphics — Citrix XenDesktop http://www.citrix.com/xendesktop

— HP Remote Graphics (RGS) http://www8.hp.com/us/en/campaigns/workstations/remote- graphics-software.html

— NICE Desktop Cloud Visualization (DCV) https://www.nice-software.com/products/dcv

. XenServer CPU performance tuning — http://www.xenserver.org/partners/developing-products-for- xenserver/19-dev-help/138-xs-dev-perf-turbo.html

THANK YOU! GRID Website www.nvidia.com/vdi

. NVIDIA GRID Sign up for the monthly GRID VDI Resources Newsletter http://tinyurl.com/gridinfo

GRID YouTube Channel http://tinyurl.com/gridvideos

Questions? Ask on our Forums https://gridforums.nvidia.com

NVIDIA GRID on LinkedIn http://linkd.in/QG4A6u

Follow us on Twitter @NVIDIAGRID