DELIVERING HIGH-PERFORMANCE REMOTE GRAPHICS WITH NVIDIA GRID VIRTUAL GPU Andy Currid NVIDIA WHAT YOU’LL LEARN IN THIS SESSION
. NVIDIA's GRID Virtual GPU Architecture — What it is and how it works
. Using GRID Virtual GPU on Citrix XenServer
. How to deliver great remote graphics from GRID Virtual GPU WHY VIRTUALIZE?
ENGINEER / DESIGNER Workstation
POWER USER High-end PC
KNOWLEDGE Entry-level WORKER PC WHY VIRTUALIZE?
. Awesome performance!
. High cost Desktop workstation
. Hard to fully utilize, limited mobility Quadro GPU . Challenging to manage
. Data security can be a problem … CENTRALIZE THE WORKSTATION
. Awesome performance! Datacenter
. Easier to fully DNotebookesktop or workstationthin client utilize, manage and secure Quadro GPU Remote Graphics . Even more expensive!
… VIRTUALIZE THE WORKSTATION
Datacenter
GPU-enabled server Virtual Machine Virtual Machine Notebook or Guest OS Guest OS thin client Apps Apps NVIDIA NVIDIA Driver Driver Citrix XenServer VMware ESX Direct GPU Remote Hypervisor Red Hat Enterprise Linux access from Graphics Open source Xen, KVM guest VM
Dedicated GPU per user NVIDIA GRID GPU … SHARE THE GPU
Datacenter
GPU-enabled server Virtual Machine Virtual Machine Notebook or Guest OS Guest OS thin client Hypervisor Apps Apps GRID Virtual GPU NVIDIA NVIDIA Manager Driver Driver
Direct GPU Remote Physical GPU Hypervisor access from Management Graphics guest VMs NVIDIA GRID vGPU NVIDIA GRID VIRTUAL GPU . Standard NVIDIA driver stack in each guest VM GPU-enabled server VM 1 VM 2 — API compatibility Guest OS Guest OS Hypervisor Apps Apps . Direct hardware access GRID Virtual GPU NVIDIA NVIDIA Manager Driver Driver from the guest Hypervisor — Highest performance
NVIDIA GRID vGPU . GRID Virtual GPU Manager — Increased manageability VIRTUAL GPU RESOURCE SHARING . Frame buffer — Allocated at VM startup GPU-enabled server VM 1 VM 2 . Channels Guest OS Guest OS Hypervisor Apps Apps — Used to post work to the GPU GRID Virtual GPU NVIDIA NVIDIA Manager Driver Driver — VM accesses its channels via GPU Base Address Register Hypervisor CPU MMU (BAR), isolated by CPU’s Memory Management Unit GPU BAR VM1 BAR VM2 BAR (MMU) NVIDIA GRID vGPU Channels
Timeshared Scheduling Framebuffer . GPU Engines VM1 FB 3D CE NVENC NVDEC VM2 FB — Timeshared among VMs, like multiple contexts on single OS
VIRTUAL GPU ISOLATION . GPU MMU controls access from engines to framebuffer GPU-enabled server and system memory VM 1 VM 2 Guest OS Guest OS Hypervisor . Apps Apps vGPU Manager maintains per-VM pagetables in GPU’s GRID Virtual GPU NVIDIA NVIDIA Manager Driver Driver framebuffer
Hypervisor Translated DMA access to VM physical memory and FB . Valid accesses are routed to framebuffer or system NVIDIA GRID vGPU Framebuffer GPU MMU VM1 FB memory Untranslated accesses VM2 FB VM1 pagetables 3D CE NVENC NVDEC VM2 pagetables . Invalid accesses are blocked Pagetable access VIRTUAL GPU DISPLAY
GPU-enabled server VM 1 Guest OS Hypervisor Apps GRID Virtual GPU NVIDIA . Virtual GPU exposes virtual display Manager Driver heads for each VM Hypervisor — E.g. 2 heads at 2560x1600 resolution
NVIDIA GRID vGPU Framebuffer . Primary surfaces (front buffers) for
VM1 FB each head are maintained in a VM’s Head Head framebuffer 1 2 3D CE NVENC . Physical scanout to a monitor is replaced by hardware delivery direct to system memory NVIDIA GRID REMOTE GRAPHICS SDK
. Available on vGPU and Network Apps Remote passthrough GPU Apps Apps Graphics Stack Graphics H.264 or . Fast readback of desktop or commands raw streams individual render targets
GRID GPU or vGPU
NVENC . Hardware H.264 encoder 3D NVIFR NVFBC . Citrix XenDesktop
Render Front . VMware View Target Buffer . NICE DCV Framebuffer . HP RGS USING NVIDIA GRID vGPU
. Citrix XenServer — First hypervisor to support GRID vGPU — Also supports GPU passthrough XenServer vSphere — Open source — Full tools integration for GPU — GRID certified server platforms
. VMware vSphere — Coming soon!
XENSERVER SETUP
. Install XenServer . Install XenCenter management GUI on PC . Install GRID Virtual GPU Manager
rpm -i NVIDIA-vgx-xenserver-6.2-331.30.i386.rpm ASSIGNING A vGPU TO A VIRTUAL MACHINE
. Citrix XenCenter management GUI
. Assignment of virtual GPU, or passthrough of dedicated GPU
BOOT, INSTALL OF NVIDIA DRIVERS
. VM’s console accessed through XenCenter
. Install NVIDIA guest vGPU driver vGPU OPERATION . NVIDIA driver now loaded, vGPU is fully operational
. Verify with NVIDIA control panel DELIVERING GREAT REMOTE GRAPHICS
.Use a high performance remote graphics stack
.Tune the platform for best graphics performance
TUNING THE PLATFORM
.Platform basics
.GPU selection
.NUMA considerations
PLATFORM BASICS
. Use sufficient CPU! — Graphically intensive apps typically need multiple cores
. Ensure CPUs can reach their highest clock speeds — Enable extended P-states / TurboBoost in the system BIOS — Set XenServer’s frequency governor to performance mode
xenpm set-scaling-governor performance
/opt/xensource/libexec/xen-cmdline --set-xen
cpufreq=xen:performance . Use sufficient RAM! - don’t overcommit memory
. Fast storage subsystem - local SSD or fast NAS / SAN MEASURING UTILIZATION
[root@xenserver-vgx-test2 ~]# nvidia-smi Mon Mar 24 09:56:42 2014 . nvidia-smi +------+ | NVIDIA-SMI 331.62 Driver Version: 331.62 | |------+------+------+ command line | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | utility |======+======+======| | 0 GRID K1 On | 0000:04:00.0 Off | N/A | | N/A 31C P0 20W / 31W | 530MiB / 4095MiB | 61% Default | +------+------+------+ | 1 GRID K1 On | 0000:05:00.0 Off | N/A | | N/A 29C P0 19W / 31W | 270MiB / 4095MiB | 46% Default | . +------+------+------+ Reports GPU | 2 GRID K1 On | 0000:06:00.0 Off | N/A | | N/A 26C P0 15W / 31W | 270MiB / 4095MiB | 7% Default | utilization, +------+------+------+ | 3 GRID K1 On | 0000:07:00.0 Off | N/A | memory usage, | N/A 28C P0 19W / 31W | 270MiB / 4095MiB | 46% Default | +------+------+------+ | 4 GRID K1 On | 0000:86:00.0 Off | N/A | temperature, and | N/A 26C P0 19W / 31W | 270MiB / 4095MiB | 45% Default | +------+------+------+ much more | 5 GRID K1 On | 0000:87:00.0 Off | N/A | | N/A 27C P0 15W / 31W | 10MiB / 4095MiB | 0% Default | +------+------+------+ | 6 GRID K1 On | 0000:88:00.0 Off | N/A | | N/A 33C P0 19W / 31W | 270MiB / 4095MiB | 53% Default | +------+------+------+ | 7 GRID K1 On | 0000:89:00.0 Off | N/A | | N/A 32C P0 19W / 31W | 270MiB / 4095MiB | 46% Default | +------+------+------+ MEASURING UTILIZATION . GPU utilization graph in XenCenter PICK THE RIGHT GRID GPU GRID K2 2 high-end Kepler GPUs 3072 CUDA cores (1536 / GPU) 8GB GDDR5 (4GB / GPU) ENGINEER / DESIGNER
GRID K1 4 entry Kepler GPUs POWER USER 768 CUDA cores (192 / GPU) 16GB DDR3 (4GB / GPU)
KNOWLEDGE WORKER SELECT THE RIGHT VGPU GRID K2 2 high-end Kepler GPUs 3072 CUDA cores (1536 / GPU) GRID K260Q ENGINEER 8GB GDDR5 (4GB / GPU) 2GB framebuffer DESIGNER 4 heads, 2560x1600
GRID K240Q 1GB framebuffer POWER USER 2 heads, 2560x1600
GRID K200 KNOWLEDGE 256MB framebuffer WORKER 2 heads, 1920x1200 SELECT THE RIGHT vGPU
GRID K1 GRID K140Q 4 entry Kepler GPUs 1GB framebuffer POWER USER 768 CUDA cores (192 / GPU) 2 heads, 2560x1600 16GB DDR3 (4GB / GPU)
GRID K100 KNOWLEDGE 256MB framebuffer WORKER 2 heads, 1920x1200 TAKE ACCOUNT OF NUMA . Non-Uniform Memory Access Memory 0 Memory 1 . Memory and GPUs connected to each CPU
CPU Socket 0 CPU Socket 1 Core Core Core Core Core Core . CPUs connected via
Core Core Core CPU Core Core Core proprietary interconnect Interconnect PCI PCI . CPU/GPU access to memory Express Express on same socket is fastest
GPU GPU GPU GPU . Access to memory on remote socket is slower PIN vCPUS TO SOCKETS . VM pinned to CPU Memory 0 Memory 1 socket by restricting its vCPUs to run only on that socket CPU Socket 0 CPU Socket 1 Core Core Core Core Core Core Virtual Machine Core Core Core Core Core Core vCPU vCPU . xe vm-param-set vCPU vCPU PCI Express uuid=
Memory 0 Memory 1
CPU Socket 0 CPU Socket 1 Core Core Core Core Core Core Core Core Core Core Core Core
GRID K2 GRID K2 GRID K2 GRID K2
GPU GPU GPU GPU GPU GPU GPU GPU 1 2 3 4 5 6 7 8 GPU GROUPS . XenServer manages physical Memory 0 Memory 1 GPUs by means of GPU groups
GPUCPU Group Socket “GRID 0 K2” CPU Socket 1 . Default behavior: all physical
AllocationCore Core policy: Core depth first Core Core Core GPUs of same type are Core Core Core Core Core Core placed in one GPU group Physical GPUs: . GPU group allocation policy: — Depth first: allocate vGPU on K2 K2 K2 K2 most loaded GPU
— Breadth first: allocate vGPU on GPU GPU GPU GPU GPU GPU GPU GPU least loaded GPU 1 2 3 4 5 6 7 8 GPU GROUPS . Default GPU group takes no Memory 0 account of where a VM is running
GPUCPU Group Socket “GRID 0 K2” CPU Socket 1 . Your VM may end up using a Virtual Machine Core Core Core vGPU that’s allocated on a AllocationvCPU policy:vCPU depth first Core Core Core GPU on a remote CPU socket PhysicalvCPU GPUs: vCPU
GPU 7
K2 GPU GROUPS . Create custom GPU groups Memory 0 Memory 1 — Per socket, or per GPU for ultimate control
. GPU CPUGroup Socket “GRID 0K2 Socket 0” GPU GroupCPU Socket“GRID K2 1 Socket 1” xe gpu-group-create name-label= AllocationCore Core policy: Core breadth first AllocationCore policy:Core Corebreadth first "GRID K2 Socket 0” PhysicalCore GPUs:Core Core PhysicalCore GPUs: Core Core . xe pgpu-param-set uuid=
. xe gpu-group-param-set GPU GPU GPU GPU GPU GPU GPU GPU uuid=
. GRID Virtual GPU on Citrix XenServer
. Remote graphics performance RESOURCES . NVIDIA GRID vGPU User Guide — Included with GRID vGPU drivers — Visit http://www.nvidia.com/vgpu, look for driver download link
. Citrix XenServer with 3D Graphics Pack — Visit http://www.citrix.com/go/vgpu
. Qualified server platforms — Visit http://www.nvidia.com/buygrid
RESOURCES . Remote Graphics — Citrix XenDesktop http://www.citrix.com/xendesktop
— HP Remote Graphics Software (RGS) http://www8.hp.com/us/en/campaigns/workstations/remote- graphics-software.html
— NICE Desktop Cloud Visualization (DCV) https://www.nice-software.com/products/dcv
. XenServer CPU performance tuning — http://www.xenserver.org/partners/developing-products-for- xenserver/19-dev-help/138-xs-dev-perf-turbo.html
THANK YOU! GRID Website www.nvidia.com/vdi
. NVIDIA GRID Sign up for the monthly GRID VDI Resources Newsletter http://tinyurl.com/gridinfo
GRID YouTube Channel http://tinyurl.com/gridvideos
Questions? Ask on our Forums https://gridforums.nvidia.com
NVIDIA GRID on LinkedIn http://linkd.in/QG4A6u
Follow us on Twitter @NVIDIAGRID