GPU Virtualization Yiying Zhang A Full GPU Virtualization Solution with Mediated Pass-Through

Kun Tian, Yaozu Dong, David Cowperthwaite kevin.tian@.com, [email protected], [email protected] GPUvm:'Why'Not'Virtualizing'GPUs' at'the'Hypervisor?

Yusuke'Suzuki*' in'collaboraBon'with' Shinpei'Kato**,'Hiroshi'Yamada***,'Kenji'Kono*' ' *'Keio'University' **'Nagoya'University' ***'Tokyo'University'of'Agriculture'and'Technology'

Graphic'Processing'Unit'(GPU) • GPUs'are'used'for'dataMparallel'computaBons' – Composed'of'thousands'of'cores' – Peak'doubleMprecision'performance'exceeds'1'TFLOPS' – PerformanceMperMwaT'of'GPUs'outperforms'CPUs' • GPGPU'is'widely'accepted'for'various'uses' – Network'Systems'[Jang&et&al.&’11],'FS'[Silberstein&et&al.&’13]& [Sun&et&al.&’12],'DBMS'[He&et&al.&’08]&etc.' /GPU

L1 L1 L1 L1 L1 L1 L1 L2'Cache

Video'Memory CPU Main'Memory MoBvaBon • GPU'is'not'the'firstMclass'ciBzen'of'cloud' compuBng'environment' – Can'not'mulBplex'GPGPU'among'virtual'machines'(VM)' – Can'not'consolidate'VMs'that'run'GPGPU'applicaBons' • GPU'virtualizaBon'is'necessary' – VirtualizaBon'is'the'norms'in'the'clouds'

VM VM VM Share' Hypervisor a'single'GPU' among'VMs' Physical' GPU Machine VirtualizaBon'Approaches'

• Categorized'into'three'approaches' 1. I/O'passMthrough' 2. API'remoBng' 3. ParaMvirtualizaBon' I/O'passMthrough'

• Amazon'EC2'GPU'instance,'Intel'VTMd& – Assign'physical'GPUs'to'VMs'directly' – MulBplexing'is'impossible'

VM VM VM … Assign'GPUs' Hypervisor to'VMs' directly' GPU GPU GPU API'remoBng'

• GViM'[Gupta&et&al.&’09],'rCUDA'[Duato&et&al&’10],' VMGL'[Largar?Cavilla&et&al.&’07]'etc.' – Forward'API'calls'from'VMs'to'the'host’s'GPUs' – API'and'its'version'compaBbility'problem' – Enlarge'the'trusted'compuBng'base'(TCB)'

Host Library'v4Library'v4 VM VM Wrapper' Wrapper' … Driver Library'v4 Library'v5 Hypervisor Forwarding' GPU API'calls ParaMvirtualizaBon'

• VMWare'SVGA2'[Dowty&’09]'LoGV'[GoEschalk&et&al.&’10]' – Expose'an'ideal'GPU'device'model'to'VMs' – Guest'device'driver'must'be'modified'or'rewriTen'

VM VM Host Library Library … Driver PV'Driver PV'Driver Hypervisor Hypercalls GPU gVirt Full-featured vGPU Full GPU virtualization Run native graphics driver in VM

Up to 95% native Mediated Pass-through performance Pass-through performance critical operations Scale up to 7 VMs Trap-and-emulate privileged operations

7 GPU Virtualization Approaches API Direct Full Forwarding Pass-Through GPU Virtualization

Performance Performance Performance

Feature Feature Feature

Sharing Sharing Sharing

8 gVirt

Open source implementation GPL/BSD dual-license Current based on (codename as XenGT) KVM support is coming

Support Intel® Processor Graphics built into 4th generation Intel® Core™ processors Principles apply to different GPUs

Trademarked as Intel® GVT-g Intel® Graphics Virtualization Technology for virtual GPU

9 Challenges

Complexity in virtualizing a modern GPU

Efficiency when sharing the GPU

Secure isolation among the VMs

10 Architecture of Intel Processor Graphics gVirt Architecture

• gVirt stub

• Extends Xen vMMU, selectively present/hide address ranges to VMs

• Mediator

• Emulates vGPUs for privileged resources

• Context switches vGPUs

• Native driver in VM

• Directly access a portion of perf-critical resource

• QEM for legacy VGA mode GPU Sharing

• Render engine scheduling

• Time slides of 16ms are assigned to VMs (vGPUs)

• Waits until the guest ring buffer to become idle before switching

• Render context switch

• Save/restore internal pipeline and I/O register states, and cache/TLB flush Pass-Through Accesses

• Graphics memory resource partition

• Each VM gets a (fixed) portion of the real graphics memory => perf impact

• Needs to translate between guest and host view => perf impact

• Translation can be avoided by adding fake (ballooned) guest address ranges GPU Page Table Virtualization GPU Page Table Virtualization

Command Protection

• Command buffers

• The primary buffer is a statically allocated ring buffers

• Batch buffers are pages allocated on demand

• gVirt audits guest command buffers when commands are submitted

• to guarantee no unauthorized address references

• But what about the window after commands are submitted (and audited) to when they are actually executed?

• What if a malicious VM modifies its commands during this window? Smart Shadowing

Utilize specific programming model

Ring Statically allocated Lazy Buffer Limited page number Shadowing

Batch Allocated on-demand Write Buffer Rare access after submission Protection

20 Lazy Shadowing

VM Graphics Driver Submit complete Copy & Audit Mediator

Submit

complete GPU Execute

21 Write-Protection

VM Graphics Driver Submit

Audit complete & & Mediator Write-Protection Submit Write-Protection on off

GPU Execute

22 VM Performance

• 3D Benchmark: • LightsMark, OpenArena, UrbanTerror, Nexuiz • 2D Benchmark: Cairo-perf-trace • Firefox-asteroids, firefox-scrolling, midori-zommed, gnome-system-monitor

Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark* and MobileMark*, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product 24 when combined with other products. For more information go to http://www.intel.com/performance. Summary

Full GPU virtualization + mediated pass-through Run native graphics driver in VM Good balance for performance, feature and sharing capability Publicly available patches https://github.com/01org/XenGT-Preview-xen https://github.com/01org/XenGT-Preview-kernel https://github.com/01org/XenGT-Preview-qemu

30 GPUvm:'Why'Not'Virtualizing'GPUs' at'the'Hypervisor?

Yusuke'Suzuki*' in'collaboraBon'with' Shinpei'Kato**,'Hiroshi'Yamada***,'Kenji'Kono*' ' *'Keio'University' **'Nagoya'University' ***'Tokyo'University'of'Agriculture'and'Technology' GPU'Internals • PCIe'connected'discrete'GPU'(NVIDIA,'AMD'GPU)' • Driver'accesses'to'GPU'w/'MMIO'through'PCIe'BARs' • Three'major'components' – GPU&compuJng&cores,'GPU&channel&and'GPU&memory& Driver,'Apps'(CPU) MMIO PCIe'BARs GPU' GPU' … GPU Channel Channel GPU'Channels … … GPU'CompuBng'Cores GPU'Memory GPU'Channel'&'CompuBng'Cores • GPU'channel'is'a'hardware'unit'to'submit' commands'to'GPU'compuBng'cores' • The'number'of'GPU'channels'is'fixed' • MulBple'channels'can'be'acBve'at'a'Bme' App App GPU'Commands GPU GPU' GPU' … Channel Channel Commands'are'executed' on'compuBng'cores … …CompuBng GPU'CompuBng'Cores GPU'Memory

• Memory'accesses'from'compuBng'cores'are' confined'by'GPU'page'tables' App App GPU'Commands GPU' GPU' GPU Channel Channel … Pointer'to'GPU'Page'Table

…GPU'Virtual'Address GPU' GPU' … Page' Page' Table' Table' GPU'CompuBng'Cores GPU'Physical'Address GPU'Memory Unified'Address'Space

• GPU'and'CPU'memory'spaces'are'unified' – GPU'virtual'address'(GVA)'is'translated'CPU'physical' addresses'as'well'as'GPU'physical'addresses'(GPA)'

App GPU'Commands GPU' GPU Channel … … GVA GPU' … Page' GPU'CompuBng'Cores Table' CPU'physical'address GPA GPU'Memory CPU'Memory Unified'Address'Space GPUvm'overview

• Isolate'GPU'channel,'compuBng'cores'&'memory'

VM1 VM2

… …

Virtual' Virtual'

… …

GPU GPU

… GPU' GPU' GPU' GPU' … GPU Channel Channel Channel Channel … Assigned'to'VM1 Assigned'to'VM2 GPU'Memory … Time'Sharing … Assigned' Assigned' to'VM1 to'VM2 … GPU'CompuBng'Cores GPUvm'components

1. GPU'shadow'page'table' – Isolate'GPU'memory' 2. GPU'shadow'channel' – Isolate'GPU'channels' 3. GPU'fairMshare'scheduler' – Isolate'GPU'Bme'using'GPU'compuBng'cores' Conclusion

• GPUvm'shows'the'design'of'full'GPU' virtualizaBon' – GPU'shadow'page'table' – GPU'shadow'channel' – GPU'fairMshare'scheduler' • FullMvirtualizaBon'exhibits'nonMtrivial'overhead' – MMIO'handling' • Intercept'TLB'flush'and'scan'page'table' – OpBmizaBons'and'paraMvirtualizaBon' reduce'this'overhead' – However'sBll'2M3'Bmes'slower'