Slides of 16Ms Are Assigned to Vms (Vgpus)

GPU Virtualization Yiying Zhang A Full GPU Virtualization Solution with Mediated Pass-Through Kun Tian, Yaozu Dong, David Cowperthwaite [email protected], [email protected], [email protected] GPUvm:'Why'Not'Virtualizing'GPUs' at'the'Hypervisor? Yusuke'Suzuki*' in'collaboraBon'with' Shinpei'Kato**,'Hiroshi'Yamada***,'Kenji'Kono*' ' *'Keio'University' **'Nagoya'University' ***'Tokyo'University'of'Agriculture'and'Technology' Graphic'Processing'Unit'(GPU) • GPUs'are'used'for'dataMparallel'computaBons' – Composed'of'thousands'of'cores' – Peak'doubleMprecision'performance'exceeds'1'TFLOPS' – PerformanceMperMwaT'of'GPUs'outperforms'CPUs' • GPGPU'is'widely'accepted'for'various'uses' – Network'Systems'[Jang&et&al.&’11],'FS'[Silberstein&et&al.&’13]& [Sun&et&al.&’12],'DBMS'[He&et&al.&’08]&etc.' NVIDIA/GPU L1 L1 L1 L1 L1 L1 L1 L2'Cache Video'Memory CPU Main'Memory MoBvaBon • GPU'is'not'the'firstMclass'ciBzen'of'cloud' compuBng'environment' – Can'not'mulBplex'GPGPU'among'virtual'machines'(VM)' – Can'not'consolidate'VMs'that'run'GPGPU'applicaBons' • GPU'virtualizaBon'is'necessary' – VirtualizaBon'is'the'norms'in'the'clouds' VM VM VM Share' Hypervisor a'single'GPU' among'VMs' Physical' GPU Machine VirtualizaBon'Approaches' • Categorized'into'three'approaches' 1. I/O'passMthrough' 2. API'remoBng' 3. ParaMvirtualizaBon' I/O'passMthrough' • Amazon'EC2'GPU'instance,'Intel'VTMd& – Assign'physical'GPUs'to'VMs'directly' – MulBplexing'is'impossible' VM VM VM … Assign'GPUs' Hypervisor to'VMs' directly' GPU GPU GPU API'remoBng' • GViM'[Gupta&et&al.&’09],'rCUDA'[Duato&et&al&’10],' VMGL'[Largar?Cavilla&et&al.&’07]'etc.' – Forward'API'calls'from'VMs'to'the'host’s'GPUs' – API'and'its'version'compaBbility'problem' – Enlarge'the'trusted'compuBng'base'(TCB)' Host Library'v4Library'v4 VM VM Wrapper' Wrapper' … Driver Library'v4 Library'v5 Hypervisor Forwarding' GPU API'calls ParaMvirtualizaBon' • VMWare'SVGA2'[Dowty&’09]'LoGV'[GoEschalk&et&al.&’10]' – Expose'an'ideal'GPU'device'model'to'VMs' – Guest'device'driver'must'be'modified'or'rewriTen' VM VM Host Library Library … Driver PV'Driver PV'Driver Hypervisor Hypercalls GPU gVirt Full-featured vGPU ° Full GPU virtualization Run native graphics driver in VM Up to 95% native ° Mediated Pass-through performance ° Pass-through performance critical operations Scale up to 7 VMs ° Trap-and-emulate privileged operations 7 GPU Virtualization Approaches API Direct Full Forwarding Pass-Through GPU Virtualization Performance Performance Performance Feature Feature Feature Sharing Sharing Sharing 8 gVirt ° Open source implementation ° GPL/BSD dual-license ° Current based on Xen (codename as XenGT) ° KVM support is coming ° Support Intel® Processor Graphics built into 4th generation Intel® Core™ processors ° Principles apply to different GPUs ° Trademarked as Intel® GVT-g ° Intel® Graphics Virtualization Technology for virtual GPU 9 Challenges ° Complexity in virtualizing a modern GPU ° Efficiency when sharing the GPU ° Secure isolation among the VMs 10 Architecture of Intel Processor Graphics gVirt Architecture • gVirt stub • Extends Xen vMMU, selectively present/hide address ranges to VMs • Mediator • Emulates vGPUs for privileged resources • Context switches vGPUs • Native driver in VM • Directly access a portion of perf-critical resource • QEM for legacy VGA mode GPU Sharing • Render engine scheduling • Time slides of 16ms are assigned to VMs (vGPUs) • Waits until the guest ring buffer to become idle before switching • Render context switch • Save/restore internal pipeline and I/O register states, and cache/TLB flush Pass-Through Accesses • Graphics memory resource partition • Each VM gets a (fixed) portion of the real graphics memory => perf impact • Needs to translate between guest and host view => perf impact • Translation can be avoided by adding fake (ballooned) guest address ranges GPU Page Table Virtualization GPU Page Table Virtualization Command Protection • Command buffers • The primary buffer is a statically allocated ring buffers • Batch buffers are pages allocated on demand • gVirt audits guest command buffers when commands are submitted • to guarantee no unauthorized address references • But what about the window after commands are submitted (and audited) to when they are actually executed? • What if a malicious VM modifies its commands during this window? Smart Shadowing ° Utilize specific programming model Ring Statically allocated Lazy Buffer Limited page number Shadowing Batch Allocated on-demand Write Buffer Rare access after submission Protection 20 Lazy Shadowing VM Graphics Driver Submit complete Copy & Audit Mediator Submit complete GPU Execute 21 Write-Protection VM Graphics Driver Submit Audit complete & & Mediator Write-Protection Submit Write-Protection on off GPU Execute 22 Linux VM Performance • 3D Benchmark: Phoronix Test Suite • LightsMark, OpenArena, UrbanTerror, Nexuiz • 2D Benchmark: Cairo-perf-trace • Firefox-asteroids, firefox-scrolling, midori-zommed, gnome-system-monitor Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark* and MobileMark*, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product 24 when combined with other products. For more information go to http://www.intel.com/performance. Summary ° Full GPU virtualization + mediated pass-through ° Run native graphics driver in VM ° Good balance for performance, feature and sharing capability ° Publicly available patches ° https://github.com/01org/XenGT-Preview-xen ° https://github.com/01org/XenGT-Preview-kernel ° https://github.com/01org/XenGT-Preview-qemu 30 GPUvm:'Why'Not'Virtualizing'GPUs' at'the'Hypervisor? Yusuke'Suzuki*' in'collaboraBon'with' Shinpei'Kato**,'Hiroshi'Yamada***,'Kenji'Kono*' ' *'Keio'University' **'Nagoya'University' ***'Tokyo'University'of'Agriculture'and'Technology' GPU'Internals • PCIe'connected'discrete'GPU'(NVIDIA,'AMD'GPU)' • Driver'accesses'to'GPU'w/'MMIO'through'PCIe'BARs' • Three'major'components' – GPU&compuJng&cores,'GPU&channel&and'GPU&memory& Driver,'Apps'(CPU) MMIO PCIe'BARs GPU' GPU' … GPU Channel Channel GPU'Channels … … GPU'CompuBng'Cores GPU'Memory GPU'Channel'&'CompuBng'Cores • GPU'channel'is'a'hardware'unit'to'submit' commands'to'GPU'compuBng'cores' • The'number'of'GPU'channels'is'fixed' • MulBple'channels'can'be'acBve'at'a'Bme' App App GPU'Commands GPU GPU' GPU' … Channel Channel Commands'are'executed' on'compuBng'cores … …CompuBng GPU'CompuBng'Cores GPU'Memory • Memory'accesses'from'compuBng'cores'are' confined'by'GPU'page'tables' App App GPU'Commands GPU' GPU' GPU Channel Channel … Pointer'to'GPU'Page'Table …GPU'Virtual'Address GPU' GPU' … Page' Page' Table' Table' GPU'CompuBng'Cores GPU'Physical'Address GPU'Memory Unified'Address'Space • GPU'and'CPU'memory'spaces'are'unified' – GPU'virtual'address'(GVA)'is'translated'CPU'physical' addresses'as'well'as'GPU'physical'addresses'(GPA)' App GPU'Commands GPU' GPU Channel … … GVA GPU' … Page' GPU'CompuBng'Cores Table' CPU'physical'address GPA GPU'Memory CPU'Memory Unified'Address'Space GPUvm'overview • Isolate'GPU'channel,'compuBng'cores'&'memory' VM1 VM2 … … Virtual' Virtual' … … … GPU GPU … GPU' GPU' GPU' GPU' … GPU Channel Channel Channel Channel … Assigned'to'VM1 Assigned'to'VM2 GPU'Memory … Time'Sharing … Assigned' Assigned' to'VM1 to'VM2 … GPU'CompuBng'Cores GPUvm'components 1. GPU'shadow'page'table' – Isolate'GPU'memory' 2. GPU'shadow'channel' – Isolate'GPU'channels' 3. GPU'fairMshare'scheduler' – Isolate'GPU'Bme'using'GPU'compuBng'cores' Conclusion • GPUvm'shows'the'design'of'full'GPU' virtualizaBon' – GPU'shadow'page'table' – GPU'shadow'channel' – GPU'fairMshare'scheduler' • FullMvirtualizaBon'exhibits'nonMtrivial'overhead' – MMIO'handling' • Intercept'TLB'flush'and'scan'page'table' – OpBmizaBons'and'paraMvirtualizaBon' reduce'this'overhead' – However'sBll'2M3'Bmes'slower'.

Load more