Good Times with a Pile of GPUs Elizabeth Baumel About me

- DOTS Team ● Disbelief - Unreal games ○ Gears of War 4 - D3D12 multi-GPU support on UWP ○ Other, more exotic D3D12 multi-GPU stuff ● Sony - PS4 graphics dev support ● - DirectX 11 drivers Haha, what?

● A pile implies more than 1 ● Multiple GPUs ○ ...you might say…….. ● multi-GPU :) What is Multi-GPU?? ● Using multiple GPUs in a single machine to do more work. ○ Rendering ○ Compute ○ Games/real-time rendering and simulation

● NOT about: ○ GPU supercomputing clusters ○ Cryptocurrency mining History of Multi-GPU ● 1990 ● Silicon Graphics SkyWriter ○ Dual pipeline ■ Dual screen ○ Hyperpipeline ■ 2 GPUs, 1 display ● Alternate frame rendering ○ the OG AFR!!

Source: https://web.archive.org/web/20110715174342/http://www.reputable.com/~s kywriter/skywriter/techreport/6.html History of Multi-GPU ● 1998 ● 3dfx ● SLI ○ Scan-Line Interleave ○ Each card rendered alternating scan lines ● Higher max resolution ○ 1024x768 on 2 cards ○ 800x600 on 1 card

Source: https://en.wikipedia.org/wiki/Scan-Line_Interleave#/media/File:STBVoodoo2SLIcards. jpg History of Multi-GPU ● 2002 ● ATI Multi-Rendering ● Super Tiling ○ across N GPUs ○ Where N is “dozens” ● Used by Evans and Sutherland

Source: https://hothardware.com/reviews/ati-crossfire-multigpu-technology-preview?page=2 History of Multi-GPU ● 2004 ● SLI ○ Now “” ○ Custom PCB, links 2 identical GPUs ● Split-frame Rendering (SFR) ○ Load balanced ● Alternate frame Rendering (AFR) ● DirectX 9 ○ mGPU handled by driver

○ Source: Game-specific profiles https://www.hexus.net/tech/reviews/graphics/916-nvidias-sli-an-i ntroduction/?page=6 History of Multi-GPU ● 2005 ● ATI CrossFire ○ Dual-link DVI Y-dongle ○ Links cards in same family ● Modes: ○ SuperTiling ○ Scissor (SFR) ○ AFR ○ Super AA History of Multi-GPU ● 2006 ○ Nvidia ■ Plex up to 8 GPUs ■ GeForce SLI up to 4 GPUs ○ ATI ■ CrossFire -> bridge ■ Bought by AMD ● 2007 ○ CrossFireX up to 4 GPUs History of Multi-GPU ● 2008 ○ AMD Hybrid CrossFire ■ 780G/V ’s HD3200 integrated GPU ■ Radeon HD3450 discrete GPU ○ Lucid Logix Hydra Engine ● 2009 ○ DirectX 11 ● 2011 ○ AMD Llano APU Dual Graphics ■ SoC IGP + discrete GPU History of Multi-GPU ● 2013 ○ AMD Mantle ■ Explicit multi-GPU support!! ● 2015 ○ DirectX 12 ● 2018 ○ Vulkan 1.1 Implicit vs Explicit Multi-GPU

● Drivers manage resources ● Engine manages resources ● IHV implements ● Developer implements ● Game only sees 1 GPU ● Game can see all GPUs ● AFR only ● Flexible rendering modes ● Vendor-specific APIs let ● No driver overhead you give the driver hints ● You do the malpractice ● Surgery while wearing yourself!! oven mitts!

Why do Multi-GPU? ● Performance ● *extremely IHV voice*: to sell more GPUs ● Heterogeneous multi-GPU setups common now ● Why not? Games that use explicit multi-GPU ● Ashes of the Singularity ● Gears of War 4 ● Deus Ex Mankind Divided ● Strange Brigade ● Rise of the Tomb Raider ● Shadow of the Tomb Raider ● Civilization Beyond Earth ● Civilization VI ● Hitman (2016) ● Battlefield 1 ● Sniper Elite 4 Games that use explicit multi-GPU ● Ashes of the Singularity ○ Broad support for 2+ mixed adapters ● Gears of War 4 ○ AFR on 2 linked adapters ● Civilization Beyond Earth ○ SFR What you can do with Explicit Multi-GPU! ● Hardware Configurations ○ Linked Device Adapters ○ Heterogeneous multi-GPU aka Mixed Device Adapters ● Rendering/Work Distribution Modes ○ Alternate Frame Rendering (AFR) ○ Split Frame Rendering (SFR) ○ Tiled ○ Frame Pipelining ○ Asymmetric Linked Adapter Multi-GPU

● Pros ● Cons ○ Fast cross-GPU copies ○ $$$$$$$$$$$$$$$$$ ○ Same cards, easy scaling ○ HUGE resolutions ■ e.g. Nvidia Mosaic Linked Adapter Multi-GPU IDXGIAdapter Heterogeneous Multi-GPU

● Pros ● Cons ○ Use any GPUs you have! ○ Can’t assume GPUs support the same texture layouts. ○ May have vastly different specs/feature support.

Compute Units Heterogeneous Multi-GPU

IDXGIAdapter

IDXGIAdapter

IDXGIAdapter Alternate Frame Rendering ● Good when you have beefy GPUs and few inter-frame dependencies ● Well-understood ● Issues ○ Need basically the same GPUs for this to work well ○ Frame pacing ○ Input latency ○ Syncing double-buffered stuff ○ Temporal effects Split Frame Rendering ● Split final frame into even parallel workloads ● Great for VR! ● Good for low input latency ● Load balancing ● Frame compositing ● Not widely used in recent times Tiled Rendering ● Sorta like SFR but a lot more split up ● Homogenize work all over your entire frame ● Potentially lots of cross-GPU borders ● Even rarer than SFR in recent times Frame Pipelining ● Copy intermediate steps to the next GPU ● Works better with temporal techniques

Source: https://developer.nvidia.com/explicit-multi-gpu-programming-directx-12-part-2 Asymmetric Multi-GPU ● Weak baby integrated GPU and RIPPED DISCRETE GPU? ● As long as you got compute units, you can do Something ● Short trip between iGPU and CPU, save PCIe bandwidth How do you actually do this ● Enumerate adapters ○ Neat sample that shows both D3D and Vulkan: ■ https://github.com/GPUOpen-LibrariesAndSDKs/VkD3DDeviceMapping ● Find out what features the GPUs support ● Figure out where your resources will live ● Figure out what needs to be synced ○ USE D3DDEBUG/VULKAN VALIDATION ● Figure out what needs to be copied ○ Do copies on the COPY QUEUE!!!!!!! Challenges to anticipate ● SYNCHRONIZATION ○ Cross-node, cross-adapter, CPU/GPUs, AFR frame sets…..etc……. ● Bandwidth limitations ○ Them texture copies ain’t free ● If you have a Finished™ engine ○ Fixing all the places you assumed you had 1 GPU (heaps, command lists, basically everything) ● Tools? ○ lmao Tools...? ● GPUView ○ Windows only :’) ● roll your own ● printf debugging GPUView ● Let’s profile!! ● Download here: ○ https://docs.microsoft.com/en-us/windows-hardware/get-start ed/adk-install ○ Part of the Windows Performance Toolkit ● Using D3D12HeterogeneousMultiadapter sample ○ https://github.com/Microsoft/DirectX-Graphics-Samples GPUView ● [capture walkthrough] GPUView ● [capture walkthrough]

Conclusion ● It’s cool you should try it!!!! ● Wide open, lots of space for creativity ● BIG CHALLENGE ● If you ever wanted a project to really force you to think about your hardware, here u go Questions? ● @Icetigris ADDENDA Dual GPU Cards ● 1999 ● Quantum3D ○ 2x Voodoo2 SLI board

Source: https://en.wikipedia.org/wiki/Scan-Line_Interleave#/media/File: uantum3D_Obsidian_X24_SLI_PCI.png Dual GPU Cards ● 2008 ○ AMD Radeon HD 3850 and 3870 X2 Some PERF NUMBERS ● Gears of War 4 ○ 15.2 ms -> 8.6ms ○ single GeForce GTX 980 Ti -> AFR ● Ashes of the Singularity ○ 17.1 ms ○ Radeon R9 Fury + GeForce GTX 980

GPUView ● [capture walkthrough] GPUView ● [capture walkthrough] GPUView ● [capture walkthrough] GPUView ● [capture walkthrough]