Crowd Simulation in Froblins

Jeremy Shopf, AMD GPG

Beyond Programmable Shading: In Action Outline

• Motivation • Global Navigation using the GPU • Local Navigation and Avoidance • Spatial Queries • Conclusions

Beyond Programmable Shading: In Action 2 All slides © 2008 , Inc. Used with permission. Dynamic and Engaging World Through AI

• Global path finding along with local navigation and avoidance – Crowd dynamics similar to fluids – Agents “flow” towards the closest goal, along the path of least resistance – Froblins follow correct paths and do not “get stuck” • Implemented in DirectX® 10.1

Beyond Programmable Shading: In Action 3 All slides © 2008 Advanced Micro Devices, Inc. Used with permission. Video

Beyond Programmable Shading: In Action 4 All slides © 2008 Advanced Micro Devices, Inc. Used with permission. Global Navigation

• Continuum approach [Treuille et al. 2006] – Smooth flow-like crowd movement – Congestion avoidance – Emergent behavior

Beyond Programmable Shading: In Action 5 All slides © 2008 Advanced Micro Devices, Inc. Used with permission. Continuum Model

• Model environment as positive cost function Static Cost + Agent Density + Hazards = Total Cost

Beyond Programmable Shading: In Action 6 All slides © 2008 Advanced Micro Devices, Inc. Used with permission. Continuum Model

• Potential ( Φ ) = cost integrated along shortest path to goal. “Travel-time” • Φ=0 at goal, satisfies eikonal equation everywhere else: (x) = F

Beyond Programmable Shading: In Action 7 All slides © 2008 Advanced Micro Devices, Inc. Used with permission. Global Navigation Direction

Cost Potential ∆ Potential • By following the negative gradient of the potential, agents are guaranteed to follow optimal path

Beyond Programmable Shading: In Action 8 All slides © 2008 Advanced Micro Devices, Inc. Used with permission. Fast Marching Method[Sethian 1996]

• Initialization: Set goal cell Φ = 0. Mark goal cell as KNOWN. All else UNKNOWN.

UNKNOWN ∞ ∞ ∞ NEIGHBOR ∞ 0 ∞ KNOWN ∞ ∞ ∞

Beyond Programmable Shading: In Action 9 All slides © 2008 Advanced Micro Devices, Inc. Used with permission. Fast Marching Method[Sethian 1996]

1. Mark UNKNOWN cells adjacent to KNOWN cells as NEIGHBOR

UNKNOWN ∞ ∞ ∞ NEIGHBOR ∞ 0 ∞ KNOWN ∞ ∞ ∞

Beyond Programmable Shading: In Action 10 All slides © 2008 Advanced Micro Devices, Inc. Used with permission. Fast Marching Method[Sethian 1996]

2. Calculate Φ at each NEIGHBOR

UNKNOWN ∞ 2 ∞ NEIGHBOR 1 0 2 KNOWN ∞ 3 ∞

Beyond Programmable Shading: In Action 11 All slides © 2008 Advanced Micro Devices, Inc. Used with permission. Fast Marching Method[Sethian 1996]

3. Mark NEIGHBOR cell with smallest Φ from step 2 as KNOWN

UNKNOWN ∞ 2 ∞ NEIGHBOR 1 0 2 KNOWN ∞ 3 ∞

Beyond Programmable Shading: In Action 12 All slides © 2008 Advanced Micro Devices, Inc. Used with permission. Fast Marching Method[Sethian 1996]

1. Mark UNKNOWN cells adjacent to KNOWN cells as NEIGHBOR

UNKNOWN ∞ 2 ∞ NEIGHBOR 1 0 2 KNOWN ∞ 3 ∞

Beyond Programmable Shading: In Action 13 All slides © 2008 Advanced Micro Devices, Inc. Used with permission. Fast Marching Method[Sethian 1996]

2. Calculate Φ at each NEIGHBOR

UNKNOWN 3 2 ∞ NEIGHBOR 1 0 2 KNOWN 4 3 ∞

Beyond Programmable Shading: In Action 14 All slides © 2008 Advanced Micro Devices, Inc. Used with permission. Fast Marching Method[Sethian 1996]

3. Mark NEIGHBOR cell with smallest Φ from step 2 as KNOWN

UNKNOWN 3 2 ∞ NEIGHBOR 1 0 2 KNOWN 4 3 ∞

Beyond Programmable Shading: In Action 15 All slides © 2008 Advanced Micro Devices, Inc. Used with permission. Fast Marching Method[Sethian 1996]

1. Mark UNKNOWN cells adjacent to KNOWN cells as NEIGHBOR

UNKNOWN 3 2 ∞ NEIGHBOR 1 0 2 KNOWN 4 3 ∞

Beyond Programmable Shading: In Action 16 All slides © 2008 Advanced Micro Devices, Inc. Used with permission. Fast Marching Method[Sethian 1996]

2. Calculate Φ at each NEIGHBOR

UNKNOWN 3 2 4 NEIGHBOR 1 0 2 KNOWN 4 3 ∞

Beyond Programmable Shading: In Action 17 All slides © 2008 Advanced Micro Devices, Inc. Used with permission. Fast Marching Method[Sethian 1996]

3. Mark NEIGHBOR cell with smallest Φ from step 2 as KNOWN

UNKNOWN 3 2 4 NEIGHBOR 1 0 2 KNOWN 4 3 ∞

Beyond Programmable Shading: In Action 18 All slides © 2008 Advanced Micro Devices, Inc. Used with permission. Fast Marching Method[Sethian 1996]

3. Mark NEIGHBOR cell with smallest Φ from step 2 as KNOWN

UNKNOWN 3 2 4 .. And so on until the neighbor list is empty NEIGHBOR 1 0 2 KNOWN 4 3 ∞

Beyond Programmable Shading: In Action 19 All slides © 2008 Advanced Micro Devices, Inc. Used with permission. Fast Marching Method

• Worst-case complexity O(n lg n) • One cells’ potential calculated per iteration • Not fast enough for our application • Not readily parallelizable

Beyond Programmable Shading: In Action 20 All slides © 2008 Advanced Micro Devices, Inc. Used with permission. Fast Iterative Method[Jeong and Whitaker 2007]

• No ordered data structure • Iteratively update active cells in parallel • Active cells determined by convergence measure • Active cells updated and culled in tiles

Beyond Programmable Shading: In Action 21 All slides © 2008 Advanced Micro Devices, Inc. Used with permission. Simplified Fast Iterative Method

• For smaller datasets, maintaining active tile list and testing convergence has negative performance impact • Conservative iteration # determined empirically • Solve 4 solutions simultaneously – 98% ALU utilization • 5-39x speedup over CPU FMM*

*Configuration: AMD reference platform with AMD Athlon™ 64 X2 Dual-Core Processor 4600+, 2.40GHz, 2GB RAM. GPU: ATI ™ HD 4870 Graphics. Beyond Programmable Shading: In Action Motherboard: ASUSTek M2R32-MVP. Memory: DDR2-800 400 MHz. Operating 22 System: Windows Vista® SP1. All slides © 2008 Advanced Micro Devices, Inc. Used with permission. Global Navigation Direction Results

Beyond Programmable Shading: In Action 23 All slides © 2008 Advanced Micro Devices, Inc. Used with permission. Global Navigation Direction

Results

Beyond Programmable Shading: In Action 24 All slides © 2008 Advanced Micro Devices, Inc. Used with permission. Implementation thoughts

• High API overhead – API calls and state changes – Compute API would provide quicker launch • Would benefit from – Perform several iterations for a tile in one launch [Jeong and Whitaker 2007] – Local Data Share (CAL/DX11/OpenCL) • Will map easily to DX11 Compute Shader

Beyond Programmable Shading: In Action 25 All slides © 2008 Advanced Micro Devices, Inc. Used with permission. More on Continuum Model

• Computed spatially • Scales well with number of agents • Motion fidelity dependent on resolution of solution

Beyond Programmable Shading: In Action 26 All slides © 2008 Advanced Micro Devices, Inc. Used with permission. Local Navigation and Avoidance

• “Sense” surroundings • Update velocity based on positions and velocities of nearby agents/obstacles • Based on Velocity Obstacle formulation [Fiorini and Shiller 1998]

Beyond Programmable Shading: In Action 27 All slides © 2008 Advanced Micro Devices, Inc. Used with permission. Binning

Agent Positions Bin Counter Bin Array 3 0 1 0 0 3 1 5 1 0 0 1 1 5 2 0 0 0 1 2 6 4 0 0 0 2 4 6 • World space position mapped to 2D index • Bin Counter: color buffer, tracks bin loads • ID Array: depth array, stores binned IDs

Beyond Programmable Shading: In Action 28 All slides © 2008 Advanced Micro Devices, Inc. Used with permission. Bin Queries

• Translate position to 2D index • Fetch bin load from color buffer • Fetch binned agent IDs from depth array – IDs are stored in array in sorted order

Beyond Programmable Shading: In Action 29 All slides © 2008 Advanced Micro Devices, Inc. Used with permission. Algorithm Overview

• Binning is a multi-pass algorithm – A single agent can fall into a given bin per pass – Once binned, agents removed from working set – Algorithm repeats until working set is empty

Beyond Programmable Shading: In Action 30 All slides © 2008 Advanced Micro Devices, Inc. Used with permission. Binning Algorithm

• Rasterize agents as point primitives • Use depth-test unit to bin in sorted order • Additively blend binned points to maintain bin counts

Beyond Programmable Shading: In Action 31 All slides © 2008 Advanced Micro Devices, Inc. Used with permission. Binning with Reduction

• A naïve implementation requires n passes over m points • Use GS & stream-out to cull binned points • Use predicated rendering to terminate unnecessary draw calls once all agents are binned • Detailed description in course notes

Beyond Programmable Shading: In Action 32 All slides © 2008 Advanced Micro Devices, Inc. Used with permission. Agent Direction Determination

• Evaluate a fixed set of possible directions relative to global navigation direction

Beyond Programmable Shading: In Action 33 All slides © 2008 Advanced Micro Devices, Inc. Used with permission. Agent Direction Determination

• Calculate fitness of each discrete direction • Fitness based on minimum time to collision and angle relative to global navigation direction • Choose direction with greatest fitness

Beyond Programmable Shading: In Action 34 All slides © 2008 Advanced Micro Devices, Inc. Used with permission. Agent Direction Determination

• Calculate fitness of each discrete direction • Fitness based on minimum time to collision and angle relative to global navigation direction • Choose direction with greatest fitness

Beyond Programmable Shading: In Action 35 All slides © 2008 Advanced Micro Devices, Inc. Used with permission. Agent Direction Determination

• Calculate fitness of each discrete direction • Fitness based on minimum time to collision and angle relative to global navigation direction • Choose direction with greatest fitness

Beyond Programmable Shading: In Action 36 All slides © 2008 Advanced Micro Devices, Inc. Used with permission. Results

• Real-time frame rates (>20fps) on ATI Radeon™ HD 4870* – >65,000 agents simulated or – 3,000 highly detailed froblins rendered and simulated or – 16,384 simplified agents rendered and simulated

*Configuration: AMD reference platform with AMD Athlon™ 64 X2 Dual-Core Processor 4600+, 2.40GHz, 2GB RAM. GPU: ATI Radeon™ HD 4870 Grahpics. Beyond Programmable Shading: In Action Motherboard: ASUSTek M2R32-MVP. Memory: DDR2-800 400 MHz. Operating 37 System: Windows Vista® SP1. All slides © 2008 Advanced Micro Devices, Inc. Used with permission. Advantages/Disadvantages

Advantages Disadvantages

Smooth flow Group-based Computed spatially Global knowledge Global Avoids congestion Emergent phenomena

Fine-grained navigation and Small set of discrete avoidance directions may cause Local oscillations

Deadlock scenarios

Beyond Programmable Shading: In Action 38 All slides © 2008 Advanced Micro Devices, Inc. Used with permission. Conclusions

• Massive crowd simulation entirely utilizing the GPU • Novel binning algorithm • See course notes for “Advances in Real- Time Rendering in 3D Graphics and Games” for more on Froblins

Beyond Programmable Shading: In Action 39 All slides © 2008 Advanced Micro Devices, Inc. Used with permission. Thank you

• Game Computing Applications Group – Natalya Tatarchuk, Chris Oat, Joshua Barczak, Abe Wiley • Dan Abrahams-Gessel • Won-Ki Jeong

Trademark Attribution

AMD, the AMD Arrow logo, AMD Opteron, AMD Athlon, AMD Turion, AMD Phenom, ATI, the ATI logo, Radeon, Catalyst, and combinations thereof are trademarks of Advanced Micro Devices, Inc. Microsoft, Windows, and Windows Vista are registered trademarks of Microsoft Corporation in the United States and/or other jurisdictions. Other names are for identification purposes only and may be trademarks of their respective owners.

©2008 Advanced Micro Devices, Inc. All rights reserved. Beyond Programmable Shading: In Action 40 All slides © 2008 Advanced Micro Devices, Inc. Used with permission.