Making Games Sound as Good as They Look

Real-time Geometric Acoustics on the GPU Sound in Modern Games

• Commercial middleware (FMod, Wwise) • Hardware or software rendering (DirectSound 3D, OpenAL, EAX) • Some additional extensions to improve immersion – ex: calculate occlusion (per ) and filter if occluded • Existing systems are parametric, rather than simulations Parametric vs. Simulation

• Parametric model: parameters are derived from game state (reverb wet/dry, room size, source distance, direction, etc..) • Parameters then map to DSP variables which interact with raw source (recorded samples) • Game state never directly interacts with signal processing Parametric vs. Simulation

• Simulation model: game state directly generates audio • Advantages: audio experience is more directly influenced by game world, can be more immersive if done right • Disadvantages: can easily sound “broken” if model is used outside limitations - not simple to fix by ‘fudging’ parameters S.T.A.L.K.E.R.™ : Call of Pripyat

• Basic positional audio (rendered through OpenAL) • Almost no environment dependence • Limited immersion, high performance • Typical for games where focus is on graphics S.T.A.L.K.E.R.™ : Call of Pripyat ARMA 2: Operation Arrowhead

• 2010 (PC only), continuously updated (version from late- 2012) • Simulation focus, HRTF and “software” occlusion • Rendering through OpenAL (although OpenAL cannot process geometry) • Focuses on sound as a gameplay mechanic ARMA 2: Operation Arrowhead Quake 3 Implementation

Q3dm1 • Acoustic version shown (3186 triangles) • 4096 rays to generate reflections • 3 orders of reflection - specular & diffuse; HRTF per-reflection (12k) • IIR based material descriptions Example: Used in (Quake 3 Arena)

For video: https://www.youtube.com/watch?v=TXUTgEmnD6U (please use headphones if possible!!) Geometry Engine

“Backwards” Ray-tracing • Start at listener • Use specular and diffuse reflection approximation at each bounce • Generate impulse response • 1 ray per thread

Geometry Engine

• Finding exact paths vs. sampling sound field • Impulse response acts as filter • Each ray has different frequency characteristics due to HRTF (different incident direction) • Unique among real-time systems

Example: Corner Bass Reinforcement • Situation on right, both listeners are facing source • Listener 1 is in around center of room, frequency response is fairly flat • Listener 2 is in corner, frequency response has low frequencies boosted • Why is this? Example: Corner Bass Reinforcement Example: Corner Bass Reinforcement Example: Corner Bass Reinforcement Performance

• Computationally expensive (scales to Order*Sources*Triangles*Rays) • Triangles count can be fairly low (use collision mesh instead of displayed mesh) • Single order BVH (bounding volume “hierarchy”) is sufficient

Geometry Engine Geometry Engine Geometry Engine Optimizations (Geometry)

• Rather not tolerate linear scaling to sources (reasoning: everything else can be mitigated in design by reducing quality) • Solution: Dynamically allocate rays to sources • Psychoacoustic justification: more chaotic sound scene = less ability to discern individual sounds

Dynamic Ray Allocation

• Dynamic ray allocation: • First problem: amplitude changes as rays are re- assigned • Second problem: slower ray-tracing (warp divergence between bounding volumes)

Ray Sorting

• Re-sort rays by assigned source in each block (inspired by Garanzha & Loop, 2010) • Advantage: Gain spatial coherence when performing occlusion testing • Disadvantage: Ray source positions and directions are less coherent (remember each source still needs omnidirectional sampling of rays)

GPU Audio Processing

• Each ray generates audio stream for mix-down (4096 in example) • Delayed and filtered due to reflected materials and HRTF position • Parallel mix-down (optimal performance depends on architecture)

Behavior at Reflection

• Typical commercial software uses per-frequency band simulation • Reflection (1.0-absorption) coefficient is a scalar for each band (may use separate sets of coefficients for diffuse reflections) • Approximate multi-band simulation by representing reflection behavior as bi-quad filter (loss of some degrees of freedom)

GPU Mix-down

• Atomic operations on Kepler GPUs (easy) • Round-robin method on pre-Kepler GPUs (similar to parallel reduction) • Only part of the work is done on GPU (CPU needed to mix down shared memory regions)

HRTF and Diffraction

• At low frequencies (below 1KHz) HRTF essentially flat • These are the frequencies which diffract the most around architecture (wavelength @ 250Hz ~ 1.3m) • Precise positioning of diffracted sources is not very important • HRTF IIR filter coefficients derived from FIR experiments (can use Prony’s method, but easier just to match approximate amplitude and cutoff) Integrating Artificial Reverb

• Simulation is useful for first 3 or 4 orders (for reference, on GTX Titan using q3dm1 map, 3 orders simulation takes ~30% real-time performance) • 3 orders generates about 500ms of reverb time, but in

practice, T60 for a comparable room can be several seconds (proportional to volume and inversely proportional to absorption) • Additional problem in that simulated reverberation gets grainier as reverb time gets longer Integrating Artificial Reverb

• Can estimate artificial reverb length with mean free path • Intuitive to apply reverb at end of DSP chain, but problematic in some situations • Resolve by applying reverb at front of DSP chain (before HRTF filtering) Design Considerations…

• On a parametric system, easy to have creative input • Make the sound more ‘exciting’ or more ‘chaotic’ • We don’t want the sound designer to become level designer • Simulation may only be suitable for certain types of games (realistic, rather than cinematic)

Thanks for Listening!

• For more information, look for dissertation entitled: Design of a Real-Time GPU Accelerated Acoustic Simulation Engine for Interactive Applications (University of Illinois Press, 2014)