and multi-channel MPEG Surround Audio Research Labs Schuyler Quackenbush Bridging the gap between stereo

ARL audio research labs MPEG Audio Standards – 1997 MPEG-2 (AAC) – 2003 MPEG-4 High Efficiency AAC (HE-AAC) – 2006 MPEG-D MPEG Surround (MPS) – MPEG Surround 5.1-channel coding – HE-AAC 2-channel MPEG Surround core coder – AAC 2-channel HE-AAC core coder • Family of MPEG Audio standards • Each builds upon the previous ARL audio research labs AAC HE-AAC MPEG Surround Perception Exploiting “Dimensions” of – Perceptually shaped quantization noise – Perceptually coded spectrum replication – Perceptual soundstage coding •SNR • Frequency • Space ARL audio research labs reverberation direct path reverberation )sin(

θ rr +

θ differences differences differences Spatial Perception of Sound – Due to head shadowing – – Due to distance – Due to reverberation • ILD – difference level • ITD – difference time • ICC – coherence ARL audio research labs Example Soundstage

ARL audio research labs Time 0.00 1.00 2.00 3.00 4.00 5.00 6.00 0.00

8000.00 6000.00 4000.00 2000.00

14000.00 12000.00 10000.00 Frequency Time/Frequency Decomposition

ARL audio research labs stereo services to multi-channel ction of overall – AAC, HE-AAC – 40 kb/s HE-AAC core + 8 spatial side information • Compatible extension of existing • be combined with any core coder Can • Spatial parameters are a fra MPEG surround principle

ARL audio research labs HE-AAC Core Coder

ARL audio research labs Spatial Encoder Block Diagram

ARL audio research labs Spatial Decoder Block Diagram

ARL audio research labs TTT 2 3 Parameters Approach 1 2 OTT Parameters – (OTT) box One-To-Two – (TTT) box Two-To-Three

• Modular, using simple building blocks ARL audio research labs Stereo to Surround Tree

ARL audio research labs tree Flat implementation instead of

ARL audio research labs Output bit stream Stereo data MPEG Surround data in ancillary data of core Core Codec encoder Mix encoder Down MPEG Surround Encoder Parameter codec – downmix Stereo – encoded by core codec – Spatial parameters – multiplexed MPEG Surround - encoder Input 5.1 PCM • MPEG Surround encoder generates

ARL audio research labs Output (analog) Set-Top-Box MPEG Surround Home Theatre Set-up with Home Theatre Set-up 5.1 PCM atial parameters into multi- MPEG decoder Surround Set-Top-Box (MPEG Surround) Core Codec decoder data Input MPEG (digital) Surround bit stream Stereo data Stereo channel output • Decoding of stereo downmix • from ancillary data of spatial parameters Extraction • Combine stereo downmix and sp MPEG Surround - decoder

ARL audio research labs Standard Set-up with stereo-only Set-Top-Box stereo-only Output (analog) Stereo PCM Core (standard) decoder Codec Set-Top-Box – Compatible decoding of stereo downmix – Disregarding spatial parameters in ancillary data Input (digital) MPEG bit stream • Legacy stereo decoder Stereo data Stereo disregarded Surround data MPEG Surround - Legacy decoder ARL audio research labs Output (analog) audio playback playback device audio MPEG Surround mobile mobile Surround MPEG “Stereo” PCM Personal BRTFs Personal MPEG MPEG decoder Surround rendering Mobile audio (MPEG Surround) (MPEG Head tracking Head Core Codec decoder – intermediate 5.1 channel required No – Functions (BRTF) of Binaural Room Transfer Application – head tracking Supports • complexity Binaural stereo Low • Personalized experience • Also works for legacy stereo input data data Input MPEG MPEG (digital) Surround Surround MPEG Surround – Binaural bit stream Stereo data Stereo

ARL audio research labs Multi-channel performance

ARL audio research labs 100 10 side info bitrate [kb/s] MPEG Surround 1 0

90 80 70 60 50 40 30 20 10 100 [MOS] MUSHRA Operating Range / Performance

ARL audio research labs Two typical use cases

• DVB (e.g. Layer II) • Mobile music player (e.g. MP3 or AAC) ARL audio research labs One codec for may uses

ARL audio research labs Audio Family! – coder: shaped noise Perceptual – 64 kb/s/chn – Parametrically coded high frequencies SBR: – 24 kb/s/mono, 32 kb/s/stereo – coding: Parametrically coded 2D audio space Spatial – 9.6 kb/s/chn (48 kb/s/5-channel) Excellent addition to the MPEG • AAC • HE-AAC • Surround MPEG ARL audio research labs