MPEG Surround Audio Research Labs Schuyler Quackenbush Bridging the Gap Between Stereo
Total Page:16
File Type:pdf, Size:1020Kb
MPEG Surround Bridging the gap between stereo and multi-channel Schuyler Quackenbush Audio Research Labs ARL audio research labs MPEG Audio Standards • Family of MPEG Audio standards – 1997 MPEG-2 Advanced Audio Coding (AAC) – 2003 MPEG-4 High Efficiency AAC (HE-AAC) – 2006 MPEG-D MPEG Surround (MPS) • Each builds upon the previous – MPEG Surround 5.1-channel coding – HE-AAC 2-channel MPEG Surround core coder – AAC 2-channel HE-AAC core coder ARL audio research labs Exploiting “Dimensions” of Perception •SNR AAC – Perceptually shaped quantization noise • Frequency HE-AAC – Perceptually coded spectrum replication • Space MPEG Surround – Perceptual soundstage coding ARL audio research labs Spatial Perception of Sound • ILD – level difference – Due to head shadowing differences • ITD – time difference reverberation – rθ + sin( r θ ) direct path – Due to distance differences • ICC – coherence reverberation – Due to reverberation differences ARL audio research labs Example Soundstage ARL audio research labs Time/Frequency Decomposition 14000.00 12000.00 10000.00 8000.00 6000.00 Frequency 4000.00 2000.00 0.00 0.00 1.00 2.00 3.00 4.00 5.00 6.00 Time ARL audio research labs MPEG surround principle • Compatible extension of existing stereo services to multi-channel • Can be combined with any core coder – AAC, HE-AAC • Spatial parameters are a fraction of overall bit rate – 40 kb/s HE-AAC core + 8 kb/s spatial side information ARL audio research labs HE-AAC Core Coder ARL audio research labs Spatial Encoder Block Diagram ARL audio research labs Spatial Decoder Block Diagram ARL audio research labs Approach • Modular, using simple building blocks – One-To-Two (OTT) box – Two-To-Three (TTT) box OTT TTT 1 2 2 3 Parameters Parameters ARL audio research labs Stereo to Surround Tree ARL audio research labs Flat implementation instead of tree ARL audio research labs MPEG Surround - encoder Down 5.1 PCM Mix Stereo data Core bit stream Parameter Codec MPEG Surround data encoder encoder Input MPEG Surround Encoder Output • MPEG Surround encoder generates – Stereo downmix – encoded by core codec – Spatial parameters – multiplexed in ancillary data of core ARL codec audio research labs MPEG Surround - decoder Home Theatre Set-up with MPEG Surround Set-Top-Box 5.1 PCM Stereo data Core MPEG bit stream Codec Surround MPEG decoder decoder Surround data Input Set-Top-Box Output (digital) (MPEG Surround) (analog) • Decoding of stereo downmix • Extraction of spatial parameters from ancillary data • Combine stereo downmix and spatial parameters into multi- ARL channel output audio research labs MPEG Surround - Legacy decoder Standard Set-up with stereo-only Set-Top-Box Stereo data Stereo PCM Core bit stream Codec MPEG decoder Surround data disregarded Input Set-Top-Box Output (digital) (standard) (analog) • Legacy stereo decoder – Compatible decoding of stereo downmix – Disregarding spatial parameters in ancillary data ARL audio research labs MPEG Surround – Binaural rendering MPEG Surround mobile audio playback device Head tracking Personal BRTFs Stereo data “Stereo” PCM Core MPEG bit stream Codec Surround MPEG decoder decoder Surround data Input Mobile audio Output (digital) (MPEG Surround) (analog) • Low complexity Binaural stereo – No intermediate 5.1 channel required • Personalized experience – Application of Binaural Room Transfer Functions (BRTF) – Supports head tracking ARL • Also works for legacy stereo input data audio research labs Multi-channel performance ARL audio research labs MPEG Surround Operating Range / Performance 100 90 80 70 60 50 40 MUSHRA [MOS] 30 20 10 0 1 10 100 ARL side info bitrate [kb/s] audio research labs Two typical use cases • DVB (e.g. Layer II) • Mobile music player (e.g. MP3 or AAC) ARL audio research labs One codec for may uses ARL audio research labs Excellent addition to the MPEG Audio Family! • AAC – Perceptual coder: shaped noise – 64 kb/s/chn • HE-AAC – SBR: Parametrically coded high frequencies – 24 kb/s/mono, 32 kb/s/stereo • MPEG Surround – Spatial coding: Parametrically coded 2D audio space – 9.6 kb/s/chn (48 kb/s/5-channel) ARL audio research labs.