The MPEG-4 General Audio Coder

The MPEG-4 General Audio Coder

The MPEG-4 General Audio Coder The MPEG-4 General Audio Coder Bernhard Grill Fraunhofer Institute for Integrated Circuits (IIS) grl 6/98 page 1 115 FRAUNHOfER Institut Integrierte Schaltungen The MPEG-4 General Audio Coder Outline • MPEG-2 Advanced Audio Coding (AAC) • MPEG-4 Extensions: – Perceptual Noise Substitution (PNS) – Long Term Prediction – TwinVQ Coding Core • The MPEG-4 Scalable General Audio Coder • Results of Listening Tests • Demonstration of a Real-Time Player grl 6/98 page 2 115 FRAUNHOfER Institut Integrierte Schaltungen The MPEG-4 General Audio Coder MPEG-2 AAC Encoder Overview: Bitstream Output Bitstream Multiplexer Gain Filter TNS M/S Quant. Control Scale Bank Coding Factors Coupling Intensity/ Noiseless Prediction Input time signal Perceptual Model Rate/Distortion Control grl 6/98 page 3 115 FRAUNHOfER Institut Integrierte Schaltungen The MPEG-4 General Audio Coder Extension: Perceptual Noise Substitution (PNS) Bitstream Output Bitstream Multiplexer Gain Filter TNS M/S Quant. Control Scale Bank Coding Factors Coupling Intensity/ Noiseless Prediction PNS Input time signal Perceptual Model Rate/Distortion Control grl 6/98 page 4 115 FRAUNHOfER Institut Integrierte Schaltungen The MPEG-4 General Audio Coder Perceptual Noise Substitution (2) Background: • Parametric coding of noise-like signal components has been used widely e.g. in speech coding MPEG-4: • Perceptual Noise Substitution (PNS) permits a frequency selective parametric coding of noise-like signal components • Noise-like signal components are detected on a scalefactor band basis • Corresponding groups of spectral coefficients are excluded from quantization/coding • Instead, only a "noise substitution flag" plus total power of the substituted band is transmitted in the bitstream • Decoder inserts pseudo random vectors with desired target power as spectral coefficients grl 6/98 page 5 115 FRAUNHOfER Institut Integrierte Schaltungen The MPEG-4 General Audio Coder Perceptual Noise Substitution (3) Perceptual "Perceptual Noise Model Substitution" (PNS): Audio Analysis Quantization Bitstream Bitstream Perceptual coder + Input Filterbank & Coding Multiplexer Out parametric represent. Noise Noise subst. signaling Detection of noise-like signals Substituted signal energies Encoder Decoder Audio Synthesis Inverse Bitstream Bitstream Output Filterbank Quantization Demultiplexer In Noise Noise subst. signaling Generator Substituted signal energies grl 6/98 page 6 115 FRAUNHOfER Institut Integrierte Schaltungen The MPEG-4 General Audio Coder Extension 2: Long Term Prediction Bitstream Output Bitstream Multiplexer Gain Filter TNS M/S Quant. Control Scale Bank Coding Factors Coupling Intensity/ Noiseless Prediction Input time signal Perceptual Model Rate/Distortion Control grl 6/98 page 7 115 FRAUNHOfER Institut Integrierte Schaltungen The MPEG-4 General Audio Coder Long Term Prediction (2) Motivation: • Tone-like signals require much higher coding precision than noise-like signals (e.g. 20 dB vs. 6 dB) • Tonal signal components are predictable MPEG-2 AAC: • Prediction of each spectral coefficient with backward adaptive predictor • High complexity (ca. 50% of decoder computation & RAM) MPEG-4: • Long Term Predictor (LTP) as known from speech coding • Lower complexity: Saving of approx. 50% in terms of computation and memory over MPEG-2 predictors • Comparable performance to MPEG-2 predictors grl 6/98 page 8 115 FRAUNHOfER Institut Integrierte Schaltungen The MPEG-4 General Audio Coder Extension 3: Twin-VQ Bitstream Output Bitstream Multiplexer Gain Filter TNS M/S Quant. Control Scale Bank Coding Factors Coupling Intensity/ Noiseless Noiseless Prediction Input time signal time Input Perceptual Model Rate/Distortion Control grl 6/98 page 9 115 FRAUNHOfER Institut Integrierte Schaltungen The MPEG-4 General Audio Coder Transform-Domain Weighted Interleave VQ (2) Background: • Audio coding at extremely low bitrates (6-8 kbit/s) • CELP speech coders do not perform well for music • 0.5 Bits per frequency line at these data rates !! MPEG-4: • Transform-Domain Weighted Interleave Vector Quantization (TwinVQ) as alternative coding kernel – Vector selection under control of the perceptual model • Fully integrated into MPEG-4 AAC coding system: – Uses same spectral representation as AAC coder – Makes use of other MPEG-4 tools (e.g. LTP, TNS, joint stereo coding) grl 6/98 page 10 115 FRAUNHOfER Institut Integrierte Schaltungen The MPEG-4 General Audio Coder Transform-Domain Weighted Interleave VQ (3) Structure: • Normalization of spectral coefficients: – LPC envelope (overall spectral shape) – Periodic component coding (harmonic components) – Bark-scale envelope coding (additional flattening) • Vector Quantization (VQ) process: – Interleaving of spectral coefficients into new sub-vectors – Vector quantization (two sets of codebooks, weighted distortion measure allows distortion control by perceptual model) ⇒ no bit/noise allocation or rate control iteration grl 6/98 page 11 115 FRAUNHOfER Institut Integrierte Schaltungen The MPEG-4 General Audio Coder Scalability Definition: • Capability to decode useful sub-sets of the bitstream Types of • SNR / NMR (Noise to Mask Ratio) Scalability: Scalability: – “Extension layers improve the SNR/NMR of the coded signal” • Audio Bandwidth Scalability: – “Extension layers increase the decodable audio band width” • Restriction of Generality: – Very low bit rate core coder optimized for special signals, e.g speech. Additional layers provide good quality for all types of signals. • Implementation Complexity: grl 6/98 page 12 115 FRAUNHOfER Institut Integrierte Schaltungen The MPEG-4 General Audio Coder Application examples • Network based (packetized) transmission – Requires routers which know about the importance of a packet – Less important (outer layer) packets may be dropped if the available bandwidth decreases • Broadcast – The most important (inner layer) packets are transmitted with a better error protection scheme • Music data base – High quality content is encoded and stored – Access to a lower quality version is possible without recoding to allow for pre-listening with a lower quality grl 6/98 page 13 115 FRAUNHOfER Institut Integrierte Schaltungen The MPEG-4 General Audio Coder Scalable GA Coder (I) Encoder Block Diagram - Reconstruct F Q&C S MDCT Spectrum + Q&C S Perceptual Model • Encoding of the error signal of an AAC or Twin-VQ Quantization and Coding (Q&C) module in a second, or third, or n-th similar quantization module in the frequency domain • Solutions using only AAC, or only Twin-VQ modules possible • Additionally, Twin-VQ / AAC combinations defined • Useful for large enhancement steps of >= 8 kbit/s per step grl 6/98 page 14 115 FRAUNHOfER Institut Integrierte Schaltungen The MPEG-4 General Audio Coder Scalable GA Coder (II) • Twin-VQ Q&C Modules Decoder Block Diagram – 8 kbit/s fixed step size Vector Quantizer (VQ) modules Reconstruct – optional 6 kbit/s in first layer Spectrum Inv. – first choice for a 6 or 8 kbit/s base layer for the coding FSS of general audio signals + • AAC Q&C modules Reconstruct Add – Any step size possible Spectrum – Reasonable step sizes from 8 to >64 kbit/s – The same end quality can be achieved as from a single step AAC coder IMDCT – However, a higher bit rate may be required for the same audio quality grl 6/98 page 15 115 FRAUNHOfER Institut Integrierte Schaltungen The MPEG-4 General Audio Coder Scalable GA Coder : Combination with CELP Coder (I) CELP MDCT Quantization CODEC FSS & Coding MDCT Encoder Perceptual. model • Very low bitrate core coder ( e.g. speech coder) • Core coder typically operating at a lower sampling frequency • MDCT used for efficient up-sampling grl 6/98 page 16 115 FRAUNHOfER Institut Integrierte Schaltungen The MPEG-4 General Audio Coder Scalable GA Coder : Combination with Core Coder (II) CELP MDCT IMDCT DECODER IFSS Re- quantization + Decoder Re- quantization grl 6/98 page 17 115 FRAUNHOfER Institut Integrierte Schaltungen The MPEG-4 General Audio Coder Scalable Stereo Coding: Stereo / Stereo grl 6/98 page 18 115 FRAUNHOfER Institut Integrierte Schaltungen The MPEG-4 General Audio Coder Scalable Stereo Coding: Mono / Stereo grl 6/98 page 19 115 FRAUNHOfER Institut Integrierte Schaltungen The MPEG-4 General Audio Coder Scalable Stereo Coding: Mono Core / Mono GA / Stereo GA grl 6/98 page 20 115 FRAUNHOfER Institut Integrierte Schaltungen The MPEG-4 General Audio Coder Scalable GA Coder : Typical Configurations • Some successfully tested mono/mono combinations: 6 kbit/s CELP + 18 kbit/s AAC 6 kbit/s TwinVQ + 18 kbit/s AAC 8 kbit/s TwinVQ + 8 kbit/s TwinVQ 6 kbit/s CELP + 18 kbit/s + 24 kbit/s AAC • Mono/stereo combinations 6 kbit/s mono CELP + 18 kbit/s mono + 24 kbit/s stereo AAC 24 kbit/s mono + 16 kbit/s stereo + 16 kbit/s stereo AAC 24 kbit/s mono + 72 kbit/s stereo AAC • Stereo/stereo combinations 2 x 6 kbit/s mono CELP + 36 kbit/s stereo AAC grl 6/98 page 21 115 FRAUNHOfER Institut Integrierte Schaltungen The MPEG-4 General Audio Coder Results (I) Mono Configurations 4,5 4 3,5 3 2,5 2 1,5 1 0,5 0 CELP+A Tw inV Q Layer 3 AAC 24 AC 24 + AAC AAC 18 24 kbit/s kbit/s kbit/s 24 kbit/s kbit/s Series1 Series2 3,5 4,1 3,85 3,65 3,27 grl 6/98 page 22 115 FRAUNHOfER Institut Integrierte Schaltungen The MPEG-4 General Audio Coder Results (II) Mono / Stereo Configuration 5 4,5 Mono Stereo 4 3,5 3 2,5 2 1,5 1 0,5 0 L3 24 AAC scal L3 40 AAC scal L3 56 AAC scal kbit/s 24 24 24 kbit/s 40 40 kbit/s 56 56 grl 6/98 kbit/s kbit/s kbit/s kbit/s kbit/s kbit/s page 23 115 FRAUNHOfER

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    24 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us