Transform Coding Transform Coding

Transform Coding Transform Coding Another concept for partially exploiting the memory gain of vector quantization Used in virtually all lossy image and video coding applications Samples of source s are grouped into vectors s of adjacent samples Transform coding consists of the following steps 1 Linear analysis transform A, converting source vectors s into transform coefficient vectors u = As 0 2 Scalar quantization of the transform coefficients u 7! u 3 Linear synthesis transform B, converting quantized transform coefficient vectors u0 into decoded source vectors s0 = Bu0 Adjacent Samples2D Transform: Transform Coefficients 4 4 Rotation by ' = 45◦ 2 2 sin ' cos ' S 0 A = U 0 1 cos ' − sin ' 1 −2 −2 −4 =) −4 −4 −2 0 2 4 −4 −2 0 2 4 S0 U0 o January 23, 2013 1 / 56 Transform Coding Overview Structure of Transform Coding Systems Motivation of Transform Coding Orthogonal Block Transforms Bit Allocation for Transform Coefficients Karhunen LoéveTransform (KLT) Signal-Independent Transforms Walsh-Hadamard Transform Discrete Fourier Transform (DFT) Discrete Cosine Transform (DCT) 2-d Transforms in Image and Video Coding Entropy coding of transform coefficients Distribution of transform coefficients for Images Modified Discrete Cosine Transform (MDCT) Summary o January 23, 2013 2 / 56 Transform Coding Structure of Transform Coding Systems 0 u0 u0 Q0 0 u1 u1 Q1 s BA s0 0 uN−1 uN−1 QN−1 analysis transform synthesis transformquantizers Synthesis transform is typically inverse of analysis transform Separate scalar quantizer Qn for each transform coefficient un Vector quantization of all bands or some of them is also possible, but Transforms are designed to have a decorrelating effect =) Memory gain of VQ is reduced Shape gain can be obtained by ECSQ Space-filling gain is left as a possible additional gain for VQ Combination of decorrelating transformation, scalar quantization and entropy coding is highly efficient - in terms of rate-distortion performance and complexity o January 23, 2013 3 / 56 Transform Coding Motivation of Transform Coding Exploitation of statistical dependencies Transform are typically designed in a way that, for typical input signals, the signal energy is concentrated in a few transform coefficients Coding of a few coefficients and many zero-valued coefficients can be very efficient (e.g., using arithmetic coding, run-length coding) Scalar quantization is more effective in transform domain Efficient trade-off between coding efficiency & complexity Vector Quantization: searching through codebook for best matching vector Combination of transform and scalar quantization typically results in a substantial reduction in computational complexity Suitable for quantization using perceptual criteria In image & video coding, quantization in transform domain typically leads to an improvement in subjective quality In speech & audio coding, frequency bands might be used to simulate processing of human ear Reduce perceptually irrelevant content o January 23, 2013 4 / 56 Transform Coding Transform Encoder and Decoder encoder u0 i0 α0 u i 1 α 1 A 1 γ bs uN−1 iN−1 αN−1 analysis transform mapping entropy coderencoder decoder 0 i0 u0 β0 i u0 1 β 1 b γ−1 1 B s0 0 iN−1 uN−1 βN−1 entropy decoder mapping synthesis transformdecoder o January 23, 2013 5 / 56 Transform Coding Linear Block Transforms Linear Block Transform Each component of the N-dimensional output vector represents a linear combination of the N components of the N-dimensional input vector Can be written as matrix multiplication Analysis transform u = A · s (1) Synthesis transform s0 = B · u0 (2) Vector interpretation: s0 is represented as a linear combination of column vectors of B N−1 0 X 0 0 0 0 s = un · bn = u0 · b0 + u1 · b1 + ··· + uN−1 · bN−1 (3) n=0 o January 23, 2013 6 / 56 Transform Coding Linear Block Transforms (cont'd) Perfect Reconstruction Property Consider case that no quantization is applied (u0 = u) Optimal synthesis transform: B = A−1 (4) Reconstructed samples are equal to source samples s0 = B u = B A s = A−1 A s = s (5) Optimal Synthesis Transform (in presence of quantization) Optimality: Minimum MSE distortion among all synthesis transforms B = A−1 is optimal if A is invertible and produces independent transform coefficients the component quantizers are centroidal quantizers If above conditions are not fulfilled, a synthesis transform B 6= A−1 may reduce the distortion o January 23, 2013 7 / 56 Transform Coding Orthogonal Block Transforms Orthonormal Basis An analysis transform A forms an orthonormal basis if basis vectors (matrix rows) are orthogonal to each other basis vectors have to length 1 The corresponding transform is called an orthogonal transform The transform matrices are called unitary matrices Unitary matrices with real entries are called orthogonal matrix Inverse of unitary matrices: Conjugate transpose A−1 = Ay (for orthogonal matrices: A−1 = AT ) (6) Why are orthogonal transforms desirable? MSE distortion can be minimized by independent scalar quantization of the transform coefficients Orthogonality of the basis vectors sufficient: Vector norms can be taken into account in quantizer design o January 23, 2013 8 / 56 Transform Coding Properties of Orthogonal Block Transforms Transform coding with orthogonal transform and perfect reconstruction B = A−1 = Ay preserves MSE distortion 1 d (s; s0) = (s − s0)y (s − s0) N N 1 y = A−1 u − B u0 A−1 u − B u0 N 1 y = Ay u − Ay u0 Ay u − Ay u0 N 1 = (u − u0)y AA−1 (u − u0) N 1 = (u − u0)y (u − u0) N 0 = dN (u; u ) (7) Scalar quantization that minimizes MSE in transform domain also minimizes MSE in original signal space For the special case of orthogonal matrices: (··· )y = (··· )T o January 23, 2013 9 / 56 Transform Coding Properties of Orthogonal Block Transforms (cont'd) Covariance matrix of transform coefficients T CUU = E (U − E fUg)(U − E fUg) n o = E A (S − E fSg)(S − E fSg)T AT −1 = ACSS A (8) Since the trace of a matrix is similarity-invariant, tr(X) = tr(PXP −1); (9) and the trace of an autocovariance matrix is the sum of the variances of the vector components, we have N−1 1 X σ2 = σ2 : (10) N i S i=0 The arithmetic mean of the variances of the transform coefficients is equal to the variances of the source o January 23, 2013 10 / 56 Transform Coding Geometrical Interpretation of Orthogonal Transforms Inverse 2-d transform matrix (= transpose of forward transform matrix) 1 1 1 T B = b0 b1 = p = A 2 1 −1 Vector interpretation for 2-d example s s = u0 · b0 + u1 · b1 1 s0 1 1 1 1 u1 · b1 = u0 · p + u1 · p s1 2 1 2 −1 u0 · b0 4 1 1 = 3:5 · + 0:5 · s 3 1 −1 b0 yielding transform coefficients s b 0 p p 1 u0 = 2 · 3:5 u1 = 2 · 0:5 An orthogonal transform is a rotation from the signal coordinate system into the coordinate system of the basis functions o January 23, 2013 11 / 56 Transform Coding Transform Example N = 2 Adjacent samples of Gauss-Markov source with different correlation factors ρ ρ = 0 ρ = 0:5 ρ = 0:9 ρ = 0:95 4 4 4 S1 S1 S1 S1 2 2 2 0 0 0 −2 −2 −2 S S S S −4 0 −4 0 −4 0 0 −4 −2 0 2 4 −4 −2 0 2 4 −4 −2 0 2 4 Transform coefficients for orthonormal 2D transform ρ = 0 ρ = 0:5 ρ = 0:9 ρ = 0:95 4 4 4 4 4 U1 U1 U1 U1 2 2 2 2 2 0 0 0 0 0 −2 −2 −2 −2 −2 U0 U U U0 −4 −4 −4 0 −4 0−4 −4 −2 0 −4 2−2 4 0 2 4 −4 −2 0 2 4 −4 −2 0 2 4−4 −2 0 2 4 o January 23, 2013 12 / 56 Transform Coding Example for Waveforms (Gauss-Markov Source with ρ = 0:95) 4 s[k] 2 Top: signal s[k] 0 Middle: −2 transform coefficient u0[k=2] k −4 also called dc coefficient 0 10 20 30 40 50 Bottom: 4 u0[k=2] transform coefficient u1[k=2] 2 also called ac coefficient 0 Number of transform −2 k=2 coefficients u0 is half the −4 number of samples s 0 5 10 15 20 25 4 Number of transform u [k=2] 2 1 coefficients u1 is half the number of samples s 0 −2 k=2 −4 0 5 10 15 20 25 o January 23, 2013 13 / 56 Transform Coding Scalar Quantization in Transform Domain Consider Transform Coding with Orthogonal Transforms direct coding transform coding transform coding quantization cells quantization cells quantization cells in transform domain in signal space Quantization cells are hyper-rectangles as in scalar quantization but rotated and aligned with the transform basis vectors Number of quantization cells with appreciable probabilities is reduced =) indicates improved coding efficiency for correlated sources o January 23, 2013 14 / 56 Transform Coding Bit Allocation for Transform Coefficients Problem: Distribute bit rate R among the N transform coefficients such that the resulting distortion D is minimized N N 1 X 1 X min D(R) = D (R ) subject to R ≤ R (11) N i i N i i=1 i=1 with Di(Ri) being the op. distortion-rate functions of the scalar quantizers Approach: Minimize Lagrangian cost function: J = D + λR N N ! @ X X @Di(Ri) ! D (R ) + λ R = + λ = 0 (12) @R i i i @R i i=1 i=1 i Solution: Pareto condition @D (R ) i i = −λ = const (13) @Ri Move bits from coefficients with small distortion reduction per bit to coefficients with larger distortion reduction per bit Similar to bit allocation problem in discrete sets: min Di + λRi o January 23, 2013 15 / 56 Transform Coding Bit Allocation for Transform Coefficients (cont'd) Operational distortion-rate function of scalar quantizers can be written as 2 Di(Ri) = σi · gi(Ri) (14) Justified to assume that gi(Ri) is a continuous strictly convex function 0 0 gi(Ri) has a continuous strictly increasing derivative gi(Ri) with gi(1) = 0 Pareto condition becomes 2 0 −σi · gi(Ri) = λ (15) 2 0 If λ ≥ −σi gi(0), the quantizer for ui cannot be operated at the given slope =) Set the corresponding component rate to Ri = 0 Bit allocation rule ( 2 0 0 : −σi gi(0) ≤ λ Ri =

Transform Coding Transform Coding

The Discrete Cosine Transform (DCT): Theory and Application

Block Transforms in Progressive Image Coding

Video/Image Compression Technologies an Overview

Predictive and Transform Coding

Speech Compression

Explainable Machine Learning Based Transform Coding for High

Signal Processing, IEEE Transactions On

Nonlinear Transform Coding Johannes Ballé, Philip A

Introduction to Data Compression

Beyond Traditional Transform Coding

Waveform-Based Coding: Transform and Predictive Coding Outline

A Hybrid Coder for Code-Excited Linear Predictive Coding of Images