Transform Codingcoding

TransformTransform CodingCoding • Principle of block-wise transform coding • Properties of orthonormal transforms • Discrete cosine transform (DCT) • Bit allocation for transform coefficients • Entropy coding of transform coefficients • Typical coding artifacts • Fast implementation of the DCT Thomas Wiegand: Digital Image Communication Transform Coding - 1 TransformTransform CodingCoding PrinciplePrinciple • Structure uvUV T Quantizer T-1 • Transform coder (T ) / decoder ( T-1) structure u UVi v T T-1 • Insert entropy coding ( ) and transmission channel u UiVi b b v T channel 1 T-1 Thomas Wiegand: Digital Image Communication Transform Coding - 2 TransformTransform CodingCoding andand QuantizationQuantization u1 U1 V1 v1 u U V v 2 T 2 Q 2 T-1 2 M M M M uN UN VN vN • Transformation of vector u=(u1,u2,…,uN) into U=(U1,U2,…,UN) • Quantizer Q may be applied to coefficients Ui • separately (scalar quantization: low complexity) • jointly (vector quantization, may require high complexity, exploiting of redundancy between Ui) • Inverse transformation of vector V=(V1,V2,…,VN) into v=(v1,v2,…,vN) Why should we use a transform ? Thomas Wiegand: Digital Image Communication Transform Coding - 3 GeometricalGeometrical InterpretationInterpretation • A linear transform can decorrelate random variables • An orthonormal transform is a rotation of the signal vector around the origin • Parseval‘s Theorem holds for orthonormal transforms u2 DC basis function U1 U u1 2 1 1 1 2 u = = T u 1 = 2 2 1 1 AC basis function u1 U1 1 1 12 1 3 U = = Tu = = U 2 2 1 11 2 1 Thomas Wiegand: Digital Image Communication Transform Coding - 4 PropertiesProperties ofof OrthonormalOrthonormal TransformsTransforms • Forward transform U = Tu N transform coefficients input signal block of size N Transform matrix of size NxN • Orthonormal transform property: inverse transform u = T1U = TT U • Linearity: u is represented as linear combination of “basis functions“ Thomas Wiegand: Digital Image Communication Transform Coding - 5 TransformTransform CodingCoding ofof ImagesImages Exploit horizontal and vertical dependencies by processing blocks original image reconstructed image Original Reconstructed image block image block Transform T Quantization Inverse Entropy Coding transform T-1 Transform Transmission Quantized transform coefficients coefficients from: Girod Thomas Wiegand: Digital Image Communication Transform Coding - 6 SeparableSeparable OrthonormalOrthonormal Transforms,Transforms, II • Problem: size of vectors N*N (typical value of N: 8) • An orthonormal transform is separable, if the transform of a signal block of size N*N-can be expressed by U = TuT1 N*N transform Orthonormal transform N*N block of coefficients matrix of size N*N input signal • The inverse transform is U = TT UT • Great practical importance: transform requires 2 matrix multiplications of size N*N instead one multiplication of a vector of size 1*N2 with a matrix of size N2*N2 • Reduction of the complexity from O(N4) to O(N3) from: Girod Thomas Wiegand: Digital Image Communication Transform Coding - 7 SeparableSeparable OrthonormalOrthonormal Transforms,Transforms, IIII Separable 2-D transform is realized by two 1-D transforms • along rows and • columns of the signal block N T u column-wise Tu row-wise TuT N N-transform N-transform N*N block of N*N block of pixels transform coefficients from: Girod Thomas Wiegand: Digital Image Communication Transform Coding - 8 Criteria for the Selection of a Particular Transform • Decorrelation, energy concentration • KLT, DCT, … • Transform should provide energy compaction • Visually pleasant basis functions • pseudo-random-noise, m-sequences, lapped transforms, … • Quantization errors make basis functions visible • Low complexity of computation • Separability in 2-D • Simple quantization of transform coefficients from: Girod Thomas Wiegand: Digital Image Communication Transform Coding - 9 KarhunenKarhunen LoèveLoève TransformTransform (KLT)(KLT) • Decorrelate elements of vector u T Ru=E{uu }, U=Tu, T T T T RU=E{UU }=TE{uu }T = TRuT =diag{ }i • Basis functions are eigenvectors of the covariance matrix of the input signal. • KLT achieves optimum energy concentration. • Disadvantages: - KLT dependent on signal statistics - KLT not separable for image blocks - Transform matrix cannot be factored into sparse matrices. from: Girod Thomas Wiegand: Digital Image Communication Transform Coding - 10 ComparisonComparison ofof VariousVarious Transforms,Transforms, II Comparison of 1D basis functions for block size N=8 (1) Karhunen Loève transform (1948/1960) (2) Haar transform (1910) (3) Walsh-Hadamard transform (1923) (4) Slant transform (Enomoto, Shibata, 1971) (5) Discrete CosineTransform (DCT) (Ahmet, Natarajan, Rao, 1974) (1) (2) (3) (4) (5) Thomas Wiegand: Digital Image Communication Transform Coding - 11 ComparisonComparison ofof VariousVarious Transforms,Transforms, IIII • Energy concentration measured for typical natural images, block size 1x32 (Lohscheller) • KLT is optimum • DCT performs only slightly worse than KLT Thomas Wiegand: Digital Image Communication Transform Coding - 12 DCTDCT - Type II-DCT of blocksize M x M - 2D basis functions of the DCT: is defined by transform matrix A containing elements (2k +1)i a = cos ik i 2M i,k = 0...(M 1) with 1 = 0 M 2 = i 0 i M Thomas Wiegand: Digital Image Communication Transform Coding - 13 Discrete Cosine Transform and Discrete Fourier Transform • Transform coding of images using the Discrete Fourier Transform (DFT): edge • For stationary image statistics, the energy folded concentration properties of the DFT converge against those of the KLT for large block sizes. • Problem of blockwise DFT coding: blocking effects • DFT of larger symmetric block -> “DCT“ due to circular topology of the DFT and Gibbs phenomena. pixel folded • Remedy: reflect image at block boundaries from: Girod Thomas Wiegand: Digital Image Communication Transform Coding - 14 HistogramsHistograms ofof DCTDCT Coefficients:Coefficients: • Image: Lena, 256x256 pixel • DCT: 8x8 pixels • DCT coefficients are approximately distributed like Laplacian pdf Thomas Wiegand: Digital Image Communication Transform Coding - 15 DistributionDistribution ofof thethe DCTDCT CoefficientsCoefficients • Central Limit Theorem requests DCT coefficients to be Gaussian distributed • Model variance of Gaussian DCT coefficients distribution as random variable (Lam & Goodmann‘2000) 2 2 1 u p(u | ) = exp 2 2 2 • Using conditional probability p(u) = p(u | 2 ) p( 2 )d 2 0 Thomas Wiegand: Digital Image Communication Transform Coding - 16 DistributionDistribution ofof thethe VarianceVariance Exponential Distribution 2 2 p( ) = μexp{}μ Thomas Wiegand: Digital Image Communication Transform Coding - 17 DistributionDistribution ofof thethe DCTDCT CoefficientsCoefficients p(u) = p(u | 2 ) p( 2) d 2 0 2 1 u 2 2 = exp 2 μexp{}μ d 0 2 2 2 2 u 2 = μ exp 2 μ d 0 2 2 1 μu = μ exp 2 2 μ 2 2μ = exp 2μ u Laplacian Distribution 2 {} Thomas Wiegand: Digital Image Communication Transform Coding - 18 BitBit AllocationAllocation forfor TransformTransform CoefficientsCoefficients II • Problem: divide bit-rate R among N transform coefficients such that resulting distortion D is minimized. N 1 1 N D(R) = Di (Ri ), s.t. R R N i Average i=1 N i=1 Average rate distortion Distortion contributed by coefﬁcient i Rate for coefﬁcient i • Approach: minimize Lagrangian cost function N N ! d dDi (Ri ) Di (Ri ) + Ri = + =0 dRi i=1 i=1 dRi • Solution: Pareto condition dD (R ) i i = dRi • Move bits from coefficient with small distortion reduction per bit to coefficient with larger distortion reduction per bit Thomas Wiegand: Digital Image Communication Transform Coding - 19 BitBit AllocationAllocation forfor TransformTransform CoefficientsCoefficients IIII • Assumption: high rate approximations are valid 2 2Ri dDi (Ri ) 2 2Ri Di (Ri ) a i 2 , 2aln2 i 2 = dRi 2aln2 R log + log R log log ˜ + R = log i + R i 2 i 2 i 2 i 2 2 ˜ 1 1 N 1 N 2aln2 N N 2aln2 R = Ri = log2 i + log2 = log2 i + log2 N i 1 N i 1 i 1 = 14 = 2 4 3 14= 2 4 3 log2 ˜ ˜ • Operational Distortion Rate function for transform coding: N N N i 1 a a 2log2 2R 2 2Ri 2 2 2R D(R) = Di (Ri ) i 2 = i 2 = a˜ 2 N i=1 N i=1 N i=1 1 N N Geometric mean: ˜ = i i=1 Thomas Wiegand: Digital Image Communication Transform Coding - 20 EntropyEntropy CodingCoding ofof TransformTransform CoefficientsCoefficients 2 1 i • Previous derivation assumes: Ri = max log2 ,0 bit 2 D • AC coefficients are very likely to be zero • Ordering of the transform DC coefficient coefficients by zig-zag scan • Design Huffman code for event: # of zeros and coefficient value • Arithmetic code maybe 1 simpler Probability 0.5 that coefficient is not zero 1 when quantizing with: 0 2 1 3 Vi=100*round(Ui/100) 2 4 3 4 5 5 6 6 7 7 8 8 Thomas Wiegand: Digital Image Communication Transform Coding - 21 EntropyEntropy CodingCoding ofof TransformTransform CoefficientsCoefficients IIII Encoder DCT 201 195 188 193 169 157 196 190 185 3 1 1 -3 2 -1 0 1480 49 33 -15 -14 33 -38 20 193 188 187 201 195 193 213 193 1 1 -1 0 -1 0 0 1 10 -52 11 -12 16 17 -13 -12 184 192 180 195 182 151 199 193 0 0 1 0 -1 0 0 0 19 32 -22 -10 22 -20 9 8 176 172 179 179 152 148 198 183 1 1 0 -1 0 0 0 -1 16 10 17 27 -31 12 6 -5 196 195 169 171 159 185 218 175 0 0 1 0 0 0 -1 0 -30 -6 13 -12 8 4 -3 -3 214 213 205 170 173 185 206 150 0 0 0 0 0 0 0 0 -25 16 6 -24 9 3 3 3 207 205 207 184 180 167 173 160 0 0 0 0 0 0 0 0 -2

Load more