Transform Codingcoding

TransformTransform CodingCoding • Principle of block-wise transform coding • Properties of orthonormal transforms • Discrete cosine transform (DCT) • Bit allocation for transform coefficients • Entropy coding of transform coefficients • Typical coding artifacts • Fast implementation of the DCT Thomas Wiegand: Digital Image Communication Transform Coding - 1 TransformTransform CodingCoding PrinciplePrinciple • Structure uvUV T Quantizer T-1 • Transform coder (T ) / decoder ( T-1) structure u UVi v T T-1 • Insert entropy coding ( ) and transmission channel u UiVi b b v T channel 1 T-1 Thomas Wiegand: Digital Image Communication Transform Coding - 2 TransformTransform CodingCoding andand QuantizationQuantization u1 U1 V1 v1 u U V v 2 T 2 Q 2 T-1 2 M M M M uN UN VN vN • Transformation of vector u=(u1,u2,…,uN) into U=(U1,U2,…,UN) • Quantizer Q may be applied to coefficients Ui • separately (scalar quantization: low complexity) • jointly (vector quantization, may require high complexity, exploiting of redundancy between Ui) • Inverse transformation of vector V=(V1,V2,…,VN) into v=(v1,v2,…,vN) Why should we use a transform ? Thomas Wiegand: Digital Image Communication Transform Coding - 3 GeometricalGeometrical InterpretationInterpretation • A linear transform can decorrelate random variables • An orthonormal transform is a rotation of the signal vector around the origin • Parseval‘s Theorem holds for orthonormal transforms u2 DC basis function U1 U u1 2 1 1 1 2 u = = T u 1 = 2 2 1 1 AC basis function u1 U1 1 1 12 1 3 U = = Tu = = U 2 2 1 11 2 1 Thomas Wiegand: Digital Image Communication Transform Coding - 4 PropertiesProperties ofof OrthonormalOrthonormal TransformsTransforms • Forward transform U = Tu N transform coefficients input signal block of size N Transform matrix of size NxN • Orthonormal transform property: inverse transform u = T1U = TT U • Linearity: u is represented as linear combination of “basis functions“ Thomas Wiegand: Digital Image Communication Transform Coding - 5 TransformTransform CodingCoding ofof ImagesImages Exploit horizontal and vertical dependencies by processing blocks original image reconstructed image Original Reconstructed image block image block Transform T Quantization Inverse Entropy Coding transform T-1 Transform Transmission Quantized transform coefficients coefficients from: Girod Thomas Wiegand: Digital Image Communication Transform Coding - 6 SeparableSeparable OrthonormalOrthonormal Transforms,Transforms, II • Problem: size of vectors N*N (typical value of N: 8) • An orthonormal transform is separable, if the transform of a signal block of size N*N-can be expressed by U = TuT1 N*N transform Orthonormal transform N*N block of coefficients matrix of size N*N input signal • The inverse transform is U = TT UT • Great practical importance: transform requires 2 matrix multiplications of size N*N instead one multiplication of a vector of size 1*N2 with a matrix of size N2*N2 • Reduction of the complexity from O(N4) to O(N3) from: Girod Thomas Wiegand: Digital Image Communication Transform Coding - 7 SeparableSeparable OrthonormalOrthonormal Transforms,Transforms, IIII Separable 2-D transform is realized by two 1-D transforms • along rows and • columns of the signal block N T u column-wise Tu row-wise TuT N N-transform N-transform N*N block of N*N block of pixels transform coefficients from: Girod Thomas Wiegand: Digital Image Communication Transform Coding - 8 Criteria for the Selection of a Particular Transform • Decorrelation, energy concentration • KLT, DCT, … • Transform should provide energy compaction • Visually pleasant basis functions • pseudo-random-noise, m-sequences, lapped transforms, … • Quantization errors make basis functions visible • Low complexity of computation • Separability in 2-D • Simple quantization of transform coefficients from: Girod Thomas Wiegand: Digital Image Communication Transform Coding - 9 KarhunenKarhunen LoèveLoève TransformTransform (KLT)(KLT) • Decorrelate elements of vector u T Ru=E{uu }, U=Tu, T T T T RU=E{UU }=TE{uu }T = TRuT =diag{ }i • Basis functions are eigenvectors of the covariance matrix of the input signal. • KLT achieves optimum energy concentration. • Disadvantages: - KLT dependent on signal statistics - KLT not separable for image blocks - Transform matrix cannot be factored into sparse matrices. from: Girod Thomas Wiegand: Digital Image Communication Transform Coding - 10 ComparisonComparison ofof VariousVarious Transforms,Transforms, II Comparison of 1D basis functions for block size N=8 (1) Karhunen Loève transform (1948/1960) (2) Haar transform (1910) (3) Walsh-Hadamard transform (1923) (4) Slant transform (Enomoto, Shibata, 1971) (5) Discrete CosineTransform (DCT) (Ahmet, Natarajan, Rao, 1974) (1) (2) (3) (4) (5) Thomas Wiegand: Digital Image Communication Transform Coding - 11 ComparisonComparison ofof VariousVarious Transforms,Transforms, IIII • Energy concentration measured for typical natural images, block size 1x32 (Lohscheller) • KLT is optimum • DCT performs only slightly worse than KLT Thomas Wiegand: Digital Image Communication Transform Coding - 12 DCTDCT - Type II-DCT of blocksize M x M - 2D basis functions of the DCT: is defined by transform matrix A containing elements (2k +1)i a = cos ik i 2M i,k = 0...(M 1) with 1 = 0 M 2 = i 0 i M Thomas Wiegand: Digital Image Communication Transform Coding - 13 Discrete Cosine Transform and Discrete Fourier Transform • Transform coding of images using the Discrete Fourier Transform (DFT): edge • For stationary image statistics, the energy folded concentration properties of the DFT converge against those of the KLT for large block sizes. • Problem of blockwise DFT coding: blocking effects • DFT of larger symmetric block -> “DCT“ due to circular topology of the DFT and Gibbs phenomena. pixel folded • Remedy: reflect image at block boundaries from: Girod Thomas Wiegand: Digital Image Communication Transform Coding - 14 HistogramsHistograms ofof DCTDCT Coefficients:Coefficients: • Image: Lena, 256x256 pixel • DCT: 8x8 pixels • DCT coefficients are approximately distributed like Laplacian pdf Thomas Wiegand: Digital Image Communication Transform Coding - 15 DistributionDistribution ofof thethe DCTDCT CoefficientsCoefficients • Central Limit Theorem requests DCT coefficients to be Gaussian distributed • Model variance of Gaussian DCT coefficients distribution as random variable (Lam & Goodmann‘2000) 2 2 1 u p(u | ) = exp 2 2 2 • Using conditional probability p(u) = p(u | 2 ) p( 2 )d 2 0 Thomas Wiegand: Digital Image Communication Transform Coding - 16 DistributionDistribution ofof thethe VarianceVariance Exponential Distribution 2 2 p( ) = μexp{}μ Thomas Wiegand: Digital Image Communication Transform Coding - 17 DistributionDistribution ofof thethe DCTDCT CoefficientsCoefficients p(u) = p(u | 2 ) p( 2) d 2 0 2 1 u 2 2 = exp 2 μexp{}μ d 0 2 2 2 2 u 2 = μ exp 2 μ d 0 2 2 1 μu = μ exp 2 2 μ 2 2μ = exp 2μ u Laplacian Distribution 2 {} Thomas Wiegand: Digital Image Communication Transform Coding - 18 BitBit AllocationAllocation forfor TransformTransform CoefficientsCoefficients II • Problem: divide bit-rate R among N transform coefficients such that resulting distortion D is minimized. N 1 1 N D(R) = Di (Ri ), s.t. R R N i Average i=1 N i=1 Average rate distortion Distortion contributed by coefﬁcient i Rate for coefﬁcient i • Approach: minimize Lagrangian cost function N N ! d dDi (Ri ) Di (Ri ) + Ri = + =0 dRi i=1 i=1 dRi • Solution: Pareto condition dD (R ) i i = dRi • Move bits from coefficient with small distortion reduction per bit to coefficient with larger distortion reduction per bit Thomas Wiegand: Digital Image Communication Transform Coding - 19 BitBit AllocationAllocation forfor TransformTransform CoefficientsCoefficients IIII • Assumption: high rate approximations are valid 2 2Ri dDi (Ri ) 2 2Ri Di (Ri ) a i 2 , 2aln2 i 2 = dRi 2aln2 R log + log R log log ˜ + R = log i + R i 2 i 2 i 2 i 2 2 ˜ 1 1 N 1 N 2aln2 N N 2aln2 R = Ri = log2 i + log2 = log2 i + log2 N i 1 N i 1 i 1 = 14 = 2 4 3 14= 2 4 3 log2 ˜ ˜ • Operational Distortion Rate function for transform coding: N N N i 1 a a 2log2 2R 2 2Ri 2 2 2R D(R) = Di (Ri ) i 2 = i 2 = a˜ 2 N i=1 N i=1 N i=1 1 N N Geometric mean: ˜ = i i=1 Thomas Wiegand: Digital Image Communication Transform Coding - 20 EntropyEntropy CodingCoding ofof TransformTransform CoefficientsCoefficients 2 1 i • Previous derivation assumes: Ri = max log2 ,0 bit 2 D • AC coefficients are very likely to be zero • Ordering of the transform DC coefficient coefficients by zig-zag scan • Design Huffman code for event: # of zeros and coefficient value • Arithmetic code maybe 1 simpler Probability 0.5 that coefficient is not zero 1 when quantizing with: 0 2 1 3 Vi=100*round(Ui/100) 2 4 3 4 5 5 6 6 7 7 8 8 Thomas Wiegand: Digital Image Communication Transform Coding - 21 EntropyEntropy CodingCoding ofof TransformTransform CoefficientsCoefficients IIII Encoder DCT 201 195 188 193 169 157 196 190 185 3 1 1 -3 2 -1 0 1480 49 33 -15 -14 33 -38 20 193 188 187 201 195 193 213 193 1 1 -1 0 -1 0 0 1 10 -52 11 -12 16 17 -13 -12 184 192 180 195 182 151 199 193 0 0 1 0 -1 0 0 0 19 32 -22 -10 22 -20 9 8 176 172 179 179 152 148 198 183 1 1 0 -1 0 0 0 -1 16 10 17 27 -31 12 6 -5 196 195 169 171 159 185 218 175 0 0 1 0 0 0 -1 0 -30 -6 13 -12 8 4 -3 -3 214 213 205 170 173 185 206 150 0 0 0 0 0 0 0 0 -25 16 6 -24 9 3 3 3 207 205 207 184 180 167 173 160 0 0 0 0 0 0 0 0 -2

Transform Codingcoding

Details

Download

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

Support