<<

TransformTransform CodingCoding

• Principle of block-wise transform coding

• Properties of orthonormal transforms

• Discrete cosine transform (DCT)

• Bit allocation for transform coefficients

• Entropy coding of transform coefficients • Typical coding artifacts • Fast implementation of the DCT

Thomas Wiegand: Digital Image Communication Transform Coding - 1 TransformTransform CodingCoding PrinciplePrinciple

• Structure uvUV T Quantizer T-1

• Transform coder (T ) / decoder (  T-1) structure u UVi v T   T-1

• Insert entropy coding (  ) and transmission channel u UiVi b b v T   channel   1 T-1  Thomas Wiegand: Digital Image Communication Transform Coding - 2 TransformTransform CodingCoding andand QuantizationQuantization

u1 U1 V1 v1 u U V v 2 T 2 Q 2 T-1 2 M M M M uN UN VN vN

• Transformation of vector u=(u1,u2,…,uN) into U=(U1,U2,…,UN) • Quantizer Q may be applied to coefficients Ui • separately (scalar quantization: low complexity) • jointly (, may require high complexity,

exploiting of redundancy between Ui) • Inverse transformation of vector V=(V1,V2,…,VN) into v=(v1,v2,…,vN) Why should we use a transform ?

Thomas Wiegand: Digital Image Communication Transform Coding - 3 GeometricalGeometrical InterpretationInterpretation

• A linear transform can decorrelate random variables • An orthonormal transform is a rotation of the vector around the origin • Parseval‘s Theorem holds for orthonormal transforms

u2 DC basis function

U1 U  u1  2 1  1 1 2 u =   =   T   u  1 =    2    2 1 1

AC basis function u1

U1  1  1 12 1  3  U =   = Tu =    =   U 2  2 1 11 2 1

Thomas Wiegand: Digital Image Communication Transform Coding - 4 PropertiesProperties ofof OrthonormalOrthonormal TransformsTransforms

• Forward transform U = Tu N transform coefficients input signal block of size N Transform matrix of size NxN • Orthonormal transform property: inverse transform

u = T1U = TT U

• Linearity: u is represented as linear combination of “basis functions“

Thomas Wiegand: Digital Image Communication Transform Coding - 5 TransformTransform CodingCoding ofof ImagesImages

Exploit horizontal and vertical dependencies by processing blocks

original image reconstructed image

Original Reconstructed image block image block

Transform T Quantization Inverse Entropy Coding transform T-1 Transform Transmission Quantized transform coefficients coefficients from: Girod

Thomas Wiegand: Digital Image Communication Transform Coding - 6 SeparableSeparable OrthonormalOrthonormal Transforms,Transforms, II

• Problem: size of vectors N*N (typical value of N: 8) • An orthonormal transform is separable, if the transform of a signal block of size N*N-can be expressed by U = TuT1

N*N transform Orthonormal transform N*N block of coefficients matrix of size N*N input signal

• The inverse transform is U = TT UT • Great practical importance: transform requires 2 matrix multiplications of size N*N instead one multiplication of a vector of size 1*N2 with a matrix of size N2*N2 • Reduction of the complexity from O(N4) to O(N3)

from: Girod

Thomas Wiegand: Digital Image Communication Transform Coding - 7 SeparableSeparable OrthonormalOrthonormal Transforms,Transforms, IIII Separable 2-D transform is realized by two 1-D transforms • along rows and • columns of the signal block

N T u column-wise Tu row-wise TuT N N-transform N-transform

N*N block of N*N block of pixels transform coefficients

from: Girod

Thomas Wiegand: Digital Image Communication Transform Coding - 8 Criteria for the Selection of a Particular Transform

• Decorrelation, energy concentration • KLT, DCT, … • Transform should provide energy compaction • Visually pleasant basis functions • pseudo-random-noise, m-sequences, lapped transforms, … • Quantization errors make basis functions visible • Low complexity of computation • Separability in 2-D • Simple quantization of transform coefficients

from: Girod

Thomas Wiegand: Digital Image Communication Transform Coding - 9 KarhunenKarhunen LoèveLoève TransformTransform (KLT)(KLT)

• Decorrelate elements of vector u T  Ru=E{uu },  U=Tu,  T T T T RU=E{UU }=TE{uu }T = TRuT =diag{ }i

• Basis functions are eigenvectors of the covariance matrix of the input signal.

• KLT achieves optimum energy concentration. • Disadvantages:

- KLT dependent on signal statistics - KLT not separable for image blocks - Transform matrix cannot be factored into sparse matrices.

from: Girod

Thomas Wiegand: Digital Image Communication Transform Coding - 10 ComparisonComparison ofof VariousVarious Transforms,Transforms, II

Comparison of 1D basis functions for block size N=8

(1) Karhunen Loève transform (1948/1960)

(2) Haar transform (1910)

(3) Walsh-Hadamard transform (1923)

(4) Slant transform (Enomoto, Shibata, 1971)

(5) Discrete CosineTransform (DCT) (Ahmet, Natarajan, Rao, 1974) (1) (2) (3) (4) (5)

Thomas Wiegand: Digital Image Communication Transform Coding - 11 ComparisonComparison ofof VariousVarious Transforms,Transforms, IIII • Energy concentration measured for typical natural images, block size 1x32 (Lohscheller) • KLT is optimum • DCT performs only slightly worse than KLT

Thomas Wiegand: Digital Image Communication Transform Coding - 12

DCTDCT

- Type II-DCT of blocksize M x M - 2D basis functions of the DCT: is defined by transform matrix A containing elements (2k +1)i a = cos ik i 2M i,k = 0...(M 1) with 1  = 0 M 2  = i  0 i M

Thomas Wiegand: Digital Image Communication Transform Coding - 13 Discrete Cosine Transform and Discrete Fourier Transform

• Transform coding of images using the Discrete Fourier Transform (DFT): edge • For stationary image statistics, the energy folded concentration properties of the DFT converge against those of the KLT for large block sizes.

• Problem of blockwise DFT coding: blocking effects

• DFT of larger symmetric block -> “DCT“ due to circular topology of the DFT and Gibbs phenomena. pixel folded • Remedy: reflect image at block boundaries

from: Girod

Thomas Wiegand: Digital Image Communication Transform Coding - 14 HistogramsHistograms ofof DCTDCT Coefficients:Coefficients:

• Image: Lena, 256x256 pixel • DCT: 8x8 pixels • DCT coefficients are approximately distributed like Laplacian pdf

Thomas Wiegand: Digital Image Communication Transform Coding - 15 DistributionDistribution ofof thethe DCTDCT CoefficientsCoefficients

• Central Limit Theorem requests DCT coefficients to be Gaussian distributed • Model variance of Gaussian DCT coefficients distribution as random variable (Lam & Goodmann‘2000)  1  u 2  2  p(u | ) =  exp 2  2  2

• Using conditional probability   2  2 2 p(u) =  p(u | ) p( )d 0

Thomas Wiegand: Digital Image Communication Transform Coding - 16 DistributionDistribution ofof thethe VarianceVariance

Exponential Distribution 2 2 p( ) = μexp{}μ

Thomas Wiegand: Digital Image Communication Transform Coding - 17 DistributionDistribution ofof thethe DCTDCT CoefficientsCoefficients

 p(u) =  p(u | 2 ) p( 2) d 2 0  2 1 u  2 2 =  exp 2   μexp{}μ  d 0 2 2   2 2 u 2  = μ  exp  2  μ   d  0 2   2  1  μu  =  μ  exp 2     2 μ 2  2μ = exp{} 2μ u Laplacian Distribution 2

Thomas Wiegand: Digital Image Communication Transform Coding - 18 BitBit AllocationAllocation forfor TransformTransform CoefficientsCoefficients II

• Problem: divide bit-rate R among N transform coefficients such that resulting distortion D is minimized. N 1 1 N D(R) = Di (Ri ), s.t. R  R N   i Average i=1 N i=1 Average rate distortion Distortion contributed by coefficient i Rate for coefficient i • Approach: minimize Lagrangian cost function N N ! d dDi (Ri ) Di (Ri ) + Ri = + =0 dRi i=1 i=1 dRi • Solution: Pareto condition dD (R ) i i =  dRi • Move bits from coefficient with small distortion reduction per bit to coefficient with larger distortion reduction per bit

Thomas Wiegand: Digital Image Communication Transform Coding - 19 BitBit AllocationAllocation forfor TransformTransform CoefficientsCoefficients IIII • Assumption: high rate approximations are valid

2 2Ri dDi (Ri ) 2 2Ri Di (Ri )  a i 2 , 2aln2 i 2 =  dRi

2aln2  i Ri  log2  i + log2  Ri  log2  i  log2 ˜ + R = log2 + R  ˜

1 1 N 1 N 2aln2  N  N 2aln2 R = Ri = log2 i + log2 = log2  i + log2 N i 1 N i 1   i 1   = 14 = 2 4 3 14= 2 4 3 log2 ˜ ˜ • Operational Distortion Rate function for transform coding:

N N N  i 1 a a 2log2 2R 2 2Ri 2  2 2R D(R) = Di (Ri )   i 2 =  i 2 = a˜ 2 N i=1 N i=1 N i=1 1  N  N Geometric mean: ˜ =   i   i=1  Thomas Wiegand: Digital Image Communication Transform Coding - 20 EntropyEntropy CodingCoding ofof TransformTransform CoefficientsCoefficients  2   1  i • Previous derivation assumes: Ri = max log2 ,0 bit 2  D  • AC coefficients are very likely to be zero • Ordering of the transform DC coefficient coefficients by zig-zag scan • Design Huffman code for event: # of zeros and coefficient value • Arithmetic code maybe 1 simpler Probability 0.5 that coefficient is not zero 1 when quantizing with: 0 2 1 3 Vi=100*round(Ui/100) 2 4 3 4 5 5 6 6 7 7 8 8

Thomas Wiegand: Digital Image Communication Transform Coding - 21 EntropyEntropy CodingCoding ofof TransformTransform CoefficientsCoefficients IIII

Encoder

DCT 

201 195 188 193 169 157 196 190 185 3 1 1 -3 2 -1 0 1480 49 33 -15 -14 33 -38 20 193 188 187 201 195 193 213 193 1 1 -1 0 -1 0 0 1 10 -52 11 -12 16 17 -13 -12 184 192 180 195 182 151 199 193 0 0 1 0 -1 0 0 0 19 32 -22 -10 22 -20 9 8 176 172 179 179 152 148 198 183 1 1 0 -1 0 0 0 -1 16 10 17 27 -31 12 6 -5 196 195 169 171 159 185 218 175 0 0 1 0 0 0 -1 0 -30 -6 13 -12 8 4 -3 -3 214 213 205 170 173 185 206 150 0 0 0 0 0 0 0 0 -25 16 6 -24 9 3 3 3 207 205 207 184 180 167 173 160 0 0 0 0 0 0 0 0 -2 17 4 -6 0 -4 -9 8 198 203 205 186 196 149 159 163 0 0 0 0 0 0 0 0 1 -2 6 0 7 -5 -8 -7 Original 8x8 block

zig-zag-scan run-level coding entropy coding Mean of block (185 3 1 0 1 1 1 -1 0 1 0 1 1 0 -3 & transmission (0,3) (0,1) (1,1) (0,1) (0,1) (0,-1) 2 -1 0 0 0 0 0 0 1 -1 -1 0 -1 0 0 : 185 (1,1) (1,1) (0,1) (1,-3) (0,2) (0,-1) 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 (6,1) (0,-1) (0,-1) (1,-1) (14,1) 0 0 0 0 0 -1 -1 EOB) (9,-1) (0,-1) (EOB)

from: Girod

Thomas Wiegand: Digital Image Communication Transform Coding - 22 EntropyEntropy CodingCoding ofof TransformTransform CoefficientsCoefficients IIIIII

Decoder Inverse run-level- decoding zig-zag- Mean of block: 185 (185 3 1 0 1 1 1 -1 0 1 0 1 1 0 -3 transmission (0,3) (0,1) (1,1) (0,1) (0,1) (0,- scan 2 -1 0 0 0 0 0 0 1 -1 -1 0 -1 0 0 1) (1,1) (1,1) (0,1) (1,-3) (0,2) 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 (0,-1) (6,1) (0,-1) (0,-1) (1,-1) 0 0 0 0 0 -1 -1 EOB) (14,1) (9,-1) (0,-1) (EOB)

Scaling and inverse 185 3 1 1 -3 2 -1 0 1 1 -1 0 -1 0 0 1 DCT:  196 193 187 192 179 176 196 189 0 0 1 0 -1 0 0 0 198 188 182 198 196 192 208 200 1 1 0 -1 0 0 0 -1 185 189 191 197 174 159 184 189 0 0 1 0 0 0 -1 0 167 181 182 177 154 153 187 189 0 0 0 0 0 0 0 0 201 199 178 165 163 185 206 179 0 0 0 0 0 0 0 0 220 217 193 176 165 179 197 170 0 0 0 0 0 0 0 0 194 198 195 193 169 156 180 179 210 196 192 209 185 149 157 160 Reconstructed 8x8 block

from: Girod

Thomas Wiegand: Digital Image Communication Transform Coding - 23 Detail in a Block vs. DCT Coefficients Transmitted

image block DCT coefficients quantized DCT block of block coefficients reconstructed of block from quantized coefficients

30 30

20 20

10 10

0 0

-10 -10

-20 -20 0 0 -30 -30 2 2 0 0 2 4 2 4 4 6 4 6 6 6

30 30

20 20

10 10

0 0

-10 -10

-20 -20 0 0 -30 -30 2 2 0 0 2 4 2 4 4 6 4 6 6 6

30 30

20 20

10 10

0 0

-10 -10

-20 -20 0 0 -30 -30 2 2 0 0 2 4 2 4 4 6 4 6 6 6

from: Girod

Thomas Wiegand: Digital Image Communication Transform Coding - 24 TypicalTypical DCTDCT CodingCoding ArtifactsArtifacts

DCT coding with increasingly coarse quantization, block size 8x8

quantizer step size quantizer step size quantizer step size for AC coefficients: 25 for AC coefficients: 100 for AC coefficients: 200

from: Girod

Thomas Wiegand: Digital Image Communication Transform Coding - 25 InfluenceInfluence ofof DCTDCT BlockBlock SizeSize • Efficiency as a function of block size NxN, measured for 8 bit Quantization in the original domain and equivalent quantization in the transform domain. Memoryless entropy of original signal = mean entropy of transform coefficients

• Block size 8x8 is a good compromise between coding efficiency and complexity from: Girod

Thomas Wiegand: Digital Image Communication Transform Coding - 26 FastFast DCTDCT AlgorithmAlgorithm II

DCT matrix factored into sparse matrices (Arai, Agui, and Nakajima; 1988): y = M  x

= S  P  M1  M2  M3  M4  M5  M6  x

s 0 1 1 1 s 1 1 1 1 0 s 2 0 1 1 0 1 1 s S = 3 P = 1 M = 1 M = -1 1 s4 1 1 1 1 2 1 s 5 1 0 1 1 1 1 0 s6 1 1 -1 0 1 s7 1 -1 1 -1 1

1 1 1 1 1 1 0 1 1 0 1 -1 0 1 1 1 1 C 4 1 1 1 -1 0 1 1 1 1 1 -1 1 1 M = -C M = M = M = 0 0 3 2 -C 6 4 1 5 -1 -1 6 1 -1 C 4 1 1 1 1 -1 0 -C C 0 0 6 2 1 1 1 1 -1 1 1 1 1 0 -1

from: Girod

Thomas Wiegand: Digital Image Communication Transform Coding - 27 FastFast DCTDCT AlgorithmAlgorithm IIII

Signal flow graph for fast (scaled) 8-DCT according to Arai, Agui, Nakajima:

x0 s 0 y 0 scaling

x 1 s 4 y 4 x a s y 2 1 2 2 only 5 + 8 Multiplications x 3 s 6 y 6 (direct matrix x a s y 4 2 5 5 multiplication: x 5 a 3 s 1 y 1 64 multiplications) s x 6 a 4 7 y 7

x 7 s 3 y 3

a 5 a1 = C4 Multiplication: Addition: 1 a = C  C s = u m m u u 2 2 6 0 u+v 2 2 v 1 a = C s = ;k =1...7 3 4 K 4 C u  K u-v a = C + C C = cos ks v 4 6 2 K ()16

a5 = C6 from: Girod

Thomas Wiegand: Digital Image Communication Transform Coding - 28 TransformTransform Coding:Coding: SummarySummary • Orthonormal transform: rotation of coordinate system in signal space • Purpose of transform: decorrelation, energy concentration • KLT is optimum, but signal dependent and, hence, without a fast algorithm • DCT shows reduced blocking artifacts compared to DFT

• Bit allocation proportional to logarithm of variance

• Threshold coding + zig-zag scan + 8x8 block size is widely used today (e.g. JPEG, MPEG-1/2/4, ITU-T H.261/2/3)

• Fast algorithm for scaled 8-DCT: 5 multiplications, 29 additions

Thomas Wiegand: Digital Image Communication Transform Coding - 29