<<

Discrete Cosine Transform

Nuno Vasconcelos UCSD Discrete Fourier Transform • last classes, we have studied the DFT • due to its computational efficiency the DFT is very popular • however, it has strong disadvantages for some applications –it is complex –it has poor compaction • energy compaction – is the ability to pack the energy of the spatial sequence into as few freqqyuency coefficients as possible –this is very important for – we represent the in the domain – if compaction is high, we only have to transmit a few coefficients – instead of the whole set of 2 Discrete Cosine Transform •a much better transform, from this point of view, is the DCT – in this example we see the amplitude spectra of the image above – under the DFT and DCT – note the much more concentrated histogram obtained with the DCT • why is energy compaction important? – the main reason is image compression – turns out to be beneficial in other applications

3 Image compression • an image compression system has three main blocks

–a transform (usually DCT on 8x8 blocks) –a quantizer – a lossless ()(entropy) coder • each tries to throw away which is not essential to understand the imagg,e, but costs

4 Image compression • the transform throws away correlations – if you make a plot of the value of a as a of one of its neighbors

– you will see that the pixels are highly correlated (i.e. most of the time they are very similar) – this is just a consequence of the fact that surfaces are smooth

5 Image compression • the transform eliminates these correlations – this is best seen by considering the 2-pt transform – note that the first coefficient is always the DC-value X [0]= x[0]+ x[1] – an orthogonal transform can be written in matrix form as X = Tx, T TT = I – i.e. T has orthogonal columns – this means that X [1]= x[0]− x[1] – note that if x[0] similar to x[1], then ⎧X [0]= x[0]+ x[1] ≈ 2x[0] ⎨ ⎩X []1 = x[0]− x[1] ≈ 0 6 Image compression • the transform eliminates these correlations – note that if x[0] similar to x[1], the

⎧X [0]= x[0]+ x[1] ≈ 2x[0] ⎨ [] ⎩X 1 = x[0]− x[1] ≈ 0 – in the transform domain we only have to transmit one number without any significant cost in image quality – by “decorrelating” the signal we reduced the rate to ½! – note that an orthogonal matrix T TT = I applies a rotation to the pixel space – this aligns the data with the canonical axes

7 Image compression • a second advantage of working in the – is that our visual system is less sensitive to around edges – the transition associated with the edge masks our ability to perceive the – e.g. if you blow up a compressed picture, it is likely to look like this – in general, the compression errors are more annoying in the smooth image regions

8 Image compression • three JPEG examples

36KB 5.7KB 1.7KB – note that the blockiness is more visible in the torso

9 Image compression • important point: by itself, the transform – does not save any bits – does not introduce any distortion • both of these happen when we throw away information • this is called “ ” and implemented by the quantizer • what is a quantizer? – think of the round() function, that rounds to the nearest integer – round(1) = 1; round(0.55543) = 1; round (0.0000005) = 0 – instead of an infinite range between 0 and 1 (infinite number of bits to transmit) – the output is zero or one (1 bit) – we threw away all the stuff in between, but saved a lot of bits – a quantizer does this less drastically 10 Quantizer • it is a function of this type – inputs in a given range are mapped to the same o utp ut • to implement this, we – 1) define a qqpuantizer step size Q – 2) apply a rounding function ⎛ x ⎞ xq = round⎜ ⎝ Q ⎠

– the larger the Q, the less reconstruction levels we have – more compression at the cost of larger distortion – e.g. for x in [0,255], we need 8 bits and have 256 color values – with Q = 64, we only have 4 levels and only need 2 bits

11 Quantizer • note that we can quantize some frequency coefficients more heavily than others by simply increasing Q • this leads to the idea of a quantization matrix • we start with an image block (e.g. 8x8 pixels)

12 Quantizer • next we apply a transform (e.g. 8x8 DCT)

DCT

13 Quantizer • and quantize with a varying Q

DCT

Q mtx

14 Quantizer • note that higher are quantized more heavily

QmtxQ mtx increasing frequency

– in result, many high frequency coefficients are simply wiped out DCT quantized DCT

15 Quantizer • this saves a lot of bits, but we no longer have an exact replica of original image block DCT quantized DCT

inverse DCT original pixels

16 Quantizer • note, however, that visually the blocks are not very different oriiiginal dddecompressed

– we have saved lots of“”f bits without much “perceptual” loss – this is the reason why JPEG and MPEG work

17 Image compression • three JPEG examples

36KB 5.7KB 1.7KB – note that the two images on the left look identical – JPEG requires 6x less bits 18 Discrete Cosine Transform • note that – the better the energy compaction – the larger the number of coefficients that get wiped out – the greater the bit savings for the same loss •this iswhy the DCT is important • we will do mostly the 1D-DCT – the formulas are simpler the insights the same – as always, extension to 2D is trivial

19 Discrete Cosine Transform • the first thing to note is that there are various versions of the DCT – these are usually known as DCT-I to DCT-IV – they vary in minor details – the most popular is the DCT-II, ⎪⎧1 , k = 0 also known as even symmetric w[]k = ⎨ 2 , DCT, or as “the DCT” ⎩⎪ 1, 1≤ k < N ⎧N −1 ⎛ π ⎞ ⎪ 2x[n]cos k(2n +1) , 0 ≤ k < N C []k = ∑ ⎜ ⎟ x ⎨n=0 ⎝ 2N ⎠ ⎩⎪ 0 otherwise ⎧ 1 N −1 ⎛ π ⎞ ⎪ w[k]C []k cos k(2n +1) 0 ≤ n < N x[]n = ∑ x ⎜ ⎟ ⎨ N k=0 ⎝ 2N ⎠ ⎩⎪ 0 otherwise 20 Discrete Cosine Transform

⎧N −1 ⎛ π ⎞ ⎪ 2x[n]cos k(2n +1) , 0 ≤ k < N C [k]= ∑ ⎜ x ⎨n=0 ⎝ 2N ⎠ ⎩⎪ 0 otherwise

• from this equation we can immediately see that the DCT coefficients are real • to understand the better energy compaction – it is interesting to compare the DCT to the DFT – it turns out that there is a simple relationship • we consider a sequence x[n] which is zero outside of {0, …, N-1} • to relate DCT to DFT we need three steps

21 Discrete Cosine Transform • step 1): create a sequence y[]n = x[n]+ x[2N − n −1] ⎧ x[n], 0 ≤ n < N = ⎨ ⎩x[2N − n −1], N ≤ n < 2N • step 2): compute the 2N-point DFT of y[n]

2π 2N −1 − j kn Y[k]= ∑ y[n]e 2N , 0 ≤ k < 2N n=0 π •stepp) 3): rewrite as a function of N terms onl y 2 2π N −1 − j kn 2N −1 − j kn Y[]k = ∑ y[n]e 2N + ∑ y[n]e 2N n=0 n=N

22 Discrete Cosineπ Transform • step 3): rewrite as a function of N terms only

2 2π N −1 − j kn 2N −1 − j kn Y[k]= ∑ x[n]e 2N + ∑ x[2N − n −1]e 2N n=0 n=N = ()m = 2N −1− n, n = 2N −1− m = 2π 2π N −1 − j kn N −1 − j k (2N −1−m) = ∑ x[n]e 2N + ∑ x[m]e 2N n=0 m=0 2π 2π 2π 2π N −1 ⎧ − j kn j kn − j k 2N j k ⎫ = x[n] e 2N + e 2N e 2N e 2N ∑ ⎨ 14243 ⎬ n=0 ⎩ 1 ⎭ 2π 2π 2π N −1 ⎧ − j kn j kn j k ⎫ = ∑ x[n]⎨e 2N + e 2N e 2N ⎬ n=0 ⎩ ⎭

– to write as a cosine we need to make it into two “mirror” exponents 23 Discrete Cosine Transform • step 3): rewrite as a functionπ of N terms only π π 2 2 2π N −1 ⎧ − j kn j kn j k ⎫ Y [k]= x[n] e 2N + πe 2N e 2N ∑ ⎨ π ⎬ n=0 ⎩ ⎭ π 2 2 π N −1 j k ⎧ − j kn − j k j kn j k ⎫ = ∑ x[n]e 2N ⎨e 2N e 2N + e 2N e 2N ⎬ n=0 ⎩ ⎭ π N −1 j k ⎛ π ⎞ = ∑2x[n]e 2N cos⎜ k(2n +1) n=0 ⎝ 2N ⎠ π j k N −1 ⎛ π ⎞ = e 2N ∑2x[n]cos⎜ k(2n +1) n=0 ⎝ 2N ⎠ –from which

π j k 2N Y[]k = e Cx[k], 0 ≤ k < 2N 24 Discrete Cosine Transform • it follows that

π ⎧ − j k ⎪e 2N Y[k], 0 ≤ k < N Cx[k] = ⎨ ⎩⎪ 0, otherwise • in summary, we have three steps

DFT x[n]↔ y[n] ↔ Y[k]↔ C [k] 12x 3 {N − pt 2{N − pt 2{N − pt N − pt

•this interpretation is useful in various ways – it provides insight on why the DCT has better energy compaction – it provides a fast algorithm for the computation of the DFT

25 Energy compaction

DFT x[]n ↔ y[]n ↔ Y[]k ↔ C []k 12x 3 {N − pt 2{N − pt 2{N − pt N − pt •to understand the energy compaction property – we sttbtart by cons idithidering the sequence y[][n] = x[ []n]+x[2N -1-n] – this just consists of adding a mirrored version of x[n] to itself x[n] y[n]

– next we remember that the DFT is identical to the DFS of the periodic extension of the sequence – let’s look at the periodic extensions for the two cases • when transform is DFT: we work with extension of x[n] • when transform is DCT: we work with extension of y[n] 26 Energy compaction

DFT x[]n ↔ y[]n ↔ Y[]k ↔ C []k 12x 3 {N − pt 2{N − pt 2{N − pt N − pt • the two extensions are DFT DCT

– note that in the DFT case the extension introduces discontinuities – this does not happen for the DCT, due to the symmetry of y[n] – the elimination of this artificial discontinuity, which contains a lot of high frequencies , – is the reason why the DCT is much more efficient 27 Fast algorithms • the interpretation of the DCT as

DFT x[n]↔ y[n] ↔ Y[k]↔ C [k] 12x 3 {N − pt 2{N − pt 2{N − pt N − pt

– also gives us a fast algorithm for its computation – it consists exactly of the three steps – 1) y[n] = x[n]+x[2N-1-n] – 2) Y[k] = DFT{y[n]} this can be computed with a 2N-pt FFT – 3) π ⎧ − j k ⎪e 2N Y[]k , 0 ≤ k < N Cx[k] = ⎨ ⎩⎪ 0, otherwise – the complexity of the N-pt DCT is that of the 2N-pt DFT 28 2D DCT • the extension to 2D is trivial • the procedure is the same 2D DFT x[][]n ,n ↔ y n ,n ↔ Y k ,k ↔ C []k ,k [] 1421 432 1421 432 1421 432 142x 1 432 N1×N2 − ptp 2N1×2N2 − ptp 2N1×2N2 − ptp N1×N2 − ptp • with

y[n1,n2 ]= x[n1,n2 ]+ x[2N1 −1− n1,n2 ]

+ x[][]n1,2N2 −1− n2 + x 2N1 −1− n1,2N2 −1− n2 • and π π ⎧ − j k1 − j k2 0 ≤ k < N 2N1 2N2 1 1 ⎪e e Y[]k1,k2 , Cx[k1,k2 ] = ⎨ 0 ≤ k < N ⎪ 2 2 ⎩ 0, otherwise 29 2D DCT • the end result is the 2D DCT pair x k NN12−−11 ⎧ ⎛ π ⎞ ⎛ π ⎞ 0 ≤ 1 < N1 C nn ⎜ ⎜ k ⎪∑ ∑ 4x [n1,n 2 ]cos⎜ k1(2n1 +1) cos⎜ k 2 (2n2 +1) , []1,k 2 = ⎨ 2N 2N 0 ≤ < 12==00 ⎝ 1 ⎠ π⎝ 2 ⎠ k 2 2 ⎪ N ⎩ 0 otherwise n ⎧ 1 NN12−−11 ⎛ ⎞ ⎛ π ⎞ 0 ≤ < N ⎜ ⎜ 1 1 ⎪ ∑ ∑w 1[k1]w 2[k 2 ]C x [k1,k 2 ]cos⎜ k1(2n1 +1) cos⎜ k 2 (2n 2 +1) x [n1,n 2 ]= ⎨N N 2N 2N 0 ≤ n < N 1 2 kk12==00 ⎝ 1 ⎠ ⎝ 2 ⎠ 2 2 ⎪ ⎩ 0 otherwise with

⎧1 ⎧1 ⎪ , k1 = 0 ⎪ , k2 = 0 w1[]k1 = ⎨ 2 , w2 []k2 = ⎨ 2 ⎪⎩ 1, 1≤ k1 < N1 ⎩⎪ 1, 1≤ k2 < N2

• it is possible to show that the 2DCT can be computed with the row-column decomposition (()homework)

30 2D-DCT

n n • 1) create 2 2 intermediate

sequence by 1D-DCT computing 1D-DCT of n k rows 1 1 f [k1 , n 2 ] x[n1 , n 2 ] • 2) compute n2 k 1D-DCT of 2 columns 1D-DCT

k1 k1 f [ k , n ] 1 2 C x [ k 1 , k 2 ] 31 32