Digital Signal Processing, Fall 2009 the Discrete Cosine Transform

ENEE425: Digital Signal Processing, Fall 2009 The Discrete Cosine Transform Steve Tjoa Dept. of Electrical and Computer Engineering, University of Maryland November 19, 2009 1 Definitions (Some of the variable names in this guide are different from the ones in Oppenheim and Schafer.) • The discrete cosine transform (DCT) of a signal x[n] is N−1 X π(2n + 1)k yc[k] = α[k] x[n] cos 2N n=0 where the scaling factor α[k] is equal to ( p1=N; k = 0; α[k] = p2=N; k 2 f1; 2; :::; N − 1g: • Like the discrete Fourier transform (DFT), the DCT provides a decomposition of any discrete- time signal as a weighted sum of basis functions. In the DFT, the basis functions are complex exponentials. In the DCT, the basis functions are cosines. • There are actually eight versions of the DCT: type-I through type-VIII. The version mentioned above, and the one used most often in practice, is the type-II DCT. • The inverse discrete cosine transform (IDCT) of yc[k] is N−1 X π(2n + 1)k x[n] = α[k]yc[k] cos : 2N k=0 • Define the vectors x and yc as 2 x[0] 3 2 yc[0] 3 6 x[1] 7 6 yc[1] 7 x = 6 7 yc = 6 7 : 6 . 7 6 . 7 4 . 5 4 . 5 x[N − 1] yc[N − 1] 1 Define the DCT matrix C as 2 c[0; 0] c[0; 1] ··· c[0;N − 1] 3 6 c[1; 0] c[1; 1] ··· c[1;N − 1] 7 6 7 6 . .. 7 4 . 5 c[N − 1; 0] c[N − 1; 1] ··· c[N − 1;N − 1] π(2n+1)k c where c[k; n] = α[k] cos 2N . Then y = Cx. • Exercise: Show that CT C = I. (If CT C = I, then C is an orthonormal matrix.) • Since C is an orthonormal matrix, then CT yc = CT Cx = x: Therefore, the forward and inverse DCT can be concisely described as yc = Cx x = CT yc: 2 Properties 1. If x[n] is real for all n, then yc[k] is real for all k. c P 2 P c 2 2. Parseval's Relation: jjxjj = jjy jj, or equivalently, n jx[n]j = k jy [k]j . 3. The DCT has excellent energy compaction for many real-world signals (e.g., signals with high correlation among neighboring samples). Digression: Let ys = Sx, where ys is the discrete sine transform (DST) of x, and S is the DST matrix. (Properties 1 and 2 also hold for the DST. We will not properly introduce the DST because it is rarely used.) If, for most real-world signals x[n] and for any choice of K less than N − 1, K K X X jyc[k]j2 > jys[k]j2 k=0 k=0 then the DCT has better energy compaction than the DST (generally speaking). Fact: The DCT has better energy compaction than the DST. 3 Uses in Signal Compression • Compression of a digital signal involves both truncation and quantization of its transform coefficients. • The mean-squared error (MSE) between two signals of length N, x[n] andx ^[n], is defined to be N−1 1 1 X MSE(x; x^) = jjx − x^jj2 = jx[n] − x^[n]j2: N N n=0 2 • The peak signal-to-noise ratio (PSNR) between x[n] andx ^[n] is, in decibels, P 2 PSNR(x; x^) = 10 log 10 MSE(x; x^) where P is the maximum possible value that x[n] orx ^[n] can take. For example, in images, P = 255 because eight-bit pixel values are between 0 and 255. • Define yc[k]; 0 ≤ k ≤ K; ys[k]; 0 ≤ k ≤ K; y^c[k] = y^s[k] = 0; K < k ≤ N − 1; 0; K < k ≤ N − 1: (Exercise: Show that an equivalent way of comparing the energy compaction between the DCT and DST is jjy^cjj > jjy^sjj.) Now, define the reconstructed signals x^c = CT y^c and x^s = CT y^s. Exercise: Use the energy compaction property to show that MSE(x; x^c) < MSE(x; x^s); i.e., x^c retains more information about x than x^s. Equivalently, PSNR(x; x^c) > PSNR(x; x^s): • When compressing signals, the DCT is rarely computed over the entire signal. Instead, the DCT is computed for several (possibly overlapping) shorter segments within the signal. • Suppose that, for some integers k and l, Var(yc[k]) > Var(yc[l]). If more bits are devoted to representing yc[k] than yc[l], then the MSE is higher than the case where more bits are devoted to representing yc[l] than yc[k]. • The compression ratio (CR) of a compression scheme is the original file size divided by the compressed file size. A good compression scheme has a high compression ratio. 4 Two-Dimensional DCT • The two-dimensional DCT of a two-dimensional signal x[m; n] is Yc = CXCT and the inverse two-dimensional DCT of yc[k; l] is X = CT YcC : • Interpretation: The 1D DCT provides a decomposition of x[n] as a weighted sum of basis vectors (i.e., cosines), where yc[k] defines the weights of each basis vector. Similarly, the 2D DCT decomposes x[m; n] as a weighted sum of basis images where yc[k; l] defines the weights of each basis image. 3.

Load more