JPEG Compression

Reference: Chapter 6 of Steinmetz and Nahrstedt

Motivations:

1. Uncompressed and audio data are huge. In HDTV, the bit rate easily exceeds 1 Gbps. --> big problems for storage and network communications. 2. The compression ratio of lossless methods (e.g., Huffman, Arithmetic, LZW) is not high enough for image and video compression, especially when distribution of pixel values is relatively flat.

1. What is JPEG?

 "Joint Photographic Expert Group". Voted as international standard in 1992.  Works with color and grayscale images, e.g., satellite, medical, ...

2. JPEG overview

 Encoding

 Decoding -- Reverse the order

3. Major Steps

 DCT (Discrete Cosine Transformation)  Quantization  Zigzag Scan  DPCM on DC component  RLE on AC Components  Entropy Coding

3a. Discrete Cosine Transform (DCT)

 Overview:

 Definition (8 point DCT):

 DC and AC components.

DC Component F(0,0) The average value of all the pixels in the block

AC Component Remaining 63 coefficients Represent the amplitudes of progressively higher horizontal and vertical spatial frequencies in the block.

Further Exploration

Try the Interactive FFT examples and the Interactive DCT examples. Note: You must download the MathCad browser in order to use these examples.

3b. Quantization

 Why? -- To throw out bits  Example: 101101 = 45 (6 bits). Truncate to 4 bits: 1011 = 11. Truncate to 3 bits: 101 = 5.  Quantization error is the main source of the .

Uniform quantization

 Divide by constant N and round result (N = 4 or 8 in examples above).  Non powers-of-two gives fine control (e.g., N = 6 loses 2.5 bits)

Quantization Tables

 In JPEG, each F[u,v] is divided by a constant q(u,v).  Table of q(u,v) is called quantization table.  ------ 16 11 10 16 24 40 51 61  12 12 14 19 26 58 60 55  14 13 16 24 40 57 69 56  14 17 22 29 51 87 80 62  18 22 37 56 68 109 103 77  24 35 55 64 81 104 113 92  49 64 78 87 103 121 120 101  72 92 95 98 112 100 103 99  ------ Eye is most sensitive to low frequencies (upper left corner), less sensitive to high frequencies (lower right corner)  Standard defines 2 default quantization tables, one for luminance (above), one for chrominance.  Custom quantization tables can be put in image/scan header.

3c. Zig-zag Scan

 Why? -- to group low frequency coefficients in top of vector.  Maps 8 x 8 to a 1 x 64 vector

3d. Differential Pulse Code Modulation (DPCM) on DC component

 DC component is large and varied, but often close to previous value (like lossless JPEG).  Encode the difference from previous 8x8 blocks -- DPCM

3e. Run Length Encode (RLE) on AC components

 1x64 vector has lots of zeros in it  Encode as (skip, value) pairs, where skip is the number of zeros and value is the next non- zero component.  Send (0,0) as end-of-block sentinel value.

3f. Entropy Coding

 Encode the DC and AC values using .

4. Overview of the JPEG bitstream

 A "Frame" is a picture, a "scan" is a pass through the pixels (e.g., the red component), a "segment" is a group of blocks, a "block" is an 8x8 group of pixels.  Frame header: sample precision (width, height) of image number of components unique ID (for each component) horizontal/vertical sampling factors (for each component) quantization table to use (for each component)  Scan header Number of components in scan component ID (for each component) Huffman table for each component (for each component)  Misc. (can occur between headers) Quantization tables Huffman Tables Tables Comments Application Data

5. Various JPEG Modes

 Baseline/Sequential -- the one that we described in detail  Lossless - A special case of the JPEG where indeed there is no loss  Progressive Mode o Goal: display low quality image and successively improve. o Two ways to successively improve image: 1. Spectral selection: Send DC component, then first few AC, some more AC, etc. 2. Successive approximation: send DCT coefficients MSB (most significant bit) to LSB (least significant bit).  "Motion JPEG" -- Baseline JPEG applied to each image in a video.