JPEG Compression
Total Page:16
File Type:pdf, Size:1020Kb
JPEG Compression Reference: Chapter 6 of Steinmetz and Nahrstedt Motivations: 1. Uncompressed video and audio data are huge. In HDTV, the bit rate easily exceeds 1 Gbps. --> big problems for storage and network communications. 2. The compression ratio of lossless methods (e.g., Huffman, Arithmetic, LZW) is not high enough for image and video compression, especially when distribution of pixel values is relatively flat. 1. What is JPEG? "Joint Photographic Expert Group". Voted as international standard in 1992. Works with color and grayscale images, e.g., satellite, medical, ... 2. JPEG overview Encoding Decoding -- Reverse the order 3. Major Steps DCT (Discrete Cosine Transformation) Quantization Zigzag Scan DPCM on DC component RLE on AC Components Entropy Coding 3a. Discrete Cosine Transform (DCT) Overview: Definition (8 point DCT): DC and AC components. DC Component F(0,0) The average value of all the pixels in the block AC Component Remaining 63 coefficients Represent the amplitudes of progressively higher horizontal and vertical spatial frequencies in the block. Further Exploration Try the Interactive FFT examples and the Interactive DCT examples. Note: You must download the MathCad browser in order to use these examples. 3b. Quantization Why? -- To throw out bits Example: 101101 = 45 (6 bits). Truncate to 4 bits: 1011 = 11. Truncate to 3 bits: 101 = 5. Quantization error is the main source of the Lossy Compression. Uniform quantization Divide by constant N and round result (N = 4 or 8 in examples above). Non powers-of-two gives fine control (e.g., N = 6 loses 2.5 bits) Quantization Tables In JPEG, each F[u,v] is divided by a constant q(u,v). Table of q(u,v) is called quantization table. ---------------------------------- 16 11 10 16 24 40 51 61 12 12 14 19 26 58 60 55 14 13 16 24 40 57 69 56 14 17 22 29 51 87 80 62 18 22 37 56 68 109 103 77 24 35 55 64 81 104 113 92 49 64 78 87 103 121 120 101 72 92 95 98 112 100 103 99 ---------------------------------- Eye is most sensitive to low frequencies (upper left corner), less sensitive to high frequencies (lower right corner) Standard defines 2 default quantization tables, one for luminance (above), one for chrominance. Custom quantization tables can be put in image/scan header. 3c. Zig-zag Scan Why? -- to group low frequency coefficients in top of vector. Maps 8 x 8 to a 1 x 64 vector 3d. Differential Pulse Code Modulation (DPCM) on DC component DC component is large and varied, but often close to previous value (like lossless JPEG). Encode the difference from previous 8x8 blocks -- DPCM 3e. Run Length Encode (RLE) on AC components 1x64 vector has lots of zeros in it Encode as (skip, value) pairs, where skip is the number of zeros and value is the next non- zero component. Send (0,0) as end-of-block sentinel value. 3f. Entropy Coding Encode the DC and AC values using Huffman coding. 4. Overview of the JPEG bitstream A "Frame" is a picture, a "scan" is a pass through the pixels (e.g., the red component), a "segment" is a group of blocks, a "block" is an 8x8 group of pixels. Frame header: sample precision (width, height) of image number of components unique ID (for each component) horizontal/vertical sampling factors (for each component) quantization table to use (for each component) Scan header Number of components in scan component ID (for each component) Huffman table for each component (for each component) Misc. (can occur between headers) Quantization tables Huffman Tables Arithmetic Coding Tables Comments Application Data 5. Various JPEG Modes Baseline/Sequential -- the one that we described in detail Lossless - A special case of the JPEG where indeed there is no loss Progressive Mode o Goal: display low quality image and successively improve. o Two ways to successively improve image: 1. Spectral selection: Send DC component, then first few AC, some more AC, etc. 2. Successive approximation: send DCT coefficients MSB (most significant bit) to LSB (least significant bit). "Motion JPEG" -- Baseline JPEG applied to each image in a video. .