Image Compression Relative Data Redundancy

Image Compression Relative Data Redundancy ► Let b and b’ denote the number of bits in two representations of the same information, the relative data redundancy R is R = 1-1/C C is called the compression ratio, defined as C = b/b’ e.g., C = 10, the corresponding relative data redundancy of the larger representation is 0.9, indicating that 90% of its data is redundant 4/14/2014 2 Why do we need compression? ► Data storage ► Data transmission 4/14/2014 3 How can we implement compression? ► Coding redundancy Most 2-D intensity arrays contain more bits than are needed to represent the intensities ► Spatial and temporal redundancy Pixels of most 2-D intensity arrays are correlated spatially and video sequences are temporally correlated ► Irrelevant information Most 2-D intensity arrays contain information that is ignored by the human visual system 4/14/2014 4 Examples of Redundancy 4/14/2014 5 Coding Redundancy The average number of bits required to represent each pixel is L1 Lavg l( r k ) p r ( r k ) 0.25(2) 0.47(1) 0.24(3) 0.03(3) 1.81 bits k0 8 C 4.42 1.81 4/14/2014 R 1 1/ 4.42 0.774 6 Spatial and Temporal Redundancy 1. All 256 intensities are equally probable. 2. The pixels along each line are identical. 3. The intensity of each line was selected randomly. 4/14/2014 7 Spatial and Temporal Redundancy 1. All 256 intensities are equally probable. 2. The pixels along each line are identical. 3. The intensity of each line was selected randomly. Run-length pair specifies the start of a new intensity and the number of consecutive pixels that have that intensity. Each 256-pixel line of the original representation is replaced by a single 8-bit intensity value and length 256 in the run-length representation. The compression ratio is 256 256 8 128:1 4/14/2014 (256 256) 8 8 Irrelevant Information 256 256 8 / 8 65536 :1 4/14/2014 9 Measuring Image Information A random event E with probability P(E) is said to contain 1 IEPE( ) log -log ( ) PE() units of information. 4/14/2014 10 Measuring Image Information Given a source of statistically independent random events from a discrete set of possible events {a12 , a , ..., aJ } with associated probabilities {P ( a12 ), P( a ), ..., P( aJ )}, the average information per source output, called the entropy of the source J H - P ( ajj )log P ( a ) j1 a j is called source symbols. Because they are statistically independent, the source called zero memory source. 4/14/2014 11 Measuring Image Information If an image is considered to be the output of an imaginary zero-memory "intensity source", we can use the histogram of the observed image to estimate the symbol probabilities of the source. The intensity source's entropy becomes L1 H - prr ( rkk )log p ( r ) k 0 prr (k ) is the normalized histogram. 4/14/2014 12 Measuring Image Information For the fig.8.1(a), H [0.25log2 0.25 0.47log 2 0.47 0.25log 2 0.25 0.03log 2 0.03] 1.6614 bits/pixel 4/14/2014 13 Fidelity Criteria Let f( x , y ) be an input image and f( x , y ) be an approximation of f( x , y ). The images are of size M N. The root- mean - square error is 2 1/2 1 MN11 e f( x , y ) f ( x , y ) rms MN xy00 4/14/2014 14 Fidelity Criteria The mean- square signal- to - noise ratio of the output image, denoted SNR ms 2 MN11 f(,) x y xy00 SNR ms 2 MN11 f(,)(,) x y f x y xy00 4/14/2014 15 RMSE = 5.17 RMSE = 15.67 RMSE = 14.17 4/14/2014 16 Image Compression Models 4/14/2014 17 Image Compression Standards 4/14/2014 18 4/14/2014 19 4/14/2014 20 4/14/2014 21 Some Basic Compression Methods: Huffman Coding 4/14/2014 22 Some Basic Compression Methods: Huffman Coding The average length of this code is Lavg 0.4*1 0.3*2 0.1*3 0.1*4 0.06*5 0.04*5 = 2.2 bits/pixel 4/14/2014 23 Some Basic Compression Methods: Golomb Coding Given a nonnegative integer nm and a positive integer divisor 0, the Golomb code of n with respect to m, denoted Gm ( n ), constructed as follows: Step 1. Form the unary code of quotient nm/ (The unary code of integer qq is defined as 1s followed by a 0) k Step2. Let k= log2 m , c 2 m , r n mod m ,and compute truncated remainder r ' such that r truncated to k-1 bits 0 r c r ' rc truncated to k bits otherwise Step 3. Concatenate the results of steps 1 and 2. 4/14/2014 24 Some Basic Compression Methods: Golomb Coding G (9) : Step 1. Form the unary code of quotient nm/ 4 (The unary code of integer q is defined as q 1s followed by a 0) 9 / 4 2, the unary code is 110 k Step2. Let k= log2 m , c 2 m , r n mod m , 2 and compute truncated remainder r ' such that k= log2 4 2,c 2 4 0, r 9mod 4 1. r truncated to k-1 bits 0 r c r ' r c truncated to k bits otherwise r ' 01 Step 3. Concatenate the results of steps 1 and 2. G4 (9) 11001 4/14/2014 25 Some Basic Compression Methods: Golomb Coding G (9) : Step 1. Form the unary code of quotient nm/ 4 (The unary code of integer q is defined as q 1s followed by a 0) 9 / 4 2, the unary code is 110 k Step2. Let k= log2 m , c 2 m , r n mod m , 2 and compute truncated remainder r ' such that k= log2 4 2,c 2 4 0, r 9mod 4 1. r truncated to k-1 bits 0 r c r ' r c truncated to k bits otherwise r ' 01 Step 3. Concatenate the results of steps 1 and 2. G4 (9) 11001 4/14/2014 26 Some Basic Compression Methods: Golomb Coding Step 1. Form the unary code of quotient nm/ (The unary code of integer q is defined as q 1s followed by a 0) k Step2. Let k= log2 m , c 2 m , r n mod m , and compute truncated remainder r ' such that G4 (7)? r truncated to k-1 bits 0 r c r ' r c truncated to k bits otherwise Step 3. Concatenate the results of steps 1 and 2. 4/14/2014 27 Some Basic Compression Methods: Arithmetic Coding 4/14/2014 28 Some Basic Compression Methods: Arithmetic Coding How to encode a2a1a2a4? 4/14/2014 29 Decode 0.572. The length of the message is 5. Since 0.8>code word > 0.4, the first symbol should be a3. 1.0 0.8 0.72 0.592 0.5728 0.8 0.72 0.688 0.5856 0.57152 Therefore, the message is a3a3a1a2a4 0.4 0.56 0.624 0.5728 056896 0.2 0.48 0.592 0.5664 0.56768 4/14/2014 30 0.0 0.4 0.56 0.56 0.5664 LZW (Dictionary coding) LZW (Lempel-Ziv-Welch) coding, assigns fixed-length code words to variable length sequences of source symbols, but requires no a priori knowledge of the probability of the source symbols. LZW is used in: •Tagged Image file format (TIFF) •Graphic interchange format (GIF) •Portable document format (PDF) LZW was formulated in 1984 4/14/2014 31 The Algorithm: •A codebook or “dictionary” containing the source symbols is constructed. •For 8-bit monochrome images, the first 256 words of the dictionary are assigned to the gray levels 0-255 •Remaining part of the dictionary is filled with sequences of the gray levels 4/14/2014 32 Important features of LZW 1. The dictionary is created while the data are being encoded. So encoding can be done “on the fly” 2. The dictionary is not required to be transmitted. The dictionary will be built up in the decoding 3. If the dictionary “overflows” then we have to reinitialize the dictionary and add a bit to each one of the code words. 4. Choosing a large dictionary size avoids overflow, but spoils compressions 4/14/2014 33 Example: 39 39 126 126 39 39 126 126 39 39 126 126 39 39 126 126 4/14/2014 34 4/14/2014 35 Some Basic Compression Methods: LZW Coding 39 39 126 126 39 39 126 126 39 39 126 126 39 39 126 126 4/14/2014 36 Decoding LZW Let the bit stream received be: 39 39 126 126 256 258 260 259 257 126 In LZW, the dictionary which was used for encoding need not be sent with the image. A separate dictionary is built by the decoder, on the “fly”, as it reads the received code words. 4/14/2014 37 Recognized Encoded pixels Dic. address Dic. entry value 39 39 39 39 39 256 39-39 39 126 126 257 39-126 126 126 126 258 126-126 126 256 39-39 259 126-39 256 258 126-126 260 39-39-126 258 260 39-39-126 261 126-126-39 39-39-126- 260 259 126-39 262 126 259 257 39-126 263 126-39-39 257 126 126 264 39-126-126 4/14/2014 38 Some Basic Compression Methods: Run-Length Coding 1.

Image Compression Relative Data Redundancy

Error Correction Capacity of Unary Coding

Soft Compression for Lossless Image Coding

Generalized Golomb Codes and Adaptive Coding of Wavelet-Transformed Image Subbands

The Pillars of Lossless Compression Algorithms a Road Map and Genealogy Tree

New Approaches to Fine-Grain Scalable Audio Coding

Critical Assessment of Advanced Coding Standards for Lossless Audio Compression

Answers to Exercises

Lec 05 Arithmetic Coding

Optimization Methods for Data Compression

Lecture 14 Announcements

Optimal Prefix Codes for Pairs of Geometrically-Distributed Random Variables Frédérique Bassino, Julien Clément, Gadiel Seroussi, Alfredo Viola

Data Compression: Transform Coding in Practice 2 / 50 Last Lecture Last Lectures: Orthogonal Block Transforms