Entropy Coding – Run Length Codes
Total Page:16
File Type:pdf, Size:1020Kb
Entropy Achieving Codes Univ.Prof. Dr.-Ing. Markus Rupp LVA 389.141 Fachvertiefung Telekommunikation (LVA: 389.137 Image and Video Compression) Last change: Jan. 20, 2020 Resume • Lossless coding example • Consider the following source in which two bits each occur with equal probability. A redundancy is included by a parity check bit Input output 00 000 01 011 10 101 11 110 Univ.-Prof. Dr.-Ing. Markus Rupp 2 Resume • How large is the entropy of this source? K −1 K −1 H (U ) = − P(ak )log 2 (P(ak )) = − pk log 2 (pk ) k =0 k =0 1 1 = −4 log 2 = 2bit 4 4 • Let us assume we receive only the first and third bit, thus 2bit per symbol. • It thus must be possible to recompute the original signal. How? Univ.-Prof. Dr.-Ing. Markus Rupp 3 Resume • Solution Input output 0X0 000 0X1 011 1X1 101 1X0 110 b2 = b1 XOR b3 Univ.-Prof. Dr.-Ing. Markus Rupp 4 Resume • Consider a pdf fX(x) of a discrete memoryless source. • Q: What happens with the entropy if a constant is added? →fX(x+c) – A: nothing the entropy remains unchanged • Q: What happens with the entropy if the range is doubled? x→2x – A: nothing, the entropy remains unchanged Univ.-Prof. Dr.-Ing. Markus Rupp 5 Resume • Consider a pdf fX(x) of a continuous memoryless source. • Q: What happens with the entropy if a constant is added? →fX(x+c) – A: nothing the entropy remains unchanged • Q: What happens with the entropy if the range is doubled? x→2x → 2 fX(2x) – A: a lot Univ.-Prof. Dr.-Ing. Markus Rupp 6 Resume • Q: What happens with the entropy if the range is doubled? x→2x → 2 fX(2x) • A: if x→cx then – For |c|<1 the density becomes more concentrated around the mean and thus the entropy decreases. – For |c|>1 the density becomes smeared and thus the entropy increases. Univ.-Prof. Dr.-Ing. Markus Rupp 7 Resume Augustin Louis Cauchy (21.8.1789 -23. 5. 1857) French Mathematician • Are there distributions without variance? • Consider Cauchy distribution a f (x) = x (x2 + a2 ) ax m = dx = 0 x 2 2 − (x + a ) ax2 2 = dx = x 2 2 − (x + a ) Univ.-Prof. Dr.-Ing. Markus Rupp 8 Resume Paul Pierre Lévy (15.9.1886-15.12.1971) French Mathematician • Are there distributions without mean and variance? • Consider Levy distribution (inverse gamma) c − c e 2(x−a) f (x) = ; x a x 2 (x − a)3/ 2 c − c e 2(x−a) m = x dx = x 3/ 2 2 a (x − a) c − c e 2(x−a) 2 = x2 dx = x 3/ 2 Univ.-Prof. Dr.-Ing.2 Markusa Rupp(x − a) 9 http://pillowlab.cps.utexas.edu/teaching/CompNeuro10/slides/slides16_EntropyMethods.pdf Resume Univ.-Prof. Dr.-Ing. Markus Rupp 10 Resume dr Univ.-Prof. Dr.-Ing. Markus Rupp 11 Resume + dr Univ.-Prof. Dr.-Ing. Markus Rupp 12 Outline • Back to entropy • Entropy achieving codes – Huffman codes – Golumb and Elias Codes – Arithmetic coding – Adaptive entropy coding – Run length codes Univ.-Prof. Dr.-Ing. Markus Rupp 13 Univ.-Prof. Dr.-Ing. Markus Rupp 14 Univ.-Prof. Dr.-Ing. Markus Rupp 15 Univ.-Prof. Dr.-Ing. Markus Rupp 16 Optimal Code Example • Take for example 16 symbols Z={0000,0001,….,1110,1111} all with probability p=1/16. • H(Z)=? • The code is optimal for equal distribution of symbols! • But what if symbols are not equally distributed? Univ.Prof. Dr.-Ing. Markus Rupp 17 Univ.-Prof. Dr.-Ing. Markus Rupp 18 Univ.-Prof. Dr.-Ing. Markus Rupp 19 Hoffman Code Example Univ.Prof. Dr.-Ing. Markus Rupp 20 Example: Morse vs. Huffman Univ.-Prof. Dr.-Ing. Markus Rupp 21 Vector Huffman Coding Huffman: special form of arithmetic coding→achieving entropy Univ.Prof. Dr.-Ing. Markus Rupp 22 Unary Codes • Unary coding is an entropy encoding that represents a natural number, n, with n − 1 ones followed by a zero. For example 5 is represented as 11110. Some representations use n − 1 zeros followed by a one. The ones and zeros are interchangeable without loss of generality. • n Unary code Alternative 1 1 0 2 01 10 3 001 110 Unary: 4 0001 1110 German: 5 00001 11110 Unär 6 000001 111110 Monadisch 7 0000001 1111110 opposite: binary 8 00000001 11111110 9 000000001 111111110 10 0000000001 1111111110 Univ.Prof. Dr.-Ing. Markus Rupp 23 Unary Codes • Unary coding is an optimally efficient encoding for the following discrete probability distribution −n pn = 2 ;n = 1,2,3... • In symbol-by-symbol coding, it is optimal for any geometric distribution −n pn = (k −1)k ;n = 1,2,3... for which k ≥ φ = 1.61803398879…, the golden ratio, or, more generally, for any discrete distribution for which pn pn+1 + pn+2 ;n = 1,2,3... Univ.Prof. Dr.-Ing. Markus Rupp 24 Golomb Codes Solomon W. Golomb, (*1932) Efficient for geometric distributions, less complex than unary codes Univ.Prof. Dr.-Ing. Markus Rupp 25 Example Elias Gamma Code Peter Elias (Nov.23, 1923 – Dec. 7, 2001) Implied probability • 1 = 20 + 0 = 1 1/2 • 2 = 21 + 0 = 010 1/8 • 3 = 21 + 1 = 011 " • 4 = 22 + 0 = 00100 1/32 • 5 = 22 + 1 = 00101 " • 6 = 22 + 2 = 00110 " Interpretation as • 7 = 22 + 3 = 00111 “ Golumb code with • 8 = 23 + 0 = 0001000 1/128 flexible subblock • 9 = 23 + 1 = 0001001 " length! • 10 = 23 + 2 = 0001010 " • 11 = 23 + 3 = 0001011 " • 12 = 23 + 4 = 0001100 " • 13 = 23 + 5 = 0001101 " • 14 = 23 + 6 = 0001110 " • 15 = 23 + 7 = 0001111 " • 16 = 24 + 0 = 000010000 1/512 • 17 = 24 + 1 = 000010001 " Univ.Prof. Dr.-Ing. Markus Rupp 26 Elias Coding: pre-Arithmetic Code − log2 ( fX (x))+1 L −log2 ( fX (x))+ 2 Univ.Prof. Dr.-Ing. Markus Rupp 27 Arithmetic Coding Example: SQUEEZE • Assume every letter appears equally that is with probability 1/7=0.143 • We then find: P(S)=P(Q)=P(U)=P(Z)=1/7, P(E)=3/7 • Entropy is 14.9bits for the entire word • Hoffman coding would result in 15bits E:0,Q:100,S:101,U:110,Z:111 Univ.-Prof. Dr.-Ing. Markus Rupp 28 Example: Arithmetic Coding 0.647705d= 0.101001011101b Univ.-Prof. Dr.-Ing. Markus Rupp 29 Example Arithmetic Coding • Eventually, the range for the last letter E becomes 0.64769 - 0.64772. This means that we can encode the word SQUEEZE with a single number in this range. The binary number in this range with the smallest number of bits is 0.101001011101, which corresponds to 0.647705 decimal. The '0.' prefix does not have to be transmitted because every arithmetic coded message starts with this prefix. So we only need to transmit the sequence 101001011101, which is only 12 bits. This number is even below the optimal number of bits of 14.90, but this is not entirely fair. Since the last letter of SQUEEZE is also the most common one, the final range is relatively large, making it easier to fit a value in it with less than the optimal number of bits. If the word was SQUEEEZ, we also would have needed 15 bits with arithmetic coding, but as messages get longer, arithmetic coding clearly outperforms Huffman coding. • The arithmetic decoding process is similar to the encoding process. We know we have transmitted the value 0.647705. Starting at the top of the figure above, we see that this number falls in the range 0.571 - 0.715, so the first letter must be an S. The decoder also subdivides this range, and we see that the value 0.647705 now falls in the range 0.633 - 0.653, so the second letter must be a Q. This process is repeated until the entire word SQUEEZE is decoded. Univ.-Prof. Dr.-Ing. Markus Rupp 30 Univ.-Prof. Dr.-Ing. Markus Rupp 31 Univ.-Prof. Dr.-Ing. Markus Rupp 32 Problem in Entropy Codes • Entropy achieving codes show quite good performance as long as probability for some event is not larger than 0.5. • Note the AC coefficients of DCT are Laplacian distributed with high likelihood (p>0.5) of zeros. • →example Univ.-Prof. Dr.-Ing. Markus Rupp 33 Example • Let‘s assume we have only three values occurring – 0 with p=0.75 – -1,+1 with p=0.125 each • Let‘s use a Golomb code – 1 for 0 – 010 for -1 – 011 for +1 Univ.-Prof. Dr.-Ing. Markus Rupp 34 Example • Entropy: H=1.0613 • Golomb code: -1 x log2(0.75)-2 x 3 x log2(0.125) =1.5bit/symbol • Note that for – 0 with p=0.5 – -1,+1 with p=0.25 each – Entropy H=1.5! Univ.-Prof. Dr.-Ing. Markus Rupp 35 Run Length (En)Coding • Consider a screen containing plain black text on a solid white background. There will be many long runs of white pixels in the blank space, and many short runs of black pixels within the text. Let us take a hypothetical single scan line, with B representing a black pixel and W representing white: • WWWWWWWWWWWWBWWWWWWWWWWWWBBBWWWW WWWWWWWWWWWWWWWWWWWWBWWWWWWWWWW WWWW • If we apply the run-length encoding (RLE) data compression algorithm to the above hypothetical scan line, we get the following: • 12W1B12W3B24W1B14W Interpret this as twelve W's, one B, twelve W's, three B's, etc. • The run-length code represents the original 67 characters in only 18 Univ.-Prof. Dr.-Ing. Markus Rupp 36 Run Length Coding • More efficient coding annotation W=0,B=1 • WWWWWWWWWWWWBWWWWWWWW WWWWBBBWWWWWWWWWWWWWWW WWWWWWWWWBWWWWWWWWWWW WWW • 12W1B 12W3B 24W1B 14W • JPEG: (12,1),(12,1),(0,1),(0,1)(24,1), EOB • Even better: 12,12,0,0,24 EOB Univ.-Prof.