
Binary Golomb Codes • Binary source with 0’s much more frequent than 1’s. • Variable-to-variable length code CSE 589 • Prefix code. Applied Algorithms • Golomb code of order 4 Autumn 2001 input output 0000 0 Golomb Coding 0001 100 Arithmetic Coding 001 101 LZW 01 110 Sequitur 1 111 CSE 589 - Lecture 6 - Autumn 2 2001 Example Constructing a Binary Golomb Code input output 0000 0 • Let p be the probability of 0 0001 100 • Choose the order k such that 001 101 k +1 ≤ − k < k −1 01 110 p 1 p p 1 111 0 1 0 1 0 1 000000001001000000010001 least likely most likely 001111010100100 001 01 1 0k-11 CSE 589 - Lecture 6 - Autumn 3 CSE 589 - Lecture 6 - Autumn 4 2001 2001 Example Comparison of GC with Entropy p = .89 0 1 7 p = .442 0 1 0 1 1- p6 = .503 0 1 0 1 5 01 1 p = .558 000001 00001 0001 001 GC – entropy Order = 6 entropy input output order 000000 0 000001 1000 00001 1001 0001 1010 001 1011 01 110 p 1 111 CSE 589 - Lecture 6 - Autumn 5 CSE 589 - Lecture 6 - Autumn 6 2001 2001 1 Arithmetic Coding Notes on Binary Golomb Code • Huffman coding works well for larger alphabets and gets to within one bit of the entropy lower • Very effective as a binary run-length code bound. Can we do better. Yes • As p goes to 1 effectiveness near entropy • Basic idea in arithmetic coding: • There are adaptive Golomb codes, so that – represent each string x of length n by an interval [l,r) the order can change according to observed in [0,1). We assume n is known. frequency of 0’s. – The width r-l of the interval [l,r) represents the probability of x occurring. – The interval [l,r) can itself be represented by any number, called a tag, within the half open interval. – The k significant bits of the tag .t1t2t3... is the code of x. That is, . .t1t2t3...tk000... is in the interval [l,r). CSE 589 - Lecture 6 - Autumn 7 CSE 589 - Lecture 6 - Autumn 8 2001 2001 Example of Arithmetic Coding (1) Some Tags are Better than Others 0 1. tag must be in the half open interval. 0 2. tag can be chosen to be (l+r)/2. 1/3 a 3. code is significant bits of the tag. 1/3 a 11/27 .011010000... ba bab 15/27 .100011100... 2/3 15/27 .100011100... 2/3 b bba b 19/27 .101101000... bb Using tag = (l+r)/2 tag = 13/27 = .011110110... code = 0111 1 tag = 17/27 = .101000010... 1 code = 101 Alternative tag = 14/37 = .100001001... code = 1 CSE 589 - Lecture 6 - Autumn 9 CSE 589 - Lecture 6 - Autumn 10 2001 2001 Code Generation from Tag Example of Codes • If binary tag is .t1t2t3... = (r+l)/2 in [l,r) then we want to choose k to form the code t1t2...tk. • P(a) = 1/3, P(b) = 2/3. tag = (l+r)/2 code • Short code: 0 0/27 .000000000... – choose k to be as small as possible so that aa aaa 1/27 .000010010... .000001001... 0 aaa aab 3/27 .000111000... .000100110... 0001 aab l < .t1t2...tk000... < r. a aba 5/27 .001011110... .001001100... 001 aba ab – Good if n is known. abb .010000101... 01 abb 9/27 .010101010... • Guaranteed code: baa 11/27 .011010000... .010111110... 01011 baa ba k = − log (r − l) +1 bab .011110111... 0111 bab – choose 2 15/27 .100011100... – l < .t1t2...tkb1b2b3... < r for any bits b1b2b3... b bba .101000010... 101 bba 19/27 .101101000... – for fixed length strings provides a good prefix code. bb bbb .110110100... 11 bbb – example: [.000000000..., .000010010...), tag = .000001001... Short code: 0 1 27/27 .111111111... .95 bits/symbol Guaranteed code: 000001 .92 entropy lower bound CSE 589 - Lecture 6 - Autumn 11 CSE 589 - Lecture 6 - Autumn 12 2001 2001 2 Guaranteed Code Example Arithmetic Coding Algorithm • P(a) = 1/3, P(b) = 2/3. • P(a1), P(a2), … , P(am) short Prefix tag = (l+r)/2 code code • C(ai) = P(a1) + P(a2) + … + P(ai-1) 0 0/27 aa aaa 1/27 .000001001... 0 0000 aaa • Encode x1x2...xn aab 3/27 .000100110... 0001 0001 aab a aba 5/27 .001001100... 001 001 aba ab abb .010000101... 01 0100 abb Initialize l := 0 and r := 1; 9/27 for i = 1 to n do baa 11/27 .010111110... 01011 01011 baa ba w := r - l; bab .011110111... 0111 0111 bab l := l + w * C(x); 15/27 i r := l + w * P(x); b bba .101000010... 101 101 bba i 19/27 t := (l+r)/2; bb choose code for the tag bbb .110110100... 11 11 bbb 1 27/27 CSE 589 - Lecture 6 - Autumn 13 CSE 589 - Lecture 6 - Autumn 14 2001 2001 Arithmetic Coding Example Arithmetic Coding Exercise • P(a) = 1/4, P(b) = 1/2, P(c) = 1/4 • P(a) = 1/4, P(b) = 1/2, P(c) = 1/4 • C(a) = 0, C(b) = 1/4, C(c) = 3/4 • C(a) = 0, C(b) = 1/4, C(c) = 3/4 • abca • bbbb symbol w l r symbol w l r 0 1 0 1 a 1 0 1/4 b 1 w := r - l; w := r - l; b 1/4 1/16 3/16 b l := l + w C(x); l := l + w C(x); c 1/8 5/32 6/32 b r := l + w P(x) r := l + w P(x) a 1/32 5/32 21/128 b tag = (5/32 + 21/128)/2 = 41/256 = .001010010... tag = l = .001010000... l = r = .001010100... r = code = 00101 code = prefix code = 00101001 prefix code = CSE 589 - Lecture 6 - Autumn 15 CSE 589 - Lecture 6 - Autumn 16 2001 2001 Decoding (1) Decoding (2) • Assume the length is known to be 3. • Assume the length is known to be 3. • 0001 which converts to the tag .0001000... • 0001 which converts to the tag .0001000... 0 0 .0001000... output a .0001000... aa output a a a ab b b 1 1 CSE 589 - Lecture 6 - Autumn 17 CSE 589 - Lecture 6 - Autumn 18 2001 2001 3 Decoding (3) Arithmetic Decoding Algorithm • Assume the length is known to be 3. • P(a1), P(a2), … , P(am) • 0001 which converts to the tag .0001000... • C(ai) = P(a1) + P(a2) + … + P(ai-1) 0 • Decode b1b2...bm, number of symbols is n. .0001000... aa aab output b a Initialize l := 0 and r := 1; ab t := .b1b2...bm000... for i = 1 to n do w := r - l; find j such that l + w * C(aj) < t < l + w * (C(aj)+P(aj)) b output aj; l := l + w * C(aj); r := l + w * P(aj); 1 CSE 589 - Lecture 6 - Autumn 19 CSE 589 - Lecture 6 - Autumn 20 2001 2001 Decoding Example Practical Arithmetic Coding • P(a) = 1/4, P(b) = 1/2, P(c) = 1/4 • Scaling: • C(a) = 0, C(b) = 1/4, C(c) = 3/4 – By scaling we can keep l and r in a reasonable range of values so that w = r - l does not • 00101 underflow. tag = .00101000... = 5/32 w l r output – The code can be produced progressively, not at 0 1 the end. 1 0 1/4 a – Complicates decoding some. 1/4 1/16 3/16 b 1/8 5/32 6/32 c • Integer arithmetic coding avoids floating point 1/32 5/32 21/128 a altogether. CSE 589 - Lecture 6 - Autumn 21 CSE 589 - Lecture 6 - Autumn 22 2001 2001 Notes on Arithmetic Coding Dictionary Coding • Arithmetic codes come close to the entropy lower bound. • Does not use statistical knowledge of data. • Grouping symbols is effective for arithmetic coding. • Arithmetic codes can be used effectively on small • Encoder: As the input is processed develop a symbol sets. Advantage over Huffman. dictionary and transmit the index of strings • Context can be added so that more than one found in the dictionary. probability distribution can be used. • Decoder: As the code is processed – The best coders in the world use this method. reconstruct the dictionary to invert the • There are very effective adaptive arithmetic coding process of encoding. methods. • Examples: LZW, LZ77, Unix Compress, gzip, GIF CSE 589 - Lecture 6 - Autumn 23 CSE 589 - Lecture 6 - Autumn 24 2001 2001 4 LZW Encoding Algorithm LZW Encoding Example (1) Dictionary a b a b a b a b a Repeat 0 a find the longest match w in the dictionary 1 b output the index of w put wa in the dictionary where a was the unmatched symbol CSE 589 - Lecture 6 - Autumn 25 CSE 589 - Lecture 6 - Autumn 26 2001 2001 LZW Encoding Example (2) LZW Encoding Example (3) Dictionary Dictionary a b a b a b a b a a b a b a b a b a 0 a 0 0 a 0 1 1 b 1 b 2 ab 2 ab 3 ba CSE 589 - Lecture 6 - Autumn 27 CSE 589 - Lecture 6 - Autumn 28 2001 2001 LZW Encoding Example (4) LZW Encoding Example (5) Dictionary Dictionary a b a b a b a b a a b a b a b a b a 0 a 0 1 2 0 a 0 1 2 4 1 b 1 b 2 ab 2 ab 3 ba 3 ba 4 aba 4 aba 5 abab CSE 589 - Lecture 6 - Autumn 29 CSE 589 - Lecture 6 - Autumn 30 2001 2001 5 LZW Encoding Example (6) LZW Decoding Algorithm • Emulate the encoder in building the dictionary.
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages15 Page
-
File Size-