Signal Compression 1 Signal Compression 2
Entropy Coding A complete entropy codec, which is an encoder/decoder pair, consists of the process of “encoding” or Entropy coding is also known as “zero-error coding”, “compressing” a random source (typically quantized “data compression” or “lossless compression”. transform coefficients) and the process of “decoding” or Entropy coding is widely used in virtually all popular “decompressing” the compressed signal to “perfectly” international multimedia compression standards such as regenerate the original random source. In other words, JPEG and MPEG.
Copyright © 2005-2008 – Hayder Radha Copyright © 2005-2008 – Hayder Radha
Signal Compression 3 Signal Compression 4 there is no loss of information due to the process of
Random Compressed entropy coding. Entropy Encoding Source Source Thus, entropy coding does not introduce any distortion, and hence, the combination of the entropy encoder and
Random Compressed Entropy Decoding entropy decoder faithfully reconstructs the input to the Source Source entropy encoder.
Copyright © 2005-2008 – Hayder Radha Copyright © 2005-2008 – Hayder Radha Signal Compression 5 Signal Compression 6
Therefore, any possible loss-of-information or distortion for such a system, and from the perspective of the that may be introduced in a signal compression system is entropy encoder, the input “random source” to that not due to entropy encoding/decoding. As we discussed encoder is the quantized transform coefficients. previously, a typical image compression system, for Quantized Transform Coefficients Coefficients example, includes a transform process, a quantization Random Entropy Compressed Transform Quantization Source Coding Source process, and an entropy coding stage. In such system, the Examples Z Examples KLT Z Z Huffman DCT Z Z Arithmetic distortion is introduced due to quantization. Moreover, Wavelets
Copyright © 2005-2008 – Hayder Radha Copyright © 2005-2008 – Hayder Radha
Signal Compression 7 Signal Compression 8
Code Design and Notations second alphabet is the one that is used for constructing
In general, entropy coding (or “source coding”) is the codewords. Based on the second alphabet , we can achieved by designing a code, C, which provides a one- construct and define the set D*, which is the set of all to-one mapping from any possible outcome a random finite-length string of symbols withdrawn from the variable X (“source”) to a codeword. alphabet .
There two alphabets in this case; one alphabet is the traditional alphabet of the random source X , and the
Copyright © 2005-2008 – Hayder Radha Copyright © 2005-2008 – Hayder Radha Signal Compression 9 Signal Compression 10
The most common and popular codes are binary codes, Alphabet (A)of Set of Codewords Random Source (X) D* where the alphabet of the codewords is simply the binary X + A Alphabet of code symbols A 00 used to construct bits “one” and “zero”. a 01 Codewords B + b 100 bBi C c 101 b 0 . 1110 1 . b2 1
In this example: B 2
Copyright © 2005-2008 – Hayder Radha Copyright © 2005-2008 – Hayder Radha
Signal Compression 11 Signal Compression 12
Binary codes can be represented efficiently using binary Binary tree representation of a binary (D-ary; D=2) prefix code. 000 Set of Codewords trees. In this case, the first two branches of the root node 00 001 0 represent the possible bit assigned to the first bit of a 0 01 010 10 110 codeword. Once that first bit is known, and if the 011 111 100 codeword has a second bit, then the second pair of Alphabet of code symbols 1 10 101 used to construct codewords branches represents the second bit and so on. 11 110 B 0 1 111 BD2
Copyright © 2005-2008 – Hayder Radha Copyright © 2005-2008 – Hayder Radha Signal Compression 13 Signal Compression 14
Definition The codewords in D* are formed from an alphabet B that
A source code, C, is a mapping from a random variable has D elements: ||D. We say that we have a D-ary
(source) X with alphabet to a finite length string of code; or B is a D-ary alphabet. symbols, where each string of symbols (codeword) is a As discussed previously, the most common case is when member of the set D*: the alphabet B is the set B 0,1 ; therefore, in this case,
CD: * D 2 and we have binary codewords.
Copyright © 2005-2008 – Hayder Radha Copyright © 2005-2008 – Hayder Radha
Signal Compression 15 Signal Compression 16
Example We can define the code C as follows:
Let X be a random source with x +1, 2, 3, 4 . Codeword Length
Let 0,1 , and hence ||D 2. Then: CCx110 L1 1
* D {0, 00, 000,...1,11, 111,... CCx2210L2 2 01,10, CCx 33110 L 3 001,010,011,100,... 3 .... CCx 4 4 111 L4 3 }
Copyright © 2005-2008 – Hayder Radha Copyright © 2005-2008 – Hayder Radha Signal Compression 17 Signal Compression 18
Definition Code Types
For a random variable X with a p.m.f. p12,pp ,..., m , the The design of a good code follows the basic notion of expected length of a code CX is: entropy: For random outcomes with a high probability, a
m good code assigns “short” codewords and vice versa. The L CpLB ii. i1 overall objective is to have the average length L LC
to be as small as possible.
Copyright © 2005-2008 – Hayder Radha Copyright © 2005-2008 – Hayder Radha
Signal Compression 19 Signal Compression 20
In addition, we have to design codes that are uniquely In general, and as a start, we are interested in codes that
decodable. In other words, if the source generates a map each random outcome xi into a unique codeword
sequence: xxx123,,,... that is mapped into a sequence of that differs from the codeword of any other outcome. For
a random source with alphabet 1,2,...m a non- codewords Cx123, Cx , Cx ,... , then we should be
singular code meets the following constraint: able to recover the original source sequence xxx123,,,... Cx i Cx j i j from the codewords sequence Cx123, Cx , Cx ,... .
Copyright © 2005-2008 – Hayder Radha Copyright © 2005-2008 – Hayder Radha Signal Compression 21 Signal Compression 22
Although a non-singular code is uniquely decodable for a Example:
single symbol, it does not guarantee unique decodability Code C1 Code C2 for a sequence of outcomes of X . CCx 111 Cx 110
CCx 2210 Cx 200
CCx 3 3 101 Cx 311
CCx 4 4 111 Cx 4110
Copyright © 2005-2008 – Hayder Radha Copyright © 2005-2008 – Hayder Radha
Signal Compression 23 Signal Compression 24
In the above example, the code C1 is non-singular, It is important to note that a uniquely decodable code however, it is not uniquely decodable. Meanwhile, the may require the decoding of multiple codewords to
uniquely identify the original source sequence. code C2 is both non-singular and uniquely decodable.
Therefore, not all non-singular codes are uniquely This is the case for the above code C2. (Can you give an
decodable; however, every uniquely decodable code is example when the C2 decoder needs to wait for more non-singular. codewords before being able to uniquely decode a
sequence?)
Copyright © 2005-2008 – Hayder Radha Copyright © 2005-2008 – Hayder Radha Signal Compression 25 Signal Compression 26
Therefore, it is highly desirable to design a uniquely Example: decodable code that can be decoded instantaneously In the following example, no codeword is used as a when receiving each codeword. prefix for any other codeword.
This type of codes are known as instantaneous, prefix CCx 110 CCx 2210 free, or simply prefix codes. CCx 33110 CCx 4 4 111
In a prefix code, a codeword cannot be used as a prefix for any other codewords.
Copyright © 2005-2008 – Hayder Radha Copyright © 2005-2008 – Hayder Radha
Signal Compression 27 Signal Compression 28
It should be rather intuitive to know that every prefix All possible Non- codes code is uniquely decodable but the inverse is not always singular codes Uniquely decodable true. codes
Prefix In summary, the three major types of codes, non-singular, (instantaneous) codes uniquely decodable, and prefix codes, are related as shown in the following diagram.
Copyright © 2005-2008 – Hayder Radha Copyright © 2005-2008 – Hayder Radha Signal Compression 29 Signal Compression 30
Kraft Inequality Theorem
Based on the above discussion, it should be clear that For any prefix D-ary code C with codeword lengths
uniquely decodable codes represent a subset of all L12,LL ,..., m the following must be satisfied:
m possible codes. Also, prefix codes are a subset of B D Li 1. uniquely decodable codes. i 1
Prefix codes meet a certain constraint, which is known as the Kraft Inequality.
Copyright © 2005-2008 – Hayder Radha Copyright © 2005-2008 – Hayder Radha
Signal Compression 31 Signal Compression 32
Conversely, given a set of codeword lengths that meet corresponding binary tree. (The same principles apply to
m higher order codes/trees.) the inequality B D Li 1, there exists a prefix code for i1 For illustration purposes, let us consider the code: this set of lengths. CCx 110 CCx 2210 Proof CCx 33110 CCx 4 4 111 A prefix code C can be represented by a D-ary tree. This code can be represented as follows. Below we illustrate the proof using a binary code and a
Copyright © 2005-2008 – Hayder Radha Copyright © 2005-2008 – Hayder Radha Signal Compression 33 Signal Compression 34
Binary tree representation of a An important attribute of the above tree representation of binary (D-ary; D=2) prefix code. 000 Set of Codewords D* codes is the number of leaf nodes that are associated with 00 001 0 0 01 010 10 each codeword. For example, the first codeword C 10 , 011 110 111 there are four leaf nodes that are associated with it. 100 Alphabet of code symbols Similarly, the codeword C 210 , has two leaf nodes. 1 10 101 used to construct codewords 11 110 B 0 1 111 BD2
Copyright © 2005-2008 – Hayder Radha Copyright © 2005-2008 – Hayder Radha
Signal Compression 35 Signal Compression 36
Binary tree representation of a The last two codewords are leaf nodes themselves, and binary (D-ary; D=2) prefix code. 000 £1 hence each of these is associated with a single leaf node 00 001 Leaf nodes of the (itself). 0 01 010 Codeword 0 011 £2 100 Leaf nodes of the
1 10 101 Codeword 10 11 110 111
Copyright © 2005-2008 – Hayder Radha Copyright © 2005-2008 – Hayder Radha Signal Compression 37 Signal Compression 38
Note that for a prefix code, any codeword cannot be an 000 £1 00 001 Leaf nodes of the ancestor of any other codeword.
0 01 010 Codeword 0 Let Lmax be the maximum length among all codeword 011 £2 lengths of a prefix code. 100 Leaf nodes of the 1 10 101 Codeword 10 For each codeword with length Li Lmax , this codeword
11 110 £3 Leaf nodes of codeword 110 is at depth Li of the D-ary tree. Hence, the total number 111 £4 Leaf nodes of codeword 111
Copyright © 2005-2008 – Hayder Radha Copyright © 2005-2008 – Hayder Radha
Signal Compression 39 Signal Compression 40 of leaf nodes that are associated with (descendant of) a By similar arguments, one can construct a prefix code for
Lmax Li codeword at level Li is D . a set of lengths that satisfy the above constraint:
m Furthermore since each group £i of leaf nodes of a B D Li 1. i1 codeword with length Li is a disjoint from any other QED group of leaf nodes £j , then: m m B DDLmaxLLi max which implies: B D Li 1. i1 i1
Copyright © 2005-2008 – Hayder Radha Copyright © 2005-2008 – Hayder Radha Signal Compression 41 Signal Compression 42
m Optimum Codes * L minLC min B pLii such that LL, ,... L LL , ,... L 12mm 12 i1 Here we address the issue of finding minimum length m L L C codes given the constraint imposed by the Kraft B D i 1. i1
m inequality. In particular, we are interested in finding If we assume that equality is satisfied: B D Li 1, we codes that satisfy: i1 can formulate the problem using Lagrange multipliers.
Copyright © 2005-2008 – Hayder Radha Copyright © 2005-2008 – Hayder Radha
Signal Compression 43 Signal Compression 44
Consequently, we can minimize the following objective L* p ; D i i . ln D mm function: JpLDBB Li . ii m ii11 L* Using the constraint B D i 1, i1 J ; Li pDi ln D 0. L 1 i ; ln D ; ln D p * . i L * i L D ; i ; * Dpi LiDilog p
Copyright © 2005-2008 – Hayder Radha Copyright © 2005-2008 – Hayder Radha Signal Compression 45 Signal Compression 46
Therefore: where H D X is the entropy of the original source X
* The average length L C of an optimum code can be (measured with a logarithmic base D). expressed as: For a binary code, D 2, then the average length is the
m same as the standard (base-2) entropy measured in bits. L** B pL i i i1 Based on the above derivation, achieving an optimum m ; L* B pplog ; L* HX * iDi D prefix code C with an entropy length L HXD is only i1 possible when:
Copyright © 2005-2008 – Hayder Radha Copyright © 2005-2008 – Hayder Radha
Signal Compression 47 Signal Compression 48
L* i ; * Below, we state one of the most fundamental theorems in Dpi LiDilog p . information theory that relates the average length of any However, and in general, the probability distribution prefix code with the entropy of the random source with values ( pi) do not necessarily guarantee integer-valued general distribution values ( p ). This theorem, which is lengths for the codewords. i commonly known as the entropy bound theorem,
illustrates that any code cannot have an average length
that is smaller than the entropy of the random source.
Copyright © 2005-2008 – Hayder Radha Copyright © 2005-2008 – Hayder Radha Signal Compression 49 Signal Compression 50
Theorem (Entropy Bound) Observation from the Entropy Bound Theorem
The expected length L of a prefix D-ary code C for a The Entropy Bound Theorem and its proof leads to
random source X with an entropy H D X satisfies important observations that we outline below:
the following inequality: For random sources with distributions that satisfy
L LH X i D pi D , where Li is an integer for im1,2,..., ,
L i there exists a prefix code that achieves the entropy with equality if-and-only-if Dpi i.
Copyright © 2005-2008 – Hayder Radha Copyright © 2005-2008 – Hayder Radha
Signal Compression 51 Signal Compression 52
H D X . Such distributions are known as D-adic. Entropy Coding Methods
For the binary case, D 2, we have a Dyadic Here, we will discuss leading examples of entropy
distribution (or a dyadic code). coding methods that are broadly used in practice, and
Example of a dyadic distribution is: which have been adopted by leading international
1 1 1 1 compression standards. In particular, we will discuss p , p , p , p ; and 1 2 2 4 3 8 3 8 Huffman coding and arithmetic coding, both of which L 1, L 2, L 3, L 3. 1 2 3 3 lead to optimal entropy coding.
Copyright © 2005-2008 – Hayder Radha Copyright © 2005-2008 – Hayder Radha Signal Compression 53 Signal Compression 54
Key Properties of Optimum Prefix Codes
Here, we outline few key properties of optimum Property 1 prefix codes that will lead to the Huffman coding If C j and Ck are two codewords of an optimum prefix procedure. code C, then:
We adopt the notation Ci to represent the codeword with
length Li of a code C. ; ppjkLLjk
Copyright © 2005-2008 – Hayder Radha Copyright © 2005-2008 – Hayder Radha
Signal Compression 55 Signal Compression 56
Property 2
000 Set of Codewords Assuming pp12 pmm 1 p, then the largest 00 001 0 codewords of an optimum code have the same length: 0 01 010 10 011 110
100 Lmm1 L . 1 10 101 Lm1 2 11 110 Lm 3 111
Copyright © 2005-2008 – Hayder Radha Copyright © 2005-2008 – Hayder Radha Signal Compression 57 Signal Compression 58
000 Set of Codewords 000 Set of Codewords 00 001 0 00 001 0 0 01 010 10 0 01 010 10 11 011 110 011 110
100 100 1 10 1 10 101 Lm1 2 101 Lm1 2 11 110 11 110 L 2 Lm 3 m Unused shorter codeword 111 Unused shorter codeword 111
Copyright © 2005-2008 – Hayder Radha Copyright © 2005-2008 – Hayder Radha
Signal Compression 59 Signal Compression 60
Property 3 Property 4
There exits an optimum code C where the largest For a binary random source, the optimum prefix code codewords are siblings (i.e., they differ in one bit). is of length:
L12L 1.
Copyright © 2005-2008 – Hayder Radha Copyright © 2005-2008 – Hayder Radha Signal Compression 61 Signal Compression 62
The Huffman Entropy Coding Procedure The Huffman coding procedures can be summarized by
The above properties lead to the Huffman entropy the following steps: coding procedure for generating prefix codes.A core 1. Sort the outcomes according probability notion in this procedure is the observation that distribution:
optimizing a given code C is equivalent to optimizing a pp12 pmm 1 p. shortened version C '. 2. Merge the two least probable outcomes. And
assign a “zero” to one outcome and a “one” to the
Copyright © 2005-2008 – Hayder Radha Copyright © 2005-2008 – Hayder Radha
Signal Compression 63 Signal Compression 64
other outcome (treat them as binary, and use an Example
“optimum” binary code.). Find an optimum set of codewords: 1 p1 3. Repeat step 2 until we have a binary source, 3 CCCC1234?, ?, ?, ? p 1 The optimum codewords must meet the following: which one merged result in a probability 1. 2 3 LLLL1234 We now illustrate the Huffman procedure using few p 1 3 4 LL34 examples. 1 p4 12 C3 and C4 siblings
Copyright © 2005-2008 – Hayder Radha Copyright © 2005-2008 – Hayder Radha Signal Compression 65 Signal Compression 66
Combining the least probable outcomes: Use the least probable outcomes of the shortened codes;
1 1 p 1 p1 p 1 p1 1 3 1 3 3 3 1 1 0 p2 p 1 p2 3 2 3 3 p 2 2 3 1 0 1 0 p3 p 1 p3 p 1 4 3 3 4 3 3 1 p 1 p 1 4 12 1 4 12 1
Copyright © 2005-2008 – Hayder Radha Copyright © 2005-2008 – Hayder Radha
Signal Compression 67 Signal Compression 68
Now we have a probability one. Nothing else to merge. 0 C 0 p 1 1 1 3 0 p1 1 p 1 1 3 p 1 0 p 1 2 3 C 10 1 p 2 2 1 0 2 3 p2 1 0 3 p3 p 1 p 2 4 3 3 1 C 110 2 3 3 1 0 1 p3 p 1 p 1 4 3 3 1 4 12 1 C4 111 1 p 1 4 12 1 What is the average length L?
Copyright © 2005-2008 – Hayder Radha Copyright © 2005-2008 – Hayder Radha Signal Compression 69 Signal Compression 70
In some cases, we may encounter more than one choice for merging the probability distribution values.(This was 0 C 00 p 1 1 1 3 2 the case in the above example.) One important question 3 0 p 1 C2 01 is: what is the impact of selecting one choice for 2 3 1 1.0 0 combining the probabilities versus the other? We p 1 3 4 1 3 C3 10 illustrate this below by selecting an alternative option for p 1 1 4 12 1 C4 11 combining the probabilities.
Copyright © 2005-2008 – Hayder Radha Copyright © 2005-2008 – Hayder Radha
Signal Compression 71 Signal Compression 72
As can be seen in the above example, the Huffman The Huffman procedure can also be used for the case procedure can lead to different prefix codes (if multiple when D 2 (i.e., the code is not binary anymore). options for merging are encountered). Hence, an Care should be taken though when dealing with a non- important question is: Does one option provide a better binary code design. code (in terms of providing a smaller average code length
L)?
Copyright © 2005-2008 – Hayder Radha Copyright © 2005-2008 – Hayder Radha Signal Compression 73 Signal Compression 74
Arithmetic Coding double the amount of bits per symbol (relative to the true
Although Huffman codes are optimal on a symbol-by- optimum limit of HX 0.5). symbol basis, there is still room for improvements in Arithmetic coding is an approach that addresses the terms of achieving lower “overhead”. For example, a overhead issue by coding a continuous sequence of binary source with entropy HX 1, still requires one source symbols while trying to approach the entropy bit-per-symbol when using a Huffman code. Hence, if, limit H X . Arithmetic coding has roots in a coding for example, HX 0.5, then a Huffman code spends approach proposed by Shannon, Fano, and Elisas, and
Copyright © 2005-2008 – Hayder Radha Copyright © 2005-2008 – Hayder Radha
Signal Compression 75 Signal Compression 76 hence, sometimes called the Shannon-Fano-Elias (SFE) Shannon-Fano-Elias Coding codes. Therefore, we first outline the principles and The SFE coding procedure is based on using the procedures of SFE codes, and then describe arithmetic cumulative distribution function (CDF) F x of a coding. random source X ;
F xXx Pr .
The CDF provides a unique one-to-one mapping for the
possible outcomes of any random source X .
Copyright © 2005-2008 – Hayder Radha Copyright © 2005-2008 – Hayder Radha Signal Compression 77 Signal Compression 78
In other words, if we denote to the alphabet of a discrete F xFi random source X by the integer index set: 1,2,...,m , F 4 then it is well known that: F 3
F iF j , i j. F 2 F 1 This can be illustrated by the following example of a typical CDF function of a discrete random source. 1 2 3 4 x i
Copyright © 2005-2008 – Hayder Radha Copyright © 2005-2008 – Hayder Radha
Signal Compression 79 Signal Compression 80
One important characteristics of the CDF of a discrete Based on the above CDF example, we can have a well- random source is that the CDF defines a set of non- defined set of non-overlapping intervals as shown in the overlapping intervals in its range of possible values next figure. between “zero” and “one”. (Recall that the CDF provides a measure of probability, and hence it is always confined between “zero” and “one”.)
Copyright © 2005-2008 – Hayder Radha Copyright © 2005-2008 – Hayder Radha Signal Compression 81 Signal Compression 82
Another important observation is that the size of each F xFi (non-overlapping) interval in the range of the CDF F x
Non-overlapping Intervals F 4 F 3 is defined by the probability-mass-function (PMF) value
F 2 p iXiPr of a particular outcome X i. This is F 1 the same is the level of “jumps” that we can observe in
the staircase-like shape of a CDF of a discrete random 1 2 3 4 x i source. This is highlighted by the next figure.
Copyright © 2005-2008 – Hayder Radha Copyright © 2005-2008 – Hayder Radha
Signal Compression 83 Signal Compression 84
Overall, and by using the CDF of a random source, one F xFi can define a unique mapping between any possible F 4 p4 outcome and a particular (unique) interval in the range F 3 p3 F 2 between “zero” and “one”. Furthermore, one can select p 2 F 1 any value within each (unique) interval of a
p1 corresponding random outcome ()i to represent that 1 2 3 4 x i
Copyright © 2005-2008 – Hayder Radha Copyright © 2005-2008 – Hayder Radha Signal Compression 85 Signal Compression 86 outcome. This selected value serves as a “codeword” for 1. Map each outcome X i to the interval
F that outcome i . HF iFi1, .
The SFE procedure, which is based on the above CDF- HFF i 1 , F i driven principles of unique mapping, can be defined as follows: Inclusive Exclusive
Copyright © 2005-2008 – Hayder Radha Copyright © 2005-2008 – Hayder Radha
Signal Compression 87 Signal Compression 88
2. Select a particular value within the interval In principle, any value within the interval
F F HF iFi1, to represent the outcome Xi . This HF iFi1, can be used for the modified CDF
value is known as the “modified CDF” and is denoted F i .
by F xFi . A natural choice is the middle of the corresponding
F interval HF iFi1, . Hence, the modified CDF
can be expressed as follows:
Copyright © 2005-2008 – Hayder Radha Copyright © 2005-2008 – Hayder Radha Signal Compression 89 Signal Compression 90
p Fi Fi 1 i , F xFi 2
which, in turn, can be expressed as: F 4 F 4 F 3 F iFi 1 Fi . F 3 2 F 2 F 2 F 1 This is illustrated by the next figure. F 1
1 2 3 4 x i
Copyright © 2005-2008 – Hayder Radha Copyright © 2005-2008 – Hayder Radha
Signal Compression 91 Signal Compression 92
So far, it should be clear that Fi +[0,1), and it Examples of modified CDF Values and Codewords
provides a unique mapping for the possible random The following table outlines a “dyadic” set of examples
outcomes of X . of values that could be used for a modified CDF F i
3. Generate a codeword to represent F i , and hence and the corresponding codewords for such values.
to represent the outcome X i. Below we consider
simple examples of such codewords according to a
SFE coding procedure.
Copyright © 2005-2008 – Hayder Radha Copyright © 2005-2008 – Hayder Radha Signal Compression 93 Signal Compression 94
F i Binary Codeword The above values of modified CDF can be combined Representation to represent higher precision values as shown in the 1 1 2 0.1 1 2 next table.
1 2 2 0.01 01 4 F i Binary Codeword Representation 1 3 2 0.001 001 12 8 0.75 2 2 0.11 11
0.625 213 2 0.101 101
Copyright © 2005-2008 – Hayder Radha Copyright © 2005-2008 – Hayder Radha
Signal Compression 95 Signal Compression 96
must be sufficiently large to make sure that the codeword
In general the number of bits needed to code the representing F i is unique (i.e., there should not be modified CDF value F i could be infinite since F i overlap in the intervals representing the random could be any real number. In practice, however, a finite outcomes). By using a truncated value for the original
number of bits Li is used to represent (“approximate”) value F i , we anticipate a loss in precision.
F i . It should be clear that the number of bits Li used
Copyright © 2005-2008 – Hayder Radha Copyright © 2005-2008 – Hayder Radha Signal Compression 97 Signal Compression 98
Let HXGWFi be the truncated value used to represent the It can be shown that the difference between the original Li GW modified CDF value F i and its approximation HXFi original modified CDF F i based on Li bits. Naturally, Li the larger number of bits used, the higher precision, and satisfies the following inequality: the smaller the difference between F i and HXGWFi . GW 1 Li FiHX Fi L . Li 2 i
Copyright © 2005-2008 – Hayder Radha Copyright © 2005-2008 – Hayder Radha
Signal Compression 99 Signal Compression 100
Consequently, and based on the definition of the This leads to the following constraint on the length Li :
p i CS1 CSp modified CDF value: Fi Fi 1 , in order to ; logDT logDTi 2 EU22Li EU
Li maintain unique mapping, the maximum error 2 has to ; Lpiilog log 2 be smaller than p /2: i ; Lpiilog 1 1 p i . CS 22Li ; 1 Li logDT 1 EUpi
Copyright © 2005-2008 – Hayder Radha Copyright © 2005-2008 – Hayder Radha Signal Compression 101 Signal Compression 102
Therefore: corresponding PMF, CDF, and modified CDF values and
FVCS codewords used based on SFE coding. 1 Li GWlogDT 1. GWEUpi
Example
The following table shows an example of a random source X with four possible outcomes and the
Copyright © 2005-2008 – Hayder Radha Copyright © 2005-2008 – Hayder Radha
Signal Compression 103 Signal Compression 104
Arithmetic Coding Xi pi F i F i F i SFE Li (Binary) Code The advantages of the SFE coding procedure can be 1 0.5 0.5 0.25 0.01 01 2 realized when it is sued to code multiple outcomes of the 2 0.25 0.75 0.625 0.101 101 3 random source under consideration. Arithmetic coding is 3 0.125 0.875 0.8125 0.1101 1101 4 basically an SFE coding applied to multiple outcomes of 4 0.125 1.0 0.9375 0.1111 1111 4 the random source.
Copyright © 2005-2008 – Hayder Radha Copyright © 2005-2008 – Hayder Radha Signal Compression 105 Signal Compression 106
Under AC, we code a sequence of n outcomes: The best way to illustrate arithmetic coding is through a
+ iiii12, ,..., n , where each outcome imj 1,2,..., . couple of examples as shown below.
Each possible vector Xi of the random source X is mapped to a unique value:
Fi + F()n [0,1).
Copyright © 2005-2008 – Hayder Radha Copyright © 2005-2008 – Hayder Radha
Signal Compression 107 Signal Compression 108
Example 1 F 3 Arithmetic coding begins with dividing the “zero” to F 3 xi+1, 2, 3 “one” range based on the CDF function of the random
source. In this example, the source can take one of F 2 F 2 three possible outcomes. F 1 F 1 0
Copyright © 2005-2008 – Hayder Radha Copyright © 2005-2008 – Hayder Radha Signal Compression 109 Signal Compression 110
If we assume that we are interested in coding n 2 1 F 3 outcomes, the following figures show the particular
F 3 xi 3,2 interval and corresponding value F xFi that
arithmetic coding focuses on to code the vector F 2 F 2 F 1 ii12,3,2. F 1 0
Copyright © 2005-2008 – Hayder Radha Copyright © 2005-2008 – Hayder Radha
Signal Compression 111 Signal Compression 112
1 F 3 1 F 3 xi 3,2 F 3 xi 3,2 F 3 F x
Transmit this F 2 F 2 number to F 2 F 2 represent the F 1 F 1 vector: xi 3,2 F 1 F 1 0 0
Copyright © 2005-2008 – Hayder Radha Copyright © 2005-2008 – Hayder Radha Signal Compression 113 Signal Compression 114
Similarly, the following figure shows the particular 1 F 3 interval and corresponding value F xFi that F 3 xi 1,3 arithmetic coding focuses on to code the vector
ii,1,3 . F 2 12 F 2 F 1 F x F 1 0
Copyright © 2005-2008 – Hayder Radha Copyright © 2005-2008 – Hayder Radha
Signal Compression 115 Signal Compression 116
Based on the above examples, we can define: Example
FF()nn () The coding process starts with the initial step values: F ()n lu and '()nnnF () F (), 2 ul (0) Fl 0 ()n ()n where Fu and Fl are the upper and lower bounds of a (0) Fu 1 F ()nn () ()n unique interval HFFlu, that F belong to. Below, '(0) (0) (0) FulF we use these expressions to illustrate the arithmetic coding procedure.
Copyright © 2005-2008 – Hayder Radha Copyright © 2005-2008 – Hayder Radha Signal Compression 117 Signal Compression 118
'()nnn () () 1 (0) After the initial step, the interval FulF and F 3 Fu 1 FF()nn () corresponding value F ()n lu are updated 2 '(0) F 2 according to the particular outcomes that the random (0) (0) FulF source is generating. This is illustrated below. F 3
0 (0) Fl 0
Copyright © 2005-2008 – Hayder Radha Copyright © 2005-2008 – Hayder Radha
Signal Compression 119 Signal Compression 120
1 (0) 1 (0) (1) F 3 Fu 1 F 3 Fu Fu ?
Example i i ,i (0) 1 2 ' F 2 2,3 F 2 i 2,3 (0) (0) '(1) FulF i1 2 (1) (1) FulF F 1 F 1
0 (0) 0 (0) (1) Fl 0 Fl Fl ?
Copyright © 2005-2008 – Hayder Radha Copyright © 2005-2008 – Hayder Radha Signal Compression 121 Signal Compression 122
1 (0) F (1) 'F (0) (0).F i 1 (0) F (1) 'F (0) (0).F i F 3 Fu u l 1 F 3 Fu u l 1 (1) FFFu 01. 22
F 2 i 2,3 F 2 i 2,3 '(1) '(1) i1 2 i1 2 (1) (1) (1) (1) FulF FulF F 1 F 1 ()1 (0)' (0) Fl Fl .Fi1 1 0 (0) F ()1 F (0)' (0).Fi 1 0 (0) Fl l l 1 Fl (1) Fl 01.FF11
Copyright © 2005-2008 – Hayder Radha Copyright © 2005-2008 – Hayder Radha
Signal Compression 123 Signal Compression 124
(2) (1) The arithmetic coding procedure can be summarized by Fu Fl 1 F (0) F (1) F 3 u u '(1) .F i2 the following steps that are outlines below.
i2 3 '(2) F 2 (2) (2) '(1) FulF i 2,3 i1 2 (1) (1) FulF F 1 F ()2 F (1) 0 F (0) F (1) l l l l '(1) .Fi2 1
Copyright © 2005-2008 – Hayder Radha Copyright © 2005-2008 – Hayder Radha Signal Compression 125 Signal Compression 126
Similar to SFE coding, after determining the value n ()nn () Fx FFF lu 2 F ()n , we use L()n bits to represent F ()n according to
()n (1)n' (n 1) FFiu Fl . n the constraint:
FV ()n (1)n '()n 1 ()n 1 FFil Fl . n 1 L GWlog 1. GWpx
'(1)nnn (1) (1) FFul
(0) (0) 0 Fl 0 Fu 1 '1
Copyright © 2005-2008 – Hayder Radha Copyright © 2005-2008 – Hayder Radha
Signal Compression 127
Copyright © 2005-2008 – Hayder Radha