Chapter 5 Differential Entropy and Gaussian Channels

Chapter 5 Differential Entropy and Gaussian Channels Po-Ning Chen, Professor Institute of Communications Engineering National Chiao Tung University Hsin Chu, Taiwan 30010, R.O.C. Continuous sources I: 5-1 • Model {Xt ∈X,t∈ I} – Discrete sources ∗ Both X and I are discrete. – Continuous sources ∗ Discrete-time continuous sources ·Xis continuous; I is discrete. ∗ Waveform sources · Both X and I are continuous. • We have so far examined information measures and their operational charac- terization for discrete-time discrete-alphabet systems. In this chapter, we turn our focus to discrete-time continuous-alphabet (real-valued) sources. Information content of continuous sources I: 5-2 • If the random variable takes on values in a continuum, the minimum number of bits per symbol needed to losslessly describe it must be infinite. • This is illustrated in the following example and validated in Lemma 5.2. Example 5.1 – Consider a real-valued random variable X that is uniformly distributed on the unit interval, i.e., with pdf given by 1ifx ∈ [0, 1); fX(x)= 0otherwise. – Given a positive integer m, we can discretize X by uniformly quantizing it into m levels by partitioning the support of X into equal-length segments 1 of size ∆ = m (∆ is called the quantization step-size) such that: i i − 1 i q (X)= , if ≤ X< , m m m m for 1 ≤ i ≤ m. – Then the entropy of the quantized random variable qm(X)isgivenby m 1 1 H(q (X)) = − log2 =log2 m (in bits). m m m i=1 Information content of continuous sources I: 5-3 – Since the entropy H(qm(X)) of the quantized version of X is a lower bound to the entropy of X (as qm(X) is a function of X) and satisfies in the limit lim H(qm(X)) = lim log2 m = ∞, m→∞ m→∞ we obtain that the entropy of X is infinite. 2 • The above example indicates that to compress a continuous source without incurring any loss or distortion requires an infinite number of bits. • Thus when studying continuous sources, the entropy measure is limited in its effectiveness and the introduction of a new measure is necessary. • Such a new measure is obtained upon close examination of the entropy of a uniformly quantized real-valued random-variable minus the quantization accuracy as the accuracy increases without bound. Information content of continuous sources I: 5-4 Lemma 5.2 Consider a real-valued random variable X with support [a, b) and pdf fX such that (i) −f log f is (Riemann-)integrable,and X 2 X − b (ii) a fX(x)log2 fX (x)dx is finite. Then a uniform quantization of X with an n-bit accuracy (i.e., with a quantization step-size of ∆ = 2−n) yields an entropy approximately equal to b − fX(x)log2 fX(x)dx + n bits a for n sufficiently large. In other words, b lim [H(q (X)) − n]=− f (x)log2 f (x)dx →∞ n X X n a where qn(X) is the uniformly quantized version of X with quantization step- size ∆ = 2−n. Information content of continuous sources I: 5-5 Proof: Step 1: Mean-value theorem. Let ∆ = 2−n be the quantization step-size, and let a + i∆,i=0, 1, ··· ,j− 1 ti:= b, i = j where b − a j = . ∆ From the mean-value theorem, we can choose xi ∈ [ti−1,ti]for1≤ i ≤ j such that ti pi := fX(x)dx = fX(xi)(ti − ti−1)=∆· fX(xi). ti−1 Information content of continuous sources I: 5-6 Step 2: Definition of h(n)(X). Let j j (n) −n h (X):= − [fX(xi)log2 fX(xi)]∆ = − [fX(xi)log2 fX(xi)]2 . i=1 i=1 Since −fX(x)log2 fX (x) is (Riemann-)integrable, b (n) h (X) →− fX(x)log2 fX(x)dx as n →∞. a Therefore, given any ε>0, there exists N such that for all n>N, b (n) − fX(x)log2 fX(x)dx − h (X) <ε. a Information content of continuous sources I: 5-7 Step 3: Computation of H(qn(X)). The entropy of the (uniformly) quantized version of X, qn(X), is given by j H(qn(X)) = − pi log2 pi i=1 j = − (fX(xi)∆) log2(fX(xi)∆) i=1 j −n −n = − (fX(xi)2 )log2(fX(xi)2 ) i=1 where the pi’s are the probabilities of the different values of qn(X). Information content of continuous sources I: 5-8 (n) Step 4: H(qn(X)) − h (X) . From Steps 2 and 3, j (n) −n −n H(qn(X)) − h (X)=− [fX(xi)2 ]log2(2 ) i=1 j ti = n fX(x)dx t −1 i=1 i b = n fX(x)dx a = n. Hence, we have that for n>N, b (n) − fX(x)log2 fX(x)dx + n − ε<H(qn(X)) = h (X)+n a b < − fX (x)log2 fX(x)dx + n + ε, a yielding that − − b 2 limn→∞ [H(qn(X)) n]= a fX(x)log2 fX(x)dx. Information content of continuous sources I: 5-9 Lemma 5.2 actually holds not limited for support [a, b) but for any support SX . Theorem 5.3 [340, Theorem 1](Rényi 1959) For any real-valued random variable with pdf fX,if j − pi log2 pi i=1 is finite, where the pi’s are the probabilities of the different values of uniformly quantized qn(X)oversupportSX,then lim [H(qn(X)) − n]=− fX(x)log2 fX(x)dx n→∞ SX provided the integral on the right-hand side exists. 1 This suggests that fX (x)log2 dx could be a good information measure for SX fX(x) continuous-alphabet sources. 5.1 Differential entropy I: 5-10 Definition 5.4 (Differential entropy) The differential entropy (in bits) of a continuous random variable X with pdf f and support S is defined as X X h(X):= − fX(x) · log2 fX (x)dx = E[− log2 fX(X)], SX when the integral exists. Example 5.5 A continuous random variable X with support SX =[0, 1) and pdf fX (x)=2x for x ∈ SX has differential entropy equal to 1 2 1 x (log2 e − 2log2(2x)) −2x · log2(2x)dx = 0 2 0 1 = − log2(2) = −0.278652 bits. 2ln2 5.1 Differential entropy I: 5-11 Next, we have that qn(X)isgivenby i i − 1 i q (X)= , if ≤ X< , n 2n 2n 2n for 1 ≤ i ≤ 2n. Hence, i (2i − 1) Pr q (X)= = , n 2n 22n which yields 2n 2i − 1 2i − 1 H(q (X)) = − log2 n 22n 22n i=1 2n 1 n = − (2i − 1) log2(2i − 1) + 2 log2(2 ) . 22n i=1 5.1 Differential entropy I: 5-12 n H(qn(X)) H(qn(X)) − n 1 0.811278 bits −0.188722 bits 2 1.748999 bits −0.251000 bits 3 2.729560 bits −0.270440 bits 4 3.723726 bits −0.276275 bits 5 4.722023 bits −0.277977 bits 6 5.721537 bits −0.278463 bits 7 6.721399 bits −0.278600 bits 8 7.721361 bits −0.278638 bits 9 8.721351 bits −0.278648 bits 5.1 Differential entropy I: 5-13 Example 5.7 (Differential entropy of a uniformly distributed random variable) Let X be a continuous random variable that is uniformly distributed over the interval (a, b), where b>a; i.e., its pdf is given by 1 ∈ b−a if x (a, b); fX(x)= 0otherwise. So its differential entropy is given by b − 1 1 − h(X)= − log2 − =log2(b a)bits. a b a b a • Note that if (b − a) < 1 in the above example, then h(X)isnegative. 5.1 Differential entropy I: 5-14 Example 5.8 (Differential entropy of a Gaussian random variable) Let X ∼N(µ, σ2); i.e., X is a Gaussian (or normal) random variable with finite mean µ, variance Var(X)=σ2 > 0andpdf 2 (x−µ) 1 − 2 fX(x)=√ e 2σ 2πσ2 for x ∈ R. Then its differential entropy is given by 2 1 2 (x − µ) h(X)= fX (x) log2(2πσ )+ 2 log2 e dx R 2 2σ 1 2 log2 e 2 = log2(2πσ )+ E[(X − µ) ] 2 2σ2 1 2 1 = log2(2πσ )+ log2 e 2 2 1 2 = log2(2πeσ )bits. (5.1.1) 2 • Note that for a Gaussian random variable, its differential entropy is only a function of its variance σ2 (it is functionally independent from its mean µ). • This is similar to the differential entropy of a uniform random variable, which only depends on difference (b − a) but not the mean (a + b)/2. 5.2 Joint & cond. diff. entrop., diverg. & mutual info I: 5-15 n Definition 5.9 (Joint differential entropy) If X =(X1,X2, ··· ,Xn)isa continuous random vector of size n (i.e., a vector of n continuous random variables) n with joint pdf fXn and support SXn ⊆ R , then its joint differential entropy is defined as n h(X ):=− fXn(x1,x2, ··· ,xn)log2 fXn(x1,x2, ··· ,xn) dx1 dx2 ··· dxn SXn n = E[− log2 fXn(X )] when the n-dimensional integral exists. 5.2 Joint & cond. diff. entrop., diverg. & mutual info I: 5-16 Definition 5.10 (Conditional differential entropy) Let X and Y be two jointly distributed continuous random variables with joint pdf fX,Y and support 2 SX,Y ⊆ R such that the conditional pdf of Y given X,givenby fX,Y (x, y) fY |X(y|x)= , fX (x) is well defined for all (x, y) ∈ SX,Y ,wherefX is the marginal pdf of X.Thenthe conditional entropy of Y given X is defined as h(Y |X):= − fX,Y (x, y)log2 fY |X(y|x) dx dy = E[− log2 fY |X(Y |X)], SX,Y when the integral exists.

Chapter 5 Differential Entropy and Gaussian Channels

Details

Download

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

Support