Information Theory Lecture 10 Differential Entropy

Information Theory Lecture 10 Differential Entropy STEFAN HÖST Differential Entropy Definition Let X be a real valued continuous random variable with probability density function f (x). The differential entropy is Z H(X ) = E− log f (X ) = − f (x) log f (x)dx R where it is used that 0 log 0 = 0. Sometimes the notation H(f ) is also used in the literature. Interpretation Differential entropy can not be interpreted as uncertainty Stefan Höst Information theory 1 Differential entropy Example 1 For a uniform distribution, f (x) = a , 0 ≤ x ≤ a, the differential entropy is Z a 1 1 H(X ) = − log dx = log a 0 a a Note that H(X ) < 0, 0 < a < 1. Example For a Gaussian (Normal) distribution, N (m, s) the differential entropy is Z 1 H(X ) = − f (x) log f (x)dx = log(2pes2) R 2 Stefan Höst Information theory 2 Translation and scaling Theorem • Let Y = X + c, then fY (y) = fX (y − c) and H(Y ) = H(X ) • 1 y Let Y = aX, then fY (y) = a fX ( a ) and H(Y ) = H(X ) + log a Stefan Höst Information theory 3 Gaussian distribution Lemma Let g(x) be the distribution function of X ∼ N (m, s). Let f (x) be an arbitrary distribution function with the same mean, m, and variance, s2. Then Z Z f (x) log g(x)dx = g(x) log g(x)dx R R Stefan Höst Information theory 4 Differential entropy Definition The joint differential entropy is the entropy for a 2-dimensional random variables (X, Y ) with the joint density function f (x, y), Z H(X, Y ) = E− log f (X, Y ) = − f (x, y) log f (x, y)dxdy R2 Definition The multi-dimensional joint differential entropy is the entropy for an n-dimensional random vector (X1, ... , Xn) with the joint density function f (x1, ... , xn), Z H(X1, ... , Xn) = − f (x1, ... , xn) log f (x1, ... , xn)dx1, ... , dxn Rn Stefan Höst Information theory 5 Mutual information Definition The mutual information for a pair of continuous random variables with joint probability density function f (x, y) is f (X, Y ) Z f (x, y) I(X; Y ) = E log = f (x, y) log dxdy f (X )f (Y ) R2 f (x)f (y) The mutual information can be derived as I(X; Y ) = H(X ) + H(Y ) − H(X, Y ) = H(X ) − H(X jY ) = H(Y ) − H(Y jX ) where R H(X jY ) = − R2 f (x, y) log f (xjy)dxdy Stefan Höst Information theory 6 Relative entropy Definition The relative entropy for a pair of continuous random variables with probability density functions f (x) and g(x) is f (X ) Z f (x) D(f jjg) = Ef log = f (x) log dx g(X ) R g(x) Theorem The relative entropy is non-negative, D(f jjg) ≥ 0 with equality if and only if f (x) = g(x), for all x. Stefan Höst Information theory 7 Mutual information Corollary The mutual information is non-negative, i.e. I(X; Y ) = H(X ) − H(X jY ) ≥ 0 with equality if and only if X and Y are independent. Corollary The entropy will not increase by considering side information, i.e. H(X jY ) ≤ H(X ) with equality if and only if X and Y are independent. Stefan Höst Information theory 8 Chain rule Theorem The chain rule for differential entropy states that n H(X1, X2, ... , Xn) = ∑ H(Xi jX1, ... , Xi−1) i=1 Corollary From the chain rule it follows n H(X1, X2, ... , Xn) ≤ ∑ H(Xi ) i=1 with equality iff X1, X2, ... , Xn are independent. Stefan Höst Information theory 9 Gaussian distribution Theorem The Gaussian distribution maximises the differential entropy over all distributions with mean m and variance s2. Stefan Höst Information theory 10 Continuous vs Discrete Theorem Let X be continuous r.v. with density fX (x). Construct a discrete ( + ) version X D, where p(x D) = R k 1 D f (x)dx = f (x ) k kD X D X k Then, in general, lim H(X D) does not exist. D!0 Theorem Let X and Y be continuous r.v. with density fX (x) and fY (y). D d D Construct discrete versions X and Y , where p(xk ) = Df (xk ), d f (xk ) = ∑` df (xk , y`), and p(y` ) = df (y`), f (y`) = ∑k Df (xk , y`). Then lim I(X D, Y d) = I(X; Y ) D,d!0 Stefan Höst Information theory 11.

Information Theory Lecture 10 Differential Entropy

Details

Download

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

Support