Inequalities in Information Theory a Brief Introduction

Inequalities in Information Theory a Brief Introduction

Inequalities in Information Theory A Brief Introduction Xu Chen Department of Information Science School of Mathematics Peking University Mar.20, 2012 Part I Basic Concepts and Inequalities Xu Chen (IS, SMS, at PKU) Inequalities in Information Theory Mar.20, 2012 2 / 80 Basic Concepts Outline 1 Basic Concepts 2 Basic inequalities 3 Bounds on Entropy Xu Chen (IS, SMS, at PKU) Inequalities in Information Theory Mar.20, 2012 3 / 80 Basic Concepts The Entropy Definition 1 The Shannon information content of an outcome x is defined to be 1 h(x) = log 2 P(x) 2 The entropy of an ensemble X is defined to be the average Shannon information content of an outcome: X 1 H(X ) = P(X ) log (1) 2 P(X ) x2X 3 Conditional Entropy: the entropy of a r.v.,given another r.v. X X H(X jY ) = − p(xi ; yj ) log2 p(xi jyj ) (2) i j Xu Chen (IS, SMS, at PKU) Inequalities in Information Theory Mar.20, 2012 4 / 80 Basic Concepts The Entropy Definition 1 The Shannon information content of an outcome x is defined to be 1 h(x) = log 2 P(x) 2 The entropy of an ensemble X is defined to be the average Shannon information content of an outcome: X 1 H(X ) = P(X ) log (1) 2 P(X ) x2X 3 Conditional Entropy: the entropy of a r.v.,given another r.v. X X H(X jY ) = − p(xi ; yj ) log2 p(xi jyj ) (2) i j Xu Chen (IS, SMS, at PKU) Inequalities in Information Theory Mar.20, 2012 4 / 80 Basic Concepts The Entropy Definition 1 The Shannon information content of an outcome x is defined to be 1 h(x) = log 2 P(x) 2 The entropy of an ensemble X is defined to be the average Shannon information content of an outcome: X 1 H(X ) = P(X ) log (1) 2 P(X ) x2X 3 Conditional Entropy: the entropy of a r.v.,given another r.v. X X H(X jY ) = − p(xi ; yj ) log2 p(xi jyj ) (2) i j Xu Chen (IS, SMS, at PKU) Inequalities in Information Theory Mar.20, 2012 4 / 80 Basic Concepts The Entropy The Joint Entropy The joint entropy of X; Y is: X 1 H(X ; Y ) = p(x; y) log (3) 2 p(x; y) x2X ;y2Y Remarks 1 The entropy H answers the question that what is the ultimate data compression. 2 The entropy is a measure of the average uncertainty in the random variable.It is the number of bits on the average required to describe the random variable. Reference for [[2]Thomas and [4]David ] Xu Chen (IS, SMS, at PKU) Inequalities in Information Theory Mar.20, 2012 5 / 80 Basic Concepts The Entropy The Joint Entropy The joint entropy of X; Y is: X 1 H(X ; Y ) = p(x; y) log (3) 2 p(x; y) x2X ;y2Y Remarks 1 The entropy H answers the question that what is the ultimate data compression. 2 The entropy is a measure of the average uncertainty in the random variable.It is the number of bits on the average required to describe the random variable. Reference for [[2]Thomas and [4]David ] Xu Chen (IS, SMS, at PKU) Inequalities in Information Theory Mar.20, 2012 5 / 80 Basic Concepts The Entropy The Joint Entropy The joint entropy of X; Y is: X 1 H(X ; Y ) = p(x; y) log (3) 2 p(x; y) x2X ;y2Y Remarks 1 The entropy H answers the question that what is the ultimate data compression. 2 The entropy is a measure of the average uncertainty in the random variable.It is the number of bits on the average required to describe the random variable. Reference for [[2]Thomas and [4]David ] Xu Chen (IS, SMS, at PKU) Inequalities in Information Theory Mar.20, 2012 5 / 80 Basic Concepts The Entropy The Joint Entropy The joint entropy of X; Y is: X 1 H(X ; Y ) = p(x; y) log (3) 2 p(x; y) x2X ;y2Y Remarks 1 The entropy H answers the question that what is the ultimate data compression. 2 The entropy is a measure of the average uncertainty in the random variable.It is the number of bits on the average required to describe the random variable. Reference for [[2]Thomas and [4]David ] Xu Chen (IS, SMS, at PKU) Inequalities in Information Theory Mar.20, 2012 5 / 80 Basic Concepts The Entropy The Joint Entropy The joint entropy of X; Y is: X 1 H(X ; Y ) = p(x; y) log (3) 2 p(x; y) x2X ;y2Y Remarks 1 The entropy H answers the question that what is the ultimate data compression. 2 The entropy is a measure of the average uncertainty in the random variable.It is the number of bits on the average required to describe the random variable. Reference for [[2]Thomas and [4]David ] Xu Chen (IS, SMS, at PKU) Inequalities in Information Theory Mar.20, 2012 5 / 80 Basic Concepts The Mutual Information Definition The mutual information is the reduction in uncertainty when given another r.v., for two r.v. X and Y this reduction is X p(x; y) I (X ; Y ) = H(X ) − H(X jY ) = p(x; y) log (4) p(x)p(y) x;y The capacity of channel is C = max I(X ; Y ) p(x) Xu Chen (IS, SMS, at PKU) Inequalities in Information Theory Mar.20, 2012 6 / 80 Basic Concepts The Mutual Information Definition The mutual information is the reduction in uncertainty when given another r.v., for two r.v. X and Y this reduction is X p(x; y) I (X ; Y ) = H(X ) − H(X jY ) = p(x; y) log (4) p(x)p(y) x;y The capacity of channel is C = max I(X ; Y ) p(x) Xu Chen (IS, SMS, at PKU) Inequalities in Information Theory Mar.20, 2012 6 / 80 Basic Concepts The relationships Figure: The relationships between Entropy and Mutual Information Graphic from [[3]Simon,2011]. Xu Chen (IS, SMS, at PKU) Inequalities in Information Theory Mar.20, 2012 7 / 80 Basic Concepts The relative entropy Definition The relative entropy or Kullback Leibler distance between two probability mass functions p(x) and q(x) is defined as X p(x) p(X ) D(p k q) = p(x) log = E log : (5) q(x) p q(X ) x2X 1 The relative entropy and mutual information I (X ; Y ) = D(p(x; y) k p(x)p(y)) (6) 2 Pythagorean decomposition: let X = AU, then D(px k pu) = D(px k p~x ) + D(~px k pu): (7) Xu Chen (IS, SMS, at PKU) Inequalities in Information Theory Mar.20, 2012 8 / 80 Basic Concepts The relative entropy Definition The relative entropy or Kullback Leibler distance between two probability mass functions p(x) and q(x) is defined as X p(x) p(X ) D(p k q) = p(x) log = E log : (5) q(x) p q(X ) x2X 1 The relative entropy and mutual information I (X ; Y ) = D(p(x; y) k p(x)p(y)) (6) 2 Pythagorean decomposition: let X = AU, then D(px k pu) = D(px k p~x ) + D(~px k pu): (7) Xu Chen (IS, SMS, at PKU) Inequalities in Information Theory Mar.20, 2012 8 / 80 Basic Concepts The relative entropy Definition The relative entropy or Kullback Leibler distance between two probability mass functions p(x) and q(x) is defined as X p(x) p(X ) D(p k q) = p(x) log = E log : (5) q(x) p q(X ) x2X 1 The relative entropy and mutual information I (X ; Y ) = D(p(x; y) k p(x)p(y)) (6) 2 Pythagorean decomposition: let X = AU, then D(px k pu) = D(px k p~x ) + D(~px k pu): (7) Xu Chen (IS, SMS, at PKU) Inequalities in Information Theory Mar.20, 2012 8 / 80 Basic Concepts Conditional definitions Conditional mutual information I (X ; Y jZ) = H(X jZ) − H(X jY ; Z) (8) p(X ; yjZ) = E log : (9) p(x;y;z) p(X jZ)p(Y jZ) Conditional relative entropy X X p(yjx) D(p(yjx) k q(yjx)) = p(x) p(yjx) log (10) q(yjx) x y p(Y jX ) = E log : (11) p(x;y) q(Y jX ) Xu Chen (IS, SMS, at PKU) Inequalities in Information Theory Mar.20, 2012 9 / 80 Basic Concepts Conditional definitions Conditional mutual information I (X ; Y jZ) = H(X jZ) − H(X jY ; Z) (8) p(X ; yjZ) = E log : (9) p(x;y;z) p(X jZ)p(Y jZ) Conditional relative entropy X X p(yjx) D(p(yjx) k q(yjx)) = p(x) p(yjx) log (10) q(yjx) x y p(Y jX ) = E log : (11) p(x;y) q(Y jX ) Xu Chen (IS, SMS, at PKU) Inequalities in Information Theory Mar.20, 2012 9 / 80 Basic Concepts Differential entropy Definition 1 The differential entropy h(X1; X2; :::; Xn), some times written h(f ), is defined by Z h(X1; X2; :::; Xn) = − f (x) log f (x)dx (12) Definition 2 The relative entropy between probability densities f and g is Z D(f k g) = − f (x) log(f (x)=g(x))dx (13) Xu Chen (IS, SMS, at PKU) Inequalities in Information Theory Mar.20, 2012 10 / 80 Basic Concepts Differential entropy Definition 1 The differential entropy h(X1; X2; :::; Xn), some times written h(f ), is defined by Z h(X1; X2; :::; Xn) = − f (x) log f (x)dx (12) Definition 2 The relative entropy between probability densities f and g is Z D(f k g) = − f (x) log(f (x)=g(x))dx (13) Xu Chen (IS, SMS, at PKU) Inequalities in Information Theory Mar.20, 2012 10 / 80 Basic Concepts Chain Rules 1 Chain rule for entropy n X H(X1; X2;:::; Xn) = H(Xi jXi−1;:::; X1): (14) i=1 2 Chain rule for information n X I (X1; X2;:::; Xn; Y ) = I (Xi ; Y jXi−1;:::; X1): (15) i=1 3 Chain rule for entropy D(p(x; y) k q(x; y)) = D(p(x) k q(x)) + D(p(yjx) k q(yjx)): (16) Xu Chen (IS, SMS, at PKU) Inequalities in Information Theory Mar.20, 2012 11 / 80 Basic Concepts Chain Rules 1 Chain rule for entropy n X H(X1; X2;:::; Xn) = H(Xi jXi−1;:::; X1): (14) i=1 2 Chain rule for information n X I (X1; X2;:::; Xn; Y ) = I (Xi ; Y jXi−1;:::; X1): (15) i=1 3 Chain rule for entropy D(p(x; y) k q(x; y)) = D(p(x) k q(x)) + D(p(yjx) k q(yjx)): (16) Xu Chen (IS, SMS, at PKU) Inequalities in Information Theory Mar.20, 2012 11 / 80 Basic Concepts Chain Rules 1 Chain rule for entropy n X H(X1; X2;:::; Xn) = H(Xi jXi−1;:::; X1): (14) i=1 2 Chain rule for information n X I (X1; X2;:::; Xn; Y ) = I (Xi ; Y jXi−1;:::; X1): (15) i=1 3 Chain rule for entropy D(p(x; y) k q(x; y)) = D(p(x) k q(x)) + D(p(yjx) k q(yjx)): (16) Xu Chen (IS, SMS, at PKU) Inequalities in Information Theory Mar.20, 2012 11 / 80 Basic inequalities Outline 1 Basic Concepts 2 Basic inequalities 3 Bounds on Entropy Xu Chen (IS, SMS, at PKU) Inequalities in Information Theory Mar.20, 2012 12 / 80 Basic inequalities Jensen's inequality Definition A function f is said to be convex if f (λx1 + (1 − λ)x2) ≤ λf (x1) + (1 − λ)f (x2) (17) for all 0 ≤ λ ≤ 1 and all x1 and x2 in the convex domain of f .

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    140 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us