10.3 Joint Differential Entropy, Conditional (Differential) Entropy, and Mutual Information Definition 10.17 The joint di↵erential entropy h(X) of a random vector X with joint pdf f(x) is defined as
h(X)= f(x) log f(x)dx = E log f(X). ZS
Corollary If X ,X , ,X are mutually independent, then 1 2 ··· n n h(X)= h(Xi). i=1 X
Theorem 10.18 (Translation) h(X + c)=h(X).
Theorem 10.19 (Scaling) h(AX)=h(X) + log det(A) . | | Definition 10.17 The joint di↵erential entropy h(X) of a random vector X with joint pdf f(x) is defined as
h(X)= f(x) log f(x)dx = E log f(X). ZS
Corollary If X ,X , ,X are mutually independent, then 1 2 ··· n n h(X)= h(Xi). i=1 X
Theorem 10.18 (Translation) h(X + c)=h(X).
Theorem 10.19 (Scaling) h(AX)=h(X) + log det(A) . | | Definition 10.17 The joint di↵erential entropy h(X) of a random vector X with joint pdf f(x) is defined as
h(X)= f(x) log f(x)dx = E log f(X). ZS
Corollary If X ,X , ,X are mutually independent, then 1 2 ··· n n h(X)= h(Xi). i=1 X
Theorem 10.18 (Translation) h(X + c)=h(X).
Theorem 10.19 (Scaling) h(AX)=h(X) + log det(A) . | | Definition 10.17 The joint di↵erential entropy h(X) of a random vector X with joint pdf f(x) is defined as
h(X)= f(x) log f(x)dx = E log f(X). ZS
Corollary If X ,X , ,X are mutually independent, then 1 2 ··· n n h(X)= h(Xi). i=1 X
Theorem 10.18 (Translation) h(X + c)=h(X).
Theorem 10.19 (Scaling) h(AX)=h(X) + log det(A) . | | Theorem 10.20 (Multivariate Gaussian Distribution) Let X (µ,K). Then N 1 h(X)= log [(2 e)n K ] . 2 | | Theorem 10.20 (Multivariate Gaussian Distribu- 5. Now consider tion) Let X (µ,K). Then ⇠N h(X)=h(QY) 1 n = h(Y)+log det(Q) Theorem 10.19 h(X)= log (2⇡e) K . | | 2 | | h i = h(Y)+0 det(Q)= 1 ± n Proof = h(Yi) 1. Let K be diagonalizable as Q⇤Q>. iX=1 2. Write X = QY, where the random variables in Y n 1 are uncorrelated with var Yi = i, the ith diagonal = log(2⇡e i) 2 element of ⇤(cf. Corollary 10.7). iX=1 3. Since X is Gaussian, so is Y. n 1 n = log (2⇡e) i 4. Then the random variables in Y are mutually inde- 2 2 3 pendent because they are uncorrelated. iY=1 4 5 1 n = log[(2⇡e) ⇤ ] 2 | | 1 n = log[(2⇡e) K ]. 2 | |
K = Q⇤Q>
K = Q ⇤ Q> | | | || || | 2 = Q ⇤ | | | | = ⇤ | | Theorem 10.20 (Multivariate Gaussian Distribu- 5. Now consider tion) Let X (µ,K). Then ⇠N h(X)=h(QY) 1 n = h(Y)+log det(Q) Theorem 10.19 h(X)= log (2⇡e) K . | | 2 | | h i = h(Y)+0 det(Q)= 1 ± n Proof = h(Yi) 1. Let K be diagonalizable as Q⇤Q>. iX=1 2. Write X = QY, where the random variables in Y n 1 are uncorrelated with var Yi = i, the ith diagonal = log(2⇡e i) 2 element of ⇤(cf. Corollary 10.7). iX=1 3. Since X is Gaussian, so is Y. n 1 n = log (2⇡e) i 4. Then the random variables in Y are mutually inde- 2 2 3 pendent because they are uncorrelated. iY=1 4 5 1 n = log[(2⇡e) ⇤ ] 2 | | 1 n = log[(2⇡e) K ]. 2 | |
K = Q⇤Q>
K = Q ⇤ Q> | | | || || | 2 = Q ⇤ | | | | = ⇤ | | Theorem 10.20 (Multivariate Gaussian Distribu- 5. Now consider tion) Let X (µ,K). Then ⇠N h(X)=h(QY) 1 n = h(Y)+log det(Q) Theorem 10.19 h(X)= log (2⇡e) K . | | 2 | | h i = h(Y)+0 det(Q)= 1 ± n Proof = h(Yi) 1. Let K be diagonalizable as Q⇤Q>. iX=1 2. Write X = QY, where the random variables in Y n 1 are uncorrelated with var Yi = i, the ith diagonal = log(2⇡e i) 2 element of ⇤(cf. Corollary 10.7). iX=1 3. Since X is Gaussian, so is Y. n 1 n = log (2⇡e) i 4. Then the random variables in Y are mutually inde- 2 2 3 pendent because they are uncorrelated. iY=1 4 5 1 n = log[(2⇡e) ⇤ ] 2 | | 1 n = log[(2⇡e) K ]. 2 | |
K = Q⇤Q>
K = Q ⇤ Q> | | | || || | 2 = Q ⇤ | | | | = ⇤ | | Theorem 10.20 (Multivariate Gaussian Distribu- 5. Now consider tion) Let X (µ,K). Then ⇠N h(X)=h(QY) 1 n = h(Y)+log det(Q) Theorem 10.19 h(X)= log (2⇡e) K . | | 2 | | h i = h(Y)+0 det(Q)= 1 ± n Proof = h(Yi) 1. Let K be diagonalizable as Q⇤Q>. iX=1 2. Write X = QY, where the random variables in Y n 1 are uncorrelated with var Yi = i, the ith diagonal = log(2⇡e i) 2 element of ⇤(cf. Corollary 10.7). iX=1 3. Since X is Gaussian, so is Y. n 1 n = log (2⇡e) i 4. Then the random variables in Y are mutually inde- 2 2 3 pendent because they are uncorrelated. iY=1 4 5 1 n = log[(2⇡e) ⇤ ] 2 | | 1 n = log[(2⇡e) K ]. 2 | |
K = Q⇤Q>
K = Q ⇤ Q> | | | || || | 2 = Q ⇤ | | | | = ⇤ | | Theorem 10.20 (Multivariate Gaussian Distribu- 5. Now consider tion) Let X (µ,K). Then ⇠N h(X)=h(QY) 1 n = h(Y)+log det(Q) Theorem 10.19 h(X)= log (2⇡e) K . | | 2 | | h i = h(Y)+0 det(Q)= 1 ± n Proof = h(Yi) 1. Let K be diagonalizable as Q⇤Q>. iX=1 2. Write X = QY, where the random variables in Y n 1 are uncorrelated with var Yi = i, the ith diagonal = log(2⇡e i) 2 element of ⇤(cf. Corollary 10.7). iX=1 3. Since X is Gaussian, so is Y. n 1 n = log (2⇡e) i 4. Then the random variables in Y are mutually inde- 2 2 3 pendent because they are uncorrelated. iY=1 4 5 1 n = log[(2⇡e) ⇤ ] 2 | | 1 n = log[(2⇡e) K ]. 2 | |
K = Q⇤Q>
K = Q ⇤ Q> | | | || || | 2 = Q ⇤ | | | | = ⇤ | | Theorem 10.20 (Multivariate Gaussian Distribu- 5. Now consider tion) Let X (µ,K). Then ⇠N h(X)=h(QY) 1 n = h(Y)+log det(Q) h(X)= log (2⇡e) K . | | 2 | | h i = h(Y)+0 n Proof = h(Yi) 1. Let K be diagonalizable as Q⇤Q>. iX=1 2. Write X = QY, where the random variables in Y n 1 are uncorrelated with var Yi = i, the ith diagonal = log(2⇡e i) 2 element of ⇤(cf. Corollary 10.7). iX=1 3. Since X is Gaussian, so is Y. n 1 n = log (2⇡e) i 4. Then the random variables in Y are mutually inde- 2 2 3 pendent because they are uncorrelated. iY=1 4 5 1 n = log[(2⇡e) ⇤ ] 2 | | 1 n = log[(2⇡e) K ]. 2 | | Theorem 10.20 (Multivariate Gaussian Distribu- 5. Now consider tion) Let X (µ,K). Then ⇠N h(X)=h(QY) 1 n = h(Y)+log det(Q) h(X)= log (2⇡e) K . | | 2 | | h i = h(Y)+0 n Proof = h(Yi) 1. Let K be diagonalizable as Q⇤Q>. iX=1 2. Write X = QY, where the random variables in Y n 1 are uncorrelated with var Yi = i, the ith diagonal = log(2⇡e i) 2 element of ⇤(cf. Corollary 10.7). iX=1 3. Since X is Gaussian, so is Y. n 1 n = log (2⇡e) i 4. Then the random variables in Y are mutually inde- 2 2 3 pendent because they are uncorrelated. iY=1 4 5 1 n = log[(2⇡e) ⇤ ] 2 | | 1 n = log[(2⇡e) K ]. 2 | | Theorem 10.20 (Multivariate Gaussian Distribu- 5. Now consider tion) Let X (µ,K). Then ⇠N h(X)=h(QY) 1 n = h(Y)+log det(Q) h(X)= log (2⇡e) K . | | 2 | | h i = h(Y)+0 n Proof = h(Yi) 1. Let K be diagonalizable as Q⇤Q>. iX=1 2. Write X = QY, where the random variables in Y n 1 are uncorrelated with var Yi = i, the ith diagonal = log(2⇡e i) 2 element of ⇤(cf. Corollary 10.7). iX=1 3. Since X is Gaussian, so is Y. n 1 n = log (2⇡e) i 4. Then the random variables in Y are mutually inde- 2 2 3 pendent because they are uncorrelated. iY=1 4 5 1 n = log[(2⇡e) ⇤ ] 2 | | 1 n = log[(2⇡e) K ]. 2 | | Theorem 10.20 (Multivariate Gaussian Distribu- 5. Now consider tion) Let X (µ,K). Then ⇠N h(X)=h(QY) 1 n = h(Y)+log det(Q) Theorem 10.19 h(X)= log (2⇡e) K . | | 2 | | h i = h(Y)+0 n Proof = h(Yi) 1. Let K be diagonalizable as Q⇤Q>. iX=1 2. Write X = QY, where the random variables in Y n 1 are uncorrelated with var Yi = i, the ith diagonal = log(2⇡e i) 2 element of ⇤(cf. Corollary 10.7). iX=1 3. Since X is Gaussian, so is Y. n 1 n = log (2⇡e) i 4. Then the random variables in Y are mutually inde- 2 2 3 pendent because they are uncorrelated. iY=1 4 5 1 n = log[(2⇡e) ⇤ ] 2 | | 1 n = log[(2⇡e) K ]. 2 | | Theorem 10.20 (Multivariate Gaussian Distribu- 5. Now consider tion) Let X (µ,K). Then ⇠N h(X)=h(QY) 1 n = h(Y)+log det(Q) h(X)= log (2⇡e) K . | | 2 | | h i = h(Y)+0 det(Q)= 1 ± n Proof = h(Yi) 1. Let K be diagonalizable as Q⇤Q>. iX=1 2. Write X = QY, where the random variables in Y n 1 are uncorrelated with var Yi = i, the ith diagonal = log(2⇡e i) 2 element of ⇤(cf. Corollary 10.7). iX=1 3. Since X is Gaussian, so is Y. n 1 n = log (2⇡e) i 4. Then the random variables in Y are mutually inde- 2 2 3 pendent because they are uncorrelated. iY=1 4 5 1 n = log[(2⇡e) ⇤ ] 2 | | 1 n = log[(2⇡e) K ]. 2 | | Theorem 10.20 (Multivariate Gaussian Distribu- 5. Now consider tion) Let X (µ,K). Then ⇠N h(X)=h(QY) 1 n = h(Y)+log det(Q) h(X)= log (2⇡e) K . | | 2 | | h i = h(Y)+0 n Proof = h(Yi) 1. Let K be diagonalizable as Q⇤Q>. iX=1 2. Write X = QY, where the random variables in Y n 1 are uncorrelated with var Yi = i, the ith diagonal = log(2⇡e i) 2 element of ⇤(cf. Corollary 10.7). iX=1 3. Since X is Gaussian, so is Y. n 1 n = log (2⇡e) i 4. Then the random variables in Y are mutually inde- 2 2 3 pendent because they are uncorrelated. iY=1 4 5 1 n = log[(2⇡e) ⇤ ] 2 | | 1 n = log[(2⇡e) K ]. 2 | | Theorem 10.20 (Multivariate Gaussian Distribu- 5. Now consider tion) Let X (µ,K). Then ⇠N h(X)=h(QY) 1 n = h(Y)+log det(Q) h(X)= log (2⇡e) K . | | 2 | | h i = h(Y)+0 n Proof = h(Yi) 1. Let K be diagonalizable as Q⇤Q>. iX=1 2. Write X = QY, where the random variables in Y n 1 are uncorrelated with var Yi = i, the ith diagonal = log(2⇡e i) 2 element of ⇤(cf. Corollary 10.7). iX=1 3. Since X is Gaussian, so is Y. n 1 n = log (2⇡e) i 4. Then the random variables in Y are mutually inde- 2 2 3 pendent because they are uncorrelated. iY=1 4 5 1 n = log[(2⇡e) ⇤ ] 2 | | 1 n = log[(2⇡e) K ]. 2 | | Theorem 10.20 (Multivariate Gaussian Distribu- 5. Now consider tion) Let X (µ,K). Then ⇠N h(X)=h(QY) 1 n = h(Y)+log det(Q) h(X)= log (2⇡e) K . | | 2 | | h i = h(Y)+0 n Proof = h(Yi) 1. Let K be diagonalizable as Q⇤Q>. iX=1 2. Write X = QY, where the random variables in Y n 1 are uncorrelated with var Yi = i, the ith diagonal = log(2⇡e i) _2 element of ⇤(cf. Corollary 10.7). iX=1 3. Since X is Gaussian, so is Y. n 1 n = log (2⇡e) i 4. Then the random variables in Y are mutually inde- 2 2 3 pendent because they are uncorrelated. iY=1 4 5 1 n = log[(2⇡e) ⇤ ] 2 | | 1 n = log[(2⇡e) K ]. 2 | | Theorem 10.20 (Multivariate Gaussian Distribu- 5. Now consider tion) Let X (µ,K). Then ⇠N h(X)=h(QY) 1 n = h(Y)+log det(Q) h(X)= log (2⇡e) K . | | 2 | | h i = h(Y)+0 n Proof = h(Yi) 1. Let K be diagonalizable as Q⇤Q>. iX=1 2. Write X = QY, where the random variables in Y n 1 are uncorrelated with var Yi = i, the ith diagonal = log(2⇡e i) _2 element of ⇤(cf. Corollary 10.7). iX=1 3. Since X is Gaussian, so is Y. n 1 n = log (2⇡e) i 4. Then the random variables in Y are mutually inde- _2 2 3 pendent because they are uncorrelated. iY=1 4 5 1 n = log[(2⇡e) ⇤ ] 2 | | 1 n = log[(2⇡e) K ]. 2 | | Theorem 10.20 (Multivariate Gaussian Distribu- 5. Now consider tion) Let X (µ,K). Then ⇠N h(X)=h(QY) 1 n = h(Y)+log det(Q) h(X)= log (2⇡e) K . | | 2 | | h i = h(Y)+0 n Proof = h(Yi) 1. Let K be diagonalizable as Q⇤Q>. iX=1 2. Write X = QY, where the random variables in Y n 1 are uncorrelated with var Yi = i, the ith diagonal = log(2___⇡e i) 2 element of ⇤(cf. Corollary 10.7). iX=1 3. Since X is Gaussian, so is Y. n 1 n = log (2___⇡e) i 4. Then the random variables in Y are mutually inde- 2 2 3 pendent because they are uncorrelated. iY=1 4 5 1 n = log[(2⇡e) ⇤ ] 2 | | 1 n = log[(2⇡e) K ]. 2 | | Theorem 10.20 (Multivariate Gaussian Distribu- 5. Now consider tion) Let X (µ,K). Then ⇠N h(X)=h(QY) 1 n = h(Y)+log det(Q) h(X)= log (2⇡e) K . | | 2 | | h i = h(Y)+0 n Proof = h(Yi) 1. Let K be diagonalizable as Q⇤Q>. iX=1 2. Write X = QY, where the random variables in Y n 1 are uncorrelated with var Yi = i, the ith diagonal = log(2⇡e__ i) 2 element of ⇤(cf. Corollary 10.7). iX=1 3. Since X is Gaussian, so is Y. n 1 n = log (2⇡e) i 4. Then the random variables in Y are mutually inde- 2 2 3 pendent because they are uncorrelated. i_____Y=1 4 5 1 n = log[(2⇡e) ⇤ ] 2 | | 1 n = log[(2⇡e) K ]. 2 | | Theorem 10.20 (Multivariate Gaussian Distribu- 5. Now consider tion) Let X (µ,K). Then ⇠N h(X)=h(QY) 1 n = h(Y)+log det(Q) h(X)= log (2⇡e) K . | | 2 | | h i = h(Y)+0 n Proof = h(Yi) 1. Let K be diagonalizable as Q⇤Q>. iX=1 2. Write X = QY, where the random variables in Y n 1 are uncorrelated with var Yi = i, the ith diagonal = log(2⇡e i) 2 element of ⇤(cf. Corollary 10.7). iX=1 3. Since X is Gaussian, so is Y. n 1 n = log (2⇡e) i 4. Then the random variables in Y are mutually inde- 2 2 3 pendent because they are uncorrelated. i_____Y=1 4 5 1 n = log[(2⇡e) ___⇤ ] 2 | | 1 n = log[(2⇡e) K ]. 2 | | Theorem 10.20 (Multivariate Gaussian Distribu- 5. Now consider tion) Let X (µ,K). Then ⇠N h(X)=h(QY) 1 n = h(Y)+log det(Q) h(X)= log (2⇡e) K . | | 2 | | h i = h(Y)+0 n Proof = h(Yi) 1. Let K be diagonalizable as Q⇤Q>. iX=1 2. Write X = QY, where the random variables in Y n 1 are uncorrelated with var Yi = i, the ith diagonal = log(2⇡e i) 2 element of ⇤(cf. Corollary 10.7). iX=1 3. Since X is Gaussian, so is Y. n 1 n = log (2⇡e) i 4. Then the random variables in Y are mutually inde- 2 2 3 pendent because they are uncorrelated. iY=1 4 5 1 n = log[(2⇡e) ___⇤ ] 2 | | 1 n = log[(2⇡e) ___K ]. 2 | | Theorem 10.20 (Multivariate Gaussian Distribu- 5. Now consider tion) Let X (µ,K). Then ⇠N h(X)=h(QY) 1 n = h(Y)+log det(Q) h(X)= log (2⇡e) K . | | 2 | | h i = h(Y)+0 n Proof = h(Yi) 1. Let K be diagonalizable as Q⇤Q>. iX=1 2. Write X = QY, where the random variables in Y n 1 are uncorrelated with var Yi = i, the ith diagonal = log(2⇡e i) 2 element of ⇤(cf. Corollary 10.7). iX=1 3. Since X is Gaussian, so is Y. n 1 n = log (2⇡e) i 4. Then the random variables in Y are mutually inde- 2 2 3 pendent because they are uncorrelated. iY=1 4 5 1 n = log[(2⇡e) ___⇤ ] 2 | | 1 n = log[(2⇡e) ___K ]. 2 | |
K = Q⇤Q>
K = Q ⇤ Q> | | | || || | 2 = Q ⇤ | | | | = ⇤ | | Theorem 10.20 (Multivariate Gaussian Distribu- 5. Now consider tion) Let X (µ,K). Then ⇠N h(X)=h(QY) 1 n = h(Y)+log det(Q) h(X)= log (2⇡e) K . | | 2 | | h i = h(Y)+0 n Proof = h(Yi) 1. Let K be diagonalizable as Q⇤Q>. iX=1 2. Write X = QY, where the random variables in Y n 1 are uncorrelated with var Yi = i, the ith diagonal = log(2⇡e i) 2 element of ⇤(cf. Corollary 10.7). iX=1 3. Since X is Gaussian, so is Y. n 1 n = log (2⇡e) i 4. Then the random variables in Y are mutually inde- 2 2 3 pendent because they are uncorrelated. iY=1 4 5 1 n = log[(2⇡e) ___⇤ ] 2 | | 1 n = log[(2⇡e) ___K ]. 2 | |
K = Q⇤Q>
K = Q ⇤ Q> | | | || || | 2 = Q ⇤ | | | | = ⇤ | | Theorem 10.20 (Multivariate Gaussian Distribu- 5. Now consider tion) Let X (µ,K). Then ⇠N h(X)=h(QY) 1 n = h(Y)+log det(Q) h(X)= log (2⇡e) K . | | 2 | | h i = h(Y)+0 n Proof = h(Yi) 1. Let K be diagonalizable as Q⇤Q>. iX=1 2. Write X = QY, where the random variables in Y n 1 are uncorrelated with var Yi = i, the ith diagonal = log(2⇡e i) 2 element of ⇤(cf. Corollary 10.7). iX=1 3. Since X is Gaussian, so is Y. n 1 n = log (2⇡e) i 4. Then the random variables in Y are mutually inde- 2 2 3 pendent because they are uncorrelated. iY=1 4 5 1 n = log[(2⇡e) ___⇤ ] 2 | | 1 n = log[(2⇡e) ___K ]. 2 | |
K = Q⇤Q>
K = Q ⇤ Q> | | | || || | 2 = Q ⇤ | | | | = ⇤ | | Theorem 10.20 (Multivariate Gaussian Distribu- 5. Now consider tion) Let X (µ,K). Then ⇠N h(X)=h(QY) 1 n = h(Y)+log det(Q) h(X)= log (2⇡e) K . | | 2 | | h i = h(Y)+0 n Proof = h(Yi) 1. Let K be diagonalizable as Q⇤Q>. iX=1 2. Write X = QY, where the random variables in Y n 1 are uncorrelated with var Yi = i, the ith diagonal = log(2⇡e i) 2 element of ⇤(cf. Corollary 10.7). iX=1 3. Since X is Gaussian, so is Y. n 1 n = log (2⇡e) i 4. Then the random variables in Y are mutually inde- 2 2 3 pendent because they are uncorrelated. iY=1 4 5 1 n = log[(2⇡e) ___⇤ ] 2 | | 1 n = log[(2⇡e) ___K ]. 2 | |
K = Q⇤Q>
K = Q ⇤ Q> | | | || || | 2 = Q ⇤ | | | | = ⇤ | | Theorem 10.20 (Multivariate Gaussian Distribu- 5. Now consider tion) Let X (µ,K). Then ⇠N h(X)=h(QY) 1 n = h(Y)+log det(Q) h(X)= log (2⇡e) K . | | 2 | | h i = h(Y)+0 n Proof = h(Yi) 1. Let K be diagonalizable as Q⇤Q>. iX=1 2. Write X = QY, where the random variables in Y n 1 are uncorrelated with var Yi = i, the ith diagonal = log(2⇡e i) 2 element of ⇤(cf. Corollary 10.7). iX=1 3. Since X is Gaussian, so is Y. n 1 n = log (2⇡e) i 4. Then the random variables in Y are mutually inde- 2 2 3 pendent because they are uncorrelated. iY=1 4 5 1 n = log[(2⇡e) ___⇤ ] 2 | | 1 n = log[(2⇡e) ___K ]. 2 | |
K = Q⇤Q>
K = Q ⇤ Q> | | | || || | 2 = Q ⇤ | | | | = ⇤ | | Theorem 10.20 (Multivariate Gaussian Distribu- 5. Now consider tion) Let X (µ,K). Then ⇠N h(X)=h(QY) 1 n = h(Y)+log det(Q) h(X)= log (2⇡e) K . | | 2 | | h i = h(Y)+0 n Proof = h(Yi) 1. Let K be diagonalizable as Q⇤Q>. iX=1 2. Write X = QY, where the random variables in Y n 1 are uncorrelated with var Yi = i, the ith diagonal = log(2⇡e i) 2 element of ⇤(cf. Corollary 10.7). iX=1 3. Since X is Gaussian, so is Y. n 1 n = log (2⇡e) i 4. Then the random variables in Y are mutually inde- 2 2 3 pendent because they are uncorrelated. iY=1 4 5 1 n = log[(2⇡e) ⇤ ] 2 | | 1 n = log[(2⇡e) K ]. 2 | | The Model of a “Channel” with Discrete Output
Definition 10.21 The random variable Y is related to the random variable X through a conditional distribution p(y x) defined for all x means | The Model of a “Channel” with Discrete Output
Definition 10.21 The random variable Y is related to the random variable X through a conditional distribution p(y x) defined for all x means |
X (general) p(y x) Y (discrete) | The Model of a “Channel” with Continuous Output
Definition 10.22 The random variable Y is related to the random variable X through a conditional pdf f(y x) defined for all x means | The Model of a “Channel” with Continuous Output
Definition 10.22 The random variable Y is related to the random variable X through a conditional pdf f(y x) defined for all x means |
X (general) f(y x) Y (continuous) | Conditional Differential Entropy
Definition 10.23 Let X and Y be jointly distributed random variables where Y is continuous and is related to X through a conditional pdf f(y x)defined for all x. The conditional di↵erential entropy of Y given X = x is| defined as { }
h(Y X = x)= f(y x) log f(y x) dy | (x) | | ZSY and the conditional di↵erential entropy of Y given X is defined as
h(Y X)= h(Y X = x) dF (x)= E log f(Y X) | | | ZSX Conditional Differential Entropy
Definition 10.23 Let X and Y be jointly distributed random variables where Y is continuous and is related to X through a conditional pdf f(y x)defined for all x. The conditional di↵erential entropy of Y given X = x is| defined as { }
h(Y X = x)= f(y x) log f(y x) dy | (x) | | ZSY and the conditional di↵erential entropy of Y given X is defined as
h(Y X)= h(Y X = x) dF (x)= E log f(Y X) | | | ZSX
X (general) f(y x) Y (continuous) | Conditional Differential Entropy
Definition 10.23 Let X and Y be jointly distributed random variables where Y is continuous and is related to X through a conditional pdf f(y x)defined for all x. The conditional di↵erential entropy of Y given X = x is| defined as { }
h(Y X = x)= f(y x) log f(y x) dy | (x) | | ZSY and the conditional di↵erential entropy of Y given X is defined as
h(Y X)= h(Y X = x) dF (x)= E log f(Y X) | | | ZSX
X (general) f(y x) Y (continuous) | Conditional Differential Entropy
Definition 10.23 Let X and Y be jointly distributed random variables where Y is continuous and is related to X through a conditional pdf f(y x)defined for all x. The conditional di↵erential entropy of Y given X = x is| defined as { }
h(Y X = x)= f(y x) log f(y x) dy | (x) | | ZSY and the conditional di↵erential entropy of Y given X is defined as
h(Y X)= h(Y X = x) dF (x)= E log f(Y X) | | | ZSX
X (general) f(y x) Y (continuous) | Conditional Differential Entropy
Definition 10.23 Let X and Y be jointly distributed random variables where Y is continuous and is related to X through a conditional pdf f(y x)defined for all x. The conditional di↵erential entropy of Y given X = x is| defined as { }
h(Y X = x)= f(y x) log f(y x) dy | (x) | | ZSY and the conditional di↵erential entropy of Y given X is defined as
h(Y X)= h(Y X = x) dF (x)= E log f(Y X) | X | | ZS f(x) dx
X (general) f(y x) Y (continuous) | Proposition 10.24 If
X (general) f(y x) Y (continuous) | then f(y) exists and is given by
f(y)= f(y x) dF (x). | Z Remark Proposition 10.24 says that the pdf of Y exists regardless of the distribution of X. The next proposition is its vector generalization.
Proposition 10.25 Let X and Y be jointly distributed random vectors where Y is continuous and is related to X through a conditional pdf f(y x)defined for all x.Thenf(y) exists and is given by |
f(y)= f(y x)dF (x). | Z Proposition 10.24 If
X (general) f(y x) Y (continuous) | then f(y) exists and is given by
f(y)= f(y x) dF (x). | Z Remark Proposition 10.24 says that the pdf of Y exists regardless of the distribution of X. The next proposition is its vector generalization.
Proposition 10.25 Let X and Y be jointly distributed random vectors where Y is continuous and is related to X through a conditional pdf f(y x)defined for all x.Thenf(y) exists and is given by |
f(y)= f(y x)dF (x). | Z Proposition 10.24 If
X (general) f(y x) Y (continuous) | then f(y) exists and is given by
f(y)= f(y x) dF (x). | Z Remark Proposition 10.24 says that the pdf of Y exists regardless of the distribution of X. The next proposition is its vector generalization.
Proposition 10.25 Let X and Y be jointly distributed random vectors where Y is continuous and is related to X through a conditional pdf f(y x)defined for all x.Thenf(y) exists and is given by |
f(y)= f(y x)dF (x). | Z Proposition 10.24 If
X (general) f(y x) Y (continuous) | then f(y) exists and is given by
f(y)= f(y x) dF (x). | Z Remark Proposition 10.24 says that the pdf of Y exists regardless of the distribution of X. The next proposition is its vector generalization.
Proposition 10.25 Let X and Y be jointly distributed random vectors where Y is continuous and is related to X through a conditional pdf f(y x)defined for all x.Thenf(y) exists and is given by |
f(y)= f(y x)dF (x). | Z Proposition 10.24 If 3. By Fubini’s theorem, the order of integration in FY (y) can be exchanged, and so g(v) X (general) f(y x) Y (continuous) | y 1 FY (y)= fY X (v x) dFX (x) dv. " | | # Z 1 Z 1 then f(y) exists and is given by 4. Then
f (y)= 1 f (y x) dF (x). Y Y X X d | | Z 1 fY (y)= FY (y) dy Proof d y = g(v) dv 1. dy Z 1 = g(y) FY (y)=FXY ( , y) 1 1 y = fY X (y x) dFX (x). 1 | | = fY X (v x) dv dFX (x). Z 1 | | Z 1 Z 1
2. Since
y 1 fY X (v x) dv dFX (x) | | Z 1 Z 1 y 1 = fY X (v x) dv dFX (x) | | Z 1 Z 1 = F ( ,y) XY 1 = FY (y) 1, fY X (v x) is absolutely integrable. | | Proposition 10.24 If 3. By Fubini’s theorem, the order of integration in FY (y) can be exchanged, and so g(v) X (general) f(y x) Y (continuous) | y 1 FY (y)= fY X (v x) dFX (x) dv. " | | # Z 1 Z 1 then f(y) exists and is given by 4. Then
f (y)= 1 f (y x) dF (x). Y Y X X d | | Z 1 fY (y)= FY (y) dy Proof d y = g(v) dv 1. dy Z 1 = g(y) FY (y)=FXY ( , y) 1 1 y = fY X (y x) dFX (x). 1 | | = fY X (v x) dv dFX (x). Z 1 | | Z 1 Z 1
2. Since
y 1 fY X (v x) dv dFX (x) | | Z 1 Z 1 y 1 = fY X (v x) dv dFX (x) | | Z 1 Z 1 = F ( ,y) XY 1 = FY (y) 1, fY X (v x) is absolutely integrable. | | Proposition 10.24 If 3. By Fubini’s theorem, the order of integration in FY (y) can be exchanged, and so g(v) X (general) f(y x) Y (continuous) | y 1 FY (y)= fY X (v x) dFX (x) dv. " | | # Z 1 Z 1 then f(y) exists and is given by 4. Then
f (y)= 1 f (y x) dF (x). Y Y X X d | | Z 1 fY (y)= FY (y) dy Proof d y = g(v) dv 1. dy Z 1 = g(y) FY (y)=FXY ( , y) 1 1 y = fY X (y x) dFX (x). 1 | | = fY X (v x) dv dFX (x). Z 1 | | Z 1 Z 1
2. Since
y 1 fY X (v x) dv dFX (x) | | Z 1 Z 1 y 1 = fY X (v x) dv dFX (x) | | Z 1 Z 1 = F ( ,y) XY 1 = FY (y) 1, fY X (v x) is absolutely integrable. | | Proposition 10.24 If 3. By Fubini’s theorem, the order of integration in FY (y) can be exchanged, and so g(v) X (general) f(y x) Y (continuous) | y 1 FY (y)= fY X (v x) dFX (x) dv. " | | # Z 1 Z 1 then f(y) exists and is given by 4. Then
f (y)= 1 f (y x) dF (x). Y Y X X d | | Z 1 fY (y)= FY (y) dy Proof d y = g(v) dv 1. dy Z 1 = g(y) FY (y)=FXY ( , y) 1 1 y = fY X (y x) dFX (x). 1 | | = fY X (v x) dv dFX (x). Z 1 | | Z 1 Z 1
2. Since
y 1 fY X (v x) dv dFX (x) | | Z 1 Z 1 y 1 = fY X (v x) dv dFX (x) | | Z 1 Z 1 = F ( ,y) XY 1 = FY (y) 1, fY X (v x) is absolutely integrable. | | Proposition 10.24 If 3. By Fubini’s theorem, the order of integration in FY (y) can be exchanged, and so g(v) X (general) f(y x) Y (continuous) | y 1 FY (y)= fY X (v x) dFX (x) dv. " | | # Z 1 Z 1 then f(y) exists and is given by 4. Then
f (y)= 1 f (y x) dF (x). Y Y X X d | | Z 1 fY (y)= FY (y) dy Proof d y = g(v) dv 1. dy Z 1 = g(y) FY (y)=FXY ( , y) 1 1 y = fY X (y x) dFX (x). 1 | | = fY X (v x) dv dFX (x). Z 1 | | Z 1 Z 1
2. Since
y 1 fY X (v x) dv dFX (x) | | Z 1 Z 1 y 1 = fY X (v x) dv dFX (x) | | Z 1 Z 1 = F ( ,y) XY 1 = FY (y) 1, fY X (v x) is absolutely integrable. | | Proposition 10.24 If 3. By Fubini’s theorem, the order of integration in FY (y) can be exchanged, and so g(v) X (general) f(y x) Y (continuous) | y 1 FY (y)= fY X (v x) dFX (x) dv. " | | # Z 1 Z 1 then f(y) exists and is given by 4. Then
f (y)= 1 f (y x) dF (x). Y Y X X d | | Z 1 fY (y)= FY (y) dy Proof d y = g(v) dv 1. dy Z 1 = g(y) FY (y)=FXY ( , y) 1 1 y = fY X (y x) dFX (x). 1 | | = fY X (v x) dv dFX (x). Z 1 | | Z 1 Z 1
2. Since
y 1 fY X (v x) dv dFX (x) | | Z 1 Z 1 y 1 = fY X (v x) dv dFX (x) | | Z 1 Z 1 = F ( ,y) XY 1 = FY (y) 1, fY X (v x) is absolutely integrable. | | Proposition 10.24 If 3. By Fubini’s theorem, the order of integration in FY (y) can be exchanged, and so g(v) X (general) f(y x) Y (continuous) | y 1 FY (y)= fY X (v x) dFX (x) dv. " | | # Z 1 Z 1 then f(y) exists and is given by 4. Then
f (y)= 1 f (y x) dF (x). Y Y X X d | | Z 1 fY (y)= FY (y) dy Proof d y = g(v) dv 1. dy Z 1 = g(y) FY (y)=FXY ( , y) 1 1 y = fY X (y x) dFX (x). 1 | | = fY X (v x) dv dFX (x). Z 1 | | Z 1 Z 1
2. Since
y 1 fY X (v x) dv dFX (x) | | Z 1 Z 1 __ y 1 = fY X (v x) dv dFX (x) | | Z 1 Z 1 = F (__,y) XY 1 = FY (y) 1, fY X (v x) is absolutely integrable. | | Proposition 10.24 If 3. By Fubini’s theorem, the order of integration in FY (y) can be exchanged, and so g(v) X (general) f(y x) Y (continuous) | y 1 FY (y)= fY X (v x) dFX (x) dv. " | | # Z 1 Z 1 then f(y) exists and is given by 4. Then
f (y)= 1 f (y x) dF (x). Y Y X X d | | Z 1 fY (y)= FY (y) dy Proof d y = g(v) dv 1. dy Z 1 = g(y) FY (y)=FXY ( , y) 1 1 y = fY X (y x) dFX (x). 1 | | = fY X (v x) dv dFX (x). Z 1 | | Z 1 Z 1
2. Since
y 1 fY X (v x) dv dFX (x) | | Z 1 Z 1 _y 1 = fY X (v x) dv dFX (x) | | Z 1 Z 1 = F ( ,y_) XY 1 = FY (y) 1, fY X (v x) is absolutely integrable. | | Proposition 10.24 If 3. By Fubini’s theorem, the order of integration in FY (y) can be exchanged, and so g(v) X (general) f(y x) Y (continuous) | y 1 FY (y)= fY X (v x) dFX (x) dv. " | | # Z 1 Z 1 then f(y) exists and is given by 4. Then
f (y)= 1 f (y x) dF (x). Y Y X X d | | Z 1 fY (y)= FY (y) dy Proof d y = g(v) dv 1. dy Z 1 = g(y) FY (y)=FXY ( , y) 1 1 y = fY X (y x) dFX (x). 1 | | = fY X (v x) dv dFX (x). Z 1 | | Z 1 Z 1
2. Since
y 1 fY X (v x) dv dFX (x) | | Z 1 Z 1 y 1 = fY X (v x) dv dFX (x) | | Z 1 Z 1 = F ( ,y) XY 1 = FY (y) 1, fY X (v x) is absolutely integrable. | | Proposition 10.24 If 3. By Fubini’s theorem, the order of integration in FY (y) can be exchanged, and so g(v) X (general) f(y x) Y (continuous) | y 1 FY (y)= fY X (v x) dFX (x) dv. " | | # Z 1 Z 1 then f(y) exists and is given by 4. Then
f (y)= 1 f (y x) dF (x). Y Y X X d | | Z 1 fY (y)= FY (y) dy Proof d y = g(v) dv 1. dy Z 1 = g(y) FY (y)=FXY ( , y) 1 1 y = fY X (y x) dFX (x). 1 | | = fY X (v x) dv dFX (x). Z 1 | | Z 1 Z 1
2. Since
y 1 fY X (v x) dv dFX (x) | | Z 1 Z 1 y 1 = fY X (v x) dv dFX (x) | | Z 1 Z 1 = F ( ,y) XY 1 = FY (y) 1, fY X (v x) is absolutely integrable. | | Proposition 10.24 If 3. By Fubini’s theorem, the order of integration in FY (y) can be exchanged, and so g(v) X (general) f(y x) Y (continuous) | y 1 FY (y)= fY X (v x) dFX (x) dv. " | | # Z 1 Z 1 then f(y) exists and is given by 4. Then
f (y)= 1 f (y x) dF (x). Y Y X X d | | Z 1 fY (y)= FY (y) dy Proof d y = g(v) dv 1. dy Z 1 = g(y) FY (y)=FXY ( , y) 1 1 y = fY X (y x) dFX (x). 1 | | = fY X (v x) dv dFX (x). Z 1 | | Z 1 Z 1
2. Since
y 1 fY X (v x) dv dFX (x) | | Z 1 Z 1 y 1 = fY X (v x) dv dFX (x) | | Z 1 Z 1 = F ( ,y) XY 1 = FY (y) 1, fY X (v x) is absolutely integrable. | | Proposition 10.24 If 3. By Fubini’s theorem, the order of integration in FY (y) can be exchanged, and so
X (general) f(y x) Y (continuous) | y 1 FY (y)= fY X (v x) dFX (x) dv. " | | # Z 1 Z 1 then f(y) exists and is given by 4. Then
f (y)= 1 f (y x) dF (x). Y Y X X d | | Z 1 fY (y)= FY (y) dy Proof d y = g(v) dv 1. dy Z 1 = g(y) FY (y)=FXY ( , y) 1 1 y = fY X (y x) dFX (x). 1 | | = fY X (v x) dv dFX (x). Z 1 | | Z 1 Z 1
2. Since
y 1 fY X (v x) dv dFX (x) | | Z 1 Z 1 y 1 = fY X (v x) dv dFX (x) | | Z 1 Z 1 = F ( ,y) XY 1 = FY (y) 1, fY X (v x) is absolutely integrable. | | Proposition 10.24 If 3. By Fubini’s theorem, the order of integration in FY (y) can be exchanged, and so
X (general) f(y x) Y (continuous) | y 1 FY (y)= fY X (v x) dFX (x) dv. " | | # Z 1 Z 1 then f(y) exists and is given by 4. Then
f (y)= 1 f (y x) dF (x). Y Y X X d | | Z 1 fY (y)= FY (y) dy Proof d y = g(v) dv 1. dy Z 1 = g(y) FY (y)=FXY ( , y) 1 1 y = fY X (y x) dFX (x). 1 | | = fY X (v x) dv dFX (x). Z 1 | | Z 1 Z 1
2. Since
y 1 fY X (v x) dv dFX (x) | | Z 1 Z 1 y 1 = fY X (v x) dv dFX (x) | | Z 1 Z 1 = F ( ,y) XY 1 = FY (y) 1, fY X (v x) is absolutely integrable. | | Proposition 10.24 If 3. By Fubini’s theorem, the order of integration in FY (y) can be exchanged, and so
X (general) f(y x) Y (continuous) | y 1 FY (y)= fY X (v x) dFX (x) dv. " | | # Z 1 Z 1 then f(y) exists and is given by 4. Then
f (y)= 1 f (y x) dF (x). Y Y X X d | | Z 1 fY (y)= FY (y) dy Proof d y = g(v) dv 1. dy Z 1 = g(y) FY (y)=FXY ( , y) 1 1 y = fY X (y x) dFX (x). 1 | | = fY X (v x) dv dFX (x). Z 1 | | Z 1 Z 1
2. Since
y 1 fY X (v x) dv dFX (x) | | Z 1 Z 1 y 1 = fY X (v x) dv dFX (x) | | Z 1 Z 1 = F ( ,y) XY 1 = FY (y) 1, fY X (v x) is absolutely integrable. | | Proposition 10.24 If 3. By Fubini’s theorem, the order of integration in FY (y) can be exchanged, and so
X (general) f(y x) Y (continuous) | y 1 FY (y)= fY X (v x) dFX (x) dv. " | | # Z 1 Z 1 then f(y) exists and is given by 4. Then
f (y)= 1 f (y x) dF (x). Y Y X X d | | Z 1 fY (y)= FY (y) dy Proof d y = g(v) dv 1. dy Z 1 = g(y) FY (y)=FXY ( , y) 1 1 y = fY X (y x) dFX (x). 1 | | = fY X (v x) dv dFX (x). Z 1 | | Z 1 Z 1
2. Since
y 1 fY X (v x) dv dFX (x) | | Z 1 Z 1 y 1 = fY X (v x) dv dFX (x) | | Z 1 Z 1 = F ( ,y) XY 1 = FY (y) 1, fY X (v x) is absolutely integrable. | | Proposition 10.24 If 3. By Fubini’s theorem, the order of integration in FY (y) can be exchanged, and so
X (general) f(y x) Y (continuous) | y 1 FY (y)= fY X (v x) dFX (x) dv. " | | # Z 1 Z 1 then f(y) exists and is given by 4. Then
f (y)= 1 f (y x) dF (x). Y Y X X d | | Z 1 fY (y)= FY (y) dy Proof d y = g(v) dv 1. dy Z 1 = g(y) FY (y)=FXY ( , y) 1 1 y = fY X (y x) dFX (x). 1 | | = fY X (v x) dv dFX (x). Z 1 | | Z 1 Z 1
2. Since
y 1 fY X (v x) dv dFX (x) | | Z 1 Z 1 y 1 = fY X (v x) dv dFX (x) | | Z 1 Z 1 = F ( ,y) XY 1 = FY (y) 1, fY X (v x) is absolutely integrable. | | Proposition 10.24 If 3. By Fubini’s theorem, the order of integration in FY (y) can be exchanged, and so
X (general) f(y x) Y (continuous) | y 1 FY (y)= fY X (v x) dFX (x) dv. " | | # Z 1 Z 1 then f(y) exists and is given by 4. Then
f (y)= 1 f (y x) dF (x). Y Y X X d | | Z 1 fY (y)= FY (y) dy Proof d y = g(v) dv 1. dy Z 1 = g(y) FY (y)=FXY ( , y) 1 1 y = fY X (y x) dFX (x). 1 | | = fY X (v x) dv dFX (x). Z 1 | | Z 1 Z 1
2. Since
y 1 fY X (v x) dv dFX (x) | | Z 1 Z 1 y 1 = fY X (v x) dv dFX (x) | | Z 1 Z 1 = F ( ,y) XY 1 = FY (y) 1, fY X (v x) is absolutely integrable. | | Proposition 10.24 If 3. By Fubini’s theorem, the order of integration in FY (y) can be exchanged, and so g(v) X (general) f(y x) Y (continuous) | y 1 FY (y)= fY X (v x) dFX (x) dv. " | | # Z 1 Z 1 then f(y) exists and is given by 4. Then
f (y)= 1 f (y x) dF (x). Y Y X X d | | Z 1 fY (y)= FY (y) dy Proof d y = g(v) dv 1. dy Z 1 = g(y) FY (y)=FXY ( , y) 1 1 y = fY X (y x) dFX (x). 1 | | = fY X (v x) dv dFX (x). Z 1 | | Z 1 Z 1
2. Since
y 1 fY X (v x) dv dFX (x) | | Z 1 Z 1 y 1 = fY X (v x) dv dFX (x) | | Z 1 Z 1 = F ( ,y) XY 1 = FY (y) 1, fY X (v x) is absolutely integrable. | | Proposition 10.24 If 3. By Fubini’s theorem, the order of integration in FY (y) can be exchanged, and so g(v) X (general) f(y x) Y (continuous) | y 1 FY (y)= fY X (v x) dFX (x) dv. " | | # Z 1 Z 1 then f(y) exists and is given by 4. Then
f (y)= 1 f (y x) dF (x). Y Y X X d | | Z 1 fY (y)= FY (y) dy Proof d y = g(v) dv 1. dy Z 1 = g(y) FY (y)=FXY ( , y) 1 1 y = fY X (y x) dFX (x). 1 | | = fY X (v x) dv dFX (x). Z 1 | | Z 1 Z 1
2. Since
y 1 fY X (v x) dv dFX (x) | | Z 1 Z 1 y 1 = fY X (v x) dv dFX (x) | | Z 1 Z 1 = F ( ,y) XY 1 = FY (y) 1, fY X (v x) is absolutely integrable. | | Proposition 10.24 If 3. By Fubini’s theorem, the order of integration in FY (y) can be exchanged, and so g(v) X (general) f(y x) Y (continuous) | y 1 FY (y)= fY X (v x) dFX (x) dv. " | | # Z 1 Z 1 then f(y) exists and is given by 4. Then
f (y)= 1 f (y x) dF (x). Y Y X X d | | Z 1 fY (y)= FY (y) dy Proof d y = g(v) dv 1. dy Fundamental Z 1 theorem of = g(y) FY (y)=FXY ( , y) calculus 1 1 y = fY X (y x) dFX (x). 1 | | = fY X (v x) dv dFX (x). Z 1 | | Z 1 Z 1
2. Since
y 1 fY X (v x) dv dFX (x) | | Z 1 Z 1 y 1 = fY X (v x) dv dFX (x) | | Z 1 Z 1 = F ( ,y) XY 1 = FY (y) 1, fY X (v x) is absolutely integrable. | | Proposition 10.24 If 3. By Fubini’s theorem, the order of integration in FY (y) can be exchanged, and so g(v) X (general) f(y x) Y (continuous) | y 1 FY (y)= fY X (v x) dFX (x) dv. " | | # Z 1 Z 1 then f(y) exists and is given by 4. Then
f (y)= 1 f (y x) dF (x). Y Y X X d | | Z 1 fY (y)= FY (y) dy Proof d y = g(v) dv 1. dy Z 1 = g(y) FY (y)=FXY ( , y) 1 1 y = fY X (y x) dFX (x). 1 | | = fY X (v x) dv dFX (x). Z 1 | | Z 1 Z 1
2. Since
y 1 fY X (v x) dv dFX (x) | | Z 1 Z 1 y 1 = fY X (v x) dv dFX (x) | | Z 1 Z 1 = F ( ,y) XY 1 = FY (y) 1, fY X (v x) is absolutely integrable. | | Proposition 10.24 If 3. By Fubini’s theorem, the order of integration in FY (y) can be exchanged, and so g(v) X (general) f(y x) Y (continuous) | y 1 FY (y)= fY X (v x) dFX (x) dv. " | | # Z 1 Z 1 then f(y) exists and is given by 4. Then
f (y)= 1 f (y x) dF (x). Y Y X X d | | Z 1 fY (y)= FY (y) dy Proof d y = g(v) dv 1. dy Z 1 = g(y) FY (y)=FXY ( , y) 1 1 y = fY X (y x) dFX (x). 1 | | = fY X (v x) dv dFX (x). Z 1 | | Z 1 Z 1
2. Since
y 1 fY X (v x) dv dFX (x) | | Z 1 Z 1 y 1 = fY X (v x) dv dFX (x) | | Z 1 Z 1 = F ( ,y) XY 1 = FY (y) 1, fY X (v x) is absolutely integrable. | | Mutual Information
Definition 10.26 Let
X (general) f(y x) Y (continuous) |
1. The mutual information between X and Y is defined as f(Y X) I(X; Y )=E log | f(Y ) f(y x) = f(y x) log | dy dF (x). (x) | f(y) ZSX ZSY 2. When both X and Y are continuous and f(x, y)exists,
f(Y X) f(X, Y ) I(X; Y )=E log | = E log . f(Y ) f(X)f(Y ) Mutual Information
Definition 10.26 Let
X (general) f(y x) Y (continuous) |
1. The mutual information between X and Y is defined as f(Y X) I(X; Y )=E log | f(Y ) f(y x) = f(y x) log | dy dF (x). (x) | f(y) ZSX ZSY 2. When both X and Y are continuous and f(x, y)exists,
f(Y X) f(X, Y ) I(X; Y )=E log | = E log . f(Y ) f(X)f(Y ) Mutual Information
Definition 10.26 Let
X (general) f(y x) Y (continuous) |
1. The mutual information between X and Y is defined as f(Y X) I(X; Y )=E log | f(Y ) f(y x) = f(y x) log | dy dF (x). (x) | f(y) ZSX ZSY 2. When both X and Y are continuous and f(x, y)exists,
f(Y X) f(X, Y ) I(X; Y )=E log | = E log . f(Y ) f(X)f(Y ) Mutual Information
Definition 10.26 Let
X (general) f(y x) Y (continuous) |
1. The mutual information between X and Y is defined as f(Y X) I(X; Y )=E log | f(Y ) f(y x) = f(y x) log | dy dF (x). X Y (x) | f(y) ZS ZS f(x) dx 2. When both X and Y are continuous and f(x, y)exists,
f(Y X) f(X, Y ) I(X; Y )=E log | = E log . f(Y ) f(X)f(Y ) Mutual Information
Definition 10.26 Let
X (general) f(y x) Y (continuous) |
1. The mutual information between X and Y is defined as f(Y X) I(X; Y )=E log | f(Y ) f(y x) = f(y x) log | dy dF (x). (x) | f(y) ZSX ZSY 2. When both X and Y are continuous and f(x, y)exists,
f(Y X) f(X, Y ) I(X; Y )=E log | = E log . f(Y ) f(X)f(Y ) Mutual Information
Definition 10.26 Let
X (general) f(y x) Y (continuous) |
1. The mutual information between X and Y is defined as f(Y X) I(X; Y )=E log | f(Y ) f(y x) = f(y x) log | dy dF (x). (x) | f(y) ZSX ZSY 2. When both X and Y are continuous and f(x, y)exists,
f(Y X) f(X, Y ) I(X; Y )=E log | = E log . f(Y ) f(X)f(Y ) Remarks
With Definition 10.26, the mutual information is defined when one r.v. is • general and the other is continuous. In Ch. 2, the mutual information is defined when both r.v.’s are discrete. • Thus the mutual information is defined when each of the r.v.’s is either • discrete or continuous. Remarks
With Definition 10.26, the mutual information is defined when one r.v. is • general and the other is continuous. In Ch. 2, the mutual information is defined when both r.v.’s are discrete. • Thus the mutual information is defined when each of the r.v.’s is either • discrete or continuous. Remarks
With Definition 10.26, the mutual information is defined when one r.v. is • general and the other is continuous. In Ch. 2, the mutual information is defined when both r.v.’s are discrete. • Thus the mutual information is defined when each of the r.v.’s is either • discrete or continuous. Remarks
With Definition 10.26, the mutual information is defined when one r.v. is • general and the other is continuous. In Ch. 2, the mutual information is defined when both r.v.’s are discrete. • Thus the mutual information is defined when each of the r.v.’s is either • discrete or continuous. Conditional Mutual Information
Definition 10.27 Let
X (general) f(y x, t) Y (continuous) T (general) |
The mutual information between X and Y given T is defined as
f(Y X, T ) I(X; Y T )= I(X; Y T = t)dF (t)=E log | | | f(Y T ) ZST | where f(y x, t) I(X; Y T = t)= f(y x, t) log | dy dF (x t). | (t) (x,t) | f(y t) | ZSX ZSY | Conditional Mutual Information
Definition 10.27 Let
X (general) f(y x, t) Y (continuous) T (general) |
The mutual information between X and Y given T is defined as
f(Y X, T ) I(X; Y T )= I(X; Y T = t)dF (t)=E log | | | f(Y T ) ZST | where f(y x, t) I(X; Y T = t)= f(y x, t) log | dy dF (x t). | (t) (x,t) | f(y t) | ZSX ZSY | Conditional Mutual Information
Definition 10.27 Let
X (general) f(y x, t) Y (continuous) T (general) |
The mutual information between X and Y given T is defined as
f(Y X, T ) I(X; Y T )= I(X; Y T = t)dF (t)=E log | | | f(Y T ) ZST | where f(y x, t) I(X; Y T = t)= f(y x, t) log | dy dF (x t). | (t) (x,t) | f(y t) | ZSX ZSY | Interpretation of I(X;Y)
j 1. Assume f(x, y) exists and is continuous. 5. For all i and j,let(x ,y ) Ai A . i j 2 x ⇥ y 2. For a fixed , for all integer i and j, define the 6. Then intervals ˆ ˆ i j I(X ; Y ) Ax =[i , (i + 1) ) and Ay =[j , (j +1) ) Pr (Xˆ , Yˆ )=(i, j) ˆ ˆ { } = Pr (X , Y )=(i, j) log { } Pr Xˆ = i Pr Yˆ = j and the rectangle Xi Xj { } { } 2 2 f(xi,yj ) i,j i j f(x ,y ) log A = A A . ⇡ i j xy x ⇥ y (f(x ) )(f(y ) ) Xi Xj i j f(x ,y ) 3. Define discrete r.v.’s 2 i j = f(xi,yj ) log f(x )f(y ) Xi Xj i j i Xˆ = i if X A f(x, y) 2 x ˆ j f(x, y)log dxdy ( Y = j if Y Ay ⇡ 2 ZZ f(x)f(y) = I(X; Y ) ˆ ˆ 4. X and Y are quantizations of X and Y ,respec- tively. 7. Therefore, I(X; Y ) can be interpreted as the limit of I(Xˆ ; Yˆ )as 0. ! 8. This interpretation continues to be valid for general distribution for X and Y . Interpretation of I(X;Y)
j 1. Assume f(x, y) exists and is continuous. 5. For all i and j,let(x ,y ) Ai A . i j 2 x ⇥ y 2. For a fixed , for all integer i and j, define the 6. Then intervals ˆ ˆ i j I(X ; Y ) Ax =[i , (i + 1) ) and Ay =[j , (j +1) ) Pr (Xˆ , Yˆ )=(i, j) ˆ ˆ { } = Pr (X , Y )=(i, j) log { } Pr Xˆ = i Pr Yˆ = j and the rectangle Xi Xj { } { } 2 2 f(xi,yj ) i,j i j f(x ,y ) log A = A A . ⇡ i j xy x ⇥ y (f(x ) )(f(y ) ) Xi Xj i j f(x ,y ) 3. Define discrete r.v.’s 2 i j = f(xi,yj ) log f(x )f(y ) Xi Xj i j i Xˆ = i if X A f(x, y) 2 x ˆ j f(x, y)log dxdy ( Y = j if Y Ay ⇡ 2 ZZ f(x)f(y) = I(X; Y ) ˆ ˆ 4. X and Y are quantizations of X and Y ,respec- tively. 7. Therefore, I(X; Y ) can be interpreted as the limit of I(Xˆ ; Yˆ )as 0. ! 8. This interpretation continues to be valid for general distribution for X and Y . Interpretation of I(X;Y)
j 1. Assume f(x, y) exists and is continuous. 5. For all i and j,let(x ,y ) Ai A . i j 2 x ⇥ y 2. For a fixed , for all integer i and j, define the 6. Then intervals ˆ ˆ i j I(X ; Y ) Ax =[i , (i + 1) ) and Ay =[j , (j +1) ) Pr (Xˆ , Yˆ )=(i, j) ˆ ˆ { } = Pr (X , Y )=(i, j) log { } Pr Xˆ = i Pr Yˆ = j and the rectangle Xi Xj { } { } 2 2 f(xi,yj ) i,j i j f(x ,y ) log A = A A . ⇡ i j xy x ⇥ y (f(x ) )(f(y ) ) Xi Xj i j f(x ,y ) 3. Define discrete r.v.’s 2 i j = f(xi,yj ) log f(x )f(y ) Xi Xj i j i Xˆ = i if X A f(x, y) 2 x ˆ j f(x, y)log dxdy ( Y = j if Y Ay ⇡ 2 ZZ f(x)f(y) = I(X; Y ) ˆ ˆ 4. X and Y are quantizations of X and Y ,respec- tively. 7. Therefore, I(X; Y ) can be interpreted as the limit of I(Xˆ ; Yˆ )as 0. ! 8. This interpretation continues to be valid for general distribution for X and Y . Interpretation of I(X;Y)
j 1. Assume f(x, y) exists and is continuous. 5. For all i and j,let(x ,y ) Ai A . i j 2 x ⇥ y 2. For a fixed , for all integer i and j, define the 6. Then intervals ˆ ˆ i j I(X ; Y ) Ax =[i , (i + 1) ) and Ay =[j , (j +1) ) Pr (Xˆ , Yˆ )=(i, j) ˆ ˆ { } = Pr (X , Y )=(i, j) log { } Pr Xˆ = i Pr Yˆ = j and the rectangle Xi Xj { } { } 2 2 f(xi,yj ) i,j i j f(x ,y ) log A = A A . ⇡ i j xy x ⇥ y (f(x ) )(f(y ) ) Xi Xj i j f(x ,y ) 3. Define discrete r.v.’s 2 i j = f(xi,yj ) log f(x )f(y ) Xi Xj i j i Xˆ = i if X A f(x, y) 2 x ˆ j f(x, y)log dxdy ( Y = j if Y Ay ⇡ 2 ZZ f(x)f(y) = I(X; Y ) ˆ ˆ 4. X and Y are quantizations of X and Y ,respec- tively. 7. Therefore, I(X; Y ) can be interpreted as the limit of I(Xˆ ; Yˆ )as 0. ! 8. This interpretation continues to be valid for general distribution for X and Y . Interpretation of I(X;Y)
j 1. Assume f(x, y) exists and is continuous. 5. For all i and j,let(x ,y ) Ai A . i j 2 x ⇥ y 2. For a fixed , for all integer i and j, define the 6. Then intervals ˆ ˆ i j I(X ; Y ) Ax =[i , (i + 1) ) and Ay =[j , (j +1) ) Pr (Xˆ , Yˆ )=(i, j) ˆ ˆ { } = Pr (X , Y )=(i, j) log { } Pr Xˆ = i Pr Yˆ = j and the rectangle Xi Xj { } { } 2 2 f(xi,yj ) i,j i j f(x ,y ) log A = A A . ⇡ i j xy x ⇥ y (f(x ) )(f(y ) ) Xi Xj i j f(x ,y ) 3. Define discrete r.v.’s 2 i j = f(xi,yj ) log f(x )f(y ) Xi Xj i j i Xˆ = i if X A f(x, y) 2 x ˆ j f(x, y)log dxdy ( Y = j if Y Ay ⇡ 2 ZZ f(x)f(y) = I(X; Y ) ˆ ˆ 4. X and Y are quantizations of X and Y ,respec- tively. 7. Therefore, I(X; Y ) can be interpreted as the limit of I(Xˆ ; Yˆ )as 0. ! 8. This interpretation continues to be valid for general distribution for X and Y . Interpretation of I(X;Y)
j 1. Assume f(x, y) exists and is continuous. 5. For all i and j,let(x ,y ) Ai A . i j 2 x ⇥ y 2. For a fixed , for all integer i and j, define the 6. Then intervals ˆ ˆ i j I(X ; Y ) Ax =[i , (i + 1) ) and Ay =[j , (j +1) ) Pr (Xˆ , Yˆ )=(i, j) ˆ ˆ { } = Pr (X , Y )=(i, j) log { } Pr Xˆ = i Pr Yˆ = j and the rectangle Xi Xj { } { } 2 2 f(xi,yj ) i,j i j f(x ,y ) log A = A A . ⇡ i j xy x ⇥ y (f(x ) )(f(y ) ) Xi Xj i j f(x ,y ) 3. Define discrete r.v.’s 2 i j = f(xi,yj ) log f(x )f(y ) Xi Xj i j i Xˆ = i if X A f(x, y) 2 x ˆ j f(x, y)log dxdy ( Y = j if Y Ay ⇡ 2 ZZ f(x)f(y) = I(X; Y ) ˆ ˆ 4. X and Y are quantizations of X and Y ,respec- tively. 7. Therefore, I(X; Y ) can be interpreted as the limit of I(Xˆ ; Yˆ )as 0. ! 8. This interpretation continues to be valid for general distribution for X and Y . Interpretation of I(X;Y)
j 1. Assume f(x, y) exists and is continuous. 5. For allyi and j,let(x ,y ) Ai A . i j 2 x ⇥ y 2. For a fixed , for all integer i and j, define the 6. Then intervals ˆ ˆ i j I(X ; Y ) Ax =[i , (i + 1) ) and Ay =[j , (j +1) ) Pr (Xˆ , Yˆ )=(i, j) ˆ ˆ { } = Pr (X , Y )=(i, j) log { } Pr Xˆ = i Pr Yˆ = j and the rectangle (j + 1) Xi Xj { } { } j i,j A Axy 2 y 2 f(xi,yj ) i,j i j f(x ,y ) log A = A A . ⇡ i j xy x ⇥ y j (f(x ) )(f(y ) ) Xi Xj { i j f(x ,y ) 3. Define discrete r.v.’s 2 i j = f(xi,yj ) logAi fx(x )f(y ) Xi Xj i j i Xˆ = i if X A f(x, y) 2 x ˆ j f(x, y)log |{z}dxdy ( Y = j if Y Ay ⇡ 2 ZZ f(x)f(y) = I(X; Y ) x ˆ ˆ 4. X and Y are quantizations of X and Y ,respec- i (i + 1) tively. 7. Therefore, I(X; Y ) can be interpreted as the limit of I(Xˆ ; Yˆ )as 0. ! 8. This interpretation continues to be valid for general distribution for X and Y . Interpretation of I(X;Y)
j 1. Assume f(x, y) exists and is continuous. 5. For all i and j,let(x ,y ) Ai A . i j 2 x ⇥ y 2. For a fixed , for all integer i and j, define the 6. Then intervals ˆ ˆ i j I(X ; Y ) Ax =[i , (i + 1) ) and Ay =[j , (j +1) ) Pr (Xˆ , Yˆ )=(i, j) ˆ ˆ { } = Pr (X , Y )=(i, j) log { } Pr Xˆ = i Pr Yˆ = j and the rectangle Xi Xj { } { } 2 2 f(xi,yj ) i,j i j f(x ,y ) log A = A A . ⇡ i j xy x ⇥ y (f(x ) )(f(y ) ) Xi Xj i j f(x ,y ) 3. Define discrete r.v.’s 2 i j = f(xi,yj ) log f(x )f(y ) Xi Xj i j i Xˆ = i if X A f(x, y) 2 x ˆ j f(x, y)log dxdy ( Y = j if Y Ay ⇡ 2 ZZ f(x)f(y) = I(X; Y ) ˆ ˆ 4. X and Y are quantizations of X and Y ,respec- tively. 7. Therefore, I(X; Y ) can be interpreted as the limit of I(Xˆ ; Yˆ )as 0. ! 8. This interpretation continues to be valid for general distribution for X and Y . Interpretation of I(X;Y)
j 1. Assume f(x, y) exists and is continuous. 5. For all i and j,let(x ,y ) Ai A . i j 2 x ⇥ y 2. For a fixed , for all integer i and j, define the 6. Then intervals ˆ ˆ i j I(X ; Y ) Ax =[i , (i + 1) ) and Ay =[j , (j +1) ) Pr (Xˆ , Yˆ )=(i, j) ˆ ˆ { } = Pr (X , Y )=(i, j) log { } Pr Xˆ = i Pr Yˆ = j and the rectangle Xi Xj { } { } 2 2 f(xi,yj ) i,j i j f(x ,y ) log A = A A . ⇡ i j xy x ⇥ y (f(x ) )(f(y ) ) Xi Xj i j f(x ,y ) 3. Define discrete r.v.’s 2 i j = f(xi,yj ) log f(x )f(y ) Xi Xj i j i Xˆ = i if X A f(x, y) 2 x ˆ j f(x, y)log dxdy ( Y = j if Y Ay ⇡ 2 ZZ f(x)f(y) = I(X; Y ) ˆ ˆ 4. X and Y are quantizations of X and Y ,respec- tively. 7. Therefore, I(X; Y ) can be interpreted as the limit of I(Xˆ ; Yˆ )as 0. ! 8. This interpretation continues to be valid for general distribution for X and Y . Interpretation of I(X;Y)
j 1. Assume f(x, y) exists and is continuous. 5. For all i and j,let(x ,y ) Ai A . i j 2 x ⇥ y 2. For a fixed , for all integer i and j, define the 6. Then intervals ˆ ˆ i j I(X ; Y ) Ax =[i , (i + 1) ) and Ay =[j , (j +1) ) Pr (Xˆ , Yˆ )=(i, j) ˆ ˆ { } = Pr (X , Y )=(i, j) log { } Pr Xˆ = i Pr Yˆ = j and the rectangle Xi Xj { } { } 2 2 f(xi,yj ) i,j i j f(x ,y ) log A = A A . ⇡ i j xy x ⇥ y (f(x ) )(f(y ) ) Xi Xj i j f(x ,y ) 3. Define discrete r.v.’s 2 i j = f(xi,yj ) log f(x )f(y ) Xi Xj i j i Xˆ = i if X A f(x, y) 2 x ˆ j f(x, y)log dxdy ( Y = j if Y Ay ⇡ 2 ZZ f(x)f(y) = I(X; Y ) ˆ ˆ 4. X and Y are quantizations of X and Y ,respec- tively. 7. Therefore, I(X; Y ) can be interpreted as the limit of I(Xˆ ; Yˆ )as 0. ! 8. This interpretation continues to be valid for general distribution for X and Y . Interpretation of I(X;Y)
j 1. Assume f(x, y) exists and is continuous. 5. For all i and j,let(x ,y ) Ai A . i j 2 x ⇥ y 2. For a fixed , for all integer i and j, define the 6. Then intervals ˆ ˆ i j I(X ; Y ) Ax =[i , (i + 1) ) and Ay =[j , (j +1) ) Pr (Xˆ , Yˆ )=(i, j) ˆ ˆ { } = Pr (X , Y )=(i, j) log { } Pr Xˆ = i Pr Yˆ = j and the rectangle Xi Xj { } { } 2 2 f(xi,yj ) i,j i j f(x ,y ) log A = A A . ⇡ i j xy x ⇥ y (f(x ) )(f(y ) ) Xi Xj i j f(x ,y ) 3. Define discrete r.v.’s 2 i j = f(xi,yj ) log f(x )f(y ) Xi Xj i j i Xˆ = i if X A f(x, y) 2 x ˆ j f(x, y)log dxdy ( Y = j if Y Ay ⇡ 2 ZZ f(x)f(y) = I(X; Y ) ˆ ˆ 4. X and Y are quantizations of X and Y ,respec- tively. 7. Therefore, I(X; Y ) can be interpreted as the limit of I(Xˆ ; Yˆ )as 0. ! 8. This interpretation continues to be valid for general distribution for X and Y . Interpretation of I(X;Y)
j 1. Assume f(x, y) exists and is continuous. 5. For all i and j,let(x ,y ) Ai A . i j 2 x ⇥ y 2. For a fixed , for all integer i and j, define the 6. Then intervals ˆ ˆ i j I(X ; Y ) Ax =[i , (i + 1) ) and Ay =[j , (j +1) ) Pr (Xˆ , Yˆ )=(i, j) ˆ ˆ ______{ } = ______Pr (X , Y )=(i, j) log { } Pr Xˆ = i Pr Yˆ = j and the rectangle Xi Xj { } { } 2 2 f(xi,yj ) i,j i j f(x ,y ) log A = A A . ⇡ i j xy x ⇥ y (f(x ) )(f(y ) ) Xi Xj i j f(x ,y ) 3. Define discrete r.v.’s 2 i j = f(xi,yj ) log f(x )f(y ) Xi Xj i j i Xˆ = i if X A f(x, y) 2 x ˆ j f(x, y)log dxdy ( Y = j if Y Ay ⇡ 2 ZZ f(x)f(y) = I(X; Y ) ˆ ˆ 4. X and Y are quantizations of X and Y ,respec- tively. 7. Therefore, I(X; Y ) can be interpreted as the limit of I(Xˆ ; Yˆ )as 0. ! 8. This interpretation continues to be valid for general distribution for X and Y . Interpretation of I(X;Y)
j 1. Assume f(x, y) exists and is continuous. 5. For all i and j,let(x ,y ) Ai A . i j 2 x ⇥ y 2. For a fixed , for all integer i and j, define the 6. Then intervals ˆ ˆ i j I(X ; Y ) Ax =[i , (i + 1) ) and Ay =[j , (j +1) ) Pr (Xˆ , Yˆ )=(i, j) ˆ ˆ ______{ } = ______Pr (X , Y )=(i, j) log { } Pr Xˆ = i Pr Yˆ = j and the rectangle Xi Xj { } { } 2 2 ______f(xi,yj ) i,j i j f(x ,y ) log A = A A . ⇡ ______i j xy x ⇥ y (f(x ) )(f(y ) ) Xi Xj i j f(x ,y ) 3. Define discrete r.v.’s 2 i j = f(xi,yj ) log f(x )f(y ) Xi Xj i j i Xˆ = i if X A f(x, y) 2 x ˆ j f(x, y)log dxdy ( Y = j if Y Ay ⇡ 2 ZZ f(x)f(y) = I(X; Y ) ˆ ˆ 4. X and Y are quantizations of X and Y ,respec- tively. 7. Therefore, I(X; Y ) can be interpreted as the limit of I(Xˆ ; Yˆ )as 0. ! 8. This interpretation continues to be valid for general distribution for X and Y . Interpretation of I(X;Y)
j 1. Assume f(x, y) exists and is continuous. 5. For all i and j,let(x ,y ) Ai A . i j 2 x ⇥ y 2. For a fixed , for all integer i and j, define the 6. Then intervals ˆ ˆ i j I(X ; Y ) Ax =[i , (i + 1) ) and Ay =[j , (j +1) ) Pr (Xˆ , Yˆ )=(i, j) ˆ ˆ { } = Pr (X , Y )=(i, j) log { } Pr Xˆ = i Pr Yˆ = j and the rectangle Xi Xj ______{ } { } 2 2 f(xi,yj ) i,j i j f(x ,y ) log A = A A . ⇡ i j xy x ⇥ y (f(x ) )(f(y ) ) Xi Xj ______i j f(x ,y ) 3. Define discrete r.v.’s 2 i j = f(xi,yj ) log f(x )f(y ) Xi Xj i j i Xˆ = i if X A f(x, y) 2 x ˆ j f(x, y)log dxdy ( Y = j if Y Ay ⇡ 2 ZZ f(x)f(y) = I(X; Y ) ˆ ˆ 4. X and Y are quantizations of X and Y ,respec- tively. 7. Therefore, I(X; Y ) can be interpreted as the limit of I(Xˆ ; Yˆ )as 0. ! 8. This interpretation continues to be valid for general distribution for X and Y . Interpretation of I(X;Y)
j 1. Assume f(x, y) exists and is continuous. 5. For all i and j,let(x ,y ) Ai A . i j 2 x ⇥ y 2. For a fixed , for all integer i and j, define the 6. Then intervals ˆ ˆ i j I(X ; Y ) Ax =[i , (i + 1) ) and Ay =[j , (j +1) ) Pr (Xˆ , Yˆ )=(i, j) ˆ ˆ { } = Pr (X , Y )=(i, j) log { } Pr Xˆ = i Pr Yˆ = j and the rectangle Xi Xj { }______{ } 2 2 f(xi,yj ) i,j i j f(x ,y ) log A = A A . ⇡ i j xy x ⇥ y (f(x ) )(f(y ) ) Xi Xj i ______j f(x ,y ) 3. Define discrete r.v.’s 2 i j = f(xi,yj ) log f(x )f(y ) Xi Xj i j i Xˆ = i if X A f(x, y) 2 x ˆ j f(x, y)log dxdy ( Y = j if Y Ay ⇡ 2 ZZ f(x)f(y) = I(X; Y ) ˆ ˆ 4. X and Y are quantizations of X and Y ,respec- tively. 7. Therefore, I(X; Y ) can be interpreted as the limit of I(Xˆ ; Yˆ )as 0. ! 8. This interpretation continues to be valid for general distribution for X and Y . Interpretation of I(X;Y)
j 1. Assume f(x, y) exists and is continuous. 5. For all i and j,let(x ,y ) Ai A . i j 2 x ⇥ y 2. For a fixed , for all integer i and j, define the 6. Then intervals ˆ ˆ i j I(X ; Y ) Ax =[i , (i + 1) ) and Ay =[j , (j +1) ) Pr (Xˆ , Yˆ )=(i, j) ˆ ˆ { } = Pr (X , Y )=(i, j) log { } Pr Xˆ = i Pr Yˆ = j and the rectangle Xi Xj { } { } 2 2 f(xi,yj ) i,j i j f(x ,y ) log A = A A . ⇡ i j xy x ⇥ y (f(x ) )(f(y ) ) Xi Xj i j f(x ,y ) 3. Define discrete r.v.’s 2 i j = f(xi,yj ) log f(x )f(y ) Xi Xj i j i Xˆ = i if X A f(x, y) 2 x ˆ j f(x, y)log dxdy ( Y = j if Y Ay ⇡ 2 ZZ f(x)f(y) = I(X; Y ) ˆ ˆ 4. X and Y are quantizations of X and Y ,respec- tively. 7. Therefore, I(X; Y ) can be interpreted as the limit of I(Xˆ ; Yˆ )as 0. ! 8. This interpretation continues to be valid for general distribution for X and Y . Interpretation of I(X;Y)
j 1. Assume f(x, y) exists and is continuous. 5. For all i and j,let(x ,y ) Ai A . i j 2 x ⇥ y 2. For a fixed , for all integer i and j, define the 6. Then intervals ˆ ˆ i j I(X ; Y ) Ax =[i , (i + 1) ) and Ay =[j , (j +1) ) Pr (Xˆ , Yˆ )=(i, j) ˆ ˆ { } = Pr (X , Y )=(i, j) log { } Pr Xˆ = i Pr Yˆ = j and the rectangle Xi Xj { } { } 2 2 f(xi,yj ) i,j i j f(x ,y ) log A = A A . ⇡ i j xy x ⇥ y (f(x ) )(f(y ) ) Xi Xj i j f(x ,y ) 3. Define discrete r.v.’s 2 i j = f(xi,yj ) log f(x )f(y ) Xi Xj i j i Xˆ = i if X A f(x, y) 2 x ˆ j f(x, y)log dxdy ( Y = j if Y Ay ⇡ 2 ZZ f(x)f(y) = I(X; Y ) ˆ ˆ 4. X and Y are quantizations of X and Y ,respec- tively. 7. Therefore, I(X; Y ) can be interpreted as the limit of I(Xˆ ; Yˆ )as 0. ! 8. This interpretation continues to be valid for general distribution for X and Y . Interpretation of I(X;Y)
j 1. Assume f(x, y) exists and is continuous. 5. For all i and j,let(x ,y ) Ai A . i j 2 x ⇥ y 2. For a fixed , for all integer i and j, define the 6. Then intervals ˆ ˆ i j I(X ; Y ) Ax =[i , (i + 1) ) and Ay =[j , (j +1) ) Pr (Xˆ , Yˆ )=(i, j) ˆ ˆ { } = Pr (X , Y )=(i, j) log { } Pr Xˆ = i Pr Yˆ = j and the rectangle Xi Xj { } { } 2 2 f(xi,yj ) i,j i j f(x ,y ) log A = A A . ⇡ i j xy x ⇥ y (f(x ) )(f(y ) ) Xi Xj i j f(x ,y ) 3. Define discrete r.v.’s 2 i j = f(xi,yj ) log f(x )f(y ) Xi Xj i j i Xˆ = i if X A f(x, y) 2 x ˆ j f(x, y)log dxdy ( Y = j if Y Ay ⇡ 2 ZZ f(x)f(y) = I(X; Y ) ˆ ˆ 4. X and Y are quantizations of X and Y ,respec- tively. 7. Therefore, I(X; Y ) can be interpreted as the limit of I(Xˆ ; Yˆ )as 0. ! 8. This interpretation continues to be valid for general distribution for X and Y . Interpretation of I(X;Y)
j 1. Assume f(x, y) exists and is continuous. 5. For all i and j,let(x ,y ) Ai A . i j 2 x ⇥ y 2. For a fixed , for all integer i and j, define the 6. Then intervals ˆ ˆ i j I(X ; Y ) Ax =[i , (i + 1) ) and Ay =[j , (j +1) ) Pr (Xˆ , Yˆ )=(i, j) ˆ ˆ { } = Pr (X , Y )=(i, j) log { } Pr Xˆ = i Pr Yˆ = j and the rectangle Xi Xj { } { } 2 2 f(xi,yj ) i,j i j f(x ,y ) log A = A A . ⇡ i j xy x ⇥ y (f(x ) )(f(y ) ) Xi Xj i j f(x ,y ) 3. Define discrete r.v.’s 2 i j = f(xi,yj ) log f(x )f(y ) Xi Xj i j i Xˆ = i if X A f(x, y) 2 x ˆ j f(x, y)log dxdy ( Y = j if Y Ay ⇡ 2 ZZ f(x)f(y) = I(X; Y ) ˆ ˆ 4. X and Y are quantizations of X and Y ,respec- tively. 7. Therefore, I(X; Y ) can be interpreted as the limit of I(Xˆ ; Yˆ )as 0. ! 8. This interpretation continues to be valid for general distribution for X and Y . Interpretation of I(X;Y)
j 1. Assume f(x, y) exists and is continuous. 5. For all i and j,let(x ,y ) Ai A . i j 2 x ⇥ y 2. For a fixed , for all integer i and j, define the 6. Then intervals ˆ ˆ i j I(X ; Y ) Ax =[i , (i + 1) ) and Ay =[j , (j +1) ) Pr (Xˆ , Yˆ )=(i, j) ˆ ˆ { } = Pr (X , Y )=(i, j) log { } Pr Xˆ = i Pr Yˆ = j and the rectangle Xi Xj { } { } 2 2 f(xi,yj ) i,j i j f(x ,y ) log A = A A . ⇡ i j xy x ⇥ y (f(x ) )(f(y ) ) Xi Xj i j f(x ,y ) 3. Define discrete r.v.’s 2 i j = f(xi,yj ) log f(x )f(y ) Xi Xj i j i Xˆ = i if X A f(x, y) 2 x ˆ j f(x, y)log dxdy ( Y = j if Y Ay ⇡ 2 ZZ f(x)f(y) = I(X; Y ) ˆ ˆ 4. X and Y are quantizations of X and Y ,respec- tively. 7. Therefore, I(X; Y ) can be interpreted as the limit of I(Xˆ ; Yˆ )as 0. ! 8. This interpretation continues to be valid for general distribution for X and Y . Interpretation of I(X;Y)
j 1. Assume f(x, y) exists and is continuous. 5. For all i and j,let(x ,y ) Ai A . i j 2 x ⇥ y 2. For a fixed , for all integer i and j, define the 6. Then intervals ˆ ˆ i j I(X ; Y ) Ax =[i , (i + 1) ) and Ay =[j , (j +1) ) Pr (Xˆ , Yˆ )=(i, j) ˆ ˆ { } = Pr (X , Y )=(i, j) log { } Pr Xˆ = i Pr Yˆ = j and the rectangle Xi Xj { } { } 2 2 f(xi,yj ) i,j i j f(x ,y ) log A = A A . ⇡ i j xy x ⇥ y (f(x ) )(f(y ) ) Xi Xj i j f(x ,y ) 3. Define discrete r.v.’s 2 i j = f(xi,yj ) log f(x )f(y ) Xi Xj i j i Xˆ = i if X A f(x, y) 2 x ˆ j f(x, y)log dxdy ( Y = j if Y Ay ⇡ 2 ZZ f(x)f(y) = I(X; Y ) ˆ ˆ 4. X and Y are quantizations of X and Y ,respec- tively. 7. Therefore, I(X; Y ) can be interpreted as the limit of I(Xˆ ; Yˆ )as 0. ! 8. This interpretation continues to be valid for general distribution for X and Y . Interpretation of I(X;Y)
j 1. Assume f(x, y) exists and is continuous. 5. For all i and j,let(x ,y ) Ai A . i j 2 x ⇥ y 2. For a fixed , for all integer i and j, define the 6. Then intervals ˆ ˆ i j I(X ; Y ) Ax =[i , (i + 1) ) and Ay =[j , (j +1) ) Pr (Xˆ , Yˆ )=(i, j) ˆ ˆ { } = Pr (X , Y )=(i, j) log { } Pr Xˆ = i Pr Yˆ = j and the rectangle Xi Xj { } { } 2 2 f(xi,yj ) i,j i j f(x ,y ) log A = A A . ⇡ i j xy x ⇥ y (f(x ) )(f(y ) ) Xi Xj i j f(x ,y ) 3. Define discrete r.v.’s 2 i j = f(xi,yj ) log f(x )f(y ) Xi Xj i j i Xˆ = i if X A f(x, y) 2 x ˆ j f(x, y)log dxdy ( Y = j if Y Ay ⇡ 2 ZZ f(x)f(y) = I(X; Y ) ˆ ˆ 4. X and Y are quantizations of X and Y ,respec- tively. 7. Therefore, I(X; Y ) can be interpreted as the limit of I(Xˆ ; Yˆ )as 0. ! 8. This interpretation continues to be valid for general distribution for X and Y . Interpretation of I(X;Y)
j 1. Assume f(x, y) exists and is continuous. 5. For all i and j,let(x ,y ) Ai A . i j 2 x ⇥ y 2. For a fixed , for all integer i and j, define the 6. Then intervals ˆ ˆ i j I(X ; Y ) Ax =[i , (i + 1) ) and Ay =[j , (j +1) ) Pr (Xˆ , Yˆ )=(i, j) ˆ ˆ { } = Pr (X , Y )=(i, j) log { } Pr Xˆ = i Pr Yˆ = j and the rectangle Xi Xj { } { } 2 2 f(xi,yj ) i,j i j f(x ,y ) log A = A A . ⇡ i j xy x ⇥ y (f(x ) )(f(y ) ) Xi Xj i j f(x ,y ) 3. Define discrete r.v.’s 2 i j = f(xi,yj ) log f(x )f(y ) Xi Xj i j i Xˆ = i if X A f(x, y) 2 x ˆ j f(x, y)log dxdy ( Y = j if Y Ay ⇡ 2 ZZ f(x)f(y) = I(X; Y ) ˆ ˆ 4. X and Y are quantizations of X and Y ,respec- tively. 7. Therefore, I(X; Y ) can be interpreted as the limit of I(Xˆ ; Yˆ )as 0. ! 8. This interpretation continues to be valid for general distribution for X and Y . Interpretation of I(X;Y)
j 1. Assume f(x, y) exists and is continuous. 5. For all i and j,let(x ,y ) Ai A . i j 2 x ⇥ y 2. For a fixed , for all integer i and j, define the 6. Then intervals ˆ ˆ i j I(X ; Y ) Ax =[i , (i + 1) ) and Ay =[j , (j +1) ) Pr (Xˆ , Yˆ )=(i, j) ˆ ˆ { } = Pr (X , Y )=(i, j) log { } Pr Xˆ = i Pr Yˆ = j and the rectangle Xi Xj { } { } 2 2 f(xi,yj ) i,j i j f(x ,y ) log A = A A . ⇡ i j xy x ⇥ y (f(x ) )(f(y ) ) Xi Xj i j f(x ,y ) 3. Define discrete r.v.’s 2 i j = f(xi,yj ) log f(x )f(y ) Xi Xj i j i Xˆ = i if X A f(x, y) 2 x ˆ j f(x, y)log dxdy ( Y = j if Y Ay ⇡ 2 ZZ f(x)f(y) = I(X; Y ) ˆ ˆ 4. X and Y are quantizations of X and Y ,respec- tively. 7. Therefore, I(X; Y ) can be interpreted as the limit of I(Xˆ ; Yˆ )as 0. ! 8. This interpretation continues to be valid for general distribution for X and Y . Definition 10.28 Let
X (discrete) f(y x) Y (continuous) |
The conditional entropy of X given Y is defined as
H(X Y )=H(X) I(X; Y ). |
Proposition 10.29 For two random variables X and Y ,
1. h(Y )=h(Y X)+I(X; Y )ifY is continuous; | 2. H(Y )=H(Y X)+I(X; Y )ifY is discrete. |
Proposition 10.30 (Chain Rule)
n
h(X1,X2, ,Xn)= h(Xi X1, ,Xi 1) ··· | ··· i=1 X Definition 10.28 Let
X (discrete) f(y x) Y (continuous) |
The conditional entropy of X given Y is defined as
H(X Y )=H(X) I(X; Y ). |
Proposition 10.29 For two random variables X and Y ,
1. h(Y )=h(Y X)+I(X; Y )ifY is continuous; | 2. H(Y )=H(Y X)+I(X; Y )ifY is discrete. |
Proposition 10.30 (Chain Rule)
n
h(X1,X2, ,Xn)= h(Xi X1, ,Xi 1) ··· | ··· i=1 X Definition 10.28 Let
X (discrete) f(y x) Y (continuous) |
The conditional entropy of X given Y is defined as
H(X Y )=H(X) I(X; Y ). |
Proposition 10.29 For two random variables X and Y ,
1. h(Y )=h(Y X)+I(X; Y )ifY is continuous; | 2. H(Y )=H(Y X)+I(X; Y )ifY is discrete. |
Proposition 10.30 (Chain Rule)
n
h(X1,X2, ,Xn)= h(Xi X1, ,Xi 1) ··· | ··· i=1 X Definition 10.28 Let
X (discrete) f(y x) Y (continuous) |
The conditional entropy of X given Y is defined as
H(X Y )=H(X) I(X; Y ). |
Proposition 10.29 For two random variables X and Y ,
1. h(Y )=h(Y X)+I(X; Y )ifY is continuous; | 2. H(Y )=H(Y X)+I(X; Y )ifY is discrete. |
Proposition 10.30 (Chain Rule)
n
h(X1,X2, ,Xn)= h(Xi X1, ,Xi 1) ··· | ··· i=1 X Definition 10.28 Let
X (discrete) f(y x) Y (continuous) |
The conditional entropy of X given Y is defined as
H(X Y )=H(X) I(X; Y ). |
Proposition 10.29 For two random variables X and Y ,
1. h(Y )=h(Y X)+I(X; Y )ifY is continuous; | 2. H(Y )=H(Y X)+I(X; Y )ifY is discrete. |
Proposition 10.30 (Chain Rule)
n
h(X1,X2, ,Xn)= h(Xi X1, ,Xi 1) ··· | ··· i=1 X Definition 10.28 Let
X (discrete) f(y x) Y (continuous) |
The conditional entropy of X given Y is defined as
H(X Y )=H(X) I(X; Y ). |
Proposition 10.29 For two random variables X and Y ,
1. h(Y )=h(Y X)+I(X; Y )ifY is continuous; | 2. H(Y )=H(Y X)+I(X; Y )ifY is discrete. |
Proposition 10.30 (Chain Rule)
n
h(X1,X2, ,Xn)= h(Xi X1, ,Xi 1) ··· | ··· i=1 X Theorem 10.31 I(X; Y ) 0, ⇥ with equality if and only if X is independent of Y .
Corollary 10.32 I(X; Y T ) 0, | ⇥ with equality if and only if X is independent of Y conditioning on T .
Corollary 10.33 (Conditioning Does Not Increase Di erential En- tropy) h(X Y ) h(X) | with equality if and only if X and Y are independent.
Remarks For continuous r.v.’s, 1. h(X),h(X Y ) 0 DO NOT generally hold; | ⇥ 2. I(X; Y ),I(X; Y Z) 0 always hold. | ⇥ Theorem 10.31 I(X; Y ) 0, ⇥ with equality if and only if X is independent of Y .
Corollary 10.32 I(X; Y T ) 0, | ⇥ with equality if and only if X is independent of Y conditioning on T .
Corollary 10.33 (Conditioning Does Not Increase Di erential En- tropy) h(X Y ) h(X) | with equality if and only if X and Y are independent.
Remarks For continuous r.v.’s, 1. h(X),h(X Y ) 0 DO NOT generally hold; | ⇥ 2. I(X; Y ),I(X; Y Z) 0 always hold. | ⇥ Theorem 10.31 I(X; Y ) 0, ⇥ with equality if and only if X is independent of Y .
Corollary 10.32 I(X; Y T ) 0, | ⇥ with equality if and only if X is independent of Y conditioning on T .
Corollary 10.33 (Conditioning Does Not Increase Di erential En- tropy) h(X Y ) h(X) | with equality if and only if X and Y are independent.
Remarks For continuous r.v.’s, 1. h(X),h(X Y ) 0 DO NOT generally hold; | ⇥ 2. I(X; Y ),I(X; Y Z) 0 always hold. | ⇥ Theorem 10.31 I(X; Y ) 0, ⇥ with equality if and only if X is independent of Y .
Corollary 10.32 I(X; Y T ) 0, | ⇥ with equality if and only if X is independent of Y conditioning on T .
Corollary 10.33 (Conditioning Does Not Increase Di erential En- tropy) h(X Y ) h(X) | with equality if and only if X and Y are independent.
Remarks For continuous r.v.’s, 1. h(X),h(X Y ) 0 DO NOT generally hold; | ⇥ 2. I(X; Y ),I(X; Y Z) 0 always hold. | ⇥ Theorem 10.31 I(X; Y ) 0, ⇥ with equality if and only if X is independent of Y .
Corollary 10.32 I(X; Y T ) 0, | ⇥ with equality if and only if X is independent of Y conditioning on T .
Corollary 10.33 (Conditioning Does Not Increase Di erential En- tropy) h(X Y ) h(X) | with equality if and only if X and Y are independent.
Remarks For continuous r.v.’s, 1. h(X),h(X Y ) 0 DO NOT generally hold; | ⇥ 2. I(X; Y ),I(X; Y Z) 0 always hold. | ⇥ Theorem 10.31 I(X; Y ) 0, ⇥ with equality if and only if X is independent of Y .
Corollary 10.32 I(X; Y T ) 0, | ⇥ with equality if and only if X is independent of Y conditioning on T .
Corollary 10.33 (Conditioning Does Not Increase Di erential En- tropy) h(X Y ) h(X) | with equality if and only if X and Y are independent.
Remarks For continuous r.v.’s, 1. h(X),h(X Y ) 0 DO NOT generally hold; | ⇥ 2. I(X; Y ),I(X; Y Z) 0 always hold. | ⇥ Theorem 10.31 I(X; Y ) 0, ⇥ with equality if and only if X is independent of Y .
Corollary 10.32 I(X; Y T ) 0, | ⇥ with equality if and only if X is independent of Y conditioning on T .
Corollary 10.33 (Conditioning Does Not Increase Di erential En- tropy) h(X Y ) h(X) | with equality if and only if X and Y are independent.
Remarks For continuous r.v.’s, 1. h(X),h(X Y ) 0 DO NOT generally hold; | ⇥ 2. I(X; Y ),I(X; Y Z) 0 always hold. | ⇥ Corollary 10.34 (Independence Bound for Di↵erential Entropy)
n h(X ,X , ,X ) h(X ) 1 2 ··· n i i=1 X with equality if and only if i =1, 2, ,n are mutually independent. ···