Moment-Entropy Inequalities for a Random Vector Erwin Lutwak, Deane Yang, and Gaoyong Zhang

1 Moment-entropy inequalities for a random vector Erwin Lutwak, Deane Yang, and Gaoyong Zhang Abstract— The p-th moment matrix is defined for a real Throughout this paper, X denotes a random vector in Rn. random vector, generalizing the classical covariance matrix. The probability measure on Rn associated with a random Sharp inequalities relating the p-th moment and Renyi entropy vector X is denoted mX . are established, generalizing the classical inequality relating n the second moment and the Shannon entropy. The extremal We will denote the standard Lebesgue density on R by distributions for these inequalities are completely characterized. dx. By the density function fX of a random vector X, we mean the Radon-Nikodym derivative of probability measure mX with respect to Lebesgue measure. If V is a vector space and Φ: Rn → V is a continuous I. INTRODUCTION function, then the expected value of Φ(X) is given by In [9] the authors demonstrated how the classical informa- Z tion theoretic inequality for the Shannon entropy and second E[Φ(X)] = Φ(x) dmX (x). n moment of a real random variable could be extended to R inequalities for Renyi entropy and the p-th moment. The We call a random vector X nondegenerate, if E[|v·X|] > 0 extremals of these inequalities were also completely charac- for each nonzero v ∈ Rn. terized. Moment-entropy inequalities, using Renyi entropy, for discrete random variables have also been obtained by Arikan [2]. B. The p-th moment of a random vector We describe how to extend the definition of the second mo- For p ∈ (0, ∞), the standard p-th moment of a random ment matrix of a real random vector to that of the p-th moment vector X is given by matrix. Using this, we extend the moment-entropy inequalities Z p p and the characterization of the extremal distributions proved E[|X| ] = |x| dmX (x). (1) n in [9] to higher dimensions. R Variants and generalizations of the theorems presented can More generally, the p-th moment with respect to the inner be found in work of the authors [8], [10], [11] and Bastero- product h·, ·iA is Romance [3]. Z p p The authors would like to thank Christoph Haberl for his E[|X|A] = |x|A dmX (x). n careful reading of this paper and valuable suggestions for R improving it. C. The p-th moment matrix II. THE p-TH MOMENT MATRIX OF A RANDOM VECTOR The second moment matrix of a random vector X is defined A. Basic notation to be M2[X] = E[X ⊗ X], Throughout this paper we denote: n n where for v ∈ R , v ⊗ v is the linear transformation given R = n-dimensional Euclidean space by x 7→ (x · v)v. Recall that M2[X − E[X]] is the covariance x · y = standard Euclidean inner product of x, y ∈ Rn √ matrix. An important observation is that the definition of the |x| = x · x moment matrix does not use the inner product on Rn. S = positive definite symmetric n-by-n matrices A unique characterization of the second moment matrix is the following: Let M = M [X]. The inner product h·, ·i −1/2 |A| = determinant of A ∈ S 2 M is the unique one whose unit ball has maximal volume among n |K| = Lebesgue measure of K ⊂ R . all inner products h·, ·iA that are normalized so that the second 2 n moment satisfies E[|AX| ] = n. The standard Euclidean ball in R will be denoted by B, and We extend this characterization to a definition of the p-th its volume by ωn. n moment matrix M [X] for all p ∈ (0, ∞). Each inner product on R can be written uniquely as p Theorem 1: If p ∈ (0, ∞) and X is a nondegenerate n n n (x, y) ∈ R × R 7→ hx, yiA = Ax · Ay, random vector in R with finite p-th moment, then there exists a unique matrix A ∈ S such that for A ∈ S. The associated norm will be denoted by | · |A. E[|X|p ] = n E. Lutwak ([email protected]), D. Yang ([email protected]), and A G. Zhang ([email protected]) are with the Department of Mathematics, Polytechnic University, Brooklyn, New York. and were supported in part by and NSF Grant DMS-0405707. |A| ≥ |A0|, 2 0 p for each A ∈ S such that E[|X|A0 ] = n. Moreover, the matrix Observe that hλ[X, Y ] is continuous in λ. A is the unique matrix in S satisfying Lemma 2: If X and Y are random vectors such that hλ[X], h [Y ], and h [X, Y ] are finite, then I = E[AX ⊗ AX|AX|p−2]. λ λ We define the p-th moment matrix of a random vector X to hλ[X, Y ] ≥ 0. −p be Mp[X] = A , where A is given by the theorem above. The proof of the theorem is given in §IV Equality holds if and only if X = Y . Proof: If λ > 1, then by the Holder¨ inequality, λ−1 1 III. MOMENT-ENTROPY INEQUALITIES Z Z λ Z λ λ−1 λ λ fY fX dx ≤ fY dx fX dx , A. Entropy n n n R R R The Shannon entropy of a random vector X is defined to and if λ < 1, then we have be Z Z Z λ λ−1 λ λ(1−λ) h[X] = − fX log fX dx, fX = (fY fX ) fY n n n R R R Z λ Z 1−λ provided that the integral above exists. For λ > 0 the λ-Renyi λ−1 λ ≤ fY fX fY . entropy power of a density function is defined to be n n R R 1 Z 1−λ The inequality for λ = 1 follows by taking the limit λ → 1. λ fX if λ 6= 1, The equality conditions for λ 6= 1 follow from the equality Nλ[X] = n R conditions of the Holder¨ inequality. The inequality for λ = h[f] e if λ = 1, 1, including the equality condition, follows from the Jensen provided that the integral above exists. Observe that inequality (details may be found, for example, page 234 in [4]). lim Nλ[X] = N1[X]. λ→1 The λ–Renyi entropy of a random vector X is defined to be C. Generalized Gaussians We call the extremal random vectors for the moment- h [X] = log N [X]. λ λ entropy inequalities generalized Gaussians and recall their The entropy hλ[X] is continuous in λ and, by the Holder¨ definition here. inequality, decreasing in λ. It is strictly decreasing, unless X Given t ∈ R, let is a uniform random vector. t = max(t, 0). It follows by the chain rule that + Let Nλ[AX] = |A|Nλ[X], (2) Z ∞ Γ(t) = xt−1e−x dx for each A ∈ S. 0 denote the Gamma function, and let B. Relative entropy Γ(a)Γ(b) β(a, b) = Given two random vectors X, Y in Rn, their relative Shan- Γ(a + b) non entropy or Kullback–Leibler distance [6], [5], [1] (also, denote the Beta function. see page 231 in [4]) is defined by For each p ∈ (0, ∞) and λ ∈ (n/(n + p), ∞), define the Z fX standard generalized Gaussian to be the random vector Z in h1[X, Y ] = fX log dx, (3) n n n f f : → [0, ∞) R Y R whose density function Z R is given by provided that the integral above exists. Given λ > 0, we define p 1/(λ−1) ap,λ(1 + (1 − λ)|x| )+ if λ 6= 1 the relative λ–Renyi entropy power of X and Y as follows. If fZ (x) = (5) λ 6= 1, then −|x|p ap,1e if λ = 1, 1 1 Z 1−λ Z λ λ−1 λ fY fX dx fY dx where n n R R n Nλ[X, Y ] = 1 , (4) p(1 − λ) p Z λ(1−λ) if λ < 1, λ n 1 n fX dx nωnβ( p , 1−λ − p ) n R p if λ = 1, and, if λ = 1, then ap,λ = nω Γ( n ) n p h1[X,Y ] n N1[X, Y ] = e , p(λ − 1) p if λ > 1. n λ provided in both cases that the righthand side exists. Define nωnβ( p , λ−1 ) the λ–Renyi relative entropy of random vectors X and Y by Any random vector Y in Rn that can be written as Y = AZ, hλ[X, Y ] = log Nλ[X, Y ]. for some A ∈ S is called a generalized Gaussian. 3 D. Information measures of generalized Gaussians If λ 6= 1, then by (9) and (5), (1), (11), and (7), Z If 0 < p < ∞ and λ > n/(n + p), the λ-Renyi entropy λ−1 fY fX power of the standard generalized Gaussian random vector Z n is given by R Z ≥ aλ−1tn(1−λ) + (1 − λ)aλ−1tn(1−λ)−p |x|pf (x) dx 1 X n n(λ − 1) λ−1 R −1 λ−1 n(1−λ) −p p 1 + ap,λ if λ 6= 1 = a t (1 + (1 − λ)t E[|X| ]) Nλ[Z] = pλ λ−1 n(1−λ) p n −1 = a t (1 + (1 − λ)E[|Z|] ]) p e ap,1 if λ = 1 Z n(1−λ) λ = t fZ , (12) If 0 < p < ∞ and λ > n/(n + p), then the p-th moment n R of Z is given by where equality holds if λ < 1. It follows that if λ 6= 1, then h p i−1 by Lemma 2, (4), (10) and (12), and (11), we have E[|Z|p] = λ 1 + − 1 .

Moment-Entropy Inequalities for a Random Vector Erwin Lutwak, Deane Yang, and Gaoyong Zhang

Details

Download

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

Support