New connections between the entropy power inequality and geometric inequalities

Arnaud Marsiglietti1 Victoria Kostina2 Center for the Mathematics of Information Department of Electrical Engineering California Institute of Technology California Institute of Technology Pasadena, CA 91125, USA Pasadena, CA 91125, USA [email protected] [email protected]

Abstract—The entropy power inequality (EPI) has a fundamen- where Voln−1(·) denotes the (n − 1)-dimensional tal role in , and has deep connections with volume. famous geometric inequalities. In particular, it is often compared to the Brunn-Minkowski inequality in convex geometry. In this There are several equivalent formulations of the article, we further strengthen the relationships between the EPI and geometric inequalities. Specifically, we establish an equiva- hyperplane conjecture, mainly expressed in geometric lence between a strong form of reverse EPI and the hyperplane or probabilistic language. One of particular interest conjecture, which is a long-standing conjecture in high-dimensional convex geometry. We also provide a simple proof of the hyperplane is an information-theoretic formulation developed by conjecture for a certain class of distributions, as a straightforward Bobkov and Madiman [4]. Let us recall that a random consequence of the EPI. vector X in Rn is isotropic if X is centered and  T  Keywords. Entropy power inequality, reverse entropy if KX = E XX , the covariance matrix of X, is power inequality, hyperplane conjecture. the identity matrix In, and that X is log-concave if X ∼ fX with log(fX ) concave. I.INTRODUCTION Conjecture 2 (Entropic Hyperplane Conjecture [4]). Since the pioneering work of Costa and Cover [9], There exists a universal constant c > 0 such that for and Dembo, Cover and Thomas [13], several rela- every n ≥ 1, for every isotropic log-concave random tionships have been developed between information vector X in Rn, theory and convex geometry. D(X||Z) ≤ cn, (3) We explore in this article an intimate connection between the entropy power inequality (EPI), funda- where D(X||Z) denotes the relative entropy between mental in information theory, and the hyperplane con- X and the standard Gaussian Z ∼ N (0,In). jecture, a major unsolved problem in high-dimensional Although the isotropicity condition is not assumed geometry. in the formulation of [4], the above formulation of Let us recall that the entropy power inequality of Conjecture 2 is actually equivalent. This is because Shannon [26] states that for all independent continuous the quantity random vectors X,Y in Rn, D(X||ZX ) = h(ZX ) − h(X), (4) N(X + Y ) ≥ N(X) + N(Y ), (1) where ZX ∼ N (0,KX ), is affine invariant, and we can 1 2 n h(X) where N(X) , 2πe e denotes the entropy power always find for arbitrary X an affine transformation of X. Furthermore, the hyperplane conjecture, raised T such that T (X) is isotropic. Hence, we can assume by Bourgain [8], can be stated as follows. without loss of generality that we work with isotropic random vectors. Conjecture 1 (Hyperplane Conjecture [8]). There Let us briefly explain the equivalence between exists a universal constant c > 0 such that for every Conjectures 1 and 2. The isotropic constant of a n ≥ 1, for every convex body K ⊂ n of volume 1, R random vector X ∼ f in n is defined as there exists a hyperplane H such that X R 2 1 2 n n LX , kfX k∞|KX | , (5) Voln−1 (K ∩ H) ≥ c, (2) where kfX k∞ denotes the essential supremum of fX , 1Supported by the Walter S. Baer and Jeri Weiss CMI Postdoctoral Fellowship. and |K | denotes the determinant of K . A convex 2Supported in part by the National Science Foundation (NSF) under Grant X X CCF-1566567. body K is isotropic if K is centered, Vol(K) = 1, R  II.EQUIVALENCEBETWEENREVERSEENTROPYPOWER and its covariance matrix K xixj dx 1≤i,j≤n is a multiple of the identity. Hensley [15] proved that INEQUALITYANDHYPERPLANECONJECTURE there exist universal constants d1, d2 > 0 such that In this section, we establish a new information- for any isotropic convex body K, for any hyperplane theoretic formulation of the hyperplane conjecture as H passing through the origin, a reverse form of the EPI. We introduce the following 2 conjecture. d1 ≤ LK Voln−1(K ∩ H) ≤ d2. (6) 0 Here, L L with X uniformly distributed on Conjecture 3. There exists a universal constant c > 0 K , X such that for every n ≥ 1, for every k ≥ 1, for i.i.d. K. Hence, lower bounding Voln−1(K ∩ H) by a universal constant (Conjecture 1) is equivalent to upper isotropic log-concave random vectors X1,...,Xk in n, bounding LK by a universal constant. Ball [1] proved R 0 that upper bounding LK by a universal constant for N(X1 + ··· + Xk) ≤ c (N(X1) + ··· + N(Xk)) 0 any isotropic convex body K is equivalent to upper = c kN(X1). bounding LX by a universal constant for any isotropic (8) log-concave random vector X. Finally, Bobkov and Theorem 1. Conjectures 2 and 3 are equivalent. Madiman [4] proved that the Renyi´ entropies of a log-concave random vector X in Rn are comparable, Proof. Assume first that Conjecture 3 is true. Let X1,...,Xk be i.i.d. isotropic log-concave random h∞(X) ≤ h(X) ≤ h∞(X) + n, (7) vectors. Then, where h∞(X) − log (kfX k ) is the ∞-Renyi´ 0 , ∞ N(X1 + ··· + Xk) ≤ c kN(X1). (9) entropy of X. Using (7), they [4] showed that up- per bounding LX by a universal constant for any It follows that   isotropic log-concave random vector X is equivalent X1 + ··· + Xk 0 N √ ≤ c N(X1). (10) to Conjecture 2. k Information-theoretic formulations have seen an By letting k go to +∞, and using the entropic central increasing interest over the past few years, mainly limit theorem developed by Barron [3] (valid in any due to the simplicity of the formulations, and due to dimension, see [16]), we obtain the simplicity of the ensuing proofs. For example, an 0 information-theoretic argument easily yields the mono- N(Z) ≤ c N(X1), (11) tonicity of entropy (see [28], [22], [10]), properties of where Z is standard Gaussian. Hence, the maximal correlation between sums of random vari- 0 ables (see [17]), and important functional inequalities 1 ≤ c N(X1). (12) in mathematics (see [27]). Furthermore, the validity To conclude, we note that of the hyperplane conjecture has several applications 2 to information theory. For example, bounds can be 1 ≤ c0N(X ) ⇐⇒ log(2πe) ≤ log(c0) + h(X ) 1 n 1 obtained for the entropy rate of log-concave random ⇐⇒ D(X1||Z) ≤ cn, processes (see [5]), as well as for the capacity and rate- (13) distortion function under log-concavity assumptions log(c0) where c = 2 , which is the hyperplane conjecture (see [24]). in the equivalent version of Conjecture 2. The goal of this article is to develop a new Now, assume that Conjecture 2 is true. Let k ≥ information-theoretic formulation of the hyperplane 1, and let X1,...,Xk be i.i.d. isotropic log-concave conjecture (Section II), and to show that the EPI random vectors. Since Xi’s are i.i.d. and isotropic, we immediately implies the hyperplane conjecture for a have class of random vectors more general than the class of 1 1 n n (14) (c1, c2)-regular distributions introduced by Polyanskiy |KX1+···+Xk | = |KX1 + ··· + KXk | = k. and Wu [25], hence simplifying and extending a result It is well known that Gaussian distributions maximize of Dorpinghaus¨ [14], who implicitly showed that entropy under covariance matrix constraint (see, e.g., Conjecture 2 holds for isotropic log-concave (c1,√ c2)- [16]), hence we have the following upper bound for regular distributions, as long as c ≤ c and c ≤ c n 1 2 any X with finite covariance matrix KX , c > 0 for some universal constant (Sections III and n 1 h(X) ≤ h(Z ) = log(2πe|K | n ). (15) IV). X 2 X Thus, using (14), which holds for i.i.d. log-concave random vectors Xi, n 1 and strengthens (22) when k is large. Conjecture 3 h(X + ··· + X ) ≤ log(2πe|K | n ) 2 √ 1 k 2 X1+···+Xk asks whether k in (22) and c n k in (23) can be n n ck c > 0 = log(2πe) + log(k). replaced with , for some universal constant . 2 2 (16) III.A SIMPLEPROOFFORTHEENTROPICHYPERPLANE Using (4), Conjecture 2 tells CONJECTUREFOR (c1, c2)-REGULARRANDOMVECTORS n In what follows, X denotes an isotropic random log(2πe) ≤ h(X ) + cn, (17) 2 1 vector in Rn (not necessarily log-concave). Following hence the terminology of [25], we say that X ∼ fX is n (c1, c2)-regular if h(X1 + ··· + Xk) ≤ h(X1) + cn + log(k). (18) n 2 k∇ log fX (x)k2 ≤ c1kxk2 + c2, ∀x ∈ R , (24) We conclude that for some non-negative constants c1, c2 ≥ 0. Here, k·k2 0 n N(X1 + ··· + Xk) ≤ c kN(X1), (19) denotes the Euclidean norm in R . As a straightforward consequence of the EPI, we where c0 = e2c. will show that Conjecture 2 holds whenever√X is Several reverse entropy power inequalities exist in (c1, c2)-regular as long as c1 ≤ c and c2 ≤ c n for literature (see, e.g., [12], [6], [2], [11]), but none of some universal constant c > 0. them solve Conjecture 3. For example, Bobkov and Proof. It is well known that the EPI implies the follow- Madiman [6] established that for all independent log- ing inequality, sometimes referred to as isoperimetric n concave random vectors X,Y in R , there exist linear inequality for entropies, which is equivalent to the volume-preserving maps u, v such that log-Sobolev inequality, N(u(X) + v(Y )) ≤ c(N(X) + N(Y )), (20) N(X)I(X) ≥ N(Z)I(Z) = n, (25) where c > 0 is a universal constant. By induction, where Z is standard Gaussian (see, e.g., [13]). Here, one deduces the existence of a universal constant  2 c > 0 such that for every k ≥ 1, for all independent I(X) , E k∇ log fX (X)k2 , (26) log-concave random vectors X1,...,Xk, there exist denotes the of X. Hence, volume-preserving maps u1, . . . , uk such that I(X) I(X) N(Z) 2 D(X||Z) = ≥ = e n . (27) N(u1(X1) + ··· + uk(Xk)) n I(Z) N(X) k−1 (21) ≤ c (N(X1) + ··· + N(Xk)). Taking the logarithm, we deduce that Inequality (21) yields a dependence in k which is n I(X) D(X||Z) ≤ log . (28) exponential, while Conjecture 3 asks for a linear 2 n dependence in k. However, inequality (21) is valid for independent non-identically distributed log-concave In view of (28), to establish (3) it is enough to show random vectors, while Conjecture 3 is restricted to that 0 i.i.d isotropic log-concave random vectors. In the i.i.d. I(X) ≤ c n, (29) case, a slower growth with k can be demonstrated. for some universal constant c0 > 0. But since X is For example, the following reverse EPI, due to Cover (c1, c2)-regular, we have and Zhang [12] (see also [29], [23]), holds for i.i.d. 2 2 2 2 log-concave random vectors Xi, k∇ log fX (x)k2 ≤ c1kxk2 + 2c1c2kxk2 + c2. (30) 2 N(X1 + ··· + Xk) ≤ k N(X1). (22) Hence, I(X) = k∇ log f (X)k2 Currently the best known bound to the hyperplane E X 2 2  2 2 (31) conjecture, due to Klartag [19] (see also [21]), is ≤ c1E kXk2 + 2c1c2E [kXk2] + c2. equivalent (via Theorem 1) to the following reverse We have by Jensen inequality and the isotropicity of EPI X, q √ √ 2 N(X1 + ··· + Xk) ≤ c n kN(X1), (23) E [kXk2] ≤ E [kXk2] = n. (32) √ As c1 ≤ c and c2 ≤ c n for some universal constant Proof of Theorem 2. As seen in Section III, it is c > 0 by assumption, we conclude that enough to show that I(X) ≤ c0n, (33) I(X) ≤ c0n, (38) 0 for some universal constant c > 0. for some universal constant c0 > 0. Since X is q Remark 1. As shown in [25], random vectors of the (c1kxkp + c2)-regular, we deduce that 2 form X + Zσ2 , where X ⊥⊥ Zσ2 ∼ N (0, σ In), are 2  2q  q 2 3 4 I(X) ≤ c1E kXkp + 2c1c2E kXkp + c2. (39) (c1, c2)-regular, with c1 = σ2 and c2 = σ2 E [kXk2]. In particular, the entropic hyperplane conjecture holds for Using Jensen inequality, we have the random vector X√+Z , where Z is standard Gaussian, q 2 kXkq ≤ kXkp p , (40) and X is an isotropic random vector independent of E p E p 2q Z.  2q  p p E kXkp ≤ E kXkp . (41) Remark 2. We do not know whether the assumption of (c , c )-regularity combined with isotropicity nec- On the other hand, using the reverse Jensen inequality 1 2 √ (36), we have essarily implies that c1 ≤ c and c2 ≤ c n, for some n n universal constant c > 0. p  p X p p X  2 2 E kXkp = E [|Xi| ] ≤ C(p) E |Xi| , IV. THEENTROPICHYPERPLANECONJECTUREFOR q i=1 i=1 (c1kxkp + c2)-REGULARLOG-CONCAVE RANDOM VECTORS (42) Let p > 1 and q > 0. Following the terminology of for some constant C(p) depending on p only. Since n [20], we say that a random vector X ∼ fX in R is X is isotropic, we have for every i ∈ {1, . . . , n}, q n (c1kxkp + c2)-regular if for almost all x ∈ R ,  2 E |Xi| = 1. (43) k∇ log f (x)k ≤ c kxkq + c , (34) X 2 1 p 2 Hence, for some non-negative constants c , c ≥ 0. Here,  p p 1 2 E kXkp ≤ C(p) n. (44) 1 n ! p As p is independent of n, 2q ≤ p, c ≤ c, and c ≤ X p √ 1 2 kxkp , |xi| . (35) c n, for some universal constant c > 0, we conclude i=1 that Note that (c , c )-regular distributions correspond 2q q 1 2 2 2q q 2 0 I(X) ≤ c C(p) n p + 2c c C(p) n p + c ≤ c n, to (c1kxk2 + c2)-regular distributions. 1 1 2 2 A typical example is the distribution with density (45) q kxk 0 − p for some universal constant c > 0. Ce q , p, q > 1, C > 0 is the normalizing constant, q−1 We end this section with a criterion for regularity, which is kxkmin{p,2(p−1)}-regular. which extends [25, Proposition 2] applicable to Theorem 2. Let p ≥ 1 be independent of the convolutions of (c , c )-regular random vectors with dimension n. The entropic hyperplane conjecture 1 2 q Gaussians. holds for (c1kxkp + c2)-regular isotropic log-concave random vectors in n, whenever q ≤ p , c ≤ c, and Proposition 1. Let V = X + Z, where X ⊥⊥ Z, √ R 2 1 c2 ≤ c n, for some universal constant c > 0. E [kXkr] < ∞, and Z has density r n − kzkr n In order to prove Theorem 2, we need the following rd fZ (z) = cr,d e , z ∈ R , (46) reverse Jensen inequality valid for log-concave random r with r ≥ 2, where d = E [kZkr] /n, and cr,d is the variables. r−1 normalizing constant. Then, V is (c1kxkr + c2)- Lemma 1 ( [18], [7]). Let X be a centered log- regular with concave . Then, for every 1 ≤ q ≤ r−2 r−1 r−2 r < +∞, 3 · 2 2 (1 + 2 ) r−1 c1 = , c2 = E [kXkr] . r 1 q 1 d d E [|X| ] r ≤ C(q, r)E [|X| ] q , (36) (47) where C(q, r) is a constant depending on q and r Proof. We follow the proof of [25, Proposition 2]. Let only. Moreover, one may take fV be the density of V . One has, 1 1 C(q, r) = 2Γ(r + 1) r /Γ(q + 1) q . (37) ∇ log fV (v) = E [∇ log fZ (v − X)|V = v] . (48) Hence, REFERENCES k∇ log f (v)k ≤ [k∇ log f (v − X)k |V = v] [1] K. Ball. Logarithmically concave functions and sections of convex sets in V 2 E Z 2 n R . Studia Math., vol. 88, no. 1, pp. 69–84, 1988. 1 h r−1 i [2] K. Ball, P. Nayar, T. Tkocz. A reverse entropy power inequality for log- = E kv − Xk2(r−1)|V = v . concave random vectors. Studia Math. 235 (2016), no. 1, 17–30. d [3] A. R. Barron. Entropy and the central limit theorem. Ann. Probab. 14 (49) (1986), no. 1, 336–342. Since r ≥ 2, one has kv − xk2(r−1) ≤ kv − xkr. Thus [4] S. G. Bobkov, M. Madiman. Entropy and the hyperplane conjecture in convex geometry. Proceedings of the 2010 IEEE International Symposium 1  r−1  on Information Theory, Austin, Texas, 2010, pp. 1438–1442. k∇ log fV (v)k2 ≤ E kv − Xkr |V = v . (50) [5] S. G. Bobkov, M. Madiman. The entropy per coordinate of a random vector d is highly constrained under convexity conditions. IEEE Trans. Inform. Let us denote Theory 57 (2011), no. 8, 4940–4954. [6] S. Bobkov, M. Madiman. Reverse Brunn-Minkowski and reverse entropy power inequalities for convex measures. J. Funct. Anal. 262 (2012), no. 7, a(X, v) , fZ (v − X)/fV (v). (51) 330–3339. [7] C. Borell. Complements of Lyapunov’s inequality. Math. Ann. 205 (1973), We have 323–331. kv − Xkr−1|V = v = kv − Xkr−1a(X, v) [8] J. Bourgain. On high-dimensional maximal functions associated to convex E r E r bodies. Amer. J. Math. 108 (1986), no. 6, 1467–1476. ≤ 2 kv − Xkr−11  [9] M. H. M. Costa and T. M. Cover. On the similarity of the entropy power E r {a(X,v)≤2} inequality and the Brunn-Minkowski inequality. IEEE Trans. Inform. Theory  r−1  30 (1984), no. 6, 837–839. + kv − Xk a(X, v)1{a(X,v)>2} . E r [10] T. A. Courtade. Monotonicity of entropy and Fisher information: a quick (52) proof via maximal correlation. Commun. Inf. Syst. 16 (2016), no. 2, Note that 111–115.   n  [11] T. A. Courtade. Links between the logarithmic Sobolev inequality and r cr,d the convolution inequalities for entropy and Fisher information. Preprint, {a(x, v) > 2} = kv − xkr < rd log . arXiv:1608.05431. 2fV (v) [12] T. M. Cover, Z. Zhang. On the maximum entropy of the sum of two (53) dependent random variables. IEEE Trans. Inform. Theory 40 (1994), no. 4, 1244–1246. Hence, [13] A. Dembo, T. M. Cover, and J. A. Thomas. Information-theoretic  r−1  inequalities. IEEE Trans. Inform. Theory 37 (1991), no. 6, 1501–1518. E kv − Xkr a(X, v)1{a(X,v)>2} (54) [14] M. Dorpinghaus.¨ A lower bound on the entropy rate for a large class of r−1 stationary processes and its relation to the hyperplane conjecture. Proc.   n  r cr,d IEEE Info. Theory Workshop (ITW) 2016, Cambridge, UK, Sept. 2016. ≤ rd log , [15] D. Hensley. Slicing convex bodies: bounds for slice area in terms of the 2fV (v) + body’s covariance. Proc. Amer. Math. Soc. 79 (1980), 619–625. [16] O. Johnson. Information theory and the Central Limit Theorem. Imperial where [x]+ max{0, x}. We have College Press, London, 2004. , [17] S. Kamath, and C. Nair. The strong data processing constant for sums fV (v) = E [fZ (v − X)] of iid random variables. Proceedings of the 2015 IEEE International Symposium on Information Theory, Hong Kong, June 2015.   [18] S. Karlin, F. Proschan, and R. E. Barlow. Moment inequalities of Polya´ ≥ E fZ (v − X)1{kXkr≤2E[kXkr]} r frequency functions. Pacific J. Math. 11 (1961), no. 3, 1023–1033. n − (kvkr+2E[kXkr]) ≥ [kXk ≤ 2 [kXk ]] c e rd [19] B. Klartag. On convex perturbations with a bounded isotropic constant. P r E r r,d Geom. Funct. Anal. 16 (2006), no. 6, 1274–1290. n r [20] V. Kostina. Data compression with low distortion and finite blocklength. cr,d − (kvkr+2E[kXkr]) ≥ e rd , IEEE Transactions on Information Theory, vol. 63, no. 7, pp. 4268–4285, 2 July 2017. (55) [21] Y. T. Lee, S. Vempala. Eldan’s stochastic localization and the KLS hyperplane conjecture: an improved lower bound for expansion. Preprint, where the last inequality follows from Markov in- arXiv:1612.01507. equality. We deduce that [22] M. Madiman, and A. R. Barron. The monotonicity of information in the central limit theorem and entropy power inequalities. Proc. IEEE  r−1  International Symposium Info. Theory, pp. 1021–1025, Seattle, July 2006. kv − Xk a(X, v)1{a(X,v)>2} E r [23] M. Madiman, and I. Kontoyiannis. Entropy bounds on abelian groups and r−1 the Ruzsa divergence. Preprint, arXiv:1508.04089. ≤ (kvkr + 2E [kXkr]) [24] A. Marsiglietti, V. Kostina. A lower bound on the of r−2 r−1 r−1 r−1 log-concave random vectors with applications. Entropy, 2018, 20(3):185. ≤ 2 kvkr + 2 E [kXkr] , [25] Y. Polyanskiy, Y. Wu. Wasserstein continuity of entropy and outer bounds for interference channels. IEEE Trans. Inform. Theory 62, (2016), no. 7, where the last inequality follows from convexity of 3992–4002. x 7→ xr−1. We conclude that [26] C. E. Shannon, A mathematical theory of communication. Bell System Tech. J., 27, (1948). 379–423, 623-656. r−1 2 r−1 r−1 [27] G. Toscani. An information-theoretic proof of Nash’s inequality. (English k∇ log fV (v)k2 ≤ kvkr + E [kXkr] summary) Atti Accad. Naz. Lincei Rend. Lincei Mat. Appl., 24 (2013), no. d 1, 83–93. r−2 [28] A. M. Tulino, S. Verdu.` Monotonic decrease of the non-Gaussianness of 2 r−1 r−1 r−1  the sum of independent random variables: a simple proof. IEEE Trans. + kvkr + 2 E [kXkr] d Inform. Theory 52 (2006), no. 9, 4295–4297. 3 · 2r−2 2r−1(1 + 2r−2) [29] Y. Yu. Letter to the editor: On an inequality of Karlin and Rinott concerning = kvkr−1 + [kXk ]r−1 . weighted sums of i.i.d. random variables. Adv. in Appl. Probab. 40 (2008), d r d E r no. 4, 1223–1226. (56)