Cayley's Hyperdeterminant, the Principal Minors of a Symmetric
Total Page:16
File Type:pdf, Size:1020Kb
Forty-Sixth Annual Allerton Conference WeA5.6 Allerton House, UIUC, Illinois, USA September 23-26, 2008 Cayley’s Hyperdeterminant, the Principal Minors of a Symmetric Matrix and the Entropy Region of 4 Gaussian Random Variables Sormeh Shadbakht and Babak Hassibi Electrical Engineering Department California Institute of Technology Pasadena, California 91125 Email: sormeh, [email protected] Abstract— It has recently been shown that there is a connec- is called balanced if for all i ∈N, α:i∈α aα =0.For tion between Cayley’s hypdeterminant and the principal minors example H1 +H2 −H12 ≥ 0 is balanced and H1 ≥ 0 is not. of a symmetric matrix. With an eye towards characterizing Theorem 1 (Discrete/continuous information inequalities): the entropy region of jointly Gaussian random variables, we obtain three new results on the relationship between Gaussian [5] random variables and the hyperdeterminant. The first is a new 1) A linear continuous information inequality (determinant) formula for the 2 × 2 × 2 hyperdeterminant. The ≥ α aαhα 0 is valid if and only if its discrete second is a new (transparent) proof of the fact that the principal ≥ n×n 2 × 2 × ...× 2 counterpart α aαHα 0 is balanced and valid. minors of an symmetric matrix satisfy the ≥ (n times) hyperdeterminant relations. The third is a minimal 2) A linear discrete information inequality α aαHα 0 is valid if and only if it can be written as α βαHα + set of 5 equations that 15 real numbers must satisfy to be the n 4 × 4 c − c ≥ ≥ principal minors of a symmetric matrix. i=1 ri(Hi,i Hi ) 0 for some ri 0, where α βαhα ≥ 0 is a valid continuous information c I. INTRODUCTION inequality (i denotes the complement of i in N ). Therefore one can study continuous random variables Let X1, ··· ,Xn be n jointly distributed discrete random ∗ to determine Γn. Among all continuous random variables variables with arbitrary alphabet size . The vector of all the N Gaussians are the most natural ones to study first. In fact it n − joint entropies of these random variables is referred 2 1 turns out that these distributions have interesting properties to as their “entropy vector” and conversely any n − di- 2 1 that make them even more desirable to study. mensional vector whose elements can be regarded as the joint T 1 Let X1, ··· ,Xn ∈ R be n jointly distributed zero-mean entropies of some n random variables, for some alphabet size vector valued Gaussian random variables of dimension T N, is called “entropic”. The entropy region is defined as the nT ×nT with covariance matrix R ∈ R . Clearly, R is sym- region of all possible entropic vectors and is denoted by ∗ Γn metric, positive semi-definite, and consists of block matrices [1]. Due to its deep connections with important problems in of size T × T (corresponding to each random variable). We information theory and probabilistic reasoning such as the will allow T to be arbitrary and will therefore consider the capacity of information networks [2][3], or the conditional normalized joint entropy of any subset α ⊆Nof these independence compatibility problem [4], characterizing this random variables region turns out to be of fundamental importance. While 1 1 T |α| it is completely solved for n =2, 3 random variables, the hα = · log (2πe) det Rα , (1) complete characterization for n ≥ 4 remains an interesting T 2 open problem. where |α| denotes the cardinality of the set α and Rα is the The above discussion focused on discrete random vari- |α|T ×|α|T matrix obtained by keeping those block rows ables; however, characterizing the entropy region of a number and block columns of R that are indexed by α. Note that of continuous random variables is as important. In fact, our normalization is by the dimensionality of the Xi, i.e., by it has been shown in [5] that there is a correspondence T , and that we have used h to denote normalized entropy. between the continuous and discrete information inequalities Normalization has the following important consequence. and therefore one can characterize one region from the other. Theorem 2 (Convexity of the region for h): The closure Let N = {1, ··· ,n} and for any α ⊆N, let Hα = of the region of normalized Gaussian entropy vectors is H(Xi,i ∈ α) (or hα whenever the underlying probability convex [6]. distributions are continuous) be the joint entropies. A valid It further turns out that for n =2, 3 random variables, discrete information inequality of the form α aαHα ≥ 0 vector-valued Gaussian random variables can be used to obtain the entire entropy region for continuous random This work was supported in part by the National Science Foundation variables [6]. through grant CCF-0729203, by the David and Lucille Packard Foundation, by the Office of Naval Research through a MURI under contract no. 1Since differential entropy is invariant to shifts there is no point in N00014-08-1-0747, and by Caltech’s Lee Center for Advanced Networking. assuming nonzero means for the Xi. 978-1-4244-2926-4/08/$25.00 ©2008 IEEE 185 Authorized licensed use limited to: CALIFORNIA INSTITUTE OF TECHNOLOGY. Downloaded on April 12,2010 at 17:36:59 UTC from IEEE Xplore. Restrictions apply. WeA5.6 Theorem 3: (Gaussians generate the entropy region for The multilinear form f is said to be degenerate if and only n =2, 3) For two and three random variables, the cone if there is a non-trivial solution (X1,X2,...,Xn) to the generated by the space of vector-valued Gaussian entropy following system of partial derivative equations [12]: vectors is the entire entropy region for continuous random ∂f variables. =0 for all j =1,...,n and i =1,...,kj (4) ∂xj, i In an effort to characterize the entropy region of discrete random variables, some inner and outer bounds have been The unique (up to a scale) irreducible polynomial with established among which the Ingleton bound is the most integral coefficients in the entries ai1,i2,...,in of a tensor A well-known. The Ingleton inequality was first discovered that vanishes when f is degenerate is called the hyperdeter- for the ranks of representable matroids [7]. In fact let minant. ··· N { ··· } v1, ,vn be n vector subspaces and = 1, ,n . Example (2 × 2 hyperdeterminant): Consider the 2 × 2 ⊆N 1 Further let α and rα be the rank function defined as hyperdeterminant, f(X1,X2)= i,j=0 ai,jxiyj. The mul- the dimension of the subspace ⊕i∈αvi. Then for any subsets tilinear form f is degenerate if there is a non-tirivial solution α1,α2,α3,α4 ⊆N, the Ingleton inequality is defined as for X1,X2, rα + rα + rα ∪α ∪α + rα ∪α ∪α + rα ∪α ∂f 1 2 1 2 3 1 2 4 3 4 = a00y0 + a01y1 =0 (5) ∂x0 −rα ∪α − rα ∪α − rα ∪α − rα ∪α − rα ∪α ≤ 0(2) 1 2 1 3 1 4 2 3 2 4 ∂f = a00x0 + a10x1 =0 (6) Although not all the entropy vectors satisfy this inequality ∂y0 [8], it turns out that certain types of entropy vectors, in par- ∂f = a10y0 + a11y1 =0 (7) ticular all the linearly representable (corresponding to linear ∂x1 codes over finite fields) and the abelian group characterizable ∂f = a01x0 + a11x1 =0 (8) entropy vectors do and hence fall into this innerbound. An ∂y1 important property of Gaussian random variables is that the entropy vector of 4 jointly Gaussian distributed random Trying to solve this system of equations, we obtain that, variables can be arranged so as to violate the Ingleton bound y0 −a01 −a11 = = (9) [6][9]. y1 a00 a10 x0 −a10 −a11 = = (10) A. Cayley’s Hyperdeterminant x1 a00 a01 Recall that the entropy of a collection of Gaussian random We see that a non-trivial solution exists if and only if, variables is simply the “log-determinant” of their covariance a00a11 − a10a01 =0, i.e. the hyperdeterminant is simply matrix. Similarly, the entropy of any subset of variables from the determinant in this case. a collection of Gaussian random variables is simply the “log” The hyperdeterminant of a 2 × 2 × 2 multilinear form was of the principal minor of the covariance matrix corresponding first computed by Cayley [11] and is as follows: to this subset. Therefore one approach to characterizing the 2 2 2 2 2 2 2 2 entropy region of Gaussians, is to study the determinantal −a000a111 − a100a011 − a010a101 − a001a110 relations of a symmetric positive semi-definite matrix. −4a000a110a101a011 − 4a100a010a001a111 For example, consider 3 Gaussian random variables. While +2a000a100a011a111 +2a000a010a101a111 the entropy vector of 3 random variables is a 7 dimensional object, there are only 6 free parameters in a symmetric +2a000a001a110a111 +2a100a010a101a011 positive semi-definite matrix. Therefore the minors should +2a100a001a110a011 +2a010a001a110a101 =0(11) satisfy a relation. It has very recently been shown that this In [10] it is further shown that the principal minors of relation is given by the Cayley’s so-called 2 × 2 × 2 ”hyper- an × symmetric matrix satisfy the × × × hy- determinant” [10]. The hyperdeterminant is a generalization n n 2 2 ... 2 of the determinant concept for matrices to tensors and it was n times first introduced by Cayley in 1845 [11]. perdeterminant. It is thus clear that determining the entropy region of Gaussian random variables is intimately related to There are a couple of equivalent definitions for the hyper- Cayley’s hyperdeterminant. determinant among which we choose the definition through the degeneracy of a multilinear form. Consider the following It is with this viewpoint in mind that we study the hyperdeterminant in this paper.