Semidefinite Programming and Multivariate Chebyshev Bounds

SEMIDEFINITE PROGRAMMING AND MULTIVARIATE CHEBYSHEV BOUNDS Katherine Comanor ¤ Lieven Vandenberghe ¤¤ Stephen Boyd ¤¤¤ ¤ The RAND Corporation, Santa Monica, California ¤¤ Electrical Engineering Department, UCLA ¤¤¤ Electrical Engineering Department, Stanford University Abstract: Chebyshev inequalities provide bounds on the probability of a set based on known expected values of certain functions, for example, known power moments. In some important cases these bounds can be e±ciently computed via convex optimization. We discuss one particular type of generalized Chebyshev bound, a lower bound on the probability of a set de¯ned by strict quadratic inequalities, given the mean and the covariance of the distribution. We present a semide¯nite programming formulation, give an interpretation of the dual problem, and describe some applications. 1. INTRODUCTION The best lower bound on E f0(X) is given by the optimal value of the linear optimization problem The classical Chebyshev inequality states that minimize E f0(X) (1) subject to E fi(X) = ai; i = 1; : : : ; m; Prob(jXj ¸ 1) · 2 where we optimize over all probability distribu- for a zero-mean random variable X 2 R with vari- tions on Rn. The dual of this problem is ance 2. Generalized Chebyshev bounds provide similar upper or lower bounds on the probability m of a set in Rn based on known expected values maximize z0 + aizi Xi=1 of certain functions, for example, the mean and m (2) covariance. Several such multivariate generaliza- subject to z0 + zifi(x) · f0(x) for all x; tions of Chebyshev's inequality appeared during Xi=1 the 1950s and 1960s, see Isii (1959), Isii (1963), Isii (1964), Marshall and Olkin (1960), Karlin and and has a ¯nite number of variables zi, i = Studden (1966). Isii (1964), for example, considers 0; : : : ; m, but in¯nitely many constraints. Isii the problem of computing upper and lower bounds n shows that strong duality holds under appropriate on E f0(X), where X is a random variable on R constraint quali¯cations, so we can ¯nd a sharp that satis¯es the moment constraints lower bound on E f0(X) by solving (2). Note that the constraints in (2) can be written as a single E fi(X) = ai; i = 1; : : : ; m: constraint g(z0; : : : ; zm) · 0, where m where 1C is the 0-1 indicator function of C (i.e., g(z0; : : : ; zm) = sup(z0 + zifi(x) ¡ f0(x)): 1 (x) = 1 if x 2 C and 1 (x) = 0 otherwise). x C C Xi=1 Hence, if E X = x¹, and E XXT = S, then This is a convex function of z, but in general tr(SP ) + 2qT x¹ + r = E(XT P X + 2qT X + r) di±cult to evaluate. Hence, (2) is usually an intractable optimization problem. ¸ 1 ¡ E 1C (X) Bertsimas and Sethuraman (2000), Lasserre (2002), = 1 ¡ Prob(X 2 C): Bertsimas and Popescu (2005), and Popescu This simple argument shows that 1 ¡ tr(SP ) ¡ (2005) have recently shown that various types of 2qT x¹ ¡ r is a lower bound on Prob(X 2 C). In generalized Chebyshev bounds can be computed the SDP (4) we compute the best lower bound via semide¯nite programming. In this paper we that can be constructed this way. discuss one important example: a lower bound on the probability of a set de¯ned by strict quadratic The proof does not establish that the bound is inequalities, given the mean and the covariance of actually achieved by a distribution with the spec- the distribution. The main purpose of the paper i¯ed moments. This stronger result can be proved is to outline a proof of the main result from by combining Isii's semi-in¯nite linear program- semide¯nite programming duality (see Vanden- ming bounds with the S-procedure (Boyd et al., berghe et al. (2006)), give a geometric interpre- 1994, page 23). It can also be proved directly tation, and describe some applications. from semidefnite programming duality. The dual problem of (4) is m minimize 1 ¡ ¸ 2. PROBABILITY OF A SET DEFINED BY i Xi=1 QUADRATIC INEQUALITIES Zi zi subject to T º 0; i = 1; : : : ; m Rn · zi ¸i ¸ Let C be de¯ned by m strict quadratic (5) inequalities m Zi zi S x¹ T ¹ T T T · zi ¸i ¸ · x¹ 1 ¸ x Aix + 2bi x + ci < 0; i = 1; : : : ; m; (3) Xi=1 T tr(AiZi) + 2bi zi + ci¸i ¸ 0; n with Ai 2 S (not necessarily positive semide¯- i = 1; : : : ; m; n nite), bi 2 R , ci 2 R. It is easily veri¯ed that n n the optimal value of the following semide¯nite which is an SDP with variables Zi 2 S , zi 2 R , program (SDP) is a lower bound on Prob(X 2 C) and ¸i 2 R. It can be shown that from every for distributions with E X = x¹ and E XX T = S: feasible solution in (5) a discrete distribution can be constructed with E X = x¹, E XX T = S and maximize 1 ¡ tr(SP ) ¡ 2qT x¹ ¡ r Prob(X 2 C) · 1 ¡ i ¸i (Vandenberghe et al. P q subject to º 0 (2006)). Tightness of Pthe generalized Chebyshev · qT r ¸ inequality (4) then follows from the fact that the (4) P ¡ ¿iAi q ¡ ¿ibi two SDPs are duals and have the same optimal T º 0; · (q ¡ ¿ibi) r ¡ 1 ¡ ¿ici ¸ value. i = 1; : : : ; m ¿i ¸ 0; i = 1; : : : ; m: 3. GEOMETRIC INTERPRETATION n n The variables in this SDP are P 2 S , q 2 R , 2 Figure 1 shows an example in R . The set C r 2 R, and ¿ 2 R, for i = 1; : : : ; m. i is de¯ned by three linear inequalities and two To verify this, we ¯rst note that the constraints concave quadratic inequalities. The mean x¹ = imply that E X is shown as a small circle, and the set T T T T T T ¡1 x P x + 2q x + r ¸ 1 + ¿i(x Aix + 2bi x + ci) fx j (x ¡ x¹) (S ¡ x¹x¹ ) (x ¡ x¹) = 1g and xT P x + 2qT x + r ¸ 0 for all x. This means is shown as the dashed ellipse. The optimal Cheby- that shev lower bound on Prob(X 2 C) for this example is 0:3992. The six solid dots are the possible T T x P x + 2q x + r ¸ 1 ¡ 1C (x) values of the discrete distribution computed from maximize 1 ¡ sP ¡ 2xq¹ ¡ r C P q 1 0 0:0311 subject to º ¿ 0:1494 · q r ¡ 1 ¸ · 0 ¡1 ¸ ¿ ¸ 0 PSfrag replacements 0:0527 P q 0:3992 º 0; · q r ¸ x¹ 0:2934 with variables P; q; r; ¿ 2 R. The values P = q = 0:0743 ¿ = 0, r = 1 are obviously feasible, with objective value zero. The values P = ¿ = 1, r = q = 0 are also feasible, with objective value 1¡s. The values T Fig. 1. The distribution that achieves the lower P q 1 1 ¡ x¹ 1 ¡ x¹ = bound on Prob(X 2 C) for a given mean · q r ¸ (s ¡ 2x¹ + 1)2 · s ¡ x¹ ¸ · s ¡ x¹ ¸ and covariance. 1 ¡ x¹ ¿ = s ¡ 2x¹ + 1 the optimal solution of the SDP (5). The numbers next to the dots give the probability of the six are feasible if s < x¹, with objective value values. (Since C is de¯ned as an open set, the 2 ¯ve points on the boundary are not in C itself, so (1 ¡ x¹) 1 ¡ sP ¡ 2xq¹ ¡ r = : Prob(X 2 C) = 0:3992 for this distribution.) s ¡ 2x¹ + 1 The solid ellipse is the level curve This proves (6). To show that the inequality is fx j xT P x + 2qT x + r = 1g tight, we use the dual SDP (5): minimize 1 ¡ ¸ where P , q, and r are the optimal solution of the subject to Z ¸ ¸ lower bound SDP (4). We notice that the optimal Z z s x¹ 0 ¹ ¹ distribution allocates nonzero probability to the · z ¸ ¸ · x¹ 1 ¸ points where the ellipse touches the boundary of C, and to its center. This relation between the so- with variables ¸; Z; z 2 R. If x¹ > s, the values lutions of the upper and lower bound SDPs holds in general, and is a consequence of the optimality s ¡ x¹2 s ¡ x¹2 Z = z = ¸ = = ; conditions of semide¯nite programming. s ¡ 2x¹ + 1 s ¡ x¹2 + (x¹ ¡ 1)2 are feasible with objective function (1 ¡ x¹)2 1 ¡ ¸ = : 4. EXAMPLES s ¡ 2x¹ + 1 In the simplest cases, the two SDPs can be solved If x¹ · s < 1, we can also take Z = s, z = x¹, analytically. As an example, we derive an exten- ¸ = s, which provides a feasible point with a sion of the Chebyshev inequality known as Sel- smaller objective value, 1 ¡ s. Finally, if s ¸ 1, we berg's inequality (Karlin and Studden, 1966, page can take Z = s, z = x¹, ¸ = 1, which has objective 475). We take C = (¡1; 1) = fx 2 R j x2 < 1g. 2 value zero. This shows that there are distributions We show that if E X = x¹ and E X = s, then with the speci¯ed mean and variance for which the righthand side of (6) is equal to Prob(X 2 C). 0 1 · s 2 8 1 ¡ s jx¹j · s < 1 For example, if x¹ = E X = 0:4 and s = E X = Prob(jXj < 1) ¸ 2 (6) > (1 ¡ jx¹j) 0:2 we get Prob(jXj < 1) ¸ 0:9, and the bound < s < jx¹j s ¡ 2jx¹j + 1 is achieved by the distribution > : 1 with probability 0:1 X = and that there is a distribution that achieves the ½ 1=3 with probability 0:9.

Semidefinite Programming and Multivariate Chebyshev Bounds

Widescreen Weekend 2007 Brochure

The Ben-Hur Franchise and the Rise of Blockbuster Hollywood

Monday Evening September 11 6 P.M. 6:30 7 P.M. 7:30 8 P.M. 8:30

MGM 70 YEARS: REDISCOVERIES and CLASSICS June 24 - September 30, 1994

Bodicacia's Tombstone

Uncivil Liberties and Libertines: Empire in Decay

No Oscar for the Oscar? Behind Hollywood's Walk of Greed

Dr. Strangelove's America

Movie Time Descriptive Video Service

Cliff Harrington

AFI PREVIEW Is Published by the Sat, Jul 9, 7:00; Wed, Jul 13, 7:00 to Pilfer the Family's Ersatz Van American Film Institute

BEST FILMS of the 1950'S 1. Seven Samurai