The Ellipsoid method

R.M. Freund/ C. Roos (MIT/TUD) e-mail: [email protected] URL: http://www.isa.ewi.tudelft.nl/ roos ∼

WI 4218 March 21, A.D. 2007

Optimization Group 1/26 Outline

⊲ Sherman-Morrison formula ⊲ Ellipsoids ⊲ Ellipsoid method ⊲ Basic construction ⊲ Prototypical iteration ⊲ Iteration bound ⊲ Two theorems ⊲ Two more theorems ⊲ Optimization with the ellipsoid method ⊲ Convexity of the homogeneous problem ⊲ The Key Proposition

Optimization Group 2/26 In memory of George Danzig and Leonid Khachyan (1)

Leonid Khachiyan (1952-2005) passed away Friday April 29 at the age of 52. He died in his sleep, apparently of a heart attack, colleagues said. Khachiyan was best known for his 1979 use of the ellipsoid , originally developed for convex programming, to give the first polynomial-time algorithm to solve problems. While the solved linear programs well in practice, Khachiyan gave the first formal proof of an efficient algorithm in the worst case.

”He was among the world’s most famous computer scientists,” said Haym Hirsh, chairman of the department at Rutgers.

George B. Dantzig, the father of Linear Programming and the creator of the Simplex Method, died at the age of 90 on May 13. In a statement INFORMS President Richard C. Larson mourned his death (see http://www.informs.org/Press/dantzigobit.htm). The tributes in- cluded obituaries in the Washington Post, the Francisco Chronicle, Mercury News and New York Times. National Public Radio commentator Keith Devlin remembered George Dantzig in a broadcast on Saturday, May 21.

Optimization Group 3/26 In memory of George Danzig and Leonid Khachyan (2)

Optimization Group 4/26 Leonid Khachyan in Shanghai (with Bai, Peng and Terlaky, 2002)

Optimization Group 5/26 Sherman-Morrison formula Let Q, R, S, T be matrices such that Q and Q + RST are nonsingular. • R and S are n k matrices of rank k n. • × ≤ T 1 1 1 T 1 1 T 1 Then (Q + RS )− = Q− Q− R(I + S Q− R)− S Q− −

Lemma 1 Let U be such that QU = R. Then I + ST U is invertible. Proof: Suppose w Rk satisfies (I + ST U)w = 0. Then, ∈ (Q + RST )Uw =(QU)w + R(ST Uw)= Rw Rw = 0. − Q + RST being nonsingular, this gives Uw = 0. Since rank (U) = k this implies w = 0. • T T T Theorem 1 If Qx0 = q and (I + S U)y = S x0 then x = x0 Uy satisfies (Q + RS )x = q. − T T T Proof: (Q + RS )(x0 Uy)= Qx0 + RS x0 QUy RS Uy − − − = q + R(I + ST U)y Ry RST Uy = q. 2 − − The solution x of (Q + RST )x = q is given by 1 1 T 1 T x = x0 Uy = Q q Q R(I + S U) S x0 − − − − − = (Q 1 Q 1R(I + ST Q 1R) 1ST Q 1)q. − − − − − − Since Theorem 1 holds for all q Rn the Sherman-Morrison formula follows. ∈ Reference: W. W. Hager. Updating the inverse of a matrix. SIAM Rev., 31(2):221–239, June 1989.

Optimization Group 6/26 Ellipsoids

Let M be a (symmetric) positive definite n n matrix and z Rn. Then × ∈ E := x : (x z)T M (x z) 1 M,z − − ≤ denotes an ellipsoid centered at z. Note thatn o

1 x E M 2 (x z) 1. ∈ M,z ⇔ − ≤ 1 1 Hence, putting u = M 2 (x z), we have x = z + M − 2 u and u 1. Therefore, − k k≤ 1 T E = x = z + M − 2 u : u u 1 . M,z ≤ In other words, n o 1 EM,z = z + M − 2 B(0, 1), where B(0, 1) denotes the unit sphere, centered at the origin. The volume of B(0, 1) is given by

n π 2 ν(n)= n , Γ 2 + 1 and the volume of EM,z by  ν(n) vol EM,z = . det(M)  Hence p 1 ln vol E = ln ν(n) ln det(M). M,z − 2  Optimization Group 7/26 Ellipsoid method

Given is a set convex S. We want to find s S. We assume ∈ that an ellipsoid E is given such that S E . M,z ⊆ M,z Step 1 : k = 0; M k = M; zk = z; Step 2 : if zk S: STOP; ∈ Step 3 : Find nonzero vector a such that

b aT x aT zk, x S; z ≤ ∀ ∈ b

Step 4 : Construct smallest volume ellipsoid that contains Rn T k EM,z x : a (x z ) 0 ; a ∩ ∈ − ≤ n o Let this ellipsoid have matrix M k+1 and center zk+1. Step 5 : k = k + 1; Step 6 : Goto Step 2.

Optimization Group 8/26 Basic construction (1)

Find m =(z, 0,..., 0), M = diag (a1,a2,...,an), such that the ellipsoid E = x Rn : (x m)T M(x m) 1 M,m ∈ − − ≤ contains B(0, 1) x : x1 0 and has minimal volume. Note that ∩ { ≥ }  n T 2 2 (x m) M(x m)= a1(x1 z) + aix . − − − i i=2 X The unit vectors e1, e2,..., en are on the boundary of B(0, 1) x : x1 0 . We require them to lie on the boundary of the± ellipsoid.± This gives: ∩ { ≥ }

2 a1(1 z) = 1 − 2 a1z + ai = 1, 2 i n. ≤ ≤ From this we obtain 2 1 2 z 1 2z a1 = , ai = 1 a1z = 1 = − , i 2. (1 z)2 − − (1 z)2 (1 z)2 ≥ − − − Recall that vol EM,m is minimal if det M is maximal. Since n 1  (1 2z) − det M = − , (1 z)2n − = 1 one may easily verify that this occurs if z n+1. Substituting this value we obtain 1 2 1 a1 = 1+ , ai = 1 , i 2. n − n2 ≥  

Optimization Group 9/26 Illustration for n = 2

1

x

0 b

-1 -1 0 1

Optimization Group 10/26 Basic construction (2)

(n + 1)2 n2 1 a1 = , ai = − , i 2. n2 n2 ≥

T B = y : y y 1 , H = y : y1 0 ≤ { ≥ } ¯ = n2 1 + 2 T ¯ = 1 M n−2 I n 1e 1e1 , z n+1e1 −   E = EM,¯ z¯

Theorem 2 (B H) E. ∩ ⊆ Proof: One has y E if and only if ∈ T 2 T e1 n 1 2e1e e1 y − I + 1 y 1. − n + 1 n2 n 1 − n + 1 ≤     −   This is equivalent to 2 n n 1 2 1 2n + 2 − y + + y1 (y1 1) 1. n2 i n2 − n2 ≤ i=1 X n 2 Now let y B H. Then 0 y1 1 implies y1 (y1 1) 0. Also y 1. Since ∈ ∩ ≤ ≤ − ≤ i=1 i ≤ n2 1 1 P − + = 1, n2 n2 we obtain y E. ∈ •

Optimization Group 11/26 Basic construction (3)

B = y : yT y 1 2 ≤ M¯ = nn 1 I + 2 oe eT n−2 n 1 1 1  −  E = EM,¯ z¯

1 Theorem 3 vol (E) < vol (B) e2(n−+1).

Proof: One has vol(E) = √det I = 1 . Moreover, vol(B) √det M¯ √det M¯

2 n 2 n 1 2 det M¯ = n 1 1+ 2 = n 1 − n+1 . n−2 n 1 n−2 n    −      Hence, using 1+ x ex we get ≤ 2 n 1 2 n 1 2 1 = n − n = 1+ 1 − 1 1 det M¯ n2 1 n+1 n2 1 − n+1  n −1  2 1 − −       en2 1en−+1 = en−+1. ≤ − This implies the theorem. • Optimization Group 12/26 Prototypical iteration

T 1 E := x : (x z) M (x z) 1 = z + M −2B(0, 1) M,z − − ≤ We know M, z, and an nonzero vector a. Define o

2 T 1 M¯ = n 1 M + 2 aa , z¯ = z + 1 M − a n−2 n 1 aT M 1a n+1 √ T 1  − −  a M − a By the Sherman-Morrison formula, one has 2 1 T 1 1 n 1 2 M − aa M − M¯ − = M − n2 1 − n + 1 aT M 1a ! − −

Theorem 4

E x : aT x aT z E . M,z ∩ ≤ ⊆ M,¯ z¯ n o

Theorem 5 1 2(n−+1) vol EM,¯ z¯ < vol EM,z e .    

Optimization Group 13/26 Proof of Theorem 5 T 1 2(u + b)(u + b) u := M − 2 a, b := u e1, R := I k k u + b 2 − k k One has RT = R, Ru = b, R2 = I. Hence ¯ n2 1 2 aaT M = −2 M + T 1 n n 1 a M − a − 1 1 2 1 T 1 n 1 2 M − 2 aa M − 2  2  2 = −2 M I + T 1 M n n 1 a M − a − 1 1 2 1 T 1 n 1 2 RM − 2 aa M − 2 R 2   2 = −2 M R I + T 1 RM n n 1 a M − a − 1 1 2 1 T 1 n 1 2 RM − 2 aa M − 2 R 2   2 = −2 M R I + 1 1 RM n n 1 T − a M − 2 RRM − 2 a n2 1 1 2 T 1 = 2  + 2  n−2 M R I n 1e1e1 RM . − Therefore,  

2 n 2 n det ¯ = n 1 det det 2 det + 2 T = n 1 1+ 2 det M n−2 M ( R) I n 1e1e1 n−2 n 1 M, − − and       2 n det M¯ n 1 2 1 = − 1+ >e n+1 . det M n2 n 1    −  Hence

√ 1 vol EM,¯ z¯ det M − =

E x : aT x aT z E . M,z ∩ ≤ ⊆ M,¯ z¯ n o

Proof: Suppose (x z)T M(x z) 1 and aT x aT z. Put − − ≤ ≥ 1 1 u = M − 2 a, Ru = b = u e1, y := RM 2 (x z) k k − Then T T 1 2 1 T y y =(x z) M 2 R M 2 (x z)=(x z) M(x z) 1. − − − − ≤ Moreover, T T T u y1 = u e y = b y = u Ru k k k k 1 T 1 1 T = a M − 2 RRM 2 (x z)= a (x z) 0. − − ≥ Now (x z¯)T M¯ (x z¯)=

− − 1 1 (x z 1 M − a )T M(x z 1 M − a )= n+1 √ T 1 n+1 √ T 1 − − a M − a − − a M − a 1 1 2 1 1 1 − 1 2 − M a T 2 n T 2 M a (x z T 1 ) M R −2 I + e1e1 RM x z T 1 = n+1 √a M − a n n 1 √a M − a − − − − − 1 u e1 2 T 1 u e1 k k +  k k 1   y n+1 u I n 1e1e1 y  n+1 u . − k k − − k k ≤     T The inequality follows just as in the proof of Theorem 1, since y y 1 and y1 0. ≤ ≥ •

Optimization Group 15/26 Two theorems

Theorem 6 Suppose we want to find a point in the set S, with S EM 0,z0 and vol (S) > 0. Then the algorithm will find a point in S after at most ⊆

vol EM 0,z0 2(n + 1) ln $ vol (S) !% iterations.

Proof: After k iterations, we have

k − vol E k k vol E 0 0 e 2(n+1) . M ,z ≤ M ,z Since S E k k for each k we obtain ⊆ M ,z   k − vol (S) vol E 0 0 e 2(n+1) . ≤ M ,z Hence  k ln vol (S) ln vol EM 0,z0 , ≤ − 2(n + 1) which implies 

vol EM 0,z0 k 2(n + 1) ln . ≤ vol (S) ! This proves the theorem. •

Optimization Group 16/26 Two theorems (cont.)

Let B(c, δ) := x : x c δ . { k − k≤ }

Theorem 7 Suppose we know R such that S B(0, R) and that S contains a ball B (x,ˆ r) for some xˆ and r > 0. Then the algorithm will find a point in S⊆after at most R 2n(n + 1) ln r    iterations.

Proof: After k iterations, we have

k − vol (B (x,ˆ r)) vol (S) vol (B(0, R)) e 2(n+1) . ≤ ≤ Taking logarithms again we get k ln vol (B (x,ˆ r)) ln vol (B(0, R)) , ≤ − 2(n + 1) and hence

n k 2(n + 1) ln vol(B(0,R)) = 2(n + 1) ln ν(n)R = 2(n + 1)n ln R . ≤ vol(B(x,rˆ )) ν(n)rn r This proves the theorem.      •

Optimization Group 17/26 Two more theorems Actually, we can improve the previous two results a lot in terms of what we need to know.

Theorem 8 Suppose vol S E 0 0 > 0. Then the algorithm will find a point in S ∩ M ,z after at most  

vol EM 0,z0 2(n + 1) ln       vol S E 0 0   ∩ M ,z     iterations.  

Proof: Obvious. •

Theorem 9 Suppose we know R and that S B(0, R) contains a ball B (x,ˆ r) for some ∩ xˆ and r > 0. Then the algorithm will find a point in S after at most R 2n(n + 1) ln   r  iterations.

Proof: Obvious. • Optimization Group 18/26 Optimization with the ellipsoid method

Consider the minimization problem T z∗ = min c x : x S . ∈ Let  T Pǫ := x : x S, c x z∗ + ǫ . ∈ ≤ 

Theorem 10 Suppose we know E 0 0 such that E 0 0 Pǫ = . Then the algorithm will find a point in M ,z M ,z ∩ 6 ∅ Pǫ after at most

vol EM 0,z0 2(n + 1) ln vol E 0 0 Pǫ $ M ,z ∩ !% iterations.  Proof: Obvious. • Theorem 11 Let us know R and B(0, R) Pǫ contains a ball B (x,ˆ r) for some xˆ and r > 0. Then the ∩ algorithm will find a point in Pǫ after at most R 2(n + 1)n ln r    iterations. Proof: Obvious. •

Optimization Group 19/26 Optimization (cont.) Unfortunately we must know R, and this is most unfortunate. Let us fix this. cT y y min cT x : x S (H) min : S, θ > 0 . ∈ ⇔ ( θ θ ∈ ) n o Note that the problem (H) at the right is homogeneous: if (y,θ) is feasible (or optimal) then so is (λy, λθ) for any λ> 0. Taking λ = 1/θ we get a feasible (or optimal) solution of (P ).

The objective function in (H) is not convex. However, the level sets are convex and this is enough for the ellipsoid method.

As far as y concerns, the feasible region is the cone generated by the convex set S, without the origin (since θ > 0). Hence the (convex!) feasible region is not closed, but also this H does not hurt the ellipsoid method.

Finally, by adding the constraint (y,θ) 1 k k≤ we lose nothing, and the whole feasible region lies in B((0, 0), 1)!

Optimization Group 20/26 Convexity of the homogeneous problem Proposition 1 The set y (y,θ) : S, θ > 0  θ ∈  is convex.

Proof: Let (y ,θ ), (y ,θ ) belong to the set. For 0 λ 1, let (y,θ)= λ(y ,θ )+ 1 1 2 2 ≤ ≤ 1 1 (1 λ)(y ,θ ). − 2 2 y1 y2 y λy + (1 λ)y λθ1θ + (1 λ)θ2θ y y = 1 − 2 = 1 − 2 = α 1 + (1 α) 2 S, θ λθ + (1 λ)θ λθ + (1 λ)θ θ − θ ∈ 1 − 2 1 − 2 1 2 = λθ1 where α λθ +(1 λ)θ . 1 − 2 •

cT y Proposition 2 The level sets of θ are convex.

Proof: For any z R one has ∈ cT y z cT y θz 0. θ ≤ ⇔ − ≤ Here we used θ > 0. This implies the result. • Optimization Group 21/26 The Key Proposition We have seen that we can apply the ellipsoid method to the homogeneous problem (H), with R = 1.

n Bn denotes the unit sphere in R .

Proposition 3 Suppose the level set Pǫ contains a ball B(ˆx, r). Let y HPǫ := (y,θ) : Pǫ,θ> 0 .  θ ∈  Then n+1 2 2 vol Bn+1 (n + 1)ν(n + 1) 1+ (r + xˆ ) k k .    n  vol HPǫ B ≤ ν(n)r ∩ n+1  

Theorem 12 Solving (P ) via (H), starting at the unit ball Bn+1, the algorithm will find an ǫ-solution of (H) (and hence of (P )) after at most

1 xˆ 2 (n + 1)(n + 2) ln + 1+ k k + 2(n + 2) ln πr√n r2 r $   ! %  iterations.

Optimization Group 22/26 Graphical illustration

b

Bn+1

b x

Optimization Group 23/26 Proof of Proposition 3 y HB(ˆx, r) := (y, θ) : B(ˆx, r),θ> 0 . θ ∈ The longest vector in the set n o y T := (y, θ) : B(ˆx, r), θ 1 HB(ˆx, r) θ ∈ ≤ ⊆ is the vector n o xˆ 1+ r xˆ xˆ xˆ xˆ + r k k = k k  1   0    1   and has length      

r 2 δ := 1+ 1+ xˆ 2 = 1+ (r + xˆ )2. s xˆ k k k k  k k q Hence 1 δ− T B +1 HB(ˆx, r) B +1 HPǫ. ⊆ n ∩ ⊆ n ∩ Therefore,

1 1 1 1 n n vol (Bn+1 HPǫ) vol δ− T = vol (T ) = ν(n)θ r dθ ∩ ≥ δn+1 δn+1 Z0 ν(n)rn ν(n)rn = = . n+1 n+1 (n + 1)δ (n + 1) 1+ (r + xˆ )2 2 k k Hence Proposition 3 follows.  • Optimization Group 24/26 Second proof of Proposition 3 y HB(ˆx, r) := (y, θ) : B(ˆx, r),θ> 0 . θ ∈ Let dS(x) denote an infinitesimal surfacen element around x To, where ∈ y T := (y,θ) : B(ˆx, r), θ = 1 HB(ˆx, r).  θ ∈  ⊆ The surface on Bn+1 cut off by the cone generated by dS(x) is equal to sin φ(x) dS(x) , δ(x)n where δ(x) = (x, 1) , and sin φ(x) = 1/δ(x). Thus the surface on B cut off by k k n+1 the cone generated by T equals dS(x) 1 ν(n) rn dS(x)= , x B(ˆx,r) δ(x)n+1 ≥ δn+1 x B(ˆx,r) δn+1 Z ∈ Z ∈ because we have for each x T that δ(x) δ, with δ as in the previous proof. Hence ∈ ≤ vol B HPǫ is at least equal to n+1 ∩  n  n ν(n) r ν(n) r n δn+1 δn+1 ν(n) r vol Bn+1 = vol Bn+1 = . surf (n + 1) ν(n + 1) (n + 1) δn+1 Bn+1     Hence, in the same way as in the previous proof, Proposition 3 follows. • Optimization Group 25/26 Proof of Theorem 12 By Theorem 10 the algorithm needs at most vol (B ) 2(n + 2) ln n+1 vol (HPǫ B +1)  ∩ n  iterations. Due to Proposition 3 this is

n+1 (n + 1)ν(n + 1) 1+ (r + xˆ )2 2 2(n + 2) ln k k . ≤  ν(n)rn       Since   n+1 n (n + 1)ν(n + 1) (n + 1)π 2 Γ + 1 = 2 √ n n+1 π n, ν(n) 2 Γ + 1 ≤ π 2  we obtain (omitting the brackets) the bound 

n+1 π√n(1+(r+ xˆ )2) 2 2( + 2) ln k k n rn   n+1 2 2 1 xˆ = 2( + 2) ln + 1+ k k n πr√n r2 r  2   1  xˆ  =( + 1)( + 2) ln + 1+ k k + 2( + 2) ln n n r2 r n πr√n .     This completes the proof.  •

Optimization Group 26/26