Math. Program., Ser. B 104, 263Ð292 (2005)
Digital Object Identifier (DOI) 10.1007/s10107-005-0616-1
James V. Burke · Adrian S. Lewis · Michael L. Overton Variational analysis of functions of the roots of polynomials
Dedicated to R. Tyrrell Rockafellar on the occasion of his 70th birthday. Terry is one of those rare individuals who combine a broad vision, deep insight, and the outstanding writing and lecturing skills crucial for engaging others in his subject. With these qualities he has won universal respect as a founding father of our discipline. We, and the broader mathematical community, owe Terry a great deal. But most of all we are personally thankful to Terry for his friendship and guidance.
Received: May 2, 2004 / Accepted: February 25, 2005 Published online: July 14, 2005 Ð © Springer-Verlag 2005
Abstract. The Gauss-Lucas Theorem on the roots of polynomials nicely simplifies the computation of the subderivative and regular subdifferential of the abscissa mapping on polynomials (the maximum of the real parts of the roots). This paper extends this approach to more general functions of the roots. By combining the Gauss-Lucas methodology with an analysis of the splitting behavior of the roots, we obtain characterizations of the subderivative and regular subdifferential for these functions as well. In particular, we completely char- acterize the subderivative and regular subdifferential of the radius mapping (the maximum of the moduli of the roots). The abscissa and radius mappings are important for the study of continuous and discrete time linear dynamical systems.
1. Introduction
Let Pn denote the linear space of complex polynomials of degree n or less. Define the root mapping on Pn to be the multifunction R : Pn → C given by R(p) = {λ | p(λ) = 0} , and let φ : C → IR ∪ {+∞} = IR¯ be a lower semi-continuous convex function. We are concerned with the variational properties of functions φˆ : Pn → IR ∪ {±∞} defined as φ(p)ˆ = sup {φ(λ) | λ ∈ R(p)} . (1) The abscissa and radius mappings on Pn are obtained by taking φ(λ) = Re λ and φ(λ) = |λ|, respectively. Indeed, the abscissa and radius mappings are the primary motivation for
J.V. Burke: Department of Mathematics, University of Washington, Seattle, WA 98195, USA. e-mail: [email protected] Research supported in part by the National Science Foundation Grant DMS-0203175. A. S. Lewis: School of Operations Research and Industrial Engineering, Cornell University, Ithaca, NY 14853, USA. e-mail: [email protected] Research supported in part by the Natural Sciences and Engineering Research Council of Canada. M.L. Overton: Courant Institute of Mathematical Sciences, NewYork University, NewYork, NY 10012, USA. e-mail: [email protected] Research supported in part by the National Science Foundation Grant DMS-0412049. Mathematics Subject Classification (1991): 90C46, 49K40, 65K05, 15A42 264 J.V. Burke et al. this study. The variational behavior of these functions is important for our understanding of the stability properties of continuous and discrete time dynamical systems [1, 2]. The functions defined by (1) possess a very rich variational structure. In addition, these functions are related to important applied problems. Consequently, they offer an ideal setting in which to test the utility and robustness of any theory for analyzing the variational structure of nonsmooth functions. We study the variational behavior of the class (1) on the set Mn of polynomials of degree n. This set is an open dense subset of the linear space Pn (endowed with the topology of pointwise convergence). Note that on the set Mn the sup in (1) can be replaced by max. The supremum in (1) is only required for constant polynomials. We also note that the non-constant members of the class (1) are never locally Lipschitz on Mn and are always unbounded in the neighborhood of any point on the boundary of Mn. The non-Lipschitzian behavior is seen by considering n ˆ the family of polynomials p(λ) = (λ − λ0) − . To see that φ must be unbounded on the boundary of Mn recall that for any non-constant convex function φ there must exist a nonzero direction a ∈ C for which φ(τa) →+∞as τ ↑∞.Ifp ∈ Pn is any −1 polynomial of degree less than n, the polynomial defined by q(λ) = (1 − a λ)p(λ) n −1 is in P , and satisfies q → p as 0. Moreover, a ∈ R(q) for all >0. ˆ Therefore, φ(q) →+∞as 0. It is this essential unboundedness of the roots on the boundary of Mn that motivates the restriction to Mn.OnMn the roots of polynomials are continuous functions of their coefficients, so the functions φˆ defined in (1) are lower semi-continuous on Mn. We use the tools developed in [6, 7, 12, 14] to study the variational properties of φˆ. Our earlier work demonstrates that these techniques are well suited to applications in stability theory [1Ð5]. In addition, we make fundamental use of a classical result originally due to Gauss and commonly known as the Gauss-Lucas Theorem. This result establishes a beautiful and elementary convexity relationship between the roots of a polynomial and the roots of its derivative. Theorem 1. [Gauss-Lucas] All critical points of a non-constant polynomial p lie in the convex hull H of the set of roots of p. If the roots of p are not collinear, no critical point of p lies on the boundary of H unless it is a multiple root of p. The Gauss-Lucas Theorem implies the following chain of inclusions for any poly- nomial of degree n:
− − conv R(p(n 1)) ⊂ conv R(p(n 2) ⊂···⊂conv R(p ) ⊂ conv R(p), (2) where conv S denotes the convex hull of the set S. Theorem 1 is referred to by Gauss as early as 1830 and has been rediscovered many times. In 1879, Lucas [9, 10] published a refinement of Gauss’s result. For more on the Gauss-Lucas Theorem and its uses see Marden [11]. With regard to the chain (2), it is useful to observe that
φ(p)ˆ = sup {φ(λ) | λ ∈ R(p)} = sup {φ(λ) | λ ∈ conv R(p)} . (3)
To see this note that the second supremum is clearly greater than or equal to the first so we need only show the reverse inequality. Let λ ∈ conv R(p), then, by Caratheodory, Variational analysis of functions of the roots of polynomials 265 there exist λ1,λ2,λ3 ∈ R(p) and 0 ≤ µ1,µ2,µ3 with µ1 + µ2 + µ3 = 1 such that λ = µ1λ1 + µ2λ2 + µ3λ3. But then by convexity
φ(λ) ≤ µ1φ(λ1) + µ2φ(λ2) + µ3φ(λ3) ≤ max φ(λj ), j=1,2,3 so that the second supremum in (3) is less than or equal to the first. In [5], Burke and Overton investigate the variational properties of the abscissa map- ping using an approach modeled on work of Levantovskii [8]. However, this approach is difficult and lengthy, and provides little insight into the underlying variational geometry. Furthermore, extending this approach to other functions of the roots of polynomials would be a daunting task at best. In [3], an approach based on the Gauss-Lucas Theo- rem is introduced to simplify the derivation of the tangent cone to the epigraph of the n abscissa mapping at the polynomial p(λ) = (λ − λ0) . This derivation is one of the two most difficult technical hurdles in [5]. The second is the verification of subdifferential regularity. In this paper, we apply the Gauss-Lucas ideas in [3] to the class of functions given by (1). As in [3], we focus on the computation of the tangent cone to the epigraph at the polynomials
= − n e(n,λ0)(λ) (λ λ0) .
We recover our earlier result for the abscissa mapping and obtain the corresponding result for the radius mapping which is stated below. Here and throughout, we denote the complex conjugate of the complex scalar z by z¯. Theorem 2. Let r : Pn → IR denote the radius mapping on Pn:
r(p) = max {|λ| | λ ∈ R(p)} . ∈ Pn × = n Let (v, η) IR be such that v k=0 bke(n−k,λ0). (i) (v, η) is an element of the tangent cone to the epigraph of r at the polynomial p(λ) = λn if and only if
1 η ≥ |b | and 0 = b ,k= 2, 3,...,n. n 1 k (ii) (v, η) is an element of the tangent cone to the epigraph of r at the polynomial n p(λ) = (λ − λ0) with λ0 = 0 if and only if 1 η ≥ |b |− λ¯ b , | | 2 Re 0 1 n λ0 ¯ 0 = Re λ0 −b2, and 0 = bk,k= 3,...,n.
The notation and definitions follow those established in [14]. The definitions of the terms epigraph and tangent cone used in Theorem 2 appear in the next section. Notation specific to the study of polynomials on the complex plane is introduced below. 266 J.V. Burke et al.
Recall that the complex plane C is a Euclidean space when endowed with the usual real inner product ω, λ = Re ωλ¯ . By extension, Cn is also a Euclidean space when given the real inner product n T T (ω1,...,ωn) ,(λ1,...,λn) = ωk ,λk . = k 1 ∈ C | = Pn Given λ0 we define the basis e(k,λ0) k 0, 1,...,n for , where = − k = e(k,λ0)(λ) (λ λ0) ,k 0, 1,...,n. Each such basis defines a real inner product (or duality pairing)onPn: n = ¯ p, q λ0 Re akbk, = k 0 = n = n = where p k=0 ake(k,λ0) and q k=0 bke(k,λ0). In the case n 0, we recover the real inner product on C. When λ0 = 0, we drop the subscript on the inner product and simply write · , · . Note that this family of inner products behaves continuously → in p, q, and λ0 in the sense that the mapping (p,q,λ0) p, q λ0 is continuous on Pn × Pn × C. To see this simply note that n p(k)(λ) q(k)(λ) p, q = Re . λ k! k! k=0 The inner product√ on a Euclidean space gives rise to a norm |·| in the usual way by setting |x| = x,x . Given a mapping φ : C → IR¯ , we define the mapping φ˜ : IR 2 → IR¯ by the composition φ˜ = φ ◦ , where : IR 2 → C is the linear transformation
(x) = x1 + i x2, (4) i ∈ C denoting the imaginary unit. If we endow IR 2 with its usual inner product and C with the inner product above, we have − ∗ Re µ 1µ = µ = . Im µ We say that φ is differentiable in the real sense if φ˜ is differentiable, in which case the derivative of φ is given by the chain rule as φ(ζ ) = ∇φ(˜ ∗ζ). Here ∇φ˜ denotes the gradient of φ˜. Similarly, we say that φ is twice differentiable in the real sense if φ˜ is twice differentiable and again the chain rule gives ∗ ∗ φ (ζ )δ = ∇2φ(˜ ζ) δ, (5) where ∇2φ˜ denotes the Hessian of φ˜. Since these are the only notions of differentiabil- ity we employ, we omit the qualifying phrase in the real sense when specifying that a function from C to IR¯ is differentiable or twice differentiable. We also make use of the following notation: φ (ζ ; δ) = φ (ζ ) , δ and φ (ζ; ω,δ) = ω, φ (ζ )δ . Variational analysis of functions of the roots of polynomials 267
2. The tangent cone and the subderivative
We use the tools in [7, 14] to describe the variational geometry of the function φˆ. Recall that the epigraph of a function f mapping a Euclidean space E into the extended real numbers IR¯ = IR ∪ {+∞} is the subset of E × IR given by epi (f ) = {(x, µ) | f(x)≤ µ} , (6) and the domain is the set dom (f ) = {x | f(x)<∞} . The fundamental variational object for f is the tangent cone to its epigraph [14, Definition 6.1]. Given a set C in a linear space S and a point x ∈ C, the tangent cone to C at x is defined to be the set ∃{ k}⊂ { }⊂ = x C, tk IR such that TC(x) d k → ↓ −1 k − → . x x, tk 0, and tk (x x) d The tangent cone to epi (f ) at a point can be viewed as the epigraph of another function called the subderivative of f at x. It is denoted by df (x) [14, Theorem 8.2]:
epi (df (x)) = {(w, η) | df (x)(w) ≤ η } = Tepi (f )(x, f (x)) (7) for all x ∈ dom (f ) = {x | f(x)<+∞}. The subderivative generalizes the notion of a directional derivative as seen by the following alternative formula [14, Definition 8.1]: f(x+ tw) − f(x) df (x)(w) = lim inf . (8) t0 t w→w In the case of the function class (1), the computation of the tangent cone is simplified due to the fact that (p, η) ∈ epi (φ)ˆ if and only if (ζp, η) ∈ epi (φ)ˆ for every nonzero complex scalar ζ. Lemma 1. Define Mn ⊂ Mn to be the set of monic polynomials of degree n: 1 Mn = + ∈ Pn−1 1 e(n,0) q q .
ˆ n ¯ Define φ1 : P → IR by φ(p)ˆ ,ifp ∈ Mn, φˆ (p) = 1 1 +∞ , otherwise.
Let λ0 ∈ dom (φ), bk ∈ C,k= 0, 1,...,n, η ∈ IR , and set n n = ˜ = v bke(n−k,λ0) and v bke(n−k,λ0). k=0 k=1 Then ∈ (v, η) Tepi (φ)ˆ e(n,λ0),φ(λ0) if and only if (v,η)˜ ∈ T ˆ e ,φ(λ ) . epi (φ1) (n,λ0) 0 268 J.V. Burke et al.
Remark 1. The lemma shows that T ˆ e ,φ(λ ) = C e + T ˆ e ,φ(λ ) , epi (φ) (n,λ0) 0 (n,λ0) epi (φ1) (n,λ0) 0 C = | ∈ C where e(n,λ0) ξe(n,λ0) ξ . Therefore, we can restrict our analysis of the tan- Mn gent cone to sequences that lie in the set 1. This is the approach taken in [4]. Here we work on the seemingly more general space Mn in order to simplify applications. ∈ ↓ Proof. Suppose (v, η) Tepi (φ)ˆ e(n,λ0),φ(λ0) , that is, there exists a sequence ξj 0 such that + + + + ∈ ˆ (e(n,λ0) ξj v o(ξj ), φ(λ0) ξj η o(ξj )) epi (φ), or equivalently, ξ v˜ + j + + + ∈ ˆ (e(n,λ0) o(ξj ), φ(λ0) ξj η o(ξj )) epi (φ1). 1 + ξj b0 Hence (v,η)˜ ∈ T ˆ e(n,λ ),φ(λ0) . epi (φ1) 0 Conversely, suppose (v,η)˜ ∈ T ˆ e ,φ(λ ) . By definition, there exists epi (φ1) (n,λ0) 0 ξj ↓ 0 such that + ˜ + + + ∈ ˆ (e(n,λ0) ξj v o(ξj ), φ(λ0) ξj η o(ξj )) epi (φ1). (9) + ˜ + + Multiplying e(n,λ0) ξj v o(ξj ) by (1 ξj b0) gives + + ˜ + = + + (1 ξj b0)(e(n,λ0) ξj v o(ξj )) e(n,λ0) ξj v o(ξj ). That is, (9) is equivalent to + + + + ∈ ˆ (e(n,λ0) ξj v o(ξj ), φ(λ0) ξj η o(ξj )) epi (φ) . ∈ Therefore, (v, η) Tepi (φ)ˆ e(n,λ0),φ(λ0) which proves the result. Next recall that the epigraph of the convex function φ as well as all of the lower level sets
levφ(µ) = {λ | φ(λ) ≤ µ} are convex sets [13]. This allows us to apply the Gauss-Lucas Theorem in a powerful way. Since ˆ µ ≥ φ(p) ⇐⇒ R(p) ⊂ levφ(µ), (10) the Gauss-Lucas Theorem yields the following chain of inclusions for any polynomial of degree n providing (p, µ) ∈ epi (φ)ˆ :
(n−1) conv R(p ) ⊂···⊂conv R(p ) ⊂ conv R(p) ⊂ levφ(µ). (11) This system of inclusions along with subdifferential information about the function φ provide the basis for a set of necessary conditions for a pair (v, η) ∈ Pn × IR to be an element of the tangent cone to epi (φ)ˆ . Variational analysis of functions of the roots of polynomials 269
The convexity of φ implies that φ is directionally differentiable in all directions at every point in λ0 ∈ dom (φ) [13] and one has
φ(λ0 + τξ)− φ(λ0) φ(λ0 + τξ)− φ(λ0) φ (λ0; ξ) = lim = inf . τ↓0 τ τ>0 τ
Taking τ = 1 and ξ = λ − λ0 in the right hand side of this expression gives the subdifferential inequality φ(λ) ≥ φ(λ0) + φ (λ0; λ − λ0).
A vector ω is a subgradient of φ, written ω ∈ ∂φ(λ), if and only if
φ(λ) ≥ φ(λ0) + ω, λ− λ0 ∀ λ ∈ C . (12)
The subdifferential ∂φ(λ0) is always a closed convex set, although it may be empty at points on the boundary of the set dom (φ). The subdifferential is related to the directional derivative of φ by the formula φ (λ0; λ) = sup { z, λ |z ∈ ∂φ(λ0)} . (13)
Therefore, at points
λ0 ∈ dom (∂φ) = {λ | ∂φ(λ) =∅} , we have φ (λ0;·) : C → IR ∪{+∞} is a lower semi-continuous convex function. Since φ (λ0;·) is a convex function, we can use (1) to define ˆ = ;· φλ0 φ (λ0 ). ˆ Pn → ∪ {±∞} That is, we define the function φλ0 : IR by ˆ = ; | = φλ0 (q) sup φ (λ0 λ) q(λ) 0 , and observe that by (13), (3), and (11),
φˆ (q) = sup { z, λ) |(z, λ) ∈ ∂φ(λ ) × conv R(q)} λ0 0 ≥ sup z, λ) (z, λ) ∈ ∂φ(λ0) × conv R(q ) = ˆ φλ0 (q ). (14) The next lemma shows how to extend the subdifferential inequality for φ to the function ˆ ˆ φ by using the function φλ0 .
Lemma 2. Given 0 <τ∈ IR and λ0 ∈ dom (φ), define the linear transformation S Pn → Pn S = + ∈ Pn [τ,λ0] : by [τ,λ0](p)(λ) p(λ0 τλ). Then, for every p , ˆ ≥ + ˆ S ( ) φ(p) φ(λ0) τφλ0 ([ [τ,λ0](p)] ), (15) for = 0, 1,...,(deg(p) − 1). 270 J.V. Burke et al.
Proof. For the case = 0, we have ˆ = { | = } φ(p) max φ(λ) p(λ) 0 = max φ(λ) S (p)(γ ) = 0,λ= λ + τγ [τ,λ0] 0 = max φ(λ + τγ) S (p)(γ ) = 0 0 [τ,λ0] ≥ + ; S = max φ(λ0) τφ (λ0 γ) [τ,λ0](p)(γ ) 0 = + ˆ S φ(λ0) τφλ0 ( [τ,λ0](p)). The remaining cases follow immediately from (14) and (11). We now translate the content of Lemma 2 into statements about the coefficients of the underlying polynomials. Consider the polynomial n = p ake(n−k,λ0), (16) k=0 with a0 = 0. The th derivative of this polynomial is given by n− ( ) = − p ! b(n k, )ake(n−(k+ ),λ0), (17) k=0 ≤ = n! where b(n, k) with k n are the binomial coefficients b(n, k) k!(n−k)! . Applying the S operator [τ,λ0] to p yields n (n−k) S = − [τ,λ0](p) akτ e(n k,0), (18) k=0 and n− ( ) n −k S = − − + [ [τ,λ0](p)] !τ b(n k, )τ ake(n (k ),0), (19) k=0 for = 0, 1, 2,...,(n− 1) (the case = 0 is just (18). With this notation, we have the following simple consequence of Lemma 2. Lemma 3. Let p ∈ Pn be as in (16) and let t ∈ IR be positive. Then s ˆ ≥ + 1/s ˆ − − −k/s φ(p) φ(λ0) t φλ0 b(n k,n s)t ake(s−k,0) , (20) k=0 for s = 1,...,n. Proof. In (19) take = n − s and τ = t1/s for = 0, 1, 2,...,(n− 1), or equivalently, s = 1,...,n, to obtain s (n−s) n/s −k/s S = − − − [τ,λ0](p) !t b(n k,n s)t ake(s k,0), k=0 = ˆ = for s 1,...,n. Plugging this expression into (15) yields the result since φλ0 (p) ˆ φλ0 (ζp) for every nonzero complex number ζ . Variational analysis of functions of the roots of polynomials 271
The main result of the section now follows.
Theorem 3. Let λ0 ∈ dom (∂φ) with ∂φ(λ0) ={0}.If ∈ (v, η) Tepi (φ)ˆ e(n,λ0),φ(λ0) with n = v bke(n−k,λ0), (21) k=0 then
η ≥ φ (λ ;−b /n), (22) 0 1 0 = g, −b2 ∀ g ∈ ∂φ(λ0), and (23)
0 = bk,k= 3,...,n. (24)
Thus, in particular, if rspan (∂φ(λ0)) = C, then b2 = 0, where for any D ⊂ C the set N rspan (D) = τkxk | N ∈ IN , τ k ∈ IR , x k ∈ Dk= 1,...,N k=1 is the real linear span of D. Proof. Let (v, η) ∈ T ˆ e ,φ(λ ) with v given by (21).Then, by Lemma 1, epi (φ) (n,λ 0) 0 (v,η)˜ ∈ T ˆ e ,φ(λ ) , where epi (φ1) (n,λ0) 0 n ˜ = v bke(n−k,λ0). k=1 ˆ Hence there exist sequences tj ↓ 0 and {(pj ,µj )}∈epi (φ1) such that −1 − → ˜ tj ((pj ,µj ) (e(n,λ0),φ(λ0))) (v,η).
{ j j j }∈Cn+1 That is, there exists (a0 ,a1 ,...,an) such that n = j pj ak e(n−k,λ0), k=0
−1 − → j = = j → −1 j → tj (µj φ(λ0)) η, a0 1 for all j 1, 2,..., and ak 0 with tj ak bk for k = 1,...,n. By applying Lemma 3 to pj for each j = 1, 2,... with t = tj we obtain for s = 1, 2,...,n s ≥ + 1/s ˆ − − −k/s j µj φ(λ0) tj φλ0 b(n k,n s)tj ak e(s−k,0) , k=0 272 J.V. Burke et al. or equivalently, s −1/s − ≥ ˆ − − −k/s j tj (µj φ(λ0)) φλ0 b(n k,n s)tj ak e(s−k,0) (25) k=0 for each s = 1,...,n. We now consider the limit as j →∞in each of these inequalities. First observe that s − − −k/s j → − + b(n k,n s)tj ak e(s−k,0) b(n, n s)e(s,0) bs k=0 for s = 1,...,n. Hence s R − − −k/s j → R − + b(n k,n s)tj ak e(s−k,0) b(n, n s)e(s,0) bs k=0 = j = = for s 1,...,n, since a0 1 for all j 1, 2,... and the roots of a polynomial are a continuous function of its coefficients on Mn. Therefore, the lower semi-continuity of ˆ φλ0 and the inequalities (25) imply that ≥ ˆ − + = ;− η φλ0 (b(n, n 1)e(1,0) b1) φ (λ0 b1/n) (26) which proves (22). For s = 2,...,n, the inequalities (25) imply that ˆ 0 ≥ φλ (b(n, n − s)e(s,0) + bs) 0 1/s πki /s −bs ω = e2 = max φ λ0; ω . (27) b(n, n − s) k = 0,...,s− 1
By assumption there exists g ∈ ∂φ(λ0) with g = 0. Inequality (27) implies that −b 1/s 0 ≥ g, s ω , b(n, n − s)
2πki /s for ω = e (k = 0, 1,...,s − 1).Fors = 3,...,nthis can only occur if bs = 0 which gives (24). For s = 2wehave −b 1/2 0 ≥ g,± 2 ∀ g ∈ ∂φ(λ ), b(n, n − 2) 0 or equivalently, condition (23) holds.
The conditions (23)and (24) play a key role in our subsequent analysis. Condition (24) is transparent, but condition (23) is not since it is nonlinear in b2. In the following lemma we describe the underlying convexity of this condition. For this we use the notion of the cone generated by a subset of the complex plane. Given a set D contained in C we define the cone generated by D to be the set
cone (D) = {τx | τ ∈ IR + and x ∈ D } . Variational analysis of functions of the roots of polynomials 273
If D is a convex subset of C, then cone (D) is convex and rspan (D) = cone (D) − cone (D). The polar of any convex cone K in C is the set
◦ K = {z | z, x ≤ 0 ∀ x ∈ K } , and we have the relationship cl (K) = K◦◦.
Lemma 4. Let λ0 ∈ dom (∂φ) with ∂φ(λ0) ={0}, and let b2 ∈ C. Then the following statements are equivalent. √ − = ∈ (i) g, b2 0 for all g√ ∂φ(λ0). (ii) Either b2 = 0 or rspan ( b2) = rspan (∂φ(λ0)). 2 (iii) Either b2 = 0 or cone (∂φ(λ0) ) = cone ({b2}), where 2 2 ∂φ(λ0) = g | g ∈ ∂φ(λ0) .
In addition, the set of all b ∈ C satisfying (i) is given by the convex cone K◦ , where 2 λ0
K =− 2 + 2 λ0 cone (∂φ(λ0) ) i [rspan (∂φ(λ0) )] (28) so that { } = C K◦ = 0 ,ifrspan (∂φ(λ0)) , and λ 2 0 cone (∂φ(λ0) ) , otherwise.
Proof. Observe that if b2 = 0, then the condition in (i) implies that the real linear span rspan (∂φ(λ0)) must be a line through the origin in C since ∂φ(λ0) ={0}. Note that any line through the origin can be written as rspan (v) for some v ∈ C and that the line perpendicular to this line in the real inner product is the line i [rspan (v)]. With these observations, we have the following string of equivalences: √ g, −b2 = 0 ∀ g ∈ ∂φ(λ0) √⇐⇒ either b2 = 0or v, i b2 = 0 ∀ v ∈ rspan (∂φ(λ0)) ⇐⇒ = either b2 0or √ v,w = 0 ∀(v, w) ∈ rspan (∂φ(λ0)) × ( i [rspan ( b2)]) ⇐⇒ √ either b2 = 0 or rspan (∂φ(λ0)) = rspan ( b2), where the final equivalence follows since the lines i [rspan (∂φ(λ0))] and the line rspan (∂φ(λ0)) are perpendicular whenever b2 = 0. Therefore, (i) and (ii) are equivalent. 274 J.V. Burke et al.
The equivalence of (ii) and (iii) follows from the following string of equivalences: √ either b2 = 0 or rspan (∂φ(λ0)) = rspan ( b2) ⇐⇒ = either b2 0or √ ∀ g ∈ ∂φ(λ0) ∃ τ ∈ IR such that g = τ b2 ⇐⇒ either b2 = 0or 2 2 ∀ g ∈ ∂φ(λ0) ∃ τ ∈ IR such that g = τ b2 ⇐⇒ either b2 = 0or 2 cone ({g }) = cone ({b2}) ∀ g ∈ ∂φ(λ0) \{0} ⇐⇒ 2 either b2 = 0 or cone (∂φ(λ0) ) = cone ({b2}).
Next note that the condition, rspan (∂φ(λ0)) = C is equivalent to the condition 2 rspan (∂φ(λ0) ) = C since the subdifferential ∂φ(λ0) is a convex set. In this case the K = C, and so K◦ ={0}. Therefore, if rspan (∂φ(λ )) = C, then the set of all b λ0 λ0 0 2 satisfying (23) equals K◦ ={0}. λ0 If rspan (∂φ(λ0)) = C, the set rspan (∂φ(λ0)) is a line through the origin since ∂φ(λ0) ={0}. This line must equal the real linear span of any nonzero element of 2 ∂φ(λ0). Hence the set cone (∂φ(λ0) ) is a ray emanating from the origin (not a line), K C and the set λ0 is a convex cone in . An easy computation shows that the polar of 2 this cone is the ray cone (∂φ(λ0) ). Hence, by (iii), the set of all b2 satisfying (23) is contained in K◦ . On the other hand, suppose b ∈ K◦ = cone (∂φ(λ )2). Since λ0 2 λ0 0 2 cone (∂φ(λ0) ) is a ray emanating from the origin, we have
◦ K = cone (∂φ(λ )2) = cone ({g2}) ∀ g ∈ ∂φ(λ ) \{0}. λ0 0 0
∈ \{ } ≥ = 2 Hence,√ for√ each g ∂φ(λ0) 0 there is a τg 0 such that b2 τgg , or equivalently, b2 =± τgg. Consequently, √ g, −b2 =± g, i τgg = 0 ∀ g ∈ ∂φ(λ0) \{0}, which shows that every b ∈ K◦ satisfies (23) completing the proof that K◦ is precisely 2 λ0 λ0 the set of all complex numbers that satisfy (23).
If φ is twice continuously differentiable with φ (λ0;·, ·) positive definite, then The- orem 3 can be sharpened. For this we make use of the following technical result. Lemma 5. If H is a 2-by-2 real symmetric matrix with nonnegative trace then the func- tion √ √ f(w)= −1 w, H−1 w is sublinear, i.e. positive homogeneous and subadditive (see (4) for the definition of ). Variational analysis of functions of the roots of polynomials 275
Proof. If ab H = bc and z = x + i y with x and y real, then − − f(z2) = 1z, H 1z = ax2 + 2bxy + cy2 a + c a − c = |z2|+ Re(z2) + b Im(z2). 2 2 Hence a + c a − c f(w)= |w|+ Re(w) + b Im(w) 2 2 and the result follows.
Definition 1. A function f : C → IR is said to be quadratic on C exactly when f composed with (defined in (4)) is quadratic on IR 2.
We now sharpen the inequality (22) using the notation established at the end of Section 1. Theorem 4. If in Theorem 3 it is further assumed that φ is either (i) quadratic, or (ii) twice continuously differentiable at λ0 with φ (λ0;·, ·) positive definite, then condition (22) can be strengthened to 1 η ≥ φ (λ ;−b ) + φ (λ ; −b , −b ) . (29) n 0 1 0 2 2 Proof. If φ is quadratic, then the proof follows essentially the same pattern of proof as in the positive definite case. Therefore, we only provide the proof in the case where φ is assumed to be twice continuously differentiable at λ0 with φ (λ0;·, ·) positive definite. ∈ Let (v, η) Tepi (φ)ˆ e(n,λ0),φ(λ0) with v given by (21). By Theorem 3, (v, η) satisfies (22)Ð(24). By Lemma 1, (v,η)˜ ∈ T ˆ e ,φ(λ ) , epi (φ1) (n,λ0) 0 ˜ = n = = where v k=1 bke(n−k,λ0). By (24), bk 0,k 3,...,n and so there exist sequences tr ↓ 0, ηr → η, b1r → b1,b2r → b2, and bkr → 0,k= 3,...,nsuch that ˆ φ(λ0) + ηr tr + o(tr ) ≥ φ(pr ), (30) where n = + + pr e(n,λ0) (bkrtr o(tr ))e(n−r,λ0) k=1 = + + + + + e(n,λ0) (b1r tr o(tr ))e(n−1,λ0) (b2r tr o(tr ))e(n−2,λ0) o(tr ), 276 J.V. Burke et al. with the second equality following since bkr → 0,k= 3,...,n. Let λ1r ,...,λnr denote the roots of the polynomials pr for r = 1, 2,..., respectively. We have