CONSTRAINT QUALIFICATIONS , LAGRANGIAN DUALITY & SADDLE POINT OPTIMALITY CONDITIONS
A Dissertation Submitted For The Award of the Degree of Master of Philosophy in Mathematics
Neelam Patel
School of Mathematics Devi Ahilya Vishwavidyalaya, (((NACC(NACC Accredited Grade “A”“A”)))) Indore (M.P.) 20122012----20132013 Contents
Page No.
Introduction 1
Chapter-1 2-7 Preliminaries
Chapter-2 8-21 Constraint Qualifications
Chapter-3 22-52 Lagrangian Duality & Saddle Point Optimality Conditions
References 53
Introduction
The dissertation is a study of Constraint Qualifications, Lagrangian Duality and saddle point Optimality Conditions. In fact, it is a reading of chapters 5 and 6 of [1]. First chapter is about preliminaries. We collect results which are useful in subsequent chapters, like Fritz-John necessary and sufficient conditions for optimality and Karush-Kuhn-Tucker necessary and sufficient conditions for optimality. In second chapter we define the cone of tangents and show that F 0 T = is a necessary condition for local optimality is the cone of tangents. The constraint ∩ ∅ qualification which are defined are Abidie ′s, Slater ′s, Cottle ′s, Zangvill ′s, Kuhn- Tucker ′s and linear independence constraint qualification. We shall prove LICQ ⇒ CQ ⇒ ZCQ KT CQ ⇒ AQ ⇒ SQ ⇑ We derive KKT conditions under various constraint qualifications. Further, We study of various constraint qualifications and their interrelationships. In third chapter, we define the Lagrangian dual problem and give its geometric interpretation. We prove the weak and strong duality theorems. We also develop the saddle point optimality conditions and its relationship with KKT conditions. Further, some important properties of the dual function, such as concavity, differentiability, and subdifferentiability have been discussed. Special cases of the Lagrangian duality for Linear and quadratic programs are also discussed.
Chapter 1 Preliminaries
We collect definitions and results which will be useful. Definition 1.1(Convex function): Let f : S R, where S is a nonempty convex set in Rn. The function f is said to be convex on S if
f( x1+(1- )x 2) f(x1) + (1- )f(x 2) for each x 1, x 2 S and for each (0, 1).≤ Definition 1.2(Pseudoconvex ∈ function): ∈ Let S be a nonempty open set in Rn, and let f : S R be differentiable on S. The function f is said to be pseudoconvex t if for each x 1, x 2 S with f(x 1) (x 2 - x1) 0 we have f(x 2) f(x 1). Definition 1.3(Strictly Pseudoconvex ∈ ∇ function):≥ Let S be a nonempty≥ open set in Rn, and let f : S R be differentiable on S. The function f is said to be strictly pseudoconvex t if x 1 x2, f(x 1) (x 2 - x1) 0 we have f(x 2) > f(x 1). Definition 1.4(Quasiconvex≠ ∇ function): Let≥ f : S R, where S is a nonempty convex n set in R . The function f is said to be quasiconvex if, for each x 1 and x 2 S,
f( x1+(1- )x 2) max {f(x 1), f(x 2)} for each (0, 1).∈ Notation 1.5: ≤ ∈ t F0 = { d/ f(x 0) d 0} The cone of feasible directions: ∇ < D = {d/d 0, x+ d, for all (0, ) for some } Theorem 1.6: Consider the≠ problem to minimize ∈ f(x) subject to x > 0 S, where f : Rn n R and S is a nonempty set in R . Suppose f is a differentiable at∈ x 0, x 0 S. If x 0 is local minimum then F 0 D . Conversely, suppose F 0 D , f is pseudoconvex∈ at x 0 and there exists an∩ -neigborhood= ∅ N (x 0), > 0 such∩ that= d ∅ = (x – x0) D for any x S N (x 0). Then, x 0 is a local minimum of f. ∈ Lemma∈ 1.7:∩ Consider the feasible region S = {x X : g i(x) 0 for i = 1,…,m}, n n where X is a nonempty open set in R , and where g ∈i : R R for≤ i = 1,…,m. Given a feasible point x 0 S, let I = {i : g i(x 0) = 0} be the index set for the binding or active ∈ constraints, and assume that g i for i I are differentiable at x 0 and that the g i′s for i I are continuous at x 0. Define the sets∈ ∉ t G0 = {d : gi(x 0) d 0, for each i I} t G′ = { d 0∇ : gi(x 0) [Cones of interior directions at≠ x 0] ∇ ≤ ∈ Then, we have G0 D G0′ Theorem 1.8: Consider the Problem⊆ P to⊆ minimize f(x) subject to x X and g i(x) 0 n n n for i = 1,…,m, where X is a nonempty open set in R , f : R R, and∈ g i : R R,≤ for i = 1,…,m. Let x 0 be a feasible point, and denote I = {i : g i(x 0) = 0}. Furthermore, suppose f and g i for i I are differentiable at x 0 and g i for i I are continuous at x 0. If x0 is a local optimal solution, ∈ then F 0 G0 = . Conversely,∉ if F 0 G0 = , and if f is pseudoconvex at x 0 and g i for i ∩ I are∅ strictly pseudoconvex∩ over∅ some - neigborhood of x 0, then x 0 is a local minimum.∈ Theorem 1.9(The Fritz John Necessary Conditions): Let X be a nonempty open set n n n in R and let f : R R, and g i : R R, for i = 1,…,m. Consider the Problem P to minimize f(x) subject to x X and g i(x) 0 for i = 1,…,m. Let x 0 be a feasible solution, and denote I = {i ∈: g i(x 0) = 0}. Furthermore,≤ suppose f and g i for i I are differentiable at x 0 and g i for i I are continuous at x 0. If x 0 locally solves Problem ∈ P, then there exist scalars u 0 and u ∉i for i I such that u0 f(x 0) ∈+ i gi(x 0) = 0 ∈ ∇ ∇ u0, u i 0 for i I (u 0, uI ) ≥ (0, 0) ∈ where uI is the vector whose component are u i for i ≠I. Furthermore, if g i for i I are also differentiable at x 0, then the foregoing conditions∈ can be written in the following∉ equivalent form: u0 f(x 0) + i gi(x 0) = 0 ∇ ∇uigi(x 0) = 0 for i = 1,…,m u0, u i 0 for i 1,…,m (u 0, u) ≥ (0, 0) = where u is the vector whose components are u i for i = 1,…,m.≠ Theorem 1.10(Fritz John Sufficient Conditions): Let X be a nonempty open set in n n n R and let f : R R, and g i : R R, for i = 1,…,m. Consider the Problem P to minimize f(x) subject to x X and g i(x) 0 for i = 1,…,m. Let x0 be a FJ solution and denote I = {i : g i(x 0) = 0}.∈ Define S as ≤the relaxed feasible region for Problem P in which the nonbinding constraints are dropped. a. If there exists an -neigborhood N (x 0), > 0 such that f is pseudoconvex over N (x 0) S and gi , i I are strictly pseudoconvex over N (x 0) S, then x 0 is a local minimum∩ for Problem ∈ P. ∩ b. If f is pseudoconvex at x 0 and if g i, i I are both strictly pseudoconvex and quasiconvex at x 0, then x 0 is a global ∈ optimal solution for Problem P. In particular, if these generalized convexity assumptions hold true only by restricting the domain of f to N (x 0) for some > 0, then x 0 is a local minimum for Problem P. Theorem 1.11(Karush-Kuhn-Tucker Necessary Conditions): Let X be a nonempty n n n open set in R and let f : R R, and g i : R R, for i = 1,…,m. Consider the Problem P to minimize f(x) subject to x X and g i(x) 0 for i = 1,…,m. Let x 0 be a feasible solution, and denote I = {i : g∈i(x 0) = 0}. Suppose≤ f and g i for i I are differentiable at x 0 and gi for i I are continuous at x 0. Furthermore, suppose ∈ gi(x 0) for i I are linearly independent.∉ If x 0 locally solves Problem P, then there∇ exist scalars∈ u i for i I such that ∈ f(x 0) + i gi(x 0) = 0 ∈ ∇ ∇ ui 0 for i I In addition to the above assumption, if g i for each≥ i I is∈ also differentiable at x 0, then the foregoing conditions can be written in the following∉ equivalent form: f(x 0) + i gi(x 0) = 0 ∇ ∇uigi(x 0) = 0 for i = 1,…,m ui 0 for i 1,…,m Theorem 1.12(Karush-Kuhn-Tucker Sufficient Conditio≥ ns): Let= X be a nonempty n n n open set in R and let f : R R, and g i : R R, for i = 1,…,m. Consider the Problem P to minimize f(x) subject to x X and g i(x) 0 for i = 1,…,m. Let x 0 be a ∈ ≤ KKT solution, and denote I = {i : g i(x0) = 0}. Define S as the relaxed feasible region for Problem P in which the constraints that are not binding at x 0 are dropped. Then, a. If there exists an -neigborhood N (x 0), > 0 such that f is pseudoconvex over N (x 0) S and g i, i I are differentiable at x 0 and are quasiconvex over N (x 0) ∩ S, then x 0 is local ∈ minimum for Problem P. b. If f is pseudoconvex∩ at x 0, and if g i, i I are differentiable and quasiconvex at x0, then x 0 is a global optimal solution∈ to Problem P. In particular, if this assumption holds true with the domain of the feasible restriction to N (x 0), for some > 0, then x 0 is a local minimum for P. Theorem 1.13(Farkas´ Lemma): Let A be an m n matrix and c be an n vector. Then, exactly one of the following two system has a× solution: System 1 A x 0 and ct x > 0 for some x Rn System 2 At y =≤ c and y 0 for some y ∈ Rn. Theorem 1.14(Gordan ′s Theorem):≥ Let A be ∈an m n matrix. Then, exactly one of the following systems has solutions: × System 1 A x < 0 for some x Rn System 2 At p = 0 and p 0 for some nonzero∈ p Rm. Theorem 1.15(Closest point≥ theorem): Let S be a nonempty∈ closed convex set in Rn and y S. Then, there exists a nonzero vector p and a scalar such that pty > and ptx ∉ for each x S. Corollary≤ 1.16(Existence∈ of supporting hyperplane): Let S be, a nonempty convex n t set in R and x 0 int S. Then there is a nonzero vector p such that p (x – x0) 0 for each x clS. ∉ ≤ n n Lemma∈ 1.17: Let f : R R be a convex function. Consider any point x 0 R and a n nonzero direction d R . Then, the directional derivative f ′(x 0, d), of f at∈ x 0 in the direction d, exists. ∈ Theorem 1.18: Let S be a nonempty convex set in Rn, and let f : S R be convex. Then, for x 0 int S, There exists a vector such that the hyperplane t ∈ H = {(x, y) : y = f(x 0) + (x-x0)} Supports epi f at (x 0, f(x 0)). In particular, t f(x) f(x 0) + (x-x0) for each x S ≥ ∈ i.e., is a sub-gradient of f at x 0. Theorem 1.19: Let S be a nonempty convex set in Rn, and let f : S R be convex on S. Consider the problem to minimize f(x) subject to x S. Suppose that x 0 S is a local optimal solution to the problem. ∈ ∈ 1. Then, x 0 is a global solution. 2. If either x 0 is strict local minimum or if f is strictly convex, then x 0 is the unique global optimal solution, and it is also a strong local minimum. Theorem 1.20: Let f : Rn R be a convex function, and let S be a nonempty compact polyhedral set in Rn. Consider the problem to maximize f(x) subject to x S. An optimal solution x 0 to the problem then exists, where x 0 is an extreme poin ∈ Chapter 2 Constraint Qualifications Consider a problem P : Minimize f(x) Subject to x ∈ X gi(x) ≤ 0 i = 1,2,…,m Usually, first, Fritz John necessary conditions (at local optimality) are derived. Then under certain constraint qualifications it is asserted that the multiplier associated with the objective function is positive at a local minimum. These are called Karush-Kuhn-Tucker (KKT) necessary conditions. In view of, theorem 1.8, local optimality implies that F 0 ⋂ G0 = ∅, which implies the Fritz John conditions. Under the linear independence constraint qualification or more generally G 0 ≠ ∅, we deduced that the Fritz John conditions can only be satisfied if the Lagrangian multiplier associated with the objective function is positive. This led to KKT conditions. Local optimality ⇒ F0 ⋂ D = ∅ ⇒ F J conditions KKT conditions Theorem 1.6 Theorem 1.8 Constraint⇒ qualification Below, first we show, a necessary condition for local optimality is that F0 ⋂ T = ∅, where is the cone of tangents. Using the constraint qualification T = G ′, we show get F0 ⋂ G′ = ∅. Further using Farkas ′ lemma(1.13), we get the KKT conditions. Local optimality ⇒ F0 ⋂ T =∅ ⇒ F0 ⋂ G′ = ∅ KKT conditions Theorem 2.5 Theorem 2.7 Farkas ⇒′ Lemma Definition 2.1(The cone of tangents of S at 0): n Let S be a non-empty set in R , and let x 0 clS. The cone of tangents of S at x0, denoted by T, is the set of all directions d such that∈ d = ∞ xk x0) → Where k 0, xk S for each k, andlim x k x0.( − Note 2.2: >It is clear ∈ that d belongs to the cone of tangents if there is a feasible sequence {x k} converging to x 0 such that the directions of the cords x k –x0 converge to d. Remark 2.3: Alternative equivalent descriptions: The cone of tangents T can be equivalently characterized in either of the following ways: + a) T = { d : there exists a sequence{ k}0 and a function : R R, where ( )0 as 0, such that x = x 0 + kd + k ( k) S for each k} b) T = { d : d = ,where 0, {x k }∈ x0 and where x k S and lim → > ∈ xk x0, for each k} ≠ Proof: We have, ( k) = d - k(x-x0)0 as k ∞