Convex Optimization 5. Duality Prof. Ying Cui Department of Electrical Engineering Shanghai Jiao Tong University 2018 SJTU Ying Cui 1 / 46 Outline Lagrange dual function Lagrange dual problem Geometric interpretation Optimality conditions Perturbation and sensitivity analysis Examples Generalized inequalities SJTU Ying Cui 2 / 46 Lagrangian standard form problem (not necessarily convex) min f0(x) x s.t. fi (x) ≤ 0, i = 1, ..., m hi (x) = 0, i = 1, ..., p m p ∗ domain D = i=0 domfi ∩ i=1 domhi and optimal value p ◮ basic idea in Lagrangian duality: take the constraints into T T account by augmenting the objective function with a weighted sum of the constraint functions Lagrangian: L : Rn × Rm × Rp → R, with dom L = D× Rm × Rp, m p L(x, λ, ν)= f0(x)+ λi fi (x)+ νi hi (x) Xi=1 Xi=1 ◮ weighted sum of objective and constraint functions ◮ λi is Lagrange multiplier associated with fi (x) ≤ 0 ◮ νi is Lagrange multiplier associated with hi (x) = 0 SJTU Ying Cui 3 / 46 Lagrange dual function Lagrange dual function (or dual function): g : Rm × Rp → R m p g(λ, ν)= inf L(x, λ, ν)= inf f0(x)+ λi fi (x)+ νi hi (x) x∈D x∈D i i ! X=1 X=1 ◮ g is concave even when problem is not convex, as it is pointwise infimum of a family of affine functions of (λ, ν) ◮ pointwise minimum or infimum of concave functions is concave ◮ g can be −∞ when L is unbounded below in x SJTU Ying Cui 4 / 46 Lower bound property The dual function yields lower bounds on the optimal value of the primal problem, i.e., for any λ 0 and any ν, g(λ, ν) ≤ p∗ ◮ the inequality holds but is vacuous when g(λ, ν)= −∞ ◮ the dual function gives a nontrivial lower bound only when λ 0 and (λ, ν) ∈ domg, i.e., g(λ, ν) > −∞ ◮ refer to (λ, ν) with λ 0, (λ, ν) ∈ domg as dual feasible proof: Supposex ˜ is feasible, i.e., fi (˜x) ≤ 0 and hi (˜x)= 0, and λ 0. Then, we have m p λi fi (˜x)+ νi hi (˜x) ≤ 0 =⇒ L(˜x, λ, ν) ≤ f0(˜x) i i X=1 X=1 Hence, g(λ, ν)= inf L(x, λ, ν) ≤ L(˜x, λ, ν) ≤ f0(˜x) x∈D Minimizing over all feasiblex ˜ gives p∗ ≥ g(λ, ν). SJTU Ying Cui 5 / 46 Examples Least-norm solution of linear equations min xT x x s.t. Ax = b dual function: ◮ to minimize L(x,ν)= xT x + νT (Ax − b) over x (unconstrained convex problem), set gradient equal to zero: T T ∇x L(x,ν) = 2x + A ν = 0 ⇒ x = −(1/2)A ν ◮ plug in L(x,ν) to obtain g: g(ν)= L((−1/2)AT ν,ν) = (−1/4)νT AAT ν − bT ν which is a concave quadratic function of ν, as −AAT 0 lower bound property: p∗ ≥ (−1/4)νT AAT ν − bT ν, for all ν SJTU Ying Cui 6 / 46 Examples Standard form LP min cT x x s.t. Ax = b, x 0 dual function: ◮ Lagrangian L(x, λ, ν)= cT x + νT (Ax − b) − λT x = −bT ν + (c + AT ν − λ)T x is affine in x (bounded below only when identically zero) ◮ dual function −bT ν, AT ν − λ + c = 0 g(λ, ν) = inf L(x, λ, ν)= x (−∞, otherwise lower bound property: nontrivial only when λ 0 and AT ν − λ + c = 0, and hence p∗ ≥−bT ν if AT ν + c 0 SJTU Ying Cui 7 / 46 Examples Two-way partitioning problem (W ∈ Sn) min xT Wx x 2 s.t. xi = 1, i = 1, ..., n ◮ a nonconvex problem with 2n discrete feasible points ◮ find the two-way partition of {1, ..., n} with least total cost ◮ Wij is cost of assigning i, j to the same set ◮ −Wij is cost of assigning i, j to different sets dual function: T 2 g(ν) =inf(x Wx + νi (x − 1)) x i i X −1T ν, W + diag(ν) 0 =inf xT (W + diag(ν))x − 1T ν = x (−∞, otherwise lower bound property: p∗ ≥−1T ν if W + diag(ν) 0 ∗ example: ν = −λmin(W )1 gives bound p ≥ nλmin(W ) SJTU Ying Cui 8 / 46 Lagrange dual function and conjugate function ◮ conjugate f ∗ of a function f : Rn → R: f ∗(y)= sup (y T x − f (x)) x∈dom f ◮ dual function of min f0(x) x s.t. x = 0 g(ν) = inf(f (x)+ νT x)= − sup((−ν)T x − f (x)) x x ◮ relationship: g(ν)= −f ∗(−ν) ◮ conjugate of any function is convex ◮ dual function of any problem is concave SJTU Ying Cui 9 / 46 Lagrange dual function and conjugate function more generally (and more usefully), consider an optimization problem with linear inequality and equality constraints min f0(x) x s.t. Ax b, Cx = d dual function: T T g(λ, ν)= inf f0(x)+ λ (Ax − b)+ ν (Cx − d) x∈dom f0 T T T T T = inf f0(x) + (A λ + C ν) x − b λ − d ν x∈dom f0 ∗ T T T T = −f0 (−A λ − C ν) − b λ − d ν ∗ domain of g follows from domain of f0 : T T ∗ domg = {(λ, µ)|− A λ − C ν ∈ domf0 } ◮ simplify derivation of dual function if conjugate of f0 is known SJTU Ying Cui 10 / 46 Examples Equality constrained norm minimization min kxk x s.t. Ax = b dual function: T T T ∗ T −b ν, ||A ν||∗ ≤ 1 g(ν)= −b ν − f0 (−A ν)= (−∞, otherwise ◮ conjugate of f0 = || · ||: ∗ 0, ||y||∗ ≤ 1 f0 (y)= (∞, otherwise i.e., the indicator function of the dual norm unit ball, where T kyk∗ = supkuk≤1 u y is dual norm of k · k SJTU Ying Cui 11 / 46 Lagrange dual problem max g(λ, ν) λ,ν s.t. λ 0 ◮ find best lower bound on p∗, obtained from Lagrange dual function ◮ always a convex optimization problem (maximize a concave function over a convex set), regardless of convexity of primal problem, optimal value denoted by d ∗ ◮ λ, ν are dual feasible if λ 0 and g(λ, ν) > −∞ (i.e., (λ, ν) ∈ domg = {(λ, ν)|g(λ, ν) > −∞}) ◮ can often be simplified by making implicit constraint (λ, ν) ∈ dom g explicit, e.g., ◮ standard form LP and its dual min cT x max − bT ν x ν s.t. Ax = b, x 0 s.t. AT ν + c 0 SJTU Ying Cui 12 / 46 Weak duality and strong duality weak duality: d ∗ ≤ p∗ ◮ always holds (for convex and nonconvex problems) ◮ can be used to find nontrivial lower bounds for difficult problems, e.g., ◮ solving the SDP max − 1T ν ν s.t. W + diag(ν) 0 gives a lower bound for the two-way partitioning problem strong duality: d ∗ = p∗ ◮ does not hold in general ◮ (usually) holds for convex problems ◮ conditions that guarantee strong duality in convex problems are called constraint qualifications ◮ there exist many types of constraint qualifications SJTU Ying Cui 13 / 46 Slater’s constraint qualification One simple constraint qualification is Slater’s condition (Slater’s constraint qualification): convex problem is strictly feasible, i.e., there exists an x ∈ intD such that fi (x) < 0, i = 1, · · · , m, Ax = b ◮ can be refined, e.g., ◮ can replace intD with relintD (interior relative to affine hull) ◮ affine inequalities do not need to hold with strict inequality ◮ reduce to feasibility when the constraints are all affine equalities and inequalities ◮ implies strong duality for convex problems ◮ implies that the dual value is attained when d ∗ > −∞, i.e., there exists a dual feasible (λ∗,ν∗) with g(λ∗,ν∗)= d ∗ = p∗ SJTU Ying Cui 14 / 46 Examples Inequality form LP primal problem: min cT x x s.t. Ax b dual function: T T T T T −b λ, A λ + c = 0 g(λ) = infx (c + A λ) x − b λ = (−∞, otherwise dual problem: max − bT λ λ s.t. AT λ + c = 0, λ 0 ◮ from weaker form of Slater’s condition: strong duality holds for any LP provided the primal problem is feasible, implying strong duality holds for LPs if the dual is feasible ◮ in fact, p∗ = d ∗ except when primal and dual are infeasible SJTU Ying Cui 15 / 46 Examples n Quadratic program: P ∈ S++ min xT Px x s.t. Ax b dual function: T T 1 T −1 T T g(λ) = infx (x Px + λ (Ax − b)) = − 4 λ AP A λ − b λ dual problem: max − (1/4)λT AP−1AT λ − bT λ λ s.t. λ 0 ◮ from weaker form of Slater’s condition: strong duality holds provided the primal problem is feasible ◮ in fact, p∗ = d ∗ always holds SJTU Ying Cui 16 / 46 Examples A nonconvex problem with strong duality: A 6 0 min xT Ax + 2bT x x s.t. xT x ≤ 1 dual function: T T g(λ) =infx (x (A + λI )x + 2b x − λ) −bT (A + λI )†b − λ, A + λI 0, b ∈ R(A + λI ) = (−∞, otherwise dual problem and equivalent SDP: max − bT (A + λI )†b − λ max − t − λ λ λ,t A + λI b s.t. A + λI 0, b ∈ R(A + λI ) s.t. 0 bT t ◮ strong duality holds although primal problem is nonconvex (difficult to show) SJTU Ying Cui 17 / 46 Geometric interpretation geometric interpretation via set of values ◮ set of values taken on by the constraint and objective functions: G = {(f1(x), · · · , fm(x), h1(x), · · · , hp(x), f0(x)) ∈ Rm × Rp × R|x ∈ D} ◮ optimal value: p∗ = inf{t|(u, v, t) ∈ G, u 0, v = 0} ◮ dual function: g(λ, ν) = inf{(λ, ν, 1)T (u, v, t)|(u, v, t) ∈ G} ◮ if the infimum is finite, then (λ, ν, 1)T (u, v, t) ≥ g(λ, ν) defines a nonvertical supporting hyperplane to G ◮ weak duality: for all λ 0, p∗ =inf{t|(u, v, t) ∈ G, u 0, v = 0} ≥ inf{(λ, ν, 1)T (u, v, t)|(u, v, t) ∈ G, u 0, v = 0} ≥ inf{(λ, ν, 1)T (u, v, t)|(u, v, t) ∈ G} =g(λ, ν) SJTU Ying Cui 18 / 46 Geometric interpretation Example consider a simple problem with one constraint ∗ min f0(x) p =inf{t|(u, t) ∈ G, u ≤ 0} x s.t.
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages46 Page
-
File Size-