
Nonlinear Programming Models Fabio Schoen Introduction 2008 http://gol.dsi.unifi.it/users/schoen Nonlinear Programming Models – p. 1 Nonlinear Programming Models – p. 2 NLP problems Local and global optima A global minimum or global optimum is any x⋆ S such that ∈ min f(x) x S f(x) f(x⋆) x S Rn ∈ ⇒ ≥ ∈ ⊆ A point x¯ is a local optimum if ε > 0 such that Standard form: ∃ x S B(¯x, ε) f(x) f(¯x) min f(x) ∈ ∩ ⇒ ≥ where Rn is a ball in Rn. hi(x)=0 i =1,m B(¯x, ε) = x : x x¯ ε Any global optimum{ ∈ is alsok a−localk≤optimum,} but the opposite is g (x) 0 j =1, k j ≤ generally false. Here S = x Rn : h (x)=0 i, g (x) 0 j { ∈ i ∀ j ≤ ∀ } Nonlinear Programming Models – p. 3 Nonlinear Programming Models – p. 4 Convex Functions Convex Functions A set S Rn is convex if ⊆ x, y S λx + (1 λ)y S ∈ ⇒ − ∈ for all choices of λ [0, 1]. Let Ω Rn: non empty convex set. A function f : Ω R ∈is convex iff ⊆ → f(λx + (1 λ)y) λf(x) + (1 λ)f(y) − ≤ − for all x, y Ω,λ [0, 1] ∈ ∈ x y Nonlinear Programming Models – p. 5 Nonlinear Programming Models – p. 6 Properties of convex functions Convex functions Every convex function is continuous in the interior of Ω. It might be discontinuous, but only on the frontier. If f is continuously differentiable then it is convex iff f(y) f(x) + (y x)T f(x) ≥ − ∇ for all y Ω ∈ x y Nonlinear Programming Models – p. 7 Nonlinear Programming Models – p. 8 If f is twice continuously differentiable f it is convex iff its Example: an affine function is convex (and concave) Hessian matrix is positive semi-definite:⇒ For a quadratic function (Q: symmetric matrix): ∂2f 1 2f(x) := f(x) = xT Qx + bT x + c ∇ ∂x ∂x 2 i j we have then 2f(x) < 0 iff ∇ f(x) = Qx + b 2f(x) = Q vT 2f(x)v 0 v Rn ∇ ∇ ∇ ≥ ∀ ∈ is convex iff < or, equivalently, all eigenvalues of 2f(x) are non negative. f Q 0 ∇ ⇒ Nonlinear Programming Models – p. 9 Nonlinear Programming Models – p. 10 Convex Optimization Problems Maximization Slight abuse in notation: a problem min f(x) max f(x) x S ∈ x S ∈ is a convex optimization problem iff S is a convex set and f is convex on S. For a problem in standard form is called convex iff S is a convex set and f is a concave function (not to be confused with minimization of a concave function, (or min f(x) maximization of a convex function) which are NOT a convex optimization problem) hi(x)=0 i =1,m g (x) 0 j =1, k j ≤ if f is convex, hi(x) are affine functions, gj(x) are convex functions, then the problem is convex. Nonlinear Programming Models – p. 11 Nonlinear Programming Models – p. 12 Convex and non convex optimization Convex functions: examples Convex optimization “is easy”, non convex optimization is Many (of course not all . ) functions are convex! usually very hard. affine functions aT x + b Fundamental property of convex optimization problems: every quadratic functions 1 xT Qx + bT x + c with Q = QT , Q 0 local optimum is also a global optimum (will give a proof later) 2 Minimizing a positive semidefinite quadratic function on a any norm is a convex function polyhedron is easy (polynomially solvable); if even a single (however is concave) eigenvalue of the hessian is negative the problem becomes x log x log x ⇒ n NP –hard f is convex if and only if x0,d R , its restriction to any ∀ ∈ line: φ(α) = f(x0 + αd), is a convex function a linear non negative combination of convex functions is convex g(x, y) convex in x for all y g(x, y) dy convex ⇒ R Nonlinear Programming Models – p. 13 Nonlinear Programming Models – p. 14 more examples . max aT x + b is convex i{ i } f,g: convex max f(x),g(x) is convex ⇒ { } fa convex functions for any a (a possibly uncountable Data Approximation set) sup f (x) is convex ∈A ⇒ a∈A a f convex f(Ax + b) ⇒ let S Rn be any set f(x) = sup x s is convex ⊆ ⇒ s∈S k − k T T race(A X) = i,j AijXij is convex (it is linear!) log det X−1 is convexP over the set of matrices X Rn×n : X 0 ∈ ≻ λmax(X) (the largest eigenvalue of a matrix X) Nonlinear Programming Models – p. 15 Nonlinear Programming Models – p. 16 Table of contents Norm approximation norm approximation Problem: maximum likelihood min Ax b x k − k robust estimation where A, b: parameters. Usually the system is over-determined, i.e. b Range(A). 6∈ For example, this happens when A Rm×n with m > n and A has full rank. ∈ r := Ax b: “residual”. − Nonlinear Programming Models – p. 17 Nonlinear Programming Models – p. 18 Examples Example: ℓ1 norm r = √rT r: least squares (or “regression”) Matrix A R100×30 ∈ k k 80 r = √rT Pr with P 0: weighted least squares norm 1 residuals k k ≻ r = max r : minimax, or ℓ∞ or di Tchebichev 70 k k i | i| approximation 60 r = r : absolute or ℓ1 approximation k k i | i| 50 P Possible (convex) additional constraints: 40 maximum deviation from an initial estimate: x x ǫ k − estk≤ 30 simple bounds ℓi xi ui ≤ ≤ 20 ordering: x x x 1 ≤ 2 ≤···≤ n 10 0 -5 -4 -3 -2 -1 0 1 2 3 4 5 Nonlinear Programming Models – p. 19 Nonlinear Programming Models – p. 20 2 ℓ∞ norm ℓ norm 20 18 norm residuals norm 2 residuals 18 ∞ 16 16 14 14 12 12 10 10 8 8 6 6 4 4 2 2 0 0 -5 -4 -3 -2 -1 0 1 2 3 4 5 -5 -4 -3 -2 -1 0 1 2 3 4 5 Nonlinear Programming Models – p. 21 Nonlinear Programming Models – p. 22 Variants comparison min h(y aT x) where h: convex function: i i i 4 − norm 1(x) P z2 z 1 3.5 norm 2(x) h linear–quadratic h(z) = | |≤ linquad(x) 2 z 1 z > 1 deadzone(x) ( | |− | | 3 logbarrier(x) 0 z 1 2.5 “dead zone”: h(z) = | |≤ z 1 z > 1 ( | |− | | 2 log(1 z2) z < 1 1.5 logarithmic barrier: h(z) = − − | | z 1 1 ( ∞ | |≥ 0.5 0 -0.5 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 Nonlinear Programming Models – p. 23 Nonlinear Programming Models – p. 24 Maximum likelihood Max likelihood estimate - MLE Given a sample X1,X2,...,Xk and a parametric family of (taking the logarithm, which does not change optimum points): probability density functions ( ; θ), the maximum likelihood L · ˆ T estimate of θ given the sample is θ = arg max log(p(Xi ai θ)) θ − i X θˆ = arg max (X1,...,Xk; θ) θ L If p is log–concave this problem is convex. Examples: ⇒ Example: linear measures with and additive i.i.d. (independent ε (0, σ), i.e. p(z) = (2πσ)−1/2 exp( z2/2σ2) MLE is the ∼ N − ⇒ identically dsitributed) noise: ℓ2 estimate: θ = arg min Aθ X ; k − k2 1 T p(z) = (1/(2a)) exp( z /a) ℓ estimate: Xi = ai θ + εi (1) −| | ⇒ θˆ = arg min Aθ X θ k − k1 where ε iid random variables with density p( ): i · k (X ...,X ; θ) = p(X aT θ) L 1 k i − i i=1 Y Nonlinear Programming Models – p. 25 Nonlinear Programming Models – p. 26 Ellipsoids p(z) = (1/a) exp( z/a)1 (negative exponential) the An ellipsoid is a subset of Rn of the form − {z≥0} ⇒ estimate can be found solving the LP problem: = x Rn :(x x )T P −1(x x ) 1 E { ∈ − 0 − 0 ≤ } min 1T (X Aθ) − Rn Aθ X where x0 is the center of the ellipsoid and P is a ≤ symmetric∈ positive-definite matrix. p uniform on [ a, a] the MLE is any θ such that Alternative representations: − ⇒ Aθ X ∞ a k − k ≤ = x Rn : Ax b 1 E { ∈ k − k2 ≤ } where A 0, or ≻ = x Rn : x = x + Au u 1 E { ∈ 0 |k k2 ≤ } where A is square and non singular (affine transformation of the unit ball) Nonlinear Programming Models – p. 27 Nonlinear Programming Models – p. 28 Robust Least Squares RLS T 2 It holds: Least Squares: xˆ = arg min i(ai x bi) Hp: ai not known, but it is known that − α + βT y α + β y pP | | ≤ | | k kk k then, choosing y⋆ = β/ β if α 0 and y⋆ = β/ β , otherwise ai i = a¯i + Piu : u 1 k k ≥ − k k ∈ E { k k≤ } if α < 0, then y =1 and k k T where Pi = P < 0. Definition: worst case residuals: i α + βT y⋆ = α + βT β/ β sign(α) | | | k k | T 2 = α + β max (ai x bi) | | k k ai∈Ei − i sX then: A robust estimate of x is the solution of T T T max (ai x bi) = max a¯i x bi + u Pix ai∈Ei | − | kuk≤1 | − | T 2 T xˆr = arg min max (ai x bi) = a¯i x bi + Pix x ai∈Ei − | − | k k s i X Nonlinear Programming Models – p. 29 Nonlinear Programming Models – p. 30 ... ... Thus the Robust Least Squares problem reduces to min t 2 1/2 x,t k k T 2 T min ( a¯i x bi + Pix ) a¯ x b + P x t | − | k k i i i i i ! − k k≤ X T a¯i x + bi + Pix ti (a convex optimization problem).
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages96 Page
-
File Size-