Numericaloptimization
Total Page:16
File Type:pdf, Size:1020Kb
Numerical Optimization Alberto Bemporad http://cse.lab.imtlucca.it/~bemporad/teaching/numopt Academic year 2020-2021 Course objectives Solve complex decision problems by using numerical optimization Application domains: • Finance, management science, economics (portfolio optimization, business analytics, investment plans, resource allocation, logistics, ...) • Engineering (engineering design, process optimization, embedded control, ...) • Artificial intelligence (machine learning, data science, autonomous driving, ...) • Myriads of other applications (transportation, smart grids, water networks, sports scheduling, health-care, oil & gas, space, ...) ©2021 A. Bemporad - Numerical Optimization 2/102 Course objectives What this course is about: • How to formulate a decision problem as a numerical optimization problem? (modeling) • Which numerical algorithm is most appropriate to solve the problem? (algorithms) • What’s the theory behind the algorithm? (theory) ©2021 A. Bemporad - Numerical Optimization 3/102 Course contents • Optimization modeling – Linear models – Convex models • Optimization theory – Optimality conditions, sensitivity analysis – Duality • Optimization algorithms – Basics of numerical linear algebra – Convex programming – Nonlinear programming ©2021 A. Bemporad - Numerical Optimization 4/102 References i ©2021 A. Bemporad - Numerical Optimization 5/102 Other references • Stephen Boyd’s “Convex Optimization” courses at Stanford: http://ee364a.stanford.edu http://ee364b.stanford.edu • Lieven Vandenberghe’s courses at UCLA: http://www.seas.ucla.edu/~vandenbe/ • For more tutorials/books see http://plato.asu.edu/sub/tutorials.html ©2021 A. Bemporad - Numerical Optimization 6/102 Optimization modeling What is optimization? • Optimization = assign values to a set of decision variables so to optimize a certain objective function • Example: Which is the best velocity to minimize fuel consumption ? fuel [ℓ/km] velocity [km/h] 0 30 60 90 120 160 ©2021 A. Bemporad - Numerical Optimization 7/102 What is optimization? • Optimization = assign values to a set of decision variables so to optimize a certain objective function • Example: Which is the best velocity to minimize fuel consumption ? best fuel fuel consumption [ℓ/km] optimal velocity velocity [km/h] 0 30 60 90 120 160 optimization variable: velocity cost function to minimize: fuel consumption parameters of the decision problem: engine type, chassis shape, gear, … ©2021 A. Bemporad - Numerical Optimization 8/102 Optimization problem f(x) min f(x) x f(x*) x ∗ x* f = minx f(x) = optimal value ∗ n n x = arg minx f(x) = optimizer x 2 R ; f : R ! R 2 3 x 6 1 7 maxx f(x) 6 . 7 x = 4 . 5 ; f(x) = f(x1; x2; : : : ; xn) xn Most often the problem is difficult to solve by inspection use a numerical solver implementing an optimization algorithm ©2021 A. Bemporad - Numerical Optimization 9/102 Optimization problem min f(x) x • The objective function f : Rn ! R models our goal: minimize (or maximize) some quantity. For example fuel, money, distance from a target, etc. • The optimization vector x 2 Rn is the vector of optimization variables (or unknowns) xi to be decided optimally. For example velocity, number of assets in a portfolio, voltage applied to a motor, etc. ©2021 A. Bemporad - Numerical Optimization 10/102 Constrained optimization problem • The optimization vector x may not be completely free, but rather restricted to a feasible set X ⊆ Rn • Example: the velocity must be smaller than 60 km/h fuel [ℓ/km] velocity [km/h] 0 30 60 90 120 160 best fuel fuel consumption [ℓ/km] optimal velocity velocity [km/h] 0 30 60 90 120 160 The new optimizer is x∗ = 42 km/h. ©2021 A. Bemporad - Numerical Optimization 11/102 Constrained optimization problem f(x) minx f(x) s:t: g(x) ≤ 0 h(x) = 0 g(x) 0 x • The (in)equalities define the feasible set X of admissible variables g : Rn ! Rm; h : Rn ! Rp " # g (x ;x ;:::;x ) X = fx 2 Rn : g(x) ≤ 0; h(x) = 0g 1 1 2 n g(x) = . • Further constraints may restrict X , gm(x1;x2;:::;xn) " # for example: h1(x1;x2;:::;xn) n h(x) = . x 2 f0; 1g (x = binary vector) . x 2 Zn (x = integer vector) hp(x1;x2;:::;xn) ©2021 A. Bemporad - Numerical Optimization 12/102 A few observations • An optimization problem can be always written as a minimization problem max f(x) = − min{−f(x)g x2X x2X • Similarly, an inequality gi(x) ≥ 0 is equivalent to −gi(x) ≤ 0 • An equality h(x) = 0 is equivalent to the double inequalities h(x) ≤ 0, −h(x) ≤ 0 (often this is only good in theory, but not numerically) • Scaling f(x) to αf(x) and/or gi(x) to βigi(x), or shifting to f(x) + γ, does not change the optimizer, for all α; βi > 0 and γ. Same if hj(x) is scaled to γjhj(x), 8γj =6 0 • Adding constraints makes the objective worse or equal: min f(x) ≤ min f(x) x2X1 x2X1; x2X2 • Strict inequalities gi(x) < 0 can be approximated by gi(x) ≤ −ϵ (0 < ϵ ≪ 1) ©2021 A. Bemporad - Numerical Optimization 13/102 Infeasibility and unboundedness • A vector x 2 Rn is feasible if x 2 X , i.e., it satisfies the given constraints • A problem is infeasible if X = ; (the constraints are too tight) • A problem is unbounded if 8M > 0 9x 2 X such that f(x) < −M. In this case we write inf f(x) = −∞ x2X ©2021 A. Bemporad - Numerical Optimization 14/102 Global and local minima • A vector x∗ 2 Rn is a global optimizer if x 2 X and f(x) ≥ f(x∗), 8x 2 X • A vector x∗ 2 Rn is a strict global optimizer if x∗ 2 X and f(x) > f(x∗), 8x 2 X , x =6 x∗ • A vector x∗ 2 Rn is a (strict) local optimizer if x∗ 2 X and there exists a neighborhood1 N of x∗ such that f(x) ≥ f(x∗), 8x 2 X \ N (f(x) > f(x∗), 8x 2 X \ N , x =6 x∗) 1Neighborhood of x = open set containing x ©2021 A. Bemporad - Numerical Optimization 15/102 Example: Least Squares • We have a dataset (uk; yk), uk; yk 2 R, k = 1;:::N • We want to fit a line y^ = au + b to the dataset that minimizes 2 3 " y # 2 XN XN u1 1 1 0 . 2 uk 2 4 . 5 . f(x) = (yk − auk − b) = ([ ] x − yk) = . x − . 1 . k=1 k=1 uN 1 yN 2 a with respect to x = [ b ] a∗ a ∗ ∗ • The problem b∗ = arg min f([ b ]) is a least-squares problem: y^ = a u + b In MATLAB: 1.5 1 x=[u ones(size(u))]ny 0.5 0 y -0.5 In Python: -1 import numpy as np -1.5 -1 -0.5 0 0.5 1 A=np.hstack((u,np.ones(u.shape))) u x=np.linalg.lstsq(A,y,rcond=0)[0] ©2021 A. Bemporad - Numerical Optimization 16/102 Least Squares using Basis Functions • More generally: we can fit nonlinear functions y = f(u) expressed as the sum Xn of basis functions yk ≈ xiϕi(uk) using least squares i=1 2 3 4 • Example: fit polynomial function y = x1 + x2u1 + x3u1 + x4u1 + x5u1 XN h i 2 2 3 4 min yk − 1 uk u u u x least squares x k k k k=1 | {z } linear with respect to x 2 3 1 2.5 6 7 6 u1 7 6 7 2 6 2 7 ϕ(u) = 6 u1 7 4 u3 5 1 1.5 4 u1 1 0 0.5 1 1.5 2 ©2021 A. Bemporad - Numerical Optimization 17/102 Least Squares - Fitting a circle • Example: fit a circle to a set of data2 XN 2 2 2 2 min (r − (xk − x0) − (yk − y0) ) x0;y0;r k=1 x0 • Let x = y0 be the optimization vector (note the change of variables!) 2− 2− 2 r x0 y0 • The problem becomes the least squares problem 2 N h i X 2 1 − 2 2 min 2xk 2yk 1 x (xk + yk) x 0 k=1 -1 -2 -3 -2 -1 0 1 2 3 2http://www.utc.fr/~mottelet/mt94/leastSquares.pdf ©2021 A. Bemporad - Numerical Optimization 18/102 Convex sets Definition n A set S ⊆ R is convex if for all x1; x2 2 S λx1 + (1 − λ)x2 2 S; 8λ 2 [0; 1] convex set nonconvex set S x1 x2 x1 x2 ©2021 A. Bemporad - Numerical Optimization 19/102 Convex functions • f : S ! R is a convex function if S is convex and f(λx1 + (1 − λ)x2) ≤ λf(x1) + (1 − λ)f(x2) 8x1; x2 2 S; λ 2 [0; 1] Jensen’s inequality (Jensen, 1906) 3 • If f is convex and differentiable at x2, take the limit λ ! 0 and get 0 f(x1) ≥ f(x2) + rf(x2) (x1 − x2) Johan Jensen (1859–1925) • A function f is strictly convex if f(λx1 + (1 − λ)x2) < λf(x1) + (1 − λ)f(x2), 8x1 =6 x2 2 S, 8λ 2 (0; 1) 3 0 f(x1) − f(x2) ≥ limλ!0(f(x2 + λ(x1 − x2)) − f(x2))/λ = rf (x2)(x1 − x2) ©2021 A. Bemporad - Numerical Optimization 20/102 Convex functions • A function f : S ! R is strongly convex with parameter m ≥ 0 if mλ(1 − λ) f(λx + (1 − λ)x ) ≤ λf(x ) + (1 − λ)f(x ) − kx − x k2 1 2 1 2 2 1 2 2 • If f strongly convex with parameter m ≥ 0 and differentiable then m f(y) ≥ f(x) + rf(x)0(y − x) + ky − xk2 2 2 • Equivalently, f is strongly convex with parameter m ≥ 0 if and only if − m 0 f(x) 2 x x convex • Moreover, if f is differentiable twice this is equivalent to r2f(x) ≽ mI (i.e., matrix r2f(x) − mI is positive semidefinite), 8x 2 Rn • A function f is (strictly/strongly) concave if −f is (strictly/strongly) convex ©2021 A.