
Lagrange Multipliers Optimization with Constraints “As long as algebra and geometry have been separated, their progress have been slow and their uses limited; but when these two sciences have been united, they have lent each mutual forces, and have marched together towards perfection.” Motivation Many problems in bioinformatics require us to find the maximum or minimum of a differentiable function f. For example, in this course we want to find the optimal rotation that causes one protein to be superimposed over another in 3D space. We will also need to do a type of optimization when we look at classification problems. 2 Lagrange Multipliers 1 Introduction (1) The points in the domain of f where the minimum or maximum occurs are called the critical points (also the extreme points). You have already seen examples of critical points in elementary calculus: Given f(x) with f differentiable, find values of x such that f(x) is a local minimum (or maximum). Usual approach: We assume these x values are such that f’ (x) = 0. That is the derivative is zero at this critical point in the x domain. 3 Lagrange Multipliers Introduction (2) Our optimization problems will be more complicated for two reasons: 1. The function f depends on several variables, that is, f(x1, x2, …, xn). 2. In many applications, we will have a second equation g(x1, x2, …, xn) = 0 that also must be satisfied. We still want a point in n that is an extreme point for f but it must also lie on the line or surface defined by g. We will consider three optimization problems corresponding to n = 1, 2, 3 with no constraint and with a constraint (6 examples altogether). 4 Lagrange Multipliers 2 Three Optimization Problems (1) Problem 1 (n = 1): Find x that will minimize f x x2 4 x 8. Since 2 f x x 2 4. 40 it is clear that the minimum is at 35 x = 2 and the f value is 4. 30 Note also that 25 20 - 4x + 8 f' x 2 x 4 2 f = x 15 and setting this to zero also 10 gives us x = 2. 5 0 -4 -2 0 2 4 6 8 x 5 Lagrange Multipliers Three Optimization Problems (2) Problem 2 (n = 2): f 22 Minimize f x x12 4 x 8. Because the squared terms cannot be negative, the minimum occurs when they are both zero, i.e. when x1 = 0, x2 = 0 and at this critical point the f value is 8. Note also that f f x2 2x1 8x2 x and 1 x1 x2 Setting both of these to zero also gives us x1 = 0, x2 = 0. 6 Lagrange Multipliers 3 Three Optimization Problems (3) Problem 3 (n = 3): 2 2 2 Minimize f x x1 4 x 2 16 x 3 8 x 2 24. 222 We can rewrite this as f x x1 4 x 2 1 16 x 3 20 Because the squared terms cannot be negative, the minimum will occur when they are all zero, that is, when x1 = 0, x2 = 1, x3 = 0 and at this critical point the f value is 20. Note also that ff f 2xx12 , 8 8 and 32x3 xx12 x3 Setting all of these to zero also gives us x1 = 0, x2 = 1, x3 = 0. Unfortunately, a graph of f versus x1, x2, x3 would be in a four dimensional space and impossible to visualize. 7 Lagrange Multipliers Optimization Via Derivatives For these simple cases (no constraint function) the general strategy is to compute the derivatives with respect to all the independent variables: f in1,2, , xi f 0in 1,2, , Setting these to zero: x gives n equations in n unknowns.i Solve the equations to get the critical points. 8 Lagrange Multipliers 4 Level Sets (1) Level sets provide a visual aid for some of the geometric arguments that we will need to make. 40 Given some particular real value,35 for example, r, a level set is a set of points in30 the domain of f such that f is equal to this value25 r. 20 - 4x + 8 2 x,,,,,,. x x f x x xf = x r Formally: 1 2nn 1 2 15 For Problem 1, the level set consists10 of two points when r > 4 and a single point when r = 4 (empty5 otherwise). 0 For Problem 2, the level set is an ellipse-4 -2when0 r2 > 84. 6 8 x For Problem 3, the level set is an ellipsoid when r > 20. 9 Lagrange Multipliers Level Sets (2) The level sets for our three optimization Problems: 4D ? space x x f x x2 4 x 8 20 2 2 2 xxxfxxx1,, 2 3 1 ,, 2 3 x 1 41682036 x 2 x 3 x 2 r = 20 r = 36 22 x1, x 2 , f x 1 , x 2 x 1 4 x 2 8 12 10 Lagrange Multipliers r = 12 5 Gradient Given a scalar function f mapping a vector T x x12,,, x xn to a real value, that is fx we define the gradient of f or grad(f) as: T f f f Alternate notation for n = 3: f ,,,. f f f f i j k. x12 x xn x1 x 2 x 3 Note that it is a column vector. T But its entries are functions of x x12,,,. x xn We really have a vector “field”: one vector defined for each point in the n space. SO: to get a critical point we find those T f 0. x x12,,, x xn such that 11 Lagrange Multipliers Gradients and Level Sets (1) A gradient essentially tells us how level sets change as r increases. They point in the direction of increasing r. Consider a gradient vector at where x is a level set point. Problem 1 Problem 2 Problem 3 df 2x 2x fx 24 f 1 1 dx 8x fx 88 2 2 32x3 x df df x 28 x 68 dx dx 12 Lagrange Multipliers 6 Gradients and Level Sets (2) In some cases a diagram will show a set of gradient vectors taken at regular intervals from the background field along with a set of level curves: Problem 2 with an array of gradient vectors and 4 level curves: 13 Lagrange Multipliers Gradients and Level Sets (3) THEOREM: The gradient of f is normal to the level set of f at that point. Proof sketch: Consider a point p in the level set and a gradient vector that is defined at p. The level set going through p is x f x f p. Assume C is any differentiable curve that: 1. Is parameterized by t T C t c12 t,,,. c t cn t 2. Lies within the level set of f f c12 t,,,. c t cn t f p 3. Passes through p when t = tp T C tp c12 t p,,,. c t p c n t p p 14 Lagrange Multipliers 7 Gradients and Level Sets (4) THEOREM: The gradient of f is normal to the level set of f at that point. Proof sketch continued: f x x2 4 x 8.Since f(p) is a The second assumption gives us constant. dd f c t, c t , , c t f40 p 0. 12 n dt dt 35 But using the chain rule on the LHS30 we can write: n 25 f c12 t,,, c t cn t dc t i 20 - 4x + 8 2 0. f = x x dt 15 i1 i T dC 10 f 0. At t = tp this can be written as: 5 dt tt Gradient vector at p. p 0 -4 -2 0 Tangent2 4 to C6 at p.8 So, the gradient at p is normal to any tangent at p in the x level set implying the gradient is normal to the level set. 15 Lagrange Multipliers Problems with Constraints (1) We now consider the same problems but with constraints: Recall: We still want to minimize or maximize f but now the point must also satisfy the equation g(x1, x2, …, xn) = 0. Let us go through the various problems and consider the effects of a constraint: Problem 1: • Minimize subject to g(x) = 0. • Find x values such that g(x) = 0 A discrete set; then find which of these x not very values produces the minimal f. interesting. 16 Lagrange Multipliers g(x) = 0 8 Problems with Constraints (2) Problem 2 with a constraint: 22 Minimize f x x12 48 x subject to: g x x12 2 x 4 0. In this case we can solve f for x1 in g(x) to get x1 = 4 – 2x2. Then f becomes: 8x2 16 x 24 8 x 12 2 . 2 2 2 So x2 = 1 and x1 = 2 giving a value for f that is 16. x x1 (4,0) 2 (2,1) (0,2) g x x12 2 x 4 0 17 Lagrange Multipliers Problems with Constraints (2) In the last problem we were “lucky” because g(x1,x2) = 0 could be solved for x1 or x2 so that a substitution could be made into f(x1,x2) reducing it to a minimization problem in one variable. In general, we cannot rely on this strategy. The constraint g(x1,x2) = 0 could be so complicated that it is impossible to solve for one of its variables. Note that the previous visualization in “f vs x space” cannot be carried over to Problem 3. We need a new approach. This will require level curves, gradients, and eventually the Lagrange multiplier strategy.
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages24 Page
-
File Size-