Optimization. A first course of mathematics for economists Xavier Martinez-Giralt Universitat Autonoma` de Barcelona [email protected] I.3.- Differentiability OPT – p.1/91 Differentiation and derivative Definitions A function f : A IR IR is differentiable at a point x0 A if the limit ⊂ → ∈ f(xo+h) f(x0) f ′(x0) = limh 0 h− exists. Equivalently, → ′ f(x0+h) f(x0) f (x0)h limh 0 − h − = 0 or → ′ f(x) f(x ) f (x )(x x ) | − 0 − 0 − 0 | limx x0 x x = 0. → | − 0| n m A function f : A IR IR is differentiable at a point x0 A if we can find a linear⊂ function→ Df(x ) : IRn IRm (that we∈ 0 → refer to as the derivative of f at x0) such that f(x) [f(x )+ Df(x )(x x )] lim k − 0 0 − 0 k = 0 x x0 x x → k − 0k If f is differentiable x A, we say that f is differentiable inA. ∀ ∈ Derivative is the slope of the linear approximation of f at x0. OPT – p.2/91 Differentiability- Illustration y = f(x0) + Df (x0)( x − x0) ′ y = f(x0) + f (x0)( x − x0) x2 f f(x) x2 x 0 x1 x0 x1 OPT – p.3/91 Differentiation and derivative (2) Some theorems m Theorem 1: Let f : A IR be differentiable at x0 A. Assume A IRn is an→ open set. Then, there is a unique∈ linear ⊂ approximation Df(x0) to f at x0. Recall some one-dimensional results Theorem 2 (Fermat): Let f : (a,b) IR be differentiable at → c (a,b). If c is an extreme point of f then, f (c) = 0. ∈ ′ Theorem 3 (Rolle): Let f : [a,b] IR be continuous. Assume → f is differentiable in (a,b). Assume also f(a)= f(b) = 0. Then, c (a,b) such that f (c) = 0. ∃ ∈ ′ Theorem 4 (Mean-Value): Let f : [a,b] IR be continuous. → Assume f is differentiable in (a,b). Then, c (a,b) such that ∃ ∈ f(b) f(a)= f (c)(b a). − ′ − Corollary: If, in addition, f ′ = 0 on (a,b), then f is constant. OPT – p.4/91 Differentiation and derivative (3) Proof of theorem 2 Let f have a maximum at c. Then, for h 0, [f(c + h) f(c)]/h 0. Letting h 0,h 0 we get ≥ − ≤ → ≥ f (c) 0. Similarly, for h 0, it follows that f (c) 0. Hence, ′ ≤ ≤ ′ ≥ f ′(c) = 0. A parallel argument holds when f has a minimum at c. Proof of theorem 3 If f(x) = 0, x [a,b], we can choose any c. ∀ ∈ If f = 0, applying the boundedness theorem c1 where f reaches6 a maximum and c where f reaches∃ a minimum. ∃ 2 Since f(a)= f(b) = 0, at least one of c1,c2 lies in (a,b). Assume c (a,b). Then, applying theorem 1 f (c ) = 0. 1 ∈ ′ 1 Mutatis mutandis for c2 OPT – p.5/91 One-dimension theorems - Illustration f ′ = 0 f(x) ′ ′ 0 f(x) f = 0 f > f ′ < 0 c − h c c + h x c − h c c + h x Theorem 2 f(x) c2 c4 c1 c3 x Theorem 3 OPT – p.6/91 Differentiation and derivative (4) Proof of theorem 4 Define an auxiliary function f(b) f(a) g(x)= f(x) f(a) (x a) b−a − − − − g(x) is continuous in [a,b] and differentiable in (a,b). f(b) f(a) g′(x)= f ′(x) b−a . Also, g(a)= g(b) = 0. − − Applying Rolle’s theorem, g′(c) = 0, Hence, f(b) f(a)= f (c)(b a). − ′ − Proof of the Corollary Apply Theorem 4 to f on [a,x]. Then, f(x) f(a)= f (c)(x a) = 0. − ′ − Thus, f(x)= f(a) x [a,b] ∀ ∈ Therefore, f is constant. OPT – p.7/91 One-dimension theorems - Illustration f(x) g(x) d f(b) − f(a) α b − a a x c b x f(b) − f(a) d − f(a) tg (α) = = b − a x − a g(x) = f(x) − [f(a) + tg (α)( x − a)] = f(x) − d Theorem 4 OPT – p.8/91 The general Mean-Value theorem We say that c is on the line segment joning x and y if it can be written as a convex combination of x and y. Namely, c = (1 λ)x + λy, λ 0, λ [0, 1] − ≥ ∈ The general Mean-Value theorem (i) Let f : A IRn IR be differentiable on A (an open set). ⊂ → Consider (x, y) A s.t. the segment defined by their convex combination∈ lies in A. Then c in that segment such that f(y) ∃f(x)= Df(c)(y x) − − (ii) Let f : A IRn IRm be differentiable on A (open set). ⊂ → Let f = (f1, f2,...,fm). Consider (x, y) A s.t. the segment defined by their convex combination∈ lies in A. Then (c ,c ,...,c ) on that segment such that ∃ 1 2 m f (y) f (x)= Df (c )(y x), i = 1, 2,...,n i − i i i − OPT – p.9/91 The Jacobian matrix Let f : A IRn IRm: f(x ,...,x )= ⊂ → 1 n [f1(x1,...,xn), f2(x1,...,xn),...,fm(x1,...,xn)] Compute the partial derivative of fj wrt xi, i.e. ∂fj , j = 1,...,m; i = 1,...,n ∂xi Partial derivative. Definition ∂fj fj (x1,...,xi+h,...,xn) fj (x1,...,xn) (x1,...,xn) = limh 0 − ∂xi → h Theorem Let f : A IRm be differentiable. Assume A IRn is → ⊂ an open set. Then, the partial derivatives ∂fj/∂xi exist and the matrix Df(x) is given by ∂f1 (x) ∂f1 (x) . ∂f1 (x) ∂x1 ∂x2 ∂xn ∂f2 (x) ∂f2 (x) . ∂f2 (x) ∂x1 ∂x2 ∂xn (Jacobian matrix of f) . .. . ∂fm ∂fm ∂fm (x) (x) . (x) ∂x1 ∂x2 ∂xn OPT – p.10/91 The gradient of f The gradient is the generalization of the concept of derivative of a function in IR to a function in IRm. The gradient is the vector of the n partial derivatives of f. Gradient of f. Definition. Let f : A IRn IR be differentiable. The gradient of f is the⊂ vector→ whose components are the elements of Df(x). That is, f = ∂f (x), ∂f (x),..., ∂f (x) ∇ ∂x1 ∂x2 ∂xn The gradient points towards the direction of greatest rate of increase of the function f. (See below, p. 19) Recall: Let f : A IR. If f is differentiable at x0 then it is continuous at that→ point. OPT – p.11/91 Gradient - Illustration Consider f(x, y) = 4x2 + y2. Then, f = (8x, 2y). The figure represents the gradient at three points∇ for the level set k = 1. They are orthogonal to the level surfice as they show the direction of greatest increase of f. y ∇f(0 , 1) = (0 , 2) "∇ f(0 , 1) " = 2 1 1 √3 f( , ) = (2 , √3) ∇ 4 2 √3 1 √3 2 f( , ) = √7 #∇ 4 2 # k = 1 1 1 x 4 2 1 ∇f( , 0) = (4 , 0) 2 1 "∇ f( , 0) " = 4 2 OPT – p.12/91 Continuity and differentiability Theorem: Let f : A IRm be differentiable in A. Assume A IRn is an open set.→ Then, f is continuous. ⊂ Proof: f(x) f(x ) − 0 f differentiable at x0 means f ′(x0) = limx x0 x x → − 0 We need to prove that limx x f(x)= f(x0) → 0 f(x) f(x0) lim [f(x) f(x0)] = lim (x x0) − x x0 − x x0 − x x → → − 0 f(x) f(x0) = lim (x x0) lim − = 0 f ′(x0) = 0 x x0 − x x0 x x · → → − 0 Therefore, limx x0 f(x)= f(x0) proving that f is continuous. → OPT – p.13/91 Continuity and differentiability (2) Theorem (Lipschitz property): Let f : A IRm be differentiable in A. Assume A IRn is an→ open set. Then, f is ⊂ continuous. More precisely, x0 A, M > 0 and δ0 > 0 such that x x <δ implies f∀(x) ∈ f(x∃ M x x . k − 0k 0 k − 0k≤ k − 0k The Lipschitz property defines a stronger notion of continuity where, the number M (called the “Lipschitz constant") represents the bound of the slope of the function at x0. A particular case of Lipschitz continuity is the property of a function being a contraction, when M < 1 (useful for fixed-point theorems, and stability of equilibria). Theorem: Consider f : A IRm. Assume A IRn is an open → ⊂ set. Assume f = (f1, f2,...,fm). If each of the partial derivatives ∂fj /∂xi exists, and is a continuous function in A, then f is differentiable in A. OPT – p.14/91 Directional derivatives Intuition: Consider a function defined in a n-dimensional space. The directional derivative is the rate of change of the function f in a particular direction e. Definition: Let f : IRn IRm. Assume f is defined in a neighborhood of x →IRn. Let e IRn be a unit vector. Then, 0 ∈ ∈ the directional derivative of f at x0 in the direction e is defined as d f(x0+he) f(x0) Duf(x0) f(x0 + he) h=0 = limh 0 − ≡ dh | → h This is very similar to the definition of a partial derivative. However, this limit may be difficult to compute. An equivalent formula can be derived using the gradient of f. OPT – p.15/91 Directional derivatives - using the gradient Introduction For illustrative purposes, the argument is developed in IR2, but straightforward generalization Consider f(x, y) and a unit vector e = (e1,e2). Define g(z) f(x, y) with x =x ˜ + e z and y =y ˜ + e z.
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages91 Page
-
File Size-