QE Optimization, WS 2016/17 Part 2. Differential Calculus for Functions
Total Page:16
File Type:pdf, Size:1020Kb
QE Optimization, WS 2016/17 Part 2. Differential Calculus for Functions of n Variables (about 5 Lectures) Supporting Literature:Angel de la Fuente, \Mathematical Methods and Models for Economists", Chapter 2 C. Simon, L. Blume, \Mathematics for Economists" Contents 2 Differential Calculus for Functions of n Variables 2 2.1 Partial Derivatives . .2 2.2 Directional Derivatives . .5 2.3 Higher Order Partials . .6 2.4 Total Differentiability . .7 2.5 Chain Rule . 13 2.6 Taylor's Formula . 16 2.7 Implicit Functions . 20 2.8 Inverse Functions . 26 2.9 Unconstrained Optimization . 27 2.10 First-Order Conditions . 28 2.11 Second-Order Conditions . 29 2.12 A Rough Guide: How to Find the Global Maxima/Minima . 32 2.13 Envelope Theorems . 34 2.14 G^ateauxand Fr´echet Differentials . 38 In this chapter we consider functions n f : R ! R of n ≥ 1 variables (multivariate functions). Such functions are the basic building block of formal economic models. 1 2 Differential Calculus for Functions of n Variables 2.1 Partial Derivatives Everywhere below: U ⊆ Rn will be an open set in the space (Rn; k · k) (with the Euclidean norm k · k) and f : U ! R, U 3 (x1; : : : ; xn) ! f(x1; : : : ; xn) 2 R: Definition 2.1.1. The function f is partially differentiable with respect to the i-th coordinate (or variable) xi; at a given point x 2 U, if the following limit exists f(x + hei) − f(x) Dif(x) : = lim h!0 h 1 = lim [f(x1; : : : ; xi−1; xi + h; xi+1; : : : ; xn) − f(x1; : : : ; xn)] ; h!0 h n where ei := f0;:::; 0;1; 0;:::; 0g is the basis vector in R . | {z } i−1 Since U is open, there exists an open ball B"(x) ⊆ U. In the definition of lim one h!0 considers only \small" h with jhj < ": Dif(x) is called the i-th partial derivative of f at point x: Notation: We also write Dxi f(x);@if(x); @f(x)=@xi: The partial derivative Dif(x) can be interpreted as a usual derivative w.r.t. the i- th coordinate, whereby all the other n − 1 coordinates are kept fixed. Namely, in the "−neighbourhood of xi, let us define a function (xi − "; xi + ") 3 ξ ! gi(ξ) := f(x1;i−1 ; ξ; xi+1; : : : ; xn): Then by Definition 2.1.1, gi(xi + h) − gi(xi) 0 Dif(x) := lim = gi(xi): h!0 h Definition 2.1.2. A function f : U ! R is called partially differentiable if Dif(x) exists for all x 2 U and all 1 ≤ i ≤ n: Furthermore, f is called continuously partially differentiable, if all partial derivatives Dif : U ! R, 1 ≤ i ≤ n are continuous functions. 2 Example 2.1.3. (i) Distance function q 2 2 n r(x) := jxj = x1 + ::: + xn; x 2 R Let us show that r(x) is partially differentiable at all points x 2 Rnnf0g: q 2 2 2 ξ ! gi(ξ) := x1 + ::: + ξ + ::: + xn 2 R: Use the chain rule for the derivatives of real-valued functions (cf. standard courses in Calculus) =) @r 1 2x x (x) = i = i : @x 2 p 2 2 2 r(x) i x1 + ::: + ξ +n n Generalization: Let f : R+ ! R be differentiable, then R 3 x ! f(r(x)) is partially differentiable at all points x 2 Rnnf0g and @ @r x f(r) = f 0(r) · = f 0(r) · i : @xi @xi r (ii) Cobb{Douglas production function with n inputs α1 α2 αn f(x) := x1 x2 : : : xn for αi > 0; 1 ≤ i ≤ n; defined on U := f(x1; : : : ; xn)j xi > 0; 1 ≤ i ≤ ng : Calculate the so-called marginal-product function of input i @f α1 αi−1 αn f(x) (x) = αix1 i : : : xn = αi : @xi xi Mathematicians will say: multiplicative functions with separable variables, polynomi- als. Economists are especially interested in the case αi 2 (0; 1): This is an example of homogeneous functions of order (degree) a = α1 + ::: + αn, which means f(λx) = λaf(x); 8λ > 0; x 2 U: Moreover, the Cobb{Douglas function is log −linear: log f(x) = α1 log x1 + ::: + αn log xn: 3 (iii) Quasilinear utility function: f(m; x) := m + u(x) with m 2 R+ (i.e., m ≥ 0) and some u : R ! R: @f @f = 1; = u0(x): @m @x (iv) Constant elasticity of substitution (CES) production function with n inputs, which describes aggregate consumption for n types of goods. α α 1/α f(x1; : : : ; xn) := (δ1x1 + ::: + δnxn) ; X with α > 0; δi > 0 and δi = 1; 1≤i≤n defined on the open domain U := f(x1; : : : ; xn) j xi > 0; 1 ≤ i ≤ ng: We calculate the marginal-product function @f 1 1 α α α −1 α−1 (x) = (δ1x1 + ::: + δnxn) · αδixi @xi α 1−α α−1 α α α = δixi (δ1x1 + ::: + δnxn) : Note that f is homogeneous : f(λx) = λf(x): Definition 2.1.4. Let U ⊆ Rn be open and f : U ! R be partially differentiable. Then, the vector @f @f n rf(x) := gradf(x) := (x);:::; (x) 2 R @x1 @xn is called the gradient of f at point x 2 U. Example 2.1.5. (i) Distance function r(x) x gradr(x) = 2 n; x 2 U := nnf0g: r(x) R R (ii) Let f; g : U ! R be partially differentiable. Then r(f · g) = f · rg + g · rf: Proof. This follows from the product rule @ @g @f (fg) = f + g : @xi @xi @xi 4 2.2 Directional Derivatives Fix a directional vector v 2 Rn with jvj = 1 (of unit length!). Definition 2.2.1. The directional derivative of f : U ! R at a point x 2 U along the unit vector v 2 Rn (i.e., with jvj = 1) is given by f(x + hv) − f(x) @vf(x) := Dvf(x) := lim : h!0 h Remark 2.2.2. (i) Define a new function h ! gv(h) := f(x + hv): If gv(h) is differentiable at h = 0; then f(x) is differentiable at point x 2 U along direction v and 0 Dvf(x) = gv(0): (ii) From the above definitions it is clear that the partial derivatives = directional deriva- tives along the basis vectors ei; 1 ≤ i ≤ n, @f (x) = Dei f(x); 1 ≤ i ≤ n: @xi Example 2.2.3. Consider the \saddle" function in R2 2 2 f(x1; x2) := −x1 + x2; p p and find Dvf(x) along the direction v := 2=2; 2=2 ; jvj = 1: Define p 2 p 2 gv(h) : = − x1 + h 2=2 + x2 + h 2=2 p 2 2 = −x1 + x2 + 2h(x2 − x1): p 0 Then Dvf(x) = gv(0) = 2(x2 − x1). Note that Dvf(x) = 0 if x1 = x2. The function f has its minimum at the diagonal x1 = x2: Relation between rf(x) and Dvf(x): n X Dvf(x) = hrf(x); viRn = @if(x) · vi: (∗) i=1 Proof. will be done later, as soon as we prove the chain rule for rf. 5 2.3 Higher Order Partials Let f : U ! R be partially differentiable, i.e., @ 9 f : U ! R 1 ≤ i ≤ n: @xi Analogously, for 1 ≤ j ≤ n we can define (if it exists) @ @ f : U ! R: @xj @xi Notation: @2f @2f or 2 if i = j: @xj@xi @xi Warning: In general, @2f @2f 6= if i 6= j: @xj@xi @xj@xi Theorem 2.3.1 ((A. Schwarz); also known as Young's theorem). Let U ⊆ Rn be open and f : U ! R be twice continuously differentiable, f 2 C2(U), (i.e., all derivatives 2 @ f ; 1 ≤ i; j ≤ n; are continuous). Then for all x 2 U and 1 ≤ i; j ≤ n @xj @xi @2f(x) @2f(x) = ; @xj@xi @xj@xi i.e., for cross-partial derivatives, the order of differentiation in their computing is irrele- vant. Example: (i) The above theorem works: 2 2 2 f(x1; x2) := x1 + bx1x2 + x2; (x1; x2) 2 R : Counterexample: (ii) The above theorem does not work: 8 2 2 x1 − x2 2 <x1x2 2 2 ; (x1; x2) 2 R n f(0; 0)g f(x1; x2) := x1 + x2 :0; (x1; x2) = (0; 0): We calculate @2f @2f (0; 0) = −1 6= 1 = (0; 0): @x1@x2 @x2@x1 Reason: f2 = C2(U). 6 Notation: @kf Dik i1 f; ; @xik : : : @xi1 for any i1; : : : ; ik 2 f1; : : : ng: In general, for any v 2 Rn with jvj = 1, we have by (∗) jD f(x)j ≤ jrf(x)j : v Rn Geometrical interpretation of rf: Define the normalized vector rf(x) n v := 2 R : jrf(x)j n R Then, for this v D f(x) = hrf(x); vi = jrf(x)j : v Rn Rn In other words, the gradient rf(x) of f at point x is the direction in which the slope of f is the largest in absolute value. 2.4 Total Differentiability Intuition: Repetition of the 1-dim case Definition 2.4.1. A function g : R ! R is differentiable at point x 2 R if the following limit exists g(x + h) − g(x) 0 lim =: g (x) 2 R: (∗) h!0 h Geometrical picture: Locally, i.e., for small jhj ! 0; we can approximate the values of g(x + h) by the linear function g(x) + ah with a := g0(x) 2 R: Indeed, the limit (∗) can be rewritten as g(x + h) − [g(x) + ah] lim = 0: h!0 h The approximation error Eg(h) equals Eg(h) := g(x + h) − [g(x) + ah] 2 R and it goes to zero with h: E (h) jE (h)j lim g = 0 i.e., lim g = 0: h!0 h h!0 jhj 7 The latter can be written as Eg(h) = o(h) as h ! 0; g(x + h) ∼ g(x) + ah as h ! 0: Summary: g : R ! R is differentiable at x 2 R if, for points x + h sufficiently close to x, the values g(x + h) admit a \nice" approximation by a linear function g(x) + ah, with an error Eg(h) := g(x + h) − g(x) − ah that goes to zero \faster" than h itself, i.e., jE (h)j lim g = 0: h!0 jhj Now we extend the notion of differentiability to functions f : Rn ! Rm, for arbitrary n; m ≥ 1 : 0 1 0 1 f1(x) f1(x1; : : : ; xn) n B .