Taylor Expansions and (Log)Linearizing

Taylor Expansions and (log)linearizing Stéphane Dupraz 1 Mean Value Thoerem Theorem 1.1. Mean Value Theorem in R Let f :[a, b] → R. If f is continuous on [a, b] and differentiable on (a, b), then there exists z ∈ (a, b) such that: f(b) − f(a) = f 0(z)(b − a) Proof. The proof is in two steps. First we prove the result—called Rolle’s theorem—in the particular case when f(b) = f(a). In this case we want to find z such that f 0(z) = 0: a critical point. Since f is continuous on [a, b] compact, f has a maximum and a minimum on [a, b] by Weierstrass theorem. To conclude, we just need to make sure that there is one maximum or minimum on the interior (a, b). There are three cases. First, if f is constant over [a, b] then f 0 = 0 and any z will do. Second, if there exists some x such that f(x) > f(a), then there is an interior maximum. Finally, if there exists some x such that f(x) < f(a), then there is an interior minimum. For the general case, the trick is to define the function g(t) = (f(b) − f(a))t − (b − a)f(t). If satisfies g(b) = g(a), so we can apply Rolle’s theorem to g: there exists z ∈ (a, b) such that g0(z) = f(b)−f(a)−(b−a)f 0(z) = 0. QED. The mean value theorem has a straightforward extension to function from Rn to R. Theorem 1.2. Mean Value Theorem in Rn Let f : S → R, S an open subset of Rn. If f is differentiable, then for any x, y ∈ S such that [x, y] ∈ S, there exists z ∈ (x, y) (that is z = 1 λx + (1 − λ)y for some λ ∈ (0, 1)) such that: f(x) − f(y) = f 0(z)(x − y) Proof. Just define g(λ) = f(λx + (1 − λ)y) for λ ∈ [0, 1] and apply the mean value theorem for function from R to R. 2 2 Taylor Expansions We start with kth-order Taylor expansions for functions from R to R, then consider first and second-order Taylor expansions for functions from Rn to R. 2.1 Taylor formula for functions from R to R We have seen that we can look at the derivative of a function f at a point x as providing the best affine approximation of f around x. We say that we linearize the function f around x. A Taylor expansion of order k of f at x generalizes this notion by looking at the best linear approximation of f around x by a polynomial of order k. There exist variants of the Taylor theorem, depending on the precise meaning of “best approximation” (some require f to be k + 1 times differentiable, other only k times differentiable). Here we state the Taylor-Lagrange theorem, whose proof relies on the mean value theorem. Theorem 2.1. Taylor-Lagrange theorem for functions from R to R Let f :[a, b] → R. If f is k + 1 times differentiable on [a, b], then for all x, y ∈ [a, b], there exists z ∈ (x, y) such that: k X f (i)(x) f (k+1)(z) f(y) = f(x) + (y − x)i + (y − x)k+1 i! (k + 1)! i=1 k X f (i)(x) We call the term f(x) + (y − x)i the Taylor expansion of order k of f at x. i! i=1 We often note the result less rigorously as: k X f (i)(x) f(y) ' f(x) + (y − x)i i! i=1 Proof. The proof is by induction on k. The base case k = 0 is the mean value theorem. The induction step relies on the mean-value theorem. We do not go into the details here. 2.2 Second-order Taylor approximations for functions from Rn to R For functions of n variables, we restrict to first and second-order Taylor expansions. We have already seen first-order Taylor expansions—differentiation/linearization. Stated not too rigorously, the second-order Taylor 3 expansion of a twice differentiable function from Rn to R is 1 f(y) ' f(x) + f 0(x)(y − x) + (y − x)0f 00(x)(y − x) 2 n n n X 1 X X ' f(x) + f 0(x)(y − x ) + f 00 (x)(y − x )(y − x ) i i i 2 ij i i j j j=1 i=1 j=1 The main application of first and second-order Taylor approximations you will see will be in solving Dynamic Stochastic General Equilibrium (DSGE) models in macroeconomics. The perturbations method solves numerically for the equilibrium of an economy defined by a set of difference equations by first taking a first-order or second-order approximation to the equations around the steady-state of the system. 4 3 Linearizing We now take a more applied turn on first-order approximations: linearization. 3.1 Linearizing a function As we have seen, to get the first-order Taylor expansion of a function f at a point x∗—to linearize the function f at x∗—we simply need to calculate its derivative at x∗, f 0(x∗). Then the approximation is f(x) = f(x∗) + f 0(x∗)(x − x∗). Defining dx ≡ x − x∗ and df(x) ≡ f(x) − f(x∗), we can note this as: 0 ∗ 0 ∗ 0 ∗ 0 ∗ df(x) = f (x )dx = f1(x )dx1 + f2(x )dx2 + ... + f (x )dxn and see it as relating the deviation of f(x) from f(x∗) to the deviation of x from x∗. This is most useful to linearize equations. Consider an equation of the form: f(x1, ..., xn) = 0 We can differentiate both sides at x∗ to get: 0 ∗ 0 ∗ 0 ∗ 0 ∗ f (x )dx = f1(x )dx1 + f2(x )dx2 + ... + fn(x )dxn = 0 This turns the equation that relates the variables x1,...,xn into an linear equation that relates the deviations of each variables dx1, ..., dxn. 3.2 Practical rules The rules of differentiation can be easily expressed in those terms. The linearity of the differentiation operator and the chain rule translate into: 1. d(x + y) = dx + dy. 2. d(λx) = λdx. 3. df(g(x)) = f 0(g(x∗))dg(x) = f 0(g(x∗))g0(x∗)dx. For example, to linearize a resource constraint: f(Kt) = Ct + Kt+1 − (1 − δ)Kt 5 ∗ ∗ around Ct = C and Kt+1 = Kt = K : 0 ∗ f (K )dKt = dCt + dKt+1 − (1 − δ)dKt 6 4 Loglinearizing In economics, it is often more relevant to focus on relative deviations (or percentage deviations) rather than on absolute deviations (or level deviations). For instance, knowing that “consumption is 5% below its steady-state value” is more meaningful than knowing that “consumption is $100 below its steady-state value”, the meaning of which depends on how high steady-state consumption is. We are more interested in the relative deviation variables, that we will denote with a hat: dx x − x∗ x = = b x∗ x∗ But it turns out that at first order1: dx x = = dx × ln0(x∗) = d ln(x) b x∗ where the last term means (as for any function): d ln(x) = ln(x)−ln(x∗). For this reason, turning a function—a relationship between inputs x1, ..., xn and output f(x)—into a linear function of the relative variation in x, xˆ1,...,xˆn, and the relative variation in y, yˆ, is called loglinearizing the function. (Turning an equation— a relationship between variables x1,...,xn—into a linear equation that relates the relative variations of each variables xˆ1, ..., xˆn, is called loglinearizing the equation). 4.1 Formally Formally, to loglinearize a function f(x) instead of linearizing it, we first do some changes of variables. • Instead of the variation dxi, we want the variation d ln(xi), so we define yi = ln(xi). • Instead of the variation df(x), we want the variation d ln(f(x)), so we consider the function ln(f(x)). This way, we consider the function: g(y) = ln(f(ey1 , ..., eyn )) ∗ ∗ 1You can also see it through : dln(x) = ln(x) − ln(x∗) = ln x = ln x−x + 1 ' x−x ≡ x. x∗ x∗ x∗ b 7 Linearizing: ∗ n 0 y∗,...,eyn f (e 1 ) ∗ X i yi dg(y) = y∗ y∗ e dyi f(e 1 , ..., e n ) i=1 n X f 0(x∗) d ln(f(x)) = i x∗d ln(x ) f(x∗) i i i=1 ∗ Definition 4.1. The elasticity of a function f at x with respect to xi is: 0 ∗ ∗ i ∗ fi (x ) ∗ ∂ ln(f(x )) εf (x ) = ∗ xi = f(x ) ∂ ln(xi) ∗ It expresses the percentage change of f(x) for a percentage increase in xi of 1% (at x ). We finally get: ∗ ∗ fd(x) = ε1(x )ˆx1 + ... + εn(x )ˆxn In practice, we can rely on two approaches to loglinearize a function or an equation. 4.2 First Approach and Practical Rules In practice, we can use the following method: Approach 1. To loglinearize a function f: 1. Take the log of f. 2. Linearize ln(f), remembering dln(x) = xb. For instance let us loglinearize a Cobb-Douglas production function Y = AKαL1−α at any point: ln(Y ) = ln(A) + α ln(K) + (1 − α) ln(L) d ln(Y ) = d ln(A) + αd ln(K) + (1 − α)d ln(L) Yˆ = Aˆ + αKˆ + (1 − α)Lˆ This example is straightforward because of the following practical rules for loglinearization, which are simply the translation of the corresponding rules for linearization: 8 1.

Load more