<<

John Nachbar Washington University September 6, 2016

The Implicit Theorem1

1 Introduction

The Implicit Function Theorem is a non-linear version of the following observation 2 from linear algebra. Suppose first that F : R → R is given by F (x) = ax1 + bx2. If a 6= 0, then the zero set of F (the set of (x1, x2) such that F (x) = 0; also called the kernel of F , since F is linear) can be written as the graph of a ψ : R → R given by b ψ(x ) = − x 1 a 1

In particular, F (ψ(x2), x2) = a[−(b/a)x2] + bx2 = 0 L M More generally, suppose that L = M + N and that F : R → R is linear and can be written in the form F (x) = Axµ + Bxν, L M N where x = (xµ, xν) ∈ R , xµ ∈ R , xν ∈ R , A is an M × M matrix, and B is an M × N matrix. If A is invertible then the zero set of F can be written as the graph N M of a linear function ψ : R → R given by

−1 ψ(xν) = −A Bxν.

−1 In particular, F (ψ(xν), xν) = A[−A Bxν] + Bxν = 0. One version of the Implicit Function Theorem says the following. Suppose that L M r ∗ ∗ F : R → R is C , that F (x ) = 0, and that DµF (x ) (the M × M matrix of ∗ with respect to the xµ variables) is invertible. Then, near the point x , r the zero set of F can be written as the graph of a C function ψ that gives xµ as a function of xν. ψ is the function implicitly defined by F (x) = 0. Moreover, for any ∗ xν near xν, setting x = (ψ(xν), xν),

−1 Dψ(xν) = −[DµF (x)] DνF (x),

which is the analog of what we found in the linear case. This allows us to compute Dψ even when we cannot solve for ψ analytically. Much of the intuition for the Implicit Function Theorem is illustrated by the unit 2 . Explicitly, the unit circle can be expressed as the zero set of F : R → R, 1cbna. This work is licensed under the Creative Commons Attribution-NonCommercial- ShareAlike 4.0 License.

1 2 2 F (x) = x1 + x2 − 1. I can write the circle as the union of the graphs of four differentiable functions on (−1, 1), two giving x1 as a function of x2, q 2 x1 = x2 − 1 and q 2 x1 = − x2 − 1. and two giving x2 as a function of x1, q 2 x2 = x1 − 1 and q 2 x2 = − x1 − 1. There is substantial overlap across these graphs, but that is not a problem. Note the following subtleties. 1. I cannot express the entire unit circle, the zero set of F , as the graph of a single function.

2. Although I expressed the Implicit Function Theorem above as saying that the first M variables could be written as a function of the last N variables, there is nothing sacred about order. As long as DF has full rank (namely M), then I can express some M of the variables as a function of the remaining N variables. In the circle example, this shows up as sometimes writing x2 as a function of x1 and sometimes writing x1 as a function of x2. Example 1. In , a standard example of an implicit function involves indif- ference (or indifference surfaces, in higher dimensions). If the function L ∗ ∗ ∗ is u : R → R and if u(x ) = c , then the indifference through x is de- fined implicitly as the zero set of the function F (x) = u(x) − c∗. If L = 2, and if ∗ ∗ D2u(x ) 6= 0, then the Implicit Function Theorem says that, near the point x , the indifference curve through x∗ can be given as the ψ, with

D1u(x) Dψ(x1) = − , D2u(x) which is (the negative of) the marginal rate of substitution. The next two examples illustrate pathologies. 2 Example 2. Define F : R → R by

F (x) = x1x2.

Let x∗ = (0, 0). DF (x∗) = [ 0 0 ],

2 which violates the full rank condition of the Implicit Function theorem. The zero set of F resembles a “+” sign. There is no way to represent this zero set in a neighborhood of the origin as the graph of a function, differentiable or otherwise.  2 Example 3. Define F : R → R by

2 F (x) = (x1 − x2) .

Then at x∗ = (0, 0), DF (x∗) = [ 0 0 ]. Here, however, a C∞ ψ exists, namely,

ψ(x2) = x1.

So the full rank condition on DF (x∗) is not necessary for either existence of an implicit function or for its differentiability. In contrast, for the Theorem, the full rank condition, while not necessary for existence of an inverse function, was necessary for the differentiability of the inverse function. 

2 The Implicit Function Theorem

M L Consider a function F : O → R where O is an open subset of R , L = M + N. L M N ∗ Denote a point x ∈ R as (xµ, xν), where xµ ∈ R and xν ∈ R . At a point x , let ∗ ∗ ∗ DµF (x ) denote the first M columns of DF (x ) (the xµ columns) and let DνF (x ) denote the remaining N columns (the xν columns).

L Theorem 1 (Implicit Function Theorem). Let O be a nonempty open subset of R . M r ∗ Let F : O → R be C , where r is a positive integer. Consider any x ∈ O such that ∗ ∗ L F (x ) = 0. If Df(x ) has full rank, namely M, then there is an open set W in R such that the restriction of the zero set F −1(0) to W is the graph of a Cr function. In particular, suppose, for concreteness and simplicity of notation, that the first ∗ M columns of Df(x ) (the xµ columns) are linearly independent. Then there are N L r M open sets U ⊆ R and W ⊆ R , and a C function ψ : U → R such that DµF (x) has full rank for all x ∈ U, and

∗ ∗ 1. xν ∈ U, x ∈ W , ∗ ∗ 2. ψ(xν) = xµ,

3. For any x ∈ W , xν ∈ U,

4. For any xν ∈ U, ψ(xν) is the unique xµ such that, setting x = (xµ, xν), (a) x ∈ W ,

3 (b) F (x) = 0,

5. For any xν ∈ U, setting x = (ψ(xν), xν),

−1 Dψ(xν) = −[DµF (x)] DνF (x). (1)

Proof. See Section3. 

The Implicit Function theorem thus states that if F is continuously differentiable, if F (x∗) = 0, and if DF (x∗) has full rank then the zero set of F is, near x∗, an N- L dimensional in R . Example2 in Section1 shows what can go wrong. Note that the focus on zero sets of functions is really without loss of generality. L M ∗ ∗ Suppose that f : R → R and that f(x ) = y . Then the of f through ∗ L N x is just the zero set of the function F : R → R , F (x) = f(x) − y∗.

The proof of the Implicit Function Theorem is an application of the Inverse Function Theorem; the Implicit Function Theorem can be viewed as a corollary. −1 The fact that Dψ(xν) = −[DµF (x)] DνF (x), labeled equation1 in the state- ment of the Implicit Function Theorem, is consistent with the . Explic- itly, suppose that we are simply told that the differentiable function ψ exists. Define L M g : U → R by g(xν) = (ψ(xν), xν). Define h : U → R by h(xν) = F (g(xν)). Then h(xν) = 0 for all xν ∈ U, hence

Dh(xν) = 0.

On the other hand, by the Chain Rule, letting x = (ψ(xν), xν),

Dh(xν) = DF (x)Dg(xν)  Dψ(x )  =  D F (x) D F (x)  ν µ ν I

= DµF (x)Dψ(xν) + DνF (x), where I is the N × N identity matrix. Putting all this together,

0 = Dµf(x)Dψ(xν) + Dνf(x). Rearranging yields equation1.

3 Proof of the Implicit Function Theorem.

L r Define G : O → R by G(x) = (F (x), xν). Then G is C and  D F (x) D F (x)  DG(x) = µ ν , 0 I

4 where 0 is the N × M matrix of zeroes and I is the N × N identity. ∗ ∗ ∗ G(x ) = (0, xν). Moreover, DG(x ) is invertible. In fact, direct calculation confirms that for any x such that DµF (x) is invertible,

 [D F (x)]−1 −[D F (x)]−1D F (x)  [DG(x)]−1 = µ µ ν . (2) 0 I

Therefore, by the Inverse Function Theorem, there is an open set O˜ ⊆ O, with ∗ ˜ L ∗ x ∈ O, and an open set V ⊆ R , with (0, xν) ∈ V , such that DG(x) has full rank for every x in O˜, G maps O˜ 1-1 onto V and the inverse G−1 : V → O˜ is Cr. ∗ Since DµF (x ) has full rank, F is continuously differentiable, and the determinate function is continuous, one can take O˜ such that DµF (x) has full rank for any x ∈ O˜. N ∗ Since V is open, there exists an open set U ⊆ R such that xν ∈ U and, for 2 M  ˜ every xν ∈ U, (0, xν) ∈ V . Let W = R × U ∩ O. This is open, since it is the intersection of two open sets. −1 For any xν ∈ U, if G (0, xν) = (xµ, xν) then xµ is the unique point such that, setting x = (xµ, xν), x ∈ O˜ and F (x) = 0. Moreover, by construction, x ∈ W . M Therefore, define ψ : U → R by setting ψ(xν) equal to the first M coordinates −1 −1 r r −1 −1 of G (0, xν). Since G is C on V , ψ is C on U. Since DG = [DG] , Dψ is given by the upper-right sub-matrix in equation2, which implies equation1. 

2 N As discussed in the notes on R , any open ball is contained in an open cube and vice versa. ∗ ∗ In the present case, since V √is open and contains (0, xν ), there is an ε > 0 such Nε(0, xν ) ⊆ V . Choose any r > 0 such that r M + N ≤ ε. Then the M + N-dimensional cube with sides of length ∗ ∗ 2r and centered at (0, xν ) is contained in Nε(0, xν ), which is contained in V . Take U to be the ∗ N-dimensional cube with sides of length 2r and centered at xν .

5