<<

0.1. SPACES AND LAGRANGE MULTIPLIERS 1 0.1 Tangent Spaces and Lagrange Multipliers

~ n+k k If a differentiable function G = (G1,...,Gk): E → E then the S defined by S = {~x | G~ (~x) = ~v} is called the level surface for G~ (~x) = ~v. Note that each of the functions n+k Gi : E → R. If we denote by Si the level surface for the equation Tk Gi(~x) = vi, then S = i=1 Si. 0 0 0 ~ 0 0 n+k k Suppose that ~x = (x1, . . . , xn+k) ∈ S and that G (~x ) ∈ L(E , E ) has rank k. Let δi,j = 1 if i = j and 0 if i 6= j. With respect to the standard basis {~ej = (δ1,j, . . . , δn+k,j) | j = 1, 2, . . . , n + k} for En+k and the analogous smaller basis for Ek, we note that the matrix ~ 0 0 [G (~x )]k×(n+k) has its whole set of k row vectors linearly independent, and 0 0 0 these row vectors are the vectors ∇G1(~x ), ∇G2(~x ),..., ∇Gk(~x ). Let φ~ : R → S be a differentiable function for which φ~(0) = ~x0. Then we call the vector ~v = φ~0(0) a to S at ~x0.

0 Definition 0.1.1. The tangent space T~x0 (S) at ~x ∈ S is the set of all tangent vectors to S at ~x0. The translate

0 0 ~x + T~x0 = {~x + ~v | ~v ∈ T~x0 (S)} is called the tangent plane to the surface S, with point of tangency at ~x0.

A translate ~a + V = {~a + ~x | ~x ∈ V } of a vector subspace V of En is called an affine subspace of En. An affine subspace is a vector subspace if and only if ~a ∈ V . (See Exercise 1.)

Theorem 0.1.1. Let G~ : En+k → Ek be a differentiable function. Let

~x0 ∈ S = {~x | G~ (~x) = ~v}.

~ 0 0 Suppose G (~x ) has rank k. Then the tangent space T~x0 (S) is the vector subspace ¡ 0 0 ¢⊥ T~x0 (S) = spanR{∇G1(~x ),..., ∇Gk(~x )} of n. In words, T~x0 (S) is the orthogonal complement of the span 0 0 of the k gradient vectors ∇G1(~x ),..., ∇Gk(~x ). 2

~0 ~ Proof. Suppose first that ~v = φ (0) ∈ T~x0 (S). This implies that φ maps 0 into each level surface Gi(~x) = vi. We will show that ~v ⊥ ∇Gi(~x ) for ~ each i = 1, . . . , k. In fact, Gi(φ(t)) ≡ vi, a real constant. We differentiate 0 ~ ~0 using the Chain Rule to find that Gi(φ(0))φ (0) = 0. In terms of the matrix 0 ~0 representation of the left side of the latter equation, we have ∇Gi(~x )·φ (0) = 0 0 ⊥ 0, so that ~v ⊥ ∇Gi(~x ). This shows that T~x0 (S) ⊆ ∇Gi(~x ) for each i. This implies that

¡ 0 0 ¢⊥ T~x0 (S) ⊆ spanR{∇G1(~x ),..., ∇Gk(~x )} . ~ 0 0 The hypothesis that rank(G (~x )) = k implies that dimT~x0 (S) ≤ n. If we can show that the tangent space is at least n-dimensional, then it will have to be the entire orthogonal complement of the span of the gradient vectors as claimed. Thus it will suffice to produce a linearly independent set of n vectors in the tangent space. Because the rank of a matrix is also the number of linearly indepen- dent column vectors, it follows that the matrix [G~ 0(~x0)] has k independent columns. We can rearrange the order of the n elements of the standard basis of En+k to arrange that the first k columns are linearly independent. By the Theorem, there exists an open set U ⊂ Ek containing 0 0 n 0 0 (x1, . . . , xk) and an open set V ⊂ E containing (xk+1, . . . , xk+n) such that there are unique differentiable functions

x1 = ψ1(xk+1, . . . , xk+n) . . . .

xk = ψk(xk+1, . . . , xk+n) solving the equation ~ G(ψ1(xk+1, . . . , xk+n), . . . , ψk(xk+1, . . . , xk+n), xk+1, . . . , xk+n) = ~v. Next we define n differentiable on S by the equations

~ 0 0 0 0 0 0 0 0 0 φ1(t) = (ψ1(xk+1 + t, xk+2, . . . , xk+n), . . . , ψk(xk+1 + t, xk+2, . . . , xk+n), xk+1 + t, xk+2, . . . , xk+n) ...... ~ 0 0 0 0 0 0 0 0 0 φn(t) = (ψ1(xk+1, . . . , xn+k−1, xk+n + t), . . . , ψk(xk+1, . . . , xk+n−1, xk+n + t), xk+1, . . . xk+n−1, xk+n + t) ~0 In comparing the vectors φi(0) for i = 1, . . . , n, observe that for each of these vectors the final n entries are all 0 except for a single entry which is 1. The location of the 1 is different for each of these vectors. Thus the n vectors are independent and the theorem is proved. 0.1. TANGENT SPACES AND LAGRANGE MULTIPLIERS 3

Corollary 0.1.1. Let G~ : Ek+n → Ek be a differentiable function and let

~x0 ∈ S = {~x | G~ (~x) = ~v}.

Suppose ~x0 is a local extreme point of a differentiable function f : S → R ~ 0 0 and that G (~x ) has rank k. Then there exist numbers λ1, . . . , λk such that

0 0 0 ∇f(~x ) = λ1∇G1(~x ) + ··· + λk∇Gk(~x ) (1)

The numbers λ1, . . . , λk are called Lagrange multipliers. Proof. If φ~ : R → S is a differentiable on S with φ~(0) = ~x0, let ψ(t) = f(φ~(t)). Since this function has an extreme point at 0, we have

ψ0(0) = ∇f(φ~(0)) · φ~0(0) = 0.

It follows from Theorem 0.1.1 that ∇f(~x0) is orthogonal to the tangent space 0 T~x0 (S). Since the co-dimension of T~x0 (S) is k, it follows that ∇f(~x ) lies in 0 0 the span of the k vectors ∇G1(~x ),..., ∇Gk(~x ). This proves the corollary.

The method of Lagrange multipliers permits an optimization problem to be replaced by a problem of solving a system of equations. From the k + n components of the vectors in Equation 1, we obtain a system of k + n equations in the n+2k unknowns x1, . . . , xk+n, λ1, . . . , λk. We get k additional equations from the k components of the equation G~ (~x) = ~v. Thus we obtain a system of n+2k equations in n+2k unknowns. Although we have replaced a calculus problem with an algebraic problem, the algebraic problem can be challenging. Nevertheless, the method of Lagrange multipliers is a powerful tool for optimization problems.

Example 0.1.1. We will begin with a three-dimensional example. Consider the surface S defined by the equation x4 +y4 +z4 = 1 in E3, shown in Figure 1. We will find both the maximum and the minimum values of the function f(~x) = x2+y2+z2 on S. (In effect, we are determining the closest and furthest distances from the origin on S.) In this example, we denote ~x = (x, y, z). Observe that if we define G(~x) = x4 + y4 + z4 then S = G−1({1}). Hence S is closed because G is continuous. S is also bounded. (Why?) Hence the function f must achieve both a maximum and a minimum value somewhere on S. Since S is smooth at all points and since ∇G is non-vanishing on S, 4

1 y 0.5 0 -0.5 -1 1

0.5 z 0

-0.5

-1 -1 -0.5 0 x 0.5 1

Figure 1: x4 + y4 + z4 = 1 the extreme points must occur at those points for which ∇f(~x) = λ∇G(~x). This yields the following system of equations.

x(1 − 2λx2) = 0 y(1 − 2λy2) = 0 z(1 − 2λz2) = 0 x4 + y4 + z4 = 1

The reader should check the following by making the necessary calculations.

• If none of the three variables is zero, then x2 = y2 = z2 = 1 showing √ √ 2λ 3 that λ = ± 2 . This implies that f(x, y, z) = 3.

• If exactly one of the three variables is zero, then at√ a point satisfying the system of equations we must have f(x, y, z) = 2.

• If exactly two of the variables are zero, then at a point satisfying the system we must have f(x, y, z) = 1. 0.1. TANGENT SPACES AND LAGRANGE MULTIPLIERS 5 √ It follows that the maximum value of f on S is 3. But the reader should be able to explain why at least one of the variables must be non-zero. Thus the minimum value is 1. There is also an easy way to explain even from the outset why f(x, y, z) ≥ 1 everywhere on S.

Exercises 0.1.

0 n 1. Prove that the tangent plane ~x + T~x0 (S) is a vector subspace of E if 0 and only if ~x ∈ T~x0 (S).

2. Describe both the tangent space and the tangent³ plane to the´ n−1 n 0 √1 √1 √1 S = {~x ∈ E | k~xk = 1} at the point ~x = n , n ,..., n .

3. The sphere S3 ⊂ E4 is defined by ( ) X4 3 2 S = ~x | xi = 1. i=1

3 P4 Define f : S → R by f(~x) = i=1 aixi where ai is a constant for each i ∈ {1, 2, 3, 4}. Show that the maximum and minimum values of f on qP 3 4 2 S are ± i=1 ai .

4. The group SL(2, R) of matrices was defined in Exercise ??.??. Let 2 2 2 2 f : SL(2, R) → R be defined by f(~x) = x1 + x2 + x3 + x4, where we identify the matrix µ ¶ x x X = 1 2 ∈ SL(2, R) x3 x4

4 with the vector ~x = (x1, x2, x3, x4) constrained to the surface S in E defined by the equation

x1x4 − x2x3 = 1.

(a) Prove that f achieves a minimum value on S but that it has no maximum. (b) Use the method of Lagrange multipliers to find the minimum value of f on S. 6

P4 2 4 5. Let f(~x) = i=1 xi for all ~x ∈ E . Let S1 be the surface in determined P4 by x1x4 − x2x3 = 1 and let S2 be the surface defined by i=1 xi = 2. Let S = S1 ∩ S2. (a) Prove that f(~x) has a minimum value on S but no maximum. (b) Find the minimum value of f(~x) on S.