Constrained Local Extrema Without Lagrange Multipliers and the Higher Derivative Test

CONSTRAINED LOCAL EXTREMA WITHOUT LAGRANGE MULTIPLIERS AND THE HIGHER DERIVATIVE TEST

SALVADOR GIGENA

Dedicated to the future generations of Mathematicians. Mainly, and with very special affection, to Camila, Cristian, Pamela, Santino, Lautaro, Martina (Gabrielita) and Brenda.

Abstract. _ Two are the main objectives of this article: first, we introduce a method for determining and analyzing constrained local extrema that provides a different alternative to all previous works on the topic, by eliminating Lagrange multipliers and reducing constrained problems to unconstrained ones. It has two added advantages: is very straightforward, effective and computationally faster than all of the previous ones; is also very simple in its conception and in the requirements of theoretical background material needed for its efficient application. In fact, to be performed successfully by undergraduate students, they only need a basic knowledge of three very central results from classical mathematical analysis: implicit function theorem, chain rule and Taylor’s formula, together with some very basic notions of topology. Second, we also develop another method for analyzing and determining the nature of known critical points by resorting, when needed, to higher order derivatives. The result presented here constitutes our proposal of solution to a long standing problem: that of extending to higher dimensions what has been known for many years in the case of functions defined on open sets of the real line, regarding the classification of critical points when enough derivatives are also known at those points.

1. Introduction.

The method of Lagrange multipliers is one of the milestones of mathematical analysis (calculus), not only because of its intrinsic, historical significance both in theoretical as well as in practical matters, but also because it has been of almost excluding use when trying to determine and analyze constrained local extrema (see, for example, [1], [2], [3], [4], [5], [11], [12], [13], [14], [15] and further references therein). In [16], F. Zizza describes two alternative methods which eliminate the multipliers from the first derivative test, i.e., from that part of the calculation in which the critical points are determined, by using differential forms. He does not indicate however a corresponding, alternative method for analyzing the critical points with regards to the possibilities of being local maxima, minima, or saddle type. As for the analysis of critical points, one of the last articles in the literature seems to be that by D. Spring [14], where it is described how to classify constrained extrema by strongly using Lagrange multipliers. More recently, E. Constantin [4] also uses the multipliers as part of the computation, within the context of the theory of tangent sets, establishing higher order smooth necessary conditions for optimality and thus solving, in particular, some examples for which the second derivative test fails. The first main purpose of this article is to present a comprehensive method for determining and analyzing constrained local extrema which, at the same time, represents a different alternative to the methods exposed in the first three mentioned articles, as well as in all previous works published on the topic: no use of Lagrange multipliers and/or differential forms is made in its formulation and development. Besides, it consists, basically, in reducing a constrained problem to a non-constrained one, thus fully eliminating the Lagrange multipliers from the calculations, including higher order derivatives, for all possible cases of dimension and codimension. Let us remark again here that the elimination of the Lagrange multipliers had been previously achieved, but only for the case of the first derivative test [16].

2010 Mathematics Subject Classification. Primary 26B99, 26B10, 26B12; Secondary 54C30. Key words and phrases. Constrained local extrema, Lagrange multipliers, implicit function theorem, chain rule, Taylor´s formula. Partially supported by Secyt-UNC.

1 2 SALVADOR GIGENA

Moreover, it should be emphasized that the method presented here is not only very effective and straightforward, but also very simple in its conception and in the requirements of theoretical background material. In fact, in order to be performed correctly by the undergraduate students, they only need to have a basic knowledge of three very central results from calculus: the implicit function theorem, the chain rule and the Taylor’s formula. Some elementary notions of topology are also needed in order to fully justify the method. Our interest in this topic is neither limited nor new. It should be noted with respect to the latter assertion that, in fact, most of textbooks explain how to find constrained critical points by the Lagrange multipliers method, but fall short of delivering a classifying criterion similar to the one usually explained for the unconstrained case, see for example [1], [3], [5], [11], [15], amongst many others. It was precisely this absence of accurate explanation that raised the author’s curiosity, since the first time he learned the subject as an undergraduate student. Thus, years later, in consultations with several fellow Mathematicians, we came to the conclusion that it was a topic worth of being elucidated, and we ourselves started by dedicating to it the corresponding parts of a book [6], now out of print, and diverse articles and congress participations afterwards. See also, for example, [7], [8], [9], [10]. Besides, the topic is not only of interest as a purely academic matter, say mainly in mathematical analysis of vector functions and teaching of mathematics at the undergraduate level, but also because of its applications to other areas, such as economics, optimization, and so on. See for instance [2], [3], [4], [12], as well as several other references therein. With regards to the computational time needed in order to perform the operations involved, let us recall that in his article [16] F. Zizza established an experimental comparison among the Lagrange multipliers method and the two methods that he proposes as alternatives. Analogously, we make the same kind of assertions, only experimental, when it comes to deal with those matters. In fact, it should be remarked as well that the equations involved are usually nonlinear, and the appearance of Lagrange multipliers only contributes to complicate the systems even more. Thus, the question of comparing the complexity of the various algorithms involved is not as simple a matter, as only counting and comparing the number of equations in one and the others. It is a subject still due the treatment of such a complexity analysis, which goes beyond the scope of this article. We include, as part of our exposition, carefully chosen examples where a step by step comparison is established with the first two previously mentioned methods ([16], [14]). Thus, in section 2 we consider one particular example in the lesser possible case of dimension and co-dimension, i.e., for a real valued function defined on the plane, subject to only one constraint. After some theoretical remarks, we also present alternatives to two of the examples whose solutions were previously proposed in [4]. In section 3 we develop theoretically the first of the methods proposed for the general constraint problem, i.e., how to determine and analyze local constrained extrema for all possible cases on the finite number of dimensions as well as co-dimensions, by proving two of the main results of this work: the first derivative test and the second derivative test, both of them without resorting to the classical Lagrange multipliers. Two more examples, again establishing suitable comparisons with the other first two methods ([16], [14]), including a preview on the use of higher order derivatives in order to achieve the full classification in one of the examples, occupy section 4. In section 5, we continue our exposition by remarking three important facts: in the first place we emphasize, and show, the reasons for the similarity of results between the method introduced here and F. Zizza´s second alternative method, when it comes to calculate the critical points (first derivative test). In the second place, since the recovery of Lagrange multipliers may be of importance in some applications (see for example [2], [3], [4], [12]), we show how to compute them as an additional output of the presently proposed method, but with much lesser effort and calculation time than those required when the old historical method is used. In the third place, we point out the fact that the present method does not stop with the second derivative test when it comes to analyzing critical points, as shown in two of the examples exposed in this article, but can go on, in case of failure of that test, by considering higher order derivatives until the analysis and conclusions are fully achieved, somewhat anticipating what is to follow. In Section 6 we present the second main objective of this article: to extend to higher dimensions the use of higher order derivatives of a given function in order to classify the nature of known critical points: maxima, minima, saddle type. Recall that this is very well known for the case of dimension equal to one, i.e., for functions defined on an open set of the real numbers and, in fact, we used it in particular to independently solve previous cases: Examples 2.1, 2.2, Exercise 2.3, where the Lagrange multipliers method failed to provide the answers.

CONSTRAINED LOCAL EXTREMA 3

It is also to be remarked here that this is long standing problem in mathematical analysis (calculus) and that the solution proposed, requiring only that enough derivatives of a function be known at a given critical point, is achieved through the implementation of a finite sequence of constrained subsidiary problems, where just determination of, and evaluation at, critical points is required, not for the function itself but only for some of the homogeneous polynomials that appear in its Taylor´s development, and by restricting such polynomials to suitable algebraic submanifold of the corresponding unit sphere. Most of calculations in this article were made with the help of two independent software algebraic systems: Scientific WorkPlace, Version 2.5, abbreviated from now on as Swp2.5, on one hand, and Mathematika 4.0, on the other. In each case we shall indicate which one we are using. We assume besides, without further discussion, all of the theoretical background that supports those implementations. However, a word of caution is in order here: for some real algebraic equations the systems may render solutions not belonging to the real field, but to its algebraic closure instead, the complex field with nontrivial imaginary components. Since we are only interested in the real-valued ones, those kinds of complex solutions are to be discarded from our further considerations, in every instance they may appear.

2. Introductory examples and theoretical considerations.

Let us consider diverse possibilities for solving the following problem:

Example 2.1. Find and classify the local extrema of the function fxy(), = xy subject to the constraint Gxy(),=− 2 x32 + 15 xy + 11 y 3 − 24 y = 0 .

First, by using F. Zizza´s method [16] we have, from the expressions of the differential forms

df=+ ydx xdy , dG =−+()() 6 x222 30 xy dx + 15 x + 33 y − 24 dy the condition that, at the critical points, their wedge product must be vanishing, i.e.,

df∧= dG32() x32 − 5 x y + 11 y 3 − 8 y dx ∧= dy 0.

This, together with the restriction Gxy(),0= , furnish the system of equations 32 3 25xxyyy−+−= 1180 32 3  , −+21511240xxyyy + − = whose solution set, calculated with the help of Swp2.5, is

_ {}{}Xxy==( , ) (0, 0),(1,1),( −− 1, 1) ∪S4 , (2.1) _ where S4 is a set with cardinality equal to 4, i.e., having exactly four elements, described by

_ 17342 20 S4 =={}()xy,ρρ , − ρρ: is a root of 17 Z −+ 58 Z 32 . 22 11

The actual values for ρ are:

1 1 1 1 ()493+ 51 33 , −+()493 51 33 , ()493− 51 33 , −−()493 51 33 . 17 17 17 17

Observe that, in this particular example, it would be very difficult to establish the nature of the critical points obtained. Are there among them maxima, minima, saddle type?

4 SALVADOR GIGENA

Second, we use the multipliers method and construct the Lagrangian function [14]:

32 3 L ()λλλ,,xy=+ fxy (,) Gxy () , =+−++− xy() 2 x 15 xy 11 y 24 y.

Then, in order to find the critical points, we have to solve the system of equations represented by ∂∂∂LLL ∇=L: ( ∂∂∂λ ,,xy) = 0, i.e., yx−+6300λλ2 xy =  22  xx++15λλλ 33 y −= 24 0  .

32 3  −+21511240xxyyy + − =

By using Swp2.5, we calculate the solution set and express it as:

11 {}(,λ X )=−() 0,0,0,() ,1,1,() ,() −− 1,1 *S , 24 24 4

where S4 is again a set with cardinality equal to 4 that can be described by

11720 S ==−−{}()λρρρρρ,()xy , , ,342: is a root of 17 Z −+ 58 Z 32 . 4 24 22 11

This produces the same solutions as the previous method, with the additional information represented by the Lagrange multipliers at each critical point.

Next, we express the (bordered) Hessian matrix of L by (see [14] )

∂∂GG 0()()XX ∂∂ xy ∂ ∂∂22 λ ==G LLλλ HL(, X ) ()X 2 (,XX ) (, ) ∂x ∂∂∂xxy ∂G ∂∂22LL ()X (,λλXX ) (, ) ∂y ∂∂yx ∂ y2 0−+ 6xxyxy222 30 15 + 33 − 24  = −+630xxy2 λλ()−+12xy 30 1 + 30 x . 22 15xy+− 33 24 130+ λλxy 66

Γ= λ To conclude the analysis, one evaluates the determinant, i.e., 3 detHLX ( , ) , at the points of the solution set above and finds that the function defined by fxy(), = xy, subject to the constraint 32 3 Gxy(),=− 2 x + 15 xy + 11 y − 24 y = 0 , attains local minimum values at the points of the set −−−1 24,() 1,1 and 1 24,() 1, 1 S4 , and local maximum values at the points ()(). ()() Γ= = However, at the critical point 0, 0,0 we have 3 detHL (0,(0,0)) 0 . So that, the Lagrange Multiplier’s method does not allow any conclusion with regards to the latter point, i.e., this is an indeterminate case. CONSTRAINED LOCAL EXTREMA 5

Here is our Proposal for an Alternative Method of Solution. (Previously exposed in [6], [7], [8], [9], and [10]).

If we compute the Jacobian matrix of G ,

∂∂GG 222 Gxy′(),:==−++−() 6 x 30 xy 15 x 33 y 24 , ∂∂xy and apply the implicit function theorem, for example at points where

∂G 22 =+−≠15xy 33 24 0 , ∂y we can consider the function J ()xfxyfxhxxhx:,== ( )() , () =⋅(), where yhx= () is defined implicitly by

32 3 32 3 G() x,2151124215()11()24()0 y=− x + x y + y − y =− x + x ()() hx + hx − () hx ≡ .

fSxyGxy, where ==≠()() , , 0, ∂G 0 It is easy to see that the restriction S ::{ ∂y } x , yS∈ has a local, constrained extremity (maximum, minimum, saddle point) at ()00 if, and only if, J x ∈ » has the same kind of (unconstrained) extremity at 0 . Thus, we proceed to find, first, the critical points of the latter function by the usual method for the one variable case, i.e., we determine the dJ set of points {xJ: ′() x==dx 0} . Now, in order to compute the derivative of J we have, by its very definition and the chain rule, that

∂∂ff Jx′′()=+ hx(). ∂∂x y

Thus, in order to obtain the derivative of J we have to compute first the derivative of h , and we proceed to do this implicitly: Kx()== Gxy(),,0 Gxhx()() ≡ implies that

dK∂∂ G G dh =+ ≡0 . dx∂∂ x y dx

It follows from the latter that

∂G dh∂x −+630 x2 xy hx′()==−=− , (2.2) dx∂G 15 x22+− 33 y 24 ∂y and we obtain for the first derivative of J

∂∂ff −+630 x2233 xy − 5 xyy + 1182 −+ yx Jx′′()=+ hx() =+− yx = . (2.3) ∂∂xy15 x22 + 33 y − 24 5 x 22 + 11 y − 8

6 SALVADOR GIGENA

Hence, in order to find the critical points we must solve the system of equations represented by both conditions Jx′()= 0 and Gxy(),0= which, except for the denominator in the first equation, is the same system as in the case of Zizza´s method. Swp2.5 provided us with exactly the same set of solutions, recorded above as equation (2.1). In order to classify the critical points we compute, next, the second derivative of the function J , which is again related to the derivatives of h up to the second order. For example, since all of the derivatives of the (constant) function K must vanish, we obtain

dK22∂∂ G 2 Gdh ∂ 2 G dh2 ∂ Gdh 2 =+20 + + ≡. dx22∂∂∂∂ x x y dx y 2 dx ∂ y dx 2

Then, it follows that

2222 dh1 ∂∂ G G ∂ G 2 hx′′()==− + 2 hx ′ () +() hx ′ () = dx22∂G ∂∂∂∂ x x y y 2 ∂y 50xxyxyxyxyx532424−−+−− 440 242 352 128 331 =−2 + 223 ()5118xy+− 550xy23253++−+ 400 yx 605 y 880 y 320 y -2 223 ()5118xy+− and, consequently,

∂∂22ff ∂ 2 f ∂ f J′′() x=+2 hx ′() +() hx ′()2 + h ′′ () x = ∂∂∂∂xxyy22 ∂ y

−2x 32 3 4 2 4 =−+−+−−+()660xy 160 x 484 xy 704 xy 256 x 81 yx ()5118xy22+−3 −2x +−+−+() 1650xy23 400 yx 2 1815 y 5 2640 y 3 960 y ()5118xy22+−3

Alternatively, one could obtain J ′′ by calculating directly the derivative of J ′ from equation (2.3), dy and then replacing in it the value of = hx′() given by equation (2.2). dx _ Then, straightforward computations show that J ′′ is positive at all the points of the set S4 . Therefore, they are local minima for the function J and hence for f . Similarly, one verifies that, at both the points {(1,1) and (−− 1, 1)} , it holds J ′′ < 0 : they are local maxima for f . Thus, these results coincide with the ones obtained previously by using the Lagrange multipliers method. On the other hand, at the point (0,0) it holds the condition J ′′()00= , so that with this method the second derivative test fails too. However, in the present situation we can consider higher-order derivatives. In fact, by following the same kind of procedure as before, when calculating the second derivative J ′′ , we can go on to find the expressions of higher order derivatives. It is not difficult to compute, then, that at the origin the third derivative also vanishes, J ′′′()00= , but the fourth derivative IV is different from zero: J ()02=− . Thus, Taylor’s theorem expansion of J in a neighborhood of CONSTRAINED LOCAL EXTREMA 7

4 this critical point can be written Jx()==−+ fxhx(),112 () x , where the omitted terms involve powers in x greater or equal than five. Consequently, we conclude that at the point (0,0) the 32 3 function fxy(), = xy, subject to the constraint Gxy(),=− 2 x + 15 xy + 11 y − 24 y = 0 , reaches a local maximum.

Remark 2.2. At the beginning of our argument we assumed the second component of the Jacobian of G , 222 Gxy′(),:=∂()( Gx ∂ ∂ Gy ∂ =− 630153324 x + xy x + y − ), to be non-vanishing,

22 i.e., ∂∂=Gy15 x + 33 y − 24 ≠ 0 . Had we assumed, instead and also in order to exhaust all of possibilities, that the other component is ∂G 2 the one different from zero, i.e., ∂x =−6300xxy + ≠ , then we could proceed in a similar fashion. Only that in this case, when applying the implicit function theorem, we would have that the 32 3 condition for constraint, i.e., Gxy(),=− 2 x + 15 xy + 11 y − 24 y = 0 , now defines x as a function of y , say xhy= (). Following with the procedure we would also have that the function to optimize is now given by the expression Jy():== fxy (,(), ) fhyy ( ) = hyy () , whose first derivative is

∂f ∂ f ∂ f∂∂Gy ∂ f1  ∂∂ fG ∂∂ fG Jy′′()=+=−+=−− hy()   . ∂∂∂∂∂∂∂∂∂∂∂∂xyxGxyGxxyyx 

After substituting the actual values for fG and we would find the condition

15−+−+x 23yyyx 11 8 2 3 Jy′()=− =0 , 25−+x 2 xy which is totally identical with the one obtained previously, except again for the denominator, and provides, together with Gxy(),0= , the same solution set as before. Finally observe that, in order to ensure ourselves to have found all of the critical points, it is necessary to analyze both cases as above. In the general case, on the other hand, one will have to treat all possibilities where the Jacobian matrix of G is non-vanishing and, in such circumstances, these may amount to some more cases to consider. Nevertheless, there will always be only a finite number to analyze and resolve.

Remark 2.3. We can illustrate the theoretical situation with a couple of diagrams. In Figure 1, we describe the problem we are dealing with: try to find and analyze the critical points of the real valued function in two variables fxy(, ), in the case where f is restricted to the set

SxyGxyGxyGxGy::===∂∂∂∂≠{()(),,0, ′(,)() 0} .

Figure 1 8 SALVADOR GIGENA

Let us observe, incidentally, that the components of the Jacobian matrix Gxy′(, ) are the same as those of the gradient, ∇=∂∂∂∂Gxy(),,() G x G y, used by D. Spring [14], and as those of the ∂∂GG differential form dG=+ dx dy , used by F. Zizza [16] . ∂∂x y In Figure 2 below we describe the fact that, by assuming ∂∂≠Gy 0 , the neighborhood of any given X ==∈(,xy ) UV , S ∂∂ ≠ point 00000: (), with GyX()00 , can be parametrized by the function xHxhx (,()) as indicated, where yhx= () is defined implicitly by the further condition Gxy(),,()0== Gxhx ( ). This gives rise to the construction, by composition, of the auxiliary real- valued functions

Jx():== f Hx () f() xhx ,() = fxy (, ), K ()x:,,== G Hx () Gxhx() () = Gxy().

Figure 2

Henceforth, as pointed out before, the problem of finding and classifying local, constrained critical points of is reduced to the same problem with respect to the (unconstrained) function J . However, f S in the expression of the latter it appears the function h , which may be difficult, or even impossible to calculate. Nevertheless, since we only need to obtain the derivatives of J , we may proceed to do so by using the chain rule, which furnishes the expression

∂∂ff Jx′′()=+ hx(). ∂∂x y

Then, in order to compute the derivative hx′(), we use the fact that the function K vanishes identically, so that its derivatives are also vanishing. From this we have that

dK∂∂ G G =+hx′() ≡ 0 dx∂∂ x y ∂∂Gx ∂G and, since ≠ 0 we obtain first, for the derivative of h : hx′()=− and, for that of ∂y ∂∂Gy ∂∂ff∂∂Gx J : Jx′()=− . ∂∂∂∂xyGy CONSTRAINED LOCAL EXTREMA 9

Observe that this shows, in particular, that this derivative, Jx′(), can always be expressed in terms of the first derivatives of the given data: f and G . Moreover, by the same token, if we assume enough differentiability of the data f , G , we can express the corresponding order of derivative of J in terms of the derivatives of f and G up to the same order. For example for the second order derivative we have, since Kx()≡ 0, that

dK22∂∂ G 2 Gdh ∂ 2 G dh2 ∂ Gdh 2 Kx′′()==+ 2 + + ≡ 0. dx22∂∂∂∂ x x y dx y 2 dx ∂ y dx 2

The latter equation implies that

2 dh2221 ∂∂ G G∂∂Gx ∂ 2 G ∂∂ Gx hx′′()==− + 2  − + − . 22∂G  2 dx∂∂∂∂∂∂∂∂ x x y G y y G y ∂y 

So that

22 ∂∂GG    ∂∂ GG    ∂∂22ff∂∂xx ∂ 2 f  ∂ f1 ∂∂ 22 GG  ∂∂ xx ∂ 2 G  Jx′′ ()=−22. +  −  −  +  ∂∂∂∂xxyy22∂∂∂GGG  ∂ y  ∂∂∂∂ xxyy 22  ∂∂ GG  ∂∂∂yyy    ∂∂ yy      

()=++63 + Example 2.4. Let us consider the function F xx12,44 x 2 x 1 x 1 x 2 subject to the constraint ()=+++=54 Gxx12,0 x 1 x 2 x 1 x 2 . This problem was presented as Example 2 in the article by E. Constantin [4], and solved by using the theory of tangent sets. On our hand, in the theoretical context ′ 43 developed above, we see that the Jacobian matrix is given by Gxy(),5141=+() x12 x + and we 3 may assume first Gx22=+≠410. Then, from the identity K ()xGxhx111=≡(),0 () we obtain ′′ ′ 43 Kx()1121=+ G Ghx () ≡0 . Thus, we have hx()1121=− GG =−()()5141 x + x 2 + and ′′25 43 Jx()1121=+ F Fhx () =34645141 x 1 ++()()() x 2 +() − x 1 + x 2 +. Next, we get that the ′ system of equations represented by both conditions, Jx()1 = 0 and Gxx()12,0= , has a solution set, obtained by using Swp2.5, with quite a few elements. In particular, the point ()0, 0 belongs to this set. This is our primary interest here and it is easy to verify that JJ′′′()000== () and J ′′′ ()06= . It follows that ()0, 0 is a saddle point for the problem, (cf. [4], pp. 47-48).

Exercise 2.5. In a similar fashion, consider Example 3, in the same article [4], i.e., the function ()=++63 22 −+ + F xxxx12,244 2 1 xx 1 2 x 1 x 2, subject to the same constraint as in the latter example. Show then that J ′()00= and J ′′ ()02= , so that ()0, 0 is a point of strict minimum, (cf. [4], p. 48).

3. The General Constrained Problem.

In this section we shall show that extensions of the method are valid for all possible cases of larger dimension and co-dimension. In fact, in the general case we have a function 10 SALVADOR GIGENA

= yfx(11 ,...., xxnn ,++ ,...., x nm ), whose local extrema are to be determined in the case where the function f is subject to subsidiary constraints represented by a set of equations like:

= Gxinnnm(11 ,...., x , x++ ,...., x ) 0, where im= 1,2,...., . We assume that the given functions are defined with enough differentiability on an open subset ⊂ »nm+ → » =→()»m U , fU: , GG1,..., GUm : . Then, we want to determine and classify the ==⊂()()\nm+ critical points of f restricted to the set {Xxxx12, ,...,nm+ : GX 0} . The treatment of this problem is greatly facilitated if we limit ourselves, first, to consider the case of those points where, besides, it holds the additional condition that allows applying the implicit function theorem, namely, that the rank of the corresponding Jacobian matrix is maximal, i.e., rank() G′() X= m . Thus, we consider the restriction of f to the set, indeed n -dimensional differentiable manifold:

==()()() =′ =⊂\nm+ S{ X x12, x ,..., xnm+ : G X 0, rank G( X ) m} .

=→()»m More precisely, since the function GG1,..., GUm : is assumed to be enough differentiable we can consider, first, its Jacobian matrix

 ∂∂ ∂ GG11… G 1  ∂∂ ∂  xx12 xnm+  GX′()=   ,   ∂∂GG ∂ G  mm… m  ∂∂ ∂  xx12 xnm+  whose rank is assumed to equal m , i.e., GX′() is to have at least m columns linearly independent. Let Xx=∈( ,...., xx , ,...., x ) S and let us suppose for convenience, and without loss of 0011 00nn++ 0 nm ′ generality, that the m linearly independent columns in GX()0 are the last ones (if not, the variables can be re-ordered): ∂∂GG ∂ G. Introducing the notations Ux= ( ,...., x ) and ∂∂, ,..., ∂ 001 0n xxnn++12 x nm + Vx= ( ,...., x ) for the components, the implicit function theorem implies that there exists a 00nnm++1 0 =→»m ⊂ »n differentiable function hh(1 ,...., hNm ) : , defined on an open neighborhood N , of = =∈∀∈ U0 , such that it holds hU()00 V and Hu() (,()) uhu SuN, . Hence, we also have that + SN∩×(»m ) = Graph( h ) = Image( H ) . In other words, the function HN: → »nm realizes a vector-valued parametrization of the differentiable manifold S in a neighborhood of X 0 . As in the case where ==, it is quite clear that the (constrained) function : → » nm1 fSS reaches a local extremity (maximum, minimum or saddle point) at X 0 if, and only if, the (unconstrained) =→» function JfHN: : reaches the same kind of local extremity at U0 . Therefore, we can reduce our original problem to the determination and analysis of the possible critical points of the latter function, following the steps required in the corresponding problem for free, unconstrained extrema. Again, this problem would be quite easy to solve if we knew the actual, explicit expression of the CONSTRAINED LOCAL EXTREMA 11 function H = (,)Id h . However, as we also know from most of examples, this may be very difficult, or even impossible to obtain. Thus, in these cases we have to compute the derivatives of the function h implicitly, by using the fact that the function K obtained by composing G with H vanishes identically, i.e., KGHGIdh:==(),0 ≡. We represent again in a graph the whole picture as Figure 3 that follows:

Figure 3

As stated previously we assume, for the rest of the present exposition, that both the real-valued function f and the vector-valued function G are enough differentiable. For example we have to require ()2 at least class C in order to develop the second derivative test. Both the gradient and the Hessian matrix of JfH:= can be computed as follows: we denote = ()… by uuuu12,,,n the n first coordinates of a general point X with coordinates = = ()… Xx(11 ,...., xxnn ,++ ,...., x nm ) and by vvvv12,,,m the last ones. Then, the function h is = defined by hu( ) ( v12 ( u ), v ( u ),..., vm ( u )) . ′ − By Gv we denote the matrix whose ith row is made up with the partial derivatives of Gi respect to the variables vv12, ,..., vm , i.e.,

∂∂ ∂ ∂∂ ∂ GG11 G 1 GG 1 1 G 1 ∂∂…… ∂ ∂∂ ∂ vv12 vmnn xx++ 1 2 x nm + G′ == . v  ∂∂ ∂ ∂∂ ∂ GGmm…… G m GG m m G m ∂∂∂∂ ∂∂ vv12 vmnn xx++ 1 2 x nm +

In what follows we denote with sub-indices the successive derivatives of the scalar and vector-valued functions needed in our exposition. Thus, for example:

2 ∂ ∂2 ∂ ∂2 ∂ ∂ G f f G G Gα −n α −n = := f ; := f ; := G ; := G ; := Gα − ; : Gα − . ∂ i ∂∂ ij ∂ i ∂∂ ij ∂ ni, ∂∂ nij, xi xxij xi xxij xi xxij 12 SALVADOR GIGENA

Observe that, when there are already existing sub-indices which indicate components, we separate with a comma to indicate derivation of the corresponding real valued function. Besides, we shall use the following range for indices: small Latin letters shall indicate indices running from 1 to n , i.e., 1≤≤ijkl , , , ... n; while small Greek letters shall run from n +1 to nm+ , i.e., nnm+≤1α ,βγ , ,λ ,... ≤ + . With this convention it turns out, for example, that the several ′ expressions of the matrix Gv can indistinctly be written as

 ∂∂GGαα−− []GG′ ===nn(). vn∂∂αβ− , vxββ−n

Moreover, since the latter is assumed to be non-singular, we shall denote its inverse by

[]′ −1 = αβ GGv : ().

Now, returning to our exposition, since the function K vanishes identically on an open set, i.e., Ku():,()0,=≡∀∈⊂ Guhu ( ) u N \ n , it follows that all of the derivatives of this function are also identically vanishing. In particular, for the first derivatives, respect to ui , we obtain, by also using the chain rule: 0  ∂∂KG ∂ h . =+⋅=()()uhu,() G′ uhu ,() () u , ∂∂v ∂  uuii u i .  0 which in terms of the alternative, indicial notation can also be represented by:

=+ = KGαα−−ni,, ni∑ Gh αββ −− n ,, ni 0. β Then,

∂∂hG−1 ()uGuhuuhu=−′ () ,() ⋅ () ,() ∂∂v , uuii and, again in terms of the indicial notation, we may represent by

=− γα hGGγα−−ni,,∑ ni. α

We obtain, similarly, from the fact that the second derivatives of the function K also vanish, i.e.,

=+ + + + ≡ KGα−−nij,, α nij∑∑∑ GhGh α −−−−−−−−− ni ,,,,,,,,, µµ n j αββ n j ni GhhGh αβνβν n ni n j ∑ αββ n nij 0 , µββνβ, the expression

=−γα + γα µρ + γα βρ − hGGGGGGGGGGγααµραβρ−−−−−−nij,,,,,,∑∑ nij ni n j ∑ n j ni ααµραβρ,, ,, − γα νρ βµ ∑ GGαβνρ−−−nnjni,,, GG GG µ . αβνρµ,,,, CONSTRAINED LOCAL EXTREMA 13

∂J = () Now, for the function JfH we calculate the first derivatives ∂ u : ui

=+ =− αβ Jfii∑∑ fhfαα−− nii,, fGG α β ni. ααβ,

Next, we compute the derivative of this with respect to u j :

=−αβ − αβ + αµ βν + Jfij ij∑∑ fGG iαβ−− n,,,,, j fGG α j β n i ∑ fGGGG αβµ −− n i ν n j αβ,,,,, αβ αβµν −+∑∑GGγα GG γα GG µρ + ∑ GG γα GG βρ  ααµραβρ−−−−−nij,,,,, ni n j n j n j (3.1) + ααµραβρ,, ,, ∑ fγ  γα νρ βµ  . γ −  ∑ GGαβνρ−−−nnjni,,, GG GG µ  αβνρµ,,,,

The above theory and notations, developed so far, allow us to state the following two theorems:

+ Theorem 3.1 (First Derivative Test, without multipliers). Let U ⊂ »nm be an open set → » =→()»m ()1 and let fU: , GG1,..., GUm : be differentiable functions of class C such that the Jacobian matrix of G has maximal rank, i.e., rank() G′() X= m , and consider the restriction of ==()()() =′ =⊂\nm+ f to the set S{ X x12, x ,..., xnm+ : G X 0, rank G ( X ) m} , assumed on the other hand to be non-empty and thus, indeed, an n − dimensional differentiable manifold. Then, the = critical points of the function yfx(11 ,...., xxnn ,++ ,...., x nm ) , subject to the constraints determined == ()′ = by both of conditions Gxinnnm(11 ,...., x , x++ ,...., x ) 0, i1,2,...., m and rank G() X m , may be calculated, taking also into account every possible case where m columns of the matrix GX′() are linearly independent, by solving, in each of those cases, the system of equations represented by

∇=J 0  . G = 0 

Proof. Obvious from the above considerations, stressing once again the very important observation that, for obtaining all of the critical points one has to analyze too, separately, all of the remaining possible cases where m columns of the matrix GX′() are linearly independent. Once such a listing is exhausted, the job of finding all of the critical points is done. The theorem is proved.

Observe, besides, that it may be more convenient to break the above in terms of components,

Ji==0, 1,2,..., n i  Gnm=−=0,α 1,2,...,  . α −n 

Here we further observe that this is a system of nm+ equations in nm+ unknowns. In fact, in the first place, the gradient ∇J had been previously written in terms of the derivatives of the given data f and G up to the first order and, therefore, both the gradient ∇J and the vector-valued function G can also be expressed in terms of the components of the variable, general point = Xx(11 ,...., xxnn ,++ ,...., x nm ) .

14 SALVADOR GIGENA

Theorem 3.2 (Classical Second Derivative Test, without multipliers). With the above ()2 notations and conditions, assume now that the functions f and G are differentiable of class C . = Then, the critical points of the function yfx(11 ,...., xxnn ,++ ,...., x nm ) , subject to the constraints == ()′ = determined by the conditions Gxinnnm(11 ,...., x , x++ ,...., x ) 0, i1,2,...., m; rank G() X m may be analyzed in order to determine whether they are maxima, minima or of saddle type, through the × Hessian matrix Hess() J= J , whose components J were described above, in equation nn : ()ij ij (3.1), i.e., in terms of the derivatives of the given data f and G up to the second order, for each of the possible cases where m columns of the matrix GX′() are linearly independent.

Proof. Obvious.

Finally, in case of failure of the latter, second derivative test (indeterminate case) we could if necessary proceed, in a similar fashion, to calculate higher order derivatives of the function J . Suitable use of the Taylor’s formula would allow then to classify the critical points. Example 4.1, in next section, illustrates this fact, while in section 6 we present a new, comprehensive method for the analysis of critical points which, when needed, resorts to those higher order derivatives. In fact, we shall also see there that it even furnishes a very valid alternative to the latter second derivative test.

Remark 3.3. Insofar, when dealing with this problem, we have considered two conditions to hold: GX()0= and the Jacobian matrix of G to have maximal rank, i.e., rank() G′() X= m . However, there may exist points satisfying the first and not the second (critical points for G ), or even it could occur that the latter condition is satisfied at no points of the (nonempty) solution set of the equation. See various examples described by J. Nunemacher in [13], and also the proof of theorem 6.3, around equation (6.6) ahead, as well as exercise 7.5. In this situation we may still try to determine and classify extrema of + the given function f , restricted to the set {XGX: ()=⊂0} \nm, such that rank() G′() X< m . Observe, first, that the solution set of this system, assumed to be nonempty, may not contain any differentiable manifold of dimension equal to n ; and, second, that in general the system of equations involved could be quite complicated, including even the possibility that it has transcendental terms, so that the situation may fall beyond the possibilities of treatment by means of the fields of knowledge that we have available on the topic nowadays, to obtain exact solutions. Except if, in the case of analytic functions, we consider the possibility of truncating the corresponding power series, by introducing some additional criterion for dealing with approximate solutions, question that goes far beyond the objectives and scope of the present article. However in some cases, for example if all of the equations in the system are of polynomial type and therefore the solution set is an algebraic variety, irreducible or not, we may try to resort to the possibility of expressing that solution set as a finite union of solution sets of other systems of equations such that, for every one of those new systems, the two conditions that guarantee treatment by means of the implicit function theorem are fulfilled. If that is the case, for each of those systems we should have, obviously, that the number of equations and the maximal rank, say m , are less than or equal to the dimension of the ambient space, but strictly greater than the original one, i.e., mmmn<≤+, and consequently the dimension of the corresponding differential manifold equal to mnm+− .

4. Further comparison of the three methods.

Consider, first, the following example.

Example 4.1. Find and classify the local extrema of the function f ()xyz,, =⋅⋅ xyz subject to the 32 3 constraint Gxyz(),,=− 2 x + 15 xy + 11 y − 24 y = 0.

CONSTRAINED LOCAL EXTREMA 15

Differential forms methods [16]:

32 3 From the functions f ()xyz,, =⋅⋅ xyz and Gxyz(),,=− 2 x + 15 xy + 11 y − 24 y we obtain, by exterior differentiation:

222 df=++ yzdx xzdy xydz , dG=−()()6301533240 x + xy dx + x + y − dy + dz .

Thus, their wedge (exterior) product can be written

df∧= dG32 z() x32 − 5 x y −+ 8 y 11 y 3 dx ∧+ dy 6 x 2 y() x − 5 y dx ∧− dz

−+−∧ 3xyx() 522 11 y 8 dydz

Zizza’s First alternative. The condition df∧= dG 0 , together with the constrained equation, furnish the system of four equations in three unknowns: 32 3 −+21511240xxyyy + − =  32 3 32zx()−−+= 5 xyy 8 11 y 0 2  6xyx()−= 5 y 0  22  3xy() 5 x+−= 11 y 8 0 with solution set, calculated again with the help of Swp2.5, and expressed as

22 {}()0, 0,zz: ∈−R *  0, 66, 0 , 0, 66, 0 . (4.1) 11 11

Zizza´s Second alternative. ()yG= yx= yz= Consider the coordinate system yyy123,, , where 1 , 2 and 3 . Thus, we obtain the condition

dy∧∧= dy dy15 x22 + 33 y − 24 dx ∧∧≠ dy dz 0 15x 22+−≠ 33y 24 0 123(), i.e., .

If we now proceed to calculate the solutions of the system as suggested by Zizza, i.e.: df∧∧= dG dx 0 , df∧∧= dG dz 0 , G = 0 , we find a solution set which contains, in addition to the set determined by the previous alternative, two more points where the coordinate system fails to be 22 valid, i.e., where 15xy+−= 33 24 0 . Thus, in order to overcome this difficulty we consider, instead, the system represented symbolically by

df∧∧ dG dx  = 0 ∧∧  dy123 dy dy  df∧∧ dG dz  = 0. ∧∧ dy123 dy dy  G = 0   

16 SALVADOR GIGENA

Observe that the quotient of differential forms makes sense since all of them are one-dimensional objects. Then, with this modified version of Zizza’s Second Alternative, the solution set is the same as that obtained with the first alternative. Finally, and again as it happened with the case of Example 2.1, Zizza´s method does not allow classifying any of the critical points obtained.

Lagrange method [14].

First, we form the Lagrangian function:

32 3 L ()λλλ,,,xyz=+ fxyz (,,) Gxyz () ,, =+−++− xyz() 2 x 15 xy 11 y 24 y.

Thus, for the present case, in order to find the critical points we have to solve the system

∂∂∂∂LLLL ∇=L: ( ∂∂∂∂λ ,,,xyz) = 0, i.e.,

32 3 −+21511240xxyyy + − =  yz+−λ () 6 x2 + 30 xy = 0  xz++−=λ () 15 x22 33 y 24 0  xy = 0 

Swp2.5 provided the solution, which we represent by

22 {}(λ ,Xzz )=∈{}() 0,() 0, 0,: R *  0, 0, 66, 0 , 0, 0, − 66, 0 . 11 11

Next, for the (bordered) Hessian matrix of L we obtain

0−+ 6xxyxy222 30 15 + 33 − 24 0  −+630x2 xyλλ() −+ 1230 x y z + 30 x y HL(,λ X )= . 15x22+− 33yzxyx 24 + 30λλ 66  00yx

Then, a straightforward computation shows that, at the critical points

2 2 0, 0, 66, 0 and 0, 0,− 66, 0 11 11

Γ=0 Γ=363096 ≠0 we have that 3 and 4 1331 . Hence, both are saddle points. However, at the rest 0,() 0, 0, zz∈R Γ=0 Γ=0 of the critical points, i.e., {(): } , we have 3 , 4 and, therefore, at every one of these points the method does not allow to get any conclusions, i.e., all are indeterminate cases.

Alternative proposed method [6], [7], [8], [9], [10].

== = With the convention that the variables are relabeled as x uz121,, u y v, in order to suit our previously exposed notation, we consider the Jacobian matrix of the function defining the restriction 32 3 Gxyz(),,=− 2 x + 15 xy + 11 y − 24 y: CONSTRAINED LOCAL EXTREMA 17

′ ==−++−[] 222 GX( ) G123 G G 6 x 30 xy 0 15 x 33 y 24 ,

22 and assume that Gx3 =+−≠15 33 y 24 0 . Hence, by the implicit function theorem, the constraint Gxyz(),,=− 2 x32 + 15 xy + 11 y 3 − 24 y = 0 = condition allows to consider yv1 as a function = = ==()( = ) of x u1 and z u2 , i.e., vyhxzhuu112,, . Therefore, we can write

32 3 Kuu()12,:==−++−≡ Guuhuu() 12 , , () 12 , 2 x 15 xhxz() , 11() hxz () , 24 hxz() , 0.

K =+GGh, K =+ GGh From the latter we obtain 11312232 , and it follows that

GxyG−+50 hx=−12 =−2 , h =− = = 0. 1222 2 2 GxyGxy335+− 11 8 15 + 33 − 24

Thus, for the first derivatives of the function J ()xz,,,,,== f() xhxz () z xzhxz() we find

−++−51128x233yyxy Jffhz=+ = , J =+ffhxy =. 1121 5118xy22+− 2232

Then, the critical points are found by solving the system

zxyy()−++−51128233 x y  = 0 15xy22+− 33 24  xy = 0 .

32 3  −+21511240xxyyy + − = 

This system exhibits the same solutions as those obtained by using Zizza´s method, labeled above as (4.1). Following with the calculations we proceed to find, next, the second derivatives of the function J . We do this by calculating first the corresponding derivatives of h which are obtained by using the fact, described previously, that all of the derivatives of K vanish. In this fashion we finally obtain, by means of straightforward calculations that:

32−+ 3 4 − 2 + =+660xy 160 x 484 xy 704 xy 256 x Jxz11 2 3 ()5118xy22+−

81yx4232−+−+− 1650 xy 400 xy 1815 y 5 2640 y 3 960 y +3 , ()5118xy22+− −++−51128x233yyxy JJ== , J = 0 . 12 21 5118xy22+− 22

2 It follows that, at the critical point ()0,11 66, 0 , the Hessian matrix of J is given by

18 SALVADOR GIGENA

 2   066 11 Hess() J =   ,  2  66 0 11  and since the eigenvalues of the latter are 2 66 and − 2 66 there is saddle at that point, 11 11 conclusion that coincides with the one obtained by using Lagrange multipliers method. Similar situation 2 happens at the symmetric, critical point ()0,− 11 66, 0 . Now, at the rest of critical points it is easy to see that the Hessian matrix vanishes, so that the second derivative test fails with this method too. But, we can follow with the computation and analysis of higher- order derivatives. In fact, by a similar procedure we find first that, at the critical points of the form {()0, 0, zz∈ R} ==== : , all of the third order derivatives vanish, i.e., JJJJ111 222 122 112 0 , while =− most of the fourth order ones are also vanishing, except one of them: J1111 2z . It follows, by also using Taylor’s theorem expansion for J in a neighborhood of each critical point, that all of the points of 0,0,z z > 0 z < 0 the form ()0 with 0 are maxima, while those with 0 are minima. It also follows that at the origin, ()0, 0, 0 , all of the fourth order derivatives vanish, so that we have to resort to the fifth order ones. In fact, most of these are also vanishing, with the exception of those having four indices equal =====− to 1 and one equal to 2, i.e., JJJJJ11112 11121 11211 12111 21111 2 . We conclude that the function has a saddle point at the origin. This completes the analysis of critical points by the present method. The full theoretical justification of what we have exposed here in this paragraph will take place in section 6 ahead, where the so-called Higher Order Derivative Test is presented.

To finish this section, let us consider the following problem, which represents an extension of the one exposed by F. Zizza in [16]:

Example 4.2. Find and classify the local extrema of the function defined by

22 f ()()()uvxy,,, =− x u +− y v ,

subject to the constraints represented by the equations

22 xy 22 Guvxy(),, ,=+−= 1 0, Guvxy(),, ,=−()() u 3 ++−= v 5 1 0 . 1 49 1

Recall that Zizza´s problem was only to find the minimum distance between points on the ellipse and points on the circle. We seek to perform the full analysis of the problem by using the present approach. ======By enumerating the variables as: uuuxvhvvh121122, , , y, we compute the GGG= , Jacobian matrix of ()12:

 GGGG1,1 1,2 1,3 1,4  0/202/9xy GX′()==  −+ , GGGG2,1 2,2 2,3 2,4 2602100uv  and assume that the minor

GG  02/9y  G′ ==1,3 1,4 v  +  GG2,3 2,4 2100v  CONSTRAINED LOCAL EXTREMA 19 is non-singular. Its inverse matrix is then given by

GG11 12  01/25()v +  ′ −1 == []Gv » » . GG12 22 9/2y 0

=−() − = () − =− () − = () − The first derivatives of f are f12342,2,2,2xu f xu f yv f yv; ()=≡ ()() from the condition Kuu12,: Guu() 121 ,, h uu 12 ,, h 2 uu 12 , 0 we obtain

=−()() − + = = =− huvhhhxy1,13 / 5 , 2,1 0, 1,2 0, 2,2 9 / 4 ; and, consequently, the first derivatives of J := fH may be represented by

u − 3 Jffhfh=+ + =−+222 xu −−+() yv , 1131,142,1 v + 5 9 x Jffhfhxu=+ + =−+−+22() yv . 2 2 3 1,2 4 2,2 2 y

In a similar fashion we compute the second derivatives:

2222+++−++ ++− = 5vv 50 170 5 uuyvyvyyuyu 30 10 34 6 J11 2 , ()v + 5 3 14yv++− 20 y 9 xu 27 x JJ==− , 12 21 25yv()+ 120−+yyvxv32 36 + 81 2 J = . 22 8 y3

Now, the critical points are obtained by solving the system

u −3  −+−−+222xu() yv = 0 v + 5   9 x 2xu−+−+ 2() yv = 0 2 y   xy22  +−= 1 0  49  22 ()()uv−++−= 3 5 1 0 

Mathematika 4.0 provided the following (approximate) solution set:

==−−()( ) Xuxvy11111: , , , 3.41407, 0.580423, 5.91025, 2.87089 , ==−−()( ) Xuxvy22222: , , , 2.58593, 0.580423, 4.08975, 2.87089 , ==()( −− ) Xuxvy33333: , , , 3.64566, 0.982085, 5.76362, 2.61341 , ==()( −− ) Xuxvy44444: , , , 2.35434, 0.982085, 4.23638, 2.61341 . 20 SALVADOR GIGENA

This set is the same as that obtained by using F. Zizza´s first alternative. It took about the same time as that required, for solving the problem, by means of the second alternative, but without exhibiting, of course, the four points reported by that author where the coordinate system is not valid. Recall too that, in his article, he indicates the comparison in computation time with the Lagrange method. Next, we analyze the Hessian matrix at these points. For example:

−− JJ11 12  20.873 2.4139 Hess() J :==  X1 −− JJ12 22  2.4139 12.616 X1 and since the (approximate) eigenvalues of this matrix are −−21.527, 11.962 , i.e., both are negative, it turns out that the function J reaches a maximum at X1 . Hence, so does f . By a similar analysis one finds that both X 2 and X 3 are saddle points while X 4 is that of minimum. The (approximate) distances are: 2.1254 for the minimum; 9.647 the maximum; 7.647 and 4.1253 at the saddle points.

It is to be noted finally that, if we were to use Lagrange multiplier’s method in order to classify the critical points, we would have to consider a (bordered) Hessian represented by a matrix of order ()()nmnm+×+=×2266 (see [14]).

5. Theoretical observations and considerations.

Remark 5.1. The fact that, in the examples presented here, the first derivative test of both our own approach and that of Zizza´s second alternative furnished the same critical points except for those where, in the latter, the new coordinates are not valid, is by no means a coincidence. It is not difficult to prove, in fact, that the expression for what we called the ith − partial derivative of the function J , i.e.,

=+ =− βα Jfii∑∑ fhfαα−− nii,, fGG α β ni, ααβ, is the same as that obtained by making the quotient

∧∧ ii−−11df∧∧∧∧ dy..... dy ..... dy df ∧∧∧∧∧∧ du .. du .. du dG ∧∧ ... dG ()−=−11111inm+ () i n m ∧∧∧ ∧∧∧∧∧∧∧ dy111..... dyinm ..... dy+ du .. du i .. du n dv .. dv m where we have adapted and combined conveniently Zizza´s notation with ours.

Remark 5.2. As it was exposed, the method presented here avoids the use of Lagrange multipliers. However, in some applications, as for example in economics, optimization, etc., [2], [3], [4], [12], it may be desirable to obtain their concrete values. We show next that the historic multipliers, being no longer part of the problem, can also be obtained by the present method as an output of the solution, with no significant additional effort. In fact, suppose we have already computed the critical points, and let X 0 be λλ one of them. Then, Lagrange multipliers theorem asserts that that there exist multipliers 1,...., m such ()+++λλ ′ = that fG11 11 GX( 0 ) (0,....,0) . + Since the point X 0 is already known, the latter represents a linear system of nm equations in λλ the m unknowns 1,...., m . However, by the assumptions made in the development of the method, we can reduce the above to the system of m linear equations represented by

CONSTRAINED LOCAL EXTREMA 21

∂∂ ∂ GG11 G 1 ∂∂… ∂ vv12 vm ∂∂f f  []λλ=−  , 1 m  ∂∂vv ∂∂ ∂ 1 m GGmm… G m ∂∂ ∂ vv12 vm

where all derivatives are evaluated at X 0 . Therefore, we can express the solution by

λ =− ()αβ () βα−−nn∑ fXGX00. α

Let us apply the above to obtain the multipliers in example 4.2, which is the only remaining case where the multipliers are not known yet. We do this by specializing at each critical point the expression:

GG11 12  01/25()v +  −=−−−−=[]ff»22()() yv yv » 34GG12 22 9/2y 0

9()yv−−−() yv =−» . yv()+ 5

Then, by substituting into the latter expression the corresponding values, already computed, we find:

λλ=− =− At X1 the multipliers are (approximately) : 1227.528, 9.647 λλ=− = At X 2 : 1221.821, 7.647 λλ==− At X 3 : 1210.849, 4.1254 λλ== At X 4 : 125.5891, 2.1254

It is to be remarked again that the whole procedure performed in this way, i.e., within the framework of the method exposed here, including the calculations of Lagrange multipliers, is, at least experimentally, much faster than trying to execute the problem and obtain its solution by the traditional Lagrange multipliers method.

Remark 5.3. As shown by example 4.1, the second derivative test may fail, either by using Lagrange multipliers method or the second derivative test (Theorem 3.2 above). However, as also exposed in those same examples, the use of higher order derivatives may allow to continue the further analysis of critical points, as we are about to prove in the next section.

6. The Higher Derivative Test.

In this section we expose another method that in case of failure of the second derivative test, indeterminate case, can be used in order to complete the classification of already known critical points. It consists, basically and when necessary, in calculating higher order derivatives of the function J . Suitable use of the Taylor’s formula allows then to classify the critical points. The only necessary condition is that the latter be non trivial, i.e., that there exists at least one derivative, greater or equal than two, which is different from zero at a given critical point. Example 4.1 illustrated this quite well and the following results justify that procedure from the theoretical point of view. Observe first of all, from the above expositions and considerations, that it is enough to consider the case where the objects to analyze are the extrema of a function fU: → » where U is an open subset of »d , since we have already reduced the analysis of critical points of a constraint problem to a non-constrained one. 22 SALVADOR GIGENA

On the other hand, observe too that the method is very well known for the case of dimension d =1 and, in particular, we used it in order to solve the cases presented as Examples 2.1, 2.4 and Exercise 2.5, where the Lagrange multipliers method failed to provide an answer. Thus, we assume in this section that d ≥ 2 . It will also be clear from what follows that the method provides, besides, a valid alternative even for the classical second derivative test. Let Xx=∈( ,...., xU ) be a critical point of the given function fU: → » and assume that 001 0d th = the first non-vanishing derivative at that point is of k -order. Then, for points Xx(1 ,...., xd ) in a neighborhood of that point we may write, by using Taylor’s formula, that

()=+ ( )1 () − − −+ fX fX0∑ fii ... i X 0()() x i x 0 x i x 0...() x i x 0 higher degree terms. k! 12kiiki 112 2 k

th The last displayed term is a homogeneous polynomial function of k -degree, represented as

=− − −=1 () − − − PPxxkk()()()10, xx 20 ,...,() xx d 0∑ f iii ...0 X()() xx i 0 x i x 0 ...() x i x 0 . 12dkiikik! 121212 k

For this kind of homogeneous polynomial function it is well-known the property that, for any real number t ≠ 0 , it holds the relationship

Ptx()()−−=−−− x,..., tx() x tPk ()()() x x , x x ,...,() x x (6.1) kdk10112 0d 10 20 d 0d

The method will rely on evaluating, on the corresponding unit sphere, the successive nonvanishing homogeneous polynomials that appear in the Taylor´s development of f . Thus, we consider next the:

First Subsidiary Problem: Determine the critical points of PX()=− P()()() x x, x − x ,...,() x − x subject to the kk1012 20 d 0d ∈=dd−−11() = restriction that X belongs to the unit sphere centered at X 0 , i.e., XS:;1 S X0 22 2 {Xxx:1()()−+−++−= x x () x x }. 1012 20d 0d Let us observe, first, that in the last expression, an also in what follows, we are assuming that the given sphere, border of the corresponding unit ball, is fully contained in the open set U , by performing a rescaling of coordinates, in case the latter were necessary in order to achieve that particularly convenient, comparative situation. 22 2 Then, if we define GX():1=−()() x x +− x x ++− () x x − the restriction condition is 1102012d 0d ′ given by GX()= 0 , with Jacobian GX( )=− 2() x xx , − x ,..., x − x . It follows, by 1 11020012d d considering all of the possibilities where one of these components is non-vanishing, i.e., − x −≠xi0, = 1,..., d, that the sphere S d 1 is covered by a finite number of the corresponding i 0i parametrizing functions, all of them in the form of Monge, i.e., graph maps, as indicated by the method exposed in the proof of Theorem 3.1. Indeed, there are d of them. However, if one requires connectedness for the image of those functions, that number rises to 2d , 2 for each of the coordinate maps. As it is very well-known, this is one of the most common and useful ways of parametrization for the sphere, and we shall refer to it as the distinguished or canonical parametrization. In other words, the initial problem to be considered is that we have a homogeneous polynomial d d −1 function P : »»→ and want to find all of the critical points of PS− : → » . This is a k k S d 1 CONSTRAINED LOCAL EXTREMA 23

− continuous function and, hence, there must be at least two points of extreme on the (compact) set S d 1 : an absolute maximum X Max and an absolute minimum X min , which could reduce to a single = point, XXMax min , in case both coincide. These considerations guarantee that some of the systems of equations below, which were indicated in the formulation of the method in order to determine the critical points, exposed in section 3, have real solutions

∇=JX()0 Pk  , (6.2) GX()=0 1  with JPH= . Observe besides that, whereas we have stressed the dependence of such a function Pkk on Pk , it should also be remarked once again that it depends on the parametrization chosen as well. Now, since the latter is a polynomial system of equations there could exist, in the best of possible − situations, only a finite number of solutions XX, ,..., X∈ Sd 1 because, as pointed out before, there 12 k0 should exist at least two: an absolute maximum X Max and an absolute minimum X min . However, the solution set could also consists of a finite union of objects with diverse dimensions, some of them being − non-finite subsets of the sphere S d 1 which, in a such a case, will turn out to be, as we shall see ahead, − − algebraic differentiable submanifolds of S d 1 with dimensions strictly less than dim()Sdd 1 =− 1 .

Next, by computing the values of Pk at the solution set, for example in case there are only a finite listing of points, P ()XPX, () ,..., PX(), we find by comparison that kk12 kk0

()≤≤() = PkkjkMAXXPXPXjmin() , 1,2,..., k 0 .

()()(≤≤ ) ∈d −1 Moreover, we also have that PkkkMAXXPXPXmin , for every XS.

d −1 Remark 6.1. Let us observe that the image set PSk ()could reduce to a single point. This happens, for example, in the case where

22 2 p PX()==−+−++− P () Xλ (()() x x x x () x x ) , (6.3) kp21020012 dd the image being precisely the real number λ , which must be non-vanishing since we are assuming =≠ PP2 pk0 . It will also be useful, and very convenient in our future exposition, to consider a kind of converse to this result, in the form of various equivalent properties. Furthermore, in order to make things easier for such a purpose, let us assume, without loss of generality, that the center lies at the origin of coordinates, i.e., Xxxx===( , ...., ) (0,0,....,0) 0 by performing if necessary a translation of 000012 d coordinates:

Lemma 6.2. Let P : »»d → , with P ==Pxx(), ,..., x∑ a xxx ... , be a non-trivial k kk1 2 n iiiiii12 ... kk 1 2 homogeneous polynomial function. Then, the following statements are equivalent:

d −1 1) The homogeneous polynomial function Pk restricted to S is identically equal to a non-

vanishing constant, P − ≡≠λλ0, ∈\ . k S d 1 − 2) For every one of the distinguished parametrizations of the sphere S d 1 , defined above, the

24 SALVADOR GIGENA

corresponding composite function JPH= is also identically equal to the same non- Pkk vanishing constant, JPH=≡≺ λ . Pkk 3) All of the first order derivatives of the latter functions are identically vanishing, in their respective domains of definition. »»d → 4) The polynomial function Pk : is the one represented as equation (6.3) above, with = ∈» X 0 0 . i.e., there exists a positive integer p >0 and a non-vanishing real number λλ∈≠\ ()==+++ ()λ 22 2p , 0 such that PXkp P212 X() x x x d.

Proof. Equivalence of the first three statements is rather obvious.

Let us prove next that 4) implies 3). In this context, the Jacobian becomes now ′ = () GX112( ) 2 xx , ,..., xd and if we consider the parametrization determined by the case where a ≠ ≠ given xi 0 we obtain, for every j i , that the derivative with respect to x j vanishes identically, i.e.,

∂J P ∂PH, J ==k k = Pjk , ∂∂xxjj

x 22pp−−11 22j =++−++≡ 2λλpx()()11 xdj x 2 px x di x 0. x i

In order to finish the proof, let us prove that 1) implies 4). And let us observe first that the degree k cannot be odd, because of equation (6.1). In fact, if we assume it to be odd, by putting t =−1 in that ∈» equation, it leads to a contradiction. Thus k is even, i.e., there exists a positive integer p >0 such that kp= 2 . Then, we may proceed by induction on the dimension d . Thus, let us take first, for d = 2 , a »»2 → 1 homogeneous polynomial function Pk : such that its restriction to S is identically equal to a nonvanishing constant, P ≡≠λλ0, ∈\ . We may display the homogeneous polynomial function k S1 as follows:

()==+ () kppp2 2−− 1,1 2 1 + 2 pp −− 2,2 2 2 2 + 2 pp −− 3,3 2 3 3 + 2 pp −− 4,4 2 4 4 + PkpXPXaxaxxaxxaxxaxx2 111212121212121212

+⋅⋅⋅+2,2pii−− pi i +⋅⋅⋅+ 1,21 p − 21 p − + k 2 p axx12 1 2 axxax 12 1 2 2 2 ,

= () with Xxx12, and where we have abbreviated conveniently the notation for the constant coefficients: the subscript indices running in this case only from 1 to 2 , while the superscripts indicate the number of times that those are repeated, separating by a comma when necessary. Thus, for example, 3,4 = 4,0== 4 0,5== 5 aa12: 1112222 , aaa12 1: 1111 , aaa12 2: 22222 , . . . ≠ We may choose to work within the parametrization defined by assuming x2 0 and compute, next, the first derivative of the composite function JPH= : Pkk

Jpaxpaxxpaxx=+−+−+22122kp21−−−−−() 21,122 p p ( ) 22,2232 p p Pk ,1 1 1 12 1 2 12 1 2 +⋅⋅⋅+() −2,2pii−−− pi 1 i +⋅⋅⋅+ 1,2121 p −− p + 2 pia12 x 1 x 2 a 12 x 2

−− − − −−− − x +21,121pp + 22,222 p p +⋅⋅⋅+ 2 piipii ,2 1 +⋅⋅⋅+ k21p − 1 ()ax12 1 2 a 12 xx 1 2 iaxx 12 1 2 2 pax 2 2 . x2 CONSTRAINED LOCAL EXTREMA 25

By collecting terms we find:

J =−()22pakp a22,221−− x p +−−()() 213 p a 21,1 p − a 23,322 p −− x p x + Pk ,1 1 12 1 12 12 1 2 +⋅⋅⋅+() −2,pii−−−+−− − () + 2 pi 2,22 i pi 1 i +⋅⋅⋅+ () 2pia12 i 2 a 12 x 1 x 2 2 p −−−−−x +−2,22pkpppp 22 + 1,2121 − 21,11 ()apaxxaxa12 2 2 1 2 12 2 12 . x2

Thus, the condition J ≡ 0 implies successively, first that Pk ,1

1,21pp−−= 21,1 = = 23,3 p − = 25,5 p − =⋅⋅⋅=221,21pl−+() l + = 3,23 p − = aa12 120 a 12 a 12 a 12 a 12

=5,2p− 5 =⋅⋅⋅=21,221lpl+−+() = aa12 12 0.

And then, that

kp−=2−− 2,2⇒ 2 p 2,2 = k 22paa112 0 a 121 pa .

−− −22p − −pp()−1 ()22p −−=aa2p 2,2 4 2pp 4,4 0⇒ a 2 4,4== a 2 p 2,2 a k . 12 12 1242 12 1

−− −p − 2 −pp()()−−12 p ()24p −−=aa2p 4,4 6 2pp 6,6 0⇒ a 2 6,6== a 2 p 4,4 a k . 12 12 1233! 12 1

−− −pp− 3! −pp()()()−−123 p p − ()26p −−=aa26,6p 8 28,8pp 0⇒ a 28,8== a 26,6 p a k = a k . 12 12 1244!4!4! 12 1()p − 1 ......

−−−+−−+−pl− p! ()22p −−+=la22,2p ll () 22 l a 222,22 p l l 0⇒ a 222,22 p l l== a 22,2 p ll a k . 12 12 12llpl++−−11!1! 12()( ) 1

......

Thus, for lp=−1 we obtain

1!− p aakp==0,2 a 2,2 p 2 = aa kk =. 21212pp!0! 11

Since, obviously, we also have that

λ ====() () kk Pkk1, 0Paa 0,121 , it follows that

()==+ ()λ 22p Pxxkp12,, P 2 xx 12() x 1 x 2 , and the result is valid for d = 2 .

26 SALVADOR GIGENA

To finish the proof, let us assume that that the result holds in every case where the dimension of the sphere equals d − 2 and assume 1), i.e., that P ==Pxx(), ,..., x∑ a xxx ... is equal to a kk1 2 d iiiiii12 ... kk 1 2 − non-vanishing constant λλ≠∈0, \ when restricted to S d 1 . Observe, first of all, that

λ ==() () ==2 pk = PkiePea2 pi i a i, for every i 1,2,..., d,

= () »d where ei 0,0,...,0,1,0,...,0 represent the elements of the usual Euclidean basis of .

Next, we consider the further restriction of P − to the ()d − 2 – dimensional sphere obtained by k S d 1 d −1 Π= =() = dd−−21=∩Π intersecting S with the hyperplane ddd:{Xxxxx12 , ,..., : 0} , SS d .

Obviously, we still have P − ≡≠λλ0, ∈\ , so that we may write, by inductive hypothesis, that k S d 2

= =λ 22 + +⋅⋅⋅+ 2p PPxxxΠ (), ,...,−− ,0() xx x . kkd 12 d 1 1 2 d 1

It follows that we can also write

22 2p P =Pxx(), ,..., x−− , x =λ () x + x +⋅⋅⋅+ x + ∑ a xxx ... , kk1 2 dd 1 1 2 d 1 iiiiii12 ... kk 1 2 = idj where ∑ axxx... is a homogeneous polynomial function of degree equal to k with each term ii12... ikk i 1 i 2 i = idj containing at least one of the variables equal to the last one in the listing, x = x . In order to determine idj the coefficients in the latter polynomial we consider the restriction of Pk to the plane spanned by ed and ≠ Π any of the other members of the usual basis, say ei , id. By denoting this plane by id, , we still ≡≠λλ ∈\ have that Pk d−1∩Π 0, and, by taking into account the proof for the case of dimension S id, d = 2 exposed above, we can also write for the restriction to the one dimensional sphere 11=∩Πd − SSid,,: id that

==+()λ 22p PPkk1 0,...,0, x i ,0,...,0, x d() xx id . Sid,

Hence, we finally obtain

pp−−1 pl dd−−11p!  d − 1 Pxxx(), ,..., =λ ∑∑ x222 + pxx +⋅⋅⋅+ ∑ xx 222lp +⋅⋅⋅+ x = kd12  i id()−  idd ii==11pll!!  i = 1 p d −1 =λλ22 + = 22 + +⋅⋅⋅+ 2p ∑ xxid() xx12 x d , i=1 and the lemma is proved.

As we shall see, it is very easy to obtain conclusions when the equivalent conditions of the latter lemma hold. Thus, including this particular possibility as a sub-case, we state the next auxiliary result.

»»d → Lemma 6.3. Let Pk : be a nontrivial homogeneous polynomial function. Then, it follows that: CONSTRAINED LOCAL EXTREMA 27

d −1 d −1 =⊂[]\ a) The image set PSk () is a closed interval in the real line, i.e., PSk () ab, , =≤=() ( ) with aPXkkMAXmin bPX . b) If k is odd, then ab<<0 . c) In case k is even we have the following possible situations regarding the extreme values of the interval: <≤ c)1 0 ab; ≤< c)2 ab0; << c)3 ab0 ; =< c)4 0 ab; <= c)5 ab0.

Remark 6.4. it is easy to construct examples of those five possible cases. So this is left to the reader as an exercise.

Proof. We treat separately the various cases: − a) The sphere S d 1 is a compact and connected subset of »d . Then its image under the (continuous) » [] function Pk must also be compact and connected as a subset of , thus a closed interval ab, . ∈ d −1 PX()≠ 0 b) Since we assumed that Pk is nontrivial, then there exists a point XS such that k . It k follows from (6.1) that Pkkk()−=− X()1 PX() =− PX (), which proves our assertion. c) It is obvious that the stated ones cover all of the different possibilities for the closed interval []ab, , with ab≤ .

Theorem 6.5. (Higher Derivative Test). Let Xx=∈( ,...., xU ) be a critical point of the function 001 0d fU: → » , U open subset of »d , and assume that the first nonvanishing derivative at that point is of k th -order, with k ≥ 2 . Then

1) k odd implies that X 0 is a saddle point.

2) For k even we have, for the first non-vanishing, homogeneous polynomial function Pk the five possibilities described in the previous lemma and, according to them, it follows that: () In case c)1 , f X 0 is a strict local minimum. () In case c)2 , f X 0 is a strict local maximum.

In case c)3 , X 0 is a saddle point. () In case c)4 , f X 0 is a candidate to furnish a minimum. However, there exists a subset of d −1 −1 dd−−11 S , ()PSSk ()0 ∩⊂, symmetric with respect to the origin, with not necessarily th a finite number of points, where the k -homogeneous polynomial Pk vanishes and we need a −1 d −1 further, deeper analysis. In fact, the algebraic set ()PSk ()0 ∩ may be expressed as a − finite union of algebraic submanifolds of the sphere S d 1 , with dimensions strictly less than d −1, and one may need to repeat the analysis on the behavior of the next nonvanishing

homogeneous polynomial, say Pll , > k, on each one of those submanifolds. By considering the expression of the Taylor’s formula in every one of the directions determined by that set, we

have the following sub-cases that we label as c)41 , c)42 , c)43 , c)44 , c)45 , and exhibit next:

28 SALVADOR GIGENA

c)41 If in the Taylor´s development of f around X 0 there are no further homogeneous polynomials, or if there are further homogeneous polynomials but are all vanishing on the ()PS−1 ()0 ∩ d −1 set k , it follows that X 0 is a point of non-strict local minimum for f . Besides, if there are only a finite number of homogeneous polynomials that vanish on the latter set, but there exists a next following one non-vanishing, one proceeds to apply to that polynomial the considerations that follow. In such a case, in order to avoid unnecessary complications in notation, we still call Pl the corresponding homogeneous polynomial. Pl, > k c)42 If the next nonvanishing homogeneous polynomial, say l , is of odd order and −1 d −1 there exists a point X 1 ∈∩()PSk ()0 such that PXl ()1 ≠ 0 , it follows that

X 0 is a saddle point. P PX()> 0 c)43 If the next nonvanishing homogeneous polynomial l is of even order and l , X ∈∩()PS−1 ()0 d −1 for every k , it follow that X 0 is a point of strict local minimum. P c)44 If the next nonvanishing homogeneous polynomial l is of even order, but there exists ∈∩()()−1 d −1 PX()< 0 some point XP1 k 0 Ssuch that l 1 , then X 0 is a saddle point. P PX()≥ 0 c)45 If the next nonvanishing homogeneous polynomial l is of even order and l , −1 d −1 for every X ∈∩()PSk ()0 , with strict inequality at some points, but with the ∈∩()()−1 d −1 PX()= 0 existence of at least one point XP1 k 0 S such that l 1 , then it is

kept the expectation that X 0 be a local minimum. However, in that case one needs to reiterate the latter analysis resorting, if necessary, to the corresponding further subsidiary problems, by considering first the next nonvanishing homogeneous polynomial, beyond P l , appearing in the Taylor´s development of the function f at the critical point X 0 −−11d −1 restricted, in this occasion, to the algebraic set ()PPSlk()00∩∩()() . Again, that set may be expressed as a finite union of algebraic submanifolds of the sphere − S d 1 , with dimensions strictly less than d − 2 .

In case c)5 the corresponding reverse situation to that in case c)4 holds, i.e., X 0 may be a saddle point; or a point of strict local maximum; or a point of non-strict local maximum. Finally, since in most of cases dimensions strictly diminish with each new nonvanishing homogeneous polynomial analyzed, except possibly in cases like c)41 , we conclude that the procedure gets exhausted after a finite number of steps.

Proof. We consider one by one the cases stated: d −1 1) By the proof of b) in the previous lemma, we have a point XS∈ such that PXk ()≠ 0 , and we +−() ∈\ consider the values of f along the straight line {XtXXt00: } . We obviously have that

+−() =() ++ααk ≠ fX()000 tX X fX thigher degree terms in t , 0 .

Since we are assuming that k is odd, it follows that at every neighborhood of X 0 there are points, say ()<< ( ) () YZ, , such that f YfXfZ0 , proving our assertion.

<≤ 2) Next, we assume that k is even and consider first the case described as c)1 0 ab. Then it d −1 follows that, for every XS∈ , PXk ()> 0 . By analyzing the values of the function f , as in the previous case, we can assert that there exists a positive real number, r > 0 , such that for every t with

CONSTRAINED LOCAL EXTREMA 29

< +−() >() tr it holds f ()XtXX000 fX. By continuity of f we conclude that there exists δ > another positive real number, r 0 , such that for every point Y within the open cylinder with axis +−()() ∈− δ ()> ( ) {XtXXt00:, rr} and radius r it holds f YfX0 . By projecting the cylinder d −1 − along its axis onto the sphere S we obtain an open d 1 (geodesic) ball around X , BX . The union d −1 of all of those balls, ∪ BX , provides an open covering for the sphere S . Hence, by compactness − XS∈ d 1 of the latter, there exists a finite open sub-covering, with balls, say, B ,BB ,..., , and XX12 Xq = corresponding positive values rr12, ,..., rq . Let rrrr012min{ , ,..., q}, then it follows that for every Y − ∈ () ()> ( ) within the d ball centered at X 0 and radius r0 , i.e., YBXr00; , it also holds f YfX0 and X 0 is a point of strict local minimum as asserted. Alternatively, one could also define r0 above as ()=+−>< ( ) () follows: let rX0000min() 1,max r such that fX() tXX fX , for t r. The − latter is a positive, continuous function on (the compact set) S d 1 . Therefore we have, by way of writing =∈() d −1 > rrXXS0 :min{ : } , that r0 0 .

≤< () Similarly, in the case labeled as c)2 ab0, we have that f X 0 is a strict local maximum.

<< ∈ d −1 PX()< 0 In the next case c)3 ab0 , we choose two points XX12, S with k 1 and PX()> 0 > k 2 . Then, along the ray emanating from X 0 and passing through X1 there exists r1 0 +−<()() << such that f ()XtXX010 fX 0 for every t with 0 tr1 . In a Similar fashion, along the > ray emanating from X 0 and passing through X 2 we have the existence of a positive r2 0 such that +−>()() << f ()XtXX020 fX 0 for every t with 0 tr2 . It necessarily follows that within every ()<< ( ) () neighborhood of X 0 there are points YZ, such that f YfXfZ0 , i.e., X 0 is a saddle point as asserted.

=< P We consider next the case c)4 0 ab and observe that the even degree polynomial function k 22 2 p cannot be of the form PX()==−+−++− P () Xλ (()() x x x x () x x ) , by kp21020012 dd

Lemma 6.2. Thus, it is easy to see, first, that the set properly included on the sphere where the k th - homogeneous polynomial Pk vanishes cannot be any relative open subset of the sphere either and, −1 dd−−11 second, that such an algebraic set, denoted by ()PSSk ()0 ∩⊂, is precisely characterized by the vanishing of the vector function

=→() »2 GGGU12,: ,

22 2 where GX():1=−()() x x +− x x ++− () x x − and GP2 := k , i.e., by the system 1102012d 0d

22 2  GX():10=−()() x x +− x x ++− () x x −= 1102012d 0d  . (6.4) GPPXPxxxx:==() =()()() −, − ,...,() xx − = 0 210200kk k12 d d 

30 SALVADOR GIGENA

Then, it is clear that the non-empty solution set of this system can be expressed as a finite union of d −1 algebraic submanifolds of the sphere S . All of these submanifolds may have diverse dimensions strictly less than d −1, the extreme cases being those with dimension 0 , i.e., isolated points, and − ∈⊂−−11()()()−1 ∩d d 2 , which occurs if there exist points XG0,0 Pk 0 S where the Jacobian matrix G′ attains maximal rank, i.e., rank() G′() X = 2 . In order to go on with the analysis of this case it is convenient to consider, first, the case where the solution set is constituted only by a finite number of isolated points, say XX12, ,..., Xq , and we evaluate at everyone of them the next homogeneous polynomial which appears as non-vanishing in the Taylor´s formula of the function. Let us denote that polynomial by

=− − −= − − − PPxxll()()()10, x 20 x ,...,() x d x 0∑ a iiii ...()() x x 0 x i x 0 ...() x i x 0 , (6.5) 12dliilil 121212

1 with lk> , afX= (). ii12... illl! ii 12 ... i 0 Then, it is easy to verify in this particular instance the parts of the statement of the theorem labeled as sub-cases c)41 through c)45 .

∈=∩−−11()()()−1 d Next, let us consider the case where there exist points XG0,0 Pk 0 S such that the Jacobian matrix G′ attains maximal rank, i.e., rank() G′() X = 2 . As stated previously, we have here the existence of an algebraic submanifold of the solution set of (6.4) with the maximum possible of dimensions, d − 2 . It will be clear, from the exposition that follows, that the other cases of existence of submanifolds with dimensions lesser than that may be treated in a quite similar fashion. Thus, we consider next the

Second Subsidiary Problem: Determine the critical points of PPxx=−()()(), x − x ,...,() x − x with the restriction that ll1012 20 d 0d − X belongs to the differentiable manifold Md 2 ::=={ X G() X 0, rank() G′ () X = 2}, where

22 2 GGG= (), , with GX():1=−()() x x +− x x ++− () x x − and GP2 := k . It is clear that 12 1102012d 0d the above is a ()d −−2 dimensional algebraic differentiable submanifold contained in the compact,

−1 d −1 algebraic set ()()P 0 ∩ S , and that, as in the case of P , the image of the restriction P − is a k k l S d 1 closed finite interval, according to the statement in Lemma 6.3. Hence, the image of the further restriction

P − is contained in the latter. l M d 2

In order to continue the argument in the case c)4 for Pk , let us consider, now, the case where there exist ∈⊂−−11()()()−1 ∩d ′ points XG0,0 Pk 0 S such that the Jacobian matrix G does not attain maximal rank, i.e., rank() G′() X < 2 . Since, on the other hand, we are also assuming that the system (6.4) admits a non-empty solution set, it follows that the set satisfying those conditions can be expressed as a d −1 finite union of algebraic submanifolds of the sphere S with dimensions strictly less that d − 2 . Therefore, each of those submanifolds may also be represented as the solution set of a system of equations similar to the one described next:

CONSTRAINED LOCAL EXTREMA 31

22 2  GX():10=−()() x x +− x x ++− () x x −= 1102012d 0d   GPPXPxxxx:==() =()()() −, − ,...,() xx − = 0  210200kk11 k 1 1 2 dd . (6.6) ......   GPPXPxxxx+ :==() =()()() −, − ,...,()xx−= 0 kkkrrr11020 k r 12d 0d 

Here, all of the polynomial appearing in the left hand side have degrees strictly lower than k and the + ≤+≤ indicated number of equations, kr 1, is less than or equal to d , 31kdr . Besides, the rank of + the corresponding Jacobian matrix is maximal, i.e., equal to kr 1 and, consequently, the previously determined solution set represents an algebraic submanifold of the unit sphere with dimension −+() −1 −− −+()dkr 1 ⊂∩⊂()() dd11 dkr 1 , M PSSk 0 .

Moreover we can proceed now, in each of those cases, to consider the corresponding subsidiary problem of determining the critical points of the restricted polynomial , i.e., of the mentioned polynomial Pl dk−+()1 M r PPxx=−()()(), x − x ,...,() x − x with the restriction that X belongs to the latter ll1012 20 d 0d dk−+()1 − differentiable manifold M r , following the same procedure as we did before for the case of M d 2 . ()()−1 ∩ d −1 It is also clear that the compact, algebraic set Pk 0 S is equal to the union of those − submanifolds together with the previous one M d 2 .

Thus, with a procedure similar to the previous considerations, regarding the different possibilities for the d −1 behavior of Pk on S , we may go on to draw conclusions from the behavior of Pl on the compact, ()()−1 ∩ d −1 algebraic subset Pk 0 S . For example: ∈∩()()−1 d −1 PX()≠ 0 1) for Pl . If l is odd and there exists XPk 0 S such that l , then X 0 is a saddle point.

2) for Pl . If l is even we have again, for the homogeneous polynomial function Pl , the five possibilities similar to those described in Lemma 6.2, from which we conclude that:

() In case c)1 for Pl , f X 0 is a strict local minimum, because we can make the similar kind of d −1 reasoning as in the previous case by observing now that, for every XS∈ , PXk ()> 0 or

PXl ()> 0 . Hence, by means of a refinement of the argument as in the same case for PXk () and by analyzing again the values of the function f , one concludes that there exists a positive real number, > < +−() >() r 0 , such that for every t with tr it holds f ()XtXX000 fX. The rest of the argument follows at once. () In case c)2 for Pl , f X 0 is of saddle type, since it was previously a candidate for minimum on the d ()()−1 ∩ d −1 basis of the analysis of Pk on S but, the further restriction of Pl to Pk 0 S makes it a prospective in the other way around, i.e., a local maximum.

In case c)3 for Pl , X 0 is obviously a saddle point. And the same can be said of case c)5 for Pl , ∈∩()()−1 d −1 PX()≠ 0 because there exists XPk 0 S such that l .

32 SALVADOR GIGENA

() Now, in case c)4 for Pl , f X 0 is kept as a candidate for furnishing a minimum and, once again, −−11ddd−−−111 − 1 th there exists a subset ()PPSPSSlk()00∩∩⊂∩⊂()() () k() 0 , where the l - P homogeneous polynomial l vanishes, as it was previously the case with Pk . In this instance, it is easy to see for the same reasons as before, Lemma 6.2, that the even degree polynomial function Pl neither

22 2 p can be of the form PX()==−+−++− P () Xλ (()() x x x x () x x ) nor be lp21020012 dd

−1 d −1 identically vanishing on ()PSk ()0 ∩ . Moreover, Pl cannot vanish on any relative open subset −1 d −1 of ()PSk ()0 ∩ . −−11d −1 Observe that, now, the compact, algebraic set ()PPSlk()00∩∩()() is characterized by the vanishing of the vector function

=→()»3 GGGGU123,, : ,

22 2 with GX():1=−()() xx +− xx ++− () xx −, GP2 := k , GP3 := l , i.e., by the system 1102012n 0n

22 2  GX():10=−()() x x +− x x ++− () x x −= 1102012d 0d   ==() = − − − = GPPXPxxxx210200: kk k()()(), ,...,() xx d 0 . (6.7) 12 d  GPPXPxxxx:==() =()()() −, − ,...,() xx − = 0  310200ll l12 d d 

Hence, and again as it was previously the case, after equation (6.4), the non-empty solution set of this −1 dd−−11 system can be expressed as a finite union of algebraic submanifolds of ()PSSk ()0 ∩⊂. All of these submanifolds may have diverse dimensions strictly less than d − 2 , the extreme cases being those with dimension 0 , i.e., isolated points, and d − 3 , which happens if there exist points ∈⊂∩∩−−11( ) ( )−−11 () ( ) () d ′ XG0,0,0 Plk 0 P 0 S where the Jacobian matrix G attains maximal rank, i.e., for the present case rank() G′() X = 3 .

Finally, it is also easy to repeat the procedure as before, from case 1) for Pl on and, in particular, to verify in this instance the similar parts in the statement of the theorem labeled as c)41 through c)45 . It is now quite clear that the above procedure may be repeated again and again, until all possibilities are exhausted after a finite number of steps, since in every one of those further steps the dimensions diminish, thus providing the complete proof for the case labeled at the beginning as c)4 .

Analogously, it is also clear that the case c)5 may be treated in a completely similar way to that in the previous case c)4 , by simply reversing all of the inequalities; the strong possibility now is that the critical point X 0 provides a maximum.

The theorem is proved.

7. Final Examples and Exercises.

Example 7.1. Let us analyze the following function fxyz(),,=++ 5 x22 3 y o () 2 , which is already expressed as in Taylor´s formula, where the term o()2 involves, as usual, homogeneous polynomials of degree greater or equal than 3 . Then, the origin with coordinates ()0, 0, 0 is a candidate for extreme CONSTRAINED LOCAL EXTREMA 33

()=+22 and, in the terminology of the previous theorem, P2 xyz,, 5 x 3 y. Thus, we consider first the subsidiary problem of determining the critical point of PS: 2 → » , i.e., finding the possible extrema 2 S 2 ()=++−=222 of P2 subject to the restriction Gxyz,, x y z 1 0. Here, the Jacobian matrix may be written GX′()2,,= () xyz, and we may assume first that z ≠ 0 . Then, in terms of Theorem 3.1 we further obtain successively

()= () ()==+ () 22 H xy,,,,() xyhxy ; J xy,,53 P2 H xy x y G G =−x =−x =−y =−y hx ; hy Gzz Gzz x y J =+PPhx =+−=10 0 10 x; J =+PPhy =+−=60 6 y xxzx2, 2, z yyzy2, 2, z

Thus, we have to solve the system of polynomial equations

10x = 0   60y =  .  xyz222++−=10

Either by simple inspection, or by using Swp2.5, we find the solution set of the latter as:

{xyz===0, 0, 1} ; {xyz===−0, 0, 1} .

Similarly, for x ≠ 0 , we may write successively

()= () ()==+ () ()2 2 H yz,,,,() yzh yz ; J yz,,5,3 P2 H yz() h yz y G y G z =−y =− =−z =− hy ; hz Gxx Gxx y z J =+PPhy =+610 x −=− 4 y; J =+PPh =+−=−010 x 10 z. yyxy2, 2, x zzxz2, 2, x

The system of polynomial equations to solve here is

−=40y   −=10z 0  ,  xyz222++−=10 with solution set {xy===1, 0, z 0} ; {xyz=−1, = 0, = 0} .

Finally, for y ≠ 0 , one obtains

()= () ()==+ ()2 ()2 H xz,,,,() xzh xz ; J xz,,53, P2 H xz x() h xz

34 SALVADOR GIGENA

G x G z =−x =− =−z =− hx ; hz Gyy Gyy x z J =+PPhxy =+−=10 6 4 x; J =+PPh =+−=−06 y 6 z xxyx2, 2, y zzyz2, 2, y

The system of polynomial equations to solve now is

40x =   −=60z  ,  xyz222++−=10 with solution set {xyz===0, 1, 0} ; {xy==−=0, 1, z 0} .

()=+22 Then, evaluating the polynomial P2 xyz,, 5 x 3 y at the points obtained gives:

()= ()−= ()= P2 0, 0,1 0 , P2 0, 0, 1 0 , P2 1, 0, 0 5 , ()−=()= ()−= P2 1, 0, 0 5 , P2 0,1,0 3 P2 0, 1,0 3 ,

2 = [] and it follows that PS2 () 0,5 .

Hence, we are in the situation described in Theorem 6.5 as case c)4 , the origin is a candidate for a minimum but we need to analyze higher degree terms at the points on the 2-sphere where P2 vanishes. In the present case there are only two points as described above ()0, 0,1 and ()0, 0,− 1 . Thus, we consider next

Example 7.1.1. Let us modify the previous example by assuming now that the function is expressed by f ()xyz,,=+++ 5 x222 3 y xy 2 xz 23 ++ z o () 3. Then it is easy to see that the homogeneous third ()=+223 + − degree polynomial P3 xyz,, xy 2 xz z, attains the values 1 and 1 respectively at the points where the previous one P2 was vanishing. Hence, the function in this particular example has a saddle point at the origin, since we are in the situation described in Theorem 6.5, case c)42 .

Example 7.1.2. Let us next modify the previous example, by also adding higher order terms, as follows f() xyz,,=+++ 5 x222 3 y xy 2 xz 23 + 5 xy + 7 xz 34 ++ 3 z o () 4. Then, we see that the third ()=+22 () degree homogeneous polynomial P3 xyz,, xy 2 xz is vanishing at the points 0, 0,1 and ()0, 0,− 1 , which allows to keep the expectation that the origin may furnish a minimum for the function and, in fact, by analyzing the values of the next, fourth order homogeneous polynomial, i.e., ()=++334 ()(=−= ) P4 xyz,, 5 xy 7 xz 3 z we find that PP440, 0,1 0, 0, 1 3 . Thus, this belongs to the situation described in Theorem 6.5, cases c)41 , second part, and c)43 . The function has a strict minimum at the origin.

Example 7.1.3. By the same token if we have, for the given function, situations like

CONSTRAINED LOCAL EXTREMA 35

7.1.3.1. f ()xyz,,=++ 5 x22 3 y g () xyz ,, , where g ()xyz,, is a (not necessarily finite) sum of terms containing, everyone of them, at least one of the variables x or y as a factor, then, since ()(=−= ) gg0, 0,1 0, 0, 1 0 , we are in the situation described in Theorem 6.5, case c)41 , and the origin is a point of non-strict minimum for the function. In particular, fz()0, 0,= 0 for every z ∈» .

7.1.3.2. f() xyz, ,=++ 5 x22 3 y g () xyz , , ++ z 2p higher degree terms , where g ()xyz,, is a sum of terms containing, everyone of them, at least one of the variables x or y , with degree strictly greater than 2 and less than or equal to 2 p , p a natural number, pp∈>»,1, then the origin is a point of strict minimum for the function. Theorem 6.5, cases c)41 and c)43 .

+ 7.1.3.3. f() xyz, ,=++ 5 x22 3 y g () xyz , , + z 21p + higher degree terms , where g ()xyz,, is a sum of terms containing, everyone of them, at least one of the variables x or y , here with degree strictly greater than 2 and less than or equal to 21p + , p a natural number, pp∈>»,1, then the origin is a saddle point for the function. Theorem 6.5, cases c)41 and c)42 .

Exercise 7.2. Prove that every function of the form fxyz(),,=+++ 5 x222 3 y 7 z o () 2 has a strict minimum at the origin of coordinates, ()0, 0, 0 , by showing that in these cases the image under the ()=++222 2 homogeneous polynomial P2 xyz,, 5 x 3 y 7 z of the 2-sphere, S , is the closed interval [] 3, 7 . Theorem 6.5, case c)1 .

Example 7.3. Let us consider now the function fxyz(),,=+ x22 2 xyy ++ o () 2, where the first ()=+22 + homogeneous polynomial is P2 xyz,, x 2 xyy. Again, the subsidiary problem in this case consists in determining the critical point of PS: 2 → » , i.e., finding the possible extrema of P 2 S 2 2 subject to the restriction Gxyz(),,=++−= x222 y z 1 0, whose Jacobian matrix is given by GX′()2,,= () xyz, and we assume first that z ≠ 0 . Then, in terms of Theorem 3.1 we further obtain

()= () ()==++ ()22 H xy,,,,() xyhxy ; J xy,,2 P2 H xy x xyy G G =−x =−x =−y =−y hx ; hy Gzz Gzz x J =PPhxy + =++−=+220 22 xy; xxzx2, 2, z y J =PPhxy + =++−=+220 22 xy yyzy2, 2, z

Thus, we have to solve the system of polynomial equations

22xy+= 0  22xy+= 0 ,  xyz222++−=10

36 SALVADOR GIGENA whose solution set we find either by simple inspection, or by using Swp2.5, to be :

11 x =−+22,zy22 =−−+= 22, zzz; 22 11 x =− −22,zy22 + = − 22, zzz + = . 22

Obviously, this representation of the solution, which is even valid or extendable by continuity to z = 0 , and/or also included in the next case considered below ( x ≠ 0 ), provides us with, and may be interpreted as, a parametrization of the 1-dimensional manifold, MS12:,,:0=∩{() xyzxy +=} , obtained by intersecting the 2-sphere S 2 with the plane determined by equation xy+=0 . The parameter z taking values in the interval []−1,1 .

Similarly, for x ≠ 0 , we may write

()= () ()==++ () ()2 () 2 H yz,,,,() yzh yz ; J yz,,,2, P2 H yz() h yz() h yz y y G y G z =−y =− =−z =− hy ; hz Gxx Gxx yy2 JP=+ Phxy =+++22() 22 xy −=− 22 x ; yyxy2, 2, x x z yz JP=+ Ph =++02() xy 2 −=−− 22 z . zzxz2, 2, x x

Now, the system of equations to solve is

y2  22x −= 0 x  yz  −−22z = 0 , x  xyz222++−=10   with solution set:

11 x =−+22,zy22 =−−+= 22, zzz; 22 11 x =− −22,zy22 + = − 22, zzz + = ; 22 11 11 xyz===2, 2, 0 ; xyz=−2, =− 2, = 0 . 22 22

Finally, for y ≠ 0 , we may proceed to perform again the computations as above or, otherwise, observe that this problem is symmetric in the variables x, y so that the solution set, obtained after repeating the procedure, is the same as the latter one.

CONSTRAINED LOCAL EXTREMA 37

()=++22 Next, we evaluate the polynomial P2 xyz,, x 2 y y at the solution set, obtaining: for 1 ()=++=+22 ()2 = points on M , Pxyz2 ,, x 2 y y x y 0, while for both of the remaining two 2 ()=++=22 = points we get Pxyz2 ,, x 2 y y () 2 2.

2 2 = [] It follows that the image of the 2-sphere S under P2 is PS2 () 0, 2 , i.e., the close interval [] 0, 2 . Hence, we are in the situation described in Theorem 6.5 as case c)4 , the origin is a candidate for a minimum but we need to analyze higher degree terms at the points on the 2-sphere where P2 vanishes.

Example 7.3.1. Let us modify the latter example by assuming now that the function to be considered is fxyz(),,=+ x2244 2 xyy ++++ x y o () 4. Thus, we can use the information obtained previously ()=+44 and proceed to study the behavior of the homogeneous polynomial P4 xyz,, x y on the subset 2 of the 2-sphere S where P2 vanishes, i.e., in the previous notation the compact, algebraic 1- −121()∩= dimensional submanifold of the 2-sphere, P2 0 SM. We may do so by considering the ()=+44 subsidiary problem of finding the critical points of P4 xyz,, x y subject to the restrictions ()()==++−=222 determined by the simultaneous equations: G1 xyz,, G xyz ,, x y z 1 0, ()=+= Gxyzxy2 ,, 0. We leave it as an exercise to show that the solution set of this problem is constituted by the four points:

11 11 2,− 2,0 ; − 2, 2,0 ; ()0, 0,1 ; ()0, 0,− 1 ; 22 22

the values of P4 at those points being:

11 11 1 PP2,−=− 2,0 2, 2,0 =; PP()(0, 0,1=−= 0, 0, 1 ) 0 . 4422 22 244

1 Then it follows that the image of the compact, algebraic submanifold M under P4 is: 1 = [] PM4 ()0,1 2 . Thus, in order to conclude whether the origin is a minimum, or not, we would have to analyze higher degree terms of the function, but now only at the two points where P4 is vanishing. So that this situation is quite similar to the previous one in exercises 7.1 and followings, and it is left as an exercise for the interested reader the construction of the further, various possibilities.

Example 7.4. We increase now the ambient space dimension and consider the function initially given by ()=+22 ++ () ()=+22 + fwxyz,,, x 2 xyy o 2. Homogeneous polynomial P2 wxyz,,, x 2 xy y. Subsidiary problem: determine the critical point of PS: 3 → » , i.e., find the critical points of P 2 S3 2 subject to the restriction Gwxyz(),,,=+++−= w2222 x y z 1 0, whose Jacobian matrix is given by GX′()2= () wxyz ,,, , and we analyze first the region where z ≠ 0 . By Theorem 3.1 we write

()= ()()==++ ()22 H wxy,,() wxyh ,,, wxy ,, ; J wxy,, P2 H wxy ,, x 2 xy y

38 SALVADOR GIGENA

G G G =−w =−w =−x =−x =−y =−y hw ; hx ; hy Gzz Gzz Gzz w JP=+ Ph =+−=00 0; wwww2, 2, z x J =PPhxy + =++−=+220 22 xy; xxzx2, 2, z y J =PPhxy + =++−=+220 22 xy. yyzy2, 2, z Thus, we have to solve the system of polynomial equations

00=   22xy+= 0   . 22xy+= 0  wxyz2222+++−=10

The solution set here is:

11  wwx==−−+=−−−+=, 222, z22 w y 222, z 22 w zz 22  (7.1) 11  wwx==−−−+=−−+=,222,222, z22 w y z 22 w zz 22 

It is rather obvious, for reasons of symmetry, that we get the same result at the region where w ≠ 0 . ()=+22 + On the other hand, when applied P2 wxyz,,, x 2 xy y to the elements of this solution one ()()=+2 = obtains, obviously, Pwxyz2 ,,, x y 0.

Let us analyze next the regions where x ≠ 0 :

H ()wyz,,= () wh , () yz , ,, yz ; ()==++ ()()2 ()2 J wyz,, P2 H wyz ,,() hwyz ,, 2() hwyz ,, y y G w G y G z =−w =− =−y =− =−z =− hw ; hy ; hz Gxx Gxx Gxx wyw JP=+ Ph =++02() xy 2 −=−− 2 w 2 wwxw2, 2, x x yy2 JP=+ Phxy =+++22() 22 xy −=− 22 x ; yyxy2, 2, x x z yz JP=+ Ph =++02() xy 2 −=−− 22 z zzxz2, 2, x x

Therefore, the system of equations to solve is

CONSTRAINED LOCAL EXTREMA 39

yw  −−22w = 0 x  y2  22x −= 0  x  ,  yz −−22z = 0  x  wxyz2222+++−=10 whose solution set is the union of the ones described in the following equations:

=−22 − + =− = =  {wyzxyyyzz21,,,}   (7.2) =−−−+=−=22 = {}wyzxyyyzz21,,,

11  wx=0, =−+ 2 z22 2, y =−−+ 2 z 2, zz = 22  (7.3) 11  wx==−−+=−+=0, 2 z22 2, y 2 z 2, zz 22 

11  wx==0, 2, y = 2, z = 0  22  (7.4) 11  wx==−=−=0, 2, y 2, z 0 22 

11  wx==0, 2, y =−= 2, z 0  22  (7.5) 11  wx==−=0, 2, y 2, z = 0 22 

If we analyze now the regions where y ≠ 0 we obtain, for reasons of symmetry of the four variables involved, by interchanging x ↔ y and wz↔ the following additional solution sets:

===−=−−+22  {wwxxy,, xz , 2 x w 1}   (7.6) = = =− =− −22 − +  {}wwxxy,, xz , 2 x w 1

11  wwx==−+=−−+=,22,22,0 w22 y w z  22  (7.7) 11  wwx==−−+=−+=,22,22,0 w22 y w z 22 

40 SALVADOR GIGENA

()=+22 +=+ ()2 It is easy to see that the polynomial P2 wxyz,,, x 2 xy y x y vanishes at almost all subsets above, except at the points ()0,−− 2 2, 2 2, 0 and ()0, 2 2, 2 2, 0 , −− = = described in (7.4), where it takes the value PP22()0, 2 2, 2 2,0() 0, 2 2, 2 2,0 2. 3 3 = [] Therefore, the image of S under P2 is the close interval PS2 () 0, 2 and, consequently, the origin ()0, 0, 0, 0 is a candidate for a minimum, but one needs to analyze higher order homogeneous 3 31∩ − () polynomials at the points of the sphere S where P2 vanishes, i.e., at the set SP2 0 , which is easily seen to be the union of a compact, 2-dimensional algebraic submanifold of S 3 , say M 2 described in (7.1), (7.2) and (7.6) above, the union of two 1-dimensional ones, M 1 , described in (7.3) and (7.7), and two isolated points, ()0, 2 2,− 2 2, 0 and ()0,− 2 2, 2 2, 0 given in (7.5). However observe, too, that those two points and M 1 are subsets of M 2 , so that for analyzing further higher degree homogeneous polynomials it is enough to consider only the latter set.

Exercise 7.4.1.1. Prove that the functions of the general form

+ fwxyz(),,,=+ x2221 2 xyy ++ zp + o () 2 p + 1,

∈≥» () pp,1, have a saddle point at the origin of coordinates 0, 0, 0, 0 . Theorem 6.5, case c)42 .

Exercise 7.4.1.2. Prove that the functions of the general form

+ fwxyz(),,,=+ x2222 2 xyy ++ zp + o () 2 p + 2, with pp∈≥»,1, do not necessarily have a point or strict minimum at the origin of coordinates () 0, 0, 0, 0 . Hint: see carefully the situation described in Theorem 6.5, case c)43 .

Exercise 7.4.1.3. Prove that the functions of the general form

f ()wxyz,,,=+ x22 2 xy ++ y g () wxyz ,,, , where g ()wxyz,,, is a sum of homogeneous polynomials of degrees greater than two, containing everyone of them the monomial ()x + y as a multiplicative factor, fulfills the situation described in

Theorem 6.5, case c)41 , so that the origin is a point of non-strict minimum for those functions and, in particular, fwxyz(),,,= 0 for every ()wxyz,,, such that xy+=0 . Observation: the sum indicated for g ()wxyz,,, could contain only a finite number of terms, i.e., a polynomial function, or ∞ 1 n else constituted in part by a convergent power series, for example g ()wxyz,,, =+ wz∑ () x y . n=1 n!

Exercise 7.4.1.4. Prove that the function fwxyz(),,,=+ x2234 2 xyy ++ xzz ++ o () 4 exhibits a saddle point at the origin. Hint: Theorem 6.5, case c)44 ,

Example 7.4.2. Let us assume next that fwxyz(),,,=+ x22422 2 xyy ++ z() x + y + o() 6. CONSTRAINED LOCAL EXTREMA 41

()=+42 2 Here the homogeneous polynomial to analyze is P6 wxyz,,, z() x y . In this case, in view of the result in Example 7.4, mainly the observation at the end, we have to consider the following subsidiary problem: to determine the critical point of PM: 2 → » . 6 M 2 It is easy to see, by equations (7.1), (7.2) and (7.3) above that M 2 is given by Mwxyzwxyz22222=+++−=+={(),,, : 1 0, xy 0} , so that we consider the corresponding problem of determining the critical points of the homogeneous polynomial P6 when restricted by the ()=++−=222 ()=+= equations Gxyz1 ,, x y z 1 0 and Gxyzxy2 ,, 0.

We leave it as an exercise for the reader the calculation of the above subsidiary problem, by using 2 Theorem 3.1, and observe that there exist points on M where P6 is also vanishing, so that in order to follow up with the analysis of the case one would need to consider even higher order homogeneous polynomials, provided these are known.

Exercise 7.4.3. Let fwxyz(),,,=+ x22422448 2 xyy ++ z() x + y + zw ++ w o() 8. Follow up ()=+44 8 the analysis of the case, by considering now the homogeneous polynomial P8 wxyz,,, zw w, −12()∩ restricted to the set P6 0 M , by means of the corresponding subsidiary problems as needed.

Exercise 7.5. Consider the two similar problems f ()wxyz,,,=+++ x446248 y z xz ++ w o () 8 and fwxyz(),,,=+++ x446249333 y z xz ++ w wxz + o () 9. Determine, in each case, if the origin is a maximum, minimum or saddle point.

Hint: in the second step of both cases you have to determine the points where the polynomial ()=+++−2222 ()=+44 functions Gwxyz1 ,,, w x y z 1 and Gwxyz2 ,,, x y are simultaneously = () = () vanishing. Let GGG12, and observe that there exist no points Xwxyz:,,, that fulfill both conditions G() X==0 and rank() G′ () X 2 . In fact observe too that, when restricted to the real 44+= == == field, xy0 if, and only if, xy0 and 0 , so that the solution set of GG120, 0 is the ()=+++−==2222 = same as that of Gwxyz1 ,,, w x y z 1 0, x 0, y 0. Verify that the rank of the Jacobian associated to the latter system of equations is maximal, equal to 3 , so that it defines a one- dimensional compact, algebraic submanifold of the unit sphere. Therefore, in the following step you have to study the behavior of the next nonvanishing homogeneous polynomial restricted to that submanifold.

()=−−+44810 Exercise 7.6. Consider the function F xx12, x 2 x 1 x 2 x 1 exhibited as Example 1 in the ()=−44 article by E. Constantin [4]. By observing that, in our terminology, P412xx, x 2 x 1, prove that the () ()=− ()= origin 0, 0 is a saddle point, since P4 1, 0 1 and P4 0,1 1 , case c)3 in Theorem 6.5, without regards to what the terms with degrees higher than 4 may be, (cf. [4], pp. 45-46).

Exercise 7.7. Consider the following related problems

f ()tuvwxyztvtwtxtytzuvuwuxuy,,, ,,, =+22 22 ++++ 22 22 22 22 + 22 + 22 + 22 + 7.7.1. +++++++++uz224 v 2 vw 222222224 vx vy vz w wx 2222 wy + ++++++−+wz22 7 x 6 4 y 6 2 tu 24 xy 6 8 xz 6 6 z 7 o() 7

42 SALVADOR GIGENA

f ()tuvwxyztvtwtxtytzuvuwuxuy,,, ,,, =+22 22 ++++ 22 22 22 22 + 22 + 22 + 22 + 7.7.2. +++++++++uz224 v 2 vw 222222224 vx vy vz w wx 2222 wy + ++++++++wz22 7 x 6 4 y 6 2 tu 24 xy 6 8 xz 6 6 z 10 o() 10

f ()tuvwxyztvtwtxtytzuvuwuxuy,,, ,,, =+22 22 ++++ 22 22 22 22 + 22 + 22 + 22 + 7.7.3. +++++++++uz224 v 2 vw 222222224 vx vy vz w wx 2222 wy + +−wz22 2 tuv 222 ++ vz 6 3 wxy 33 +−+ xy 6 8 xz 6 5 u 10 + o()10

Determine, in each case, if the origin is a maximum, minimum or saddle point.

References

[1] T.M. Apostol, Calculus, Volume II, Blaisdell, 1967. [2] J.V. Baxley and J.C. Moorhouse, Lagrange Multiplier Problems in Economics, American Mathematical Monthly, Vol. 91, Nr. 7, (1984), 404-412. [3] C. Caratheodory, Calculus of Variations and Partial Differential Equations II, Holden- Day, 1965. [4] E. Constantin, Higher Order Necessary Conditions in Smooth Constrained Optimization, Contemporary Mathematics Vol. 479, (2009), 41-49 [5] W.H. Fleming, Functions of Several Variables, Addison-Wesley, 1965. [6] S. Gigena, M. Binia, D. Joaquín, D. Abud, E. Cabrera, Análisis Matemático II: Teoría Práctica y Aplicaciones, Galeón Editorial, 1999. [7] S. Gigena, M. Binia, D. Abud, Extremos Condicionados: Una propuesta metodológica para su resolución, Revista de Educación Matemática, Vol. 16, Nº 3, (2001), 31-53. [8] S. Gigena, M. Binia, D. Abud, Puntos Críticos Condicionados, Acta Latinoamericana de Matemática Educativa, Vol. 15, Tomo II., (2002), 998-1003. [9] S. Gigena, Extremos Condicionados sin Multiplicadores de Lagrange, Acta Latino-americana de Matemática Educativa, Vol. 19, (2006), 150-155. [10] S. Gigena, Constrained Local Extrema without Lagrange Multipliers, workshop, International Congress on Mathematical Education, ICME-11, Monterrey, N.L., México, 06-13 July, 2008. [11] J. Marsden, J. and A.J. Tromba, Vector Calculus, 2nd. edition, Freeman, 1976. [12] Y. Murata, Mathematics for Stability and Optimization of Economic Systems, Academic Press, 1977. [13] J. Nunemacher, Lagrange Multipliers can fail to determine Extrema, The College Mathematics Journal, Vol. 34, No. 1, (2003), 60-62. [14] D. Spring, On the Second Derivative Test for Constrained Local Extrema, American Mathematical Monthly, (1985), 631-643. [15] R.E. Williamson, R.H. Crowell and H.F. Trotter, Calculus of Vector Functions, 3rd. edition, Prentice Hall, 1972. [16] F. Zizza, Differential Forms for Constrained Max-Min Problems: Eliminating Lagrange Multipliers, The College Math. Journal, Vol. 29 – Nº. 5, (1998), 387-396.

Departamento de Matemáticas, Facultad. de Ciencias Exactas, Físicas y Naturales – Universidad Nacional de Córdoba, Ciudad Universitaria, Córdoba, ARGENTINA – e-mail: [email protected]

Departamento de Matemáticas, Facultad de Ciencias Exactas, Ingeniería y Agrimensura – Universidad Nacional de Rosario, Rosario, ARGENTINA – e-mail: [email protected]