ECON0702: Mathematical Methods in

Yulei Luo

SEF of HKU

February 15, 2009

Luo, Y. (SEF of HKU) MME February15,2009 1/81 Constrained Static Optimization

So far we have focused on …nding the maximum or minimum value of a without restricting the choice variables. In many economic problems, the choice variables must be constrained by economic considerations. E.g., consumers maximizes their functions with the budget constraints. A …rm minimizes the cost of production with the constraint of production technology. Such constraints may lower the maximum (or increase the minimum) for the objective function being maximized (or minimized). Because we are not able to choose freely among all choice variables, the objective function may not be as large as it can be. The constraints would be said to be nonbinding if we could obtain the same level of the objective function with or without imposing the constraint.

Luo, Y. (SEF of HKU) MME February15,2009 2/81 Finding the Stationary Values

For illustration, consider a consumer’schoice problem: maximize his utility as follows u (x1, x2) = x1x2 + 2x1, (1) subject to the budget constraint

4x1 + 2x2 = 60. (2) For this simple constrained problem, we can solve it by substituting the budget constraint into the objective utility function without using any new technique: The budget constraint can be rewritten as 60 4x1 x2 = 2 = 30 2x1,which combining with the utility function gives u (x1) = x1 (30 2x1) + 2x1. (3) By setting ∂u = 32 4x = 0, we get x = 8 and x = 14. Since ∂x1 1 1 2 d 2u dx 2 = 4 < 0, the stationary value constitutes a constrained 1 maximum.

Luo, Y. (SEF of HKU) MME February15,2009 3/81 The method

However, when the constraint is itself a complicated function or when the constraint cannot be solved to express one variable as an explicit function of the other variables, the technique of substitution and elimination is not enough to solve the constrained optimization problem. We will introduce a new method: the Lagrange multiplier method to solve the constrained optimization problem. The essence of the LM method is to convert a constrained extremum problem into a form such that the FOC of the free extremum problem can still be applied. In general, given an objective function z = f (x, y) (4) subject to the constraint g (x, y) = c where c is a constant, (5) we can write the Lagrange function as follows Z = L (x, y, λ) = f (x, y) + λ [c g (x, y)] . (6) Luo, Y. (SEF of HKU) MME February15,2009 4/81 (conti.) The symbol, λ, is called a Lagrange multiplier and will be determined and discussed later. If c g (x, y) = 0 always holds, the last term of Z will vanish regardless of the value of λ. Hence, …nding the constrained maximum value of z is equivalent to …nding a critical value of Z. The question now is: How can we make the parenthetical expression in (6) vanish? Let’sproceed to do so, treating λ also as an additional choice variable (in addition to x and y). From the Lagrange function (6), the FOCs are ∂L = fx λgx = 0, (7) ∂x ∂L = fy λgy = 0, (8) ∂y ∂L = c g (x, y) = 0. (9) ∂λ The …nal equation will automatically guarantee the satisfaction of the constraint. Since λ [c g (x, y)] = 0, the stationary values of Z in (6) must be identical with those of (4), subject to (5).

Luo, Y. (SEF of HKU) MME February15,2009 5/81 Reconsider the above consumer’schoice problem, …rst, we can write the Lagrange function as follows

Z = L (x1, x2, λ) = x1x2 + 2x1 + λ [60 (4x1 + 2x2)] . (10) The FOCs are: ∂L = x2 + 2 4λ = 0, (11) ∂x1 ∂L = x1 2λ = 0, (12) ∂x2 ∂L = 60 (4x1 + 2x2) = 0, (13) ∂λ solving the above equation for the critical values gives

x1 = 8, x2 = 14, and λ = 4. (14)

Luo, Y. (SEF of HKU) MME February15,2009 6/81 Summary of the Procedure

Step 1: Forming the Lagrange function Z = L(x, y, λ) = f (x, y) + λ[c g(x, y)]. (15) Step 2: Find the critical points of the Lagrangian function L(x, y, λ) by computing ∂L/∂x, ∂L/∂y, and ∂L/∂λ and setting each equal to 0 to solve for optimal (x , y , λ) : ∂L ∂L ∂L = 0, = 0, and = 0. ∂x y ∂λ Note that since λ just multiplies the constraint in the de…nition of L, the equation ∂L/∂λ = 0 is equivalent to the constraint c g(x, y) = 0. Note that by introducing the Lagrange multiplier λ into the constrained problem, we have transformed a two-variable constrained problem to the three-variable unconstrained problem of …nding the critical points of a function L(x, y, λ).

Luo, Y. (SEF of HKU) MME February15,2009 7/81 Total Di¤erential Approach

In the discussion of the free extremum of z = f (x, y), it is learned that the necessary FOC can be stated in terms of the total di¤erential dz : dz = fx dx + fy dy = 0. (16) This statement remains valid after adding the constraint g(x, y) = c. However, with the constraint, we can no longer take both dx and dy as arbitrary change as before because now dx and dy are dependent each other:

g(x, y) = c = (dg =) gx dx + gy dy = 0. (17) )

Luo, Y. (SEF of HKU) MME February15,2009 8/81 (conti.) The FOC in terms of total di¤erential becomes

dz = 0 subject to g(x, y) = c and gx dx + gy dy = 0. (18)

In order to satisfy this necessary FOC, we must have f f x = y , (19) gx gy

which together with the constraint g(x, y) = c will provide two equations to solve for the critical values of x and y. Hence, the total di¤erential approach yields the same FOC conditions as the Lagrange multiplier method. Note that the LM method gives the value of λ as a direct by-product.

Luo, Y. (SEF of HKU) MME February15,2009 9/81 An Interpretation of the Lagrange Multiplier

The Lagrange multiplier λ measures the sensitivity of Z  (which is the value of Z in optimum) to the change in the constraint. In other words, it gives us a new measure of value of the scarce resources (The e¤ect of an increase in c would indicate how the optimal solution is a¤ected by a relaxation of the constraint). If we can express the optimal values of (x , y , λ) as implicit functions of c :

x  = x  (c) ; y  = y  (c) ; λ = λ (c)

and all of which have continuous derivatives. Further, we have the following identities:

fx (x , y ) λgx (x , y ) = 0, (20) fy (x , y ) λgy (x , y ) = 0, (21) c g (x , y ) = 0. (22)

Luo, Y. (SEF of HKU) MME February15,2009 10/81 (conti.) and the Lagrange function in optimum can be written as

Z  = L(x , y , λ) = f (x , y ) + λ[c g(x , y )], (23) which means that

dZ  dx  dy  dλ = fx + fy + [c g(x , y )] dc dc dc dc =0 dx  dy  +λ 1 gx + gy| {z } dc dc   

dx  dy  = fx λgx + fy λgy + λ 0 1 dc 0 1 dc =0 (FOC for x) B C B=0 (FOC for x)C @ A @ A = λ.| {z } | {z }

Luo, Y. (SEF of HKU) MME February15,2009 11/81 Generalization: n-Variable and Multi-constraint Case

The optimization problem can be formed as follows:

max min z = f (x1, , xn), (24) x1, ,xn x1, ,xn    f  g f  g subject to : g(x1, , xn) = c (25)    It follows that the Lagrange function is

Z = L (x1, , xn, λ) = f (x1, , xn) + λ [c g(x1, , xn)] , (26)          for which the FOCs are

fi (x1, , xn) = λgi (x1, , xn), i = 1, , n.          g(x1, , xn) = c   

Luo, Y. (SEF of HKU) MME February15,2009 12/81 (conti.) If the objective function has more than one variable, say, two constraints

g(x1, , xn) = c and h(x1, , xn) = d (27)       The Lagrange function is

Z = f (x1, , xn) + λ [c g(x1, , xn)] (28)       +µ [d h(x1, , xn)] ,    for which the FOCs are

fi (x1, , xn) = λgi (x1, , xn) + µhi (x1, , xn), i = 1, , n.             g(x1, , xn) = c    h(x1, , xn) = d.   

Luo, Y. (SEF of HKU) MME February15,2009 13/81 Second-Order Conditions

Note that even though Z  is indeed a standard type of extremum w.r.t. the choice variables, it is not so w.r.t. the Lagrange multiplier. (23) shows that unlike (x , y ), if λ is replaced by any other value of λ, no e¤ect will be produced on Z since c g(x , y ) = 0. Thus    the role played by λ in the optimal solution is di¤ers basically from that of x  and y . While it is safe to treat λ as another choice variable in the discussion of FOCs, we should treat λ di¤erently in the discussion of SOCs. The new SOCs can again be stated in terms of the SO total di¤erential d2z. The presence of the constraint will entail certain signi…cant modi…cations.

Luo, Y. (SEF of HKU) MME February15,2009 14/81 Second-Order Total Di¤erential

The constraint g(x, y) = c implies that dg = gx dx + gy dy = 0. dx and dy are no longer both arbitrary: we may take dx as an arbitrary change, but then dy is dependent on dx, i.e., dy = (gx /gy ) dx. Note that since gx and gy depend on x and y, dy also depends on x and y.Thus, ∂ (dz) ∂ (dz) d2z = d (dz) = dx + dy ∂x ∂y ∂ (f dx + f dy) ∂ (f dx + f dy) = x y dx + x y dy ∂x ∂y     ∂dy ∂dy = fxx dx + fxy dy + fy dx + fyx dx + fyy dy + fy dy ∂x ∂y       2 2 ∂dy ∂dy = fxx dx + 2fxy dxdy + fyy dy + fy dx + fy dy ∂x ∂y   2 2 2 = fxx dx + 2fxy dxdy + fyy dy + fy d y

Luo, Y. (SEF of HKU) MME February15,2009 15/81 (conti.) The last term disquali…es d2z as a quadratic form. But d2z can be transformed into a quadratic form by virtue of the constraint g(x, y) = c :

dg = 0 = ) 2 2 2 d (dg) = gxx dx + 2gxy dxdy + gyy dy + gy d y = 0. (29)

Solving the last equation for d2y and substituting the result in the expression of d2z gives

2 fy 2 fy fy 2 d z = fxx gxx dx + 2 fxy gxy dxdy + fyy gyy (30)dy gy gy gy       2 2 = (fxx λgxx ) dx + 2 (fxy λgxy ) dxdy + (fyy λgyy ) dy (31) 2 2 = Zxx dx + 2Zxy dxdy + Zyy dy (32)

where Zxx = fxx λgxx , Zxy = fxy λgxy , and Zyy = fyy λgyy are from partially di¤erentiating the derivatives in (7) and (8).

Luo, Y. (SEF of HKU) MME February15,2009 16/81 Second-Order Conditions

For a constrained extremum problem, the SO necessary and su¢ cient conditions are still determined by the SO total di¤erential d2z for dx and dy satisfying dg = gx dx + gy dy = 0. Theorem (SO su¢ cient conditions) For maximum of z : d2z negative de…nite, subject to dg = 0. For minimum of z : d2z positive de…nite, subject to dg = 0.

Theorem (SO necessary conditions) For maximum of z : d2z negative semide…nite, subject to dg = 0. For minimum of z : d2z positive semide…nite, subject to dg = 0.

Luo, Y. (SEF of HKU) MME February15,2009 17/81 The Bordered Hessian

As in the case of free extremum, it is possible to express the SO su¢ cient condition in determinantal form. In the constrained-extremum case we will use what is known a bordered Hessian. Let’s…rst analyze the conditions for the sign de…niteness of a two-variable quadratic form, subject to a linear constraint:

q = au2 + 2huv + bv 2 subject to αu + βv = 0 (33)

Since the constraint means that v = (α/β) u, u2 q = aβ2 2hαβ + bα2 , (34) β2  which means that q is positive (negative) de…nite i¤ aβ2 2hαβ + bα2 > 0(< 0). Luo, Y. (SEF of HKU) MME February15,2009 18/81 (conti.) It so happens that the following symmetric determinant

0 α β α a h = aβ2 2hαβ + bα2 . (35) β h b 

Consequently, we can state that positive de…nite q is subject to αu + βv = 0 negative de…nite  0 α β < 0 i¤ α a h = . (36) > 0 β h b 

Luo, Y. (SEF of HKU) MME February15,2009 19/81 (conti.) Note that the determinant used in this criterion is nothing a h but the discriminant of the original quadratic form , with a h b

border placed on top and a similar border on the left: The border is

merely composed of two coe¢ cients α and β from the constraint, plus a zero in the principal diagonal. When applied to the quadratic form d2z, the (plain) discriminant Z Z consists of the Hessian xx xy . Given the constraint Z Z xy yy gx dx + gy dy = 0,

0 g g positive de…nite x y < 0 d2z is s.t. dg = 0 i¤ g Z Z = . negative de…nite x xx xy > 0  gy Zxy Zyy 

H j j | {z }

Luo, Y. (SEF of HKU) MME February15,2009 20/81 Example: Consumer’sUtility Maximization Problem

x max u (x , x ) = x x subject to x + 2 = B. (37) 1 2 1 2 1 1 + r Step1: The Lagrange function is

x2 Z = L (x1, x2, λ) = u (x1, x2) + λ B x1 . (38) 1 + r   Step 2: The FOCs are

∂Z x2 = B x1 = 0, (39) ∂λ 1 + r ∂Z = x2 λ = 0, (40) ∂x1 ∂Z λ = x1 = 0, (41) ∂x2 1 + r

which can be used to solve for optimal level of x1 = B/2 and x2 = B (1 + r) /2.

Luo, Y. (SEF of HKU) MME February15,2009 21/81 (conti.) Next, we should check the SO su¢ cient condition for a maximum. The bordered Hessian for this problem is

0 1 1 1+r 2 H = 1 0 1 = > 0. (42) 1 1 + r 1 0 1+r

Thus the SO su¢ cient condition is satis…ed for a maximum.

Luo, Y. (SEF of HKU) MME February15,2009 22/81 -Variable Case

When the optimization problem takes the form:

max min z = f (x1, , xn), subject to: g(x1, , xn) = c, x1, ,xn x1, ,xn       f  g f  g (43) the Lagrange function is

Z = f (x1, , xn) + λ [c g(x1, , xn)] , (44)       and dx1, , dxn satisfy the relation:    dg = g1dx1 + + gndxn = 0. (45)    which implies that the bordered Hessian is

0 g1 gn    g1 Z11 Z1n H    . = .. .          gn Zn1 Znn    Luo, Y. (SEF of HKU) MME February15,2009 23/81

(conti.) its bordered leading principal minors are

0 g1 g2 g3 0 g1 g2 g1 Z11 Z12 Z13 H2 = g1 Z11 Z12 , H3 = , g Z Z Z g Z Z 2 21 22 23 2 21 22 g Z Z Z 3 31 32 33

, H = Hn   

Theorem (Conditions for Maximum)

(1) FO necessary condition: Zλ = Z1 = = Zn = 0. (2) SO su¢ cient n   condition: H2 > 0, H3 < 0. , ( 1) Hn > 0.  

Theorem (Conditions for Minimum)

(1) FO necessary condition: Z = Z1 = = Zn = 0. (2) SO su¢ cient λ    condition: H2 < 0, H3 < 0. , Hn < 0.  

Luo, Y. (SEF of HKU) MME February15,2009 24/81 Quasiconcavity and Quasiconvexity

For an unconstrained optimization problem (a free extremum problem), the concavity (convexity) of the objective function guarantees the existence of absolute maximum (absolute minimum). For a constrained optimization problem, we will show that the quasiconcavity (quasiconvexity) of the objective function guarantees the existence of absolute maximum (absolute minimum). Quasiconcavity (quasiconvexity), like concavity (convexity), can be either strict or nonstrict.

Luo, Y. (SEF of HKU) MME February15,2009 25/81 De…nition A function is quasiconcave (quasiconvex) i¤ for any pair of distinct points u and v in the convex domain of f , and for 0 < θ < 1, f (v) f (u) implies that 

f (θu + (1 θ) v) f (u)(f (θu + (1 θ) v) f (v)) . (46)   Further, if the weak inequality “ ” (“ ”) is replaced by the strict inequality “<” (“>”), f is said to be strictly quasiconcave (strictly quasiconvex). Fact Quasiconcavity (quasiconvexity) is a weaker condition than concavity (convexity).

Luo, Y. (SEF of HKU) MME February15,2009 26/81 Theorem (Negative of a function) If f (x) is quasiconcave (strictly quasiconcave), then f (x) is quasiconvex (strictly quasiconvex).

Proof. Use the fact that multiplying an inequality by 1 reverses the sense of inequality.

Theorem (Concavity vs. quasiconcavity) Any (strictly) concave (convex) function is (strictly) quasiconcave (quasiconvex) function, but the converse is not true.

Proof. Using the de…nitions of concavity and quasiconcavity to prove.

f (θu + (1 θ) v) θf (u) + (1 θ) f (v)  f (u) (because we assume that f (v) f (u))  

Luo, Y. (SEF of HKU) MME February15,2009 27/81 Theorem (Linear function) If f (x) is linear, then it is quasiconcave as well as quasiconvex. Proof. Using the fact that if f (x) is linear, then it is concave as well as convex.

Fact Unlike concave (convex) functions, a sum of two quasiconcave (quasiconvex) functions is not necessarily quasiconcave (quasiconvex).

Luo, Y. (SEF of HKU) MME February15,2009 28/81 Sometimes it is easier to check quasiconcavity and quasiconvexity by the following alternative de…nitions. We …rst introduce the concept of convex sets. De…nition If, for any two points in set S, the line segment connecting these two points lies entirely in S, then S is said to be a . A corresponding algebraic de…nition is: A set S is convex i¤ for any two points u S and 2 v S, and for every scalar θ [0, 1], it is true that 2 2 w = θu + (1 θ) v S. Note that u and v can be in the space with any dimension. 2

Luo, Y. (SEF of HKU) MME February15,2009 29/81 Fact To qualify as a convex set, the set of points must contain no holes, and its boundary must not be indented anywhere. See Figure 11.8 in the book.

Fact Note that convex sets and convex functions are distinct concepts. In describing a function, the word convex speci…es how a curve or surface bends itself-it must form a valley. But in describing a set, the word speci…es how the points in the set are packed together - they must not allow any holes and the boundary must not be indented.

Luo, Y. (SEF of HKU) MME February15,2009 30/81 De…nition A function f (x), where x is a vector of variables, is quasiconcave (quasiconvex) i¤ for any constant k, the set S  = x f (x) k (S = x f (x) k ) is convex. f j  g  f j  g See Figure 12.5 (a), (b), and (c). The three functions in Figure 12.5 contain concave as well as convex segments and hence are neither concave or convex. But the function in Fig 12.5 (a) is quasiconcave because for any value of k, the set S is convex. The function in Fig 12.5 (b) is quasiconvex because for any value of k, the set S is convex. The in Fig 12.5 (b) is quasiconcave as well as quasiconvex because both S and S are convex. Hence, given that S  is convex, we can only conclude that the function f is quasiconcave, but not necessarily concave. Examples: (1) Z = x2 (x 0) is quasiconvex as well as quasiconcave since both S and S are convex. (2) Z = (x a)2 + (y b)2 is   quasiconvex since S  is convex.

Luo, Y. (SEF of HKU) MME February15,2009 31/81 If the function z = f (x1, , xn) is twice continuously di¤erentiable, quasiconcavity and quasiconvexity   can be checked by the …rst and second partial derivatives of the function. De…ne a bordered determinant as follows:

0 f1 fn    f1 f11 f1n B    (47) = .. .          fn fn1 fnn   

Note that the determinant B is di¤erent from the bordered Hessian H : Unlike H , the border in B is composed of the …rst derivatives

of the function f rather than the constraint function g.

Hence, if B satis…es the SO su¢ cient condition for strict quasiconcavity (will be speci…ed below), H must also satisfy the SO

su¢ cient condition for constrained maximization problem.

Luo, Y. (SEF of HKU) MME February15,2009 32/81 (conti.) We can de…ne successive principal minors of B as follows:

0 f1 f2 0 f1 B1 = , B2 = f1 f11 f12 , , Bn = B (48) f1 f11    f f f 2 21 22

A su¢ cient condition for f to be quasiconcave on the nonnegative

domain is that n B1 < 0, B2 > 0, , ( 1) Bn > 0. (49)    For quasiconvexity, the corresponding condition is that

B1 < 0, B2 < 0, , Bn < 0. (50)    A necessary condition for f to be quasiconcave on the nonnegative

domain is that n B1 0, B2 0, , ( 1) Bn 0. (51)       For quasiconvexity, the corresponding condition is that

B1 0, B2 0, , Bn 0. (52)       Luo, Y. (SEF of HKU) MME February15,2009 33/81

Example: Consider z = f (x1, x2) = x1x2, since

f1 = x2, f2 = x1, f11 = f22 = 0, f12 = f21 = 1,

the relevant principal minors are

0 x2 x1 0 x2 2 B1 = = x2 0, B2 = x2 0 1 = 2x1x2 0. x2 0   x 1 0 1 (53)

Thus, z = x1x2 is quasiconcave on the nonnegative domain.

Luo, Y. (SEF of HKU) MME February15,2009 34/81 Example: Show that z = f (x, y) = xay b (x > 0, y > 0, a, b (0, 1)) is quasiconcave. Since 2

a 1 b a b 1 fx = ax y , fy = bx y , a 2 b a b 2 fxx = a (a 1) x y , fyy = b (b 1) x y , a 1 b 1 fxy = fyx = abx y ,

2 0 fx a 1 b B1 = = ax y < 0, (54) fx 0

  0 fx fy B = f f f = 2ab (a + b) x3a 2y 3b 2 > 0, (55) 2 x xx xy f f f y yx yy

which means that the function is quasiconcave.

Luo, Y. (SEF of HKU) MME February15,2009 35/81 (conti.) In this case, the condition for concavity can be expressed as

2 a 2 b a b 2 fxx fyy f = a (a 1) x y b (b 1) x y xy 2 h a 1 b 1 i h i abx y (56) h i 2a 2 2b 2 2 2 2a 2 2b 2 = ab (a 1)(b 1) x y a b x y 2a 2 2b 2 = ab [1 a b] x y (57) and this expression is positive (as required for concavity) for

1 a b > 0 = a + b < 1. )

Luo, Y. (SEF of HKU) MME February15,2009 36/81 Fact When the function in the constraint is linear, that is, g(x1, , xn) = a1x1 + + anxn = c, the bordered determinant B and       the bordered Hessian H have the following relationship:

2 B = λ H . (58)

Hence, in the linear constraint case, the two bordered determinants always have the same sign at the stationary point.

Fact (Relative vs. absolute extreme) If a function is quasiconcave (quasiconvex), by the same reasons for concave (convex) functions, its relative maximum (relative minimum) is an absolute maximum (absolute minimum).

Luo, Y. (SEF of HKU) MME February15,2009 37/81 Utility Maximization and Consumer Demand

Consider the following two-commodity consumer optimization problem: max u (x, y) (ux > 0, uy > 0) (59) subject to the budget constraint

Px x + Py y = B (60)

where Px , Py , and B are given exogenously. The Lagrange function is then

Z = u (x, y) + λ (B Px x Py y) . (61)

Luo, Y. (SEF of HKU) MME February15,2009 38/81 The FOCs are

Z = B Px x Py y = 0 (62) λ Zx = ux λPx = 0 (63) Zy = uy λPy = 0. (64) From the last two equations, we have u P x = x , (65) uy Py

where ux = MRS is called the marginal rate of substitution (MRS) uy xy of x for y. Thus, we have the well-known equality: MRS = Px xy Py which is the necessary condition for the interior optimal solution. See the …gure of the indi¤erence curve.

Luo, Y. (SEF of HKU) MME February15,2009 39/81 If the bordered Hessian is positive, i.e.,

0 Px Py 2 2 Px uxx uxy = 2Px Py uxy P uxx P uyy > 0, y x P u u y yx yy

in which all elements are evaluated at the optimum, then the stationary value of u is maximum.

Luo, Y. (SEF of HKU) MME February15,2009 40/81 Static Optimization with Inequality Constraints

So far we have considered optimization problem with equality constraints. Now we shall consider constraints that may be satis…ed as inequailities in the solution. Consider the simple optimization problem with inequality constraints: max f (x, y) subject to g (x, y) c. (66) x,y S  f g2 We seek the largest value attained by f (x, y) in the admissible or feasible set S of all pairs (x, y) satisfying g (x, y) c. Note that problems where one wants to minimize f(x, y) subject to x, y S can be handled by instead studying the problem of maximizingf g 2 f (x, y) subject to x, y S. This problem can be solved by usingf ang 2 extended Lagrangian multiplier method introduced above, and it involves examining the stationary points of f in the interior of the feasible set S and the behavior of f on the boundary of S. This new method is originally proposed by two Princeton mathematicians: H. W. Kuhn and A.W.

Luo,Tucker. Y. (SEF of HKU) MME February15,2009 41/81 Theorem (Recipe for solving the optimization problem with inequality constraints.) A. Associate a Lagrange multiplier λ with the constraint g (x, y) c, and de…ne the Lagrangian function as follows 

Z = L(x, y) = f (x, y) + λ[c g(x, y)]. (67) B. Equate the partial derivatives of Z w.r.t. x and y to zeros:

fx λgx = 0, fy λgy = 0. (68) C. Introduce the complementary slackness condition

λ 0 and λ[c g(x, y)] = 0. (69)  D. Require (x, y) to satisfy the constraint

g (x, y) c (70) 

Luo, Y. (SEF of HKU) MME February15,2009 42/81 (conti.) Step C (the complementary slackness condition) is tricky. It requires that λ is nonnegative and moreover that λ = 0 if g (x, y) < c. Thus, if λ > 0, we must have g (x, y) = c. Note that the Lagrange multiplier λ can be interpreted as a price (it is called the shadow price) associated with increasing the right-hand side c of the resource constraint g (x, y) c by 1 unit. With this interpretation, prices are nonnegative, and if the resource constraint is not binding because g (x, y) < c at the optimum, the price associated with increasing by one unit is 0. It is possible to have both λ = 0 and g (x, y) = c. The two inequalities are complementary in the sense that at most one can be “slack”, that is, at most one can hold with inequality. Equivalently, at least one must be an equality.

Luo, Y. (SEF of HKU) MME February15,2009 43/81 (conti.) Conditions (68) and (69) are called the Kuhn-Tucker conditions. Note that they are necessary conditions for the above problem. Note that with an inequality constraint, one will have ∂Z ∂λ = c g(x, y) > 0 at an optimum if the constraint holds with inequality at that point. For this reason, we wouldn’tdi¤erentiate the Lagrangian w.r.t. λ. Example: Consider the following problem

max f (x, y) = x2 + y 2 + y 1 (71) x,y S f g2 subject to : g (x, y) = x2 + y 2 1. (72)  The Lagrange function is

Z = x2 + y 2 + y 1 + λ 1 x2 + y 2 .  

Luo, Y. (SEF of HKU) MME February15,2009 44/81 The Trial-and-Error Approach to Search Optimal Solutions

(conti.) The FOCs are then

Zx = 2x 2λx = 0 (73) Zy = 2y + 1 2λy = 0. (74) The complementary slackness condition is

λ 0 and λ 1 x2 + y 2 = 0. (75)  we want to …nd all pairs (x, y)that satisfy these conditions for some suitable value of λ. Begin by looking at (73). This condition implies that

2x (1 λ) = 0. There are two possibilities: λ = 1 or x = 0. If λ = 1, then (74) implies that 1 = 0, a contradiction. Hence, x = 0.

Luo, Y. (SEF of HKU) MME February15,2009 45/81 (conti.) Since x = 0, Suppose that x2 + y 2 = 1, we have y = 1. We try y = 1 …rst. Substituting y = 1 into (74) implies that λ = 3/2 and (75) is satis…ed. Hence, (x, y) = (0, 1) with λ = 3/2 is a candidate for optimality. Similarly, if y = 1, λ = 1/2 and (75) is also satis…ed. Hence, (x, y) = (0, 1)with λ = 1/2 is also a candidate for optimality. Finally, consider the case where x = 0 and x2 + y 2 < 1. In this case, (75) implies that λ = 0 and (74) implies y = 1/2. Hence, (x, y) = (0, 1/2) with λ = 0 is also a candidate for optimality. We then conclude that there are 3 candidates for optimality:

f (0, 1) = 1, f (0, 1) = 1, f (0, 1/2) = 5/4. Because we want to maximize a continuous function over a closed bounded set, by the Extreme value theorem there is a solution to the problem. Thus, (x, y) = (0, 1) solved the maximization problem.

Luo, Y. (SEF of HKU) MME February15,2009 46/81 Consider the n-variable problem

g1 (x1, , xn) c1     max f (x1, , xn) s.t. (76)    8    gm (x1, , xn) cm <     : Theorem (Recipe for Solving the General Problem with n-variable) A). Write down the Lagrange function

m L = f (x1, , xn) + ∑ λj (cj gj (x1, , xn)) (77)    j=1    where λj is the Lagrange multiplier associated with the j th constraint.

Luo, Y. (SEF of HKU) MME February15,2009 47/81 Theorem (conti.) B) Equate all the FOCs to 0, for each i = 1, , n :    m ∂f ∂gj ∑ λj = 0. (78) ∂xi j=1 ∂xi

C) Impose the complementary slackness conditions:

λj 0 ( = 0 if cj gj (x1, , xn) > 0), j = 1, , m. (79)       

D) Require x1, , xn to satisfy the constraints   

gj (x1, , xn) cj , j = 1, , m. (80)       

Luo, Y. (SEF of HKU) MME February15,2009 48/81 Theorem (Kuhn-Tucker Su¢ cient Conditions)

Consider (66) and suppose that (x , y ) satis…es conditions (68), (69) and (70). If the Lagrange function is concave, then (x , y ) solves the problem.

Proof.

If (x , y ) satis…es the conditions in (68), then (x , y ) is a stationary point of the Lagrangian. Because a stationary point of the concave Lagrangian will maximize the function, we have

L(x , y ) = f (x , y ) + λ[c g(x , y )] f (x, y) + λ[c g(x, y)]  Rearranging the terms gives

f (x , y ) f (x, y) λ[g(x , y ) g(x, y)] (81) 

Luo, Y. (SEF of HKU) MME February15,2009 49/81 Proof.

(conti.) If c g (x , y ) > 0, then by (69), we have λ = 0, so (81) implies that f (x , y ) f (x, y). On the other hand, if c g(x , y ) = 0, then     

λ[g(x , y ) g(x, y)] = λ[c g(x, y)]. Here λ 0, and c g(x, y) 0 for all (x, y) satisfying the constraint.   Hence, (x , y ) solves problem (66).

Luo, Y. (SEF of HKU) MME February15,2009 50/81 A Special Case: Nonnegativity Conditions on the Variables

Many economic variables must be nonnegative by their very nature. It is not di¢ cult to incorporate such constraints in our above formulation. For example, x 0 can be expressed as h (x, y) = x 0, and we introduce an additional Lagrange multiplier to go with it. Consider the problem

max f (x, y) subject to g (x, y) c, x 0, y 0. (82) x,y S    f g2 Note that it can be rewritten as

max f (x, y) subject to g (x, y) c, x 0, y 0. (83) x,y S    f g2

Luo, Y. (SEF of HKU) MME February15,2009 51/81 (conti.) The Lagrange function is then

Z = L(x, y, λ) = f (x, y) + λ[c g(x, y)] + µ x + µ y. (84) 1 2 The FOCs are

fx λgx + µ = 0 (85) 1 fy λgy + µ = 0 (86) 2 λ 0, λ[c g(x, y)] = 0 (87)  µ 0, µ x = 0 (88) 1  1 µ 0, µ y = 0, (89) 2  2 which is equivalent to

fx λgx 0 ( = 0 if x > 0)  fy λgy 0 ( = 0 if y > 0)  λ 0 ( = 0 if c g(x, y) > 0) 

Luo, Y. (SEF of HKU) MME February15,2009 52/81 The same idea can obviously be extended to the n-variable problem

g1 (x1, , xn) c1     max f (x1, , xn) s.t. , (90)    8    gm (x1, , xn) cm <     x1 0, , xn 0.      : The necessary FOCs for the solution of (90) are that, for each i = 1, , n :    m ∂f ∂gj ∑ λj 0 ( = 0 if xi > 0) (91) ∂xi j=1 ∂xi 

λj 0 ( = 0 if cj gj (x, y) > 0), j = 1, (92), m.    

Luo, Y. (SEF of HKU) MME February15,2009 53/81 Example: The consumer’smaximization problem is

max u = u (x, y)

subject to

Px x + Py y B (93)  cx x + cy y C (94)  x 0 (95)  y 0 (96)  The Lagrange function is

Z = u (x, y) + λ1 [B (Px x + Py y)] + λ2 [C (cx x + cy y)] +µ1x + µ2y Suppose that 2 u (x, y) = xy , B = 100, Px = Py = 1, C = 120, cx = 2, and cy = 1.

Luo, Y. (SEF of HKU) MME February15,2009 54/81 (conti.) The Kuhn-Tucker conditions are

2 Zx = y λ1 2λ2 0, x 0, xZx = 0. 2  equivalently, y λ1 2λ2 0 (= 0 if x > 0)  Zy = 2xy λ1 λ2 0, y 0, yZy = 0.    λ1 0, λ1 [100 (x + y)] = 0, 100 (x + y) 0.   λ2 0, λ2 [120 (2x + y)] = 0, 120 (2x + y) 0.   Again, the solution procedure involves a certain amount of trial and error. We can …rst choose one of the constraints to be nonbinding and solve for x and y. Once found, use these values to test if the constraint chosen to be nonbinding is violated. If it is, then redo the procedure choosing another constraint to be nonbinding. If violation of the nonbinding constraint occurs again, then we can assume both constraints bind and the solution is determined only by the constraints.

Luo, Y. (SEF of HKU) MME February15,2009 55/81 (conti.) Step 1: Assume that the second constraint is nonbinding in the solution (that is, 120 > 2x + y), so that λ2 = 0 by complementary slackness. But let x, λ1, and y be positive so that:

2 Zx = y λ1 = 0 Zy = 2xy λ1 = 0 100 = x + y

Solving for x and y yields a trial solution 1 2 x = 33 , y = 66 . 3 3 But substituting this into the second constraint gives 1 2 1 2 33 + 66 = 133 > 120. 3 3 3   In other words, this solution violates the second constraint and must be rejected.

Luo, Y. (SEF of HKU) MME February15,2009 56/81 (conti.) Step 2: Reverse the assumption so that λ1 = 0 and let x, λ2, and y be positive so that:

2 Zx = y λ2 = 0 Zy = 2xy λ2 = 0 120 = 2x + y

Solving for x and y yields a trial solution

x = 20, y = 80

which implies that λ2 = 3200. These solutions together with λ1 = 0 satisfy all constraints. Thus we accept them as the …nal solution to the Kuhn-Tucker conditions.

Luo, Y. (SEF of HKU) MME February15,2009 57/81 Example: Quasi-linear

Consumer’sproblem is to choose two commodities to maximize his utility function: u(x, y) = y + a ln (x) (97) subject to

px + qy I , (98)  x 0 (99)  y 0 (100)  where a is a given positive constant. p and q are both positive prices. First, we construct the Lagrange function as follows

L = y + a ln (x) + λ [I px qy] + +µ x + µ y (101) 1 2

Luo, Y. (SEF of HKU) MME February15,2009 58/81 (conti.) The Kuhn-Tucker conditions can be written as a a λp 0, x 0, λp x = 0 (102) x   x 1 λq 0, y 0, (1 λq)y = 0 (103)   I px qy 0, λ 0, (I px qy) λ = 0. (104)   in which we have 23 possibilities of equalities or inequalities because there are two nonnegative variables and one inequality constraint. First, note that the budget constraint must be binding (that is, px + qy = I , all available income must be used up) because the marginal utility is positive: ux (x, y) > 0 and uy (x, y) > 0, that is, consuming more can result in higher levels of utility. Formally, if the a BC is not binding, then λ = 0, which means that x 0 and 1 0, both are contradiction.   Now we can reduce the number of possibilities to 4 : x > 0, x = 0, y > 0, y = 0. We can also rule out the possibility that x = 0 and y = 0 because they don’tsatisfy the BC: px + qy = I > 0.

Luo, Y. (SEF of HKU) MME February15,2009 59/81 (conti.) If x = 0 and y = I /q > 0, the second line in the Kuhn-Tucker conditions means that 1 λ = q and the …rst line in the KT conditions implies that p ∞, q  which is a contradiction. Intuitively, the initial one unit increase in x will result in positively in…nite utility, so zero consumption in x can’t be optimal.

Luo, Y. (SEF of HKU) MME February15,2009 60/81 (conti.) If y = 0 and x = I /p > 0, the …rst line in the KT conditions implies that a a λp = 0 = λ = . (105) x ) I Substituting it into the second line of the KT conditions gives a 1 q 0 = I aq. I  )  If these parameters (I , a, q) satisfy this condition, then x = I /p and y = 0 is a candidate optimal solution. Finally, if both x and y are positive, the …rst two lines in the KT conditions gives a λp = 0 = 1 λq = x = aq/p and y = I /q a. (106) x ) Hence, if I > aq (y > 0), then it is also a candidate optimal solution.

Luo, Y. (SEF of HKU) MME February15,2009 61/81 (conti.) In sum, the optimal solution eventually depends on the values of parameters (I , a, q):

x = I /p and y = 0 if I aq.  x = aq/p and y = I /q a if I > aq.

Luo, Y. (SEF of HKU) MME February15,2009 62/81 The Envelop Theorem for Unconstrained Optimization

A maximum-value function is an objective function where the choice variables have been assigned their optimal values. Thus, this function indirectly becomes a function of the parameters only through the parameters’e¤ects on the optimal values of the choice variables, and is also referred to as the indirect objective function. The indirect objective function traces out all the maximum values of the objective functions these parameters vary. Hence, the IOF is an “envelop” of the set of optimized objective functions generated by varying the parameters. Consider max u = f (x, y, φ) where x and y are choice variables and φ is a parameter.

Luo, Y. (SEF of HKU) MME February15,2009 63/81 The FOC necessary conditions:

fx (x, y, φ) = fy (x, y, φ) = 0. (107) If the SO conditions are met, these two equations implicitly de…ne the solutions x  = x  (φ) and y  = y  (φ) . Substituting these solutions back into the f gives the IOF (or the maximum value function)

V (φ) = f (x  (φ) , y  (φ) , φ) . (108) Di¤erentiating V (φ) w.r.t. φ gives

dV ∂x  ∂y  = fx + fy + f = f . dφ ∂φ ∂φ φ φ

This result means that at the optimum (fx = fy = 0), as φ varies, dV with x  and y  allowed to adjust, d φ gives the same result as if x  and y  are treated as constants (only the direct e¤ect need to be considered). This is the essence of the Envelop theorem.

Luo, Y. (SEF of HKU) MME February15,2009 64/81 The Envelop Theorem for Constrained Optimization

The problem becomes

max u = f (x, y, φ) s.t. g (x, y, φ) = 0.

The Lagrangian is then

Z = f (x, y, φ) λg (x, y, φ) . The FOCs are

Zx = fx λgx = 0 Zy = fy λgy = 0 Z = g (x, y, φ) = 0. λ The optimal solution is then

x  = x  (φ) , y  = y  (φ) , λ = λ (φ) .

Luo, Y. (SEF of HKU) MME February15,2009 65/81 Substituting these solutions back into the f gives the IOF (or the maximum value function)

V (φ) = f (x  (φ) , y  (φ) , φ) . (109) Di¤erentiating V (φ) w.r.t. φ gives

dV ∂x  ∂y  = fx + fy + f . (110) dφ ∂φ ∂φ φ Further, note that since

g (x  (φ) , y  (φ) , φ) = 0, we have ∂x  ∂y  gx + gy + g = 0. (111) ∂φ ∂φ φ Multiplying λ on both sides and then combining it with the expression dV of d φ gives the Envelop theorem for constrained optimization:

dV ∂x  ∂y  = (fx λgx ) + (fy λgy ) + f λg = Z . (112) dφ ∂φ ∂φ φ φ φ  Luo, Y. (SEF of HKU) MME February15,2009 66/81 Homogeneous Functions

De…nition A function is said to be homogenous of degree r, if multiplication of each of its independent variables by a constant j will alter the value of the function by the proportion jr , that is,

r f (jx1, , jxn) = j f (x1, , xn) . (113)       In economics applications, we assume that j is usually taken to be positive.

Example Given the function f (x, y, w) = x/y + 2w/3x, if we multiply each variable by j, we get

f (jx, jy, jw) = x/y + 2w/3x = j0f (x, y, w) , (114)

which means that this function is homogenous function of degree 0 (j0 = 1). Luo, Y. (SEF of HKU) MME February15,2009 67/81 De…nition Production functions are usually homogeneous functions of degree 1. They are often referred to as linearly homogeneous functions.

Assume that the production function has the following form:

Q = f (K, L) (115)

The mathematical assumption of linear homogeneity would amount to the economic assumption of constant returns to scale (CRTS), because linear homogeneity means that raising all inputs j-fold will always raise the output (value of the function) exactly j-fold also.

Luo, Y. (SEF of HKU) MME February15,2009 68/81 Fact Given the LH production function Q, the average production of labor (APL) and of capital (APK) can be expressed as functions of the capital-labor ratio, k = K /L alone.

Fact Multiplying each independent variable by a factor j = 1/L and using the property of linear homogeneity, we have

Q K APL = = f , 1 = f (k, 1) = φ (k) . (116) L L   Q Q L φ (k) APK = = = (117) K L K k Therefore, while the production function is homogeneous degree one, both APL and APK are homogenous of degree zero in the K and L.

Luo, Y. (SEF of HKU) MME February15,2009 69/81 Fact Given the LH production function Q, the marginal production of labor (MPL) and of capital (MPK) can be expressed as functions of the capital-labor ratio, k = K /L alone.

Fact To …nd the marginal products, we rewrite the total product as Q = Lφ (k) .Di¤erentiating it w.r.t. K and L gives

∂Q ∂φ (k) dφ (k) ∂k MPK = = L = L = φ0 (k) ∂K ∂K dk ∂K ∂Q ∂φ (k) K MPL = = φ (k) + L = φ (k) + Lφ0 (k) ∂L ∂L L2   = φ (k) φ0 (k) k They are also homogenous of degree zero in the K and L (they remain the same as long as k is held constant.

Luo, Y. (SEF of HKU) MME February15,2009 70/81 Theorem (Euler’sTheorem) If Q = f (K, L) is linearly homogeneous, then

∂Q ∂Q K + L = Q. ∂K ∂L Proof.

∂Q ∂Q K + L = K φ0 (k) + L φ (k) φ0 (k) k ∂K ∂L = Lφ (k) = Q. 

This theorem says that the value of a LH function can always be expressed as a sum of terms, each of which is one of the independent variables and the FO partial derivative w.r.t. to that variable. Hence, under conditions of CRTS, if each input is paid the amount of its marginal product, the total product will be exhausted by the distributive shares for all the input factors, or the pure economic pro…t will be zero. Luo, Y. (SEF of HKU) MME February15,2009 71/81 Example (Cobb-Douglas Production Function) One speci…c production function is the Cobb-Douglas function:

Q = AK αLβ (118) where A is a positive constant, α and β are positive fractions. Major features of this production are: (1) It is homogenous of degree (α + β) . (2) In the special case of α + β = 1, it is linearly homogeneous. (3) Its isoquants are negatively sloped throughout and strictly convex for positive values of K and L. (4) It is strictly quasiconcave for positive K and L.

Luo, Y. (SEF of HKU) MME February15,2009 72/81 Example (Cobb-Douglas Production Function) (conti.) For the special case α + β = 1,

α 1 α α Q = AK L = ALk . (119) Q Q APL = = Akα, APK = = Akα 1. (120) L K ∂Q α 1 ∂Q α MPK = = Aαk , MPL = = A (1 α) k . (121) ∂K ∂L The Euler theorem can be veri…ed as follows:

∂Q ∂Q α 1 α K + L = KAαk + LA (1 α) k ∂K ∂L = kα [AαL + A (1 α) L] = ALkα = Q.

Luo, Y. (SEF of HKU) MME February15,2009 73/81 Example (CES Production Function) A constant elasticity of substitution (CES) production function is

ρ ρ 1/ρ Q = A δK + (1 δ) L (A > 0; 0 < δ < 1; 1 < ρ = 0) 6 (122)   where δ is the distributive parameter, like α in the CD function, and ρ is the substitution parameter-which has no counterpart in the CD function. First, the CES production function is homogeneous of degree one

1/ρ ρ ρ jQ = A δ (jK ) + (1 δ)(jL) . (123) h i Second, CD function is a special case of the CES function. When ρ 0, the CES function approaches the CD function. !

Luo, Y. (SEF of HKU) MME February15,2009 74/81 Example (CES Production Function) (conti.)

Proof. Taking log on both side of the CES production function and then taking the limitation gives

Q ln [δK ρ + (1 δ) L ρ] ln = lim A ρ 0 ρ   ! ρ ρ δK ln K (1 δ) L ln L δ 1 δ = lim = ln K L . = ρ 0 δK ρ + (1 δ) L ρ ! ) δ 1 δ   Q = AK L . in which we use the L’Hopital’srule.

Luo, Y. (SEF of HKU) MME February15,2009 75/81 Least-Cost Combination of Inputs (Application of the Homogenous PF)

Minimizing cost problem:

min C = aPa + bPb (124) a,b f g subject to the output constraint

Q (a, b) = Q0 (125)

where Q0, Pa, and Pb are given exogenously. The marginal products are positive: Qa > 0, Qb > 0. The Lagrangian can be written as

Z = aPa + bPb + µ [Q0 Q (a, b)] (126)

Luo, Y. (SEF of HKU) MME February15,2009 76/81 (conti.) The FOCs are

Z = Q0 Q (a, b) = 0 (127) µ Za = Pa µQa = 0 (128) Zb = Pb µQb = 0 (129) The last two equation imply that

Pa P = b = µ, (130) Qa Qb which means that at the optimum, the input price – marginal product ratio must be the same for each input. This equality can be rewritten as Pa Qa = , MRTab, (131) Pb Qb

where MRTab is the marginal rate of technical substitution of a for b. See Figure 12.8 in CW.

Luo, Y. (SEF of HKU) MME February15,2009 77/81 (conti.) Su¢ cient SOC can be used to ensure a minimum cost after the FOCs are met. That is, the problem needs a negative bordered Hessian:

0 Qa Qb 2 2 H = Qa µQaa µQab = µ QaaQ 2QabQaQb + QbbQ < 0. b a Q µQ µQ b ba bb  (132)

Note that the curvature of an isoquant is represented by the second derivative: 2 db db d b d db d Qa Qaa + Qab da Qb Qba + Qbb da Qa 2 = = = 2 da da da da Qb Q       b   Qa Qa Qaa + Qab Q Qb Qba + Qbb Q Qa = b b h  i 2h  i Qb 1 2 2 = 3 QaaQb 2QabQaQb + QbbQa Qb  d 2b Hence, the satisfaction of the su¢ cient SOC implies that da2 > 0 because µ > 0 and Qb > 0. That is, the isoquant is strictly convex. Luo, Y. (SEF of HKU) MME February15,2009 78/81 Now assume that Q = Aaαbβ. The optimum implies that

P Q αb b β P a = a = =  = a = a constant Pb Qb βa ) a α Pb See Figure 12.9. The expansion path serves to describe the least-cost combinations required to produce varying levels of Q0 and is the locus of the points of tangency. In the above case, the EP is a straight line. Note that any homogenous production can give rise to a linear EP. A more general class of functions, known as homothetic functions, can produce linear EPs too.

De…nition Homotheticity can arise from a composite function in the form

H = h (Q (a, b)) h0 (Q) = 0 (133) 6 where Q (a, b) is homogenous of degreer. 

Luo, Y. (SEF of HKU) MME February15,2009 79/81 (conti.) Note that H (a, b) is in general not homogenous in a and b. Nonetheless, the EPs of it are linear: H h (Q) Q Q Slope of H isoquant = a = 0 a = a Hb h0 (Q) Qb Qb = Slope of Q isoquant b = constant for any given a given the linearity of the EPs of Q.

Check two examples: H = Q2 (H is also a homogenous function) and H = exp (Q) (H is not a homogenous function).

Luo, Y. (SEF of HKU) MME February15,2009 80/81 What is the e¤ect of the change in the exogenous input price ratio Pa Pb on the optimal ratio b ? We introduce a concept the elasticity of a substitution:

b b b b Pa relative change in a d a / a d a /d P σ =  =   =  b , P  P   P  b  P  relative change in a d a / a  / a Pb Pb Pb a Pb         (134) the larger the σ, the greater the substitution between the two inputs. If b is considered as a function of Pa , then σ will be the ratio of a a Pb marginal function to an average function. For the CD function, b = β Pa , which means that a α Pb σ = 1. (135) For the CES function introduced above,

Q P K δ 1/(1+ρ) P 1/(1+ρ) 1 L = L =  = L = σ = . QK PK ) L 1 δ PK ) 1 + ρ     

Luo, Y. (SEF of HKU) MME February15,2009 81/81