JOURNAL OF INDUSTRIAL AND doi:10.3934/jimo.2020070 MANAGEMENT OPTIMIZATION Volume 17, Number 5, September 2021 pp. 2307–2329

COMPUTING SHADOW WITH MULTIPLE LAGRANGE MULTIPLIERS

Tao Jie and Gao Yan∗ 516 Jungong Road School, University of Shanghai for Science and Technology Shanghai 200093, China

(Communicated by Liqun Qi)

Abstract. There is a wide consensus that the shadow prices of certain re- sources in an are equal to Lagrange multipliers. However, this is misleading with respect to multiple Lagrange multipliers. In this paper, we propose a new type of , the weighted minimum norm Lagrange multiplier, which is a type of shadow . An attractive aspect of this type of Lagrange multiplier is that it conveys the sensitivity informa- tion when resources are required to be proportionally input. To compute the weighted minimum norm Lagrange multiplier, we propose two algorithms. One is the penalty function method with numeric stability, and the other is the ac- celerated gradient method with fewer arithmetic operations and a convergence rate of O( 1 ). Furthermore, we propose a two-phase procedure to compute k2 a particular subset of shadow prices that belongs to the set of bounded La- grange multipliers. This subset is particularly attractive since all its elements are computable shadow prices. We report the numerical results for randomly generated problems.

1. Introduction. In recent years, given that the importance of environmental problems and resource allocation issues appear to be growing by the day, shadow prices (hereafter referred to as SP) have received significant attention from re- searchers, particularly in the areas of the energy economy [17, 19, 15], resource pricing [9] and decision-making analysis [14,7,8]. A shadow price, as stated by Tinbergen and Kantorovich respectively in the 1930s, is denoted as the infinitesimal change in the arising from an infinitesimal change in the resources constraint (see [21]). In general, a Lagrange multiplier vector (hereafter referred to as an LM) expresses the sensitivity information as the SP only when the LM is unique. In multiple LM cases, researchers have long recognized that multiple LMs may lead to incorrect computations of SPs. This observation raises the natural question of what kind of sensitivity information do Lagrange multipliers convey when they are not unique [5]. As to models, Jansen et al. [13] pointed out that: managers

2020 Mathematics Subject Classification. Primary: 90C25, 90C30; Secondary: 90C31. Key words and phrases. Shadow Price, Multiple Lagrange Multipliers, Weighted Norm La- grange Multiplier, Accelerated Gradient Method, Convex Programming. Tao Jie is supported by National Natural Science Foundation of China grant No. 71601117 and Soft Science Foundation of Shanghai grant No. 19692104600. ∗ Corresponding author: Gao Yan.

2307 2308 TAO JIE AND GAO YAN who build their own microcomputer linear programming models are apt to misuse the resulting SPs and shadow costs. Fallacious interpretations of these values can lead to expensive mistakes, especially unwarranted capital investments. Aucamp and Steinberg [2] also pointed out that the inequivalence of LMs and SPs occurs when the optimal solution is degenerated, which is also shown in [1, 12, 23]. In [2], Aucamp and Steinberg also proposed the notion of selling shadow prices and buying shadow prices, which correlate to two particular types of LMs. Akgul [1] also proposed a tableau - based method to compute these two types of SPs using linear programming models. Gauvin [11] extended the notion of selling/buying shadow prices to nonconvex programming models by resorting to parametric programming methods. However, a major drawback of these studies is that they fail to offer a path to compute the SPs in nonlinear programming models. To address this problem, Bertsekas et al. [6] proposed that the minimum norm LM can be used as a kind of SP in general programming models. The proposition of the minimum norm LM is of great significance since it provides a way to compute a certain SP. However, the lack of managerial significance of the minimum norm LM limits its application value. Furthermore, selling / buying shadow prices may also be impractical in real- world applications. To see this, in the production process, resources are usually proportionally input, and thus it is unrealistic to consider the increment of each resource separately. In this research, we first extend the work of [6], and introduce the notion of the weighted minimum norm Lagrange multiplier, which is a new type of shadow price. In practice, managers have the flexibility to adjust the weights of the resources being input. More importantly, the weighted minimum norm Lagrange multiplier is computable in the sense that two algorithms are proposed to compute this type of LM, and the convergence results of these two algorithms are also proved. Moreover, we propose a two-phase procedure to compute the set of shadow prices that belongs to a bounded subset of Lagrange multipliers. This set is of particular interest since the elements in this set are computable shadow prices. We also note that, to the best of our knowledge, the two-phase procedure is the first algorithm to identify whether a certain LM is a SP.

1.1. Structure of the paper. In Section2, we provide several relevant defini- tions, such as Lagrange multipliers and shadow prices. In Section3, we introduce the notion of the weighted minimum norm Lagrange multiplier, and discuss the managerial significance of this type of Lagrange multiplier. In Section4, we pro- pose two algorithms to compute the weighted minimum norm Lagrange multiplier: one is the penalty function method with numeric stability, and the other is the accel- erated gradient method with fewer arithmetic operations and a convergence rate of 1 O( k2 ). In Section5, enlightened by Wei and Yan’s vertex enumeration method [22], we propose a two-phase procedure to compute the set of shadow prices that belongs to a bounded subset of Lagrange multipliers. This subset is particularly attractive since all elements in this set are computable shadow prices. In Section6, numerical examples are shown to testify the computational efficiency of the algorithm herein. Finally, some concluding remarks are given in Section7.

2. Preliminaries.

2.1. Notations. Unless particularly specified, the norm used in this paper is the n n n n Euclidean norm; R denotes the n - dimensional Euclidean space; S , S+ and S++ COMPUTING SHADOW PRICES 2309 denote the space of symmetric matrix, positive semi-definite matrix and positive definite matrix respectively; · → · denotes the limit of a certain sequence; g+ = max{0, g}; e denotes the vector whose elements are all equal to 1; ej represents the vector whose elements are all equal to zero except the jth element being one; ∇ n denotes the gradient operator; let f : R → R, ∇if denotes the ith element of ∇f; For the convex function f and a certain point x, ∂f(x) denotes the subgradient of f at x; For the sets A and B, A\B represents A T Bc; For a given matrix D, D(m, n) represents the element on the mth row and nth column of D. D(m, :) represents the mth row of D, and D(1 : m, :) represents the matrix constituted by the first m rows of D.

2.2. Basic definitions. Consider the following economic system: max f(x) (1) s.t. gi(x) ≤ 0, i = 1, . . . , m, n where f, gi, i = 1, . . . , m, are continuously differentiable functions defined on R , and x ∈ Rn is the decision variable. The objective function f represents the ben- efit function, and gi, i = 1, . . . , m represents the constraint on the supplement of resources. Suppose x∗ is a locally optimal solution of Problem (1), we state that an ∗ ∗ ∗ inequality gi(x) ≤ 0 is active at x if gi(x ) = 0. We use I(x ) to denote the set of ∗ ∗ ∗ indices of active inequalities at the optimal solution x , i.e., I(x ) = {i | gi(x ) = 0}. ∗ c ∗ ∗ c ∗ Correspondingly, I(x ) is the complement of I(x ), i.e., I(x ) = {i | gi(x ) < 0}. The economic significance of the model is to make the best decision in order to maximize the value function under the constraint of limited resources. Let X = {x | gi(x) ≤ 0, i = 1, . . . , m} be the feasible set of Problem (1), a vector y ∈ Rn is said to be a tangent direction of X at x∗ ∈ X if either y = 0 or there ∗ exists a sequence {xk} ⊆ X such that xk =6 x for all k and ∗ ∗ xk − x y xk → x , ∗ → . kxk − x k kyk The set of all tangent directions of Problem (1) at x∗ is called the tangent cone ∗ ∗ ∗ of Problem (1) at x , denoted by TX(x ). A cone V (x ) is called the subspace of ∗ ∗ T first order feasible variations if V (x ) = {y | ∇gi(x ) y ≤ 0}. The Lagrangian of m P Problem (1) is defined as L(x; µ) = f(x) − µjgj(x). j=1 An LM of the nonlinear optimization problem is defined as follows [5]: Definition 1. Suppose x∗ is a locally optimal solution of Problem (1). If there exists µ∗ ∈ Rm satisfying: m ∗ P ∗ ∗ ∇f(x ) − µj ∇gj(x ) = 0; j=1 µ∗ ≥ 0; ∗ ∗ c µi = 0, i ∈ I(x ) , then µ∗ is called the Lagrange multiplier of Problem (1) at x∗.

Definition 2. Let L > 0. A function f : Rn → R is said to be L - smooth over a set D ⊆ Rn if it is differentiable over D and satisfies k∇f(x) − ∇f(y)k ≤ Lkx − yk, for all x, y ∈ D. The constant L is called the smoothness parameter. 2310 TAO JIE AND GAO YAN

2.3. Constraint qualification, lagrange multipliers and shadow price. The typical method to guarantee the existence of LMs is to develop constraint qualifica- tions, for example the linear independence constraint qualification, to ascertain the ∗ ∗ equivalence of TX(x ) and V (x ); that is, the tangent cone of Problem (1) and the subspace of first order feasible variations. This line of development of LM theory is insightful and intuitive, but does not offer satisfactory answers to the question as “what kind of sensitivity information do LMs convey when they are not unique?”1 Indeed, there exists several types of constraint qualifications to induce the unique- ness of LM, and hence SP [18, 16]. These qualifications, however, are too strong and may not always be satisfied [1]. Naturally, the need arises to develop a modern view of LMs. A significant step forward for studying the sensitivity information of LMs was taken by [6]. To address this issue, we proposed the notion of the informative LM, which has a strong connection with the notion of SP. Inspired by [6], we characterize shadow price by using the information of the first - order derivative of Problem (1) as follows. Definition 3. Suppose x∗ is a locally optimal solution of Problem (1) with the ∗ ∗ active set I(x ), and µ is some corresponding Lagrange multiplier, I1 = {i|i ∈ ∗ ∗ ∗ ∗ I(x ), µi > 0}, I2 = I(x )\I1. µ is shadow price of Problem (1) if there exists ∗ T ∗ T d =6 0 such that for all i ∈ I1, ∇gi(x ) d > 0, ∇f(x ) d > 0, and for all i ∈ I2, ∗ T ∇gi(x ) d ≤ 0. We explain Definition3 as follows: with respect to the ith resource (scarcity) in Problem (1), increasing the supplement of the ith resource is equal to changing the righthand side of the ith constraint to δi, i.e., gi(x) ≤ δi with δi > 0. As gi T is differentiable, there exists a direction d such that ∇gi(x) d > 0. In view of the fact that the ith resource is scarce, the increment of this resource will lead to an increment of the value function, i.e., ∇f(x)T d > 0. Based on this observation, if a certain LM µ is SP with I1 = {i|µi > 0} and I2 = {i|µi = 0}, then there T T exists a direction d such that ∇gi(x) d > 0, ∀i ∈ I1, ∇gi(x) d ≤ 0, ∀i ∈ I2, and ∇f(x)T d > 0. Therefore, from the geometrical perspective, there exists a hyperplane with the normal direction d that contains ∇gi(x), i ∈ I1 and ∇f(x) in one of its closed half space, and ∇gi(x), i ∈ I2 in its complementary subspace. From the economic perspective, the ith resource with i ∈ I1 is a type of scarcity, since a small increment of the supplement of these resources would increase the profit of the economic system. Conversely, the jth resource with j ∈ I2 is irrelevant to the increment of the profit of the economic system. It is noted that the definition of shadow price coincides with the notion of in- formative Lagrange multiplier proposed in [5]. According to [5], the informative Lagrange multiplier exists as long as the locally optimal solution x∗ exists, hence the existence of shadow price is guaranteed. Also we note that the shadow prices are not unique, hence in section5 we design algorithms to compute the set of shadow price.

3. Weighted minimum norm lagrange multiplier and shadow price. In this section, we review the theory of the minimum norm Lagrange multiplier (MinLM) from the literature of [5,6], and discuss merits and limitations of the MinLM.

1In the Appendix section we show that in nonlinear programming models, multiple LMs’ case really exists and not all LMs are SP. COMPUTING SHADOW PRICES 2311

Further we extend the the MinLM to the “weighted Minimum Norm Lagrange multiplier” (WMinLM) to obtain a larger class of SPs.

3.1. Minimum norm lagrange multiplier. Consider the following problem (pri- mal problem):

∗ T 1 X ∗ T + 2 min −∇f(x ) d + ((∇gi(x ) d) ) . (2) n d∈R 2 i∈I(x∗) The significance of the above model is that when loosing constraints on some resources along some direction d and keeping constraints on other resources un- changed, then how much will the cost function decrease along this direction d. That is the exact mathematical formulation of shadow price. According to [5], the dual of Problem (2) tells us that the MinLM is SP. To see this, we reformulate Problem (2) as ∗ T 1 P ∗ T + 2 min −∇f(x ) d1 + 2 ((∇gi(x ) d2) ) , d1,d2 i∈I(x∗) s.t. d1 = d2, whose Lagrangian is

∗ T 1 X ∗ T + 2 T L(d1, d2; λ) = −∇f(x ) d1 + ((∇gi(x ) d2) ) + λ (d1 − d2), 2 i∈I(x∗) where λ is the Lagrange multiplier corresponding to constraints d1 = d2. Then the dual function q(λ) is obtained as:

q(λ) = min L(d1, d2; λ) n d1,d2∈R ∗ T T 1 P ∗ T + 2 T = min (−∇f(x ) d1 + λ d1) + min { ((∇gi(x ) d2) ) − λ d2}. n n 2 d1∈R d2∈R i∈I(x∗) n The minimization over d1 ∈ R of ∗ T T −∇f(x ) d1 + λ d1 (3) yields a value of 0 if ∇f(x∗)−λ = 0 and a value of −∞ otherwise. For minimization of d2, the minimum is attained at ∗ T + ∗ λi = (∇gi(x ) d2) ∇gi(x ), i = 1, . . . , m. (4) Therefore, ( 1 P ∗ T + 2 ∗ P ∗ T + ∗ − 2 ((∇gi(x ) d) ) , if ∇f(x )− (∇gi(x ) d) ∇gi(x )= 0, q(λ) = i∈I(x∗) i∈I(x∗) −∞ else.

∗ T + ∗ ∗ Set µi = (∇gi(x ) d) , i ∈ I(x ), and µi = 0, for i∈ / I(x ), we have the dual problem 1 2 min kµk2 µ 2 ∗ P ∗ s.t. ∇f(x ) − µi∇gi(x ) = 0, i∈I(x∗) µ ≥ 0. The exact meaning of this dual problem is the minimum Euclidean norm Lagrange multiplier. Moreover for each primal optimal solution d∗, we have ∗ ∗ T ∗ + ∗ µi = (∇gi(x ) d ) , i ∈ I(x ). (5) 2312 TAO JIE AND GAO YAN

Moreover, from Eq.(3), Eq. (4), and Eq. (5), we have ∇f(x∗)T d∗ = −kµ∗k2. (6) 3.2. Extension - weighted norm lagrange multiplier. Although MinLM offer a possibility to compute a SP, its practical value, however, is limited in the sense that the economic significance of MinLM is obscure. Speak intuitively, in the production process the input of resources is usually in proportion; e.g., for a pharmacist, each ingredient should be input accurately in proportion so as to produce a standard medicine. To address this issue, we extend the work of [6], and propose a new characterization of shadow price; that is the weighted norm Lagrange multiplier. Suppose the input of m resources is controlled in proportion to positive weights ω1, . . . , ωm. Consider the following programming:

∗ T 1 X ∗ T + 2 max F (d) = ∇f(x ) d − (ωi(∇gi(x ) d) ) , (7) n d∈R 2 i∈I(x∗) By a nearly verbatim repetition of the deduction above, we obtain the Lagrange dual problem of Problem (7) as: 1 P ∗ T + 2 min 2 (ωi(∇gi(x ) d) ) d i∈I(x∗) ∗ P 2 ∗ T + ∗ s.t. ∇f(x ) − ωi (∇gi(x ) d) ∇gi(x ) = 0 i∈I(x∗) 2 ∗ T + ∗ ∗ Set µi = ωi (∇gi(x ) d) , i ∈ I(x ), and µi = 0, for i∈ / I(x ), we have the dual problem 1 µ 2 min k k2 µ 2 ω ∗ P ∗ s.t. ∇f(x ) − µi∇gi(x ) = 0 (8) i∈I(x∗) µ ≥ 0,  µi ∗ µ , i ∈ I(x ) where ( ) = ωi . Consequently, by the dual relationship of Prob- ω i 0, i∈ / I(x∗) lem (7) and (8), we know that the WMinLM is also shadow price. Similarly, we can also obtain the following relationship ∗ 2 ∗ T ∗ + ∗ µi = ωi (∇gi(x ) d ) , i ∈ I(x ), ∗ T ∗ µ∗ 2 (9) ∇f(x ) d = −k ω k . We note that WMinLM reduces to MinLM if the input of these m resources is equally weighted. We then discuss the economic significance of WMinLM. Let d∗ be the optimal solution of Problem (7), then d∗ is also the optimal solution of the following problem as max ∇f(x∗)T d P ∗ T + 2 s.t. (ωi(∇gi(x ) d) ) = β,¯ (10) i∈I(x∗) P ∗ T ∗ + 2 where β¯ is a constant whose value equals to (ωi(∇gi(x ) d ) ) . We explain i∈I(x∗) this in detail. Suppose, on the contrary, the optimal solution of Problem (10) is d˜∗, it follows that ∇f(x∗)T d˜∗ > ∇f(x∗)T d∗, P ∗ T ∗ + 2 P ∗ T ∗ + 2 (ωi(∇gi(x ) d˜ ) ) = (ωi(∇gi(x ) d ) ) . i∈I(x∗) i∈I(x∗) COMPUTING SHADOW PRICES 2313

Therefore when we put d˜∗ into Problem (7), the optimal value F (d˜∗) is larger than F (d∗), contradicts to the fact that d∗ is optimal for Problem (7). In this sense, for any violation of constraints β > 0, θd∗ with some positive θ = β/β¯ solves the following problem min ∇f(x∗)T d P ∗ T + 2 s.t. (ωi(∇gi(x ) d) ) = β. (11) i∈I(x∗) In Problem (11), given a certain direction d ∈ Rm, the directional derivative ∗ T + (∇gi(x ) d) expresses the increment of resource i which is calculated up to first ∗ T order of resources supplement function gi. Similarly, ∇f(x ) d expresses the im- provement of the benefit function value at the current optimal state x∗ in the direc- tion d. Therefore β in the constraints of Problem (11) sums up the increments of all resources with weights ωi, i = 1, . . . , m. The economic meaning of Problem (11) is to select an optimal direction d∗ that maximizes the benefit function improvement calculated up to first order for a given value of the constraint increment calculated up to first order. Once the optimal direction d∗ is obtained, then the shadow price is computed as ∗ 2 ∗ T ∗ + ∗ µi = ωi (∇gi(x ) d ) , i ∈ I(x ), ∗ ∗ µi = 0, i∈ / I(x ), which is exactly the weighted minimum norm Lagrange multiplier by the above analysis. ∗ ∗ Let I = {µi > 0}= 6 ∅, {k} be a sequence with 0 < k < 1 for all k, and k → 0 ∗ as k → ∞. For each k there exists {xk} such that xk → x with ∗ xk − x ∗ ∗ → d . kxk − x k

Denote ∗ xk − x ∗ εk = ∗ − d → 0(as k → ∞). kxk − x k By Eq. (9), we have ∗ ∗ ∗ T ∗ ∗ T ∗ f(xk) = f(x ) + kxk − x k(∇f(x ) d + ∇f(x ) εk) + o(kxk − x k), ∗ µ∗ 2 ∗ ∗ = f(x ) − k ω k kxk − x k + o(kxk − x k). Similarly, for i ∈ I(x∗), we have ∗ ∗ ∗ T ∗ ∗ T ∗ gi(xk) = gi(x ) + kxk − x k(ωi(∇gi(x ) d + ∇gi(x ) εk)) + o(kxk − x k) ∗ ∗ µi ∗ = kxk − x k( ) + o(kxk − x k). ωi Noticing that ∗ ∗ µi ∗ µi 2 ∗ gi(xk) = kxk − x k( ) + o(kxk − x k), ωi ωi we have m ∗ ∗ X µi ∗ f(xk) = f(x ) − gi(xk) + o(kxk − x k). (12) ω i=1 i This means that the first order cost improvement is equal to m ∗ X µi gi(xk). ω i=1 i µ∗ Therefore, i express the rate of improvement per unit increment of the ith resource, ωi along the steepest improvement direction d∗. 2314 TAO JIE AND GAO YAN

As a summary, d∗ represents the cost function maximum improvement direction, ∗ and µi express the rate of improvement per unit weighted increment of the ith resource, along the steepest improvement direction d∗. Therefore in Problem (7), the input proportion of each resource is controlled.

4. Computation of the weighted minimum norm lagrange multiplier. In this section, we propose two algorithms to compute WMinLM. The first algorithm, computing Problem (8), is based on penalty function method. The other algorithm, addressing Problem (7), is based on accelerated gradient method. The penalty function algorithm is numerically stable, and the accelerated gradient algorithm is computationally cheap. 4.1. Penalty function method. Suppose x∗ is the locally optimal solution of Problem (1). WMinLM is directly computed by solving Problem (8), whose con- straints are formed by the set of LMs. Since the LM set is a closed polyhedral, and the weighted Euclidean norm is coercive and strictly convex, there exists a unique optimal solution of Problem (8). Herein, we propose a penalty function algorithm (hereafter referred to as, PFA) to compute the weighted minimum norm LM. ∗ ∗ ∗ Let I(x ) = {i|gi(x ) = 0}, card(I )= N, and let ρ ∈ R be a certain penalized parameter. Furthermore, denote ∗ ∗  n×N G = ∇gi1(x ), ··· , ∇giN (x ) ∈ R , b = ∇f(x∗) ∈ Rn  N µ˜ = µi1, ··· , µiN ∈ R , Consider the following problem: 1 µ˜ 2 ρ 2 min 2 k ω k + 2 kb − Gµ˜k s.t. Gµ˜ = b, (13) µ˜ ≥ 0. Supposeµ ˜ is an optimal solution of Problem (13). Then, by the complementary slackness condition,  µ˜ , i ∈ I(x∗), µ∗ = i i 0, i∈ / I(x∗), is the optimal solution of Problem (8). Let λ ∈ Rn be the Lagrange multiplier vector to Problem (13), the Lagrangian is written as 1 µ˜ ρ L(˜µ, λ) = k k2 + kGµ˜ − bk2 + λT (Gµ˜ − b). 2 ω 2 Define zT = (˜µT , λT ), and the KKT mapping T as  µ˜ T T  2 + ρ(G (Gµ˜ − b)) + G λ T (z) = ω , b − Gµ˜

µ˜ µ˜i which is monotonic [4]. ( 2 )i = 2 , i = 1, . . . , m. Also we have the following ω ωi optimality condition for Problem (13): T (z∗) = 0, µ˜∗ ≥ 0, where z∗ = (˜µ∗, λ∗), and the saddle point inequality is also satisfied as: ∗ ∗ ∗ ∗ N L(˜µ , λ) ≤ L(˜µ , λ ) ≤ L(˜µ, λ ), ∀µ˜ ≥ 0, λ ∈ R . COMPUTING SHADOW PRICES 2315

Therefore, the PFA algorithm has the form:

zk+1 = P N n (zk − αkT (zk)), R+ ×R where αk is the step size. The iteration step can also be explained as:

µ˜k T T µ˜k+1 = P N (˜µk − αk( 2 + ρ(G (Gµ˜k − b)) + G λk)), R+ ω λk+1 = λk − αk(b − Gµ˜k).

The step size αk can be selected with a non-summable but square summable step length, i.e., ∞ ∞ rk X X 2 αk = , lim rk = 0, rk = ∞, rk = R < ∞. kT (zk)k k→∞ k=1 k=1 ∞ ∗ We can show that the sequence {zk}k=1 is convergent to the optimal solution z of Problem (13). ∗ Theorem 4. Suppose there exists M > 0 such that kz1k ≤ M and kz k ≤ M, then ∗ µ˜k → µ˜ and Gµ˜k → b as k → ∞. Proof. By deducing, we have ∗ 2 ∗ 2 kzk+1 − z k = kP N n (zk − αkT (zk)) − z k , R+ ×R ∗ 2 ≤ kzk − αkT (zk) − z k ∗ 2 T ∗ 2 2 ≤ kzk − z k − 2αkT (zk) (zk − z ) + αkkT (zk)k , T ∗ ∗ 2 T (zk) (zk−z ) 2 ≤ kzk − z k − 2rk + r . kT (zk)k k Summing up the above inequalities from k = 1 to k = N, N ∗ 2 P rk T ∗ ∗ 2 kzk+1 − z k + 2 T (zk) (zk − z ) ≤ kz1 − z k + R, kT (zk)k k=1 ≤ 4M 2 + R. Since both terms in the left hand side of the inequality above are nonnegative, for all k, we have ∗ 2 2 kzk − z k ≤ 4M + R, (14)

N X rk T ∗ 2 R T (zk) (zk − z ) ≤ 2M + , ∀k. (15) kT (zk)k 2 k=1 Inequality (14) means {zk} is bounded, and thus {T (zk)} is also bounded. Based on our choice of rk and Inequality (15), we have T ∗ T (zk) (zk − z ) → 0, as k → ∞. Moreover, it is obtained that T  ∗  T ∗ µ˜k T T T T T µ˜k − µ˜ T (zk) (zk − z ) = ( ω2 + ρ(Gµ˜k − b) G + λk G, b − µ˜k G ) ∗ λk − λ ∗ 1 µ˜k 2 1 µ˜ 2 2 ∗T ≥ 2 k ω2 k − 2 k ω2 k + ρkGµ˜k − bk + λ (Gµ˜k − b), ∗ ∗ ∗ ρ 2 = L(˜µk, λ ) − L(˜µ , λ ) + 2 kGµ˜k − bk . Since the left-hand side of the inequality above approaches zero, and the right-hand side is greater than zero, we have ∗ ∗ ∗ L(˜µk, λ ) → L(˜µ , λ ), as k → ∞, Gµ˜k → b, as k → ∞. ∗ Therefore,µ ˜k → µ˜ and Gµ˜k → b as k → ∞. 2316 TAO JIE AND GAO YAN

We now present the PFA algorithm of computing WMinLM as follows.

Algorithm 1 Algorithm PFA to compute the weighted minimum norm Lagrange multiplier 1: Computing the optimal solution x∗. ∗ ∗ 2: Set λi = 0, i∈ / I(x ). 3: Initialize z1 = (˜µ1, λ1). 4: while not convergent do 5: compute: T (zk); rk 6: compute the step size θk = ; kT (zk)k 7: zk+1 = P N n (T (zk)). R+ ×R 8: end while ∗ ∗ 9: µi =µ ˜i, i ∈ I(x ).

4.2. Accelerated gradient method. Since Algorithm PFA aims to solve a con- strained quadratic programming problem, it is not particularly attractive for solving large-scale problems. Alternatively we can resort to Problem (7), an unconstrained optimization problem, to compute the WMinLM. Firstly, in order to prove that Problem (7) has an optimal solution we need the following two propositions. Proposition 1. The objective function F (d) in Problem (7) is unbounded if and only if there exists at least an r ∈ R and i ∈ {1, . . . , n} such that ∗ ∇if(x )r > 0, ∗ ∇igj(x )r ≤ 0, j = 1, . . . , m. Proof. Sufficiency. Suppose α > 0 and d to be the vector such that

di = αr, dj = 0, j =6 i, j = 1, . . . , n. Taking d into the objective function of Problem (7), we obtain ∗ α(∇if(x )r) → ∞(α → ∞). Necessity. Supposing on the contrary that ∀i ∈ {i|i = 1, . . . , n}, the inequality system ∗ ∇if(x )r > 0, ∗ (16) ∇igj(x )r ≤ 0, j = 1, . . . , m, is infeasible. Then for any d ∈ Rn, ∗ T P ∗ T + 2 ∇f(x ) d − ((ωi∇gi(x ) d) ) i∈I(x∗) n n P ∗ T P P ∗ + 2 = ∇jf(x ) dj − ((ωi∇jgi(x )dj) ) . j=1 i∈I(x∗) j=1

In view of the infeasibility of inequalities (16), we have for any dj ∈ R such that ∗ T ∗ T ∗ + 2 |∇j f(x ) dj | ∇jf(x ) dj > 0, we have ((∇jgi(x )dj) ) > 0 and lim ∗ + 2 = 0. |dj |→∞ ((∇j gi(x )dj ) ) Thus it contradicts to the unboundedness assumption of Problem (7). By Proposition1, Problem (7) has an optimal solution. To see this, since the objective function in Problem (7) is continuous, we only need to show Problem (7) is bounded above. Suppose that Problem (7) is unbounded, then we can construct COMPUTING SHADOW PRICES 2317 a feasible ascent direction d as was shown in the proof of Proposition1. This leads to a contradiction to the optimality of x∗. Problem (7) is an unconstrained optimization model, and the gradient of the objective function F (d) is ∗ X 2 ∗ T + ∗ ∇F (d) = ∇f(x ) − ωi (∇gi(x ) d) ∇gi(x ), (17) i∈I(x∗) and gradient ascent method can be applied to solve Problem (7). As would become clear below, F (d) is also smooth and we propose a more efficient first order algorithm to solve Problem (7); that is, the accelerated gradient ascent method. In order to formulate the algorithm, we need the following proposition. ∗ T 1 ∗ T + 2 Proposition 2. The function F (d) = ∇f(x ) d − 2 (ωi(∇gi(x ) d) ) is L - P 2 ∗ 2 smooth with smoothness parameter L = ωi k∇gi(x )k . i∈I(x∗) Proof. Recall the gradient computed in (17), F (d) is differentiable over Rn. For n any d1, d2 ∈ R , P 2 ∗ T + ∗ T + ∗ k∇F (d1) − ∇F (d2)k ≤ k ωi ((∇gi(x ) d1) − (∇gi(x ) d2) )∇gi(x )k i∈I(x∗) P 2 ∗ T + ∗ T + ∗ ≤ ωi |(∇gi(x ) d1) − (∇gi(x ) d2) |k∇gi(x )k i∈I(x∗) P 2 ∗ T ∗ ≤ ωi |∇gi(x ) (d1 − d2)|k∇gi(x )k i∈I(x∗) P 2 ∗ 2 ≤ ωi k∇gi(x )k kd1 − d2k, i∈I(x∗) establishing the desired result. 1 Let θ = P 2 ∗ 2 , then the description of the accelerated gradient ωi k∇gi(x )k i∈I(x∗) method (hereafter referred to as, AGM) now follows.

Algorithm 2 Algorithm AGM to compute the weighted minimum norm LM

1: Initialize d0, and set y0 = d0, t0 = 1. 2: while not convergent, for k = 0, 1, 2,... execute the following steps: do 3: Pick θ as stepsize; 4: set dk+1 = yk√+ θ∇F (yk); 2 1+ 1+4tk 5: set tk+1 = 2 ; tk−1 6: compute yk+1 = dk+1 + ( )(dk+1 − dk). tk+1 7: end while

Let d¯ be the output of the AGM algorithm, then the WMinLM µ∗ is computed as ∗ ∗ T ¯ + ∗ ∗ ∗ c µi = (ωi∇gi(x ) d) , i ∈ I(x ), µi = 0, i ∈ I(x ) . We can also establish a rate of convergence of the AGM algorithm.

Proposition 3. Let {dk} be a sequence generated by the AGM algorithm. De- note d∗ be the maximizer of Problem (7), and dist(d) = min kd − d∗k. Then n d∈R limk→∞ dist(dk) = 0, and

∗ 2θ 2 F (dk) − F ≤ (dist(d0)) . (k + 1)2 2318 TAO JIE AND GAO YAN

The proof of Proposition3 can be referred to [4]. We note that the convergence 1 rate of the AGM algorithm is O( k2 ), much faster than that of the gradient ascent 1 method; that is O( k ). It is noted that the AGM algorithm can also be used as an alternative way to compute the Lagrange multiplier when LM is unique. We use a numerical example in [20] (Problem No.329) to illustrate this. 3 3 max −(x1 − 10) − (x2 − 20) 2 2 s.t. −(x1 − 5) − (x2 − 5) + 100 ≤ 0, 2 2 −(x1 − 6) − (x2 − 5) ≤ 0, (18) 2 2 −82.81 + (x1 − 6) + (x2 − 5) ≤ 0, 13 ≤ x1 ≤ 16, 0 ≤ x2 ≤ 15. The optimal solution of Problem (18) is x∗ = (14.09, 0.8430)T . Information on Problem (18) is plotted in Figure1. The contours in this figure represent the level 3 3 curve of the objective function f(x) = −(x1 −10) −(x2 −20) . The red dotted line 2 2 represents the first constraint function g1(x) = −(x1 −5) −(x2 −5) +100 = 0, and 2 the black solid line represents the third function g3(x) = −82.81 + (x1 − 6) + (x2 − 5)2 = 0. Since the second constraint is redundant, it is not plotted. The optimal ∗ T solution x = (14.09, 0.8430) is denoted as the black point. Here, g4 through g7 represent the constraints of the lower and upper bound of variables, respectively, ∗ ∗ ∗ and ∇f(x ), ∇g1(x ) and ∇g3(x ) represent the gradients of f(x), g1(x) and g3(x) ∗ ∗ at x . A simple computation reveals that only g1(x) and g3(x) are active at x , and thus their corresponding LM is zero.

16 −(x −5)2−(x −5)2+100 14 1 2

12 −82.81+(x −6)2+(x −5) 1 2 10

8

6 x2 4

2 ∇ g (x*) 1 0 ∇ * ∇ * g (x ) −2 f(x ) 3

−4

0 5 10 15 20 x1

Figure 1. Numerical Example in Schttfkowski (1987)

∗ ∗ Since ∇g1(x ) and ∇g3(x ) are linear independent, there exists a unique LM corresponding to x∗ by regularization constraint qualification [3]. Applying the AGM algorithm, we compute λ = (1097.12, 0, 1229.54, 0, 0, 0, 0), which is also SP. It can be confirmed in Figure1 that there exists a hyperplane with normal direction ∗ ∗ ∗ d that contains ∇f(x ), ∇g1(x ) and ∇g3(x ) in one of its closed halfspace. COMPUTING SHADOW PRICES 2319

5. Computation of the set of shadow prices. Besides the weighted minimum norm Lagrange multipliers, we can characterize and compute a larger class of shadow prices. In particular, we are interested in characterizing the subset, which is the intersection of the LM set and the SP set. This subset is attractive since all its elements are computable shadow prices. The following proposition, as illustrated in Figure2, clarifies the relationships between LMs and SPs, assuming there exists at least one LM.

Lagrange Multiplier Set: S LM

Shadow Price Bounded Subset: Set: S S¯ S SP BLM

Figure 2. Relationship of Lagrange Multiplier and Shadow Price

Proposition 4. Assume there exists at least one Lagrange multiplier, then T • The set of SP (i.e., SSP ) is bounded; that is, SSP (SLM \SBLM ) = ∅, where SLM and SBLM denote the set of Lagrange multipliers and the set of bounded Lagrange multipliers, respectively. • SSP * SBLM , and SBLM * SSP . Proof. The second argument in Proposition4 is directly obtained by [5] and [11], and thus we only to prove the first argument of Proposition4. Assume, to arrive m a contradiction, that a sequence of LMs {µk} ⊆ R with kµkk → ∞(k → ∞) are ∗ SPs. Since {µk} is a sequence of LMs corresponding to x , there exists some i such (i) that µk → ∞ as k → ∞ and satisfying m ∗ P (i) ∗ ∇f(x ) − µk ∇gi(x ) = 0, i=1 (i) ∗ (19) µk gi(x ) = 0, i = 1, . . . , m, µk ≥ 0.

(i) c Denote I = {i ∈ {1, . . . , m}|µk → ∞, as k → ∞}, and I = {1, . . . , m}\I. Noticing that the sequence {µk} is unbounded, then the index set I is nonempty. Since for all n k, µk is a shadow price, then according to Definition (3), there exists a d ∈ R such ∗ T ∗ T ∗ (i) that ∇gi(x ) d > 0, ∀i ∈ I1 and ∇f(x ) d > 0, where I1 = {i|i ∈ I(x ), µk > 0}. 2320 TAO JIE AND GAO YAN

(i) Noticing that for k sufficiently large, µk > 0, i ∈ I, and thus the index set I belongs ∗ T 2 to I1. It then follows that ∇gi(x ) d > 0, ∀i ∈ I. . Therefore by Eq. (19) ∗ T X (i) ∗ T X (i) ∗ T ∇f(x ) d − µk ∇gi(x ) d = µk ∇gi(x ) d, i∈Ic i∈I which is a contradiction because the right-hand side is unbounded, whereas the left hand side is not. Remark 1. The set of LMs at x∗ is  m  ∗ P ∗  ∇f(x ) − µi∇gi(x ) = 0,   i=1  SLM = µ ∗ µigi(x ) = 0, i = 1, . . . , m,    µ ≥ 0,  is an unbounded polyhedra which is constructed by the intersection of hyperplanes and half-spaces. Proposition4, however, states that we only need to focus on its bounded subset in order to compute SPs 3.

Based on Proposition4, we can focus on the following bounded subset SBLM of the LM set when analyzing SP:  m  ∗ P ∗  ∇f(x ) − µi∇gi(x ) = 0,   i=1   ∗  SBLM = µ µigi(x ) = 0, i = 1, . . . , m,

 µ ≥ 0,   T   e µ ≤ M,  where M represents a sufficiently large constant. In practice, we are particularly interested in the intersection of SBLM and SSP ; T that is, S¯ = SBLM SSP . The reason is that S¯ is a type of shadow price that can be computed as Lagrange multipliers, while the other SPs cannot. In this section, we present a method of computing S¯ based on a two-phase procedure. The first phase identifies the vertices of S¯. Thus, elements of S¯ are expressed as the convex combinations of these vertices. In the second phase, we propose an algorithm to identify whether a given element in S¯ is SP. 5.1. Phase 1 - Vertices computation. First, we consider the expression of all LMs in S¯. Since S¯ is a bounded polyhedra, it can thus be expressed by the convex combination of its vertices. Using vertex enumeration methods [10], the vertices of S¯ can be computed. We here apply the vertex enumeration method proposed by [22]. The following two lemmas were extracted from [22], and the proof here is omitted. Lemma 5. Suppose that P = {x ∈ Rn|cT x ≥ α, x ≥ 0, eT x = 1}. • if (ck − α)(cl − α) < 0, ck =6 cl, 1 ≤ k < l ≤ n, then

cl − α ck − α T n d = (0,..., 0, , 0,..., 0, − , 0,..., 0) ∈ R cl − ck cl − ck is a vertex of P , where cl−α and − ck−α are respectively the kth and lth cl−ck cl−ck element of d;

2 ∗ T c (i) c We do not care the sign of ∇gi(x ) d with i ∈ I since {µk }, i ∈ I is finite 3We note that the LM set S is bounded if and if the Mangasarian-Fromovitz constraint quali- ∗ T ∗ fication holds; that is, there exists a vector d such that ∇gi(x ) d < 0, i ∈ I(x ). COMPUTING SHADOW PRICES 2321

• if ck − α ≥ 0, 1 ≤ k ≤ n, d = ek is the vertex of P . Lemma 6. Suppose that n T T P = {µ ∈ R |c µ = α, µ ≥ 0, e µ = 1}.

If (ck − α)(cl − α) ≤ 0, ck =6 cl, 1 ≤ k < l ≤ n, then

cl − α ck − α T n d = (0,..., 0, , 0,..., 0, − , 0,..., 0) ∈ R cl − ck cl − ck is the vertex of P , where cl−α and − ck−α are respectively the kth and lth element cl−ck cl−ck of d. Based on Lemmas5 and6, the polyhedra 1 n T T P = {x ∈ R |a1 x ≥ b1, x ≥ 0, e x = 1}, can be represented by 1 1 T k P = {D λ|e λ = 1, λ ≥ 0, λ ∈ R 1 }, where 1 11 12 1k1 D = (d , d , . . . , d )n×k1 ,

11 1k1 1 1 1 d , . . . , d are vertices of P . Denote k0 = n,Λ = P . As to 2 n T T T P = {x ∈ R |a1 x ≥ b1, a2 x ≥ b2, e x = 1, x ≥ 0}, we have 2 n T 1 P = {x ∈ R |a2 x ≥ b2, x ∈ P } 1 T 1 T k1 = {D λ|a2 D λ ≥ b2, e λ = 1, λ ≥ 0, λ ∈ R },

2 k1 T 1 T and set Λ = {λ ∈ R |a2 D λ ≥ b2, e λ = 1, λ ≥ 0}. From Lemma5, 2 2 T k Λ = {D λ|e λ = 1, λ ≥ 0, λ ∈ R 2 },

2 21 2k2 21 2k2 2 where D = (d , . . . , d )k1×k2 , d , . . . , d are all vertices of Λ . Therefore

2 1 T 1 T k1 P = {D λ|(a2 D )λ ≥ b2, e λ = 1, λ ≥ 0, λ ∈ R } = {D1λ|λ ∈ Λ2} = {D1D2λ|eT λ = 1, λ ≥ 0, λ ∈ Rk2 }. In general, suppose that l 1 2 l T k P = {D D ··· D λ|e λ = 1, λ ≥ 0, λ ∈ R l }, where s = 1, 2, . . . , l, l s1 sks D = (d , . . . , d )ks−1×ks ,

s1 sks s s T 1 2 s−1 and let d , . . . , d be all vertices of Λ , where Λ = {λ|(as D D ··· D )λ ≥ T ks−1 l+1 bs, e λ = 1, λ ≥ 0, λ ∈ R }. Then, for P we have l+1 n T l P = {x ∈ R |al+1x ≥ bl+1, x ∈ P }, = {D1 ··· Dl+1λ|eT λ = 1, λ ≥ 0, λ ∈ Rkl+1 }. Based on the analysis above, we obtain the following lemma.

Lemma 7. Suppose P = {x ∈ Rn|Ax ≥ b, x ≥ 0, eT λ = 1}, D1D2 ··· Dm = 1 km m×n (d , . . . , d )n×km where A ∈ R . Then, • P = {D1 ··· Dmλ|eT λ = 1, λ ≥ 0, λ ∈ Rkm }; • the vertices of P belong to {d1, . . . , dkm }. 2322 TAO JIE AND GAO YAN

Since each element λ in S¯ satisfies m ∗ P ∗ ∇f(x ) − λj∇gj(x ) = 0, j=1 ∗ λigi(x ) = 0, i = 1, . . . , m, λ ≥ 0, eT λ ≤ M,

S¯ is composed of m + 1 equalities, one inequality, and m lower bound constraints. We make the following transformation on S¯: ∗ λi ˜ ∗ ∇f(x ) • set µi = M , i = 1, . . . , m, ∇f(x ) = M , • add a slack variable µm+1 so as to transform the last inequality in S¯ to an equality. This yields an equivalent set Sˆ whose elements µ satisfy

m ∗ P ∗ ∇f˜(x ) − µj∇gj(x ) = 0, j=1 ∗ µigi(x ) = 0, i = 1, . . . , m, µi ≥ 0, i = 1, . . . , m + 1, eT µ = 1.

It is obvious that if µ is the vertex of Sˆ, λ = Mµ is the vertex of S¯ which is also a corresponding LM. The vertices of Sˆ can be directly computed by the following vertex enumera- tion method (hereafter referred to as, VEM) in Wei and Yan (2001), and thus all bounded LMs can be represented by the convex combination of vertices of S¯ which are computed by Algorithm VEM.

Algorithm 3 Algorithm VEM to compute the vertices of the bounded LM set 1: Initialize D0 = D as an identity matrix. 2: for i = 1 : n + m do T i 3: compute ai D and D ; 4: update D = D × Di; 5: delete the elements in D at which less than m + 1 equalities holds. 6: end for 7: V = D(1 : m, :), and each column of V represents a vertex of Sˆ.

5.2. Phase 2 - Shadow price identification. Subsequently in the second stage, we propose an algorithm to identify whether these LMs are SPs. According to Definition3, to identify a given LM and determine whether it belongs to the set of SP ( denoted as SPx∗ ) is equal to check the feasibility of a set of linear inequalities. To see this, supposing that v is a given LM, it is identical to check the feasibility of the following set of inequalities:

∇f(x∗)T d > 0, ∗ T ∇gi(x ) d > 0, ∀i ∈ {i|vi > 0}, (20) ∗ T ∇gi(x ) d ≤ 0, ∀i ∈ {i|vi = 0}. COMPUTING SHADOW PRICES 2323

Inequalities (20) are not closed. We can alternatively choose a tolerance  > 0 sufficiently small, and consider the following closed form of the inequality system: −∇f(x∗)T d ≤ −, ∗ T −∇gi(x ) d ≤ −, ∀i ∈ {i|vi > 0}, (21) ∗ T ∇gi(x ) d ≤ , ∀i ∈ {i|vi = 0}.

It is obvious that v ∈ SPx∗ if and only if (21) is − feasible. Inequalities (21) can be computed by the following min max problem which can be easily solved by subgradient methods: min F (d), (22) d where  −∇f(x∗)T d,   ∗ T   −∇gi(x ) d, ∀i ∈ {i|vi > 0},  F (d) = max ∗ T .  ∇gi(x ) d, ∀i ∈ {i|vi = 0},   −.  It is obvious that if the optimal value of Problem (22) is −, then inequality system (21) is feasible. Assuming there exists a point d such that the optimal value of Problem (22) F ∗ is −, we can use Polayk’s stepsize (see Polyak, 1987) as

F (dk) +  αk = 2 , kgkk2 where gk is the subgradient of F (dk). Moreover, we can use F (dk) ≤ − as our termination condition because F ∗ = −. Therefore, the iterations of the subgradient method of solving Problem (22) are:

F (dk)+ dk+1 = dk − 2 gk, kgkk2  ∗ ∗ T −∇f(x ) if F (dk) = −∇f(x ) dk  ∗ ∗ T gk = −∇gi(x ) if F (dk) = −∇gi(x ) dk, i ∈ {i|vi > 0} ∗ ∗ T  ∇gi(x ) if F (dk) = ∇gi(x ) dk, i ∈ {i|vi = 0} The algorithm to identify whether a given LM is SP is as follows (hereafter we referred to this algorithm as ISP, which stands for Identification of Shadow Price):

Algorithm 4 Algorithm ISP to identify whether an LM is SP

1: Initialize d0, F (d0), k = 1, maxiter. 2: while F (dk) > − and k < maxiter do 3: compute gk; F (dk)+ 4: dk+1 = dk − 2 gk; kgkk 5: k = k + 1; 6: update F (dk+1). 7: end while 8: if F (dk) = − then 9: v is SP; 10: else 11: v is not SP. 12: end if

Theorem8 proves the sublinear convergence rate of Algorithm ISP. 2324 TAO JIE AND GAO YAN

∞ Theorem 8. Assuming f, gi, i = 1, . . . , m are Lipschitz continuous functions, {dk}k=1 is generated by Algorithm ISP, and d∗, F ∗ = − are respectively the optimal solution ∗ and optimal value of Problem (22). Moreover, kd1 − d k ≤ R, where R ∈ R, then ∗ F (dk) → F .

Proof. Since f, gi, i = 1, . . . , m are Lipschitz continuous functions, there exits a positive G such that ∀x, y ∈ Rn: |f(x) − f(y)| ≤ Gkx − yk, |gi(x) − gi(y)| ≤ Gkx − yk, i = 1, . . . , m. n which means the subgradient kgkk ≤ G, ∀x ∈ R . Since dk+1 = dk − αkgk, where F (dk)+ αk = 2 , we have kgkk ∗ 2 ∗ 2 T ∗ 2 2 kdk+1 − d k ≤ kdk − d k − 2αkgk (dk − d ) + αkkgkk , 2 ∗ 2 F (dk)+ (F (dk)+) ≤ kdk − d k − 2 2 (F (dk) + ) + 2 . kgkk kgkk The second inequality is due to the convexity of F (d). Therefore,

k 2 X (F (di) + ) ∗ 2 ≤ kd1 − d k , kg k2 i=1 i ∗ and by using kd1 − d k ≤ R, we have k X 2 2 (F (di) + ) ≤ (RG) , i=1 ∗ which means F (dk) − F → 0 with k → ∞. Moreover, the number of steps needed RG 2 before we can guarantee sub - optimality δ is ( δ ) . It seems impossible to determine whether all elements in S¯ are SPs by only checking the vertices of S¯. Since such numerical examples exist in which any convex combination of the vertices of S¯ is shadow price, although none of the vertex itself is. The following numerical example illustrates this phenomenon. 2 2 max 0.1x1 + 0.5x2 s.t. g1(x) = −(x1 − 1) + (x2 − 1) ≤ 0, g2(x) = −0.5(x1 − 1) + (x2 − 1) ≤ 0, (23) g3(x) = 0.5(x1 − 1) + (x2 − 1) ≤ 0, g4(x) = (x1 − 1) + (x2 − 1) ≤ 0, with the optimal solution x∗ = (1, 1). Information of the numerical example (23) is plotted in Figure3. In the figure, the ellipsoidal curves represent the contour lines of the objective function. The four dotted lines represent the four constrained functions, ∇f(x∗) represents the gradient of the objective function at the optimal ∗ solution x , and ∇gi, i = 1,..., 4 represents the gradient of the ith constrained function at the optimal solution x∗. Applying Algorithm VEM, we know that there are four vertices of the bounded Lagrange multiplier set, which are 1 2 T v1 = ( 3 , 0, 3 , 0) , 1 1 T v2 = ( 2 , 0, 0, 2 ) , 1 1 T v3 = (0, 2 , 2 , 0) , 2 1 T v4 = (0, 3 , 0, 3 ) . COMPUTING SHADOW PRICES 2325

It can be seen that none of the four vertices is SP. Taking v1 as an example, it is impossible to find a direction d that is the normal of a hyperplane that contains ∗ the vectors ∇f(x ), ∇g1, and ∇g3 in one of its closed halfspaces, and the vectors ∇g2 and ∇g4 in the complementary subspace. However, any convex combination of these four vertices is SP, because it is possible to find a direction d such that ∇f(x∗)T d > 0, and moving from x∗ in the direction of d will violate only those constraints that have positive multipliers (three or four constraints).

10 0.1*x2+0.5*x2 1 2 8 ∇ f(x*) 6 ∇ g ∇ g 3 4 ∇ g 2 ∇ g 1 4 2

x2 0

−2

−4

−6

−8

−10 −10 −5 0 5 10 x1

Figure 3. Numerical Example

6. Numerical tests. In this section, we use numerical examples to testify the va- lidity and efficiency of the PFA, AGM, VEM and ISP algorithms. All numerical tests are conducted on a computer server with an Intel(R) Xeon(R) CPU @2.40 GHz, 192 GB of RAM, and the programming language is MATLAB 2017b. First, we testify the convergence property of the PFA algorithm. We choose n = ∗ ∗ ∗ 20 and N = 100, and randomly generate vectors ∇f(x ), ∇g1(x ), ..., ∇g100(x ). The penalty parameter is set as ρ = 1, ρ = 102, and ρ = 104, respectively. The results are plotted in Figure4. The results clearly show that the PFA algorithm rapidly converges when parameter ρ is set as 1. The reason for this is that as the weight of the penalty term in the objective function increases, the algorithm contributes less to the decrement of the original objective function. We then apply the AGM, VEM and ISP algorithms to randomly generated quadratic optimization models. A quadratic optimization model was chosen be- cause in the field of , the minimization of costs is often treated as an objective function of an economic system, and cost functions are often represented as quadratic forms. In this sense, we formulated our test model as follows: max − 1 xT Hx − cT x x 2 (24) s.t. Ax ≤ b, where H is a positive definite matrix, c ∈ Rn, A ∈ Rm×n, and b ∈ Rm. With respect to the AGM algorithm, we aim to testify its convergence properties using randomly generated data sets with various dimensions n and numbers of 2326 TAO JIE AND GAO YAN

0 10

10−2

10−4

−6 10 rho=1e0 rho=1e+2

−8 rho=1e+4 10 0 1000 2000 3000 4000 5000 iterations

Figure 4. Convergence of the PFA algorithm with different penalty parameters

m n Computational Time 500 5000 10.7504 500 10000 11.4222 500 20000 12.5574 500 50000 15.5700 1000 5000 21.4259 1000 10000 21.8054 1000 20000 22.2733 1000 50000 25.2937 5000 5000 101.1437 5000 10000 102.1762 5000 20000 106.8786 5000 50000 107.3515 Table 1. Computational Times of the AGM algorithm on Large - scale Data Sets. constraints m. The convergence results of the AGM algorithm on the data set are illustrated in Table1. The first and second columns of Table1 respectively represent the number of constraints m and the dimension n of the data set, and the last column in Table1 represents the computing time of the AGM algorithm on these data sets. The unit of the computational time is seconds. It can be seen that the AGM algorithm performs well on the large - scale data sets. Moreover, it can also be inferred from Table1 that the impact of m on the computational time is much more significant than the impact of the dimension n. We aim to confirm the validity of the VEM and ISP algorithm. To construct numerical examples with multiple Lagrange multipliers, we generate a 9 × 5 matrix, in which the first five rows are randomly generated, and the last four rows are linear COMPUTING SHADOW PRICES 2327

Vertices Elements v1 0.0000 1.6118 0.0000 0.0000 0.0000 0.0000 0.4939 0.0000 0.0000 v2 0.0000 2.1057 0.4939 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 v3 0.0000 3.0834 0.0000 0.0000 0.1352 0.0000 0.0000 0.6957 0.0000 v4 0.0000 3.4207 0.0000 0.0000 0.0026 0.0000 0.0064 0.7603 0.1801 v5 0.0000 5.7145 0.0000 0.0000 3.1147 2.5431 3.6277 0.0000 0.0000 v6 0.0000 7.1430 0.0000 4.9601 0.9637 0.0000 1.9331 0.0000 0.0000 v7 0.0000 7.2983 0.0000 0.0000 0.0000 1.7188 3.3700 0.0000 2.6128 v8 0.0000 7.5898 0.0000 4.3192 0.0000 0.0000 2.0493 0.0000 1.0416 v9 0.0000 7.7043 2.9184 0.0000 2.4098 1.9675 0.0000 0.0000 0.0000 v10 0.0000 8.1361 1.7390 4.2912 0.8338 0.0000 0.0000 0.0000 0.0000 v11 0.0000 8.5707 1.8287 3.7067 0.0000 0.0000 0.0000 0.0000 0.8939 v12 0.0000 8.8386 2.7554 0.0000 0.0000 1.3515 0.0000 0.0000 2.0545 v13 0.0000 9.0761 0.0000 3.0271 0.9637 0.0000 0.0000 1.9331 0.0000 v14 0.0000 9.1971 0.0000 0.0000 1.9422 1.1568 0.0000 2.7039 0.0000 v15 0.0000 9.6392 0.0000 2.2699 0.0000 0.0000 0.0000 2.0493 1.0416 v16 0.0000 10.0534 0.0000 0.0000 0.0000 0.6918 0.0000 2.5809 1.6740 v17 0.0000 3.4315 0.0050 0.0000 0.0027 0.0000 0.0000 0.7627 0.1807 v18 0.6494 10.2980 0.0000 0.0000 0.0000 0.0000 0.0000 2.4700 1.5827 v19 1.0475 9.6669 0.0000 0.0000 1.7714 0.0000 0.0000 2.5142 0.0000 v20 1.2187 9.3956 2.5332 0.0000 0.0000 0.0000 0.0000 0.0000 1.8526 v21 1.5166 8.1461 0.0000 0.0000 0.0000 0.0000 3.0317 0.0000 2.3055 v22 1.6981 8.6358 2.5864 0.0000 2.0798 0.0000 0.0000 0.0000 0.0000 v23 2.1242 7.1628 0.0000 0.0000 2.6016 0.0000 3.1114 0.0000 0.0000 Table 2. Result of the 2-phase Procedure with 23 Vertices of the Bounded Lagrange Multiplier Set combinations of the first five rows. The reason for this construction is to avoid the regularity constraint qualification, which guarantees the uniqueness of the Lagrange multiplier. The results of the VEM and ISP algorithm are illustrated in Table2. As the table indicates, there are 23 vertices, v1 to v23, of the LM set, and any Lagrange multiplier in the set can be represented by these 23 vertices. Moreover, among these 23 vertices, only four vertices (i.e., v1,v5,v8, and v15) are shadow prices, and the other 19 vertices are not.

7. Conclusions. The major insights from our analysis are as follows: (1) The weighted minimum norm Lagrange multiplier is the shadow price, and it is attractive when resources are proportionally input. (2) The two proposed algorithms for computing the weighted norm Lagrange multiplier converge and are computationally cheap. (3) A class of shadow prices is computed with a two-phase procedure. That is, we design an algorithm to compute a particular shadow price set that belongs to the bounded Lagrange multiplier set. This class of shadow prices is of particular interest since all its elements are computable shadow prices. (4) To the best of our knowledge, the two-phase procedure is the first method to identify whether a given Lagrange multiplier is a shadow price.

Appendix. Multiple lagrange multipliers’ case. We use a simple example to show that the multiple LMs case really exists, and moreover, in the following case not all LMs are SPs.

2 2 max x1 + x2 s.t. x + x ≤ 1, 1 2 (25) x1 − x2 ≤ 1, x1 − 2x2 ≤ 1. Obviously, there are two optimal solutions (0, 1) and (1, 0), and herein we focus on LMs corresponding to x∗ = (1, 0) (see Figure5). The contours in Figure5 represents 2328 TAO JIE AND GAO YAN

2 2 the level curve of f(x) = x1 + x2. The black dot represents the optimal solution ∗ ∗ ∗ ∗ ∗ x = (1, 0), and ∇f(x ), ∇g1(x ), ∇g2(x ), and ∇g3(x ) are normal directions of ∗ ∗ ∗ ∗ ∗ ∗ ∗ the hyperplanes f(x)−f(x ) = 0, x1+x2−1 = 0, x1−x2−1 = 0, and x1−2x2−1 = 0 ∗ ∗ ∗ ∗ at x . According to Definition1,( λ1, λ2, λ3) is LM of Problem (25) corresponding to x∗ if: ∗ ∗ ∗ λ1 + λ2 + λ3 = 2, ∗ ∗ ∗ λ1 − λ2 − 2λ3 = 0. k 3 2 Obviously, any element in M = {( 2 + 1, − 2 k + 1, k)|∀0 ≤ k ≤ 3 } is an LM of 2 Problem (25). Among them, we consider two particular LMs. First, we set k = 3 , ˆ 4 2 ˆ and the current LM is λ = ( 3 , 0, 3 ). According to Definition3, λ is SP if there exists ∗ ∗ ∗ a hyperplane with normal direction d that contains ∇f(x ), ∇g1(x ), ∇g3(x ) in one ∗ of its closed half space, and ∇g2(x ) in its complementary subspace. This is clearly impossible, as shown in Figure5, and thus λˆ is not SP. Second, we set k = 0, and the current LM is λ˜ = (1, 1, 0). In Figure5, a hyperplane exists with normal d that ∗ ∗ ∗ ∗ separates ∇f(x ), ∇g1(x ), ∇g2(x ) and ∇g3(x ), thus λ˜ is SP.

2 x2+y2 x+y=1 1.5 x−y=1 x−2y=1 1

0.5 ∇ g (x*) 1

y 0 ∇ f(x*) * −0.5 ∇ g (x ) ∇ g (x*) 2 3

−1

−1.5

−2 −2 −1 0 1 2 x

Figure 5. Example of Multiple Lagrange Multipliers

The example above shows the case of multiple LMs, where not all LMs are SPs. Moreover, the norm of LM (1, 1, 0) is the minimum, which is consistent with the conclusion drawn by [6]. In fact, according to [6], the minimum norm multipliers ∗ λi , i = 1, . . . , m express the rate of improvement per unit constraint violation, along the maximum improvement (or steepest descent) direction d∗.

Acknowledgments. We would like to thank the editors and two anonymous re- viewers whose feedback substantially improved this paper. Tao Jie would like to dedicate this paper to Wen Zongying and Wen Zongxia.

REFERENCES

[1] M. Akgul, A note on shadow prices in linear programming, Journal of the Operational Re- search Society, 35 (1984), 425–431. COMPUTING SHADOW PRICES 2329

[2] D. C. Aucamp and D. I. Steinberg, The computation of shadow prices in linear programming, Journal of the Operational Research Society, 33 (1982), 557–565. [3] M. S. Bazaraa, H. D. Sherali and C. M. Shetty, Nonlinear Programming: Theory and Algo- rithms, John Wiley & Sons Inc., Hoboken, NJ, 2006. [4] D. P. Bertsekas, Convex Optimization Algorithms, Athena Scientific, Belmont, 2015. [5] D. P. Bertsekas, A. Nedi, A. E. Ozdaglar, et al., Convex Analysis and Optimization, Athena Scientific, Belmont, 2003. [6] D. P. Bertsekas and A. E. Ozdaglar, Pseudonormality and a lagrange multiplier theory for constrained optimization, Journal of Optimization Theory and Applications, 114 (2002), 287–343. [7] J. P. Caulkins, D. Grass, G. Feichtinger and G. Tragler, Optimizing counter-terror operations: Should one fight fire with“fir” or“wate”?, Computers & , 35 (2008), 1874–1885. [8] T.-L. Chen, J. T. Lin and S.-C. Fang, A shadow-price based heuristic for capacity planning of tft-lcd manufacturing, Journal of Industrial & Management Optimization, 6 (2010), 209–239. [9] B. Col, A. Durnev and A. Molchanov, Foreign risk, domestic problem: Capital allocation and firm performance under political instability, Management Science, 64 (2018), 1975–2471. [10] M. E. Dyer, The complexity of vertex enumeration methods, Mathematics of Operations Research, 8 (1983), 381–402. [11] J. Gauvin, Shadow prices in nonconvex mathematical programming, Mathematical Program- ming, 19 (1980), 300–312. [12] M. Hessel and M. Zeleny, Optimal system design: towards new interpretation of shadow prices in linear programming, Computers & Operations Research, 14 (1987), 265–271. [13] B. Jansen, J. De Jong, C. Roos and T. Terlaky, Sensitivity analysis in linear programming: Just be careful!, European Journal of Operational Research, 101 (1997), 15–28. [14] T. T. Ke, Z.-J. M. Shen and J. M. Villas-Boas, Search for information on multiple products, Management Science, 62 (2016), 3576–3603. [15] R. Kutsuzawa, A. Yamashita, N. Takemura, J. Matsumoto, M. Tanaka and N. Yamanaka, Demand response minimizing the impact on the consumers’ utility towards renewable energy, in Smart Grid Communications (SmartGridComm), 2016 IEEE International Conference on, IEEE, 2016, 68–73. [16] J. Kyparisis, On uniqueness of kuhn-tucker multipliers in nonlinear programming, Mathemat- ical Programming, 32 (1985), 242–246. [17] C.-Y. Lee and P. Zhou, Directional shadow price estimation of co2, so2 and nox in the united states coal power industry 1990–2010, Energy Economics, 51 (2015), 493–502. [18] O. L. Mangasarian, Uniqueness of solution in linear programming, Linear Algebra and Its Applications, 25 (1979), 151–162. [19] W. Meng and X. Wang, Distributed energy management in smart grid with wind power and temporally coupled constraints, IEEE Transactions on Industrial Electronics, 64 (2017), 6052–6062. [20] K. Schittkowski, More Test Examples for Nonlinear Programming Codes, Lecture Notes in Economics and Mathematical Systems, 282. Springer-Verlag, Berlin, 1987. [21] N. Walter, Microeconomic Theory: Basic Principles and Extensions., Nelson Education, Canada, 2005. [22] Q. Wei and H. Yan, A method of transferring polyhedron between the intersection-form and the sum-form, Computers & Mathematics with Applications, 41 (2001), 1327–1342. [23] L. Zhang, D. Feng, J. Lei, C. Xu, Z. Yan, S. Xu, N. Li and L. Jing, Congestion surplus minimization pricing solutions when lagrange multipliers are not unique, IEEE Transactions on Power Systems, 29 (2014), 2023–2032. Received August 2019; revised November 2019. E-mail address: [email protected] E-mail address: [email protected]