<<

Using Lagrange Multipliers CEE 201L. Uncertainty, Design, and Optimization Department of Civil and Environmental Engineering Duke University Henri P. Gavin and Jeffrey T. Scruggs Spring 2020

In optimal design problems, values for a set of n design variables, (x1, x2, ··· xn), are to be found that minimize a scalar-valued objective of the design variables, such that a set of m inequality constraints, are satisfied. Constrained optimization problems are generally expressed as

g1(x1, x2, ··· , xn) ≤ 0 g2(x1, x2, ··· , xn) ≤ 0 min J = f(x1, x2, ··· , xn) such that . (1) x1,x2,··· ,xn . gm(x1, x2, ··· , xn) ≤ 0 If the objective function is quadratic in the design variables and the constraint are linearly independent, the has a unique solution. Consider the simplest constrained minimization problem: 1 min kx2 where k > 0 such that x ≥ b . (2) x 2 1 2 This problem has a single design , the objective function is quadratic (J = 2 kx ), there is a single constraint inequality, and it is linear in x (g(x) = b − x). If g > 0, the constraint constrains the optimum and the optimal solution, x∗, is given by x∗ = b. If g ≤ 0, the constraint equation does not constrain the optimum and the optimal solution is given by x∗ = 0. Not all optimization problems are so easy; most optimization methods require more advanced methods. The methods of Lagrange multipliers is one such method, and will be applied to this simple problem. Lagrange multiplier methods involve the modification of the objective function through the addition of terms that describe the constraints. The objective function J = f(x) is augmented by the constraint equations through a set of non-negative multiplicative Lagrange multipliers, λj ≥ 0. The augmented objective function, JA(x), is a function of the n design variables and m Lagrange multipliers,

m X JA(x1, x2, ··· , xn, λ1, λ2, ··· , λm) = f(x1, x2, ··· , xn) + λj gj(x1, x2, ··· , xn) (3) j=1 For the problem of equation (2), n = 1 and m = 1, so 1 1 J (x, λ) = kx2 + λ (b − x) = kx2 − λx + λb (4) A 2 2 The Lagrange multiplier, λ, serves the purpose of modifying (augmenting) the objective 1 2 1 2 function from one quadratic ( 2 kx ) to another quadratic ( 2 kx − λx + λb) so that the minimum of the modified quadratic satisfies the constraint (x ≥ b). 2 CEE 201L. Uncertainty, Design, and Optimization – Duke – Spring 2020 – Gavin and Scruggs

Case 1: b = 1 1 2 If b = 1 then the minimum of 2 kx is constrained by the inequality x ≥ b, and the optimal value of λ should minimize JA(x, λ) at x = b. Figure1(a) plots JA(x, λ) for a few non-negative values of λ and Figure1(b) plots contours of JA(x, λ).

12 4

3.5 (b) 10 (a) λ=4 3

8 λ (x) A 2.5 33 6 2 2 * 4 1.5 Lagrange multiplier, augmented cost, J 1 2 b=1 1

0.5 0 λ = 0

-2 -1.5 -1 -0.5 0 0.5 1 1.5 2 0 design variable, x -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 design variable, x

4 3.5 3 2.5

JA 2 1.5 1 0.5 4 3.5 3 2.5 00 2 0.5 1.5Lagrange multiplier, λ 1 1 1.5 0.5 design variable, x 2 0

1 2 ∗ ∗ Figure 1. JA(x, λ) = 2 kx − λx + λb for b = 1 and k = 2. (x , λ ) = (1, 2)

CC BY-NC-ND HPG, JTS July 10, 2020 Constrained Optimization using Lagrange Multipliers 3

Figure1 shows that:

• JA(x, λ) is independent of λ at x = b,

∗ ∗ • JA(x, λ) is minimized at x = b for λ = 2,

• the surface JA(x, λ) is a saddle shape, • the point (x∗, λ∗) = (1, 2) is a ,

∗ ∗ ∗ ∗ • JA(x , λ) ≤ JA(x , λ ) ≤ JA(x, λ ),

∗ ∗ ∗ ∗ • min JA(x, λ ) = max JA(x , λ) = JA(x , λ ) x λ Saddle points have no .

∂J (x, λ) A = 0 (5a) ∗ ∂x x = x λ = λ∗

∂J (x, λ) A = 0 (5b) ∗ ∂λ x = x λ = λ∗ (5c)

For this problem,

∂J (x, λ) A = 0 ⇒ kx∗ − λ∗ = 0 ⇒ λ∗ = kx∗ (6a) ∗ ∂x x = x λ = λ∗

∂J (x, λ) A = 0 ⇒ −x∗ + b = 0 ⇒ x∗ = b (6b) ∗ ∂λ x = x λ = λ∗

1 2 This example has a physical interpretation. The objective function J = 2 kx represents the potential energy in a spring. The minimum potential energy in a spring corresponds to a stretch of zero (x∗ = 0). The constrained problem: 1 min kx2 such that x ≥ 1 x 2 means “minimize the potential energy in the spring such that the stretch in the spring is greater than or equal to 1.” The solution to this problem is to set the stretch in the spring equal to the smallest allowable value (x∗ = 1). The force applied to the spring in order to achieve this objective is f = kx∗. This force is the Lagrange multiplier for this problem, (λ∗ = kx∗).

The Lagrange multiplier is the force required to enforce the constraint.

CC BY-NC-ND HPG, JTS July 10, 2020 4 CEE 201L. Uncertainty, Design, and Optimization – Duke – Spring 2020 – Gavin and Scruggs

Case 2: b = −1 1 2 If b = −1 then the minimum of 2 kx is not constrained by the inequality x ≥ b. The derivation above would give x∗ = −1, with λ∗ = −k. The negative value of λ∗ indicates that the constraint does not affect the optimal solution, and λ∗ should therefore be set to ∗ ∗ zero. Setting λ = 0, JA(x, λ) is minimized at x = 0. Figure2(a) plots JA(x, λ) for a few negative values of λ and Figure2(b) plots contours of JA(x, λ).

12 0 * -0.5 (b) 10 (a) λ=-4 -1

8 λ (x) A -1.5 -3 6 -2 -2 4 -2.5 Lagrange multiplier, augmented cost, J -1 2 b=-1 -3

-3.5 0 λ = 0

-2 -1.5 -1 -0.5 0 0.5 1 1.5 2 -4 design variable, x -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 design variable, x

4 3.5 3 2.5

JA 2 1.5 1 0.5 0 -0.5 -1 -1.5 0-2 -2 -1.5 -2.5Lagrange multiplier, λ -1 -3 -0.5 -3.5 design variable, x 0 -4

1 2 ∗ ∗ Figure 2. JA(x, λ) = 2 kx − λx + λb for b = −1 and k = 2. (x , λ ) = (0, 0)

CC BY-NC-ND HPG, JTS July 10, 2020 Constrained Optimization using Lagrange Multipliers 5

Figure2 shows that:

• JA(x, λ) is independent of λ at x = b,

• the saddle point of JA(x, λ) occurs at a negative value of λ, so ∂JA/∂λ 6= 0 for any λ ≥ 0.

• The constraint x ≥ −1 does not affect the solution, and is called a non-binding or an inactive constraint.

• The Lagrange multipliers associated with non-binding inequality constraints are nega- tive.

• If a Lagrange multiplier corresponding to an inequality constraint has a negative value at the saddle point, it is set to zero, thereby removing the inactive constraint from the calculation of the augmented objective function.

Summary In summary, if the inequality g(x) ≤ 0 constrains the minimum of f(x) then the optimum point of the augmented objective JA(x, λ) = f(x)+λg(x) is minimized with respect to x and maximized with respect to λ. The optimization problem of equation (1) may be written in terms of the augmented objective function (equation (3)),

max min JA(x1, x2, ··· , xn, λ1, λ2, ··· , λm) such that λj ≥ 0 ∀ j (7) λ1,λ2,··· ,λm x1,x2,··· ,xn The necessary conditions for optimality are:

∂J A = 0 (8a) ∗ ∂xk xi = xi ∗ λj = λj

∂J A = 0 (8b) ∗ ∂λk xi = xi ∗ λj = λj

λj ≥ 0 (8c)

These equations define the relations used to derive expressions for, or compute values of, the ∗ optimal design variables xi . Lagrange multipliers for inequality constraints, gj(x1, ··· , xn) ≤ 0, are non-negative. If an inequality gj(x1, ··· , xn) ≤ 0 constrains the optimum point, the cor- responding Lagrange multiplier, λj, is positive. If an inequality gj(x1, ··· , xn) ≤ 0 does not constrain the optimum point, the corresponding Lagrange multiplier, λj, is set to zero.

CC BY-NC-ND HPG, JTS July 10, 2020 6 CEE 201L. Uncertainty, Design, and Optimization – Duke – Spring 2020 – Gavin and Scruggs

Sensitivity to Changes in the Constraints and Redundant Constraints Once a constrained optimization problem has been solved, it is sometimes useful to consider how changes in each constraint would affect the optimized cost. If the minimum of f(x) (where x = (x1, ··· , xn)) is constrained by the inequality gj(x) ≤ 0, then at the ∗ ∗ ∗ ∗ optimum point x , gj(x ) = 0 and λj > 0. Likewise, if another constraint gk(x ) does not ∗ ∗ constrain the minimum, then gk(x ) < 0 and λk = 0. A small change in the j-th constraint, ∗ ∗ ∗ from gj to gj + δgj, changes the constrained objective function from f(x ) to f(x ) + λj δgj. ∗ However, since the k-th constraint gk(x ) does not affect the solution, neither will a small ∗ change to gk, λk = 0. In other words, a small change in the j-th constraint, δgj, corresponds to a proportional change in the optimized cost, δf(x∗), and the constant of proportionality ∗ is the Lagrange multiplier of the optimized solution, λj .

m ∗ X ∗ δf(x ) = λj δgj (9) j=1

The value of the Lagrange multiplier is the sensitivity of the constrained objective to (small) changes in the constraint δg. If λj > 0 then the inequality gj(x) ≤ 0 constrains the optimum ∗ point and a small increase of the constraint gj(x ) increases the cost. If λj < 0 then ∗ gj(x ) < 0 and does not constrain the solution. A small change of this constraint should not affect the cost. So if a Lagrange multiplier associated with an inequality constraint ∗ gj(x ) ≤ 0 is computed as a negative value, it is subsequently set to zero. Such constraints are called non-binding or inactive constraints.

f(x) f(x)

δ f = λ δ g λ<0

λ>0

      λ=0  x δ = 0 x  f         x*   δ  x* g(x)+ g g(x)+ δ g not ok ok g(x) not ok ok g(x)

Figure 3. Left: The optimum point is constrained by the condition g(x∗) ≤ 0. A small increase of the constraint from g to g + δg increases the cost from f(x∗) to f(x∗) + λδg.(λ > 0) Right: The constraint is negative at the minimum of f(x), so the optimum point is not constrained by the condition g(x∗) ≤ 0. A small increase of the constraint function from g to g + δg does not affect the optimum solution. If this constraint were an equality constraint, g(x) = 0, then the Lagrange multiplier would be negative and an increase of the constraint would decrease the cost from f to f + λδg (λ < 0).

The previous examples consider the minimization of a simple quadratic with a single inequality constraint. By expanding this example to two inequality constraints we can see again how Lagrange multipliers indicate whether or not the associated constraint bounds

CC BY-NC-ND HPG, JTS July 10, 2020 Constrained Optimization using Lagrange Multipliers 7 the optimal solution. Consider the minimization problem 1 x ≥ b min kx2 such that where k > 0 and 0 < b < c (10) x 2 x ≥ c

f(x)

      x           b x*=c not ok ok

1 2 Figure 4. Minimization of f(x) = 2 kx such that x ≥ b and x ≥ c with 0 < b < c. Feasible solutions for x are greater-or-equal than both b and c.

Proceeding with the method of Lagrange multipliers 1 J (x λ , λ ) = kx2 + λ (b − x) + λ (c − x) (11) A 1 1 2 2 1 2 Where the constraints (b − x) and (c − x) are satisfied if they are negative-valued. The of the augmented objective function are

∂J (x, λ , λ ) A 1 2 ∗ ∗ ∗ x = x∗ = 0 ⇒ kx − λ1 − λ2 = 0 (12a) ∂x ∗ λ1 = λ1 ∗ λ2 = λ2

∂J (x, λ , λ ) A 1 2 ∗ x = x∗ = 0 ⇒ −x + b = 0 (12b) ∂λ1 ∗ λ1 = λ1 ∗ λ2 = λ2

∂J (x, λ , λ ) A 1 2 ∗ x = x∗ = 0 ⇒ −x + c = 0 (12c) ∂λ2 ∗ λ1 = λ1 ∗ λ2 = λ2 ∗ ∗ ∗ These three equations are linear in x , λ1, and λ2.  k −1 −1   x∗   0     ∗     −1 0 0   λ1  =  −b  (13) ∗ −1 0 0 λ2 −c Clearly, if b 6= c then x can not equal both b and c, and equations (12b) and (12c) can not both be satisfied. In such cases, we proceed by selecting a subset of the constraints, ∗ 1 2 and evaluating the resulting solutions. So doing, the solution x = b minimizes 2 kx such ∗ that x ≥ b with a Lagrange multiplier λ1 = kb. But x = b does not satisfy the inequality ∗ 1 2 x ≤ c. The solution x = c minimizes 2 kx . such that both inequalities hold and with with a Lagrange multiplier λ2 = kc. Note that for this quadratic objective, the constraint with the larger Lagrange multiplier is the active constraint and that the other constraint is non-binding.

CC BY-NC-ND HPG, JTS July 10, 2020 8 CEE 201L. Uncertainty, Design, and Optimization – Duke – Spring 2020 – Gavin and Scruggs

Multivariable Quadratic optimization problems with a single design variable and a single linear in- equality constraint are easy enough. In problems with many design variables (n  1) and many inequality constraints (m  1), determining which inequality constraints are enforced at the optimum point can be difficult. Numerical methods used in solving such problems involve iterative trial-and-error approaches to find the set of “active” inequality constraints. Consider a third example with n = 2 and m = 2:

2 2 g1 = 3x1 + 2x2 + 2 ≤ 0 min x1 + 0.5x1 + 3x1x2 + 5x2 such that (14) x1,x2 g2 = 15x1 − 3x2 − 1 ≤ 0

This example also has a quadratic objective function and inequality constraints that are linear in the design variables. Contours of the objective function and the two inequality constraints are plotted in Figure5. The of these two inequality constraints is to the left of the lines in the figure and are labeled as “g1 ok” and “g2 ok”. This figure shows that the inequality g1(x1, x2) constrains the solution and the inequality g2(x1, x2) does not. This is visible in Figure5 with n = 2, but for more complicated problems it may not be immediately clear which inequality constraints are “active.” Using the method of Lagrange multipliers, the augmented objective function is

2 2 JA(x1, x2, λ1, λ2) = x1 + 0.5x1 + 3x1x2 + 5x2 + λ1(3x1 + 2x2 + 2) + λ2(15x1 − 3x2 − 1) (15)

Unlike the first examples with n = 1 and m = 1, we cannot plot contours of JA(x1, x2, λ1, λ2) since this would be a plot in four-dimensional space. Nonetheless, the same optimality conditions hold.

∂JA ∗ ∗ ∗ ∗ min JA ⇒ = 0 ⇒ 2x + 0.5 + 3x + 3λ + 15λ = 0 (16a) x 1 2 1 2 1 ∂x1 ∗ ∗ ∗ ∗ x1,x2,λ1,λ2

∂JA ∗ ∗ ∗ ∗ min JA ⇒ = 0 ⇒ 3x + 10x + 2λ − 3λ = 0 (16b) x 1 2 1 2 2 ∂x2 ∗ ∗ ∗ ∗ x1,x2,λ1,λ2

∂J A ∗ ∗ max JA ⇒ = 0 ⇒ 3x1 + 2x2 + 2 = 0 (16c) λ1 ∂λ1 ∗ ∗ ∗ ∗ x1,x2,λ1,λ2

∂J A ∗ ∗ max JA ⇒ = 0 ⇒ 15x1 − 3x2 − 1 = 0 (16d) λ2 ∂λ2 ∗ ∗ ∗ ∗ x1,x2,λ1,λ2 If the objective function is quadratic in the design variables and the constraints are linear in the design variables, the optimality conditions are simply a set of linear equations in the design variables and the Lagrange multipliers. In this example the optimality conditions are expressed as four linear equations with four unknowns. In general we may not know which inequality constraints are active. If there are only a few constraint equations it’s not too hard to try all combinations of any number of constraints, fixing the Lagrange multipliers for the other inequality constraints equal to zero.

CC BY-NC-ND HPG, JTS July 10, 2020 Constrained Optimization using Lagrange Multipliers 9

Let’s try this now!

• First, let’s find the unconstrained minimum by assuming neither constraint g1(x1, x2) ∗ ∗ or g2(x1, x2) is active, λ1 = 0, λ2 = 0, and

" #" ∗ # " # " ∗ # " # 2 3 x1 −0.5 x1 −0.45 ∗ = ⇒ ∗ = , (17) 3 10 x2 0 x2 0.14 which is the unconstrained minimum shown in Figure5 as a “ ∗”. Plugging this so- ∗ ∗ ∗ ∗ lution into the constraint equations gives g1(x1, x2) = 0.93 and g2(x1, x2) = −8.17, so the unconstrained minimum is not feasible with respect to constraint g1, since g1(−0.45, 0.14) > 0.

• Next, assuming both constraints g1(x1, x2) and g2(x1, x2) are active, optimal values for ∗ ∗ ∗ ∗ x1, x2, λ1, and λ2 are sought, and all four equations must be solved together.    ∗     ∗    2 3 3 15 x1 −0.5 x1 −0.10  3 10 2 −3   x∗   0   x∗   −0.85     2     2       ∗  =   ⇒  ∗  =   , (18)  3 2 0 0   λ1   −2   λ1   3.55  ∗ ∗ 15 −3 0 0 λ2 1 λ2 −0.56 which is the constrained minimum shown in Figure5 as a “ o” at the intersection of the ∗ g1 line with the g2 line in Figure5. Note that λ2 < 0 indicating that constraint g2 is not active; g2(−0.10, 0.85) = 0 (ok). Enforcing the constraint g2 needlessly compromises the optimality of this solution. So, while this solution is feasible (both g1 and g2 evaluate to zero), the solution could be improved by letting go of the g2 constraint and moving along the g1 line.

∗ • So, assuming only constraint g1 is active, g2 is not active, λ2 = 0, and    ∗     ∗    2 3 3 x1 −0.5 x1 −0.81    ∗     ∗     3 10 2   x2  =  0  ⇒  x2  =  0.21  , (19) ∗ ∗ 3 2 0 λ1 −2 λ1 0.16 ∗ which is the constrained minimum as a “o” on the g1 line in Figure5. Note that λ1 > 0, ∗ ∗ which indicates that this constraint is active. Plugging x1 and x2 into g2(x1, x2) gives a value of -13.78, so this constrained minimum is feasible with respect to both constraints. This is the solution we’re looking for.

∗ • As a final check, assuming only constraint g2 is active, g1 is not active, λ1 = 0, and    ∗     ∗    2 3 15 x1 −0.5 x1 0.06    ∗     ∗     3 10 −3   x2  =  0  ⇒  x2  =  −0.03  , (20) ∗ ∗ 15 −3 0 λ2 1 λ2 −0.04

which is the constrained minimum shown in as a “o” on the g2 line in Figure5. Note ∗ that λ2 < 0, which indicates that this constraint is not active, contradicting our as- ∗ ∗ sumption. Further, plugging x1 and x2 into g1(x1, x2) gives a value of +2.24, so this constrained minimum is not feasible with respect to g1, since g1(0.06, −0.03) > 0.

CC BY-NC-ND HPG, JTS July 10, 2020 10 CEE 201L. Uncertainty, Design, and Optimization – Duke – Spring 2020 – Gavin and Scruggs

1

(a)

0.5

g1 ok 2

x 0

g2 ok

-0.5

-1 -1 -0.5 0 0.5 1

x1

1

(b)

0.5

g1 ok 2

x 0

g2 ok

-0.5

-1 -1 -0.5 0 0.5 1

x1

Figure 5. Contours of the objective function and the constraint equations for the example of ∗ equation (14). (a): J = f(x1, x2); (b): JA = f(x1, x2) + λ1g1(x1, x2). Note that the the contours of JA are shifted so that the minimum of JA is at the optimal point along the g1 line.

CC BY-NC-ND HPG, JTS July 10, 2020 Constrained Optimization using Lagrange Multipliers 11

Matrix Relations for the Minimum of a Quadratic with Equality Constraints

The previous example of the minimization of a quadratic in x1 and x2, subject to two in linear equality constraints equations (14), may be written, generally as

1 2 1 2 A11x1 + A12x2 − b1 ≤ 0 min H11x1 + H12x1x2 + H22x2 + c1x1 + c2x2 + d such that x1,x2 2 2 A21x1 + A22x2 − b2 ≤ 0 (21) or, even more generally, for any number (n) of design variables, as

Pn j=1 A1jxj − b1 ≤ 0  n n n  Pn 1 X X X j=1 A2jxj − b2 ≤ 0 min  xiHijxj + cixi + d such that . (22) x1,x2,··· ,xn . 2 i=1 j=1 i=1 . Pn j=1 Amjxj − bm ≤ 0 P P where Hij = Hji and i j xiHijxj > 0 for any set of design variables. In matrix-vector notation, equation (22) is written 1 min xTHx + cTx + d such that Ax − b ≤ 0 (23) x 2 The augmented optimization problem is

 n n n m  n  1 X X X X X max min  xiHijxj + cixi + d + λk  Akjxj − bk , (24) λ ,··· ,λ x1,··· ,xn 1 m 2 i=1 j=1 i=1 k=1 j=1 such that λk ≥ 0. Or, in matrix-vector notation, 1  max min xTHx + cTx + d + λT (Ax − b) such that λ ≥ 0 . (25) λ x 2 Assuming all constraints are “active” the necessary conditions for optimality are ∂J (x, λ) ∂J (x, λ) A = 0T and A = 0T (26) ∂x ∂λ Which, for this very general system result in

Hx + c + ATλ = 0 and Ax − b = 0 (27)

These equations can be written compactly as a single matrix equation

" HAT #" x # " −c # = (28) A 0 λ b

Equation (28) is called the Karush Kuhn Tucker (KKT) equation. Comparing equation (14) to equation (22) and comparing equations (18), (19), and (20) to equation (28), you may begin to convince yourself that the KKT system gives a general solution for the minimum of a quadratic with linear equality constraints. It is very useful.

CC BY-NC-ND HPG, JTS July 10, 2020 12 CEE 201L. Uncertainty, Design, and Optimization – Duke – Spring 2020 – Gavin and Scruggs

Primal and Dual If it is possible to solve for the design variables x in terms of the Lagrange multipliers λ, then the design variables can be eliminated from the problem and the optimization is simply a maximization over the set of Lagrange multipliers. For this kind of problem, the condition

∂J (x, λ) A = 0T (29) ∂x results in Hx + c + ATλ = 0 from which the design variables x may be solved in terms of λ x = −H−1(ATλ + c). (30) Substituting this solution for x in terms of λ into (25) eliminates x from equation (25) and results in a really long expression for the augmented objective in terms of λ only

1         max −(λTA + cT)H−1 H −H−1(ATλ + c) − cTH−1(ATλ + c) + d + λT A −H−1(ATλ + c) − b λ 2 ... and after a little algebra involving some convenient cancellations and combinations ... we obtain the dual problem.  1 1  max − λTAH−1ATλ − (cTH−1AT + bT)λ − cTH−1c + d such that λ ≥ 0 (31) λ 2 2

Equation (25) is called the primal quadratic programming problem and equation (31) is called the dual quadratic programming problem. The primal problem has n + m unknown variables (x and λ) whereas the dual problem has only m unknown variables (λ) and is therefore easier to solve. Note that if quadratic term in the primal form (quadratic in the primal variable x) is positive, then the quadratic term in the dual form (quadratic in the dual variable λ) is negative. So, again, it makes sense to minimize over the primal variable x and to maximize over the dual variable λ.

CC BY-NC-ND HPG, JTS July 10, 2020 Constrained Optimization using Lagrange Multipliers 13

An Active Set Algorithm for Quadratic Programming As was shown in the numerical example, in problems with multiple inequality con- straints it is often hard to know which of the inequality constraints are active. A trial- and-error approach may be attempted to find the set of active inequality constriants, for which all the Lagrange multipliers are positive-valued. Anticipating nonlinear problems, we will assume we know that x is a good but potentially infeasible solution to the quadratic program. We wish to update x to a better solution by adding the update vector h, so that x + h solves the quadratic program. To find the “optimal” update vector, we can express the quadratic in terms of the known current parameter values x and the unknown update vector h , 1  max min (x + h)TH(x + h) + cT(x + h) + d + λT (A(x + h) − b) s.t. λ ≥ 0 (32) λ h 2 Expanding equation (32) and deriving the necessary conditions for optimality of the step h, ∂J (h, λ) ∂J (h, λ) A = 0T and A = 0T (33) ∂h ∂λ leads to H(x + h) + c + ATλ = 0 and A(x + h) − b = 0 (34) which may be writtin in a KKT form " HAT #" h # " Hx − c # = (35) A 0 λ −Ax + b But the question remains, “Which constraint equations are active?” In the following the constraint equations in the active set are denoted

Aax − ba = 0 (36) with Lagrange multipliers λa associated with the constraint equations in the active set. In this active set algorithm, we may start by assuming no constraints are active, so that (0) (0) Aa = A and ba = b. 1. In iteration (k), solve

" (k)T #" # " # HAa h Hx − c (k) (k) = (k) (k) Aa 0 λa −Aa x + ba (k) for h and λa . 2. Evaluate the full set of constraint equations g = A(x + h) − b.

(k+1) (k+1) 3. In Aa and b include the rows of A and b corresponding to the rows of g that are positive.

(k+1) (k+1) (k) 4. In Aa and b exclude the rows of A and b corresponding to the rows of λa that are negative.

(k) 5. Assign elements of λa to the active indices of λ. 6. set k = k + 1 and return to step 1.

CC BY-NC-ND HPG, JTS July 10, 2020