<<

Chapter 14

Linear Programming: Interior-Point Methods

In the 1980s it was discovered that many large linear programs could be solved efficiently by formulating them as nonlinear problems and solving them with various modifications of nonlinear such as Newton’s method. One characteristic of these methods was that they required all iterates to satisfy the inequality constraints in the problem strictly, so they became known as interior-point methods. By the early 1990s, one class—primal-dual methods— had distinguished itself as the most efficient practical approach, and proved to be a strong competitor to the method on large problems. These methods are the focus of this chapter. Interior-point methods arose from the search for algorithms with better the- oretical properties than the simplex method. As we mentioned in Chapter ??, the simplex method can be inefficient on certain pathological problems. Roughly speaking, the time required to solve a linear program may be exponential in the size of the problem, as measured by the number of unknowns and the amount of storage needed for the problem data. For almost all practical problems, the simplex method is much more efficient than this bound would suggest, but its poor worst-case complexity motivated the development of new algorithms with better guaranteed performance. The first such method was the , proposed by Khachiyan [11], which finds a solution in time that is at worst polynomial in the problem size. Unfortunately, this method approaches its worst-case bound on all problems and is not competitive with the simplex method in practice. Karmarkar’s projective [10], announced in 1984, also has the poly- nomial complexity property , but it came with the added inducement of good practical behavior. The initial claims of excellent performance on large linear programs were never fully borne out, but the announcement prompted a great deal of research activity and a wide array of methods described by such labels as “affine-scaling,” “logarithmic barrier,” “potential-reduction,” “path-following,”

1 2 CHAPTER 14. INTERIOR-POINT METHODS

“primal-dual,” and “infeasible interior-point.” All are related to Karmarkar’s original algorithm, and to the log-barrier approach described in Chapter ??, but many of the approaches can be motivated and analyzed independently of the earlier methods. Interior-point methods share common features that distinguish them from the simplex method. Each interior-point iteration is expensive to compute and can make significant progress towards the solution, while the simplex method usually requires a larger number of inexpensive iterations. Geometrically speak- ing, the simplex method works its way around the of the feasible polytope, testing a sequence of vertices in turn until it finds the optimal one. Interior-point methods approach the boundary of the feasible set only in the limit. They may approach the solution either from the interior or the of the , but they never actually lie on the boundary of this region. In this chapter, we outline some of the basic ideas behind primal-dual interior- point methods, including the relationship to Newton’s method and homotopy methods and the concept of the central path. We sketch the important methods in this class, and give a comprehensive convergence analysis of a long-step path- following method. We describe in some detail a practical predictor-corrector algorithm proposed by Mehrotra, which is the basis of much of the current generation of software.

14.1 Primal-Dual Methods

Outline We consider the problem in standard form; that is,

min cT x, subject to Ax = b, x ≥ 0, (14.1) where c and x are vectors in IRn, b is a vector in IRm,andA is an m × n matrix. The dual problem for (14.1) is

max bT λ, subject to AT λ + s = c, s ≥ 0, (14.2) where λ is a vector in IRm and s is a vector in IRn. As shown in Chapter ??,so- lutions of (14.1),(14.2) are characterized by the Karush-Kuhn-Tucker conditions (??), which we restate here as follows:

AT λ + s = c, (14.3a) Ax = b, (14.3b) xisi =0,i=1, 2,...,n, (14.3c) (x, s) ≥ 0. (14.3d)

Primal-dual methods find solutions (x∗,λ∗,s∗) of this system by applying vari- ants of Newton’s method to the three equalities in (14.3) and modifying the 14.1. PRIMAL-DUAL METHODS 3 search directions and step lengths so that the inequalities (x, s) ≥ 0aresatisfied strictly at every iteration. The equations (14.3a),(14.3b),(14.3c) are only mildly nonlinear and so are not difficult to solve by themselves. However, the prob- lem becomes much more difficult when we add the nonnegativity requirement (14.3d). This nonnegativity condition is the source of all the complications in the design and analysis of interior-point methods. To derive primal-dual interior-point methods we restate the optimality con- 2n+m ditions (14.3) in a slightly different form by means of a mapping F from IR to IR2n+m:   AT λ + s − c F (x, λ, s)= Ax − b  =0, (14.4a) XSe (x, s) ≥ 0, (14.4b) where X =diag(x1,x2,...,xn),S=diag(s1,s2,...,sn), (14.5) and e =(1, 1,...,1)T . Primal-dual methods generate iterates (xk,λk,sk)that satisfy the bounds (14.4b) strictly, that is, xk > 0andsk > 0. This property is the origin of the term interior-point. By respecting these bounds, the methods avoid spurious solutions, that is, points that satisfy F (x, λ, s) = 0 but not (x, s) ≥ 0. Spurious solutions abound, and do not provide useful information about solutions of (14.1) or (14.2), so it makes sense to exclude them altogether from the region of search. Many interior-point methods actually require the iterates to be strictly fea- sible;thatis,each(xk,λk,sk) must satisfy the linear equality constraints for the primal and dual problems. If we define the primal-dual feasible set F and strictly feasible set F o by

F = {(x, λ, s) | Ax = b, AT λ + s = c, (x, s) ≥ 0}, (14.6a) F o = {(x, λ, s) | Ax = b, AT λ + s = c, (x, s) > 0}, (14.6b) the strict feasibility condition can be written concisely as

(xk,λk,sk) ∈Fo.

Like most iterative algorithms in optimization, primal-dual interior-point methods have two basic ingredients: a procedure for determining the step and a measure of the desirability of each point in the search space. The procedure for determining the search direction procedure has its origins in Newton’s method for the nonlinear equations (14.4a). Newton’s method forms a linear model for F around the current point and obtains the search direction (∆x, ∆λ, ∆s)by solving the following system of linear equations:   ∆x J(x, λ, s)  ∆s  = −F (x, λ, s), ∆λ 4 CHAPTER 14. INTERIOR-POINT METHODS where J is the Jacobian of F . (See Chapter ?? for a detailed discussion of Newton’s method for nonlinear systems.) If the current point is strictly feasible (that is, (x, λ, s) ∈Fo), the Newton step equations become       0 AT I ∆x 0  A 00  ∆λ  =  0  . (14.7) S 0 X ∆s −XSe

Usually, a full step along this direction would violate the bound (x, s) ≥ 0. To avoid this difficulty, we perform a along the Newton direction so that the new iterate is (x, λ, s)+α(∆x, ∆λ, ∆s), for some line search parameter α ∈ (0, 1]. Unfortunately, we often can take only a small step along this direction (α  1) before violating the condition (x, s) > 0. Hence, the pure Newton direction (14.7), which is known as the affine scaling direction, often does not allow us to make much progress toward a solution. Most primal-dual methods modify the basic Newton procedure in two im- portant ways:

1. They bias the search direction toward the interior of the nonnegative or- thant (x, s) ≥ 0, so that we can move further along the direction before one of the components of (x, s) becomes negative.

2. They prevent the components of (x, s)frommoving“tooclose”tothe boundary of the nonnegative orthant.

To describe these modifications, we need to introduce the concept of the central path, and of neighborhoods of this path.

The Central Path The central path C is an arc of strictly feasible points that plays a vital role in primal-dual algorithms. It is parametrized by a scalar τ>0, and each point (xτ ,λτ ,sτ ) ∈Csolves the following system:

AT λ + s = c, (14.8a) Ax = b, (14.8b) xisi = τ, i =1, 2,...,n, (14.8c) (x, s) > 0. (14.8d)

These conditions differ from the KKT conditions only in the term τ on the right-hand side of (14.8c). Instead of the complementarity condition (14.3c), we require that the pairwise products xisi have the same (positive) value for all indices i. From (14.8), we can define the central path as

C = {(xτ ,λτ ,sτ ) | τ>0}. 14.1. PRIMAL-DUAL METHODS 5

It can be shown that (xτ ,λτ ,sτ ) is defined uniquely for each τ>0ifandonly if F o is nonempty. Another way of defining C is to use the mapping F defined in (14.4) and write   0   F (xτ ,λτ ,sτ )= 0 , (xτ ,sτ ) > 0. (14.9) τe The equations (14.8) approximate (14.3) more and more closely as τ goes to zero. If C converges to anything as τ ↓ 0, it must converge to a primal-dual solution of the linear program. The central path thus guides us to a solution along a route that maintains positivity of the x and s components and decreases the pairwise products xisi, i =1, 2,...,n to zero at the same rate. Primal-dual algorithms take Newton steps toward points on C for which τ>0, rather than pure Newton steps for F . Since these steps are biased toward the interior of the nonnegative orthant defined by (x, s) ≥ 0, it usually is possible to take longer steps along them than along the pure Newton (affine scaling) steps, before violating the positivity condition. To describe the biased search direction, we introduce a centering parameter σ ∈ [0, 1] and a measure µ defined by

n xT s µ 1 x s , = n i i = n (14.10) i=1 which measures the average value of the pairwise products xisi.Byfixing τ = σµ and applying Newton’s method to the system (14.9), we obtain       0 AT I ∆x 0  A 00  ∆λ  =  0  . (14.11) S 0 X ∆s −XSe+ σµe

The step (∆x, ∆λ, ∆s) is a Newton step toward the point (xσµ,λσµ,sσµ) ∈C,at which the pairwise products xisi are all equal to σµ. In contrast, the step (14.7) aims directly for the point at which the KKT conditions (14.3) are satisfied. If σ = 1, the equations (14.11) define a centering direction,aNewtonstep toward the point (xµ,λµ,sµ) ∈C, at which all the pairwise products xisi are identical to the current average value of µ. Centering directions are usually biased strongly toward the interior of the nonnegative orthant and make little, if any, progress in reducing the duality measure µ. However, by moving closer to C, they set the scene for a substantial reduction in µ on the next iteration. At the other extreme, the value σ = 0 gives the standard Newton (affine scaling) step (14.7). Many algorithms use intermediate values of σ from the open interval (0, 1) to trade off between the twin goals of reducing µ and improving centrality.

A Primal-Dual Framework With these basic concepts in hand, we can define a general framework for primal-dual algorithms. 6 CHAPTER 14. INTERIOR-POINT METHODS

Framework 14.1 (Primal-Dual) Given (x0,λ0,s0) ∈Fo for k =0, 1, 2,... Solve       0 AT I ∆xk 0  A 00  ∆λk  =  0  , (14.12) k k k k k S 0 X ∆s −X S e + σkµke

k T k where σk ∈ [0, 1] and µk =(x ) s /n; Set

k+1 k+1 k+1 k k k k k k (x ,λ ,s )=(x ,λ ,s )+αk(∆x , ∆λ , ∆s ), (14.13) k+1 k+1 choosing αk so that (x ,s ) > 0. end (for).

The choices of centering parameter σk and step length αk are crucial to the performance of the method. Techniques for controlling these parameters, directly and indirectly, give rise to a wide variety of methods with different theoretical properties. So far, we have assumed that the starting point (x0,λ0,s0) is strictly feasible and, in particular, that it satisfies the linear equations Ax0 = b, AT λ0 + s0 = c. All subsequent iterates also respect these constraints, because of the zero right- hand side terms in (14.12). For most problems, however, a strictly feasible starting point is difficult to find. Infeasible-interior-point methods require only that the components of x0 and s0 be strictly positive. The search direction needs to be modified so that it improves feasibility as well as centrality at each iteration, but this requirement entails only a slight change to the step equation (14.11). If we define the residuals for the two linear equations as

T rb = Ax − b, rc = A λ + s − c, (14.14) the modified step equation is       T 0 A I ∆x −rc       A 00 ∆λ = −rb . (14.15) S 0 X ∆s −XSe+ σµe

The search direction is still a Newton step toward the point (xσµ,λσµ,sσµ) ∈C. It tries to correct all the infeasibility in the equality constraints in a single step. If a full step is taken at any iteration, (that is, αk =1forsomek), the residuals rb and rc become zero, and all subsequent iterates remain strictly feasible. We discuss infeasible-interior-point methods further in Section 14.3.

Central Path Neighborhoods and Path-Following Methods Path-following algorithms explicitly restrict the iterates to a neighborhood of the central path C and follow C to a solution of the linear program. By preventing 14.1. PRIMAL-DUAL METHODS 7

central path neighborhood

C

Figure 14.1: Central path, projected into space of primal variables x, showing a typical neighborhood N the iterates from coming too close to the boundary of the nonnegative orthant, they ensure that search directions calculated from each iterate make at least a minimal amount of progress toward the solution. A key ingredient of most optimization algorithms is a measure of the de- sirability of each point in the search space. In path-following algorithms, the duality measure µ defined by (14.10) fills this role. By forcing the duality mea- k k k sure µk to zero as k →∞, we ensure that the iterates (x ,λ ,s )comecloser and closer to satisfying the KKT conditions (14.3). The two most interesting neighborhoods of C are the so-called 2-norm neigh- borhood N2(θ) defined by o N2(θ)={(x, λ, s) ∈F |XSe− µe2 ≤ θµ}, (14.16) for some θ ∈ [0, 1), and the one-sided ∞-norm neighborhood N−∞(γ) defined by o N−∞(γ)={(x, λ, s) ∈F | xisi ≥ γµ all i =1, 2,...,n}, (14.17) for some γ ∈ (0, 1]. (Typical values of the parameters are θ =0.5andγ =10−3.) If a point lies in N−∞(γ), each pairwise product xisi must be at least some small multiple γ of their average value µ. This requirement is actually quite modest, and we can make N−∞(γ) encompass most of the feasible region F by choosing γ close to zero. The N2(θ) neighborhood is more restrictive, since certain points o in F do not belong to N2(θ) no matter how close θ is chosen to its upper bound of 1. By keeping all iterates inside one or other of these neighborhoods, path- following methods reduce all the pairwise products xisi to zero at more or less the same rate. Figure 14.1 shows the projection of the central path C onto the primal variables for a typical problem, along with a typical neighborhood N . Path-following methods are akin to homotopy methods for general nonlinear equations, which also define a path to be followed to the solution. Traditional 8 CHAPTER 14. INTERIOR-POINT METHODS homotopy methods stay in a tight tubular neighborhood of their path, making incremental changes to the parameter and chasing the homotopy path all the way to a solution. For primal-dual methods, this neighborhood is conical rather than tubular, and it tends to be broad and loose for larger values of the duality measure µ. It narrows as µ → 0, however, because of the positivity requirement (x, s) > 0. The algorithm we specify below, a special case of Framework 14.1, is known as a long-step path-following algorithm. This algorithm can make rapid progress because of its use of the wide neighborhood N−∞(γ), for γ close to zero. It depends on two parameters σmin and σmax, which are upper and lower bounds on the centering parameter σk. The search direction is, as usual, obtained by solving (14.12), and we choose the step length αk to be as large as possible, subject to the requirement that we stay inside N−∞(γ). Here and in later analysis, we use the notation

def (xk(α),λk(α),sk(α)) =(xk,λk,sk)+α(∆xk , ∆λk, ∆sk), (14.18a) def k T k µk(α) = x (α) s (α)/n. (14.18b)

Algorithm 14.2 (Long-Step Path-Following) Given γ, σmin, σmax with γ ∈ (0, 1), 0 <σmin <σmax < 1, 0 0 0 and (x ,λ ,s ) ∈N−∞(γ); for k =0, 1, 2,... Choose σk ∈ [σmin,σmax]; Solve (14.12) to obtain (∆xk, ∆λk, ∆sk); Choose αk as the largest value of α in [0, 1] such that

k k k (x (α),λ (α),s (α)) ∈N−∞(γ); (14.19) k+1 k+1 k+1 k k k Set (x ,λ ,s )=(x (αk),λ (αk),s (αk)); end (for). Typical behavior of the algorithm is illustrated in Figure 14.2 for the case of n = 2. The horizontal and vertical axes in this figure represent the pair- wise products x1s1 and x2s2, so the central path C is the line emanating from the origin at an angle of 45◦. (A point at the origin of this illustration is a primal-dual solution if it also satisfies the feasibility conditions (14.3a), (14.3b), and (14.3d).) In the unusual geometry of Figure 14.2, the search directions (∆xk, ∆λk, ∆sk) transform to curves rather than straight lines. As Figure 14.2 shows (and the analysis confirms), the lower bound σmin on the centering parameter ensures that each search direction starts out by moving away from the boundary of N−∞(γ) and into the relative interior of this neighborhood. That is, small steps along the search direction improve the centrality. Larger values of α take us outside the neighborhood again, since the error in approximating the nonlinear system (14.9) by the linear step equations (14.11) becomes more pronounced as α increases. Still, we are guaranteed that a certain minimum step can be taken before we reach the boundary of N−∞(γ), as we show in the analysis below. 14.2. ANALYSIS OF ALGORITHM 14.2 9

x2 s2 1 iterates 0

central path C

2

3

boundary of neighborhood N

x1 s1

Figure 14.2: Iterates of Algorithm 14.2, plotted in (xs)space

We present a complete analysis of Algorithm 14.2, which makes use of sur- prisingly simple mathematical foundations, in Section 14.2 below. With judi- cious choices of σk, this algorithm is fairly efficient in practice. With a few more changes it becomes the basis of a truly competitive method, as we discuss in Section 14.3. An infeasible-interior-point variant of Algorithm 14.2 can be constructed by generalizing the definition of N−∞(γ) to allow violation of the feasibility con- ditions. In this extended neighborhood, the residual norms rb and rc are bounded by a constant multiple of the duality measure µ. By squeezing µ to zero, we also force rb and rc to zero, so that the iterates approach complemen- tarity and feasibility at the same time.

14.2 Analysis of Algorithm 14.2

We now present a comprehensive analysis of Algorithm 14.2. Our aim is to show that given some small tolerance >0, and given some mild assumptions about the starting point (x0,λ0,s0), the algorithm requires O(n| log |) iterations to k k k k T k identify a point (x ,λ ,s )forwhichµk ≤ ,whereµk =(x ) s /n. For small ,thepoint(xk,λk,sk) satisfies the primal-dual optimality conditions except for perturbations of about  in the right-hand side of (14.3c), so it is usually very close to a primal-dual solution of the original linear program. The O(n| log |) estimate is a worst-case bound on the number of iterations required; on practical problems, the number of iterations required appears to increase only slightly as 10 CHAPTER 14. INTERIOR-POINT METHODS n increases. The simplex method may require 2n iterations to solve a problem with n variables, though in practice it usually requires a modest multiple of max(m, n) iterations, where m is the row dimension of the constraint matrix A in (14.1). As is typical for interior-point methods, the analysis builds from a purely technical lemma to a powerful theorem in just a few pages. We start with the technical lemma—Lemma 14.1—and use it to prove Lemma 14.2, a bound on the vector of pairwise products ∆xi∆si, i =1, 2,...,n. Theorem 14.3 finds a lower bound on the step length αk and a corresponding estimate of the reduction in µ on iteration k. Finally, Theorem 14.4 proves that O(n| log |) iterations are required to identify a point for which µk <, for a given tolerance  ∈ (0, 1).

Lemma 14.1 Let u and v be any two vectors in IR n with uT v ≥ 0.Then UVe≤2−3/2u + v2, where U =diag(u1,u2,...,un),V=diag(v1,v2,...,vn). Proof. First, note that for any two scalars α and β with αβ ≥ 0, we have from the algebraic-geometric mean inequality that 1 |αβ|≤ |α + β|. (14.20) 2 Since uT v ≥ 0, we have T 0 ≤ u v = uivi + uivi = |uivi|− |uivi|, (14.21)

uivi≥0 uivi<0 i∈P i∈M where we partitioned the index set {1, 2,...,n} as

P = {i | uivi ≥ 0}, M = {i | uivi < 0}. Now, 1/2 2 2 UVe = [uivi]i∈P  + [uivi]i∈M 1/2 2 2 ≤ [uivi]i∈P 1 + [uivi]i∈M1 since ·2 ≤·1 1/2 2 ≤ 2 [uivi]i∈P 1 from (14.21) √ 1 2 ≤ 2 (ui + vi) from (14.20) 4 i∈P 1 −3/2 2 =2 (ui + vi) i∈P n −3/2 2 ≤ 2 (ui + vi) i=1 ≤ 2−3/2u + v2, 14.2. ANALYSIS OF ALGORITHM 14.2 11 completing the proof.

Lemma 14.2 If (x, λ, s) ∈N−∞(γ),then

∆X∆Se≤2−3/2(1 + 1/γ)nµ.

Proof. It is easy to show using (14.12) that

∆xT ∆s =0. (14.22)

By multiplying the last block row in (14.12) by (XS)−1/2 and using the defini- tion D = X1/2S−1/2,weobtain

D−1∆x + D∆s =(XS)−1/2(−XSe+ σµe). (14.23)

Because (D−1∆x)T (D∆s)=∆xT ∆s = 0, we can apply Lemma 14.1 with u = D−1∆x and v = D∆s to obtain

∆X∆Se = (D−1∆X)(D∆S)e ≤ 2−3/2D−1∆x + D∆s2 from Lemma 14.1 =2−3/2(XS)−1/2(−XSe+ σµe)2 from (14.23).

Expanding the squared Euclidean norm and using such relationships as xT s = nµ and eT e = n,weobtain n 1 ∆X∆Se≤2−3/2 xT s − 2σµeT e + σ2µ2 xisi i=1 n ≤ −3/2 xT s − σµeT e σ2µ2 x s ≥ γµ 2 2 + γµ since i i σ2 ≤ −3/2 − σ nµ 2 1 2 + γ ≤ 2−3/2(1 + 1/γ)nµ, as claimed.

Theorem 14.3 Given the parameters γ, σmin,andσmax in Algorithm 14.2, there is a constant δ independent of n such that δ µ ≤ − µ , k+1 1 n k (14.24) for all k ≥ 0. Proof. We start by proving that k k k 3/2 1 − γ σk x (α),λ (α),s (α) ∈N−∞(γ) for all α ∈ 0, 2 γ , (14.25) 1+γ n 12 CHAPTER 14. INTERIOR-POINT METHODS where xk(α),λk(α),sk(α) is defined as in (14.18). It follows that the step length αk is at least as long as the upper bound of this interval, that is,

3/2 σk 1 − γ αk ≥ 2 γ . (14.26) n 1+γ For any i =1, 2,...,n, we have from Lemma 14.2 that k k k k −3/2 |∆xi ∆si |≤∆X ∆S e2 ≤ 2 (1 + 1/γ)nµk. (14.27) k k Using (14.12), we have from xi si ≥ γµk and (14.27) that xk(α)sk(α)= xk + α∆xk sk + α∆sk i i i i i i k k k k k k 2 k k = xi si + α xi ∆si + si ∆xi + α ∆xi ∆si k k 2 k k ≥ xi si (1 − α)+ασkµk − α |∆xi ∆si | 2 −3/2 ≥ γ(1 − α)µk + ασkµk − α 2 (1 + 1/γ)nµk. By summing the n components of the equation Sk∆xk + Xk∆sk = −XkSke + σkµke (the third block row from (14.12)), and using (14.22) and the definition of µk and µk(α) (see (14.18)), we obtain

µk(α)=(1− α(1 − σk))µk. From these last two formulas, we can see that the proximity condition k k xi (α)si (α) ≥ γµk(α) is satisfied, provided that 2 −3/2 γ(1 − α)µk + ασkµk − α 2 (1 + 1/γ)nµk ≥ γ(1 − α + ασk)µk. Rearranging this expression, we obtain 2 −3/2 ασkµk(1 − γ) ≥ α 2 nµk(1 + 1/γ), which is true if 3/2 − γ α ≤ 2 σ γ 1 . n k γ 1+ We have proved that xk(α),λk(α),sk(α) satisfies the proximity condition for N−∞(γ)whenα lies in the range stated in (14.25). It is not difficult to show that xk(α),λk(α),sk(α) ∈Fo for all α in the given range (see the exercises). Hence, we have proved (14.25) and therefore (14.26). We complete the proof of the theorem by estimating the reduction in µ on the kth step. Because of (14.22), (14.26), and the last block row of (14.11), we have µ xk α T sk α /n k+1 = ( k) ( k) k T k k T k k T k 2 k T k = (x ) s + αk (x ) ∆s +(s ) ∆x + α (∆x ) ∆s (14.28)/n k k T k = µk + αk −(x ) s /n + σkµk (14.29) =(1− αk(1 − σk))µk (14.30) 23/2 1 − γ ≤ 1 − γ σk(1 − σk) µk. (14.31) n 1+γ 14.2. ANALYSIS OF ALGORITHM 14.2 13

Now, the function σ(1 − σ) is a concave quadratic function of σ,soonanygiven interval it attains its minimum value at one of the endpoints. Hence, we have

σk(1 − σk) ≥ min {σmin(1 − σmin),σmax(1 − σmax)} , for all σk ∈ [σmin,σmax].

The proof is completed by substituting this estimate into (14.31) and setting

3/2 1 − γ δ =2 γ min {σmin(1 − σmin),σmax(1 − σmax)} . 1+γ

We conclude with the complexity result.

Theorem 14.4 Given >0 and γ ∈ (0, 1), suppose the starting point (x0,λ0,s0) ∈ N−∞(γ) in Algorithm 14.2 has

κ µ0 ≤ 1/ (14.32) for some positive constant κ. Then there is an index K with K = O(n log 1/) such that µk ≤ , for all k ≥ K. Proof. By taking logarithms of both sides in (14.24), we obtain δ µ ≤ − µ . log k+1 log 1 n +log k

By repeatedly applying this formula and using (14.32), we have δ δ µ ≤ k − µ ≤ k − κ 1. log k log 1 n +log 0 log 1 n + log 

The following well-known estimate for the log function,

log(1 + β) ≤ β, for all β>−1, implies that δ µ ≤ k − κ 1. log k n + log 

Therefore, the convergence criterion µk ≤  is satisfied if we have δ k − κ 1 ≤ . n + log  log

This inequality holds for all k that satisfy n k ≥ K κ 1, =(1+ ) δ log  so the proof is complete. 14 CHAPTER 14. INTERIOR-POINT METHODS

14.3 Practical Primal-Dual Algorithms

Practical implementations of interior-point algorithms follow the spirit of the theory in the previous section, in that strict positivity of xk and sk is main- tained throughout and each step is a Newton-like step involving a centering component. However, several aspects of “theoretical” algorithms are typically ignored, while several enhancements are added that have a significant effect on practical performance. We discuss in this section the algorithmic enhancements that are present in most interior-point software. Practical implementations typically do not enforce membership of the central path neighborhoods N2 and N−∞ defined in the previous section. Rather, they calculate the maximum steplengths that can be taken in the x and s variables (separately) without violating nonnegativity, then take a steplength of slightly less than this maximum. Given an iterate (xk,λk,sk)with(xk,sk) > 0, and a k k k pri dual step (∆x , ∆λ , ∆s ), it is easy to show that the quantities αk,max and αk,max defined as follows:

k pri def xi αk,max =min− , (14.33a) k xk i:∆xi <0 ∆ i k dual def si αk,max =min− , (14.33b) k sk i:∆si <0 ∆ i are the largest values of α for which xk + α∆xk ≥ 0andsk + α∆sk ≥ 0, respectively. Practical algorithms then choose the steplengths to lie in the open intervals defined by these maxima, that is,

pri pri dual dual αk ∈ (0,αk,max),αk ∈ (0,αk,max), and then obtain a new iterate by setting

k+1 k pri k k+1 k+1 k k dual k k x = x + αk ∆x , (λ ,s )=(λ ,s )+αk (∆λ , ∆s ).

If the step (∆xk, ∆λk, ∆sk) rectifies the infeasibility in the KKT conditions (14.3a) and (14.3b), that is,

k k k T k k k T k k A∆x = −rb = −(Ax − b),A∆λ +∆s = −rc = −(A λ + s − c), it is easy to show that the infeasibilities at the new iterate satisfy k+1 pri k k+1 dual k rb = 1 − αk rb ,rc = 1 − αk rc , (14.34)

k+1 k+1 where rb and rc are defined in the obvious way. A second feature of practical algorithms is their use of corrector steps, a development made practical by Mehrotra [14]. These steps compensate for the linearization error made by the Newton (affine-scaling) step in modeling the equation xisi =0,i =1, 2,...,n (see (14.3c)). Consider the affine-scaling 14.3. PRACTICAL PRIMAL-DUAL ALGORITHMS 15 direction (∆x, ∆λ, ∆s) defined by       T aff 0 A I ∆x −rc    aff    A 00 ∆λ = −rb , (14.35) S 0 X ∆saff −XSe

(where rb and rc are defined in (14.14)). If we take a full step in this direction, we obtain

aff aff (xi +∆xi )(si +∆si ) aff aff aff aff aff aff = xisi + xi∆si + si∆xi +∆xi ∆si =∆xi ∆si .

aff aff That is, the updated value of xisi is ∆xi ∆si rather than the ideal value 0. We can solve the following system to obtain a step (∆xcor, ∆λcor, ∆scor)that corrects for this deviation from the ideal:       0 AT I ∆xcor 0  A 00  ∆λcor  =  0  . (14.36) S 0 X ∆scor −∆Xaff ∆Saff e

In many cases, the combined step (∆xaff , ∆λaff , ∆saff )+(∆xcor, ∆λcor, ∆scor) does a better job of reducing the duality measure than does the affine-scaling step alone. Like theoretical algorithms such as the one analysed in Section ??,practical algorithms make use of centering steps, with an adaptive choice of the centering parameter σk. Mehrotra [14] introduced a highly successful scheme for choosing σk based on the effectiveness of the affine-scaling step at iteration k. Roughly, if the affine-scaling step (multiplied by a steplength to maintain nonnegativity of x and s) reduces the duality measure significantly, there is not much need forcentering,soasmallervalueofσk is appropriate. Conversely, if not much progress can be made along this direction before reaching the boundary of the nonnegative orthant, a larger value of σk will ensure that the next iterate is more centered, so a longer step will be possible from this next point. Specif- ically, Mehrotra’s scheme calculates the maximum allowable steplengths along the affine-scaling direction (14.35) as follows: pri def xi αaff =min1, min − , (14.37a) aff xaff i:∆xi <0 ∆ i dual def si αaff =min1, min − , (14.37b) aff saff i:∆si <0 ∆ i and then defines µaff to be the value of µ that would be obtained by using these steplengths, that is,

pri aff T dual aff µaff =(x + αaff ∆x ) (s + αaff ∆s )/n. (14.38) 16 CHAPTER 14. INTERIOR-POINT METHODS

The centering parameter σ is then chosen as follows: µ 3 σ aff . = µ (14.39)

To summarize, computation of the search direction in Mehrotra’s method requires the solution of two linear systems. First, the system (14.35) is solved to obtain the affine-scaling direction, also known as the predictor step.This step is used to define the right-hand side for the corrector step (see (14.36)) and to calculate the centering parameter from (14.38), (14.39). Second, the search direction is calculated by solving       T 0 A I ∆x −rc       A 00 ∆λ = −rb . (14.40) S 0 X ∆s −XSe− ∆Xaff ∆Saff e + σµe

Note that the predictor, corrector, and centering contributions have been ag- gregated on the right-hand side of this system. The coefficient matrix in both linear systems (14.35) and (14.40) is the same. Thus, the factorization of the matrix needs to be computed only once, and the marginal cost of solving the second system is relatively small. Once the search direction has been calculated, the maximum steplengths along this direction are computed from (14.33), and the steplengths actually used are chosen to be

pri pri dual dual αk =min(1,ηkαmax),αk =min(1,ηkαmax), (14.41) where ηk ∈ [0.9, 1.0) is chosen so that ηk → 1 as the iterates approach the primal-dual solution, solution, to accelerate the asymptotic convergence. (For details on how η is chosen and on other elements of the algorithm such as the choice of starting point (x0,λ0,s0), see Mehrotra [14].) We now specify the algorithm in the usual format.

Algorithm 14.3 (Predictor-Corrector Algorithm (Mehrotra [14])) Given (x0,λ0,s0)with(x0,s0) > 0; for k =0, 1, 2,... Set (x, λ, s)=(xk,λk,sk) and solve (14.35) for (∆xaff , ∆λaff , ∆saff ); pri dual Calculate αaff , αaff ,andµaff as in (14.37) and (14.38); 3 Set centering parameter to σ =(µaff /µ) ; Solve (14.40) for (∆x, ∆λ, ∆s); pri dual Calculate αk and αk from (14.41); Set

k+1 k pri x = x + αk ∆x, k+1 k+1 k k dual (λ ,s )=(λ ,s )+αk (∆λ, ∆s);

end (for). 14.3. PRACTICAL PRIMAL-DUAL ALGORITHMS 17

No convergence theory is available for Mehrotra’s algorithm, at least in the form in which it is described above. In fact, there are examples for which the algorithm diverges. Simple safeguards could be incorporated into the method to force it into the convergence framework of existing methods. However, most programs do not implement these safeguards, because the good practical per- formance of Mehrotra’s algorithm makes them unnecessary. When presented with a linear program that is infeasible or unbounded, the k k algorithm above typically diverges, with the infeasibilities rb and rc and/or the duality measure µk going to ∞. Since the symptoms of infeasibility and un- boundedness are fairly easy to recognize, interior-point codes contain heuristics to detect and report these conditions. More rigorous approaches for detecting infeasibility and unboundedness make use of the homogeneous self-dual formu- lation; see Wright [24, Chapter 9] and the references therein for a discussion. A more recent approach that applies directly to infeasible-interior-point methods is described by Todd [20]

Solving the Linear Systems

Most of the computational effort in primal-dual methods is taken up in solving linear systems such as (14.15), (14.35), and (14.40). The coefficient matrix in these systems is usually large and sparse, since the constraint matrix A is itself large and sparse in most applications. The special structure in the step equations allows us to reformulate them as systems with more compact symmetric coefficient matrices, which areeasierandcheapertofactorthanthe original sparse form. We apply the reformulation procedures to the following general form of the linear system:       T 0 A I ∆x −rc       A 00 ∆λ = −rb . (14.42) S 0 X ∆s −rxs

Since x and s are strictly positive, the diagonal matrices X and S are nonsingu- lar. Hence, by eliminating ∆s from (14.42), we obtain the following equivalent system: 0 A ∆λ −rb T −2 = −1 , (14.43a) A −D ∆x −rc + X rxs −1 −1 ∆s = −X rxs − X S∆x, (14.43b) where we have introduced the notation

D = S−1/2X1/2. (14.44)

This form of the step equations usually is known as the augmented system.Since the matrix X−1S is also diagonal and nonsingular, we can go a step further, 18 CHAPTER 14. INTERIOR-POINT METHODS eliminating ∆x from (14.43a) to obtain another equivalent form: 2 T −1 −1 AD A ∆λ = −rb − AXS rc + AS rxs (14.45a) T ∆s = −rc − A ∆λ, (14.45b) −1 −1 ∆x = −S rxs − XS ∆s. (14.45c) This form often is called the normal-equations form because the system (14.45a) can be viewed as the normal equations (??)foralinearleast-squares problem with coefficient matrix DAT . Most implementations of primal-dual methods are based on formulations like (14.45). They use direct sparse Cholesky algorithms to factor the matrix AD2AT , and then perform triangular solves with the resulting sparse factors to obtain the step ∆λ from (14.45a). The steps ∆s and ∆x are recovered from (14.45b) and (14.45c). General-purpose sparse Cholesky software can be applied to AD2AT , but a few modifications are needed because AD2AT may be ill-conditioned or singular. (Ill conditioning of this system is often observed during the final stages of a primal-dual algorithm, when the elements of the diagonal weighting matrix D2 take on both huge and tiny values.) A disadvantage of this formulation is that if A contains any dense columns, the matrix AD2AT is completely dense. Hence, practical software identifies dense and nearly-dense columns, excludes them from the matrix product AD2AT , and performs the Cholesky factorization of the resulting sparse matrix. Then, a device such as a Sherman-Morrison-Woodbury update is applied to account for the excluded columns. We refer the reader to Wright [24, Chapter 11] for further details. The formulation (14.43) has received less attention than (14.45), mainly because algorithms and software for factoring sparse symmetric indefinite ma- trices are more complicated, slower, and less prevalent than sparse Cholesky algorithms. Nevertheless, the formulation (14.43) is cleaner and more flexible than (14.45) in a number of respects. We avoid the fill-in caused by dense columns in A in the matrix product AD2AT , and free variables (that is, com- ponents of x with no explicit lower or upper bounds) can be handled without resorting to the various artificial devices needed in the normal-equations form.

14.4 Other Primal-Dual Algorithms and Exten- sions

Other Path-Following Methods Framework 14.1 is the basis of a number of other algorithms of the path- following variety. They are less important from a practical viewpoint, but we mention them here because of their elegance and their strong theoretical prop- erties. Some path-following methods choose conservative values for the centering parameter σ (that is, σ only slightly less than 1) so that unit steps (that is, 14.4. OTHER PRIMAL-DUAL ALGORITHMS AND EXTENSIONS 19 asteplengthofα = 1) can be taken along the resulting direction from (14.11) without leaving the chosen neighborhood. These methods, which are known as short-step path-following methods, make only slow progress toward the solution because they require the iterates to stay inside a restrictive N2 neighborhood (14.16). Better results are obtained with the predictor-corrector method, due to Mizuno, Todd, and Ye [15], which uses two N2 neighborhoods, nested one inside the other. (Despite the similar terminology, this algorithm is quite distinct from Algorithm 14.3 of Section 14.3.) Every second step of this method is a predictor step, which starts in the inner neighborhood and moves along the affine-scaling direction (computed by setting σ = 0 in (14.11)) to the boundary of the outer neighborhood. The gap between neighborhood boundaries is wide enough to allow this step to make significant progress in reducing µ. Alternating with the predictor steps are corrector steps (computed with σ =1andα =1),which take the next iterate back inside the inner neighborhood in preparation for the next predictor step. The predictor-corrector algorithm produces a sequence of duality measures µk that converge superlinearly to zero, in contrast to the linear convergence that characterizes most methods.

Potential-Reduction Methods Potential-reduction methods take steps of the same form as path-following methods, but they do not explicitly follow the central path C and can be moti- vated independently of it. They use a logarithmic potential function to measure the worth of each point in F o and aim to achieve a certain fixed reduction in this function at each iteration. The primal-dual potential function, which we denote generically by Φ, usually has two important properties:

T Φ →∞if xisi → 0forsomei, while µ = x s/n → 0, (14.46a) Φ →−∞if and only if (x, λ, s) → Ω. (14.46b)

The first property (14.46a) prevents any one of the pairwise products xisi from approaching zero independently of the others, and therefore keeps the iterates away from the boundary of the nonnegative orthant. The second property (14.46b) relates Φ to the solution set Ω. If our algorithm forces Φ to −∞,then (14.46b) ensures that the sequence approaches the solution set. An interesting primal-dual potential function is the Tanabe-Todd-Ye func- tion Φρ defined by n T Φρ(x, s)=ρ log x s − log xisi, (14.47) i=1 for some parameter ρ>n(see [17], [21]). Like all algorithms based on Frame- work 14.1, potential-reduction algorithms obtain their search directions by solv- ing (14.12), for some σk ∈ (0, 1), and they take steps of length αk along these directions. For instance, the step length αk may be chosen to approximately√ minimize Φρ along the computed direction. By fixing σk = n/(n+ n) for all k, 20 CHAPTER 14. INTERIOR-POINT METHODS one can guarantee constant reduction in Φρ at every iteration. Hence, Φρ will approach −∞, forcing convergence. Adaptive and heuristic choices of σk and αk are also covered by the theory, provided that they at least match the reduction in Φρ obtained from the conservative theoretical values of these parameters.

Extensions Primal-dual methods for linear programming can be extended to wider classes of problems. There are simple extensions of the algorithm to the monotone lin- ear complementarity problem (LCP) and convex prob- lems for which the convergence and polynomial complexity properties of the linear programming algorithms are retained. The LCP is the problem of finding vectors x and s in IRn that satisfy the following conditions:

s = Mx+ q, (x, s) ≥ 0,xT s =0, (14.48) where M is a positive semidefinite n × n matrix and q ∈ IR n. The similar- ity between (14.48) and the KKT conditions (14.3) is obvious: The last two conditions in (14.48) correspond to (14.3d) and (14.3c), respectively, while the condition s = Mx + q is similar to the equations (14.3a) and (14.3b). For practical instances of the problem (14.48), see Cottle, Pang, and Stone [3]. In convex quadratic programming, we minimize a convex quadratic objective subject to linear constraints. A convex quadratic generalization of the standard form linear program (14.1) is

T 1 T min c x + 2 x Gx, subject to Ax = b, x ≥ 0, (14.49) where G is a symmetric n × n positive semidefinite matrix. The KKT condi- tions for this problem are similar to those for linear programming (14.3) and also to the linear complementarity problem (14.48). See Section ?? for further discussion of interior-point methods for (14.49). Primal-dual methods for problems can be devised by writing down the KKT conditions in a form similar to (14.3), adding slack vari- ables where necessary to convert all the inequality conditions to simple bounds of the type (14.3d). As in Framework 14.1, the basic primal-dual step is found by applying a modified Newton’s method to some formulation of these KKT condi- tions, curtailing the step length to ensure that the bounds are satisfied strictly by all iterates. When the nonlinear programming problem is convex (that is, its objective and constraint functions are all convex functions), global conver- gence of primal-dual methods can be proved. Extensions to general nonlinear programming problems are not so straightforward; this topic is the subject of Chapter ??. Interior-point methods are highly effective in solving semidefinite program- ming problems, a class of problems involving symmetric matrix variables that are constrained to be positive semidefinite. Semidefinite programming, which has been the topic of concentrated research since the early 1990s, has applica- tions in many areas, including control theory and combinatorial optimization. 14.5. PERSPECTIVES AND SOFTWARE 21

Further information on this increasingly important topic can be found in the survey papers of Todd [19] and Vandenberghe and Boyd [22] and the books of Nesterov and Nemirovskii [16], Boyd et al. [1], and Boyd and Vandenberghe [2].

14.5 Perspectives and Software

The appearance of interior-point methods in the 1980s presented the first seri- ous challenge to the dominance of the simplex method as a practical means of solving linear programming problems. By about 1990, interior-point codes had emerged that incorporated the techniques described in Section 14.3 and that were superior on many large problems to the simplex codes available at that time. The years that followed saw a quantum improvement in simplex software, evidenced by the appearance of packages such as CPLEX and XPRESS-MP. These improvements were due to algorthmic advances such as steepest-edge pivoting (See Goldfarb and Forrest [8]) and improved pricing heuristics, and also to close attention to the nuts and bolts of efficient implementation. The efficiency of interior-point codes also continued to improve, through improve- ments in the linear algebra for solving the step equations and through the use of higher-order correctors in the step calculation (see Gondzio [9]). During this period, a number of good interior-point codes became freely available, at least for research use, and found their way into many applications. In general, simplex codes are faster on problems of small-medium dimensions, while interior-point codes are competitive and often faster on large problems. However, this rule is certainly not hard-and-fast; it depends strongly on the structure of the particular application. Interior-point methods are generally not able to take advantage of prior knowledge about the solution, such as an estimate of the solution itself or an estimate of the optimal basis. Hence, these methods are less useful than simplex approaches in situations in which “warm- start” information is readily available. The most widespread situation of this type involves branch-and-bound algorithms for solving integer programs, where each node in the branch-and-bound tree requires the solution of a linear program that differs only slightly from one already solved in the parent node. Interior-point software has the advantage that it is easy to program, relative to the simplex method. The most complex operation is the solution of the large linear systems at each iteration to compute the step; software to perform this linear algebra operation is readily available. The interior-point code LIPSOL [25] is written entirely in the Matlab language, apart from a small amount of FORTRAN code that interfaces to the linear algebra software. The code PCx [4] is written in C, but also is easy for the interested user to comprehend and modify. It is even possible for a non-expert in optimization to write an efficient interior-point implementation from scratch that is customized to their particular application.

Notes and References 22 CHAPTER 14. INTERIOR-POINT METHODS

For more details on the material of this chapter, see the book by Wright [24]. As noted in the text, Karmarkar’s method arose from a search for linear pro- gramming algorithms with better worst-case behavior than the simplex method. The first algorithm with polynomial complexity, Khachiyan’s ellipsoid algorithm [11], was a computational disappointment. In contrast, the execution times re- quired by Karmarkar’s method were not too much greater than simplex codes at the time of its introduction, particularly for large linear programs. Karmarkar’s is a primal algorithm; that is, it is described, motivated, and implemented purely in terms of the primal problem (14.1) without reference to the dual. At each iteration, Karmarkar’s algorithm performs a projective transformation on the primal feasible set that maps the current iterate xk to the center of the set and takes a step in the feasible steepest descent direction for the transformed space. Progress toward optimality is measured by a logarithmic potential function. Nice descriptions of the algorithm can be found in Karmarkar’s original paper [10] and in Fletcher [7]. Karmarkar’s method falls outside the scope of this chapter, and in any case, its practical performance does not appear to match the most efficient primal- dual methods. The algorithms we discussed in this chapter have polynomial complexity, like Karmarkar’s method. Many of the algorithmic ideas that have been examined since 1984 actually had their genesis in three works that preceded Karmarkar’s paper. The first of these is the book of Fiacco and McCormick [6] on logarithmic barrier functions, which proves existence of the central path, among many other results. Further analysis of the central path was carried out by McLinden [12], in the context of nonlinear complementarity problems. Finally, there is Dikin’s paper [5], in which an interior-point method known as primal affine-scaling was originally proposed. The outburst of research on primal-dual methods, which culminated in the efficient software packages available today, dates to the seminal paper of Megiddo [13]. Todd gives an excellent survey of potential reduction methods in [18]. He relates the primal-dual potential reduction method mentioned above to pure primal potential reduction methods, including Karmarkar’s original algorithm, and discusses extensions to special classes of nonlinear problems. For an introduction to complexity theory and its relationship to optimiza- tion, see the book by Vavasis [23].

Exercises 1. This exercise illustrates the fact that the bounds (x, s) ≥ 0 are essential in relating solutions of the system (14.4a) to solutions of the linear program (14.1) and its dual. Consider the following linear program in IR2:

min x1, subject to x1 + x2 =1, (x1,x2) ≥ 0. Show that the primal-dual solution is ∗ 0 ∗ ∗ 1 x = ,λ=0,s= . 1 0 14.5. PERSPECTIVES AND SOFTWARE 23

Also verify that the system F (x, λ, s) = 0 has the spurious solution 1 0 x = ,λ=1,s= , 0 −1

which has no relation to the solution of the linear program.

2. (i) Show that N2(θ1) ⊂N2(θ2)when0≤ θ1 <θ2 < 1andthat N−∞(γ1) ⊂N−∞(γ2)for0<γ2 ≤ γ1 ≤ 1.

(ii) Show that N2(θ) ⊂N−∞(γ)ifγ ≤ 1 − θ.

3. Given an arbitrary point (x, λ, s) ∈Fo, find the range of γ values for which (x, λ, s) ∈N−∞(γ). (The range depends on x and s.)

4. For n = 2, find a point (x, s) > 0 for which the condition

XSe− µe2 ≤ θµ

is not satisfied for any θ ∈ [0, 1).

5. Prove that the neighborhoods N−∞(1) (see (14.17)) and N2(0) (see (14.16)) coincide with the central path C.

6. Show that Φρ defined by (14.47) has the property (14.46a).

7. Prove that the coefficient matrix in (14.11) is nonsingular if and only if A has full row rank.

8. Given (∆x, ∆λ, ∆s) satisfying (14.12), prove (14.22).

9. (NEW) Given an iterate (xk,λk,sk)with(xk,sk) > 0, show that the pri dual quantities αmax and αmax defined by (14.33) are the largest values of α such that xk + α∆xk ≥ 0andsk + α∆sk ≥ 0, respectively.

10. (NEW) Verify (14.34).

11. Given that X and S are diagonal with positive diagonal elements, show that the coefficient matrix in (14.45a) is symmetric and positive definite if and only if A has full row rank. Does this result continue to hold if we replace D by a diagonal matrix in which exactly m of the diagonal elements are positive and the remainder are zero? (Here m is the number of rows of A.)

12. Given a point (x, λ, s)with(x, s) > 0, consider the trajectory H defined by   (1 − τ)(AT λ + s − c) F xˆ(τ), λˆ(τ), sˆ(τ) =  (1 − τ)(Ax − b)  , (ˆx(τ), sˆ(τ)) ≥ 0, (1 − τ)XSe 24 CHAPTER 14. INTERIOR-POINT METHODS for τ ∈ [0, 1], and note that xˆ(0), λˆ(0), sˆ(0) =(x, λ, s), while the limit of xˆ(τ), λˆ(τ), sˆ(τ) as τ ↑ 1 will lie in the primal-dual solution set of the linear program. Find equations for the first, second, and third derivatives of H with respect to τ at τ = 0. Hence, write down a Taylor series approximation to H near the point (x, λ, s). 13. Consider the following linear program, which contains “free variables” denoted by y:

T T min c x + d y, subject to A1x + A2y = b, x ≥ 0.

By introducing Lagrange multipliers λ for the equality constraints and s for the bounds x ≥ 0, write down optimality conditions for this problem in an analogous fashion to (14.3). Following (14.4) and (14.11), use these con- ditions to derive the general step equations for a primal-dual interior-point method. Express these equations in augmented system form analogously to (14.43) and explain why it is not possible to reduce further to a for- mulation like (14.45) in which the coefficient matrix is symmetric positive definite. 14. Program Algorithm 14.3 in Matlab. Choose η =0.99 uniformly in (14.41). Test your code on a linear programming problem (14.1) generated by choosing A randomly, and then setting x, s, b,andc as follows: random positive number i =1, 2,...,m, xi = 0 i = m +1,m+2,...,n, random positive number i = m +1,m+2,...,n si = 0 i =1, 2,...,m, λ = random vector, c = AT λ + s, b = Ax.

Choose the starting point (x0,λ0,s0) with the components of x0 and s0 set to large positive values. Bibliography

[1] S. Boyd, L. El Ghoui, E. Feron, and V. Balakrishnan. Linear Matrix In- equalities in Systems and Control Theory. SIAM Publications, Phildelphia, 1994.

[2] S. Boyd and L. Vandenberghe. . Cambridge University Press, 2003.

[3] R. W. Cottle, J.-S. Pang, and R. E. Stone. The Linear Complementarity Problem. Academic Press, San Diego, 1992.

[4] J. Czyzyk, S. Mehrotra, M. Wagner, and S. J. Wright. PCx: An interior- point code for linear programming. Optimization Methods and Software, 11/12:397–430, 1999.

[5] I. I. Dikin. Iterative solution of problems of linear and quadratic program- ming. Soviet -Doklady, 8:674–675, 1967.

[6]A.V.FiaccoandG.P.McCormick. Nonlinear Programming: Sequential Unconstrained Minimization Techniques. John Wiley & Sons, New York, NY, 1968. reprinted by SIAM Publications, 1990.

[7] R. Fletcher. Practical Methods of Optimization. John Wiley & Sons, New York, second edition, 1987.

[8] D. Goldfarb and J. Forrest. Steepest edge simplex algorithms for linear programming. Mathematical Programming, 57:341–374, 1992.

[9] J. Gondzio. Multiple centrality corrections in a primal-dual method for linear programming. Computational Optimization and Applications, 6:137– 156, 1996.

[10] N. Karmarkar. A new polynomial-time algorithm for linear programming. Combinatorics, 4:373–395, 1984.

[11] L. G. Khachiyan. A polynomial algorithm in linear programming. Soviet Mathematics Doklady, 20:191–194, 1979.

25 26 BIBLIOGRAPHY

[12] L. McLinden. An analogue of Moreau’s proximation theorem, with ap- plications to the nonlinear complementarity problem. Pacific Journal of Mathematics, 88:101–161, 1980. [13] N. Megiddo. Pathways to the optimal set in linear programming. In N. Megiddo, editor, Progress in Mathematical Programming: Interior-Point and Related Methods, chapter 8, pages 131–158. Springer-Verlag, New York, N.Y., 1989.

[14] S. Mehrotra. On the implementation of a primal-dual interior point method. SIAM Journal on Optimization, 2:575–601, 1992. [15] S. Mizuno, M. Todd, and Y. Ye. On adaptive step primal-dual interior-point algorithms for linear programming. Mathematics of Operations Research, 18:964–981, 1993. [16]Yu.E.NesterovandA.S.Nemirovskii.Interior Point Polynomial Methods in Convex Programming. SIAM Publications, Philadelphia, 1994. [17] K. Tanabe. Centered Newton method for mathematical programming. In System Modeling and Optimization: Proceedings of the 13th IFIP confer- ence, volume 113 of Lecture Notes in Control and Information Systems, pages 197–206, Berlin, 1988. Springer-Verlag. [18] M. J. Todd. Potential reduction methods in mathematical programming. Mathematical Programming, Series B, 76:3–45, 1997. [19] M. J. Todd. Semidefinite optimization. Acta Numerica, 10:515–560, 2001. [20] M. J. Todd. Detecting infeasibility in infeasible-interior-point methods for optimization. In F. Cucker, R. DeVore, P. Olver, and E. Suli, editors, Foundations of Computational Mathematics, Minneapolis, 2002, pages 157– 192. Cambridge University Press, 2004. [21] M. J. Todd and Y. Ye. A centered projective algorithm for linear program- ming. Mathematics of Operations Research, 15:508–529, 1990. [22] L. Vandenberghe and S. Boyd. Semidefinite programming. SIAM Review, 38:49–95, 1996. [23] S. A. Vavasis. Nonlinear Optimization. Oxford University Press, New York and Oxford, 1991. [24] S. J. Wright. Primal-Dual Interior-Point Methods. SIAM Publications, Philadelphia, Pa, 1997. [25] Y. Zhang. Solving large-scale linear programs with interior-point methods under the Matlab environment. Optimization Methods and Software, 10:1– 31, 1998.