Lectures on Integer Programming

L.E. Trotter, Jr. School of OR&IE Cornell University Ithaca, New York 14853 June 16, 2004

Abstract

Lecture notes for the “Doctoral School in Discrete Systems Optimization,” E.P.F.- Lausanne, June 2004. This material stresses fundamental aspects of geometry and duality for Integer Programming.

1 Linear Spaces

n We generally deal with IR = x = (x1,...,xn) : xj IR j , the n dimensional eu- clidean over the reals{ IR. Sometimes attention∈ is restricted∀ } to−Qn, the rational n vectors. Familiarity with manipulation of vectors and matrices is assumed, but we review the− formal properties of vector spaces, stressing the manner in which the theory of linear equality systems extends naturally to linear inequality systems.

Subspaces

As a vector space, IRn must obey certain axioms. First, IRn is an abelian group with respect to (vector) addition: there is a zero element 0 = (0,..., 0) IRn, each element has an ∈ inverse, and vector addition is associative and commutative.

0+ a = a, a IRn ∀ ∈ n a +( a)=0, a IR (a + b)+ c = a +(− b + c), ∀ a,∈ b, c IRn a + b = b + a, ∀ a, b ∈IRn ∀ ∈ Next, IRn satisfies IR-module properties governing the action of IR on IRn via (scalar) mul- tiplication: there is a unit scalar 1 IR, scalar multiplication is associative, and scalar multiplication distributes over (both vector∈ and scalar) addition.

1a = a, a IRn ∀ ∈ n λ(µa)=(λµ)a, a IR ; λ,µ IR λ(a + b)= λa + λb, ∀ a,∈ b IRn∀; λ ∈IR ∀ ∈ n ∀ ∈ (λ + µ)a = λa + µa, a IR ; λ,µ IR ∀ ∈ ∀ ∈ Finally, IRn is closed with respect to vector addition and scalar multiplication.

a + λb IRn, a, b IRn; λ IR ∈ ∀ ∈ ∀ ∈ Exercise 1.1 Show that the closure condition is equivalent to the requirement:

λ a + + λ a IRn, m 1; λ ,...,λ IR; a ,...,a IRn. 2 1 1 ··· m m ∈ ∀ ≥ ∀ 1 m ∈ ∀ 1 m ∈ n The expression λ1a1 + +λmam is a of the ai. Any subset of IR obeying the above stipulations,··· i.e., containing 0 and closed under linear combinations, is a subspace. As with a1,...,am in the exercise, we often index the vectors in a set, denoting (ai)j, the jth component of ai, simply by aij. Moreover, when the set is finite, say A = a1,...,am , A may be identified with the m n matrix A IRm×n, whose ith row is (a ,...,a{ ); thus} × ∈ i1 in we write ai A to mean that ai is the ith row of matrix A. Whether we choose to view A as a set or as∈ a matrix (an ordered set of vectors) will always be clear from context. When matrix–vector products are involved, it will also be clear from context whether row or column vectors are intended, enabling us (usually) to avoid the use of transpose notation.

1 When all λi = 0, the linear combination λ1a1 + + λmam is trivial; otherwise, it is nontriv- ial. Of course, the result of the trivial linear combination··· is just the zero vector. And when 0 results from a nontrivial linear combination, the a1,...,am are linearly dependent. This terminology extends to sets: S IRn is linearly dependent when some finite subset, consist- ing of distinct elements of S, satisfies⊆ a linear dependence relation λ a + + λ a = 0. 1 1 ··· m m Similarly, a / S is linearly dependent on S provided a and otherwise only (distinct) members of S satisfy a∈ linear dependence relation in which the coefficient of a is nonzero; of course, we may scale so that the coefficient of a is 1 and then, provided S = , rewrite the dependence − 6 ∅ relation as an equation expressing a as a linear combination of members of S. The of S IRn is L(S)= S a IRn : a is linearly dependent on S ; when T S L(T ), we say that⊆ T spans S. In particular,∪{ ∈ observe that L( )= 0 , as 0 is} linearly dependent⊆ ⊆ on ∅ { } any set, even ; thus is not a subspace. ∅ ∅ Exercise 1.2 Show that: (i) S T L(S) L(T ); (ii) S⊆ L(⇒S); ⊆ (iii) L⊆(L(S)) = L(S); (iv) S = L(S) S is a subspace. 2 ⇔ For any S IRn, it follows from the exercise that L(S) is a subspace containing S; we say that L(S) is⊆ the subspace generated by S. On the other hand, the intersection of all subspaces containing S, or equivalently, the unique minimal subspace containing S, is called the linear hull of S. Now when T is a subspace and T S, (i) and (iv) imply L(S) L(T )= T , and ⊇ ⊆ consequently L(S) also gives the linear hull of S. Thus we have two ways to think about any subspace: an interior (generator) description provided by the linear span and an exterior (constraint) description in terms of the linear hull. Below we turn these dual geometric descriptions into finite and equivalent representations for any subspace.

Finite Generator Representations

For matrix A, L(A) is finitely generated by the rows of A; in this case L(A) is called the row space of A.

Exercise 1.3 Consider the following elementary row operations on rows ai, ak of matrix A: (i) interchange a a ; i ↔ k (ii) replace a a + a ; k ← k i (iii) replace ai λai, 0 = λ IR. Show that these operations← leave6 ∈ the row space of A unchanged. 2

We now show that every subspace has a finite set of generators. Any set of vectors which is not linearly dependent is linearly independent; in particular, the empty set is linearly ∅ independent.

2 m Proposition 1.4 Let a0 = i=1 λiai, with a1,...,am linearly independent. Then: (i) the λi are unique; P (ii) a :0 i = k is linearly independent λ =0; { i ≤ 6 } ⇔ k 6 (iii) L( a ,...,a )= L( a :0 i = k λ =0. { 1 m} { i ≤ 6 } ⇔ k 6 Proof: (i) If a = m λ a = m µ a , then 0= a a = m (λ µ )a . 0 i=1 i i i=1 i i 0 − 0 i=1 i − i i Thus linear independence of a ,...,a implies λ µ =0 i; hence λ = µ i. P P1 m i − i ∀P i i ∀ (ii) λk =0 0= a0 + i6=k λiai ai :0 i = k linearly dependent. For the converse,⇒ −a :0 i = k linearly⇒{ dependent≤ 6 } µ a = 0, and some µ = 0. { i ≤P 6 } ⇒ 0≤i6=k i i i 6 Moreover, µ0 = 0, since a1,...,am are linearly independent.P Thus a = 6 ( µ /µ )a = m λ a , so part (i) implies λ = 0. 0 1≤i6=k − i 0 i i=1 i i k (iii) First note that a L( a ,...,a ), hence L( a ,...,a )= L( a ,...,a ). P 0 ∈ { 1 P m} { 0 m} { 1 m} Now λk =0 ak = (1/λk)a0 1≤i6=k(λi/λk)ai ak L( ai : i = k )= L( a0,...,am ). If λ = 0,6 then⇒ a L( a :1 − i = k )= L( a :0⇒ i∈= k {). 6 } { } k 0 ∈ { i ≤ P6 } { i ≤ 6 } But linear independence of a ,...,a a L( a :1 i = k ) = L( a ,...,a ). 2 1 m ⇒ k 6∈ { i ≤ 6 } 6 { 1 m} Proposition 1.5 For subspace S, with a ,...,a = A S and b ,...,b = B S, { 1 m} ⊆ { 1 n} ⊆ if A is linearly independent and B spans S, then m n. ≤ Proof: Considering the a sequentially for 1 i m, if a B, we do nothing. i ≤ ≤ i ∈ If a B, then a L(B), so a = n λ b with b : λ =0 linearly independent. i 6∈ i ∈ i j=1 j j { j j 6 } For some λj = 0, we must have bj P a1,...,am , and we replace bj ai in B. Proposition 1.46 implies that this replacement6∈ { does} not change L(B)=←S. After m steps, a ,...,a b ,...,b and it follows that m n. 2 { 1 m}⊆{ 1 n} ≤ Thus the maximum number of linearly independent elements of a subspace is no larger than the minimum number of its elements needed to span. Since the unit vectors e1,...,en (i.e., n eij = 1 for i = j and eij = 0 for i = j) span IR , any linearly independent set of vectors in IRn, say a ,...,a , must satisfy6 m n. { 1 m} ≤ Exercise 1.6 For A IRm×n with m < n, show that Ax =0 has a solution x =0. 2 ∈ { } 6 A basis for a subspace S is a maximal linearly independent subset of S, i.e., a linearly independent set that is properly contained in no linearly independent subset of S. If B is a basis for S, then L(B) L(S) = S by (1.2), and maximality of B with respect to ⊆ linear independence implies that every other element of S is linearly dependent on B, hence S L(B). Thus L(B)= S; i.e., B spans S. Since a basis is both independent and spanning, we⊆ are able to strengthen the max-min relation observed following Proposition 1.5 above: in any subspace, the maximum size of an independent set equals the minimum size of a spanning set. Moreover, if B1 and B2 are bases, then B1 B2 , since B1 is independent and B2 is spanning; symmetrically, B B . It therefore| |≤| follows| that all bases of a subspace | 2|≤| 1| are the same size, i.e., are equicardinal. We summarize these observations in the following fundamental theorem of linear algebra, the Finite Basis Theorem, stating that any subspace can be represented as the set of all linear combinations of the finite set of generators given by any of its bases, i.e., is finitely generated by the elements of any basis.

3 Theorem 1.7 (Finite Basis Theorem) Every subspace S IRn has a finite basis. Furthermore, all bases for S have the same cardinality, say⊆m, and m n. 2 ≤ The dimension of subspace S IRn is defined by dim(S) = B , where B is a basis for S; ⊆ | | the theorem shows that dimension is well-defined, since it makes no difference which basis of S is used to determine its dimension. For S = 0 , B = is a basis and dim(S)=0. { } ∅ Exercise 1.8 For any two subspaces S, T IRn show that: ⊆ (i) dim( ) is subclusive: S T dim(S) dim(T ), with equality if and only if S = T ; (ii) dim(· ) is modular: dim⊆(L(S⇒ T )) + dim≤ (S T )= dim(S)+ dim(T ). 2 · ∪ ∩

Finite Constraint Representations

Thus far our focus has been on generator representation of subspaces; i.e., given S IRn, we have considered the subspace L(S) generated by S. Now we consider a different⊆ subspace arising from S. We use the notation Sx = 0 to mean that x is orthogonal to every element of S; i.e., for every s S the inner product s x = s1x1 + + snxn is 0. Then the subspace So = x : Sx =0 is the∈ dual of S, and when S· itself is a subspace,··· So is called the orthogonal { } o complement of S. Since S is defined by linear constraints, s1x1 + + snxn =0 s S, it will be called a constrained subspace. A subspace is finitely constrained··· provided∀ it∈ can be expressed in the form x : Ax =0 , for some matrix A. { } Exercise 1.9 For S IRn show: o (i) S is a subspace;⊆ (ii) S So 0 ; ∩ ⊆{ } o (iii) if a ,...,a S and a ,...,a S are linearly independent sets, { 1 m} ⊆ { m+1 p} ⊆ then a1,...,ap is also a linearly independent set; hence p n; o { o } ≤ (iv) S =(L(S)) . 2

Note that for any S IRn, the Finite Basis Theorem assures L(S) = L(A), where A is a subset of S of finite⊆ cardinality, i.e., where A is a matrix. It then follows from (iv) o o o o that S = (L(S)) = (L(A)) = A = x : Ax = 0 , the null space of matrix A. Thus, regardless of S , only finitely many of the{ constraints}Sx = 0 are needed in order to define So , implying| that| the subspace So is finitely constrained. The following result summarizes further properties of the duality relation.

Proposition 1.10 Let S, T IRn. Then: o o ⊆ (i) S T S T ; oo (ii) S⊆ S⇒; ⊇ (iii) S⊆o = Sooo ; oo (iv) S = S S is a constrained subspace; o (v) A IRm×⇔n, S = yA : y IRm S = x IRn : Ax =0 . ∈ { ∈ }⇒ { ∈ } 4 o o Proof: (i) x T T x =0 Sx =0 x S . o oo (ii) x S ∈xy =0⇒, y S⇒ x S⇒. ∈ (iii) Applying∈ ⇒ (i) to (ii∀) gives∈ S⇒o ∈Sooo ; applying (ii) directly to So gives So Sooo . ⊇oo o o ⊆ (iv) ( ) This is clear, since S = S =(S ) , which is constrained by definition. ( ⇒) S constrained S = T o for some T , so by (iii), S = T o = T ooo = Soo . ⇐ o ⇒ o o (v) S = L(A) S =(L(A)) = A , using part (iv) of (1.9). 2 ⇒ When S = Soo as in part (iv) of the theorem, S is o-closed; thus the o-closed sets are precisely the constrained subspaces. It is natural to ask which subspaces (i.e., among all subspaces) are o-closed, or, in view of part (iv) of (1.9), which subspaces can be represented as the solution set for a finite, homogeneous system of linear equalities. We answer this question using the following form of Gaussian elimination. Note that the elimination procedure is described for inhomogeneous linear systems, i.e., systems of the form Ax = b . { } Proposition 1.11 (Gaussian Elimination) Suppose A IRm×n, b IRm and denote the linear equality system Ax = b as (I). ∈ ∈ { } “Eliminate” xn to obtain system (II) as follows: (i) a =0 a x + + a x = b is in (II); in ⇒ i1 1 ··· i,n−1 n−1 i (ii) if ain =0 i, skip this step; otherwise, select akn =0 and for i = k, ain =0 do: ai1 a∀k1 ai,n−1 ak,n−1 6 bi bk 6 6 x1 + + xn−1 = is in (II). ain − akn ··· ain − akn ain − akn Then (I) is consistent if and only if (II) is consistent.  

Proof: ( ) Clearly, x ,...,x satisfy (I) x ,...,x satisfy (II). ⇒ 1 n ⇒ 1 n−1 ( ) Given x1,...,xn−1 which satisfy (II), when ain =0 i, we fix xn arbitrarily; ⇐ ∀ bk ak1 ak,n−1 otherwise, for index k selected in step (ii), define xn = x1 xn−1. akn − akn −···− akn Then one easily checks that x1,...,xn−1, xn satisfy (I). 2

Theorem 1.12 Every subspace is finitely constrained.

Proof: Let S be a subspace of IRn with a basis given by the rows of A IRm×n. Then: m ∈ S = yA : y IR = {x : x ∈ yA =0} is consistent { { − } p}×n = x : Bx =0 , for some B IR . { } ∈ The final step eliminates y1,...,ym from the system x yA =0 , thereby producing the linear system Bx =0 . 2{ − } { } Note, in particular, that the proof remains valid for S = 0 ; then we have m = 0, so A is vacuous and B is simply the n n identity matrix. What{ about} when S = IRn? There are several important consequences× of Theorem 1.12. First, from part (iv) of (1.10) and (1.12) we obtain a characterization of o-closed subsets of IRn. Furthermore, using (1.12) we obtain a dual statement for part (v) of (1.10), i.e., that for a given matrix A, the two subspaces obtained by interpreting the rows of A as generators and constraints, respectively, are dual. (Generators correspond to constraints, and conversely, under duality.)

5 oo Corollary 1.13 For S IRn, S = S if and only if S is a subspace. 2 ⊆ Corollary 1.14 Let S = yA : y IRm and T = x IRn : Ax =0 , where A IRm×n. Then So = T and T o ={S. ∈ } { ∈ } ∈

o oo o oo o Proof: S = T is (v) of (1.10). Thus S = T and by (1.13), S = S; hence S = T . 2

Finally, we use Theorem 1.12 to derive a theorem of the alternative characterizing the exis- tence of a solution for a linear equality system.

Theorem 1.15 (Fredholm 1903) For A IRm×n and c IRn, exactly one holds: (i) y IRm such that yA = c; (ii) ∈x IRn such that∈ Ax =0, cx =0. ∃ ∈ ∃ ∈ 6 Proof: Let S = L(A). Then (i) c / S c / Soo cx =0, x So (ii). 2 ¬ ⇔ ∈ ⇔ ∈ ⇔ 6 ∃ ∈ ⇔ Thus system yA = c has a solution if and only if Ax = 0 cx = 0, i.e., precisely when the components{ of c obey} all linear dependence relations satisfied⇒ by the columns of A. We now have two equivalent (dual) ways to think about subspaces: as sets determined from linear combinations of finitely many generators (Theorem 1.7) or as solution sets for finite homogeneous linear equality systems (Theorem 1.12). And for a given subspace, we pass from a generator representation to a constraint representation, and conversely, using Gaussian elimination as in (1.12). Generators provide a simple existence criterion for membership, c S yA = c y IRm, and constraints, for non-membership, c / S cb =0 b B. ∈ ⇔ ∃ ∈ ∈ ⇔ 6 ∃ ∈ Recall b B means b is a row of B; here matrix B from the proof of (1.12) is being used to provide∈ the x in part (ii) of Theorem 1.15. Since each row of A is in S, we have Ab =0 for each row b B. Note that the constraint description S = x : Bx = 0 actually gives us more information∈ than part (ii) of (1.15) when it comes to establishing{ non-membership} c / yA : y IRm – namely, the vector x in part (ii) of (1.15) can be selected from a ∈ { ∈ } finite list (the rows of matrix B), and this list does not depend on c. In this sense, (1.12) is a sharper result than (1.15). The present development emphasizes that for any subspace S IRn, there exist matrices m×n p×n m ⊆ A IR , B IR so that S = yA : y IR = x : Bx = 0 . Note that Ax = 0 and Bx∈= 0 implies∈ x = 0, since x S{(as Bx ∈= 0) and} {x So (as Ax} = 0), and by Exercise o ∈ n ∈ 1.9(ii), x S S implies x = 0. Thus c IR , Ax = 0 and Bx = 0 implies cx = 0, and it therefore follows∈ ∩ from Theorem 1.15 that∀ for∈ any c IRn, there must exist y IRm, z IRp so that c = yA + zB; i.e., the generators for S and∈So span IRn. The rank of∈ A is defined∈ o by rank(A)= dim(L(A)) and the nullity of A by nullity(A)= dim(A ). Thus we have the following well-known result from matrix algebra.

Corollary 1.16 For A IRm×n, rank(A) + nullity(A)= n. 2 ∈

6 Orthogonal Projection

Continuing with the above development, let a1,...,am be a basis for S and am+1,...,an o n { } { } be a basis for S . Then for any x IR we may write ∈ x = λ a + + λ a + λ a + + λ a , 1 1 ··· m m m+1 m+1 ··· n n 0 and (1.4) implies the scalars λi are unique. The vector x = (λ1a1 + + λmam) is called 00 ··· the (orthogonal) projection of x into the subspace S; x = (λm+1am+1 + + λnan) is the o ··· orthogonal projection of x into S . The terminology orthogonal here is due to the fact that x0 x and x0 are orthogonal, as are x00 x and x00. Note that x0 and x00 are unique, for if x =− x0 + x00 =x ¯ + x¯ with x0, x¯ S and−x00, x¯ So , then S x0 x¯ = x¯ x00 So , so we ∈ ∈ 3 − − ∈ must have x0 =x ¯ and x¯ = x00. Theorem 1.17 Let S be a subspace of IRn. o Then each point in IRn is the unique sum of its orthogonal projections into S and S . 2 Projection has already played a central role in our development of subspace duality. Namely, Gaussian elimination is actually projection of the solutions for the system Ax = b from IRn into IRn−1 by elimination of the nth coordinate – more precisely, projection{ from} IRn into the subspace IRn−1 0 , after which the nth coordinate is simply dropped. ×{ } n Exercise 1.18 Consider the subspace S = x : xn =0 IR . (i) What is the projection of a IRn into{ S? Justify} ⊂ your answer using (1.17). ∈ n (ii) What is the projection of an arbitrary set T IR into S? (iii) For T = x : Ax = b , with A IRm×n, b ⊆IRm, show that { } ∈ ∈ Gaussian elimination of xn defines the projection of T into S. 2 In order to adhere to standard notation, our focus for the remainder of this section shifts to subspaces generated by the columns of A. At denotes the transpose of matrix A.

m×n Exercise 1.19 Suppose A IR has linearly independent columns a1,...,an. t ∈ t −1 t (i) Show that A A is invertible and M =(Im A(A A) A ) is symmetric. m 0 00 0 − 00 o (ii) For x IR , show x = x + x , where x S = L( a1,...,an ) and x = Mx S ; o i.e., A(∈AtA)−1At projects x into S and M∈projects{x into S .}2 ∈ Projection is a fundamental tool in optimization, indeed, throughout all of applied math- ematics. We summarize here further important properties and applications of projection. Recall that the length of x IRn (i.e., the Euclidean norm or 2-norm of x) is defined as 1 ∈ x = (x x) 2 ; this definition derives, in fact, from inductive application of part (i) of Exercisek k 1.18· along with the Pythagorean Theorem of elementary geometry. Similarly, the distance between two points, say x, z IRn, is given by z x . For S a subspace of IRn ∈ n k − k and x0 the (unique) projection of x IR into S, consider the distance x0 x . Even the one-dimensional case is important,∈ viz., the case of a line S = λak: λ− kIR , for n 0 a·x { 0 ∈ } 0 = a IR . From Exercise 1.19 we have x = ( a·a )a; consequently, x x must satisfy 6 ∈ 2 k − k 0 x ( a·x )a 2 = x x 2( a·x )a x+( a·x )2a a = x 2 (a·x) . This implies the fundamental ≤k − a·a k · − a·a · a·a · k k − kak2 7 Cauchy-Schwarz inequality: a x a x a, x IRn. | · |≤k kk k ∀ ∈ Thus a + x 2 = a a +2a x + x x a 2 +2 a x + x 2 =( a + x )2, yielding the k k · · · ≤k k k kk k k k k k k k triangle inequality: a + x a + x a, x IRn. k k≤k k k k ∀ ∈ Moreover, the points 0, x, x0 define a right (by orthogonality) triangle which determines the

angle θ between a, x IRn: cos θ = a·x . ∈ kakkxk When x0 = 0 lies on the ray λa : λ 0 , then a x > 0 and angle θ is acute, while 6 { ≥ } · o 0 = x0 λa : λ 0 implies a x < 0 and θ is obtuse; a x = 0 precisely for x S , and then6 cos∈θ {= 0, reflecting≤ } the orthogonality· of a and x. · ∈ Returning now to the general case, i.e., for S an arbitrary subspace, note that projection of x to x0 S achieves the minimum distance from x to S. This is straightforward, for if z S, then z∈ x 2 = z x0 +x0 x 2 = z x0 2 +2(z x0) (x0 x)+ x0 x 2, so that x0, z∈ S k − k ok − − k k − k − · − k − k ∈ and x0 x S implies z x 2 = z x0 2 + x0 x 2 x0 x 2. For an important application,− ∈ consider A kIRm−×nkand Sk =− Axk : x k IR−n .k When≥ k b− / Skthe system Ax = b has no solution. The previous∈ discussion shows,{ however,∈ } that in this∈ case the projection{ b}0 satisfies b0 b 2 Ax b 2 x IRn, i.e., Ax S. Thus b0 S provides a least squares approximationk − kfor≤k the inconsistent− k ∀ ∈ linear system∀ ∈Ax = b . Moreover,∈ Exercise 1.19 shows how to determine the linear transformations which{ project} IRm into either S or So . The matrix AtA of Exercise 1.19 has many applications. It determines, for example, the volume of certain geometric figures. Specifically, for A IRm×n with linearly independent ∈ columns a ,...,a , when n = 1, the single column a defines [0, a ]= a x :0 x 1 , 1 n 1 1 { 1 1 ≤ 1 ≤ } a line segment of length a = det(AtA). When n = 2, a and a define a parallelogram k 1k 1 2 [0, a ]+[0, a ]= a x + a x :0q x 1, j =1, 2 , whose area is det(AtA). And in gen- 1 2 { 1 1 2 2 ≤ j ≤ } eral, the columns of A generate a parallelotope P = n a x :0 q x 1 j of volume { j=1 j j ≤ j ≤ ∀ } vol(P ) = det(AtA). To establish this, we require certainP intuitive properties of volume, namely, thatq volume is not changed by rigid displacement (any transformation, such as rota- tion, reflection, or translation, which preserves lengths of and angles between the generators), that volume is additive over disjoint sets, and that the unit hypercube (aj = ej j) has unit volume. (With these properties, volume is identical to Lebesgue measure.) ∀ First, consider the effect on vol(P ) resulting from application of the elementary operations of Exercise 1.3 to the a . Any interchange a a simply re-orders the generators, hence does j j ↔ k not change vol(P ). The replacement a a also has no effect on vol(P ), as it just trans- k ← − k lates any point j ajxj of P by ak to the point j ajxj ak = j6=k ajxj +( ak)(1 xk) in the parallelotope generated by− a j = k and a . For− a a + a , there− is again− no P j ∀ 6 P− k k ←Pk l change in vol(P ), as the effect here is to translate by a those points of P for which x x l l ≤ k (by (1.4), the xj are uniquely determined): a x a x = a x + a (x x )+(a + a )x , for x [x , 1]; j j j ↔ j j j j6=k,l j j l l − k k l k l ∈ k a x a x + a = a x + a (1 x + x )+(a + a )x , for x [0, x ]. j Pj j ↔ j Pj j l P j6=k,l j j l − k l k l k l ∈ k NoteP that someP points which areP not translated, say j ajxj, are duplicated by points under P 8 translation, j ajx¯j + al, but for such points, multiplier uniqueness stipulates xl =x ¯l +1, so that xl = 1.P Thus any duplication occurs on the boundary of P , and this has no effect on the volume (details?). The remaining operation is scaling, a αa , and since the case α = 1 k ← k − has been treated above, we now restrict attention to α > 0. Note that when α is integral, the effect of the scaling operation is that vol(P ) is also multiplied by α. This follows from the fact that j6=k ajxj +(αak)xk consists of α translates of j ajxj given by j ajxj + pak, for p =0, 1,...,αP 1. (Again, duplication at the boundariesP of these regions doPes not affect the volume.) It follows− easily that for α rational, a αa scales vol(P ) by α; the general k ← k result for α 0 then follows by rational approximation of α. ≥ Mutually orthogonal generators, i.e., aj ak = 0 j = k, provide an orthogonal basis for L( a ,...,a ); in this case AtA is a diagonal· matrix∀ with6 a = a 2. Thus P is rectangu- { 1 n} jj k jk lar, hence it is clear, at least intuitively, that vol(P )= a a = det(AtA). Mutual k 1k···k nk orthogonality also simplifies the projection operation, as shown in theq following exercise. The classical Gram-Schmidt orthogonalization establishes that, in fact, any subspace has an orthogonal basis.

m m Exercise 1.20 Suppose a1,...,an IR are mutually orthogonal and let x IR . ∈ n∈ aj ·x Show that the orthogonal projection of x into L( a1,...,an ) is given by ( )aj. 2 { } j=1 aj ·aj P Theorem 1.21 (Gram-Schmidt Orthogonalization 1907) For A IRm×n with linearly independent columns a ,...,a , ∈ 1 n define b1 = a1; o project a into b to obtain b = a ( b1·a2 )b ; 2 1 2 2 b1·b1 1 o { } − b1·a3 b2·a3 project a3 into b1, b2 to obtain b3 = a3 ( )b1 ( )b2; { } − b1·b1 − b2·b2 and so on, yielding matrix B with columns b1,...,bn after n steps. Then the columns of B provide an orthogonal basis for L( a ,...,a ). 2 { 1 n} Exercise 1.22 Prove the preceeding theorem. 2

In matrix terms, the orthogonalization process can be summarized as a sequence of column operations on A effected by matrices of unit determinant. These operations leave det(AtA) unaltered; the above discussion shows that they leave vol(P ) unaltered, as well. We thus lose no generality in assuming that the columns of A are mutually orthogonal. To confirm that vol(P )= a a = det(AtA), we normalize, i.e., divide each a by a ; this divides k 1k···k nk j k jk both det(AtA) and vol(qP ) by a a . The columns of A, now mutually orthogonal k 1k···k nk and eachq of unit length, constitute an orthonormal basis for L( a1,...,an ). To complete the validation, it suffices to show that the parallelotope at hand{ has unit volume,} the same as the unit hypercube. For this, let B be the (orthogonal) m m matrix whose first n rows × 0 are the aj and whose remaining rows are an orthonormal basis for L( a1,...,an ) . As the m { } columns of BA are ej IR , 1 j n, this transformation takes P to the unit hypercube. ∈ ≤ ≤ t t t t t Moreover, orthogonality of B implies B B = Im, so that x y = x B By = (Bx) (By). I.e., inner products, hence lengths and angles, are preserved and vol(P ) remains unchanged.

9 Theorem 1.23 For A IRm×n with linearly independent columns a ,...,a , ∈ 1 n the volume of the parallelotope n a x :0 x 1 j is det(AtA). 2 { j=1 j j ≤ j ≤ ∀ } P q Due to orthogonality of the projections in the Gram-Schmidt process, the lengths of the 2 2 b1·a2 2 2 basis vectors do not increase (e.g., a2 = b2 + ( )b1 b2 , hence a2 b2 ). k k k k k b1·b1 k ≥k k k k≥k k Thus, for any matrix A IRm×n with linearly independent columns a ,...,a , we have the ∈ 1 n Hadamard inequality: det(AtA) a a . ≤k 1k···k nk q Note that when A is square and invertible, the volume of the parallelotope generated by the rows of A is given by det(AtA)= det(A) a a ; equality holds here if and only | |≤k 1k···k nk if A has mutually orthogonalq columns. The following additional property of AtA is evident from the present discussion.

Corollary 1.24 For A IRm×n of rank n, AtA is symmetric and det(AtA) > 0. 2 ∈

Affine Spaces

m An affine combination is a linear combination λ1a1 + + λmam for which i=1 λi = 1. ··· n Removing the special role played by 0, we define an affine space to be any subsetP of IR which is closed under affine combinations. Note, in contrast to subspaces, that is an affine space. Any set S is affinely dependent if one of its elements can be expressed∅ as an affine combination of other elements of S, i.e., when for some a0 S we can write a = λ a + + λ a , where a = a S i and m λ = 1. Similarly,∈ if a / S satisfies 0 1 1 ··· m m 0 6 i ∈ ∀ i=1 i 0 ∈ such an affine dependence relation, we say that a0 isP affinely dependent on S. Since a vacuous set of coefficients cannot sum to unity, we now have that 0 is not affinely dependent on . The affine span of S is defined as A(S) = S x IRn : x is affinely dependent on S .∅ It ∪{ ∈ } is not difficult to verify that A( ) also has the properties iterated in Exercise 1.2 for linear span L( ). Thus affine spaces are· just those subsets of IRn which satisfy S = A(S). Now, S · A(S) L(S), so S = L(S) S = A(S). Thus any subspace is an affine space ⊆ ⊆ ⇒ and it is easily verified that an affine space is a subspace if and only if it contains 0. The following shows that any nonempty affine space is simply a translation of a subspace.

Exercise 1.25 Suppose S IRn is an affine space and a S = . Show that: (i) S = T + a , i.e., S⊆= x + a : x T , for some subspace∈ 6 ∅T ; (ii) moreover,{ T} is uniquely{ determined∈ by} S, i.e., independent of the choice of a S. 2 ∈ On the other hand, for any subspace T IRn and any vector a IRn, it is evident that translation of T by a, i.e., T + a , is an⊆ affine space. We thus have∈ the following result. { } Theorem 1.26 For S IRn we have: ⊆ S is a nonempty affine space if and only if S is a translation of a (unique) subspace. 2

10 The solution set for any finite linear equality system is an affine space, e.g., S = x : Ax = b , where A IRm×n, b IRm. Using (1.26) and (1.12), we now show that every affine{ space can} be represented∈ in this∈ way.

Theorem 1.27 Suppose S IRn. Then S is an affine space⊆ S = x : Ax = b , for some A IRm×n, b IRm. ⇔ { } ∈ ∈ Proof: The sufficiency is clear. For the necessity, when S = , we take S = x :0x =1 . ∅ { } When S = , (1.26) implies S = T + a , where T is a subspace and a S. Thus by6 (1.12),∅ S = z + a : Az =0 {=} x : A(x a)=0 , for some A∈ IRm×n. Defining b = Aa yields{ the desired result.} { 2 − } ∈

Exercise 1.28 Let A IRm×m and b IRm. Show that Ax = b∈has a unique solution∈ Ax =0 has no nonzero solution. 2 { } ⇔ { } Note that we can interpret results (1.26) and (1.27) as affine analogues for earlier results on subspaces. Theorem 1.26 states that any nonempty affine space is of the form T + a for n { } some subspace T and a IR . Thus if a ,...,a is a basis for T, then T + a is (finitely) ∈ { 1 k} { } generated by taking affine combinations of a1 + a, a2 + a,...,ak + a, a. Similarly, Theorem 1.27 states that any affine space is (finitely) constrained, cf. Theorem 1.12. The dimension of an affine space S = is defined with reference to Theorem 1.26 by setting dim(S)= dim(T ), where S is a translation6 ∅ of the subspace T .

Exercise 1.29 Let S = x : Ax = b = , for A IRm×n and b IRm. Show that the dimension{ of the affine} 6 ∅ space S∈is n rank(A).∈2 − A hyperplane in IRn is an affine space of dimension n 1. Thus by (1.27) and (1.29), H IRn − n ⊆ is a hyperplane if and only if H = x : ax = β , for some a IR 0 , β IR. By (1.27), every affine space except IRn itself{ can be expressed} as the∈ intersection\{ } ∈ of finitely many hyperplanes; this includes the affine space . (Why?) ∅ A set of vectors which are not affinely dependent are affinely independent. The following exercise relates linear and affine independence.

Exercise 1.30 For a ,...,a IRn, show that the following are equivalent: 1 m ∈ (i) a1,...,am are affinely independent; { m } m (ii) i=1 λiai =0 and i=1 λi =0 implies λi =0 i; (iii) for each k, 1 k m, the set a a : i = k∀ is linearly independent. 2 P ≤ ≤P { i − k 6 } By stipulating dim(S)= dim(A(S)), the notion of dimension is extended to any nonempty set S IRn. The affine rank of S IRn is defined as the size of a largest affinely independent subset⊆ of S. Thus the affine rank⊆ of is 0 and the affine rank of any nonempty set is related ∅ to its dimension by the following result.

Theorem 1.31 For = S IRn, the affine rank of S exceeds dim(S) by one. ∅ 6 ⊆ 11 Proof: Suppose the affine rank of S is k + 1 and a0, a1,...,ak S are affinely independent. For A(S)= T + a , where T is a subspace, we have dim(S)=∈ dim(A(S)) = dim(T ). { 0} By Exercise 1.30, a1 a0,...,ak a0 are k linearly independent elements of T . { − − } k k Furthermore, a T a + a0 A(S) λi so that i=0 λi = 1 and a + a0 = i=0 λiai. k ∈ ⇒ ∈k ⇒ ∃ Thus a = i=0 λi(ai a0)= i=1 λi(ai a0); hence Pai a0 :1 i k is a basisP for T . − 2 − { − ≤ ≤ } Therefore Pdim(T )= dim(S)=Pk.

Of course, the concept of volume also depends on dimension. In IR3, for example, a line has positive (one-dimensional) length and a parallelogram, positive (two-dimensional) area, yet neither has positive (three-dimensional) volume. We use simply the term volume for all k 3, dimensionality being clear from context. Roughly speaking, vol(S) > 0 for k- ≥ n dimensional volume only when dim(S) = k, for S IR , n k. Recalling, in particular, our earlier discussion, duplication of boundary points⊆ by translations≥ associated with the operations ak ak + al and ak αak on generators of the parallelotope P is negligible in vol(P ) precisely← because it occurs← within a region of dimension less than dim(P ). Using (1.23), it is straightforward to compute vol(P ) after a bijective affine transformation. Exercise 1.32 How does the transformation x Bx+b, for B IRm×m invertible, b IRm, affect the volume of a parallelotope in IR7→m? 2 ∈ ∈

Algorithmic Considerations For a ,...,a IRn, consider the subspace S = L( a ,...,a ). One can determine dim(S) 1 m ∈ { 1 m} by finding a basis for S among the ai, as prescribed in the following exercise. Exercise 1.33 (Greedy Algorithm) Suppose a ,...,a IRn and (initially) B = . 1 m ∈ ∅ Considering the ai sequentially: if ai / L(B), replace B B ai and continue. Show that upon termination B is a basis∈ for L( a ,...,a← ). ∪{2 } { 1 m} For computational considerations we restrict now to rational data. Observe that in order to carry out the conditional test ai / L(B) of the Greedy Algorithm, we must determine, given A Qm×n and c Qn, whether∈ the linear system yA = c is consistent. I.e., we must ∈ ∈ m { } resolve the decision problem ? y Q : yA = c . Of course, the Gaussian elimination procedure of Proposition 1.11{∃ can be∈ used for this} purpose. Iterated application of this procedure, eliminating first ym, then ym−1, etc., can terminate in only two ways: either with an equivalent system which can be back-solved (as in the proof of Proposition 1.11) to determine a solution for the original system, or with a plainly inconsistent relation 0y1 = β −1 with β = 0. Note also that simply solving yB = ei, for i =1,...,m, will determine B when B Qm6 ×m is invertible. Thus Gaussian elimination can be used, not only as in the Greedy ∈ Algorithm to determine a basis for S = L( a1,...,am ), but also as follows to determine a o { } basis for S . We assume, without loss of generality, that A is nonvacuous, i.e., that m> 0, and that the rows of A are linearly independent. Thus there are m independent columns of A and for some permutation matrix P , we can partition AP = [B N], with B an invertible m m matrix; hence B−1AP = [I B−1N]. Matrix B is a (column) basis for A. × m 12 B−1N Exercise 1.34 Let D = P − . " In−m # (i) Show that the columns of D are in Ao . o (ii) Conclude from (i) that the columns of D constitute a basis for A . 2

Note that parts (i) and (ii) of the exercise remain valid even when L(A)= 0 ; in this case −1 { } B N is vacuous and In−m = In. 2 1 1 Exercise 1.35 Is (1, 1, 0) in the row space of − ? Is ( 1, 5, 3) . . .? 2 3 3 1 − − " − − # 2 3 1 1 b Exercise 1.36 Determine all solutions x IR4 to − x = 1 . 2 ∈ 1 5 20 b2 " − # " # How efficient is Gaussian elimination? This algorithm performs at most m2 elementary row operations, each requiring at most n arithmetic operations, so the overall number of simple operations (add, subtract, multiply, divide, compare, store) executed is bounded by a polynomial function of m, n. The measurement of computational efficiency must also consider the simple arithmetic operations. Now, comparison and storage operations do not alter the problem data, but the arithmetic involved in the elementary operations can affect the data—sometimes dramatically, as we see below. The following variant of Gaussian elimination, stated for ease of presentation only in terms of the operations on matrix A, uses simple cross multiplication without division (cf. the algorithm of (1.11), where row p is divided by apq, and row i by aiq, before the cross multiplication). Algorithm 1.37 Input: A Qm×n. Initialize ∈k =0, I = J = . (i) (select pivot) find p∅ / I,q / J with a =0 – if none exists, STOP; ∈ ∈ pq 6 set k k +1, I I p , J J q , (p , q ) (p, q); ← ← ∪{ } ← ∪{ } k k ← (ii) (pivot) i / I with aiq =0, replace aij apqaij aiqapj j / J; go to (i). ∀ ∈2 6 ← − ∀ ∈

Exercise 1.38 Suppose matrix A is m m, with a = 2 for i = j and a = 1, otherwise. × ij ij Apply the previous algorithm using (pk , qk )=(k,k), 1 k m 1, denoting by αk the magnitude of the largest entry in A when step (i) is encountered,≤ ≤ 0− k m 1 (α = 2). ≤ ≤ − 0 Show that α = (k+2)k α2 , k 1; i.e., data magnitude essentially squares each iteration. 2 k (k+1)2 k−1 ≥ This exercise is due to F. Voelkle (private communication, E.P.F.-Lausanne, 1985). For the size of an integer α, we take 1+ log2 α , i.e., the number of bits required to ex- d | |e ` 2 press α (including the sign). The exercise shows that for ` 1, αk+` αk , and hence ` ≥ ∼ log α 2 log α ; i.e., data size grows exponentially as the algorithm progresses. In 2 k+` ∼ 2 k J. Res. of Nat. Bur. Stds.(B) 71(1967)241–245 Edmonds presents a variant of Gaussian elim- ination which guarantees no such exponential increase in the size of the data.

13 Algorithm 1.39 This procedure is the same as Algorithm 1.37, except (with a =1): p0 q0 (ii) (pivot) i / I, replace aij (apqaij aiqapj)/ap q j / J; ... . 2 ∀ ∈ ← − k−1 k−1 ∀ ∈

Note that step (ii) of the revised procedure stipulates replacement even when aiq = 0. The normalization of dividing by the previous pivot element mitigates the increase in the size of the data encountered over the course of the computation. Indeed, now all intermediate matrix entries are (in magnitude) just determinants of submatrices of the original matrix.

Theorem 1.40 (Edmonds 1967) Any nonzero matrix entry produced in Algorithm 1.39 is equal in magnitude to the determinant of a submatrix of the original matrix A.

Proof: Denote by Ak the data as iteration k 0 begins; the assertion is clear for A0 = A. ≥ Inductively assume the theorem valid for A0, A1,...,Ak−1, and for Ak consider k k−1 apq for p / p1 ,...,pk ,q / q1 ,...,qk (the other entries being unchanged from A ). k ∈{ } ∈{ } 0 We claim apq is equal in magnitude to the determinant of the submatrix of A given by

rows p1 ,...,pk ,p and columns q1 ,...,qk , q . { } { r } Where δr denotes this determinant value in A , consider the relation between δk−1 and δk. k k k−1 In computing A , the only operation affecting apq is (pre-) multiplication of A by an

elementary matrix with the following entries in rows pk ,p and columns pk ,p:

1 0 k−1 k apq ap q  k k k  . k−1 k−1 ap q ap q − k−1 k−1 k−1 k−1     k k−1 Thus δk/δk−1 = a /a . pk qk pk−1 qk−1 k−2 k−1 Similarly, in going from A to A , only rows pk and p of the submatrix change. k−1 k−2 2 It follows that δk−1/δk−2 =(a /a ) , and iteration of this argument yields pk−1 qk−1 pk−2 qk−2

k 1 k−1 2 2 k−1 1 k a ap q a a δk δk δk−1 δ2 δ1 pk qk k−1 k−1 p2 q2 p1 q1 = = k−1 k−2 1 0 . δ0 δk−1 δk−2 ··· δ1 δ0 a  a  ··· a  a  pk−1 qk−1 pk−2 qk−2 p1 q1 p0 q0 1 2 k         Hence δk = a a a δ0. p1 q1 p2 q2 pk qk 1 2 ··· k k Also, δk and ap q ap q ap q apq are of the same magnitude, 1 1 2 2 ··· k k by (permuted) triangularity of the submatrix. k 2 Therefore δ0 and apq must be of equal magnitude, as required.

An immediate consequence of the theorem is that when the original matrix A has only integral entries, the division by a in successive iterations of the algorithm produces pk−1 qk−1 integer-valued results; i.e., only integral arithmetic is required. It follows that in actual implementation of the procedure for arbitrary (rational) data, one can always work with integral data, thus avoiding questions regarding different representations of the same rational number.

14 Exercise 1.41 Show that any n n submatrix in Exercise 1.38 has 0, 1, n + 1 -valued determinant magnitude. Repeat the× computation of the exercise using Algorithm{ 1.39.} 2

Finally, recall that the determinant of any m m matrix is the sum of m! (signed) products of × m matrix entries, one for each permutation of the indices (1, 2,...,m). Thus the determinant of any submatrix is of magnitude at most m!αm (mα)m, where α denotes the magnitude of the largest entry in the original matrix; hence≤ the size of the determinant is at most m(log2 m + log2 α). Thus, since the overall number of elementary computational operations and the size of the data produced over the course of the computation are each bounded by a polynomial function of the size of the original problem instance, i.e., a polynomial function of m, n, and log2 α, Algorithm 1.39 is a polynomial-time algorithm.

Exercise 1.42 Give a polynomial-time algorithm (in the size of A Qm×n,c Qn) which ∈ ∈ terminates with either y satisfying alternative (i) of (1.15) or x satisfying alternative (ii). 2

Exercise 1.43 For A IRm×n of rank n and b IRm, suppose x¯ solves (AtA)x = Atb . Show that b0 = Ax¯ is∈ the least squares approximation∈ for Ax = b . {2 } { } Exercise 1.44 For the Gram-Schmidt orthogonalization procedure of Theorem 1.21, (i) show L( a1,...,ai ) = L( b1,...,bi ), for i =1,...,n; (ii) hence b{ = a } A (At{A )−1Ata } , 1 i n 1, where A = [a ...a ]; i+1 i+1 − i i i i i+1 ≤ ≤ − i 1 i (iii) conclude that the size of each bi is a polynomial functon of the size of a1,...,ai. 2

15 2 Lattice Points in Linear Spaces

We now restrict the scalar multiplication in the vector space axioms of the previous section to the integers ZZ. The resulting sets, containing 0 and closed under integral linear combinations, are called ZZ-modules. That is, instead of subspaces, we direct attention to subsets of vectors which form (abelian) groups under the operation of vector addition. As with subspaces and affine spaces, we define in the obvious way integral dependence, integral dependence relations, and integral span, denoted Z( ). The properties of Exercise 1.2 hold for integral span, so we may think of ZZ-modules as those· sets S which satisfy S = Z(S).

Finite Generation

The ZZ-module Z(A)= yA : y ZZm is finitely generated by the rows of A IRm×n;a lattice is a ZZ-module which has{ linearly∈ independent} generators. Thus ZZn is the integer∈ lattice and its elements are loosely referred to as lattice points in IRn. Not all finitely generated ZZ- 2 1 modules are lattices; e.g., 1y1 + √2y2 : (y1,y2) ZZ IR . However, our focus will be on rational ZZ-modules,{ essentially limiting finite∈ generation} ⊂ (after integral scaling) to submodules of the integer lattice. We establish this in the following theorem, borrowing a simple technique used to prove the Hilbert Basis Theorem (Theorem 4.2 later) in Giles and Pulleyblank, Lin. Alg. and Its Appl. 25(1979)191–196.

Theorem 2.1 For M a rational ZZ-module, M is finitely generated if and only if kM is integer-valued for some integer k =0. 6 Proof: Suppose M = yA : y ZZm with A Qm×n. Then clearly kA ZZm{×n, and∈ hence} kM ZZ∈n, for some nonzero k ZZ. On the other hand,∈ suppose N = kM ZZ⊆n, with a ,...,a N ∈a basis for L(N). ⊆ { 1 r} ⊆ For any x N, we have x = i αiai for (unique) rational coefficients αi. Define G =∈ a ,...,a α a N :0 α < 1 ; G is evidently a finite subset of N. { 1 r}∪{ Pi i i ∈ ≤ i } Moreover, x N x = iPαiai = i αi ai + i(αi αi )ai x Z(G). It follows that∈ G generates⇒ N; hence Mb isc finitely generated− b c by⇒ 1 g∈: g G . 2 P P P { k ∈ } If N is a submodule (containing 0, closed under integral combinations) of a finitely generated rational ZZ-module M, then the theorem implies kN kM ZZn for some 0 = k ZZ. Thus N is also finitely generated, a well-known and fundamental⊆ ⊆ result in algebra.6 ∈

Corollary 2.2 If a rational ZZ-module is finitely generated, so is each of its submodules. 2

ZZ n m ZZ Still, not all are rational -modules are lattices; consider, e.g., Q or D = 2n : m, n , the dyadic rationals. It is evident from Theorem 2.1 that such ZZ-modules{ are not finitely∈ } generated. On the other hand, it follows from Theorem 2.3 below on Hermite normal form that all finitely generated rational ZZ-modules are lattices. An integer-valued matrix with unit magnitude determinant is unimodular. Cramer’s rule

16 implies that U ZZm×m is unimodular if and only if U −1 ZZm×m is also unimodular. Thus any unimodular∈ matrix U preserves integrality in the sense∈ that z ZZm zU ZZm. Furthermore, it is not difficult to verify conversely that only unimodular∈ matrices⇔ have∈ this property, namely, that a matrix U for which zU ZZm z ZZm must be unimodular. It is easy to see that unimodular matrices effect the following∈ ⇔ ele∈mentary row operations: (i) interchange two rows; (ii) add one row to another; (iii) multiply a row by -1. It follows that any sequence of these unimodular elementary operations applied to the rows of matrix A leaves Z(A) unaltered (cf. Exercise 1.3). If a column of an integral (or rational) matrix contains two nonzero entries, repeated subtraction between the two rows (the second and third operations) eventually drives one entry to zero. Applied iteratively, this procedure constructs the Hermite normal form, matrix H stipulated in the following classical theorem. Theorem 2.3 (Hermite 1851) For any rational matrix A of full column rank, there exists a unimodular matrix U so that UA = H, with: h =0, i>j; 0 h < h , i < j. 2 ij ≤ ij jj Exercise 2.4 (i) Prove the preceeding theorem using unimodular elementary operations. (ii) Show that the Hermite normal form matrix H is unique. 2 The theorem guarantees that any finitely generated (by the rows of A) rational ZZ-module has a linearly independent set of generators (the rows of H) and is therefore a lattice. Any such linearly independent generating set is a (lattice) basis. By Exercise 2.4(ii), all bases of a lattice are equicardinal, a result similar to the Finite Basis Theorem for subspaces. Note, however, that in the lattice setting an arbitrary generating set need not contain (explicitly) a basis; e.g., for ZZ = 2y +3y :(y ,y ) ZZ2 IR1, the only bases are +1, -1. { 1 2 1 2 ∈ } ⊂ Exercise 2.5 For A Qm×n and x Z(A), it is clear that x must be a linear combination of dim(Z(A)) or fewer∈ rows of A. Is∈ there an analogous result for integral combinations? 2 The algorithmic proof of Theorem 2.3 suggested above actually consists of a series of ap- plications of Euclid’s greatest common divisor algorithm. Indeed, consider the following composite operations on the rows of A. Let g = gcd(aij, akj) = paij + qakj. We then pre- multiply A by the m m unimodular matrix U which is the identity matrix, except for × −akj aij entries: uii = p, uik = q, uki = g , ukk = g . This results in the following changes in the jth column: a pa + qa = g and a a akj + a aij =0. ij ← ij kj kj ← − ij g kj g Exercise 2.6 Let a ZZn and suppose γ = gcd(a ,...,a ). ∈ 1 n (i) For a as the first column of A in Theorem 2.3, show h11 = γ. (ii) Show a x = α has an integral solution γ divides α ( Q), i.e., α ZZ. j j j ⇔ ∈ ⇔ γ ∈ (iii) Show a is a column of a unimodular n n matrix γ =1. 2 P × ⇔ An analogue of Theorem 1.15 remains valid in the lattice setting. We give a direct proof now based on unimodular transformations, indicating later how the result can be derived in a manner entirely similar to the duality development for (1.15).

17 Theorem 2.7 (Kronecker 1884) For A Qm×n and c Qn, exactly one holds: (i) y ZZm such that yA = c; (ii) ∈x Qn such that∈ Ax ZZm, cx ZZ. ∃ ∈ ∃ ∈ ∈ 6∈ Proof: Clearly [(i) and (ii)], else ZZ y(Ax)=(yA)x = cx ZZ, a contradiction. ¬ 3 6∈ n If yA = c has no rational solution, then (1.15) provides x0 Q such that Ax0 =0, cx0 = 0; { }ZZ 1 0 ∈ 6 for δ sufficiently large, (ii) holds with x = δ x . If yA =∈c has a rational solution, (1.15) shows we may assume A has full column rank. { } Scale the data so that A, c are integer-valued; this doesn’t change (i) or (ii). H Thus, by Theorem 2.3, UA = for some unimodular U, where H is invertible. " 0 # The system zUA = c has a solution, namely z =(z ,...,z , z ,...,z )=(cH−1, 0). { } 1 n n+1 m Thus y = zU solves yA = c . If cH−1 ZZn, then z{ ZZm } zU = y ZZm, so (i) holds. ∈ ∈ ⇒ ∈ If (cH−1) ZZ, take x as the jth column of H−1. j 6∈ H e Then cx ZZ, while Ax = U −1UAx = U −1 x = U −1 j ZZm, so (ii) holds. 2 6∈ " 0 # " 0 # ∈

n n 2 Exercise 2.8 Show that Q can be strengthened to Q+ in alternative (ii) of (2.7).

Exercise 2.9 Show (2.7) remains valid for c IRn, but may fail when A IRm×n. 2 ∈ ∈ Hermite normal form uniqueness stipulates that any two bases of a full-dimensional lattice are related by a unimodular transformation; determinant magnitude is, therefore, not affected by basis changes. This invariant has geometric significance. E.g., recall from Theorem 1.23 that for A ZZn×n with linearly independent rows a , det(A) = vol( y a : 0 y 1 i ). ∈ i | | { i i i ≤ i ≤ ∀ } Moreover, det(A) is also the number of integral elements in the (partiallyP open) parallelo- tope P = | y a| : 0 y < 1 i . This is clearly the case for n = 1. Proceeding by { i i i ≤ i ∀ } induction onPn, for U unimodular, yA is integer-valued if and only if yAU is integer-valued. Thus we may assume A has been (lower-)triangularized via unimodular elementary (column) n k operations. For k = 0, 1,...,ann 1, denote Sk = yiai P ZZ : yn = . The sets i ann −ZZn { ∈ ∩ } Sk are disjoint, their union is P , and they are equicardinal,P as the following shows. n ∩k Sk = z ZZ : z = yiai + an; 0 y1,...,yn−1 < 1 i

Every lattice contains a smallest (shortest) nonzero element. This is an immediate con- sequence of the following topological characterization of finite generation, motivated, for example, by consideration of the dyadic rationals.

Theorem 2.11 For M a rational ZZ-module, M is finitely generated if and only if 0 is not a limit point of M.

Proof: We are given that M is rational; thus if M is finitely generated, we have δ, the least common multiple of denominators in components of the generators for M. The granularity of M determined by δ shows 0 cannot be a limit point. For the reverse implication, suppose a1,...,ar M is a basis for L(M). Define G = a ,...,a α a M :0 α∈ < 1 . { 1 r}∪{ i i i ∈ ≤ i } Since 0 is not a limit pointP of M, we can choose > 0 so that x >  x M 0 . It follows that x y >  x, y M with x = y. k k ∀ ∈ \{ } Thus the intersectionk − k of M ∀with∈ any bounded6 set is finite and, therefore, G is a finite set. The argument used in proving Theorem 2.1 now shows that G generates M. 2

Clearly, 0 is a limit point of M if and only if M contains no smallest nonzero element. Thus:

Corollary 2.12 For a rational ZZ-module M, the following are equivalent: (i) M is finitely generated; (ii) kM is integral, for some 0 = k ZZ; (iii) M is a lattice; 6 ∈ (iv) 0 is not a limit point of M; (v) M contains a shortest (i.e., minimum norm) nonzero element. 2

Exercise 2.13 Show that (ii) - (v) may fail for M finitely generated but not rational. 2

We now follow L. Lov´asz, An Algorithmic Theory of Numbers, Graphs and Convexity (SIAM, 1986) and R. Kannan, Ann. Rev. Comp. Sci. 2(1987)231-267 in deriving bounds on the length of a shortest nonzero vector in the lattice M = yA : y ZZn , where A is a rational, { ∈ } n invertible n n matrix. We denote δ(M) = det(A) and for x IR , x ∞ = maxj xj . First, a classical× argument from the geometry| of numbers| is used∈ to determinek k an upper| | bound based on δ(M). We point out that the proof relies on the fact that it is impossible to cover the unit hypercube [0, 1]n with a finite family of proper subsets which are pairwise disjoint and closed.

n Exercise 2.14 For S1,...,Sk disjoint, closed, proper subsets of [0, 1] , n 2 show that i Si is properly contained in [0, 1] . S

19 1 Theorem 2.15 z √nδ(M) n , for some nonzero element z of lattice M. k k ≤ Proof: Denote M = x = yA : y ZZn , where we may assume A ZZn×n and invertible. 1 { ∈ } ∈ n To each point ` M, associate the cube ` + C, where C = x :0 xj δ(M) . ∈ { ≤ ≤ } 1 −1 n 0 n −1 Transform x xA , so that M ZZ and C C = yB :0 yi 1 , B = δ(M) A . n 7→ 0 7→ 7→ { ≤ ≤ } 0 ZZn 0 0 Now, IR = z∈ZZn (z + C ), as x = yB = i yi bi + i(yi yi )bi = z + y , z ,y C . Thus [0, 1]n is covered by [0, 1]n (z + C0) =b c, determined− byb c finitely many z∈ ZZn. ∈ S ∩ P6 ∅ P ∈ By the preceeding exercise, (s + C0) (t + C0) = , for some distinct pair s, t ZZn; hence (p + C) (q + C) = for distinct∩ p, q6 ∅M. ∈ ∩ 6 ∅ ∈ 1 Thus, for z = p q M, z √n z = √n p q √nδ(M) n , as required. 2 − ∈ k k ≤ k k∞ k − k∞ ≤ The orthogonalization (Theorem 1.21) of any basis for lattice M provides a lower bound on n λ(M), the length of a shortest nonzero vector in M. To see this, let b1,...,bn Q be the n ∈ 0 orthogonalization of basis a1,...,an Q and 0 = z M. Then z = i yiai = i yibi, ZZ 0 ∈ 6 ∈ where yi ,yi Q i. Let k be the largest index for which yk = 0; i.e.,Pyi =0,i>kP and ∈ ∈ ∀ 6 bi·ak yk = 0. By construction, L( a1,...,ak )= L( b1,...,bk ) and bk = ak ( )bi, hence 6 { } { } − i

Since the orthogonal vectors b1,...,bn provide a lower bound on the length of a shortest vector in M, one might expect to find reasonably short elements of M near the bi. In fact, it is not difficult to construct an entire basis, saya ¯1,..., a¯n, which also has orthogonalization b1,...,bn and whose members are near the bi, in the sense that eacha ¯i is in the rectangle 1 centered at bi and generated by b1,...,bi−1; i.e., µ¯ij 2 i, j in the (unique) representation a¯ = µ¯ b + b , 1 i n. Such a basis is| said| ≤ to be∀ weakly reduced. The new repre- i j

20 Algorithm 2.17 (Basis Reduction) Input: lattice basis a ,...,a Qn. 1 n ∈ (i) (orthogonalization) determine mutually orthogonal b1,...,bn and representation a = µ b + b , 1 i n; i j

Theorem 2.18 (Lov´asz 1982) Algorithm 2.17 terminates with a reduced basis.

Proof: We may assume a ZZn, for i =1,...,n; let matrix A have columns a , , a . i ∈ i 1 ··· i Similarly, denote Bi = [b1 bi] for the related orthogonalization b1,...,bn. ··· t t Recall from the proof of Theorem 1.23 that det(AiAi)= det(Bi Bi), for i =1,...,n. t t To show termination, we track ∆ = det(A1A1) det(An−1An−1) during execution. Step (ii) of the algorithm does not change the··· orthogonalization, hence does not change ∆. At step (iii), Ai is unchanged for ik, t so det(AiAi) changes only for i = k, due to the new column k (viz., ak+1) of Ak. Denoting the data after exchange and re-orthogonalization by ∆,¯ etc., µ b + + µ b + b = a =¯a =µ ¯ ¯b + +¯µ ¯b + ¯b . k+1,1 1 ··· k+1,k k k+1 k+1 k k1 1 ··· k,k−1 k−1 k As µk+1,kbk +bk+1 is orthogonal to b1 = ¯b1,...,bk−1 = ¯bk−1, it follows that ¯bk = µk+1,kbk +bk+1. ¯ ¯ 2 2 2 2 3 ¯ 3 Thus ∆/∆= bk / bk = µk+1,kbk + bk+1 / bk < 4 , hence ∆ < 4 ∆. ZZn k k k k ¯t k¯ ¯k k k Also,a ¯i implies det(AiAi) 1 i, so ∆ 1 holds at each iteration. ∈ 3 ` ≥ ∀ 3 ` 2(≥n−1) 2 3 ` n(n−1) Thus after ` iterations, 1 < ( 4 ) ∆ ( 4 ) a1 an−1 ( 4 ) (maxi ai ) . n(n−1) ≤ k k ···k k ≤ k k As 3 < 1 and (max a ) is fixed, the iteration count ` must be bounded above. 2 4 ik ik Recall that for lattice M we denote λ(M) = min z : 0 = z M and δ(M) = det(A) , {k k 6 ∈ } | | where the rows of A are a basis for M. The following theorem from A. Lenstra, H. Lenstra, and L. Lov´asz, Math. Ann. 261(1982)515-534 shows that the lengths of the elements of a reduced basis must obey certain bounds related to λ(M) and δ(M).

n Theorem 2.19 Let a1,...,an Q be a reduced basis for lattice M. Then: n−1 ∈ n−1 1 n(n−1) (i) a 2 2 λ(M); (ii) a 2 4 δ(M) n ; (iii) a a 2 4 δ(M). k 1k ≤ k 1k ≤ k 1k···k nk ≤ Proof: Let b ,...,b Qn be the Gram-Schmidt orthogonalization of a ,...,a , 1 n ∈ 1 n with weakly reduced representation ai = j

Part (iii) of the theorem shows that any rational lattice M has a basis a1 ...,an which n(n−1) 4 satisfies a1 an γnδ(M), where γn =2 is a function of dimension only. The first k k···k k ≤ n(n−1) 4 4 result of this type was established by Hermite(1850) with γn =( ) . Minkowski(1896) n 3 improved this to γ = 2 , where V is the volume of the unit n-sphere. Note that for an n Vn n orthogonal basis we would have a1 an = δ(M); for this reason, a1 an /δ(M) has been termed the orthogonalityk defectk···kof thek basis. Although its resultingk k···k orthogonalityk defect is not as sharp as the Minkowski result, the basis reduction procedure of Lov´asz has an attractive feature absent from its classical predecessors – it is a polynomial-time algorithm; we discuss the algorithm’s computational complexity later. The orthogonality defect of the n(n−1) 4 4 present procedure can be made arbitrarily close to ( 3 ) as follows.

4 4 Exercise 2.20 Suppose we replace the constant 3 by α (1, 3 ] in defining a reduced basis. Show that the base 2 in Theorem 2.19(i) (iii) then∈ becomes β = 4α ( 4 , 2]. 2 − 4−α ∈ 3 The classical orthogonality defects also imply counterparts to Theorem 2.19(ii). I.e., for some n−1 1 1 basis a ,...,a : [Hermite] a ( 4 ) 4 δ(M) n ; [Minkowski] a 2( δ(M) ) n . Compare 1 n 1 3 1 Vn k k ≤ 1 k k ≤ these bounds with the bound z √nδ(M) n on the length of a shortest nonzero element, k k ≤ given in Theorem 2.15. Observe, however, that there is no guarantee that the short vector z is in a basis for M . Minkowski studied the successive minima of M, denoted λ (M), 1 i n, i ≤ ≤ where λi(M) is the smallest real number such that M contains i linearly independent elements of length at most λi(M); thus λ1(M)= λ(M), the length of a shortest nonzero element of M. Note here again that there is no stipulation that these linearly independent elements of M be contained in a basis for M. Minkowski’s orthogonality defect cited above is a specialization of his so-called second theorem from the geometry of numbers, which bounds the product of the successive minima for certain sets. Note that λ (M) λ (M) a a for any 1 ··· n ≤k 1k···k nk basis a1,...,an of M, though this inequality generally will hold strictly.

Constraint Representations and Duality

The dual of S Qn is S# = x Qn : Sx ZZ ; by Sx ZZ we intend sx ZZ s S. # ⊆ { ∈ ∈ } ∈ ∈ ∀ ∈ Note that S is a rational ZZ-module. Due to the integrality restrictions, any ZZ-module of the form M = S# will be called constrained. When S is finite, M is finitely constrained, # n | | m×n i.e., when M = A = x Q : Ax ZZm , with A Q . Recall that the dyadic rationals D constitute a ZZ{-module∈ which∈ is not} finitely generated.∈ This ZZ-module is also not # m ## constrained, since D = x Q : n x ZZ m, n ZZ = 0 , which implies D = Q = D { ∈ 2 ∈ ∀ ∈ } { } 6 (see Proposition 2.21(iv) below). Nor are all constrained ZZ-modules finitely constrained, for let M = x Qn : Bx =0 be a subspace of Qn of dimension less than n, with B Qm×n. { ∈ } ∈

22 Then M is constrained, since M = yB : y Qm # . But M is not finitely constrained, because the restrictions Bx = 0 cannot{ be replaced∈ by} finitely many constraints of the form ax ZZ. (Why not?) Our development now follows Carvalho and Trotter, Math. of O.R. ∈ 14(1989)639–663. The following properties are elementary – cf. Proposition 1.10.

Proposition 2.21 Let S, T Qn. Then: # # ⊆ (i) S T S T ; (ii) S⊆ S##⇒; ⊇ (iii) S⊆# = S### ; (iv) S = S## S is a constrained ZZ-module; (v) A Qm×n⇔, S = yA : y ZZm = Z(A) S# = x Qn : Ax ZZm ; #∈ # { ∈ } ⇒ { ∈ ∈ } (vi) S = [Z(S)] ; (vii) (S T )# = S# T # ; (viii) if M∪ and N are∩ ZZ-modules, then (M + N)# = M # N # . 2 ∩ Exercise 2.22 For A Qn×n invertible, show that yA : y ZZn # = A−1z : z ZZn . 2 ∈ { ∈ } { ∈ } The lineality (subspace) of a ZZ-module M is the largest (rational) subspace contained in M, i.e., the set of all rational vectors x for which λx M λ Q. Representing the lineality explicitly leads us to ZZ-modules of extended finite generator∈ ∀ form∈ yA+zB : y Qm, z ZZp and of extended finite constraint form x Qn : Cx =0, Dx ZZ{ q . ∈ ∈ } { ∈ ∈ } Proposition 2.23 Let M = yA + zB : y Qm, z ZZp , where A Qm×n and B Qp×n. Then M # = x Qn : Ax{=0, Bx ZZ∈p . ∈ } ∈ ∈ { ∈ ∈ } Proof: We have M = yA : y Qm + zB : z ZZp . Now, yA : y Qm # {= x ∈Qn :(}yA){x = y(Ax∈ ) }ZZ y Qm = x Qn : Ax =0 . { ∈ } { ∈ # n∈ ∀ ∈ } { ∈ } By Proposition 2.21(v), zB : z ZZp = x Q : Bx ZZp . Thus Proposition 2.21(viii{ ) M∈# =} x {Qn∈: Ax =0,∈ Bx } ZZp . 2 ⇒ { ∈ ∈ } The next result shows that ZZ-modules in extended finite generator form also have an extended finite constraint representation, a ZZ-module analogue for Theorem 1.12.

Theorem 2.24 Let M = yA + zB : y Qm, z ZZp , where A Qm×n and B Qp×n. Then M = x Qn : Cx{ =0, Dx ∈ZZs , for some∈ }C Qr×n and∈ D Qs×n. ∈ { ∈ ∈ } ∈ ∈ A Proof: Assume rank(A)= m and that columns 1,...,m + q are a column basis for . " B # Via row operations we obtain RA = [Im A2]. Partition the columns of B = [B1 B2] similarly and denote Bˆ = B2 B1A2. By (2.3), there is a unimodular matrix U which brings Bˆ into Hermite− normal form, say

Bˆ1 Bˆ2 q×q UBˆ = , where Bˆ1 Q is upper triangular with positive diagonal. " 0 0 # ∈

23 Denoting A0 = [I A ] , B0 = UBˆ, we drop the final p q rows (all zeroes) of UBˆ to obtain m 2 − 0 Im 0 Im 0 R 0 A A Im+q 0 = 0 . " 0 U # " B1 Ip # " 0 Ip # " B # " B # h i − 0 A Im 0 0 n×n Finally, column operations yield 0 Q = , via an invertible Q Q . " B # " 0 Iq 0 # ∈ Now x = yA + zB for some y Qm, z ZZp x = y0A0 + z0B0 for some∈y0 Qm∈, z0 ZZq ⇔ ∈ ∈ xQ =(y0, z0, 0), with y0 rational and z0 integral. Thus⇔ we have x M if and only if Cx = 0 and Dx ZZs, where columns of Q∈corresponding to y0 do not restrict∈x, columns of Q corresponding to z0 give integrality restrictions (s rows of D) on x, and columns of Q corresponding to 0 give orthogonality restrictions (r rows of C) on x. 2

We observe several consequences of Theorem 2.24. First, note that Proposition 2.23 implies that the set M in the theorem is constrained, since M = uC + vD : u Qr, v ZZq # . { ∈ ∈ } It then follows from part (iv) of (2.21) that sets of the form M, the (extended) finitely ## generated sets, are #-closed, i.e., M = M . Since M is #-closed, the proof of Theorem 1.15 adapts directly to the present setting to give a generalization of Theorem 2.7. The original theorem is the special case of the following result in which matrix A is vacuous; note that unimodular row operations, the main tool used in our initial proof of (2.7), have played an equally important role here in the more general development (in proving Theorem 2.24). Corollary 2.25 For A Qm×n, B Qp×n and c Qn, exactly one holds: m ∈ ∈ ∈ (i) y Q , z ZZp such that yA + zB = c; (ii)∃ x∈ Qn such∈ that Ax =0, Bx ZZp, and cx ZZ. 2 ∃ ∈ ∈ 6∈ The proof of Corollary 1.14 also applies here, showing that (extended) generators and con- straints play entirely symmetric roles under #-duality and providing a duality converse for Proposition 2.23. Corollary 2.26 Let M = yA + zB : y Qm, z ZZp , N = x Qn : Ax =0, Bx ZZp , m×n { p×n ∈ # ∈ } # { ∈ ∈ } where A Q and B Q . Then M = N and N = M. 2 ∈ ∈ We also obtain a converse of Theorem 2.24, i.e., that solution sets for finitely many orthog- onality and integrality restrictions can be represented via finite generating sets for rational and integral combinations. Corollary 2.27 Let M = x Qn : Ax =0, Bx ZZp , where A Qm×n and B Qp×n. Then M = yC + zD : y{ ∈Qr, z ZZq , for some∈ C} Qr×n and∈ D Qq×n. ∈ { ∈ ∈ } ∈ ∈ Proof: We are given that M is finitely constrained by matrices A, B. By (2.26), M # is finitely generated by matrices A, B. By (2.24), M # is finitely constrained by certain matrices C,D. By (2.26), M ## is finitely generated by matrices C,D – and by (2.23), M = M ## . 2

24 Exercise 2.28 Show that the integral elements of a (rational) subspace are a lattice. 2

Exercise 2.29 Suppose M Qn is a lattice; i.e., M = yA : y ZZm for some A Qm×n. ⊆ { ∈ m } ∈ Show that: (i) the (rational) span of M is L(M)= yA : y Q . (ii) M # is a lattice if and only if M is full-dimensional.{ ∈ } 2

By Proposition 2.21(iv) and Corollary 2.26, ZZ-modules of extended finite generator (or con- straint) form are #-closed. We now establish that this property actually characterizes the #-closed sets. We first show that any ZZ-module decomposes into a sum of its lineality and a submodule which has trivial lineality, i.e., a submodule of lineality 0 . { } Proposition 2.30 Let M be a ZZ-module with lineality L. Then M = L +(M Lo ) and M Lo has trivial lineality. ∩ ∩ o Proof: We need only show M L +(M L ), the reverse inclusion being obvious. By Theorem 1.17, we have x ⊆M x =∩y + z, with y L, z Lo . ∈ ⇒ ∈ o ∈ Now y L y L M; hence x +( y)= z M L . Moreover,∈ z,⇒ −z ∈M ⊆Lo z L Lo−, and then∈ Exercise∩ 1.9(ii) implies z = 0. 2 − ∈ ∩ ⇒ ∈ ∩ ## Theorem 2.31 The set M Qn is #-closed, i.e., M = M , if and only if M = yA + zB : y Qm,⊆ z ZZp , for some A Qm×n and B Qp×n. { ∈ ∈ } ∈ ∈ Proof: The sufficiency was discussed following (2.24). For the necessity, suppose M = S# and let the rows of A Qm×n be a basis for So . o ∈ m It is easy to see that S is the lineality of M, hence by (2.30) M = yA : y Q +M L(S). Choose columns of R from S which form a basis of L(S). { ∈ } ∩ Then M L(S)= xRt : xRts ZZ s S =(RtS)# Rt, where RtS = Rts : s S . Since RtR∩ RtS, we{ have (RtS∈)# ∀ (R∈tR)}# . { ∈ } By Exercise⊆ 1.19, (RtR)−1 exists, and⊆ by Exercise 2.22, its rows generate (RtR)# . Thus by Corollary 2.2, (RtS)# is finitely generated, say by the rows of F Qp×n. We therefore obtain M = yA + zB : y Qm, z ZZp , where B = F Rt. ∈2 { ∈ ∈ } We now derive a second characterization of #-closure for rational ZZ-modules – (rational) topological closure; similarly, we shall see later that (real) topological closure characterizes duality closure for convex cones. Here we follow P. Carvalho (Ph.D. Thesis, Cornell, 1984).

Lemma 2.32 Suppose M is a topologically closed rational ZZ-module with lineality L. Then there exists > 0 so that M x : x  L. ∩{ k k ≤ } ⊆ Proof: For each > 0 define L = L( x M : x  ) and L = L .  { ∈ k k ≤ } 0 ∩>0  We claim L0 = L; clearly L0 is a subspace and L L0, so we must show L0 M. Let x L ; we construct a sequence in M converging⊆ to x. ⊆ ∈ 0 Suppose q = dim(L0); for each k =1, 2,..., we define xk = i αi ai, where a M and a 1/k i, L = L(a ,...,a ), and x = bα ac. i ∈ k ik ≤ ∀ 0 1 q Pi i i P 25 Then x M and x x = (α α )a (α α )a a q/k. k ∈ k − kk k i i − b ic ik ≤ i k i − b ic ik ≤ i k ik ≤ Thus xk x, and since M is topologicallyP closed, weP have x M; this establishesP L0 = L. Since →L is a decreasing intersection of subspaces, only finitely∈ many L are distinct. ∩>0   It follows that L = L = L, for some particular  > 0. >0  0 0 For  we have∩ that x M and x  implies x L, as required. 2 0 ∈ k k ≤ 0 ∈ Combining the above results, we obtain the desired characterization of #-closure.

Theorem 2.33 M Qn is #-closed if and only if M is a topologically closed ZZ-module. ⊆ Proof: If M Qn is #-closed, then M is a constrained ZZ-module, by Proposition 2.21(iv). #⊆ n ZZ ∞ Let M = N = x Q : xy y N and xi i=1 M with xi x as i . By linearity, x y{ ∈xy y N∈ and∀ since∈ x} y ZZ{ }x , y⊂ N, we must→ have xy→∞ZZ. i → ∀ ∈ i ∈ ∀ i ∀ ∈ ∈ Therefore x M, establishing that M is topologically closed. For the converse,∈ Proposition 2.30 implies M = L +(M Lo ), where L is the lineality of M. o ∩ By Corollary 2.26, it now suffices to show that M L is a finitely generated ZZ-module. o It is clear that M L is a topologically closed ZZ-module∩ with lineality 0 . Thus Lemma 2.32∩ implies M Lo x : x  = 0 , for some > 0.{ } ∩ ∩{o k k ≤ }o { } Hence 0 is not a limit point of M L , so M L is finitely generated, by (2.11). 2 ∩ ∩ Recall that nonempty affine spaces are translations of subspaces (Theorem 1.26) and that the integral elements of a (rational) subspace are a lattice (Exercise 2.28). It then follows that the integral elements of a nonempty (rational) affine space are an integral translation of a lattice. In terms of linear systems, let A Qm×n and b Qm; then M = x ZZn : Ax =0 is a lattice and x ZZn : Ax = b = M +∈ z , for any z∈ ZZn which satisfies{ ∈Az = b. } We conclude this{ ∈ section by indicating} an{ } affine generalizati∈ on of the equivalence given in Theorem 2.24 and Corollary 2.27, i.e., the equivalence between sets finitely generated via integral and rational combinations and sets finitely constrained by orthogonality and integrality restrictions. Theorem 2.34 Let S = yR + zT : y Qr, z ZZt + a , where a Qn, R Qr×n, T Qt×n. Then S ={ x Qn : Ax∈ = b, (Cx∈ }d) { ZZ} p , for b ∈Qm,d ∈Qp, A Q∈m×n,C Qp×n. { ∈ − ∈ } ∈ ∈ ∈ ∈ Proof: If S = , take A = [0,..., 0], b =1. If S = , we define S0 Qn+1 as: ∅ 6 ∅ ⊆ R 0r 0 r ZZt ZZ S = (y,z,zt+1)  T 0t  : y Q , z , zt+1 . { a 1 ∈ ∈ ∈ }     Applying Theorem 2.24 to S0 we get A Qm×n, b Qm, C Qp×n, d Qp, for which ∈ ∈ ∈ ∈

0 x x p S = (x, xn+1) : [A b] =0, [C d] ZZ . { − " xn+1 # − " xn+1 # ∈ } Now x S (x, 1) S0 Ax = b, (Cx d) ZZp, completing the proof. 2 ∈ ⇔ ∈ ⇔ − ∈ 26 Theorem 2.35 Let S = x Qn : Ax = b, (Cx d) ZZp , where b Qm,d Qp, A Qm×n,C Qp×n. Then S ={ ∈yR + zT : y Qr, z− ZZt∈+ }a , for a ∈Qn, R ∈Qr×n, T∈ Qt×n. ∈ { ∈ ∈ } { } ∈ ∈ ∈ Proof: If S = , take R and T vacuous, i.e., r = t =0. If S= , define S0 Qn+1 as: ∅ 6 ∅ ⊆ S0 = (x, x ) Qn+1 : Ax bx =0, (Cx dx ) ZZp, x ZZ . { n+1 ∈ − n+1 − n+1 ∈ n+1 ∈ } By Corollary 2.27, there exist U Qu×n,V Qv×n s.t. S0 = yU + zV : y Qu, z ZZv . ∈ 0∈ { ∈ ∈ } Note that we have xn+1 ZZ (x, xn+1) S . This is possible only if the∈ final∀ column∈ of U is 0; to see this, take z = 0 and y as arbitrary (rational) multiples of the unit vectors. Since S0 is unaffected by unimodular row operations on V , we may further assume that the final column of V has a single nonzero entry, say vi,n+1 = 0. 0 6 Moreover, x S (x, 1) S ; hence S= vi,n+1 = 1. Now, in order∈ to⇔ obtain the∈ desired form6 ∅⇒ for S, we take R as the submatrix of U given by the first n columns, a as the first n components of the ith row of V , and S as the submatrix of V obtained by deleting the ith row and final column. 2

By Theorem 2.24 and Corollary 2.27, we know that any set which can be expressed in the form x Qn : Ax =0, Cx ZZp can also be expresed as the sum of a (rational) subspace { ∈ ∈ } and a (rational) lattice, and conversely. The generalization of this provided by the previous two theorems is that any set of the form x Qn : Ax = b, (Cx d) ZZp is the sum of a { ∈ r− ∈ } (rational) affine space and a (rational) lattice ( yR + a : y Q + zT : z ZZt in the notation of Theorems 2.34 and 2.35), and conversely.{ ∈ } { ∈ }

Algorithmic Considerations

Here we consider the Hermite normal form of Theorem 2.3 in column format; i.e., given an m n rational matrix A of rank m, there is a unimodular matrix U for which AU = H × satisfies: hij = 0, i j. The proof of Theorem 2.7 shows how this form may be used to address≤ the decision problem ? x ZZn : Ax = b . First, using {∃ ∈ } Algorithm 1.39, we determine in polynomial time whether Ax = b has a rational solution. If not, no integral solution exists and we are done. If so,{ (2.7) shows} we may assume A integral and of full row rank; Algorithm 1.39 is used to determine and delete dependent equations. Recall the earlier discussion concerning the use of composite operations (now on columns of A) to determine the Hermite normal form. Where row i [ a a ], ↔ ··· ij ··· ik ··· let g = gcd(aij, aik)= paij + qaik. Then post-multiplication of A by the unimodular matrix −aik aij U which is the identity matrix, except for entries ujj = p, ujk = g , ukj = q, ukk = g , replaces a g and a 0. ij ← ik ← Consider the obvious algorithm for computing H suggested by these composite operations: (i) gcd computation can be done in polynomial time; (ii) at most n2 composite operations are

27 required, as some entry is reduced to zero by each operation; (iii) any composite operation requires at most 3n arithmetic operations. Does this algorithm compute H in polynomial time? Maybe ... but inordinate growth of matrix entries can be exhibited over the course of the computation, similar to that of Exercise 1.38. Matrix entries could square with each row iteration, though no explicit example of this is known. E.g., for 8 8 matrices with × random integral entries smaller than 216, entries grew to size 2432 using this algorithm. However, as shown in P. Domich, R. Kannan, and L. Trotter, Math. of O.R. 12(1987)50–59, all computation can be carried out modulo the determinant magnitude for a column basis of A, which will insure that the size of intermediate numbers encountered remains bounded by a polynomial in the size of the original data.

Lemma 2.36 If H is the Hermite normal form of A and the columns of C are in the lattice H 0 A C of the columns of A, then H0 = is the Hermite normal form of A0 = . " 0 I # " 0 I #

Proof: Suppose AU = H and AV = C, where U is unimodular and V is integral. U V Then U 0 = − is unimodular and A0U 0 = H0. 2 " 0 I #

Lemma 2.37 Suppose A = [B N], with det(B) = δ > 0. | | Then the vectors δe , 1 i m, are in the lattice generated by the columns of A. i ≤ ≤ δB−1 Proof: By Cramer’s rule, δB−1 is integer-valued; moreover, A = δI. 2 " 0 # Lemma 2.37 shows we may take C = δI in (2.36). Applying unimodular column operations (row-by-row) to [A δI] produces [H 0], where H is the Hermite normal form of A. At any point in the computation, should the magnitude of any entry exceed δ, we simply use further unimodular column operations involving that entry and the appropriate column of δI to reduce the entry modulo δ. Since A is m n of rank m, δ is bounded above by m!αm (mα)m, for α the largest magnitude of an× element of A. Thus the size of δ is at ≤ most m(log2 m + log2 α), and so this is a polynomial-time algorithm for computing H.

Exercise 2.38 Specify in detail a polynomial-time algorithm to compute the Hermite normal form for A Qm×n of full row rank. 2 ∈ Exercise 2.39 Give an algorithm which terminates with either y satisfying (i) or x satisfy- ing (ii) of Theorem 2.7 in time polynomial in the size of A Qm×n and c Qn. 2 ∈ ∈ The basis reduction procedure of Algorithm 2.17 is also a polynomial-time algorithm. To see this, recall that the validation in Theorem 2.18 shows that the number of iterations ` satisfies 3 ` n(n−1) 3 1 < ( 4 ) (maxi ai ) . Thus 0 <`(log2 4 )+ n(n 1) log2 nα, where α = maxi,j aij ; since 3 k k − | | log2 4 .415, we obtain ` < 2.5n(n 1) log2 nα. Moreover, it is clear that the number of arithmetic≈ − operations performed at− each iteration is bounded by a polynomial function

28 of n and α, so it follows that the overall number of arithmetic operations executed by the algorithm is polynomially bounded. It remains to show that the size of intermediate data encountered over the course of the execu- tion is polynomially bounded. First, as in the proof of Theorem 2.18, we note that throughout the execution we have det(A¯tA¯ ) a 2 a 2 (nα)2n, for 1 i n . (Recall that the i i ≤k 1k ···k ik ≤ ≤ ≤ a ZZn are the columns of the original matrix A and A¯ denotes the matrix with columns i ∈ i a¯1,..., a¯i given by members of an updated basis encountered during execution.) Also, from ¯ ¯ ¯t ¯ −1 ¯t ¯ ¯t ¯ ZZn Exercise 1.44(ii), we know bi+1 =¯ai+1 Ai(AiAi) Aia¯i+1, which implies bi+1det(AiAi) . − ¯ 2n ∈ Thus during execution the denominator of any bi is at most (nα) , so its size is at most ¯ −2n ¯ 2 ¯ 2 ¯ 2 ¯t ¯ ¯t ¯ 2n log2 nα and bi (nα) . Moreover, b1 bi−1 bi = det(Bi Bi) = det(AiAi), k k ≥ k k ···k k k k 2 and hence ¯b 2 = det(A¯tA¯ ) ¯b −2 ¯b −2 (nα)2n(nα)2n(n−1) = (nα)2n , so that the k ik i i k 1k ···k i−1k ≤ numerator of any ¯bi is also polynomially bounded. Thus the size of any ¯bi during execution is bounded by a polynomial function of the original data parameters n and α. For thea ¯i, we have the relationa ¯i = ¯bi +µ ¯i1¯b1 + +µ ¯i,i−1¯bi−1, which, as a consequence ¯ ¯ 2 ¯ ···2 2 ¯ 2 2 ¯ 2 of the orthogonality of b1,..., bi, yields a¯i = bi +¯µi1 b1 + +¯µi,i−1 bi−1 . Weak k k k k k 2 k ··· k k reduction and the earlier analysis thus give a¯ 2 n (nα)2n , which shows that the size of k ik ≤ 4 anya ¯i is also bounded by a function of n and α. Finally, observe that during weak reduction ¯ the bi do not change, and anya ¯i encountered during this phase of an iteration remains of polynomially bounded size, due to the order in which the weak reduction process is carried out.

Theorem 2.40 (Lov´asz 1982) Algorithm 2.17 executes in polynomial time. 2

29 3 Convex Cones

n For λi IR and ai IR i, the expression λ1a1 + + λmam is a conical combination ∈ ∈ ∀ m ··· provided λi 0 i; if, in addition, i=1 λi =1, it is a . When we restrict ≥ ∀ n the scalar multiplication axioms forP subspaces of IR to nonnegative multipliers, the resulting sets, containing 0 and closed under conical combinations, are (convex) cones. A set which is closed under convex combinations is convex (whether or not it contains 0). As with linear spaces, conical dependence of a / S on S IRn, requires a linear combination in which a has coefficient -1 and the remaining∈ terms are⊆ distinct elements of S with positive coefficients; convex dependency is defined similarly. The conical span (cone generated by S) and convex span ( generated by S) are given, respectively, by: K(S)= S a IRn : a is conically dependent on S , n ∪{ ∈ } C(S)= S a IR : a and a ,...,a S satisfy a convex dependency relation . ∪{ ∈ 1 m ∈ } The properties of Exercise 1.2 remain valid for conical span; thus cones are precisely those S IRn for which S = K(S). Similarly, convex sets satisfy S = C(S). Note the analogy to⊆ the situation with linear subspaces and affine spaces. Once again, 0 K(S) for any S IRn, so that is not a cone; on the other hand, = C( ), so that is convex.∈ Moreover, ⊆ ∅ ∅ ∅ ∅ S C(S) K(S), so cones are convex. Similarly, affine spaces and subspaces are convex and,⊆ moreover,⊆ subspaces are cones. The class of cones is closed under finite summation and arbitrary intersection; that is, if n K1,K2 IR are cones, then so is their sum, K1 + K2 = u + v : u K1, v K2 , and for cones⊆ K IRn i I, the set K is also a cone.{ Note that∈K +( ∈K), i.e.,} all i ⊆ ∀ ∈ i∈I i − vectors of the form x y for x, y TK, is the subspace L(K) generated by K, hence the smallest subspace containing− K; thus∈ K +( K) and K are of the same dimension. In particular, when a cone is finitely generated, i.e.,− for a cone of the form K = yA : y 0 m×n m { ≥ } with A IR , it is clear that L(K) = yA : y IR . In general, cones may contain subspaces;∈ e.g., (x , x ) IR2 : x 0 {contains∈ the subspace} (0, x ) : x IR . The { 1 2 ∈ 1 ≥ } { 2 2 ∈ } lineality (subspace) of cone K is K ( K) = x : x K, x K , the largest subspace contained in K. The cone 0 is trivial∩ −; any other{ cone∈ is −nontrivial∈ }. When K ( K) is trivial, K is pointed. In analogy{ } with Proposition 2.30 for (rational) ZZ-modules, any∩ − cone is the sum of its lineality and a pointed cone; the earlier proof remains valid here.

Proposition 3.1 Suppose cone K has lineality subspace S. Then K = S +(K So ) and K So is pointed. 2 ∩ ∩ Any vector 0 = a IRn defines a homogeneous halfspace x : ax 0 . Note that a halfspace is a cone. Any6 intersection∈ of homogeneous halfspaces{ is a constrained≥ } cone; i.e., where S IRn, S+ = x : Sx 0 is a constrained cone. By Sx 0 we mean sx 0 s S. +⊆ { ≥ } ≥ ≥ ∀ ∈ S is the conical dual of S, geometrically, those points which make an acute angle with each element of S. If K is a cone, then K + is the dual (polar) cone of K. Often the polar cone is defined using inequalities of the form sx 0; we use stipulations sx 0 in order to emphasize closure under nonnegative linear combinations≤ for cones. ≥

30 Thus the rows of matrix A IRm×n also define a finitely constrained (polyhedral) cone K = x : Ax 0 = A+ . For∈ such cones the following exercise shows how to determine the components{ of≥ the} previous decomposition, viz. the lineality S of K and the pointed cone o K S . ∩ Exercise 3.2 Let S be the lineality subspace of K = x : Ax 0 , where A IRm×n. (i) Express S in terms of matrix A. { ≥ } ∈ (ii) How can one determine a basis for S? (iii) If the rows of B span S, show K So = x : Bx 0, Bx 0, Ax 0 . B ∩ { ≥ − ≥ ≥ } 2 What is the rank of  B ? −A     Exercise 3.3 Let S IRn. o ⊆ + (i) Show that S is the lineality of S – cf. proof of Theorem 2.31. (ii) What condition on S ensures that S+ is pointed? 2

The elementary duality properties of subspaces (1.10) and ZZ-modules (2.21) remain valid for cone duality. When S = S++ , i.e., when equality holds in Proposition 3.4(ii), we say that the set S is +-closed.

Proposition 3.4 For S, T IRn we have: + + ⊆ (i) S T S T ; (ii) S⊆ S++⇒; ⊇ (iii) S⊆+ = S+++ ; (iv) S = S++ S is a constrained cone; (v) A IRm×n⇔, K = yA : y 0 = K(A) K + = x : Ax 0 . +∈ + { ≥ } ⇒ { ≥ } (vi) S = [K(S)] . (vii) (S T )+ = S+ T + . 2 ∪ ∩ The following exercises summarize further fundamental properties of cone duality. Note, in particular, that part (vii) of Exercise 3.5 is a dual result for Proposition 3.1 and that Exercise 3.6 shows that full-dimensionality and pointedness are dual properties.

Exercise 3.5 Establish the following relations: + + + n (i) (K1 + K2) = K1 K2 , for cones K1,K2 IR ; + + ∩ + ⊆ n (ii) K1 + K2 (K1 K2) , for cones K1,K2 IR ; o + ⊆ ∩ n ⊆ (iii) S = S , for subspace S IR ; oo (iv) S++ S = L(S), for S ⊆ IRn; (v) L(S++⊆)= L(S), for S IR⊆n; o + ⊆ n (vi) S L(K ), for cone K IR with lineality S; o o (vii) K ⊇= L(K) (K + [L(K)]⊆) and K + [L(K)] is full-dimensional, for cone K IRn; (viii) prove or find∩ a counterexample: K(S)= S++ . 2 ⊆

31 Exercise 3.6 Let K IRn be a cone. (i) Give examples⊆ of cones in IR2 which are both pointed and full-dimensional, pointed but not full-dimensional, full-dimensional but not pointed, and neither full-dimensional nor pointed. (ii) Show that K full-dimensional implies K + pointed. (iii) Show that K pointed implies K + full-dimensional, provided K = K ++ . (iv) Find a pointed cone K IR2 whose dual K + is not full-dimensional. 2 ⊆

Finite Cones

Although many of the concepts and results of our development thus far for cones are anal- ogous to earlier material on subspaces, we cannot proceed further in strict analogy, because for cones there is no finite basis result (Theorem 1.7). The ice-cream cone of the follow- ing exercise demonstrates this and shows, moreover, that Corollary 2.2 fails for cones; i.e., subcones of finitely generated cones (IR3 is a finitely generated cone.) need not be finitely generated.

3 2 2 2 Exercise 3.7 Let K = (x1, x2, x3) IR : x1 + x2 x3, x3 0 . Show that K is a cone,{ yet K is∈ not finitely generated.≤ 2≥ }

Even though the global finiteness result of Theorem 1.7 fails in the cone setting, there is a local analogue. Whereas in (1.7) a single subset of S containing dim(L(S)) elements suffices to generate the entire subspace L(S), the following classical theorem shows that each element of K(S) can again be generated using at most dim(K(S)) elements from S, but now different choices of x may require different generating subsets of S.

Theorem 3.8 (Carath´eodory 1911) Let 0 = x K(S) with dim(K(S)) = m. Then x is a conical combination of linearly independent6 ∈ (thus at most m) elements of S.

Proof: Since x K(S), we have x = λ1s1 + + λpsp, where p 1 and λi 0,si S i. If the s are linearly∈ dependent, then 0 = µ ···s + + µ s with≥ not all µ =≥ 0. ∈ ∀ i 1 1 ··· p p i Multiplying this dependency relation by 1 if necessary, we may assume µi > 0 for some i. 0 − Defining δ = min λi/µi : µi > 0 and λi = λi δµi 0 i, { p } −p 0 ≥ ∀ 0 we have x = x 0= i=1(λi δµi)si = i=1 λisi and i : λi > 0 > i : λi > 0 . − − 2 |{ }| |{ }| Iteration of this process establishesP the theorem.P

We will defer consideration of general convex cones to a later chapter, restricting attention for now to cones which are finitely generated. This will enable further development of the theory of convex cones in analogy with that for subspaces. Specifically, we now introduce an elimination scheme for linear inequalities which will play the same role here that Gaussian elimination (1.11) did for equality systems. This procedure is used to establish that finitely generated and finitely constrained cones are the same objects – called simply finite cones. As in (1.11), the procedure is stated for general (inhomogeneous) linear inequality systems.

32 Proposition 3.9 (Fourier-Motzkin Elimination 1827) Suppose A IRm×n, b IRm and denote the linear inequality system Ax b as (I). ∈ ∈ { ≥ } “Eliminate” xn to obtain system (II) as follows: (i) a =0 a x + + a x b is in (II); in ⇒ i1 1 ··· i,n−1 n−1 ≥ i ain > 0 ai1 ak1 ai,n−1 ak,n−1 bi bk (ii) ( a a )x1 + +( a a )xn−1 ( a a ) is in (II). " akn < 0 # ⇒ in − kn ··· in − kn ≥ in − kn Then (I) is consistent if and only if (II) is consistent.

Proof: ( ) This is clear, since x1,...,xn−1, xn satisfy (I) x1,...,xn−1 satisfy (II). ( ) Given⇒ x ,...,x that satisfy (II), we use the relations⇒ of system (II) to define x . ⇐ 1 n−1 n From (ii) we have, ain > 0, akn < 0 1 (b a x∀ a x ) 1 (b a x a x ). akn k k1 1 k,n−1 n−1 ain i i1 1 i,n−1 n−1 − −···−1 ≥ − −···− Define µ = min(k:a <0) (bk ak1x1 ak,n−1xn−1) , with µ =+ if akn < 0; kn { akn − −···− } ∞ 6 ∃ λ = max 1 (b a x a x ) , with λ = if a > 0. (i:ain>0) ain i i1 1 i,n−1 n−1 in Now x ,...,x , x satisfy{ (I)− provided−···−x is chosen so that} λ x −∞µ. 2 6 ∃ 1 n−1 n n ≤ n ≤ Geometrically, Fourier-Motzkin elimination is the same as Gaussian elimination. The elim- ination of xn from system (I) produces system (II) with variables x1,...,xn−1 for which (x1,...,xn−1) satisfies (II) if and only if (x1,...,xn−1, xn) satisfies (I) for some xn; i.e., the elimination process projects the set of solutions for (I) onto the first n 1 coordinates to − produce system (II) – recall Exercise 1.18. With this in mind, the assertion that (I) is con- sistent if and only if (II) is consistent is evident. Algorithmically, Fourier-Motzkin elimination can be used to determine consistency of a lin- ear equality system in nonnegative variables (i.e., to determine whether a given point is in a finite cone whose generators are known) or (equivalently) to determine consistency of a linear inequality system. These are central issues of linear programming, the topic of focus later in Chapter 6. Exercise 3.10 Adapt (3.9) to test consistency of Ax b , where A IRm×n, b IRm; i.e., find and validate an algorithm which determines{ a≥ solut} ion or proves∈ none exists.∈ 2 Although the consistency test for linear inequality systems based on Fourier-Motzkin elimi- nation is finite, it is important to recognize that the primary value of this tool is theoretical, rather than practical. Indeed, the computational effort required by this procedure may grow exponentially in the size of the initial linear system – in contrast to the polynomial-time pro- cedures of Exercises 1.42 and 2.39. This behavior is demonstrated in the following example from Schrijver, Theory of Linear and Integer Programming (Wiley, 1986). Exercise 3.11 For positive integers p and n =2p + p +2, consider the system Ax b , where b IRm and A IRm×n has rows e e e , 1 i

33 The proof of the following theorem demonstrates the power of Fourier-Motzkin elimination. Theorem 3.12 (Weyl 1935) Every finitely generated cone is polyhedral. Proof: Let K = yA : y 0 , where A IRm×n. Then we have: { ≥ } ∈ K = x : x = yA, y 0 is consistent = {x : {x yA 0≥, } x + yA 0, y} 0 is consistent = {x : {Bx− 0 ,≥ by Fourier-Motzkin− ≥ elimination≥ } of y. 2 } { ≥ } As with Theorem 1.12 for linear subspaces, there are several important consequences of Weyl’s Theorem. First, by combining Theorem 3.12 and Exercise 3.4(iv), we obtain a char- acterization of +-closed, finitely generated sets. Next, the two cones obtained by interpreting the rows of a matrix as generators and constraints, respectively, are dual to one another. We also obtain an alternative theorem – compare Theorems 1.15 and 2.7 – the celebrated Farkas Theorem for nonnegative solutions to linear equality systems. Finally, the converse of Theorem 3.12 also holds, a result which was obvious in the setting of linear spaces, since every subspace is finitely generated. Corollary 3.13 For K IRn with finite (conical) basis, K = K ++ K is a cone. 2 ⊆ ⇔ m m×n Corollary 3.14 Let K = yA : y IR+ and J = x : Ax 0 , where A IR . + + { ∈ } { ≥ } ∈ Then K = J and J = K. 2 Theorem 3.15 (Farkas 1896) For A IRm×n and c IRn, exactly one holds: (i) y 0 such that yA = c; (ii)∈ x IRn such∈ that Ax 0, cx< 0. ∃ ≥ ∃ ∈ ≥ Proof: Let K = yA : y 0 ; applying (3.12) to K yields K = u : Bu 0 . Clearly [(i) and{ (ii)], else≥ } 0 > cx = yAx 0, a contradiction.{ ≥ } Now (i)¬ c K, so (i) c K cx <≥0 for some x B. ⇔ ∈ ¬ ⇒ 6∈ ⇒ ∈ And since each row of A is in K (using y = unit vectors), we also have Ax 0. 2 ≥ Theorem 3.16 (Minkowski 1896) Every polyhedral cone is finitely generated. Proof: Let J = x : Ax 0 , K = yA : y 0 . + { ≥ } { ≥ } By (3.14), J = K, a finitely generated cone, which is polyhedral by (3.12). Applying (3.14) again, the polyhedral cone K has a finitely generated dual K + . Since J = K + , the proof is complete. 2

The Weyl-Minkowski duality of Theorems 3.12 and 3.16 asserts that finitely generated and polyhedral cones are, in fact, the same, justifying the terminology finite cones. Corollary 3.14 shows that duality provides a complete correspondence between generators and constraints – but there are fundamental differences in the tools used to establish this equivalence. The direction generators constraints is elementary, as in Exercise 3.4(v), while the compan- ion statement constraints→ generators expressed by J + = K in Corollary 3.14 is deeper, → ++ requiring the consequence of (3.12) that K = K for finitely generated cones.

34 It is instructive to interpret the Farkas Theorem using duality. For K = yA : y 0 , alternative (i) of (3.15) says c K, while (ii) asserts that cx < 0 for some{ x K≥+ , or} equivalently, c / K ++ . Thus the∈ statement (i) (ii) of the Farkas Theorem∈ may be ∈ ++ ++⇔ ¬ rephrased as c K [c / K ]; i.e., K = K . By Exercise 3.4(iv), we conclude that any finitely generated∈ ⇔ cone ¬ can∈ be expressed as the set of solutions for a linear homogeneous inequality system. Weyl’s Theorem gives a sharper result: any finitely generated cone must have a finite inequality representation. The geometric content of the Farkas Theorem is that either the vector c is in the cone generated by the rows of A, or it is not, in which case there is a hyperplane (containing the origin) specified by the vector x in alternative (ii), so that the cone lies on one side of the hyperplane (Ax 0 zx 0 z K = yA : y 0 ) and c lies strictly on the other ≥ ⇒ ≥ ∀ ∈ { ≥ } side (cx < 0). That is, z : zx =0 is a separating hyperplane for c and K. Note again the sense in which Weyl’s Theorem{ is sharper:} since K is finitely generated, it is polyhedral, so a separating hyperplane can be chosen from a finite list of candidates independent of c; any polyhedral representation for K provides such a list. Thus the Weyl Theorem converts the existence assertion of the Farkas Theorem into a finite combinatorial statement. The following exercises develop several consequences of Weyl-Minkowski finite cone duality. Exercise 3.17 For finite cones K,K ,K IRn, show: 1 2 ⊆ (i) K1 + K2 and K1 K2 are finite cones; (ii) IRn = K K + , a∩ conical analogue for Theorem 1.17; − (iii) equality holds in parts (ii) and (vi) of Exercise 3.5; (iv) both results in (iii) may fail for cones which are not finite. 2 2 1 Exercise 3.18 Determine constraints for K( ) via Fourier-Motzkin elimination. 2 " 1 2 # Exercise 3.19 For K = yA+zB : y 0, z unrestricted and J = x : Ax 0, Bx =0 , show that K + = J and{J + = K. 2≥ } { ≥ } Exercise 3.20 Show that x : Ax 0 = 0 , for any A IRn×n. 2 { ≥ } 6 { } ∈ Exercise 3.21 For A having no row consisting entirely of zeroes, show: yA =0, y 0, y =0 has no solution { yA : y 0≥is pointed6 } ⇔{ ≥ } x : Ax 0 is full-dimensional ⇔{Ax > 0≥ has} a solution. 2 ⇔{ } From the previous exercise, a finite cone is pointed if and only if its dual is full-dimensional. Thus, pointedness of x : Bx 0 is equivalent to full-dimensionality of zB : z 0 . Since L( zB : z 0{) = L(B≥), this} is equivalent – recall Exercise 3.2(iii{) – simply≥ to} { ≥ } the condition that rank(B) = n. Finally, note that this does not hold for all cones; e.g., (x , x ) : x > 0 (0, 0) is pointed, yet its dual (x , 0) : x 0 is not full-dimensional { 1 2 2 }∪{ } { 1 1 ≥ } – recall Exercise 3.6(iv).

35 The Farkas Theorem is often useful for characterizing existence of a solution to a linear system. Consider, e.g., the system Ax b , for A IRm×n, b IRm. This linear system is equivalent to Ax I z = b, z { 0 .≥Now,} any vector∈ x IR∈n can be expressed as the { − m ≥ } ∈ difference of two nonnegative vectors. Hence the original system is equivalent to the system Au Av I z = b, u 0, v 0, z 0 , which is of the form required in alternative (i) { − − m ≥ ≥ ≥ } of Theorem 3.15. We conclude that the system is consistent if and only if there is no vector y IRm for which yA 0, yA 0, yI 0, and yb < 0, which yields the following. ∈ ≥ − ≥ − m ≥ Corollary 3.22 Suppose A IRm×n and b IRm. ∈ ∈ Then Ax b is consistent yA =0, y 0, yb> 0 is inconsistent. 2 { ≥ } ⇔ { ≥ } Exercise 3.23 Prove the following version of the Farkas Theorem. yA c, y 0 has a solution Ax 0, x 0, cx< 0 has no solution. 2 { ≥ ≥ } ⇔ { ≥ ≤ } Exercise 3.24 Matrix A IRm×m is positive semi-definite provided yAy 0 y IRm. For A positive semi-definite,∈ show yA 0, y 0, y =0 always has≥ a solution.∀ ∈ 2 { ≥ ≥ 6 } Exercise 3.25 For cone K finite, show x y > 0 for some x K, x 0, y K + , y 0. − ∈ ≥ ∈ ≤ Show by example that the result may fail when K is not finite. 2

Exercise 3.26 For A IRm×n, show that exactly one of the following alternatives holds: (i) Ax =0, x> 0∈has a solution; (ii) yA 0, yA =0 has a solution. 2 { } { ≥ 6 } Observe the crucial role played by Fourier-Motzkin elimination in the above development. All of the classical results follow easily, once validity of the elimination scheme has been established. The Weyl Theorem (proof) shows how Fourier-Motzkin elimination can be used to go from a generator description of a finite cone to an inequality description and Corollary 3.14 guarantees that we can go from constraints to generators similarly. As with Gaussian elimination (Exercise 1.18), Fourier-Motzkin elimination for matrices has a natural dual counterpart consisting of simple truncation. In the following schematic representation, we make the latter statement precise, interpreting the elimination scheme of Proposition 3.9 as being applied to matrix A of the linear system Ax 0 to produce matrix A0 which in turn defines the linear system A0x0 0 . { ≥ } { ≥ } Diagram 3.27 As dual matrix operations, elimination and truncation obey the following:

B generates K + (i) K = x IRn : Ax 0 K = x IRn : Bx 0 (ii) { ∈ ≥ } −→ { ∈ ≥ }

FM 0 T RN 0  A A  B B    →  →   0  0 n−1 0 0 0 + 0  n−1 0 0 (iii) K = x IR : A x 0 (K ) = x  IR : B x 0 (iv) {y ∈ ≥ } + { y∈ + ≥ } K0 0 = K proj. to x : x = 0 (K0) 0 = K x : x = 0 × { } { n } × { } ∩ { n }

36 Discussion : The assertions in the above schematic are justified as follows. (i) Matrix A IRm×n defines the polyhedral cone K. ∈ (ii) The rows of matrix B generate K and hence give constraints defining K + , by 3.4(v). (iii) Applying Fourier-Motzkin elimination (3.9) to matrix A removes the last coordinate and yields matrix A0. Then A0 is used to define the polyhedral cone K0 IRn−1. It follows from the discussion following Proposition 3.9 (see also Exercise 1.19)⊆ that K0 = x0 IRn−1 :(x0, x ) K, x IR ; i.e., K0 is the projection of K into IRn−1. { ∈ n ∈ ∃ n ∈ } (iv) Matrix B0 results when the last column of B is simply truncated (deleted). Then to any 0 0 0 0 0 row b B there corresponds a row b =(b , bn) B K. Hence B K . Moreover, ∈ 0 0 ∈ ⊂ 0 ⊂ for any x K , there is some xn IR such that x =(x , xn) K. Since B generates 0 ∈ ∈ 0 0 ∈ 0 0 K, (x , xn) = yB for some y 0, and so x = yB . Thus the rows of B generate K , and it follows that (K0)+ = ≥x0 IRn−1 : B0x0 0 , as asserted in (iv). Furthermore, if x0 IRn−1 and x =(x0, 0) { K∈+ , then Bx ≥0 implies} B0x0 0, so x0 (K0)+ . And ∈ + ∈ ≥ ≥ ∈ + if x0 (K0) , clearly B0x0 0 implies Bx 0 for x = (x0, 0), so (x0, 0) K . Thus (K0)+∈= x0 IRn−1 :(x0, 0)≥ K + , the intersection≥ of K + with IRn−1 ∈0 . 2 { ∈ ∈ } ×{ } Thus when matrices A and B define a dual pair of polyhedral cones in IRn, the matrix operations of Fourier-Motzkin elimination applied to A and (dually) truncation applied to B produce matrices A0 and B0 which define the dual pair of polyhedral cones K0, (K0)+ in IRn−1. Note also that K0 IR and (K0)+ 0 are a dual pair of cones in IRn. In geometric terms, × ×{ } n projection of a polyhedral cone onto the hyperplane H = x IR : xn = 0 corresponds under duality to intersection of the dual (also polyhedral) cone{ ∈ with H. } Exercise 3.28 Show, more generally, that for a polyhedral cone K and its dual K + : (i) projection of K onto the hyperplane H = x IRn : ax =0 , where 0 = a IRn, { ∈ + } 6 ∈ corresponds under duality (in H) to intersection of K with H. (ii) projection of K onto the subspace S = x IRn : Ax =0 , where A IRm×n, { ∈ + } ∈ corresponds under duality (in S) to intersection of K with S. 2

Minimal Representations As a further application of the Farkas Theorem, consider the question of how tight a specific generator or inequality representation may be for a finite cone. Specifically, when can a particular generator or inequality be removed from a representation without changing the cone? Observe that (ii) in Theorem 3.15 is equivalent to the assertion Ax 0 cx 0. ¬ ≥ ⇒ ≥ When this is the case, we say that the (homogeneous) inequality cx 0 is implied by the system Ax 0 , is valid for the cone x : Ax 0 , and is inessential≥ (redundant) in the representation{ ≥ Ax} 0, cx 0 . If we{ restate the≥ Farkas} Theorem as [ y 0 : yA = c] { ≥ ≥ } ∃ ≥ ⇔ [Ax 0 cx 0], then it is evident that an inequality is inessential if and only if it is a conical≥ combination⇒ ≥ of the remaining inequalities. Corollary 3.29 K + determines (precisely) the valid inequalities for the finite cone K. 2

37 Dually, a generator is inessential if and only if it lies in the cone spanned by the other generators; by Carath´eodory’s Theorem, no more than dim(K) of the remaining generators are required to demonstrate this. A similar statement holds for an inessential inequality. Clearly, given a specific inequality (generator) description of a finite cone, recursive deletion of an inessential inequality (generator) ultimately produces a minimal representation for the cone. For example, the above discussion shows row a A to be inessential in the description i ∈ K = yA : y 0 if and only if the system yA = ai, y 0, yi = 0 is consistent. Note that (Exercise{ ≥ 3.10)} Fourier-Motzkin elimination{ can be us≥ed to settle} this question.

Exercise 3.30 For A = ( 1, 2), (4, 1), (3, 3) , show that (3, 3) is inessential for K(A); ...that (4, 1) is essential{ − for K(A). 2 }

When is a minimal description unique? Of course, multiplication of any inequality or gen- erator by a positive scalar does not change the cone, so uniqueness here can mean only up to positive scaling. We characterize uniqueness first for generators and then use duality to derive the corresponding result for constraints.

Theorem 3.31 A finite cone has a unique (up to positive scaling) minimal set of generators if and only if it is a line or it is pointed.

Proof: Suppose K IRn is a finite cone. ⊆ ( ) For K trivial, a line, or a ray, the conclusion is obvious. Suppose⇐ A = a ,...,a , B = b ,...,b are minimal generating sets, m,p > 1. { 1 m} { 1 p} We show that each ak A is a positive multiple of some bi B. Select a A; then a ∈= p β b , with all β 0. ∈ k ∈ k i=1 i i i ≥ If ak = βibi for some i, weP are done, so suppose this is not the case. Without loss of generality, ak = b + c, for 0 = b = β1b1 K, 0 = c = i>1 βibi K. Expressing b and c using A, we get b = m 6 λ a , c = ∈m µ a6 , with all λ ,µ ∈ 0. i=1 i i i=1 i i P i i ≥ Thus (1 λk µk)ak = b + c (λk + µkP)ak = i6=k(λiP+ µi)ai; we consider 3 cases. − − − λi+µi (1 λk µk) > 0 ak = i6=k aPi ak inessential, contradicting A minimal. − − ⇒ 1−λk−µk ⇒  λi+µi (1 λk µk) < 0 ak =P i6=k ai ak, ak K, contradicting K pointed. − − ⇒ − λk+µk−1 ⇒ − ∈ (1 λ µ )=0 If j = k andλ + µ > 0, then (λ + µ )a = (λ + µ )a , − k − k ⇒ 6 P j j − j j j i=6 k,j i i i so aj =0 or aj K, either of which is impossible; hence λj = µPj =0 j = k, ± ∈ 1 β1 ∀ 6 implying that b = λkak; i.e., ak = b = b1. λk λk ( ) We show the contrapositive. o o By⇒ (3.1), K = S +(K S ), with S = K ( K) and K S pointed (and finite). Let G = g ,...,g be∩ a minimal generating∩ − set for K S∩o (for K So = 0 , take G = ) { 1 p} ∩ ∩ { } ∅ and let b1,...,b` be a basis for S. Now K not{pointed implies} ` 1. If ` = 1, we must also have p ≥ 1, since K is not a line; ≥ b1,g1,...,gp , b1,g1 + b1,g2,...,gp are distinct minimal generating sets for K. If `>{±1, we use }b ,...,{± b G and b ,...,b} , ` b G. 2 {± 1 ± `} ∪ { 1 ` − i=1 i} ∪ P 38 For 0 = a IRn, the subspace generated by a is a line, λa : λ IR and the cone generated by a is6 a ray∈ , λa : λ 0 . Note that λa : λ IR +{ = x :∈ax }= 0 , a hyperplane, and that λa : λ {0 + = ≥x : ax} 0 , a halfspace.{ Thus∈ } (homogeneous){ lines} and hyperplanes, { ≥ } { ≥ } rays and halfspaces, are dual objects under cone duality, yielding the following dual result for uniqueness of inequality descriptions.

Corollary 3.32 A finite cone has a unique (up to positive scaling) minimal inequality rep- resentation if and only if it is a hyperplane or it is full-dimensional.

We will refer to the minimal generating sets of Theorem 3.31 as(conical) bases; the theorem provides a basis equicardinality result for lines and pointed finite cones. By decomposing a finite cone into the sum of its lineality and a pointed cone (Proposition 3.1), this equicardi- nality property extends to all finite cones; the dual counterpart follows from Exercise 3.5(vii).

Corollary 3.33 A finite cone K may be expressed K = uB + vC : v 0 , where: (i) the rows of B constitute a basis for K ( K); { ≥ } (ii) K [K ( K)]o = vC : v 0 is pointed∩ − and the representation minimal. 2 ∩ ∩ − { ≥ } Corollary 3.34 A finite cone K may be expressed K = x : Bx =0, Cx 0 , where: o (i) the rows of B constitute a basis for [L(K)] ; { ≥ } (ii) K +[K +( K)]o = x : Cx 0 is full-dimensional and the representation minimal. 2 − { ≥ } Basis equicardinality is a common feature of all three algebraic duality models thus far inves- tigated: subspaces (the Finite Basis Theorem – Theorem 1.7), rational lattices (the Hermite Normal Form – Theorem 2.3 and Exercise 2.4), and pointed finite cones (Theorem 3.31). The structural commonalities of bases in these three models actually persist to a deeper level. In generating subspaces, arbitrary linear combinations are allowed, and it follows from Exercise 1.3 that any two bases are related by an invertible transformation and, similarly, applying an invertible transformation comprised of unrestricted elementary operations from (1.3) to any basis yields another. For rational lattices, generation is restricted to integral combinations and Hermite normal form uniqueness implies that any two lattice bases are related by a unimodular transformation, while application of a unimodular transformation to any basis produces another basis. Here the admissible (i.e., unimodular) transformations preserve the integrality inherent in the lattice definition. For pointed finite cones, genera- tion is via nonnegative combinations, and Theorem 3.31 shows that in this setting we pass among conical bases using transformations comprised of permutations and positive scalings, the elementary operations which preserve nonnegativity. In each case, the transformations relating the bases are those corresponding to the types of linear combinations allowed in generation. Further similarities shared by these (and other) duality models are examined in Carvalho and Trotter, Math. of O.R. 14(1989)639–663.

39 Facial Structure of Finite Cones

Suppose now that K is given in polyhedral form, say K = x : Ax 0 , where A IRm×n. How can we then determine matrices B and C of (3.33)?{ Now, matrix≥ } B has rows∈ which constitute a basis for the lineality subspace of K, x : Ax =0 , so Exercise 1.34 shows how to determine matrix B. Finding matrix C of (3.33){ reduces} (see Exercise 3.2(iii)) to the question of recognition of members of the (unique) minimal set of generators for a polyhedral cone which is nontrivial and pointed, i.e., specified by a constraint matrix of full column rank. This situation is addressed by the following exercise.

Exercise 3.35 Let K = x : Ax 0 IRn be pointed, with minimal generating set G = . (i) Show λa G λ> 0 { 0 = a≥ K} ⊆and Ea =0 for some rank n 1 submatrix E of6 A∅. (ii) Determine∈ a∃ corresponding⇔ 6 dual∈ result. 2 −

In the exercise, K = x : Ax 0 = 0 is pointed, and thus has a unique minimal generating set (Theorem{ 3.31); the≥ subset} 6 of{ K} corresponding to any one of these generators, obtained by forcing a rank n 1 family of the relations Ax 0 to hold at equality, e.g., − { ≥ } x K : Ex = 0 = λa : λ 0 , is called an extreme ray of K. Of course, the members {of the∈ unique minimal} { generating≥ } set for K (the vectors a in the exercise) define a matrix A¯ for which x : Ax¯ 0 is the unique minimal inequality representation for K + (which, in the present{ setting,≥ is} a full-dimensional proper subset of IRn). For each a A¯, the set x K + : ax = 0 is a facet of K + . Thus the extreme rays of K and the facets∈ of K + are { ∈ } in one-to-one correspondence. This correspondence extends to provide a complete “facial” pairing between K and K + , which we now develop. Recall from Corollary 3.29 that to each c K + there corresponds a valid inequality cx 0 ∈ ≥ for K. Any such valid inequality determines a face of K given by x K : cx = 0 . Note that extreme rays and facets are obtained by forcing certain inequalities{ ∈ in a polyhedral} representation to hold at equality; this property, in fact, characterizes faces.

Theorem 3.36 Suppose F K = x : Ax 0 . Then F is a face of K F =⊆z K{: A z =0≥ }, for some A A. ⇔ { ∈ F } F ⊆ Proof: ( ) Suppose valid inequality cx 0 determines F ; i.e., F = z K : cz =0 . Corollary⇒ 3.29 implies c K + ; hence y∗A≥ = c for some y∗ 0. { ∈ } ∈ ∗ ≥ For AF we take the rows ai A for which yi > 0. ∗∈ ∗ ∗ Then z F 0= cz =(y A)z = y (Az)= yF (AF z); ∈ ∗ ⇒ since yF > 0 and AF z 0, this implies AF z = 0. Moreover, x K F 0 <≥ cx = y∗ (A x) a x> 0 for some a A . ∈ \ ⇒ F F ⇒ i i ∈ F ( ) Define c as the sum of the rows of AF ; then cx 0 determines F , ⇐as K x : cx 0 and F = z K : A z =0 ≥= z K : cz =0 . 2 ⊆{ ≥ } { ∈ F } { ∈ }

The equality subset for a face F is the unique maximal subsystem AF x 0 for which F = x : A x = 0 , i.e., the entire set of relations in the inequality{ system≥ } Ax 0 { F } { ≥ } 40 which hold at equality for all points of F . Since the equality set A x 0 is maximal, { F ≥ } for each row ai A AF we know aizi > 0 for some zi F . Thus, forz ˆ = i zi F we have a zˆ > 0 a ∈ A\ . The pointz ˆ is an interior point of∈ F ; clearly, every face contains∈ an i ∀ i 6∈ F P interior point. Interior points provide a simple means for establishing facial dimension.

Theorem 3.37 For face F x : Ax 0 , we have dim(F )= n rank(A ). ⊆{ ≥ } − F

Proof: This follows immediately from the fact that L(F )= x : AF x =0 . To see the latter, note that x F A x = 0, hence L(F ) { x : A x =0} . ∈ ⇒ F ⊆{ F } For the reverse inclusion, suppose AF x¯ = 0 and choosex ˆ interior to F . Then (1 + )ˆx x¯ =x ˜ F for sufficiently small > 0; thus (1+) xˆ 1 x˜ =x ¯ L(F ). 2 − ∈  −  ∈ We now show that the duality operation induces a natural pairing between all faces of a finite cone with those of its dual. Consider once again Exercise 3.35. Any member of the minimal generating set for K, say a K, defines a face F = λa : λ 0 of K = x : Ax 0 , and ∈ { ≥ } { ≥ } the equality set for F is a rank n 1 submatrix AF of A for which F = x K : AF x =0 . o − o + + { ∈ }+ Thus F = L(AF ) = x : ax = 0 and F K = x K : ax = 0 is the face of K which is defined by the{ essential inequality} for∩ K + , ax{ ∈0. } More generally, suppose F = x K : A x = 0 is≥ an arbitrary face of K, and that { ∈ F } dim(F )= n rank(AF )= p. Let b1,...,bp F be a basis for L(F ); then bix 0, 1 i p, are valid inequalities− for K + . By (3.36),∈ any intersection of faces is again≥ a face,≤ hence≤ + o + + K F = K x : b1x = 0 x : bpx = 0 is a face of K . Thus we consider o o F ∩F K + , associating∩{ face }∩···∩{F K to face F K}+ K + . Applying7→ ∩ this same correspondence⊆ in K + , we associate∩ ⊆ face F o K + of K + to the face o + o ++ ++ ++ o∩ + o ++ (F K ) K of K . Now, K = K and we claim that (F K ) K = F . To ∩ ∩ o o ∩o ∩ see the latter assertion, note that F = [L(F )] = [ x : AF x = 0 ] = L(AF ). Moreover, + o + {o o }+ o o each row of AF is in K , so it follows that (F K ) = [L(F K )] = [L(AF )] = L(F ). o + o ++ ∩ ∩ o + Hence (F K ) K = L(F ) K = F , so the correspondence returns F K to F ; in other words,∩ the correspondence∩ is∩involutory. ∩ o Thus the mapping F F K + can be used to associate the faces of any finite cone K to those of its dual K + .7→ The mapping∩ depends only on the geometric structure of K and K + , not on particular representations. Consistency of the mapping therefore requires that it be o o a bijection, i.e., one-to-one and onto. Note also that F G implies G K + F K + , so that the correspondence between faces of K and K + is⊆inclusion-reversing∩ . ⊆ ∩

o Theorem 3.38 (Vorono¨ı1908) The pairing F F K + between faces of finite cone K and those of its dual K + is an involutory, inclusion-reversing7→ ∩ bijection. 2

o Exercise 3.39 For face F of the finite cone K and F K + the associated face of K + , o show that dim(F )+ dim(F K + )= n. 2 ∩ ∩ We continue to examine the cone K = x : Ax 0 . The above development exploits the fundamental connection between the faces{ of K≥and} the equality sets defined by certain

41 subsystems of the linear inequality representation Ax 0 . Each face is, of course, a finite cone, and we now show that equality sets can{ also≥ be} used to determine a minimal set of generators for any face. Note that all faces of K have the same lineality space, viz. L = x : Ax = 0 . L also is a face of K; thus it is the unique minimal face of K. The following{ theorem} shows that in order to determine a set of generators for K, it suffices to consider generators for L along with a single representative from each face which is minimal with respect to proper containment of L, i.e., which properly contains L but no other face. Theorem 3.40 Let K be a finite cone with lineality space L = K( y ,...,y ) and suppose { 1 s} xi Fi L i, where F1,...,Ft are the minimal faces of K which properly contain L. ∈ \ ∀ Then K = K( y ,...,y , x ,...,x ). { 1 s 1 t} Proof: Let x K. If x L, then x is generated by y ,...,y . ∈ ∈ 1 s If x K L, let F be the smallest face of K containing x; F contains some F . ∈ \ i Now x, xi K L and the equality set for Fi contains that for F . Thus for >∈ 0\ appropriately chosen, x x K has a (strictly) larger equality set than x. − i ∈ Now repeat the argument, replacing x x xi. Since the equality set for x increases at← each− step, the process ends with a point in L. The construction shows (the original) x is generated by y ,...,y , x ,...,x . 2 { 1 s 1 t} The generators in the theorem are defined by determinants of submatrices of A (subdetermi- nants of A). To see this, suppose rank(A)= k and consider xi Fi L. Then the rank of A ∈ \ Fi is k 1; hence there is a row basis of A, say A¯, whose first k 1 rows are from A and whose − − Fi last row is from A A . Since A¯ is a row basis of A, these two matrices have identical column \ Fi dependencies. If the columns of A¯ corresponding to nonzero components of xi are linearly dependent, some d L specifies a dependency relation among them. Then x + d F and, ∈ i ∈ i for appropriate choice of , has fewer nonzero components than x . Replacing x x + d i i ← i and repeating the argument eventually determines xi Fi L whose nonzero components correspond to independent columns of A¯. Thus (after∈ permutation,\ if necessary) we may write A¯ = [B, N], where B is a k k, invertible submatrix, among whose columns are those × corresponding to nonzero components of xi. It follows that for xi we may take the (unique) solution to the system Bx + Nx = e resulting when x = 0. By Cramer’s rule, scaling { B N k} N by det(B) gives a solution whose components are subdeterminants of A. Any linear system Bx + Nx =0 corresponding to a row basis of A provides a basis for { B N } L consisting of the n k solutions obtained by taking xN , successively, as the unit vectors. As above, we obtain solutions− whose components are subdeterminants of A. These vectors and their negatives constitute a set of (conical) generators for L. Finally, the above development applies to any face F of K, since F is again a finite cone with an inequality representation consisting of rows of A. In fact, it is immediate from Theorem 3.36 that the minimal faces of F properly containing L are among the Fi. Thus we obtain the following bound on the component magnitude for generators for the faces of K. Theorem 3.41 For A IRm×n with largest subdeterminant magnitude ∆, each face of K = x∈: Ax 0 has a finite generating set g with g ∆ i. 2 { ≥ } { i} k ik∞ ≤ ∀ 42 4 Lattice Points in Convex Cones A monoid (semigroup) contains 0 and is closed under addition, i.e., under nonnegative in- tegral linear combinations. Monoids apparently capture certain aspects of lattices, closed under integral combinations, and cones, closed for nonnegative combinations. This suggests integral monoids may have implications for nonnegative integral solutions for linear systems similar to those derived for lattices (integral solutions) and cones (nonnegative solutions), so it is natural to attempt to mimic the earlier development for lattices and cones in the monoid setting. Our interest is primarily in integral monoids, those with integer-valued elements, and we will restrict attention to integral data generally throughout this section.

Finite Generator Representations

Any set S ZZn generates an integral monoid M(S) via nonnegative integral linear combi- nations of its⊆ elements. The monoid M(A)= yA : y ZZm is finitely generated by the rows { ∈ + } of matrix A ZZm×n. ∈ Exercise 4.1 Show that every integral monoid of dimension one is finitely generated. 2

As monoids are discrete analogues of cones, it is natural to consider finite cones as a source for finitely generated integral monoids. The fundamental result in this area is the Hilbert Basis Theorem – the integral elements of a finite rational cone have a finite basis as a monoid.

Theorem 4.2 (Gordan 1873, Hilbert 1890) Suppose K IRn is a rational, finite cone. ⊆ p Then there exists a matrix H ZZp×n such that K ZZn = zH : z ZZ = M(H). ∈ ∩ { ∈ +} Proof: K is a rational, finite cone, hence we may write K = yA : y 0 , with A ZZm×n. ZZn m { ≥ } ∈ Now, any x K may be expressed x = i=1 λiai, with λi 0 i, for rows ai A. Thus we have∈ x =∩ λ a + + λ a + m (λ λ )a . ≥ ∀ ∈ b 1c 1 ··· b mc m Pi=1 i − b ic i I.e., x is a nonnegative integral combinationP of the ai plus an integral remainder from m m ZZn R = i=1(λi λi )ai : i=1 λiai K and 0 λi i . { − b c ∈ ∩ ≤ m∀ } The rowsP of A define the (partiallyP open) zonotope Z = i=1 µiai :0 µi < 1 i . ZZn ZZn { ≤ ∀ } Now R Z K and, since Z is bounded, R isP a finite set. Taking⊆ the rows∩ of⊆A and∩ the members of R as the rows of H, the theorem follows. 2

The above proof is from Giles and Pulleyblank, Lin. Alg. and Its Appl. 25(1979)191–196. The rows of matrix H in the theorem constitute a Hilbert basis for the monoid K ZZn. Note ∩ that the theorem may fail when K is not rational or when K is not finite. When K is given in polyhedral form, Theorem 3.41 bounds a ; moreover, by Carath´eodory’s k ik∞ Theorem, each element of R in the proof can be expressed using at most n of the ai. Thus we obtain the following bound on component magnitude for Hilbert basis elements.

Corollary 4.3 For B ZZm×n with largest subdeterminant magnitude ∆, ∈ x ZZn : Bx 0 has an integral Hilbert basis H satisfying h n∆ h H. 2 { ∈ ≥ } k ik∞ ≤ ∀ i ∈

43 Arguing as in the proof of Theorem 3.31 we obtain a uniqueness result for Hilbert bases. The earlier uniqueness up to positive scaling now becomes truly unique – in terms of the discussion following (3.34), monoids are closed under nonnegative integral linear combinations and the only elementary operation which preserves both nonnegativity and integrality is permutation. Theorem 4.4 If K is a rational, finite, pointed cone, then K ZZn has a unique minimal Hilbert basis. 2 ∩ Exercise 4.5 Show that the unique minimal Hilbert basis of the preceeding theorem consists of elements of K ZZn which are not sums of other elements of K ZZn. 2 ∩ ∩ Theorem 4.2 says that for a given rational finite cone, we can obtain a finite set of generators for the integral monoid comprised of its integral elements. On the other hand, when given a finite generating set for an integral monoid, the following result shows that these generators naturally determine the finite cone spanned by the monoid. Proposition 4.6 For M = M(S) with S ZZn, we have K(M)= C(M)= K(S). ⊆ Proof: Since S M(S)= M, we have K(S) K(M). Conversely , x ⊆K(M) x = p λ x with⊆λ 0, x M = M(S) i x K(S). ∈ ⇒ i=1 i i i ≥ i ∈ ∀ ⇒ ∈ Clearly C(M) K(M). For the reverse inclusion, note 0 M 0 C(M); then ⊆ P ∈ ⇒ ∈ 0=x K(M) x = p z x , z 0, x M 6 ∈ ⇒ i=1 i i i ≥ i ∈ ⇒ p z p zi i=1 i P p 2 x = i=1 α (αxi)+(1 α )0, α = i=1 zi > 0, αxi M x C(M). − P d e ∈ ⇒ ∈ P P Thus for monoids we have that M finitely generated over ZZ+ implies that K(M) is finitely generated over IR+. This property actually characterizes finitely generated integral monoids. In order to establish the converse, we require the following lemma on integral point sets. ZZn Lemma 4.7 Any set S + contains a finite subset T with the following property: to each s S there corresponds⊆ some t T for which t s. ∈ ∈ ≤ Proof: Induction on n. The result is clear for n = 1; assume it true for 1, 2,...,n 1. Define S0 = (x ,...,x ):(x ,...,x , x ) S, for some x ; by induction we− have { 1 n−1 1 n−1 n ∈ n} T 0 S0, T 0 < + , for which s0 S0 t0 s0 for some t0 T 0. Lift the⊆ members| | of T∞0 back to S to get∈ ⇒ ≤ ∈ T = x S :(x ,...,x ) T 0 and x x (x ,...,x , x ) S . −1 { ∈ 1 n−1 ∈ n ≥ n ∀ 1 n−1 n ∈ } Let k = max xn :(x1,...,xn) T−1 . Then x S and{ x k t0 ∈(x ,...,x} ) for some t0 T 0 t x for some t T . ∈ n ≥ ⇒ ≤ 1 n−1 ∈ ⇒ ≤ ∈ −1 To handle the case xn

44 The following theorem appears in R. Jeroslow, Math. of O.R. 3(1978)145–154. The earlier Hilbert Basis Theorem is a special case of the ( ) direction here, i.e., for M = K(M) ZZn. ⇐ ∩ Theorem 4.8 (Jeroslow 1978) An integral monoid M is finitely generated over ZZ + ⇔ its conical span K(M) is finitely generated over IR ; i.e., K(M) is a finite cone. + ⇔ Proof: The second equivalence is just Weyl–Minkowski duality for K(M). In the first, Proposition 4.6 shows ( ); for ( ), let K(M)= yA : y IRm , A IRm×n. ⇒ ⇐ { ∈ + } ∈ Each row of A is in K(M), and is thus a finite conical combination of elements of M. We may therefore assume that the rows of A, say a1,...,am, are in M. Now z M z = yA with y 0, and by Carath´eodory’s Theorem (3.8), we may∈ assume⇒ a : y > 0≥ are linearly independent. { i i } Thus Cramer’s rule z = n(z) A, for n(z) ZZm and D depends only on A (not on z M); ⇒ D ∈ + ∈ e.g., D= magnitude of lcm of all subdeterminants of A. For d 0, 1,...,D 1 m, define M(d)= z M : n(z) d(mod D) . ∈{ − } { ∈ ≡ } Applying Lemma 4.7 to n(z) : z M(d) , we obtain T (d) M(d) for which T (d) < + and{z M(d)∈ n(z}0) n(z) for some⊆z0 T (d). | | ∞ ∈ ⇒ ≤ m ∈ For T = a1,...,am ( T (d) : d 0, 1,...,D 1 ), clearly T < + and T M. { } ∪ { ∈{ − n}(z) } | | ZZm ∞ ⊆ We claim T generates MS, for if z M, we have z = D A with n(z) + . ∈ 0 ∈0 n(z0) Thus z M(d), where n(z) d(mod D) and n(z ) n(z) for some z = D A T (d). ∈ 0 0 ≡ ≤ ∈ So z = n(z ) A + n(z)−n(z ) A =1z0 + yA and y ZZm, since n(z0) n(z) n(z0)(mod D). 2 D D ∈ + ≤ ≡ Not all integral monoids are finitely generated, for example: 2 2 M = (0, 0) (x1, x2) > 0:(x1, x2) ZZ or M = (x1, x2) ZZ : x1 √2x2 . Moreover,{ any }∪{integral monoid satisfies M∈ }K(M) ZZn{; the theorem∈ assures≤ that} even when this containment is strict, M will still⊆ be finitely∩ generated provided K(M) is finite. Thus, in a certain sense, the simplest integral monoids are those treated by the Hilbert Basis Theorem; i.e., M = K(M) ZZn, where K(M) is a finite cone. It is also of interest to observe that all finitely generated integral∩ monoids are related to those of Theorem 4.2 by projection.

Exercise 4.9 (i) Determine a finitely generated integral monoid M = K(M) ZZn. (ii) Show that every finitely generated integral monoid is the projection6 of a∩ monoid of the form M = K(M) ZZn. 2 ∩ A further characterization of finite generation for integral monoids given by Jeroslow is the following.

Exercise 4.10 Let M be an integral monoid, with K = K(M) the closure of K(M). Show that M is finitely generated if and only if: (i) K is rational and polyhedral; (ii) x K Qn, p ZZ such that px M. 2 ∀ ∈ ∩ ∃ ∈ ∈ The characterization of Theorem 4.8 suggests the following generalization of Theorem 4.4.

45 Theorem 4.11 Suppose M is an integral monoid and K(M) is pointed and finite. Then M has a unique minimal set of generators.

Proof: By (3.21), ax > 0 x K(M) for some a IRn; we may take a ZZn. (Why?) ∀ ∈ ∈ ∈ Since ax ZZ x M, a induces the (ordered) partition M = M M , where ∈ + ∀ ∈ 1 ∪ 2 ∪··· ax = ay x, y Mi for each i, and ax > ay x Mi,y Mj for each i > j. Any generating∀ set∈ for M must contain M , since∀ for∈ any x∈ M , 1 ∈ 1 x = y + z with 0 = y, z M ax = ay + az, with ay, az ax > 0, which is impossible. Define B = M and6 inductively∈ ⇒ assume we have B M , 1 ≥ i p, for which 1 1 i ⊆ i ≤ ≤ every generating set for M contains B B and (M M ) M(B B ). 1 ∪···∪ p 1 ∪···∪ p ⊆ 1 ∪···∪ p Now set Bp+1 = Mp+1 M(B1 Bp). It is clear that B \M and∪···∪ (M M ) M(B B ). p+1 ⊆ p+1 1 ∪···∪ p+1 ⊆ 1 ∪···∪ p+1 Moreover, Bp+1 is contained in every generating set for M, since for any x Bp+1, x = y + z with 0 = y, z M 0 < ay, az < ax y M , z M with∈ i,j

Efficient Generator Representations

For convex cones, Carath´eodory’s Theorem (3.8) provides a tight upper limit on the number of generators required to represent any point of the cone: for each element of a cone in IRn, there must be a representation using no more than n generators. Is there an analogous result for integral monoids? We consider this question first in the context of Theorem 4.2, i.e., for monoids of the form K(A) ZZn, with K(A) a rational, pointed, finite cone. In this setting ∩ we seek a bound on the number of elements of the unique minimal Hilbert basis needed to represent an arbitrary integral element, say x K(A) ZZn. Of course x may have many different representations as a conical combination∈ of the∩ rows of A Qm×n. In order to ∈ bound the number of Hilbert basis elements required to represent x, we define the height of x as h(x) = max m λ : m λ a = x, λ 0 i . Since K(A) is pointed, it follows { i=1 i i=1 i i i ≥ ∀ } from Carath´eodory’sP TheoremP that h(x) is always achieved by a set of linearly independent generators; i.e., there exist coefficients λi > 0 : i I 1,...,m for which i∈I λiai = x and a : i I is independent. { ∈ ⊆{ }} { i ∈ } P Exercise 4.12 Show that h(x) is attained by an independent set of generators; i.e., h(x)= max λ : x = λ a , λ 0 i I, a : i I independent 2. { i∈I i i∈I i i i ≥ ∀ ∈ { i ∈ } } P P When A has linearly independent rows, K(A) is simplicial. Note that simplicial cones are pointed. Our development makes use of the following refinement lemma for simplicial cones.

46 Lemma 4.13 Let x = n λ a , y = n τ a =0, for λ , τ 0 i and K(A) simplicial. i=1 i i i=1 i i 6 i i ≥ ∀ Then, for each τj > 0, x Kj = K( y ai : i = j ) λj/τj = min λi/τi. P ∈ P{ }∪{ 6 } ⇔ τi>0 Moreover, K(A)= Ki and dim(Ki Kj) < n τi > 0, τj > 0, i = j. τi>0 ∩ ∀ 6 S 1 τi n λj τi Proof: If τj > 0, then aj = y i6=j ai; thus x = λiai = y + i=6 j(λi λj )ai. τj − τj i=1 τj − τj By Proposition 1.4, τ > 0 y a : i = j is linearly independent; j ⇒{ }∪{P i 6 } P P moreover, this representation for x as a linear combination of y ai : i = j is unique. λj τi { }∪{ 6 } Thus x Kj 0 and λi λj 0 for i = j λj/τj = min λi/τi. ∈ ⇔ τj ≥ − τj ≥ 6 ⇔ τi>0 Since x is an arbitrary element of K(A), we have that K(A)= K . τi>0 i Now consider i = j for which τ > 0, τ > 0. 6 i j S We will show Ki Kj = Kij = K( y al : l = i, j ); hence dim(Ki Kj)= n 1. Clearly K K ∩ K ; to see the reverse{ }∪{ inclusion,6 let} a K K . ∩ − ij ⊆ i ∩ j ∈ i ∩ j Then a = βy + b = γy + c, where β,γ 0 and b K( a : l = i ), c K( a : l = j ). ≥ ∈ { l 6 } ∈ { l 6 } If β = γ, then clearly b = c K( al : l = i ) K( al : l = j ); since the a are linearly independent,∈ { 6 it} follows∩ { that a6 }K . l ∈ ij On the other hand, when β = γ, we may assume β<γ; thus (γ β)y = b c. 6 − − n Now ai L( al : l = i ), so by (1.15), tai = 0 and tal =0 l = i, for some t IR ; without6∈ loss{ of6 generality,} ta > 0, so that6 tz 0 z K∀ (6A). ∈ i ≥ ∀ ∈ Then (γ β)ty = tb tc, with γ β > 0,ty > 0, tb = 0, and tc 0, which is impossible. 2 − − − ≥ We now give a bound on h(x) due to Ewald and Wessels, Results in Math.19(1991)275-278; the proof given here is from Liu, Trotter, and Ziegler, Results in Math.23(1993)374-376.

Theorem 4.14 Let A Qm×p, with K(A) pointed and dim(K(A)) = n 3. ∈ ≥ Then each x in the minimal Hilbert basis for K(A) ZZp must satisfy h(x) < n 1. ∩ − n Proof: Suppose x = i=1 λiai, where λi 0 i and, by (4.12), the ai are linearly independent. If there exists y ZZp, with x = y = n≥τ a∀, 0 < n τ 1, τ 0 i, we apply (4.13), ∈ P 6 i=1 i i i=1 i ≤ i ≥ ∀ replacing a y for some τ > 0 and yielding a smaller simplicial cone containing x. j ← j P P Continued refinement of the cone containing x ultimately determines x K( b1,...,bn ), n ZZp n ∈ { } where j=1 αjbj :0 < j=1 αj 1, αj 0 j = b1,...,bn . { ∈ n ≤ ZZp ≥ ∀ }n { } Note each bPj is of the form bj = Pi=1 τjiai , with i=1 τji 1, τji 0 j, i. ∈ ZZp ≤ ≥ ∀ ZZp Now x is in the minimal HilbertP basis for K(A) ,P hence also for K( b1,...,bn ) ; ∩ n { } ∩ thus (see Theorem 4.2), either x = bj for some j or x = j=1 αjbj, 0 αj < 1 j. n 0 n n ≤ ∀ If j=1 αj > n 1, then from x =( j=1 bj) x = j=1(1 Pαj)bj, we obtain 0 = x0 K(−b ,...,b ) ZZp with n (1− α ) < 1, contradicting− our construction. P 6 ∈ { 1 n} ∩ P j=1 − j P If n α = n 1, then n (1 α )=1, so x0 coincides with some b ; j=1 j − j=1 − j P j Pbut this implies x = Pk6=j bk, which again is a contradiction, as n 3. n ≥ Hence j=1 αj < n 1,P and we therefore have n − n n n n n n x =P j=1 αjbj = j=1 αj( i=1 τjiai)= i=1( j=1 αjτji)ai, with i=1 j=1 αjτji < n 1. Independence of the a implies that λ = n α τ , so that n λ < n 1. − P Pi P i Pj=1 jPji i=1 iP P− Thus h(x) < n 1. 2 − P P

47 The bound on h(x) given by the theorem is asymptotically sharp. To see this, consider the simplicial cone K IRn, 0 = p ZZ generated by (1,..., 1,p) and the n 1 unit vectors p ⊂ 6 ∈ + − e1,...,en−1. The minimal Hilbert basis for Kp is given by its n generators along with the p 1 vectors (1,..., 1, q) for q = 1,...,p 1. These additional vectors are of the form − − q (1,..., 1,p)+ n−1 p−q e , and so their respective heights are q + (n−1)(p−q) =(n 1) (n−2)q . p i=1 p i p p − − p It is clear thatP the maximum height is achieved for q = 1, i.e., for x = (1,..., 1) with h(x)=(n 1) n−2 ; thus h(x) is arbitrarily close to n 1 for sufficiently large p. − − p − If the cone K(A) is not pointed, the minimal Hilbert basis need no longer be unique. In this setting, there is generally no hope for obtaining a bound as in the theorem. For example, 2 Hk = (1, 0), (0, 1), ( k, k) is a Hilbert basis for the integral monoid ZZ , for each positive integer{ k, yet for A =− H−we} have ZZ2 = K(A) ZZ2 and h(x)= k for x =( k, k), so that 1 ∩ − − we can select a Hilbert basis with an element of arbitrarily large height. Still in the context of Theorem 4.14, we consider h( ) now with respect to H as a generating n · ZZp set for K(H) = K(A). Let x = i=1 λibi K(A) , where λi 0 i, the bi H are independent, and h(x) = n λ ; that is, ∈ n λ ∩ τ , whenever≥ x∀ = τ∈h with i=1 iP i=1 i ≥ i∈I i i∈I i i τi 0, hi H i I. WeP may assume KP( b1,...,bPn ) has been refined asP in the proof ≥ ∈ ∀ ∈ n { ZZp } n of Theorem 4.14 (details?), i.e., that i=1 τibi : 0 < i=1 τi 1, τi 0 i = { ∈ n n≤ ≥ ∀ } b1,...,bn . We decompose x as in TheoremP 4.2, x = i=1 λiPbi + i=1(λi λi )bi, and { } n n b c − b c denote the remainder by z = i=1(λi λi )bi = iP=1 αibi, 0 Pαi < 1 i; note that ZZp − b cn ≤ ∀ n z K( b1,...,bn ) . SinceP h(x) = i=1 λi, itP is immediate that h(z) = i=1 αi. If ∈h(z) >{ n 1,} we∩ obtain a contradiction as in the proof of (4.14). If h(z) = n 1, − P P − then z = i6=j bi, for some j, contradicting 0 αi < 1 i in the (unique) representation n ≤ ∀ ZZp z = i=1 αPibi. We therefore have h(z) < n 1. Now, z K(A) implies z = i∈I µihi, for h H and positive integers µ , and since− h(z) < n ∈1, it follows∩ that I < n 1. Thus Pi ∈ i − | | −P we can always obtain a representation of x using at most 2n 2 elements of H, namely, x = n λ b + µ h . − i=1b ic i i∈I i i ThusP the bound onP h( ) established in Theorem 4.14 leads to the result that each integral element of a rational, finite,· pointed cone of dimension n 3 can be expressed as a positive integral combination of at most 2n 2 of the cone’s Hilbert≥ basis elements. It is evident − that the argument given here remains valid for n = 2. Thus we have the following result of A. Seb¨o, IPCO Conf. Proc.(1990)431-455, which improves the initial bound of 2n 1 due to Cook et al., J. Comb. Th. (B)40(1986)63-70. − Theorem 4.15 Each integral element of a rational, finite, pointed cone of dimension n 2 is a nonnegative integral combination of 2n 2 minimal Hilbert basis elements. 2≥ − For the case n = 3, Seb¨oestablished that only three Hilbert basis elements are needed in such representations (see also J. Liu, Ph.D. Thesis, Cornell, 1991). A monoid basis from which any monoid element can be generated using no more than n basis elements will be termed a Carath´eodory basis. Thus Seb¨o’s result is that in IR3 any minimal Hilbert basis for the integral elements of a polyhedral cone is also a Carath´eodory basis. This does not hold in general, as demonstrated by the following example due to Bruns et al., Preprint 32, Universit¨at Magdeburg(1998).

48 Example 4.16 Consider the 6-dimensional cone with Hilbert basis: (-111001),(-312001),(-111101),(011101),(0-1-1211), (-1-20211),(-11-2311),(-10-1311),(-2-13001),(101101). Here, no fewer than 7 Hilbert basis elements suffice to generate (-25 5 13 41 12 30). 2 Our focus thus far has been on efficient generation of a cone’s integral elements using a minimal Hilbert basis; we have seen that for cones of dimension n, this generating set may not constitute a Carath´eodory basis, though each integral element can be generated using at most 2n 2 Hilbert basis elements (Theorem 4.15). Recall (proof of Theorem 4.2), − however, that for A ZZm×n, any element of K(A) ZZn is a positive integral combination of at most n + 1 elements∈ from a ,...,a ∩m µ a : 0 µ < 1 i . We thus { 1 m}∪{ i=1 i i ≤ i ∀ } consider expanding the set of generators in order toP enable more compact representations. We show below that an enlarged set of generators admits a discrete analogue for the following sharpening of Carath´eodory’s Theorem, and hence a Carath´eodory basis. We relax here the terminology partition (into simplicial subcones), allowing members of a partition to have nonempty intersection, provided it is of lower dimension – recall Lemma 4.13.

Theorem 4.17 Suppose K(A)= K( a1,...,am ) and dim(K(A)) = n m. Then K(A) partitions into simplicial{ subcones,} each with generators ≤from a ,...,a . { 1 m} Proof: This is clear for n =1 or n = m; let 1 n, we may assume a1 linearly dependent on a2,...,am. Denote K = K( a ,...,a ) and apply Theorem 3.12 to obtain K = x : Cx 0 ; 0 { 2 m} 0 { ≥ } we may assume this inequality description for K0 is minimal. Since a / K , we must have c a < 0 for certain c C, say for 1 i p. 1 ∈ 0 i 1 i ∈ ≤ ≤ We denote Ji = x : cix =0 K0 and Ki = K(Ji a1 ), 1 i p. Any x K(A) K{ , is of the} form ∩ x = m λ a , with∪{λ }> 0 and≤ ≤λ 0 i> 1. ∈ \ 0 ı=1 i i 1 i ≥ ∀ Thus for x K(A) K , we have x λ a = m λ a K , and consequently ∈ \ 0 − 1P1 ı=2 i i ∈ 0 there is a largest λ [0, 1] for which λx +P (1 λ)(x λ1a1)= y K0. Therefore y J for some∈ i and x = y + (1 λ)λ−a K−; it follows∈ that K(A)= p K . ∈ i − 1 1 ∈ i ∪i=0 i For j > 0, K K = J and c x =0 x J yet c a < 0; thus dim(K K ) < dim(K(A)). 0 ∩ j j j ∀ ∈ j j 1 0 ∩ j And for distinct j,k > 0, Kj Kk = K((Jj Jk) a1 ), as in the proof of Lemma 4.13. Note c x > 0 for some x K∩ , for if c x =0∩ x ∪{K }, then c a = 0, a contradiction; j j j ∈ 0 j ∀ ∈ 0 j 1 similarly, ckxk > 0 for some xk K0. Thus for y = 1 x + 1 x , we obtain∈c y > 0, c y > 0, and c y 0 i = j, k. 2 j 2 k j k i ≥ ∀ 6 Also, minimality of Cx 0 implies c z < 0, c z 0, and c z 0 i = j, k for some z. { ≥ } j k ≥ i ≥ ∀ 6 Thus x = (1 λ)z + λy, with λ (0, 1], so that cjx = 0, ckx> 0, and cix 0 i = j, k. That is,∃ x J−J = and consequently∈ dim(J J ) < dim(J ), so that ≥ ∀ 6 ∈ j\ k 6 ∅ j ∩ k j dim(K K )= dim((J J ) a ) < dim(J a ) dim(K(A)). j ∩ k j ∩ k ∪{ 1} j ∪{ 1} ≤ Hence the Ki partition K(A) in the appropriate sense. By induction on n, we partition Ji = Jij, for simplicial Jij with generators in a1,...,am . ∪j∈Ii { } We denote K = K(J a ), and by induction on m, partition K = K similarly. ij ij 1 0 j∈I0 0j It is now straightforward∪{ to verify} that K is the required partition of∪ K(A). 2 ∪i,j ij

49 Note that Carath´eodory’s Theorem (3.8) is an evident consequence of this theorem. Also, for K(A) full-dimensional, any simplicial subcone in the partition given by the theorem is generated by an n n invertible submatrix of A. When this is an integral, unimodular × submatrix, we will call the simplicial subcone it generates unimodular; of course, each integral element of such a unimodular subcone is a nonnegative integral combination of its generators. We now develop a discrete analogue for the previous theorem; for A ZZm×n we denote m ∈ d = max det(ai ,...,ai ) : ai A, j and αZ = µiai :0 µi < α i , for α> 0. {| 1 n | j ∈ ∀ } { i=1 ≤ ∀ } We first consider simplicial, full-dimensional cones. P Proposition 4.18 Suppose A ZZn×n and K(A) is simplicial and full-dimensional. ∈ Then K(A) partitions into unimodular subcones, each with generators from 2dZ ZZn. ∩ Proof: The result is trivial for d = det(A) = 1; for d> 1, we proceed by induction. | | By Theorem 2.10, there is a nonzero y Z ZZn, thus 0 = y = n τ a , with 0 τ < 1 i. ∈ ∩ 6 i=1 i i ≤ i ∀ Lemma 4.13 provides the partition K(A)= Ki, where Ki = K( y aj : j = i ). ∪τi>0 P { }∪{ 6 } Let Aj denote matrix A with row aj replaced by y. Now, τ > 0 det(A ) = τ det(A) d 1, so we may decompose K inductively. j ⇒| j | j| | ≤ − j I.e., we partition Kj into unimodular subcones whose generators are of the form d−1 i6=j λiai + λjy = i6=j(λi + λjτi)ai + λjτjaj, with 0 λk < 2 k. d−1 d ≤ 2 ∀ SinceP λi + λjτi < (1 +Pτi)2 < 2 , the desired result follows.

The restriction to simplicial cones in (4.18) can be removed: Theorem 4.17 partitions K(A) into simplicial subcones generated by the rows of A, and to each of these we apply Proposition 4.18, thus partitioning K(A) into unimodular cones (J. Liu, Ph.D. Thesis, Cornell, 1991). The full-dimensionality restriction can also be removed. If dim(K(A)) < n, we may assume, as above, that K(A) is simplicial; i.e., A ZZm×n with dim(K(A)) = m = rank(A). Here we ∈ redefine d = Z ZZn and say that K(A) is unimodular when d = 1, i.e., when Z ZZn = 0 . We note that| this∩ definition| subsumes the former one, and that when K(A) is unimodular,∩ { } each of its integral elements is a nonnegative integral combination of the rows of A.

Exercise 4.19 (i) Show that K(A) unimodular M(A)= K(A) ZZn; n ZZn ⇒ ∩ (ii) where 0 = y = i=1 τiai Z , with 0 τi < 1 i, as in the proof of (4.18), and 0 6 ∈ ∩ ≤ ∀ Z = i6=j λiai P+ λjy : 0 λk < 1 k , show that the usual round-off operation { ≤ ∀ } 0 0 ZZn ZZn gives aP 1:1 correspondence between points z Z and z Z , defined by: 0 ∈ ∩ ∈ ∩ z = i6=j λiai + λjy = i6=j(λi + λjτi)ai + λjτjaj = i6=j λi + λjτi ai + z; i.e., z0 = a + z, where I = i = j : λ + λ τ 1 . 2 b c P i∈I i P { 6 i j i ≥ } P P Part (i) of the exercise treats the d = 1 case in the proof of (4.18). Part (ii) then handles the inductive step, since by part (ii) we have Z0 ZZn Z ZZn = d and, moreover, | ∩ |≤| ∩ | y (Z Z0) ZZn, so Z0 ZZn d 1. Thus the adaptation of the proof of (4.18) is straightforward,∈ \ ∩ yielding| the∩ result| ≤ that,− at the expense of a (greatly) enlarged generating set, we can partition any rational, polyhedral cone into unimodular subcones. For n = 2 it is not difficult to construct such a partition using only minimal Hilbert basis elements as

50 generators. This remains true for n = 3 (A. Seb¨o, IPCO Conf. Proc.(1990)431-455), but Example 4.16 shows that generally the minimal Hilbert basis must be expanded in order to obtain a unimodular decomposition.

m×n Theorem 4.20 For A ZZ , denote = I : ai : i I is a row basis of A , ∈ I { { ∈ } ZZn } ZI = i∈I µiai :0 µi < 1 i I , and d = maxI∈I ZI . Then K(A) partitions{ into unimodular≤ ∀ subcones,∈ } each with generators| ∩ from| 2dZ ZZn. 2 P ∩ Corollary 4.21 For A ZZm×n, K(A) ZZn has a Carath´eodory basis. 2 ∈ ∩

Linear Constraint Representations

We now follow the development of earlier instances of generator/constraint duality, defining n ∗ n the dual of S ZZ as the integral monoid S = x ZZ : sx ZZ+ s S , and calling ⊆ ∗ ZZn { ZZ∈m ∈ ∀ ∈ } any set of the form S constrained; x : Ax + is finitely constrained by the rows of A ZZm×n. As with earlier dualities,{ ∈ we again have∈ the} following elementary properties. ∈ Proposition 4.22 For S,T ZZn we have (cf. Propositions 1.10, 2.21, 3.4): ∗ ∗ ⊆ (i) S T S T ; ∗∗ (ii) S⊆ S⇒; ⊇ (iii) S⊆∗ = S∗∗∗ ; ∗∗ (iv) S = S S is a constrained monoid; ZZm×n⇔ ZZm ∗ ZZn ZZm (v) A , M = yA : y + = M(A) M = x : Ax + ; (vi) S∗∈= [M(S)]∗ ; { ∈ } ⇒ { ∈ ∈ } ∗ ∗ ∗ (vii) (S T ) = S T . 2 ∪ ∩ A finitely constrained integral monoid is simply the set of integral points in a rational poly- hedral cone; i.e., x ZZn : Ax 0 . Thus Theorem 4.2 shows that the integral monoid analogue of Minkowski’s{ ∈ Theorem≥ (3.16)} is valid. But the converse Weyl result (3.12) fails for monoids; e.g., M = 0, 2, 3, 4, 5,... is finitely generated over ZZ+ by 2,3 , though not finitely constrained by restrictions{ of the} form a x ZZ . It is easy to see{ that} the Farkas i ∈ + Theorem (3.15) fails too. This prompts the question: When does the Weyl condition hold? In fact, this condition holds only for the integral elements of rational polyhedral cones.

ZZm ZZm×n Proposition 4.23 Let M = yA : y + , where A . Then M is finitely constrained{ M∈ = ZZ} n yA : y∈ IRm , i.e., M = ZZn K(M). 2 ⇔ ∩{ ∈ + } ⇔ ∩ Proof: Each row of A is in M; hence yA K(M) for y 0; i.e., yA : y 0 K(M). p ∈ ≥ { ≥ } ⊆ Conversely , x K(M) x = i=1 λixi with λi 0, xi M i p ∈ ⇒ m ≥ ∈ ∀ p x = λ (z A) with λ 0, z Z i x = yA for y = λ z 0. ⇒ i=1 i i i P≥ i ∈ + ∀ ⇒ i=1 i i ≥ This provesP the second equivalence. P ∗ n ∗∗ n m It is easy to validate that M = Z x : Ax 0 and M = Z yA : y R+ . Thus M = M ∗∗ precisely when M =∩{Zn yA≥: y} Rm . 2 ∩{ ∈ } ∩{ ∈ + } 51 Summarizing, for a finitely generated integral monoid M = yA : y ZZm , we have: { ∈ + } M is finitely constrained, i.e., linear Weyl-Minkowski duality holds in ZZ+ M = yA : y IRm ZZn = K(M) ZZn ⇔ { ∈ + } ∩ ∩ yA = c, y IRm yA = c, y ZZm c ZZn : { ∈ + } if and only if { ∈ + } ⇔ ∀ ∈ " is consistent # " is consistent # the rows of A are a Hilbert basis for M. ⇔ Considering all possibilities, an integral monoid may be: (i) neither constrained nor finitely generated, e.g., 2 2 M = (0, 0) (x1, x2) > 0:(x1, x2) ZZ or M = (x1, x2) ZZ : x1 √2x2 ; (ii) constrained{ but}∪{ not finitely generated,∈ e.g., } { ∈ ≤ } ZZ2 M = (x1, x2) + : √3x2 x1 √2x2 ; (iii) finitely{ generated∈ but not constrained,≥ ≥ e.g.,} M = 0, 2, 3, 4, 5,... ; { } (iv) both constrained and finitely generated, hence also finitely constrained (4.23), e.g., M = ZZn x : Ax 0 , for A Qm×n. ∩{ ≥ } ∈ Exercise 4.24 Given A ZZm×n and c ZZn, exactly one holds: .... ∈ ∈ (i) Complete this assertion to an analogue for (1.15, 2.7, 3.15) in the ZZ+ setting. 2 (ii) Show that the assertion is false for A = , c =1. " 3 # (iii) State conditions on A which characterize validity of the assertion. 2

Superadditive Constraint Representations

We have seen above that the finitely generated integral monoid M = 0, 2, 3, 4,... cannot { } be represented using linear constraints of the form aix ZZ+. This monoid can, however, be finitely represented if we broaden the class of constraint∈ functions to include operations; ZZ 1 1 b·c e.g., M = x : 2 x 3 x 0 . Such functions capture the integrality properties inher- ent in monoids;{ ∈ theyb werec − introduced≥ } in Gomory’s initial work on integer programming (see Schrijver, Theory of Linear and Integer Programming (Wiley, 1986), for example) and they remain of fundamental importance to the subject. The present development follows C. Blair and R. Jeroslow, Math. Prog. 23(1982)237-273 and J. Ryan, Ph.D. Thesis, Cornell, 1986. Define n as the smallest family of functions f : IRn IR so that: C → (i) f linear f n; (ii) f = αg +⇒βh∈with C α, β IR and g, h n f n; ∈ + ∈ C ⇒ ∈ C (iii) f = g with g n f n. n b c ∈ C ⇒ ∈ C = n≥1 is the class of Chv´atal functions, equivalently, linear functions with opera- tionsC insertedC with positive multipliers; e.g., 2x 3 x + 1 x x + 2 x + 4 1 x b·c+ 1 x . S 1 − 4 2 b 2 1 − 2c b 3 1c 5 b− 5 2 b 2 2cc The equivalence follows from the fact that each member of the latter class is a Chv´atal func- tion and the entire class is closed under operations (i)-(iii).

52 Note that there may be different representations for the same Chv´atal function, e.g., 0, x 0(mod 6) x x = 1 ( x + x 5x ) = ≡ bb 6 c − 6 c b 2 b 2 c b 3 c − 6 c 1, otherwise. ( − The rank of f is the minimum depth of operations (ii), (iii) needed to express f, formally, ∈ C the least integer r 0 satisfying: (i) f linear f≥is of rank r = 0; ⇒ (ii) f = αg + βh with g, h of rank

In fact, each Chv´atal function has a unique carrier. As we now show, this is a consequence of the bounds given in parts (iii) and (iv) of the previous result. (Note that in the earlier example the two different representations give the same carrier.)

Corollary 4.26 C(f) =1 f ; i.e., each Chv´atal function has a unique carrier. | | ∀ ∈ C Proof: If f 0, f 00 C(f), then using k0,k00 as in Proposition 4.25(iv), we get: k0 + k00 f 0(∈x) f(x)+ f 00(x) f(x) f 0(x) f(x)+ f(x) f 00(x)= f 0(x) f 00(x). ≥ − − ≥ − − − As the difference of two distinct linear functions is unbounded, it follows that f 0 = f 00. 2

It is evident (inductively) that the carrier of a Chv´atal function f is its linear relaxation obtained by removing the operations from any representation for f; we will denote the carrier of f by f 0. b·c Henceforth, we restrict attention to rational Chv´atal functions, defined using rational linear functions and rational multipliers for the operations. Now, any rational f n can be expressed in the form f(x)= r α g (x)+b·c`(x), where α Q , g (x)= h (x) ∈, h C n i, i=1 i i i ∈ + i b i c i ∈ C ∀ and `(x) is a rational linear function.P Assume (recursively) each gi is also of this form, giving the representation R of f. The integrality properties of f are determined by its sub-carriers (nested linear pieces), defined recursively as C(R, f) = g0 ,...,g0 ( r C(R, h )). For { 1 r} ∪ ∪i=1 i example, when f(x)= 1 1 x + 1 1 x , then we have C(R, f)= 1 x + 1 x, 1 x, 1 x . b 2 b 3 c 5 b 2 cc { 6 10 3 2 }

53 Proposition 4.27 For x ZZn with (R, f) as above, f 0(x)= f(x) s(x) ZZ s C(R, f). ∈ ⇔ ∈ ∀ ∈ 0 r 0 r 0 Proof: f (x)= f(x)= i=1 αigi(x)+ `(x) 0= f (x) f(x)= i=1 αi(gi(x) gi(x)). 0 ⇒ − 0 − Since αi 0 and gi(x) P gi(x) for each i, we must have gi(x)= gPi(x) i. 0 ≥ ≥ ZZ ∀ Thus gi(x)= gi(x)= hi(x) i. b 0 c ∈0 ∀ 0 Also, hi(x) gi(x)= gi(x)= hi(x) hi(x), and so hi(x)= hi(x) i. By induction≥ we may assume s(x) ≥ZZ s C(R, h ) i, hence s(x∀) ZZ s C(R, f). ∈ ∀ ∈ i ∀ ∈ ∀ ∈ Conversely, s(x) ZZ s C(R, f) f(x)= f 0(x), as there is no rounding in f(x). 2 ∈ ∀ ∈ ⇒ Corollary 4.28 Suppose f n is rational with carrier f 0. ∈ C Then there exists 0=k ZZ , such that f(k x)= f 0(k x) x ZZn. 2 6 f ∈ + f f ∀ ∈ ZZm ZZm×n When an integral monoid is finitely generated, say M = yA : y + with A , { m ∈ } ∈ it follows from Proposition 4.6 that K(M) = yA : y IR+ ; a similar derivation shows m { ∈ } L(M) = yA : y IR and Z(M) = yA : y ZZm . Thus generators for K(M), L(M), { ∈ } { ∈ } n and Z(M) are immediate from those for M. When M = x ZZ : fi(x) 0, 1 i m , n { ∈ ≥ ≤ ≤ } for rational f1,...,fm , superadditivity of the fi implies M is an integral monoid, and ∈ C 0 we will say M is finitely -constrained. Where fi (x)=0 x M 1 i k( m), the following three results showC that K(M), L(M), and Z(M)∀ are∈ again⇔ easily≤ determined,≤ ≤ this time from the Chv´atal restrictions defining M: 0 K(M)= x : fi (x) 0, 1 i m , L(M)= {x : f 0(x)=0≥ , 1 ≤i ≤k ,} { i ≤ ≤ } Z(M)= x : f 0(x)=0, 1 i k, x ZZn, and h(x) ZZ h k C(R , f ) . { i ≤ ≤ ∈ ∈ ∀ ∈ ∪i=1 i i } n n Theorem 4.29 Let M = x ZZ : fi(x) 0, 1 i m , for rational f1,...,fm . Then K(M)= x : f 0(x{) ∈0, 1 i ≥m . ≤ ≤ } ∈ C { i ≥ ≤ ≤ } t Proof: x K(M) x = j=1 λjzj, where λj 0, zj M j ∈ 0 ⇒ t 0 t ≥ ∈ ∀ fi (x)= j=1Pλjfi (zj) j=1 λjfi(zj) 0 i. ⇒ n≥ ≥ ∀ For the reverse inclusion,P let x Q beP a generator of the finite, rational cone 0 ∈ ZZ ZZn x : fi (x) 0, 1 i m and choose 0=k0 + so that k0x . By{ Corollary≥ 4.28 there≤ exists≤ } 0=k ZZ so6 that∈f (k z)= f 0(k z)∈ z ZZn. 6 i ∈ + i i i i ∀ ∈ Thus for k = k k k we have f (kx)= f 0(kx) 0 i kx M x K(M). 2 0 1 ··· m i i ≥ ∀ ⇒ ∈ ⇒ ∈ n n Theorem 4.30 Let M = x ZZ : fi(x) 0, 1 i m , for rational fi and f 0(x)=0 x M for (precisely){ ∈ 1 i≥ k. Then≤ ≤L(M})= x : f 0(x)=0∈, C1 i k . i ∀ ∈ ≤ ≤ { i ≤ ≤ } 0 Proof: Clearly, L(M)= L(K(M)); thus we consider K(M)= x : fi (x) 0, 1 i m . 0 { ≥ ≤ ≤ } The relations fi (x) 0, 1 i k constitute the equality set for the entire cone K(M). The desired result{ now≥ follows≤ as≤ in} the proof of Theorem 3.37. 2

0 Exercise 4.31 Show in (4.30) and (4.32) that fi(x)=0 x M fi (x)=0 x M. ∀ ∈ ⇔ ∀ 0∈ 2 Thus it is immaterial whether the equality set condition is stated in terms of fi or fi .

54 n Theorem 4.32 Let M = x ZZ : fi(x) 0, 1 i m , with Ri a representation for n { ∈ 0 ≥ ≤ ≤ } rational fi , 1 i m, and fi (x)=0 x M for (precisely) 1 i k. Then Z(M)=∈ Cx ZZ≤n : h≤(x) ZZ h k C(∀R ∈, f ) L(M) ≤ ≤ { ∈ ∈ ∀ ∈ ∪i=1 i i } ∩ = x ZZn : h(x) ZZ h k C(R , f ); f 0(x)=0, 1 i k . { ∈ ∈ ∀ ∈ ∪i=1 i i i ≤ ≤ } Proof: ( ) For x Z(M), evidently x ZZn L(M) and x = u v, where u, v M. ⊆ ∈k ∈ ∩ ZZ − ∈ Now for any h i=1C(Ri, fi), we must have h(u), h(v) , because (e.g., for u) ∈ ∪ 0 0 ∈ ZZ u M 0 fi(u) fi (u)=0 fi(u)= fi (u)=0, 1 i k h(u) , by (4.27). ∈ ⇒k ≤ ≤ ⇒ ≤ ≤ZZ ⇒ ∈ Since h i=1C(Ri, fi) is linear, we have h(x)= h(u) h(v) . ∈ ∪ ZZn ZZ k − ∈ ( ) Suppose x L(M) and h(x) h i=1C(Ri, fi). ⊇ ∈ ∩0 ∈ ∀ ∈ ∪ Then x L(M) fi (x)=0, 1 i k. ∈ ZZ ⇒ k ≤ ≤ 0 Also, h(x) h i=1C(Ri, fi) fi(x)= fi (x)=0, 1 i k, by (4.27). ∈ ∀ ∈ ∪ ⇒0 0 ≤ ≤ 0 For k +1 i m, zi M s.t. fi (zi) = 0; thus fi (zi) fi(zi) 0 fi (zi) > 0. With z = ≤ m ≤ z ,∃ select∈ p ZZ so that6 f (pz)= f 0(pz≥) > f ≥(x),⇒ k +1 i m. i=k+1 i ∈ + i i − i ≤ ≤ P =0+0=0, 1 i k Then superadditivity implies fi(pz + x) fi(pz)+ fi(x) ≤ ≤ ≥ > 0, k +1 i m. ( ≤ ≤ Now f (pz + x) 0, 1 i m pz + x M. i ≥ ≤ ≤ ⇒ ∈ Since pz M, we have x =(pz + x) pz u v : u, v M = Z(M). 2 ∈ − ∈{ − ∈ }

Let 0 denote those functions which are either linear or of the form f f, with f linear.C ⊆ CThus when g = f f, we have g0 = 0; moreover, g providesb ac −means for expressing an integrality restrictionb c − on f as a linear equality or inequality constraint, since f(x) ZZ g(x)=0 g(x) 0. 0 functions characterize (recall Proposition 4.23) when an integral∈ ⇔ monoid is precisely⇔ ≥ the intersectionC of the ZZ-module and cone it generates.

Theorem 4.33 Let M ZZn be a finitely generated integral monoid. Then: (i) M is finitely constrained⊆ by linear functions M = K(M) ZZn; (ii) M is finitely constrained by functions ⇔M = K(M) Z∩(M). C0 ⇔ ∩ Proof:(i) This is Proposition 4.23. n (ii) ( ) Suppose M = x ZZ : fi(x) 0, 1 i m , where ⇒ { ∈ ≥ ≤ ≤0 } fi(x)= gi(x) gi(x), 1 i k and fi(x)= fi (x), k +1 i m (linear). b c − 0 ≤ ≤ ≤ ≤ Then K(M)= x : fi (x) 0, k +1 i m L(M) { n ≥ ≤ ≤ } ⊆ and Z(M)= x ZZ : gi(x) ZZ, 1 i k L(M). { ∈ ZZn∈ ≤ ZZ≤ } ∩ 0 Hence K(M) Z(M)= x : gi(x) , 1 i k; fi (x) 0, k +1 i m ∩n { ∈ ∈ ≤ ≤ ≥ ≤ ≤ } = x ZZ : gi(x) gi(x) 0, 1 i k; fi(x) 0, k +1 i m = M. { ∈ b c − ZZm≥ ≤ ≤ZZm×n ≥ ≤ ≤ } ( ) Suppose M = yA : y + , with A . ⇐ { ∈m } ∈ ZZm Then K(M)= yA : y IR+ and Z(M)= yA : y . By Theorem 3.12{ (for rational∈ } data), K(M)={ x ∈IRn : Bx} 0 , for some B Qp×n. { ∈ ≥ } r×n ∈ q×n By Theorem 2.24, Z(M)= x ZZn : Cx =0, Dx ZZq , for some C Q ,D Q . Thus M = K(M) Z(M)={ x∈ ZZn : Bx 0, Cx∈=0}, Dx ZZq . ∈ ∈ ∩ { ∈ ≥ ∈ } Now, for each di D, we may express dix ZZ by dix dix 0. Hence M is finitely∈ -constrained. 2 ∈ b c − ≥ C0 55 The following schematic summarizes our development for finitely generated integral monoids and Chv´atal functions.

Diagram 4.34 For finitely generated integral monoid M = yA : y ZZm , A ZZm×n : { ∈ + } ∈

yA : y ZZm yA : y IRm yA : y ZZm yA : y IRm ZZn { ∈ + } ⊆ { ∈ + }∩{ ∈ } ⊆ { ∈ + } ∩

k(1) k(2) k M K(M) Z(M) K(M) ZZn ⊆ ∩ ⊆ ∩

Discussion: By Theorem 4.33(i), equality holds in both (1) and (2) M is finitely linearly constrained ⇔ ⇔ yA = c yA = c c ZZn : y ZZm iff y IRm . ∀ ∈  ∈ +   ∈ +   consistent   consistent      By Theorem 4.33(ii), equality holds in (1) Mis finitely -constrained ⇔ C0 ⇔ yA = c yA = c yA = c c ZZn : y ZZm iff y ZZm and y IRm . 2 ∀ ∈  ∈ +   ∈   ∈ +   consistent   consistent   consistent        As indicated above,M = K(M) Z(M) holds precisely for those monoids M which are finitely -constrained. In fact, this∩ is valid in an asymptotic sense for all finitely generated C0 integral monoids: sufficiently deep within K(M), i.e., sufficiently far from the hyperplanes constraining K(M), the monoid M is identical to its lattice Z(M).

m×n ∗ Proposition 4.35 For A ZZ , there exists k ZZ+ (dependent only on A) so that whenever x yA : ∈y k∗, 1 i m we∈ have x M(A) x Z(A). ∈{ i ≥ ≤ ≤ } ∈ ⇔ ∈ ∗ ZZp×m Proof: Let k = p(maxi,j bij ), with the rows of bij = B a basis for y : yA =0 . Clearly, x M(A) x | Z(|A). { } ∈ { } Suppose x∈ Z(A) and⇒ choose∈ y ZZm, y0 IRm for which ∈ 0 0 ∈ ∗ ∈ yA = x and y A = x, y k 1m, where 1m = (1,..., 1). 0 0 ≥ p 0 ∗ Now, (y y)A =0 y y = λB, λ IR y + λB = y k 1m. We also have− y + λ⇒B −ZZm and y + ∈λ B ⇒0, since (λ ≥λ )B k∗1 . b c ∈ b c ≥ − b c ≤ m Thus (y + λ B)A = yA = x M(A). 2 b c ∈ Consider once again Theorem 4.29, which stipulates that any finitely -constrained integral monoid generates a polyhedral cone. Combining this with Jeroslow’sC characterization in Theorem 4.8 yields the following Minkowski-type relation (cf. Corollary 2.27, Theorem 3.16).

Theorem 4.36 Every finitely -constrained integral monoid is finitely generated. 2 C 56 The converse Weyl-type relation (cf. Theorems 1.12, 2.24, 3.12) was established by C. Blair and R. Jeroslow in Math. Prog. 23(1982)237-273. For this, they introduce the class of Gomory functions = n, where n is the smallest family of functions f : IRn IR G n≥1 G G → obeying properties (i) (Siii) stated earlier for Chv´atal functions, and the additional property: (iv) f = min g,− h with g, h n f n. { } ∈ G ⇒ ∈ G As in the development of Chv´atal functions, we define the rank of f n as the minimum number of operations of the form (ii), (iii), or (iv) needed in the construction∈ G of f, starting with linear functions. Induction on rank can be used to show:

Exercise 4.37 Each Gomory function is: (i) superadditive; (ii) a finite minimum of Chv´atal functions. 2

Note that by 4.37(ii), a constraint f(x) 0 determined by f n is equivalent to a finite ≥ ∈ G system of Chv´atal functions, say g (x) 0, 1 i m, with each g n. That is, for i ≥ ≤ ≤ i ∈ C f = min g1,...,gm we have f(x) 0 gi(x) 0 i. { } ≥ ⇔ ≥ ∀ ZZm In the following we consider the integral monoid M = yA : y + , finitely generated by m×n { ∈ } the rows ai of A ZZ . The proof that M has a finite representation in terms of Chv´atal restrictions proceeds∈ by induction on the number of generators. We require the following two lemmas, which construct Chv´atal restrictions for a monoid with m generators from valid Chv´atal restrictions for any (sub-)monoid generated by m 1 of the generators. − n n Lemma 4.38 Suppose g satisfies g(a1)= 1 and g(ai) 0, 2 i m, and h . Then there exists f ∈n Gsuch that: − ≥ ≤ ≤ ∈ G ∈ G (i) f(a1)= h(a1) and f(ai) h(ai), 2 i m; (ii) g(x) ZZ f(x)+ g(x)≥f(a )= h(x≤+ g≤(x)a ). ∈ ⇒ 1 1 Proof: The proof is by induction on the construction of h; for h linear, take f = h. For h = h , with h n, let (by induction) f , h satisfy (i), (ii). b 1c 1 ∈ G 1 1 With f(x)= f (x)+(f (a ) f (a ) )g(x) , stipulations (i), (ii) hold for f, h, since: b 1 1 1 − b 1 1 c c f(a1)= f1(a1) = h1(a1) = h(a1); f(ai) f1(ai) h1(ai) = h(ai), 2 i m; g(x) ZZb f(xc)=b f (x)+c f (a )g(x) ≥f b(a ) g(cx ≥) b c ≤ ≤ ∈ ⇒ b 1 1 1 c − b 1 1 c = h1(x + g(x)a1) h(a1)g(x)= h(x + g(x)a1) g(x)f(a1). For h = αh with α 0, setb f = αf , wherec −f , h satisfy (i), (ii), − 1 ≥ 1 1 1 and for h = h1 + h2, set f = f1 + f2, where f1, h1 and f2, h2 satisfy (i), (ii); in either case, one easily verifies (i), (ii) for f, h. For h = min h , h , where h (a ) h (a ) and both f , h and f , h satisfy (i), (ii), with { 1 2} 1 1 ≤ 2 1 1 1 2 2 f(x)= min f (x), f (x)+(h (a ) h (a ))g(x) we again satisfy (i), (ii), since: { 1 2 2 1 − 1 1 } f(a1)= min f1(a1), f2(a1) (f2(a1) f1(a1)) = f1(a1)= h1(a1)= h(a1); f(a ) min{f (a ), f (a ) − min h−(a ), h (a}) = h(a ), 2 i m; i ≥ { 1 i 2 i } ≥ { 1 i 2 i } i ≤ ≤ g(x) ZZ f(x)= min h1(x + g(x)a1) g(x)f1(a1), ∈ ⇒ { h (x + g(x)a−) g(x)f (a )+(h (a ) h (a ))g(x) 2 1 − 2 1 2 1 − 1 1 } = g(x)f1(a1)+ min h1(x + g(x)a1), h2(x + g(x)a1) = −g(x)f(a )+ h(x +{ g(x)a ). 2 } − 1 1 57 Suppose now that function h n accurately represents the sub-monoid of M generated by only m 1 of the generators,∈ say G a ,...,a ; i.e., h(x) 0 x M( a ,...,a ). We − 2 m ≥ ⇔ ∈ { 2 m} now exploit the previous lemma to show how h can be used to construct, for each k ZZ+, a n ∈ function g with the following properties. (i) g(a1) 1 and g(a2) 0,...,g(am) 0 – and it therefore∈ G follows from superadditivity that the≥ −constraint g(x≥) 0 is valid≥for ZZ ≥ M( a2,...,am ), i.e., g(x) 0 for x = i≥2 yiai with yi + i, and nearly valid for the { } ≥ ZZm ∈ ∀ n entire monoid M = yA : y + , inP the sense that g(x) yi for x = i=1 yiai with y ZZ i; hence the{ condition∈ g(}a ) 1 serves to moderate≥ − the extent to which g(x) i ∈ + ∀ 1 ≥ − P can be negative when x M. (ii) g(x) k 1 whenever x M, so that g indicates that x M with a negative∈ value which is ≤arbitrarily − − large for large6∈ k. Thus, as k = 0, 1,... increases,6∈ the functions g n constructed in the following lemma uniformly retain the near ∈ G validity for M expressed in (i), while becoming stronger and stronger indicators for x M. 6∈ n Lemma 4.39 Suppose h(x) 0 x M( a2,...,am ), h . ≥ ⇔ ∈ n{ } ∈ G Then for each k ZZ+ there exists g for which: (i) g(a ) ∈1 and g(a ) 0, 2 ∈i G m; 1 ≥ − i ≥ ≤ ≤ (ii) x / M( a ,...,a ) g(x) k 1. ∈ { 1 m} ⇒ ≤ − − Proof: If h(a ) 0, then (i), (ii) hold for g =(k + 1) h . 1 ≥ b c If h(a1) < 0, we scale so that h(a1)= 1. Proceeding by induction on k, for k =− 0 we take g = h . n b c To complete the proof, we assume that g satisfies (i), (ii) for k ZZ+ and determine an appropriate function in ∈n Gwhich satisfies (i), (ii) for∈ k + 1. 1 G 1 If g(a1) > 1, then (i), (ii) hold for α g , where α = min g(a1), 2 . − b− c n { − } If g(a1)= 1, we take min g, f , with f provided by Lemma 4.38. Then f(a )=− h(a )= b1 and{ f(}ca ) h(a )∈ G0, 2 i m, which implies (i) holds. 1 1 − i ≥ i ≥ ≤ ≤ In order to validate (ii), suppose x / M( a1,...,am ). Then: if g(x) < k 1, clearly min g∈(x), f{(x) g}(x) k 2; if g(x)= −k − 1, then 4.38(b ii){ implies }c≤b c ≤ − − − − f(x)= g(x)f(a1)+ h(x + g(x)a1)=( k 1) + h(x (k + 1)a1) < k 1; hence f(x)− k 2, which implies that− (ii)− also holds− for min g, f −. −2 b c ≤ − − b { }c We now establish the Weyl-type converse of Theorem 4.36 relating finitely generated integral monoids and finite families of Chv´atal restrictions.

Theorem 4.40 Every finitely generated integral monoid is finitely -constrained. C ZZm ZZm×n Proof: Let M = yA : y + , where A . The proof is by induction{ ∈ on m} , the number∈ of generators. n For m = 1, select b1,...,bn−1 so that L( a1 )= x IR : bjx =0, 1 j n 1 . 2 { } { ∈ ≤ ≤ − } For w = a1/ a1 , we have x = y1a1 wx = y1, for any y1 IR. || || n ⇒ ∈ Hence M = x IR : bjx 0, bjx 0, 1 j n 1; wx 0; wx wx 0 . Thus M is finitely{ ∈ -constrained≥ in≤ the 1-dimensional≤ ≤ − case, and≥ web considerc − m>≥ }1. C 58 n By Theorem 4.33(ii) and Exercise 4.37(ii), f1(x) 0 x Z(M) K(M), with f1 . ≥ ⇔ ∈ ∩ ∈ G n By induction, for 1 i m we have hi(x) 0 x Mi = M( aj : j = i ), with hi . ∗ ≤ ≤ ≥ ⇔ ∈ { n 6 } ∈ G For k = k 1 as in Proposition 4.35, Lemma 4.39 provides gi , 1 i m, so that − ∗ ∈ G ≤ ≤ gi(ai) 1; gi(aj) 0, j = i; gi(x) k , for x / M Mi. Suppose b≥,...,b − Qn≥generate6 the cone≤ −z : Az 0∈ and⊇ define f n by 1 p ∈ { ≥ } 2 ∈ G f (x)= min b x + m (a b )g (x) . 2 1≤j≤p { j i=1 i j i } We claim that for f = min f1,P f2 , we obtain x M f(x) 0. When x / M, either x /{ Z(A)} or x / yA : y∈ k∗⇔1 , by≥ Proposition 4.35. ∈ ∈ ∈{ ≥ m} If x / Z(A), then (4.33) implies f1(x) < 0, hence f(x) < 0. ∈ ∗ ∗ If x / yA : y k 1m , (3.22) implies (x k 1mA)s< 0, for some s z : Az 0 . ∈{ ∗ ≥ } ∗ m − ∈{ ≥ } Hence (x k 1mA)bj = bjx k i=1 aibj < 0, for some j. − m − ∗ m Thus f2(x) bjx + i=1(aibj)gi(Px) bjx k i=1 aibj < 0, and hence f(x) < 0. Next consider≤ x M; i.e., x = m ≤y a ,− with y ZZ . ∈ P k=1 k k Pk ∈ + Since a K(M) Z(M) k, we have f (a ) 0 k. k ∈ ∩ ∀ P 1 k ≥ ∀ And gk(ak) 1, gi(ak) 0 k, i = k and aibj 0 i, j implies f (a )=≥min − b a≥+(∀a b∀)g6 (a )+ (≥a b )∀g (a ) 0 k. 2 k 1≤j≤p { j k k j k k i6=k i j i k } ≥ ∀ Hence f(ak)= min f1(ak), f2(ak) 0 k. P Since x = m y a{, with y ZZ} ≥k, f∀(x) 0 follows from superadditivity. k=1 k k k ∈ + ∀ ≥ Finally, since Pf is a finite minimum of Chv´atal functions (by Exercise 4.37(ii)), M is finitely -constrained. 2 C Theorems 4.36 and 4.40 provide a duality for finitely generated integral monoids and finitely -constrained Chv´atal functions analogous to the linear Weyl-Minkowski duality of finite cones.C As an immediate corollary of Theorem 4.40, we obtain the following variant of the Farkas condition (compare Theorems 1.15, 2.7, 3.15), now in the setting of integral monoids.

Corollary 4.41 For A ZZm×n and c ZZn, exactly one holds: (i) y ZZm such that∈yA = c; (ii)∈ f n such that f(a ) 0 a A, f(c) < 0. 2 ∃ ∈ + ∃ ∈ C i ≥ ∀ i ∈

59 5 Polyhedra

Linear systems have played a central role in our development thus far. The simplest among these are finite systems of homogeneous equalities, i.e., systems of the form Ax = 0 for { } some matrix A. We saw in Corollary 1.13 that subspaces characterize solution sets for such systems. Moreover, Theorem 1.7 shows that this characterization remains valid even for arbitrary, not necessarily finite, sets of homogeneous equalities. For inhomogeneous systems Ax = b , Theorem 1.27 characterizes the solution sets as affine spaces and once again the Finite{ Basis} Theorem shows that this still holds for infinitely many inhomogeneous equalities. When we pass to inequality systems, we have seen that the distinction between finite and infinite systems can no longer be ignored. Restricting to finite homogeneous systems, i.e., of the form Ax 0 for some matrix A, Corollary 3.13 shows that now finite cones characterize { ≥ } the solution sets. We shall return later to remove the restriction to finiteness, characterizing solution sets for arbitrary homogeneous inequality systems as topologically closed cones and solution sets for general inhomogeneous inequality systems as closed convex sets. We have yet to study the closed convex sets arising from finite inhomogeneous inequality systems, and this is the topic we now consider.

Classical Results

We focus attention throughout this section on convex sets analogous to finite cones. A polytope is a convex set generated by a finite collection of points and a polyhedron is a convex set defined by finitely many linear inequalities, i.e., of the form x : Ax b , where A IRm×n, b IRm. Just as affine spaces inherit many properties of{ their homoge≥ } neous ∈ ∈ counterparts, linear spaces, many of the results derived for finite cones remain valid for polytopes and polyhedra. For clarity, sometimes in the following development we will use n 0n to denote the zero-vector of IR and 1n for the n-vector of all ones.

Theorem 5.1 (Weyl – inhomogeneous) q p×n q×n Let P = yB + zC : y 0, z 0, i=1 zi =1 , where B IR , C IR . Then P is{ a polyhedron;≥ i.e., P≥ = x : Ax b }, for some A∈ IRm×n, ∈b IRm. {P ≥ } ∈ ∈

Proof: If P = , we take A = [0n] and b = 1 (scalar); thus we assume P = (p, q 1). For P 0 IRn+1∅as follows, Weyl’s Theorem (3.12) yields A IRm×n, b 6IRm∅ so that:≥ ⊆ ∈ ∈

0 B 0p x P = (y, z) : y 0, z 0 = (x, xn+1) : [A b] 0 . { " C 1q # ≥ ≥ } { − " xn+1 # ≥ }

Now, clearly, x P (x, 1) P 0 Ax b. 2 ∈ ⇔ ∈ ⇔ ≥ Exercise 5.2 Suppose x =0 and a / L = λx IRn : λ IR . Theorem 5.1 shows L +6 a is a∈ polyhedron.{ ∈ Is C(L ∈ a }) a polyhedron? 2 { } ∪{ }

60 In the proof of (5.1), we apply Weyl’s Theorem to a finite cone in IRn+1, hence a homogeneous linear inequality system, derived from the original description for P IRn. This technique is called homogenization; it has already been used in proving (2.34) and⊆ (2.35) and we use it here in various forms to derive inhomogeneous analogues for classical results of cone duality.

Theorem 5.3 (Minkowski – inhomogeneous) Let P = x : Ax b , for A IRm×n, b IRm. { ≥ } ∈ ∈ Then P is the sum of a finite cone with a polytope; i.e., P = yB + zC : y 0, z 0, q z =1 , for some B IRp×n, C IRq×n. { ≥ ≥ i=1 i } ∈ ∈ Proof: If P = , take B and C vacuousP (p = q = 0); thus we assume P = . n+1∅ r×(6 n+1)∅ For P 0 IR as follows, Minkowski’s Theorem (3.16) yields D IR so that: ⊆ ∈

0 n+1 A b x P = (x, xn+1) IR : − 0 = uD : u 0 . { ∈ " 0n 1 # " xn+1 # ≥ } { ≥ } Scale the rows of D so that the final column is (0, 1)-valued. B 0 Re-arranging the rows, for B IRp×n,C IRq×n,p + q = r we may write D = p . ∈ ∈ C 1q Since (x, 1) P 0 x P = , C is nonvacuous (q 1). " # ∈ ∀ ∈ 6 ∅ ≥ If B is vacuous, take B = [0n] (p = 1). Thus x P (x, 1) P 0 x = yB + zC, with y 0, z 0, q z =1. 2 ∈ ⇔ ∈ ⇔ ≥ ≥ i=1 i Taken together, (5.1) and (5.3) are sometimes called the Double DescriptionP Theorem. That is, finite descriptions using conical and convex generators produce the same geometric objects as those using linear inequalities. Thus the sum of a finite cone with a polytope yields a polyhedron, and conversely. Note that any polytope is simply the sum of itself and the trivial cone 0 ; it follows that any polytope P is a bounded polyhedron; i.e., for some δ IR we have {x} δ x P . Theorem 5.3 shows that the converse holds as well, since P =∈ and k k ≤ ∀ ∈ 6 ∅ bounded implies B has only zero entries in (5.3).

Corollary 5.4 Polytopes are (precisely) bounded polyhedra. 2

Thus for an unbounded polyhedron P , the cone yB : y 0 in (5.1) and (5.3) must be nontrivial. This cone, consisting of the directions{ in whic≥h P} recedes to infinity, is given by x : Ax 0 . It is known as recession cone of polyhedron P = , formally defined as rec({P ) = x≥: y}+ x P y P . The following exercise summarizes6 ∅ its basic properties. { ∈ ∀ ∈ } Part (iii) of the exercise shows that when P = in Theorems 5.1 and 5.3, the rows of matrix B generate rec(P ). Furthermore, parts (i) and6 ∅ (iii) emphasize that different representations for P , whether by inequalities or by generators, yield the same recession cone – the recession cone is, by definition, a geometric object determined by P itself.

Exercise 5.5 Let P IRn be a nonempty polyhedron with rec(P )= K. Then: (i) P = x : Ax ⊆b K = x : Ax 0 ; hence K is a finite cone. { ≥ }⇒ { ≥ } (ii) P is a polytope K = 0 . 0 0 0 (iii) P = K + Q, for⇔ K a{ finite} cone and Q a polytope K = K. 2 ⇒ 61 We point out explicitly that for matrix A IRm×n fixed, all nonempty polyhedra in the family P (b) = x : Ax b b IRm have∈ the same recession cone independent of b; i.e., P (b) = rec{(P (b)) =≥ x}: Ax ∀ ∈ 0 . In this sense, the recession cone is A-generic. 6 ∅⇒ { ≥ } The same homogenization as that used in proving Theorem 5.3 can also be used to establish an inhomogeneous counterpart for the Farkas Theorem (3.15).

Theorem 5.6 (Farkas – inhomogeneous) For A IRm×n, b IRm, c IRn, δ IR, exactly one holds: (i) ∈y 0 such∈ that yA∈= c, yb∈ δ; ∃ ≥ n ≥ (ii) x IR such that Ax b, cx < δ or Ax 0, cx< 0. ∃ ∈ ≥ ≥ A b Proof: Apply (3.15) to A0 = − IR(m+1)×(n+1) and c0 =(c, δ) IRn+1. " 0n 1 # ∈ − ∈ Either (y,ym+1) 0 for which yA = c, yb + ym+1 = δ, which is equivalent to (i), or ∃(x, x ) for≥ which Ax bx 0−, x 0, cx− δx < 0, but not both. ∃ n+1 − n+1 ≥ n+1 ≥ − n+1 The two cases in (ii) arise from the two possibilities xn+1 > 0, xn+1 =0. 2

Exercise 5.7 A key component in proving Theorems 5.3 and 5.6 is the homogenization

0 n+1 A b x P = x : Ax b P = (x, xn+1) IR : − 0 . { ≥ } → { ∈ " 0n 1 # " xn+1 # ≥ }

(i) Show that P 0 = (y, 0) : y rec(P ) + K( (x, 1) : x P ), and hence that P 0 is a geometric{ object,∈ not dependent} { on the representation∈ } for P . (ii) Establish a similar result for the homogenization of Theorem 5.1. (iii) What is the relation between the two cones P 0 from (i) and (ii)? 2

We will see later that Theorem 5.6 leads to a simple proof of the fundamental Strong Duality Theorem of linear programming. For now, we concentrate on several applications of (5.6) analogous to those studied earlier for its homogeneous counterpart, Theorem 3.15. As with polyhedral cones, when Ax b cx δ, we say that cx δ is implied by the system Ax b , is valid for P =≥ x ⇒: Ax ≥b and is inessential≥(or redundant) in the representation{ ≥ x} : Ax b, cx δ{. Theorem≥ } 5.6 can be used to characterize valid { ≥ ≥ } inequalities when P = . If(i) holds in (5.6), then Ax b cx = yAx yb δ. If(i) fails, then (ii) must hold,6 ∅ so either cx < δ for some x ≥P or⇒ (as P = ) ≥cx < ≥0 for some x rec(P ); in either case we have Ax b cx δ. ∈ 6 ∅ ∈ ≥ 6⇒ ≥ Corollary 5.8 Suppose polyhedron P = x : Ax b = . Then: { ≥ } 6 ∅ A b cx δ is valid for P yA = c, yb δ for some y 0 (c, δ) K( − ). 2 ≥ ⇔ ≥ ≥ ⇔ − ∈ " 0n 1 #

Repeated deletion of an inessential inequality from the representation for P will lead to an irredundant or minimal description. Applying Carath´eodory’s Theorem (3.8) to the second

62 statement in (5.8) shows that any inequality implied by a consistent system in IRn is actu- ally a consequence of at most n + 1 of the system’s relations. Similarly, for a polyhedron represented by conical and convex combinations of generators, as in Theorems 5.1 and 5.3, Carath´eodory’s Theorem immediately implies that at most n + 1 generators are needed to represent any specific point of the polyhedron. More generally, we have the following.

Exercise 5.9 (Carath´eodory – inhomogeneous) For D = K(S)+ C(T ) and S, T IRn show that: p q ⊆ q x D x = y s + z t , for y 0,s S i; z 0, t T j; z =1; ∈ ⇔ i=1 i i j=1 j j i ≥ i ∈ ∀ j ≥ j ∈ ∀ j=1 j s1 sp t1 tq 2 and the vectorsP (0 ),...,P(0 ), (1 ),..., (1 ) are linearly independent. P

q p×n Finally, we consider whether P = yB + zC : y 0, z 0, i=1 zi =1 , with B IR , q×n { ≥ ≥ } ∈ C IR , is a minimal generator representation. If P = , theP only minimal representation is the∈ trivial one with B and C vacuous (p = q = 0). Assume∅ P = . Then yB : y 0 = 6 ∅ { ≥ } rec(P ) and a row of B can be omitted from this representation if and only if rec(P ) is not altered, by part (iii) of (5.5). Thus a row of B is inessential if and only if it is a conical combination of the other rows in B. Similarly, it is straightforward to check that a row of C can be dropped precisely when it can be expressed using the remaining generators. We thus have the following characterization of minimal systems of generators.

Theorem 5.10 For P = yB+zC : y, z 0, q z =1 = , with B IRp×n, C IRq×n: { ≥ i=1 i } 6 ∅ ∈ ∈ (i) bj B is inessential bj = yB, for someP y 0 with yj =0; ∈ ⇔ ≥ q (ii) c C is inessential c = yB + zC, for some y, z 0 with z =0, z =1. 2 j ∈ ⇔ j ≥ j i=1 i P Thus Theorem 5.6 can be used to characterize inessential constraints and generators for representations of nonempty polyhedra. A simplified variant of (5.6) is useful for applications in which the linear system Ax b is inconsistent, i.e., when P = x : Ax b = . This result is an immediate consequence{ ≥ of} Corollary 3.22; it can also be deduced{ from≥ } (5.6)∅ using c =0 , δ = 1 or from (3.15) using [A b] and (0 1). n − n − Corollary 5.11 For A IRm×n and b IRm, exactly one holds: ∈ ∈ n (i) y 0 such that yA =0, yb =1; (ii) x IR such that Ax b. 2 ∃ ≥ ∃ ∈ ≥ Using (5.11) we extend earlier results on separating hyperplanes (see (3.16) ff.).

Theorem 5.12 Suppose P = x : Ax b = , Q = x : Cx d = , and P Q = . Then some hyperplane separates{ P ≥and}Q 6 .∅ { ≥ } 6 ∅ ∩ ∅

Proof: Applying (5.11) to P Q = x : Ax b, Cx d = , we get y, z 0 for which yA + zC∩ =0,{ yb + zd≥= 1; i.e.,≥yA}= ∅zC, yb> zd. ≥ − − Moreover, yA = 0 (hence zC = 0), since forx ¯ P, xˆ Q we have yAx¯ yb, zCxˆ zd; but yA = 6 zC =0 0=6 yAx¯ + zCxˆ yb∈+ zd =1∈ , a contradiction.≥ ≥ − ⇒ ≥ Now take f = yA = zC = 0 and γ = yb. Then x P fx =−yAx 6 yb = γ and x Q fx = zCx zd < yb = γ. 2 ∈ ⇒ ≥ ∈ ⇒ − ≤ − 63 Recall the geometric content of the Farkas Theorem (3.15), namely, that c is not in the finite cone K = yA : y 0 if and only if c can be separated from K by a hyperplane. This is true a fortiori{ when≥K }is replaced by a polyhedron, for if c / P = x : Ax b , then Ac b ∈ { ≥ } 6≥ and some row of Ax b determines a hyperplane separating c from P . Theorem 5.12 is more general, allowing{ ≥c to} be a second polyhedron. Observe that taking γ =(yb zd)/2 in − the proof gives a strict separation; i.e., fx>γ x P and fx<γ x Q. Similarly, strict separation holds for a point not in a finite cone,∀ a∈ particular instance∀ of∈ (5.12). When P = x : Ax b = , any constraint is, by definition, valid for P , or implied by { ≥ } ∅ the system Ax b . It is still reasonable to ask, though, whether a given inequality of the system {Ax ≥b }is unnecessary in this representation for P , i.e., whether the remaining inequalities{ of the≥ system} cannot be simultaneously satisfied. This situation is characterized in Corollary 5.11 – a linear inequality system Ax b is inconsistent if and only if the con- tradictory constraint 0x 1 is implied by its{ relations.≥ } Recursive deletion of unnecessary ≥ inequalities ultimately leaves a minimal inconsistent system. Applying Carath´eodory’s The- orem in this context yields the classical result that any minimal inconsistent system defined on n variables has at most n + 1 relations.

Corollary 5.13 If x : Ax b = , for A IRm×n, b IRm, then x : A x b ={ , for≥ linearly} ∅ independent∈ (hence∈ n +1) rows [A b ] [A b]. { R ≥ R} ∅ ≤ R R ⊆ Proof: Since x : Ax b = , (5.11) implies yA =0, yb = 1, for some y 0. { ≥ } ∅ ≥ By Carath´eodory’s Theorem (3.8), we may assume y has at most n + 1 positive components. Rows i of Ax b for which y > 0 then define the subsystem A x b . 2 { ≥ } i { R ≥ R} Thus x : Ax b = can be proved by demonstrating alternative (i) of (5.11) on a small { ≥ } ∅ ( n + 1 rows) subsystem of Ax b . The bound n + 1 in Corollary 5.13 is sharp, i.e., cannot≤ be reduced. Consider,{ e.g., ≥x } 0,..., x 0, x + + x 1 . Also note the { 1 ≤ n ≤ 1 ··· n ≥ } contrapositive statement of (5.13), that the linear system Ax b is consistent if and only if each subsystem of n + 1 relations is consistent. Corollary{ 5.13≥ may} be restated for families of polyhedra in the following form.

m n Corollary 5.14 Suppose i=1Pi = , where Pi IR are polyhedra, 1 i m. Then P = for some∩ I 1∅,...,m with⊆ I n +1. 2 ≤ ≤ ∩i∈I i ∅ ⊆{ } | | ≤

The assertion of (5.14) remains valid if we replace the polyhedra Pi by arbitrary convex sets S Ci, 1 i m. To see this, let x i∈SCi = S 1,...,m , S = n + 1, and define ≤ ≤S ∈ ∩ 6 ∅ ∀S ⊆ { } | | Pi = C( x : S i ) Ci, for 1 i m. Now x i∈SPi = , S 1,...,m , S = { 3 } ⊆ m ≤ ≤ ∈ ∩ m6 ∅ ∀ ⊆{ } | | n +1, so (5.14) implies i=1Pi = . But Pi Ci i, hence i=1Ci = . This result is Helly’s Theorem.∩ 6 ∅ Below we⊆ indicate∀ a second∩ proof6 ∅ for it, a proof which implicitly contains the combinatorial result known as Radon’s Theorem.

m n Theorem 5.15 (Helly 1923) Let i=1Ci = , for convex sets Ci IR , 1 i m. Then C = for some I 1∩,...,m ∅with I n +1. ⊆ ≤ ≤ ∩i∈I i ∅ ⊆{ } | | ≤

64 Proof [Radon 1921]: We prove the contrapositive. The result is clear for m = n + 1. Suppose m > n + 1 and assume i∈I Ci = I 1,...,m with I = n + 1. By induction we have C = ∩, so let6 x∅ ∀ ⊆{C , 1 j} m. | | ∩i6=j i 6 ∅ j ∈ ∩i6=j i ≤ ≤ As m > n +1, the points (x1, 1),..., (xm, 1) are linearly dependent; thus m λ (x , 1) = 0 and we define P = j : λ 0 = , N = j : λ < 0 = . j=1 j j { j ≥ } 6 ∅ { j } 6 ∅ Now y = λ x / λ is a convex combination of points x C , j P and P j∈P j j j∈P j j ∈ ∩i∈N i ∈ z = Pj∈N ( λjxj)P/ j∈N ( λj) is a convex combination of points xj i∈P Ci, j N. Furthermore, y−= z, as −λ x = ( λ )x and λ = ∈( ∩λ ). ∈ P P j∈P j j j∈N − j j j∈P j j∈N − j Thus y C and z C ; since y = z, it follows that C = . 2 ∈ ∩i∈N i P∈ ∩i∈P i P P ∩i∈NP∪P i 6 ∅ Theorem 5.16 (Radon 1921) Any family of affinely dependent points (copies are allowed), particularly, any collection of points in IRn with more than n +1 members, can be partitioned into two subfamilies whose convex spans have nonempty intersection. 2 Consideration of any n + 1 affinely independent points in IRn shows that the value n + 1 here cannot be reduced. The convex span of n + 1 affinely independent points in IRn is a simplex. Exercise 5.17 Consider the simplex C( a , a ,...,a ) defined by the points a IRn. { 1 2 n+1} i ∈ Show that its volume is det(A) /n!, where matrix A has rows a a , i =1,...,n. 2 | | i − n+1 Simplices are evidently fundamental for minimal inconsistent systems of linear inequalities. Consider again the minimal inconsistent system x 0,..., x 0, x + + x 1 , { 1 ≤ n ≤ 1 ··· n ≥ } discussed earlier. For A with rows ai = ei, 1 i n, an+1 = (1,..., 1) and b = (0n, 1), we denote this system as Ax b . Now,− for each≤i,≤ the n n equality system resulting after { ≥ } × removal of the ith relation, i.e., ajx = bj, j = i , has independent rows, and hence a unique solutionx ¯ ; specifically,x ¯ = e ,{1 i n, and6 }x ¯ =0 . These points generate a simplex i i i ≤ ≤ n+1 n C( x¯1,..., x¯n+1 ), for which an inequality representation results when the inequalities of Ax{ b are reversed} , i.e., Ax b . We show later in Theorem 5.45 that a similar situation holds{ ≥ for}all minimal inconsistent{ ≤ systems.} Here we give related algebraic characterizations. (The “ ” assertion in the corollary can be proved using Proposition 1.4.) ⇐ Theorem 5.18 For A IRm×n, b IRm, the system Ax b is minimally inconsistent ∈ ∈ { ≥ } rank(A a )= m 1 i and yA =0, yb =1 for some y > 0. ⇔ \{ i} − ∀ Proof: ( ) By (5.11) and the proof of (5.13), we have yA =0, yb = 1 for some y > 0. ⇒ m If rank(A ak ) < m 1, then z IR for which z =0, zA =0, zk = 0, and zb 0. \{ } −0 ∃ ∈ 0 0 6 0 ≤ Using z we construct y 0 for which y A =0,y b> 0 and yi = 0 for some i: ≥ 0 yi if some zi > 0, define y = y z, for  = min ; − zi>0 { zi } if z 0 and zb < 0, define y0 = z; ≤ − y if z 0 and zb = 0, define y0 = y + z, for  = min i . zi<0 −zi Via (5.11),≤ y0 leads to a contradiction of minimality. { } ( ) Since yA =0, yb = 1 with y > 0, (5.11) implies Ax b is inconsistent. Also,⇐ any proper row subset is independent, so its related{ ≥ equality} system is consistent. 2 Corollary 5.19 (Motzkin 1933, Fan 1956) Ax b is a minimal inconsistent system rank(A)= m 1 and yA =0,yb> 0 for{ some≥ }y> 0. ⇔ −

65 Facial Structure of Polyhedra

Recall that for c IRn, δ IR, and polyhedron P x : cx δ , the inequality cx δ is valid for P . Any∈ valid inequality∈ for P determines⊆ a {face of ≥P given} by z P : cz ≥= δ . Clearly, when P = , every inequality is valid and determines the face {. We∈ exclude this} ∅ ∅ trivial case and assume throughout this section that P = . It is an easy consequence of the characterization of valid inequalities given in Corollary6 5.8∅ that for a given vector c there exists a value δ so that cx δ is valid for P = x : Ax b = if and only if c K(A). (To prove the if assertion, one≥ simply takes any {δ satisfying≥ }δ 6 ∅yb, where y 0, yA∈ = c.) Thus it is precisely those c K(A) = x : Ax 0 + = rec(P≤)+ (Exercise 5.5)≥ which give ∈ { ≥ } rise to valid inequalities. If cx δ is valid with c = 0, then we must have δ 0, which allows only two possibilities: ≥ ≤ δ = 0, which determines the face P = z P : 0z = 0 , or δ < 0, yielding the face . When cx δ is valid with c = 0 and{ ∈z P : cz = }δ = , then x : cx = δ is ∅a supporting hyperplane≥ for P . (Note6 that the{ ∈ hyperplane may,} 6 in∅ fact, contain{ all of }P .) Thus all nonempty, proper faces of P , i.e., excluding only and P itself, are determined by supporting hyperplanes. Of course, each such supporting hy∅ perplane is, in turn, determined by a nonzero vector c K(A). We now show that the converse of this statement also holds. ∈ Theorem 5.20 Let P = x : Ax b = , where A IRm×n, b IRm, and let c IRn. Then x : cx = δ supports{ P ≥for} some 6 ∅ δ IR ∈0 = c K(∈A). ∈ { } ∈ ⇔ 6 ∈ Proof: ( ) This is immediate from the discussion preceeding the theorem. ( ) By⇒ Theorem 5.3, there exist finitely many generators, say u , v IRn, so that ⇐ i j ∈ P = i yiui + j zjvj : yi 0 i, zj 0 j, j zj =1 . Now, y {u : y 0 i = rec(P≥) by∀ (5.5);≥ hence∀ c K(A)=}rec(P )+ cu 0 i. { i i Pi i ≥ P∀ } ∈P ⇒ i ≥ ∀ Defining δ = cv = min cv yields a valid inequality cx δ, since x P we may write P k j j ≥ ∀ ∈ cx = i yi(cui)+ j zj(cvj) j zj(cvk)= cvk = δ, with yi 0 i, zj 0 j, j zj = 1. Moreover, v x : cx = δ P≥= , so for c = 0 the hyperplane≥ x∀: cx =≥δ supports∀ P . 2 P k ∈{ P }∩ 6 P∅ 6 { } P The faces of P are geometric objects, and, as was the case with finite cones, they are easily determined from a particular algebraic representation. The following exercise shows that the development in Chapter 3 for faces of finite cones essentially carries over directly to the polyhedral setting.

0 A b x Exercise 5.21 For P = x : Ax b = and P = (x, xn+1) : − 0 , { ≥ } 6 ∅ { 0n 1 xn+1 ≥ } where A IRm×n, b IRm, show that: " # " # ∈ 1:1∈ nonempty faces of P faces of P 0 whose equality sets do not contain x 0. 2 ←→ n+1 ≥ In view of the facial correspondence between P and P 0 given in the previous exercise and the correspondence of valid inequalities established in Corollaries 3.29 and 5.8, viz., cx δ is 0 ≥ valid for P if and only if cx δxn+1 0 is valid for P , the following polyhedral restatement of Theorem 3.36 is immediate.− ≥

66 Theorem 5.22 For = F P = x : Ax b , the following are equivalent: (i) F is a face of P∅; 6 ⊆ { ≥ } (ii) F = z P : cx cz x P for some c K(A); { ∈ ≥ ∀ ∈ } ∈ (iii) F = z P : A z = b for some subsystem A x b of Ax b . 2 { ∈ F F } { F ≥ F } { ≥ } Corollary 5.23 The following hold for polyhedron P : (i) P has finitely many faces; (ii) every face of P is a polyhedron; (iii) F 0 F and F a face of P [F 0 is a face of F F 0 is a face of P ]; (iv) the⊆ intersection of two faces⇒ of P is a face of P .⇔ 2

As with cones, the equality set for face F is the (unique) maximal subsystem A x b { F ≥ F } for which F = z P : AF z = bF . The equality set for P consists of the implicit equalities of the representation,{ ∈ i.e., those inequalities} which hold at equality for all x P . Maximality ∈ of the equality set A x b implies that for each row a of A not in A , some z F { F ≥ F } i F i ∈ i zi satisfies aizi > bi. Thus if we definez ˆ = |A\A | , we obtain an interior point zˆ F which F ∈ satisfies a zˆ > b for all a not in A . NotP excluded is the case F = zˆ , wherez ˆ itself is i i i F { } an interior point of F , even though possibly AF = A. Thus each nonempty face contains an interior point. On the other hand, the following shows that each point of P is interior for some face of P .

Exercise 5.24 Let x P with F a face of P . Show that x is an interior point for F if and only if F is the (unique)∈ minimal face of P containing x. 2

Let C be a convex set, with S a convex subset of C. Then S is extremal (an extreme subset) provided either (i) S = or (ii) whenever x, y C, 0 <λ< 1, and λx +(1 λ)y S, then ∅ ∈ − ∈ x, y S. Condition (ii) stipulates that when the interior of any line segment in C intersects S, the∈ entire line segment lies in S. Let S = be extremal in P = x : Ax b with F a smallest (minimal) face of P containing S.6 Minimality∅ of F implies{S contains≥ } an interior

point, sayx ˆ, of F . Letx ¯ be any other point of F . Now, since AF xˆ = AF x¯ = bF and aixˆ > bi, aix¯ bi for all other rows ai of A, for > 0 sufficiently small we have (1 + )ˆx x¯ =x ˜ F ≥ 1  − ∈ withx ˆ = 1+ x˜ + 1+ x¯. Extremality of S implies thatx ¯ S; i.e., F = S. So extreme subsets are faces. Conversely, for each nonempty face F P some∈ valid inequality for P holds at equality within P for precisely those points on F ;⊆ it follows easily that F is extremal.

Theorem 5.25 F is extremal in polyhedron P if and only if F is a face of P . 2

One argues similarly that a face’s equality set determines its affine span, hence its dimension.

Theorem 5.26 A(F )= x : A x = b for each nonempty face F . { F F } Proof: Recall Theorem 3.37: A(F ) x : A x = b ; for A x¯ = b andx ˆ interior to F , ⊆{ F F } F F (1 + )ˆx x¯ =x ˜ F for > 0 sufficiently small, hence (1+) xˆ 1 x˜ =x ¯ A(F ). 2 − ∈  −  ∈ 67 Corollary 5.27 Each nonempty face F satisfies dim(F )= n rank(A ). − F Proof: Since A(F )) = x : A x = b , a translation of subspace x : A x =0 , we have: { F F } { F } dim(F )= dim(A(F )) = dim( x : A x = b )= dim( x : A x =0 )= n rank(A ). 2 { F F } { F } − F For the face F = , let us denote C = c : cx cz x P, z F ; i.e., c C precisely 6 ∅ F { ≥ ∀ ∈ ∀ ∈ } ∈ F when there is a valid inequality cx δ for P so that F x P : cx = δ . Note that CF is a cone. The fundamental relation≥ between the geometric⊆ { an∈d algebraic characterizations} of faces given in parts (ii) and (iii) of Theorem 5.22 implies the following.

Theorem 5.28 CF = K(AF ) for each nonempty face F .

Proof: For row ai AF , aiz = bi aix z F, x P ai CF ; hence K(AF ) CF . ∈ ≤ ∗ ∀ ∈ ∗ ∀ ∈ ⇒ ∗ ∈ ⊆ To see CF K(AF ), let c CF , with y 0, y A = c, and y b cz z F (Corollary 5.8). Clearly, Az⊆ b cz = y∗∈Az y∗b, hence≥ y∗b = cz z F . ≥ ∀ ∈ ∗ ≥ ⇒ ≥ ∗ ∀ ∈ Thus y (Az b) = 0, which shows aiz = bi yi > 0 z F . ∗ − ∀ ∀ ∈ 2 Now y A = c expresses c as a positive combination of rows in AF ; hence K(AF )= CF .

Combining the two previous results, we have dim(C ) = dim( yA : y 0 )= rank(A ), F { F ≥ } F hence dim(F )+dim(CF )= n, for every face F . When P has faces of dimension 0, i.e., vertices (extreme points), P is pointed. From the dimension condition, we see that P is pointed if

and only if P has a face F for which dim(CF ) = n. Furthermore, we know from Exercise 3.21, that pointedness of C is equivalent to full-dimensionality of C+ = x : A x 0 , F F { F ≥ } i.e., that x : AF x > 0 = . But the latter relation is easily seen to be equivalent to x : Ax >{ b = , i.e.,} that 6 ∅P is full-dimensional. (If Ax¯ > b andx ˆ is an interior point { } 6 ∅ for F , then AF (¯x xˆ) > 0; conversely, if AF x˜ > 0 andx ˆ is an interior point for F , then A(ˆx + x˜) > b for sufficiently− small > 0). Thus we have the following result.

Corollary 5.29 Each nonempty face F satisfies dim(F )+ dim(CF )= n. Furthermore,

P is pointed CF is full-dimensional for some nonempty face F ; P is full-dimensional⇔ C is pointed for every nonempty face F . 2 ⇔ F Facets. For a polyhedron P = x : Ax b IRn, Corollary 5.27 says that its dimension is n minus the rank of its implicit equalities.{ ≥ The} ⊆ direct linkage between dimension and equality set rank for faces has important consequences for minimal inequality representations of P . A maximal proper face of P , distinct from P yet properly contained in no face other than P , is a facet of P . Of course, P = has no facets. And it follows from Theorem 5.22 ∅ that is the unique facet of P if and only if P is an affine space. Thus a polyhedron has nonempty∅ facets precisely when it is not an affine space. The following theorem characterizes the nonempty facets of P as faces of dimension one less than that of P itself; thus facets arise from supporting hyperplanes defined by the rows of A not among the implicit inequalities. The characterization requires that the representation for P be minimal (irredundant), i.e., that it contain no inessential constraints; thus for each row aj A, P is properly contained in x : a x b i = j . ∈ { i ≥ i ∀ 6 } 68 Theorem 5.30 Suppose Ax b is a minimal representation for polyhedron P . For face F = of P , the{ following≥ } are equivalent: (i) F is6 a∅ facet of P ;

(ii) F = z P : aiz = bi , for some row ai (A AP ); (iii) A ={ A∈ a , for} some row a (A A∈ ); \ F P ∪{ i} i ∈ \ P (iv) dim(F )= dim(P ) 1. − 0 Proof: (i) (ii): For ai AF AP , note that F F = z P : aiz = bi . Now F 0 is⇒ a proper face of∈ P and\ F is a maximal⊆ proper{ face,∈ hence F =}F 0. (ii) (iii): By minimality of Ax b , there⇒ existsx ¯ for which a x¯{ < b ≥but} a x¯ b for all other rows. i i j ≥ j For an interior pointx ˆ of P , AP xˆ = bP and ajxˆ > bj for all other rows (including row i). With 0 <λ =(aixˆ bi)/(aixˆ aix¯) < 1, definex ˜ = λx¯ + (1 λ)ˆx. Then A x˜ b , a x˜−= b , and−a x˜ > b for all other rows. − P ≥ P i i j j Moreover,x ˜ P A x˜ = b , showing that A = A a . ∈ ⇒ P P F P ∪{ i} (iii) (iv): Note ai L(AP ), else we would have the contradiction ai AP , since yA⇒ = a andx ¯ 6∈F b = a x¯ = yA x¯ = yb , so A x = b yA ∈x = yb a x = b . P i ∈ ⇒ i i P P P P ⇒ P P ⇒ i i Thus dim(F )= n rank(AF )= n rank(AP ai )= n (rank(AP )+1)= dim(P ) 1. (i) (iv): If F−= P , clearly (iv−) fails. ∪{ } − − Suppose¬ ⇒ ¬F is properly contained in some facet F 0 of P . From Corollary 5.23(iii), we see that F is also a face of F 0. Two applications of (i) (iv) now yield dim(F ) dim(F 0) 1= dim(P ) 2. 2 ⇒ ≤ − − Note that condition (iv) is clearly geometric, i.e., not dependent on the representation of P . On the other hand, for conditions (ii) and (iii) we have the following.

Exercise 5.31 Show by example (a sketch in two dimensions will suffice) that conditions (ii) and (iii) may fail to characterize facets in the presence of inessential inequalities. 2

In Theorem 5.30, relation (i) (iii) shows that to any facet there corresponds a unique row of A A . Moreover, the argument⇒ used in proving (ii) (iii) shows, conversely, that \ P ⇒ each row of A A determines a unique facet of P . Thus the facets of P are in one-to-one \ P correspondence with the rows of A AP in a minimal representation. (Note, in particular, n \ that for IR , there are no nonempty facets and there are no rows in the minimal inequality representation.) For full-dimensional polyhedral cones, this correspondence was the essence of Corollary 3.32. The polyhedral analogue is as follows.

Exercise 5.32 Prove that a polyhedron has a unique (up to positive scaling) minimal in- equality representation if and only if it is a hyperplane or it is full-dimensional. 2

Combining this exercise with the previous discussion yields an important classical result.

Corollary 5.33 In a full-dimensional polyhedron, nonempty facets are in 1:1 correspon- dence with the rows of the unique (up to positive scaling) minimal inequality representation. 2

69 Exercise 5.34 Suppose P = x : Ax b is full-dimensional. Show that Ax b is a { ≥ } { ≥ } minimal representation for P if and only if for each pair of inequalities aix bi, ajx bj given by distinct rows of this system, some x¯ P satisfies a x¯ = b , a x¯ > b .≥2 ≥ ∈ i i j j Exercise 5.35 Suppose S IRn with S < + and polytope P = C(S) is full-dimensional. Show cx δ determines a⊆ facet of P |if| and only∞ if: (i) cx δ is valid for P and (ii) for each a / ≥λc : λ 0 there exist points x,¯ x˜ S for which ax¯≥ > ax˜ and cx¯ = δ. 2 ∈{ ≥ } ∈ Linealities. A minimal nonempty face of P properly contains no nonempty face of P . If P has faces of dimension 0, i.e., when P is pointed, its vertices (extreme points) are its minimal nonempty faces. A face of dimension 1 is called an edge, (extreme) ray, or (extreme) line of P , depending on whether it contains, respectively, 2, 1, or 0 extreme points of P . Among these, lines are minimal nonempty faces, but edges and rays are not, as they properly contain extreme points. Recall that for a polyhedral cone, say x : Ax 0 , the unique minimal nonempty face is its lineality space x : Ax = 0 . We adopt{ the≥ same} name, linealities, for the minimal nonempty faces of any{ polyhedron;} this terminology is justified by part (iii) of the following theorem. Note that the linealities of P must be mutually disjoint, since a common element of two distinct linealities would define a face with larger equality set. Moreover, all linealities must have the same dimension, dim( x : Ax =0 )= n rank(A), which is independent of b. { } −

Theorem 5.36 For polyhedron P = x : Ax b and face F = of P , { ≥ } 6 ∅ the following are equivalent: (i) F is a lineality of P ; (ii) F = x : A x = b , hence F is an affine space; { F F } (iii) F is a translation of z : Az =0 . { } Proof: (i) (ii): Clearly F x : A x = b ; suppose A x¯ = b , yetx ¯ / F . ⇒ ⊆{ F F } F F ∈ F = x P : AF x = bF ,sox ¯ / P and aix¯ < bi for some ai A AF . By minimality{ ∈ of F , any} x ˆ F∈= satisfies a xˆ > b . ∈ \ ∈ 6 ∅ i i Define λi =(aixˆ bi)/(aixˆ aix¯); then 0 <λi < 1 and ai(λix¯ + (1 λi)ˆx)= bi. ∗ − − ∗ ∗ ∗ −∗ ∗ For λ = min λi : aix¯ < bi > 0 and x = λ x¯ + (1 λ )ˆx, we get x P and AF x = bF . { } ∗ − ∈ Hence AF is properly contained in ai : aix = bi , contradicting minimality of F . (ii) (iii): Fixx ˆ F = . Then A{ xˆ = b and} for Az = 0 we have A z = 0. ⇒ ∈ 6 ∅ F F F Thus AF (ˆx + z)= bF , z such that Az = 0; hence xˆ + z : Az =0 F. On the other hand,x ¯ ∀ F xˆ + λ(¯x xˆ) : λ IR{ F P , since} ⊆ F is an affine space. ∈ ⇒ { − ∈ } ⊆ ⊆ Hence Axˆ + λA(¯x xˆ) b λ IR, implying A(¯x xˆ) = 0; i.e.,x ¯ xˆ z : Az =0 . Thusx ¯ =x ˆ +(¯x −xˆ) ≥xˆ +∀z :∈Az =0 , as required.− − ∈{ } (iii) (i): Suppose− F∈{0 is a face of P , with} = F 0 F = xˆ + z : Az =0 . ⇒ ∅ 6 ⊆ { } Then for x0 F 0, we have x0 =x ˆ +ˆz, where Azˆ = 0, and thus F 0 F =∈ x0 +(z zˆ) : Az =0 = x0 + z0 : A(z0 +ˆz)=0 = x0 + z0 : Az0 =0 F 0. As F =⊆F 0, it{ follows that− F is a lineality.} { 2 } { } ⊆

70 Just as maximal proper faces of P are fundamental for uniqueness results on minimal in- equality representations of P (as in Exercise 5.32 above), the minimal nonempty faces lead to uniqueness results for generator representations. Consideration of linealities enables the following refinement of the Inhomogeneous Minkowski Theorem (5.3).

Theorem 5.37 Suppose P = is a polyhedron, K a finite cone, and Q a polytope. 6 ∅ Then P = K + Q if and only if: (i) K is the recession cone of P ; (ii) Q P and Q F = , for each lineality F of P . ⊆ ∩ 6 ∅ Proof: ( ) For K and Q satisfying (i) and (ii), K +Q is a polyhedron (5.1) and K +Q P . Forx ¯ ⇐P (K + Q), Theorem 5.12 yields a valid inequality cx δ for K + Q with cx¯ <⊆ δ. ∈ \ ≥ We are given that K = rec(P ) and from Exercise 5.5 we also have K = rec(K + Q). Thus by Theorem 5.20, cx δ0 is valid for P and z P : cz = δ0 = for some δ0 < δ. Now, the face z P : cz =≥δ0 contains a lineality{ F∈of P . } 6 ∅ { ∈ } Thus cz = δ0 z F , and in particular, cz˜ = δ0, wherez ˜ Q F = , by (ii). However,z ˜ ∀K +∈ Q implies cz˜ δ > δ0. ∈ ∩ 6 ∅ ∈ ≥ This contradiction establishes P (K + Q)= ; i.e., P = K + Q. ( ) If P = x : Ax b = K +\Q, (i) follows∅ from 5.5(iii) and, clearly, Q P . Suppose⇒ F is{ a lineality≥ } and y + z F , where y K and z Q. ⊆ ∈ ∈ ∈ Now, Q P AF z bF and Ay 0 AF z = bF AF y bF ; thus AF z = bF . Therefore⊆z ⇒F Q,≥ establishing (≥ii). ⇒2 − ≤ ∈ ∩ Thus minimal generator representations for P are obtained by taking a minimal generator representation for its recession cone and a single “representative” from each lineality.

Corollary 5.38 Let P = yB +zC : y 0, z 0, q z =1 , for B IRp×n, C IRq×n, { ≥ ≥ i=1 i } ∈ ∈ with p, q 1. This is a minimal generator representationP for P if and only if: (i) the≥ rows of B constitute a minimal generating set for the recession cone of P ; (ii) the rows c C, 1 i q, satisfy c F , where F are the linealities of P . 2 i ∈ ≤ ≤ i ∈ i i When is there a unique minimal generator representation K + Q for P = ? When K is pointed, it has a unique minimal generating set (Theorem 3.31). Moreover,6 ∅ in this case, it follows from Theorem 5.36(iii) that the linealities of P are simply points, giving a unique minimal generating set for Q. On the other hand, if K is not pointed, the same reasoning shows that there can be no unique minimal generating set for Q. Thus P is pointed precisely when K is pointed; the extreme rays of K are also referred to as extreme rays of P . We have the following characterization of uniqueness for generator representations.

Corollary 5.39 A nonempty polyhedron has a unique (up to positive scaling of recession cone generators) minimal generator representation if and only if it is pointed. 2

Consider the polyhedron P = x : Ax = b, x 0 = , for A IRm×n, b IRm. Such polyhedra arise naturally in standard{ form linear≥ programming} 6 ∅ models.∈ Now, ∈P IRn and ⊆ + 71 n IR+ is pointed, so P is also pointed . Thus, by the previous corollary, the generators of P , i.e., vertices and extreme rays, are uniquely determined. It is not difficult to give an algebraic characterization for these generators. Note that since P = , we may assume 6 ∅ that rank(A) = m (Theorem 1.15). Thus, after possible column permutation, A = [B N],

where B is a column basis for A; partitioning x in the same manner yields x = (xB , xN ).

Variables xB are basic, while the remaining variables (with corresponding columns in N) are

nonbasic. Expressing Ax = b as BxB +NxN = b , it is evident that a solution is given by (x , x )=(B−1b, 0);{ this is the} basic{ solution corresponding} to basis B. When B−1b 0, B N ≥ this solution is termed basic feasible. We have already used basic solutions in earlier settings – recall the discussion preceeding Exercise 1.34, as well as that leading up to Theorem 3.41.

Exercise 5.40 For P = x : Ax = b, x 0 , with A IRm×n of rank m, show that: { −1 ≥ } ∈ (i) x is a vertex of P x =(B b, 0n−m) 0, for B a column basis (x is basic feasible); (ii) αx : α 0 is an⇔ extreme ray of P ≥x = λ( B−1a , e ) 0, for B a column basis, { ≥ } ⇔ − j j ≥ aj a nonbasic column (indexed by the (n m)-dimensional unit vector ej), and λ> 0. (iii) Illustrate (i) and (ii) for P with the linear− system Ax = b given by: { } x + x + x = 2 − 1 2 3 x + 2x + x = 6. 2 − 1 2 4 Note that by Corollary 5.11, the condition y : yA = 0, yb = 1,y 0 = characterizes inconsistency of Ax b ; moreover, the polyhedron{ here is described≥ } via 6 linear∅ equalities { ≥ } and nonnegativity restrictions on y. One can exploit this observation to obtain the following geometric characterization of minimal inconsistent subsystems, due to Gleeson and Ryan ORSA J. Computing(1990)61-63.

Exercise 5.41 For A IRm×n and b IRm, show that ∈ ∈ 1:1 minimal inconsistent subsystems of Ax b vertices of y : yA =0, yb =1,y 0 . In particular, nonzero components of{ vertices≥ } index←→ minimal inconsistent{ subsystems.≥ 2}

Resolution. As a final refinement of Theorem 5.3, the following resolution theorem combines the present development with the cone decomposition of Proposition 3.1.

Theorem 5.42 (Goldman 1956) Let P = x : Ax b = . Then P = S + K + Q uniquely, where: (i) S {= x : Ax≥ =0} 6 ,∅ the lineality subspace of P ; (ii) S +{K = x : Ax} 0 , the recession cone of P ; { ≥ } (iii) K is a pointed cone; o (iv) K + Q is the pointed polyhedron P S ; (v) Q is the polytope C( vertices of K +∩ Q ). { }

72 Proof: Denote S = x : Ax =0 , the lineality subspace of P . o We decompose P ={S + P S }, exactly as in the proof of Proposition 3.1. Since P So is a polyhedron,∩ (5.3) P = S + K + Q, for K a finite cone and Q a polytope. ∩ ⇒ Then condition (i) holds by construction, and Exercise 5.5(iii) establishes (ii). If d, d K, then d, d S + K, and (ii) implies d S; − ∈ − ∈ o o ∈ o o o on the other hand, K + Q S Q S K S Q = S d S . o Thus d, d K d S S ⊆ d =⇒ 0; hence⊆ ⇒K is pointed⊆ − and (iii)⇒ and∈ (iv) hold. Since P −So∈is pointed,⇒ ∈ it∩ has, by⇒ Corollary 5.39, a unique minimal generator representation; ∩ thus (v) follows from Corollary 5.38. o Uniqueness of S and P S is evident. K is the (unique) recession∩ cone of P So and uniqueness of Q follows from (v). 2 ∩ As stipulated in the theorem, the geometric objects S,K,Q in the polyhedral decomposition P = S + K + Q are unique. The key to uniqueness of K and Q lies in the orthogonal o decomposition P = S + P S used in the proof. E.g., in the following diagram, P is simply ∩ the intersection of two halfspaces H1,H2 and two decompositions of P are indicated, namely, P = S + K + q = S + K + q , though So contains neither K + q nor K + q . 1 { 1} 2 { 2} 1 { 1} 2 { 2}

H2

P

H1

K2 K1

q2 q1

S 0

Furthermore, the minimal generator representations for K (via its extreme rays) and Q (via its extreme points) in the theorem are also unique, since K is pointed. For inequality representations, a result in the same spirit can be obtained by applying Theorem 5.30. Note, in particular, that the result of Exercise 5.32 is also a consequence of the following result.

Exercise 5.43 With P = x : Ax b = , show P = H R uniquely, for: (i) H = x : A x = b {, the affine≥ } span 6 ∅ of P ; ∩ { P P } (ii) R = x : Cx d , where the rows of C are in one-to-one correspondence with the facets{ of P and≥ are} in the subspace x : A x =0 . 2 { P }

73 Exercise 5.44 2 13 8 1 1 0 3 Suppose P = x : Ax b , where A =  −  , b =  − . { ≥ } 1 23 8      1 2 1   12       −   −  (i) For P = S + K + Q as in Theorem 5.42, show that S = ( λ, λ,λ) : λ IR . { − − ∈ } (ii) Determine A,¯ ¯b so that K + Q = x : Ax¯ ¯b . (iii) Show that (1, 4, 5) is an extreme{ point of≥K +} Q and that (1, 0, 1) determines an extreme ray of K + Q. 2

Theorem 5.42 can also be used to specify a geometric characterization of minimal inconsistent inequality systems (Amaldi, Pfetsch, and Trotter, Math. Prog.(2003)533-554).

Theorem 5.45 For A IRm×n, b IRm, the system Ax b is minimally inconsistent Ax = b is inconsistent∈ and∈ x : Ax b = S{+ Q≥, where:} ⇔ { (i) S}= x : Ax =0 , the{ lineality≤ subspace} of x : Ax b ; { } { ≤ } (ii) Q = C( x¯ ,..., x¯ ) is a simplex, whose vertices x¯ satisfy a x¯ = b j = i. { 1 m} i j i j ∀ 6

Proof: ( ) By (5.18), there existx ¯i, 1 i m, satisfying aix¯i < bi and ajx¯i = bj j = i. Thus P ⇒= x : Ax b = , and by Theorem≤ ≤ 5.37, P = K + Q, where K = x : Ax∀ 6 0 { ≤ } 6 ∅ { ≤ } and Q P is a polytope generated by representatives from each lineality of P . Now, Axˆ ⊆0 with a x<ˆ 0 A(¯x xˆ) b for large > 0 Ax b consistent; ≤ i ⇒ i − ≥ ⇒{ ≥ } hence we must have Ax =0 x K, which establishes (i). For (ii), note that the linealities ∀F of∈ P are determined by A = a : j = i , for 1 i m. i Fi j Thus Q = C( x¯ ,..., x¯ ), and Q is a simplex, provided thex ¯ are{ affinely6 } independent.≤ ≤ { 1 m} i But this must be the case, for ifx ¯i = j6=i λjx¯j with j6=i λj = 1, then aix¯i = ai( j6=i λjx¯j)= j6=i λj(aix¯Pj)= j6=i λjbi =Pbi, contradicting aix¯i < bi. ( ) By conditionP (ii), everyP proper subsystemP of Ax b has an equality solution; ⇐thus minimality is obvious, once we have shown{ Ax≥ }b inconsistent. The proof is by contradiction: letx ˆ be on a lineality{ of ≥x : Ax} b = . { ≥ } 6 ∅ Since x : Ax = b = , we have a xˆ > b for some i. { } ∅ i i Similarly,x ¯i Q x : Ax b and x : Ax = b = imply that aix¯i < bi. Definex ˜ = λ∈xˆ + (1⊆{ λ)¯x , where≤ } 0 <λ{ =(b a}x¯ )/∅(a xˆ a x¯ ) < 1. − i i − i i i − i i Thenx ˜ x : Ax b and aix˜ = bi, so the equality set ofx ˜ properly contains that ofx ˆ. This contradicts∈{ the≥ choice} ofx ˆ, as the equality set of any lineality is maximal. 2

74 6 Linear Programming

A linear programming problem (LP) is to optimize (min or max) a linear function over a polyhedron, e.g., min cx : Ax b . The linear function cx is termed the objective func- tion and the polyhedron{ x : Ax≥ } b is the set of feasible solutions for this LP, i.e., its { ≥ } feasible region. While we have not yet dealt directly with such optimization models, they have been implicit in much of our development on polyhedral theory. For example, when a valid inequality cx δ determines a nonempty face F of polyhedron P = x : Ax b , we ≥ { ≥ } have cx cz x P, z F , so that any z F solves the LP min cx : Ax b . The same is true≥ for∀ any∈ objective∀ ∈ function cx for which∈ c C . Thus the nonempty{ ≥ faces} of P ∈ F are precisely the solution sets for LPs over P , and it is therefore reasonable to expect that polyhedral theory will provide important insight into linear programming models.

Historical Comments

The following appears in the paper by Liebling, Prodon, and Trotter in the UNESCO volume Encyclopedia of Life Support Systems (2002)249–320. See also the following references: Gale, The Theory of Linear Economic Models (McGraw-Hill, 1960); Dantzig, Linear Programming and Extensions (Princeton University Press, 1963); Schrijver, Theory of Linear and Integer Programming (Wiley, 1986). Intensive study of linear programming per se began only at the midpoint of the past cen- tury, even though the theoretical foundations for linear systems and polyhedra were laid over a century ago. Indeed, certain optimization models related to linear spaces have been well understood for two centuries. For example, recall that least squares approximation seeks an element of a subspace which lies at minimum distance from a given point; i.e., for A IRm×n, b IRm and subspace S = Ax : x IRn , we consider the minimization prob- ∈ ∈ { ∈ } lem minx Ax b . The solution (Exercise 1.43) dates to Legendre(1805) and Gauss(1809). An alternativek − solutionk method, iterative in nature, was proposed in 1823 by Gauss, who evidently found the procedure very elementary – so mindless, in fact, that he could do the computation while half-asleep or thinking about other things (“ . . . l¨asst sich halb im Schlafe ausf¨uhren, oder man kann w¨arend desselben an andere Dinge denken.”).

Later, Fourier(1826) considered the same problem, but with a different norm, minx Ax b ∞; i.e., find x which minimizes the largest component magnitude of Ax b. His formulationk − k − is apparently the first linear programming model: min λ : λ a x b λ i . It { − ≤ j ij j − i ≤ ∀ } is significant that the method of solution he suggested is essentiallyP the same as that most commonly used today. He proposed vertex-to-vertex descent along edges of the polyhedron of feasible solutions (“ . . . on continue de descendre suivant une seconde arˆete jusqu’`aun nou- veau sommet . . . ”) until attaining the minimum (“ . . . au point le plus bas du poly`edre.”). Thus Fourier provided a geometric description of the simplex method (discussed later) and, though his application was only three-dimensional, he noted that the method extends to more dimensions (“ . . . le mˆeme proc´ed´econvient `aun nombre quelconque d’inconnues . . . ”). The same problem was considered by de la Vall´ee Poussin(1910), who also presented a linear programming formulation and gave the complete algebraic details of a solution procedure,

75 once again the simplex method. The development of game theory in the 1920’s, particularly two-person zero-sum games, is intimately linked to linear programming. In such games there are two players, called R (row) m×n and C (column), and a payoff matrix A IR which specifies a payment aij from C to R, when R selects strategy (row) i and C∈selects strategy (column) j. Players are allowed, moreover, to use randomized strategies: for R, a vector from = y IRm : y = 1 , Y { ∈ + i i } giving probabilities with which R will select the m rows; similarly for C, aP probability vector from = x IRn : x = 1 . C’s goal is to select a randomized strategy X { ∈ + j j } which will minimize loss, takingP all possible strategies of R into account, i.e., to solve minx∈X maxy∈Y yAx; similarly, R maximizes gain by solving maxy∈Y minx∈X yAx. It is ele- mentary that min max yAx max min yAx, and Borel(1924) conjectured equal- x∈X y∈Y ≥ y∈Y x∈X ity for certain matrices. In 1928 von Neumann proved the celebrated Minimax Theorem,

establishing equality for any payoff matrix, i.e., minx∈X maxy∈Y yAx = maxy∈Y minx∈X yAx. This result is an instance of linear programming duality (discussed below). To see the rela-

tion to linear programming, note that if x is fixed, say x =x ¯, then maxy∈Y yAx¯ is just the

largest component of Ax¯. Thus minx∈X maxy∈Y yAx can be solved by choosing x so that the largest component of Ax is minimized, i.e., by choosing λ, x to solve the linear programming problem min λ : a x λ i; x = 1; x 0 j . The best randomized strategy { j ij j ≤ ∀ j j j ≥ ∀ } for player R is determined by solving max µ : y a µ j; y = 1; y 0 i , the P P { i i ij ≥ ∀ i i i ≥ ∀ } linear programming dual of C’s model. AlthoughP these LP modelsP are remarkably similar to the earlier model of Fourier and de la Vall´ee Poussin, it is significant that the earlier applications arose in the context of analyzing physical data, while the present application is one of economic decision making. Of course, the presence of linear programming models at the heart of von Neumann’s Minimax Theorem is revealed by an ex post facto analysis. Nevertheless, Ville(1938) did later prove this theorem using results from the theory of linear inequalities. The work of Kantorovich(1938) marks the first explicit use of LP models as effective tools for strategic planning. Kantorovich(1960) later indicated that his applications were related “ . . . to the organization and planning of production . . . connected specifically with the [for- mer] Soviet [communist] system of economy . . . ”. In one model which he studied extensively, execution of jobs i = 1,...,m produces component parts j = 1,...,n for assembly into a final product. Machines k = 1,...,p can be configured to work on the m jobs; assigning machine k to job i for one time unit produces aijk units of the jth component. Thus, if yik denotes the fraction of its time machine k spends working on job i, the quantity i k yikaijk is the total production of the jth part. Assuming that each final product containsP P exactly one copy of each component, that over-production of components is not allowed, and that each machine must be fully utilized, the following LP maximizes the production output per unit time: max µ : y a = µ j; y = 1 k; y 0 i, k . It is sig- { i k ik ijk ∀ i ik ∀ ik ≥ ∀ } nificant that Koopmans(1959-60)P P later showed thatP any LP problem can be put into this form. Note also that if the model is modified slightly to permit component over-production ( y a µ j), then the case of a single machine, i.e., p = 1, is just the problem of i k ik ijk ≥ ∀ playerP P R in the two-person zero-sum game setting.

76 Another important model considered by Kantorovich, the transportation problem, was also studied by Hitchcock around this same time. In this model, a commodity produced in quan- tities ai at plants i =1,...,m is demanded in amounts bj by customers at locations indexed j =1,...,n, and total supply equals total demand, i.e., i ai = j bj; the unit shipping cost from plant i to customer j is cij. The goal is to determineP a minimumP cost shipping schedule which will meet customer demand while respecting product availability at the production sites, i.e., to solve min c x : x = a i; x = b j; x 0 i, j . The { i,j ij ij j ij i ∀ i ij j ∀ ij ≥ ∀ } history of the transportationP problemP dates to Monge(1784)P , but the first thorough inves- tigations were those of Kantorovitch(1939) and Hitchcock(1941); both gave algorithms for this model which are variants of the simplex algorithm. The special case in which m = n and all ai = bj = 1, is known as the assignment problem; viz., n people are to be assigned to n jobs at minimum cost. This problem had been studied much earlier by K¨onig(1916,1931) and Egerv´ary(1931), who gave an efficient combinatorial procedure, today known as the Hungarian algorithm, for its solution. A further early application of linear programming motivated by economic considerations, whose initial formulation dates at least to J. Cornfield in 1940, was the diet problem studied by Stigler(1945). This model considers nutrients i obtained from a diet made up of food types j; bi is the amount of the ith nutrient required over the planning horizon, aij the amount of the ith nutrient supplied in one unit of the jth food type, and cj the per unit cost of the jth food type. The quantities xj of the various food types to be purchased in order to meet nutritional requirements at minimum cost are thus determined by the linear programming problem min cx : Ax b; x 0 . It was, in fact, a 9 27 instance of this model which provided the{ initial “large-scale”≥ ≥ test} by Laderman in 1947× of the computational efficiency of Dantzig’s simplex method. By 1949, when the Cowles Commission conference on linear programming was organized by Koopmans, it was apparent that linear programming models were broadly applicable as strategic economic planning tools. The simple LP problem max cx : Ax b; x 0 has an { ≤ ≥ } obvious interpretation as a production model: values xj are sought for various production activities in order to maximize total profit, with activity j returning profit at rate cj. Re- source availability limits production; only bi units of resource i are on hand over the planning horizon, and resource i is consumed by production activity j at rate aij. Kantorovich’s com- ments notwithstanding, LP models are pervasive, whether maximizing production efficiency or profit, whether minimizing operation time or service costs. The simplicity of this linear programming model suggests that in mathematical economics it should have appeared very early as a rudimentary model for quantitative analysis, but, as Dantzig(1963) indicates, this was not the case: “The current introduction of linear pro- gramming in economics appears to be an anachronism; it would seem logical that it should have begun around 1758 when economists first began to describe economic systems in math- ematical terms. Indeed, a crude example of a linear programming model can be found in the Tableau ´economique of Quesnay . . . .” Much later Walras(1874) did introduce a model in mathematical economics in which linear equalities described consumption of raw materials by production of various goods under an assumption of fixed technology coefficients (matrix

77 A in the linear programming model). This model, as simplified by Cassel(1918), was of the form: Ax = b; yA = z; x = f(z). The interpretation here for Ax = b is similar to that given for the production model described above, i.e., that resources consumed by produc- tion Ax should equal availability b; y is a vector of unit prices for the raw materials, so that yA = z computes z as unit costs of production for the goods being considered, while x = f(z) stipulates that production levels are a function of these costs. The model was later modified by Zeuthen(1933) and Schlesinger(1935) to allow supply of raw materials to exceed production demand: Ax b; yA = z; x, y, z 0; x = f(y, z); y(b Ax) = 0. The condition ≤ ≥ − y(b Ax) = 0 requires that when there is excess supply of the ith raw material, i.e., when b − a x > 0, then the market price y for that material should be 0. The model now i − j ij j i displaysP strong similarity to the optimality conditions for a dual pair of linear programming problems (see below). These similarities are again present in the mathematical model of economic growth studied by von Neumann(1937). Concurrently with these developments in mathematical economics, empirical economic mod- els of Leontief were being increasingly applied. These models bear certain similarities to the Walras model and its variants, but it is important to draw the distinction that, whereas the theoretical models were used primarily for qualitative insight, their empirical counterparts were used quantitatively. The input-output matrix for the Leontief-type model has rows cor- responding to goods (resources) used in production (input) and columns corresponding to goods produced (output) in the economy; matrix coefficients in a particular column give resource amounts needed to produce one unit of the corresponding good. Moreover, it is as- sumed that resources and products correspond one-to-one; i.e., the matrix is square. Despite the limitations imposed by the square input-output matrix and the static and non-robust nature of solutions, the application of Leontief models by Cornfield, Evans, and Hoffen- berg(1947), beginning as early as 1936, had a profound effect on the development of linear programming. Of particular significance is the model’s production function, which, in con- trast to that of the theoretical models, is linear; i.e., change in the output of a certain good necessitates a proportional change in supply of resources needed for its production. For the linear programming model, the crucial ingredient missing in its various economic predecessors is the explicit optimization of a linear function. This, of couse, made its first appearance around 1940 in models for the diet and transportation problems. The broad potential for linear programming models was heavily stressed in the economics community by Koopmans(1947), who had worked with transportation models during World War II. Thus, with the broad understanding of linear systems provided by classical mathematics, the growing acceptance of the use of empirical economic models for strategic planning, and the abundance of important new applications motivated by economic considerations, the stage was well-prepared for the arrival of linear programming in 1947. A first-hand ac- count is reported by Dantzig(1963): “Intensive work began in June 1947 . . . to generalize the inter-industry [Leontief] approach. The result was the development of the linear program- ming model by July 1947. . . . During the summer of 1947, Leonid Hurwicz . . . worked with the author on techniques for solving linear programming problems. This effort and some suggestions of T.C. Koopmans resulted in the Simplex Method.”

78 Linear Programming Duality

Dantzig leaves little doubt about the fundamental value of von Neumann’s contributions to linear programming: “Credit for laying the mathematical foundations of this field goes to John von Neumann more than to any other man . . . at the first meeting with the author in October 1947, [von Neumann] was able immediately to translate basic theorems in game theory into their equivalent statements for systems of linear inequalities . . . He introduced and stressed the fundamental importance of duality and conjectured the equivalence of games and linear programming problems.” In November of 1947, shortly after his October meeting with Dantzig, von Neumann wrote a manuscript presenting the essentials of duality theory for linear programming. He showed that the optimization problem min cx : Ax b; x 0 can be solved simply by solving { ≥ ≥ } the linear system Ax b; x 0; yA c; y 0; cx yb . Crucial here is the sense of the inequality cx{ yb≥, for the≥ other≤ direction≥ (termed≤ weak} duality) is an elementary ≤ consequence of feasibility for the two systems Ax b; x 0 and yA c; y 0 , viz., cx (yA)x = y(Ax) yb. The essence of this development{ ≥ ≥ was} later{ stated≤ by Gale,≥ } Kuhn, and≥ Tucker(1951) as the≥ linear programming (Strong) Duality Theorem. It is significant that their proof, as well as that given earlier by von Neumann, was based on the Farkas Theorem. The two alternatives of Theorem 5.6 suggest consideration of the following LPs:

max yb min cx s.t. yA = c (D) (P ) s.t. Ax b ≥ y 0 ≥ We view problem (P) as the initial, or primal, LP and (D) as the dual, obtained from the data for (P). By writing any LP in the form of (P), we extend the notion of duality to arbitrary LPs. The following exercise shows that the dual of (D) is (P), i.e., that LP duality is involutory, so that we are justified in referring to (P) and (D) simply as a dual pair of LPs. This is the so-called standard form LP duality; part (ii) of the following exercise indicates LP duality in symmetric form.

Exercise 6.1 Show the following: (i) The dual of (D) is (P). (ii) min cx : Ax b, x 0 , max yb : yA c, y 0 are a dual pair of LPs. 2 { ≥ ≥ } { ≤ ≥ } When the polyhedron of feasible solutions for (P) is empty, i.e., x : Ax b = , problem { ≥ } ∅ (P) is said to be infeasible; similar terminology is used for (D) with respect to the polyhedron y : yA = c, y 0 . Recall that the recession cone of x : Ax b is K = x : Ax 0 { + ≥ } { ≥ } + { ≥ } and that K = yA : y 0 . Thus (D) is feasible if and only if c K . { ≥ } ∈ Exercise 6.2 For J = y : yA =0, y 0 , show that J + = t Ax : t 0, x IRn ; show that (P) is feasible{ if and only≥ if } b J + . 2 { − ≥ ∈ } − ∈

79 By Exercise 5.5(ii), we know that when (P) is feasible, i.e., x : Ax b = , then x : Ax b is bounded if and only if Ax 0 x = 0; similarly{ for (D),≥ } when 6 ∅ (D) is {feasible, ≥y : }yA = c,y 0 is bounded if and≥ only⇒ if yA =0,y 0 y = 0. The following { ≥ } ≥ ⇒ exercise further relates the feasible regions for (P) and (D).

Exercise 6.3 Show that: (i) if the feasible region of (P) is nonempty and bounded then the feasible region of (D) is nonempty and unbounded; (ii) if the feasible region of (D) is nonempty and bounded then the feasible region of (P) is nonempty and unbounded. 2

When (P) and (D) are both feasible, i.e., when c K + and b J + , the two objective functions must satisfy cx =(yA)x = y(Ax) yb, proving∈ the following− ∈ proposition. ≥ Proposition 6.4 (Weak LP Duality) For x (P)-feasible and y (D)-feasible, cx yb. ≥ When x∗ is feasible for (P) and cx∗ cx for any x feasible for (P), then x∗ is an optimal solution for (P); optimality for (D) is≤ defined similarly. Proposition 6.4 provides a simple sufficient condition for optimality when both (P) and (D) are feasible.

Corollary 6.5 If x∗ and y∗ are feasible for (P) and (D), respectively, and cx∗ = y∗b, then x∗ and y∗ are optimal solutions for (P) and (D). 2

Of course, (P) and (D) need not have optimal solutions; e.g., either linear system defining feasibility could be inconsistent. Moreover, even when (P) is feasible, there may be no optimal solution; consider min x : x 0 . (P) is unbounded when for any α IR { − ≥ }→−∞ ∈ there corresponds a feasible x for (P) so that cx α; unboundedness for (D) is defined similarly. An immediate consequence of (6.4) is that≤ when either LP is unbounded, the other one must be infeasible.

Corollary 6.6 (P) unbounded (D) infeasible; (D) unbounded (P) infeasible. 2 ⇒ ⇒ We saw in Exercise 6.2 that feasibility of (P) is completely determined by the recession cone J for the polyhedron in (D); i.e., (P) is feasible b J + . The following exercise shows, moreover, that when (P) is feasible, unboundedness⇔ −of (P)∈ is completely determined by the recession cone K for the polyhedron in (P).

Exercise 6.7 For (P) feasible, show (P) is unbounded c / K + ; ⇔ ∈ + conclude from (6.1) that for (D) feasible, (D) is unbounded b / J . 2 ⇔ − ∈ It follows from Theorem 5.20 that any LP which is feasible but not unbounded must have an optimal solution. Combining this with the previous result yields the following.

Corollary 6.8 For (P) feasible, (P) has an optimal solution c K + ; for (D) feasible, (D) has an optimal solution b J + . ⇔2 ∈ ⇔ − ∈ 80 The following fundamental result for linear programming summarizes the above development.

Theorem 6.9 (Strong LP Duality) For the dual pair of LPs (P) and (D), there are four mutually exclusive and exhaustive cases: (i) (P) and (D) are both infeasible; (ii) (P) is unbounded and (D) is infeasible; (iii) (D) is unbounded and (P) is infeasible; (iv) (P) and (D) are both feasible and max = min.

Proof: The cases arise from the four possibilities determined by the recession cones K,J. When b / J + , c / K + , we have case (i); see Exercise 6.2. − ∈ + ∈ + + + When b J ,c / K , we have case (ii) and b / J ,c K is case (iii); see (6.7). The only− remaining∈ ∈ case is b J + ,c K + , in− which∈ both∈ (P) and (D) are feasible. In this case (6.8) provides optimal− ∈ solutions∈ x∗ and y∗ and (6.4) stipulates y∗b cx∗. ≤ If y∗b

We remark that the theorem is A-generic in the sense described earlier for the recession cone. Once A is specified, the cones J, K which dictate the four cases of the theorem are completely determined, independent of b,c.

Exercise 6.10 Construct examples which demonstrate each possibility in Theorem 6.9. 2

Exercise 6.11 Show that the two-person zero-sum game LPs are dual to one another:

(i) player C: min λ : j aijxj λ i; j xj = 1; xj 0 j ; (ii) player R: max{µ : y a ≤ µ ∀j; y = 1; y ≥0 ∀i .} 2 { P i i ij ≥ ∀ Pi i i ≥ ∀ } Theorem 6.12 (von NeumannP 1928 – MinimaxP Theorem) m×n n m For A IR and = x IR+ : j xj =1 , = y IR+ : i yi =1 , ∈ X { ∈ } Y2 { ∈ } minx∈X maxy∈Y yAx = maxy∈Y minPx∈X yAx. P Exercise 6.13 For the dual pair of LPs (P) and (D), show the following. (i) Let x : Ax b = S + K + Q = , as in (5.42), and assume (P) is not unbounded. Show{ xˆ +≥S is} a set of optimal6 solutions∅ for (P), for some vertex xˆ of K + Q. Explain{ the} relation here to Theorem 5.36(iii). (ii) Show that when (D) is feasible, the polyhedron y : yA = c, y 0 is pointed. Thus, for (D) feasible and not unbounded, some vertex{ of this polyhedron≥ } solves (D). 2

Exercise 6.14 For LPs (P) and (D), show that when (D) is feasible (P) has an interior solution, say x¯ such that Ax¯ > b, if and only if (D) has a bounded, nonempty set of optimal solutions. 2

Since y∗b = cx∗ = y∗Ax∗ y∗(b Ax∗) = 0 at optimality, we also have the following. ⇔ − Corollary 6.15 (Complementary Slackness) For x∗ (P)-feasible and y∗ (D)-feasible, x∗ and y∗ are optimal y∗(a x∗ b )=0, for each a A. ⇔ i i − i i ∈ 81 The terminology complementary slackness derives from the complementarity stipulation yi(aix bi) = 0, requiring that either yi = 0 or aix bi = 0, and from the fact that a x b−measures the deviation from equality, i.e., the slack− , in the ith constraint a x b . i − i i ≥ i The fact that complementary slackness conditions are a by-product of optimality for a dual pair of LPs was certainly broadly known among early researchers in linear programming; Gale(1960) credits Goldman and Tucker(1956) with first stating this explicitly.

Exercise 6.16 Give complementary slackness conditions for the LPs in Exercise 6.1(ii).2

Exercise 6.17 Suppose x∗ is optimal for max cx : Ax b, x 0 and Ax∗ < b. { ≤ ≥ } Determine all optimal solutions for the dual. 2

Recall the production model max cx : Ax b; x 0 . The interpretation discussed earlier { ≤ ≥ } was that of profit maximization subject to constraints limiting resource consumption: cj is the unit profit for the jth production activity, bi the current availability of the ith raw material used in production, aij the rate at which the ith raw material is consumed by the jth production activity, and xj the unknown production level (quantity) for the jth activity. The components of the dual model min yb : yA c; y 0 have a natural interpretation in terms of market pricing, similar to that{ discussed≥ earlie≥ r} for the Walras model and its extensions. Suppose, as an alternative to using the raw materials in production, the producer can sell them on the open market, the unit price of the ith resource being yi. What prices must be offered to induce the entrepreneur to sell these raw materials? First note that the prices should be nonnegative yi 0, as there is no penalty to the entrepreneur for unused raw materials. Furthermore, the≥ market value of the array of resources needed to operate the jth production activity at unit level, when compared with the profit to be gained by producing one copy of the jth item, should be sufficiently attractive that the sale of these resources will incur no loss, i.e., y a c . Thus the market problem of setting prices i i ij ≥ j which enable minimum cost purchaseP of all resources is precisely the linear programming dual of the production model. The complementary slackness conditions also have natural economic interpretations. First, if there is surplus of the ith resource after production, then the market price for this raw material should be zero, i.e., a x < b y = 0; on the other hand, if the market value j ij j i ⇒ i of the resources used in productionP of the jth good exceeds its profit from production, then there is no incentive to produce that good, i.e., y a >c x = 0. i i ij j ⇒ j P

82 7 Lattice Points in Polyhedra

In this section we will generally restrict matrix A to be rational; when b is also rational, P = x : Ax b IRn is said to be a rational polyhedron; i.e., the entire polyhedron { ≥ } ⊆ n is generated via convex combinations of its rational elements, viz. P = C(Q P ). We have already studied lattice points in certain rational polyhedra: for homogeneous∩ equality systems (subspaces), x ZZn : Ax = 0 is a lattice; for inhomogeneous equality systems (affine spaces), x ZZ{ n :∈Ax = b is an} integral translation of a lattice; for homogeneous inequality systems{ (finite∈ cones), }x ZZn : Ax 0 is a finitely generated integral monoid. { ∈ ≥ } Now we focus attention on the general case of integral solutions for inhomogeneous inequality systems (polyhedra), x ZZn : Ax b . { ∈ ≥ } We begin by again exploiting the homogenization of Theorem 5.3, this time in the discrete setting, in order to derive a polyhedral analogue (Theorem 7.1 below) for the Hilbert Basis Theorem. Let K = (x, x ) : Ax bx 0, x 0 , for A Qm×n, b Qm. { n+1 − n+1 ≥ n+1 ≥ } ∈ ∈ Then x P (x, 1) K; hence x P ZZn (x, 1) K ZZn+1. Since K is a rational ∈ ⇔ ∈n+1 ∈ ∩ ⇔ ∈ ∩n+1 polyhedral cone, K ZZ has a Hilbert basis z1, , zp ZZ . Denoting the zk with 0,1 last component as (∩u , 0) and (v , 1), for 1 i{ s,···1 }j ⊂ t, yields (x, 1) K ZZn+1 i j ≤ ≤ ≤ ≤ ∈ ∩ ⇔ (x, 1) = s (u , 0)λ + t (v , 1)µ , λ , µ ZZ . Thus any rational polyhedron is the i=1 i i j=1 j j ∃ i j ∈ + sum of aP finitely generatedP integral monoid and a finite set of integer-valued vectors. Theorem 7.1 Let P = x : Ax b , where A Qm×n, b Qm. Then P ZZn = s λ u +{ t µ≥v }: λ ,µ ZZ ∈, u , v ZZ∈n i, j and t µ =1 . 2 ∩ { i=1 i i j=1 j j i j ∈ + i j ∈ ∀ j=1 j } P P P The linear (conical, convex) hull of S IRn is the intersection of all subspaces (convex cones, ⊆ convex sets) containing S; hulls provide exterior descriptions for L(S), K(S), and C(S), complementing the interior descriptions given by linear, conical, and convex combinations.

Exercise 7.2 Show that L(S), K(S), and C(S) are the respective linear, conical, and convex hulls of S IRn. 2 ⊆ Proposition 4.6 shows that generators for an integral monoid also determine its conical (or convex) hull via nonnegative (convex) combinations. The inhomogeneous result is:

Exercise 7.3 For S = yB + zC : y 0, integral; z a unit vector with B, C integral, { ≥ } show C(S)= yB + zC : y 0; z 0, z =1 . 2 { ≥ ≥ i } P Observe, moreover, by Theorem 5.1, C(S) in the previous exercise is a rational polyhe- dron. Applying Theorem 7.1 and Exercise 7.2, we obtain the result of Meyer (Math. Prog. 7(1974)223-235) that the integral elements of a rational polyhedron have a polyhedral con- vex hull; this integer hull of polyhedron P is denoted P = C(ZZn P ). Optimization of a I ∩ linear function over PI determines a face of optimal solutions; since this face is extreme in C(ZZn P ), it must contain an element of ZZn P . Thus integer programming over ZZn P , ∩ ∩ ∩ i.e., optimization of a linear function over the integral elements of P , may be viewed simply

as linear programming over the polyhedron PI .

83 2 Corollary 7.4 (Meyer 1974) If P is a rational polyhedron, then so is PI . Exercise 7.5 For b IRm, is x : Ax b polyhedral if A IRm×n? ... if A Qm×n? 2 ∈ { ≥ }I ∈ ∈ Exercise 7.6 For P = x : Ax b = with A Qm×n, show: (i) P = rec({P )= rec≥(P}) 6; ∅ ∈ I 6 ∅⇒ I (ii) PI = dim(rec(P )) < n; ∅⇒ ZZn (iii) PI = and A(P ) = dim(rec(P )) < dim(P ). Do (i), (ii), and∅ (iii) remain∩ valid6 ∅⇒ for general (real-valued) A? 2 Since part (i) of the exercise establishes that rec(P ) = rec(P ) (provided P = ), we I I 6 ∅ know, moreover, that the same objective functions will have finite optimum over P and PI , namely, (rec(P ))+ = K(A). Cook et al. (Math. Prog. 34(1986)251-264) have related linear

programming over P and PI through the following result. Theorem 7.7 (Cook, Gerards, Schrijver, Tardos 1986) For A ZZm×n, b IRm, and c K(A), consider the optimization problems ∈ ∈ ∈ ZZn (i) minx∈P cx and (ii) minx∈ZZn∩P cx, where P = x : Ax b with P = . Then, for ∆ the maximum subdeterminant{ magnitude≥ } of A: ∩ 6 ∅ to each solution x¯ for (i), there corresponds a solution x0 for (ii) with x¯ x0 n∆; k − k∞ ≤ to each solution xˆ for (ii), there corresponds a solution x00 for (i) with xˆ x00 n∆. k − k∞ ≤ Proof: Supposex ¯ solves (i) andx ˆ solves (ii). A1 b1 Write Ax b as x , where A1x¯ > A1xˆ and A2x¯ A2xˆ. ≥ " A2 # ≥ " b2 # ≤ Consider the cone K = u : A1u 0, A2u 0 and notex ¯ xˆ K. n { ≥ ≤ } − ∈ Let u1,...,up ZZ be generators for K; by (3.41), we may assume ui ∞ ∆ i. ∈ nk k ≤ ∀ By Carath´eodory’s Theorem (3.8), we may assume thatx ¯ xˆ = i=1 λiui, λi 0 i. 0 − ≥ ∀ Define x =x ˆ + λ1 u1 + + λn un =x ¯ (λ1 λ1 )u1 P(λn λn )un, x00 =x ¯ bλ cu ··· b λ c u =x ˆ−+(λ − b λ c )u −···−+ +(λ − bλ c)u . − b 1c 1 −···−b nc n 1 − b 1c 1 ··· n − b nc n One easily verifies that x00 is feasible for (i) and x0 is feasible for (ii). Moreover, sincex ¯ solves (i) and A1x¯ > A1xˆ b1, complementary slackness implies vA = c with v 0; thus u K A u ≥ 0 cu = vA u 0 i. 2 ≥ i ∈ ⇒ 2 i ≤ ⇒ i 2 i ≤ ∀ Sincex ˆ is optimal for (ii) andx ˆ + λi ui is feasible for (ii) for each i, cxˆ cxˆ + λi cui. It follows that λ = 0 whenever cub

84 Theorem 7.8 (Graver 1975) Given A Qm×n, there exists a set S ZZn, S < + , such that b IRm, c IRn, xˆ∈ ZZn P , for P = x : Ax ⊆b , | | ∞ ∀ ∈ ∀ ∈ ∀ ∈ ∩ n { ≥ } either (i)x ˆ solves min n cx or (ii)x ˆ + s ZZ P, c(ˆx + s) A1xˆ and A2x¯ A2xˆ. ≥ " A2 # ≥ " b2 # ≤ Note thatx ¯ xˆ K = u : A1u 0, A2u 0 . Let u ,...,u− ZZ∈n be a{ Hilbert basis≥ for ZZ≤n }K; hencex ¯ xˆ = p λ u , λ ZZ i. 1 p ∈ ∩ − i=1 i i i ∈ + ∀ Now c(¯x xˆ) < 0 λj 1, cuj < 0 for some j. P Furthermore,− x ˆ + u⇒=x ¯≥ λ u (λ 1)u A(ˆx + u ) b, j − i6=j i i − j − j ⇒ j ≥ since A1(ˆx + uj) A1xˆ P b1 and A2(¯x i6=j λiui (λj 1)uj) A2x¯ b2. ZZn ≥ ≥ − − − ≥ ≥ Thusx ˆ + uj P and c(ˆx + uj)= cxˆ + cuPj

The test set S provides a set of feasible directions for incremental steps fromx ˆ to other points of ZZn P , namely, D = d S :x ˆ + d P . These directions enable determination ∩ { ∈ ∈ } of the family of objective functions optimized over PI atx ˆ.

Exercise 7.9 Show that cxˆ = min n cx c yA : y 0 x : dx 0 d D . 2 x∈ZZ ∩P ⇔ ∈{ ≥ }∩{ ≥ ∀ ∈ } This observation leads to a striking result of Wolsey (Disc. Appl. Math. 3(1981)193-201) – a finite subset of ZZn K(A) specifies coefficients for all defining inequalities for P for all b. ∩ I Theorem 7.10 (Wolsey 1981) Given A Qm×n, there is an integral matrix G for which: b IRm and associated polyhedron P =∈ x : Ax b , h such that P = x : Gx h . ∀ ∈ { ≥ } ∃ I { ≥ } Proof: For D S, the test set for A, define K = yA : y 0 x : dx 0 d D . ⊆ D { ≥ }∩{ ≥ ∀ ∈ } Let GD be a set of integer-valued generators for the finite, rational cone KD . Define the rows of G by G ; if dim(rec(P )) < n, we append two additional rows a, a: D⊆S D − when rec(P ) is not full-dimensional,S x : Ax 0 x : ax =0 for some a = 0; we may assume a ZZn and by Theorem{ 3.15,≥a, }⊆{a K(A), as }Ax 0 ax6 =0 . m ∈ − ∈ { ≥ ⇒ } Now, for b IR denote, as usual, P = Ax b and P = C(ZZn P ). ∈ { ≥ } I ∩ For P = , define h = min n g x, for each row g G. I 6 ∅ i x∈ZZ ∩P i i ∈ Clearly, P x : Gx h , as z P g z min n g x = h g G Gx h. I ⊆{ ≥ } ∈ I ⇒ i ≥ x∈ZZ ∩P i i ∀ i ∈ ⇒ ≥ Conversely, suppose gx γ is valid for PI ; i.e., γ minx∈ZZn∩P gx = gxˆ, withx ˆ PI . By (7.9), g K , with D≥ S determined byx ˆ; thus≤ G xˆ = h and yG = g,y∈ 0. ∈ D ⊆ D D D ≥ Hence Gx h G x h gx = yG x yh = yG xˆ = gxˆ γ. ≥ ⇒ D ≥ D ⇒ D ≥ D D ≥ I.e., if Gx h, then x satisfies every valid inequality for PI ; hence x : Gx h PI . For P = , if ≥P = , we note that the rows of A are in G, since K = {yA : y ≥ 0 }. ⊆ I ∅ ∅ ∅ { ≥ } Thus, defining hi = bk for each row gi = ak A, we obtain PI = x : Gx h = . If P = , 7.6(ii) showsd edim(rec(P )) < n, hence∈ Ax 0 ax =0{ for a, ≥a }G; ∅ 6 ∅ { ≥ ⇒ } − ∈ defining h = h = 1 for a = g G, a = g G yields P = x : Gx h = . 2 i j i ∈ − j ∈ I { ≥ } ∅

85 Integral Polyhedra P is an integral polyhedron when P = P , i.e., when P = C(ZZn P ), the of its I ∩ integral elements. The viewpoints of faces of P afforded by optimization, as solution sets for linear programming problems over P , and geometry, as intersections of P with its supporting hyperplanes, lead to the following characterizations of integral polyhedra; see, e.g., Chapters 16 and 22 of Schrijver, Theory of Linear and Integer Programming (Wiley, 1986). Theorem 7.11 For P = x : Ax b with A Qm×n, the following are equivalent: { ≥ } ∈ (i) P = PI ; (ii) each nonempty face of P contains an integral element; (iii) each lineality of P contains an integral element;

(iv) minx∈P cx is attained by an integral vector, c yA : y 0 ; ZZn ∀ ∈{ ≥ } (v) minx∈P cx is an integer, c yA : y 0 ; (vi) each rational supporting∀ hyperplane∈ ∩{ for P ≥contains} an integral element. Proof: For P = the equivalence is trivial; suppose P = . ∅ 6 ∅ ZZn (i) (ii): This follows because (5.25) faces are the extreme subsets of P = PI = C( P ). (ii)⇒ (iii): This is obvious. ∩ ⇒ (iii) (iv): The face z P : cz = minx∈P cx contains a lineality, hence an integral point. (iv) ⇒(v): This is obvious.{ ∈ } ⇒ n (v) (vi): Let x : cx = δ support P , with c Q and minx∈P cx = δ. ⇒ { } n ∈ We may scale to obtain c ZZ and gcd(cj :1 j n) = 1; thus by (v), δ ZZ also. It then follows from Exercise∈ 2.6(ii) that cx =≤δ must≤ have an integral solution.∈ (iii) (vi): Since A is rational, we may scale so that A ZZm×n. ¬ ⇒ZZ ¬n n ∈ Let F = for lineality F = x IR : AF x = bF – recall Theorem 5.36(ii). ∩ ∅ ZZn ZZ{ ∈ m } By Theorem 2.7, yA , yb , for some y Q with yi = 0 for rows ai AF . n∈ 6∈ ∈ 6∈ Since (y + ei)A ZZ , (y + ei)b ZZ for proper choice of  ZZ+, we may assume 0 y = 0. Defining c = yA∈and δ = yb, hyperplane6∈ H = x : cx = δ ∈supports P , because ≤ 6 { } x P Ax b cx = yAx yb = δ and x F AF x = bF cx = yAx = yb = δ. But c∈ ZZn⇒and δ≥ ZZ⇒implies that≥H ZZn = , and∈ so⇒(vi) holds. ⇒ (iii) ∈(i): Let x 6∈ F ZZn, where F ∩, 1 i ∅ p, are the¬ linealities of P . ⇒ i ∈ i ∩ i ≤ ≤ By (5.37) P = K + Q, where K = rec(P ), Q = C( x1, , xp ); note K = KI , Q = QI . It follows now that P = K + Q = K + Q (K +{Q) ···= P } P , so that P = P , since I I ⊆ I I ⊆ I λ k + µ q = λ µ (k + q ), with λ = µ = 1 and λ ,µ 0 i, j. 2 i i i j j j i,j i j i j i i j j i j ≥ ∀ P P P P P Geometric insight for (iii) (v) was first provided by A. Hoffman for pointed polyhedra ¬ ⇒ ¬ (Math. Prog. 6(1974)352–359). If extreme point x∗ ZZn, suppose its jth component is ∗ ∗ ZZn 6∈ ∗ nonintegral, say xj = xj + f, 0

86 (ii)0 A(F ) ZZn = , for each nonempty face F of P ; 0 ∩ 6 ZZ ∅ ZZn (v) minx∈P cx , for all c in some set S , S < (hint: see 7.29 below); (vi)0 x : a x =∈b ZZn = , for each row a⊂ of A.| |2 ∞ { i i} ∩ 6 ∅ i We now examine various conditions related to unimodularity which characterize certain well- studied classes of integral polyhedra.

Total unimodularity. Recall that a square matrix with integer-valued entries and unit magnitude determinant is unimodular. Extending this stipulation to submatrices, we say that an m n matrix A is totally unimodular provided each of its nonsingular submatrices × is unimodular, i.e., when every “subdeterminant” of A is 0, +1, or -1. Note that when A is totally unimodular, we must have a 0, +1, 1 i, j. Recall also that for A of rank r, a ij ∈{ − } ∀ row basis is any set of r linearly independent rows of A. It is elementary that A is totally A unimodular if and only if each row basis of is unimodular. Total unimodularity can " I # be used to characterize certain integral polyhedra as follows.

Theorem 7.13 (Hoffman-Kruskal 1956) For A ZZm×n, the following are equivalent: (i) A is totally unimodular; ∈ (ii) b ZZm, the polyhedron x : Ax b, x 0 is integral; (iii)∀ c∈ ZZn, the polyhedron { y : yA ≥ c, y ≥ 0} is integral. ∀ ∈ { ≤ ≥ } Proof: Since totally unimodularity of A and At are equivalent, it suffices to prove (i) (ii). − ⇔ (i) (ii): Denote P = x : Ax b, x 0 ; for P = , there is nothing to prove. ⇒ { ≥ ≥ } ∅ n For P = , Theorem 5.36 implies the linealities of P are extreme points, as P IR+. Any extreme6 ∅ point is the unique solution to the linear system determined by its⊆ equality set, A and this equality set contains a row basis of . " I # Since this row basis is unimodular, the extreme point is integral (Cramer’s rule). Integrality of P now follows from Theorem 7.11(iii). A A A (ii) (i): Permuting columns if necessary, let B = 1 2 be a row basis of , ⇒ " 0 In−r # " I # where [ A1 A2 ] is comprised of r rows of A. To establish unimodularity of B, it suffices (Why?) to show B−1z ZZn z ZZn. ∈ ∀ ∈ Let z ZZn and select y ZZn for which: ∈ −1 ∈ yj +(B z)j 0, when the jth row of B comes from A. y = (B−1z)≥= z , when the jth row of B comes from I (also identity rows in B−1). j − j − j Now x = y + B−1z is the unique solution to Bx = By + z. Define bi =(By + z)k, when the ith row of A is row k in B; otherwise, bi = Ax i. Therefore b ZZm and x = y + B−1z satisfies Ax b, x 0, b c ∈ ≥ ≥ with all relations corresponding to rows of B holding at equality. Thus x is an extreme point of x : Ax b, x 0 , and must be integral. { ≥ ≥ } Since x, y ZZn, we also have x y = B−1z ZZn, which completes the proof. 2 ∈ − ∈ 87 Linear programming duality yields the following immediate consequence of the theorem for a dual pair of linear programming problems in symmetric form. Corollary 7.14 Matrix A ZZm×n is totally unimodular if and only if ∈ b ZZm, c ZZn such that max yb : yA c, y 0 = min cx : Ax b, x 0 , ∀ ∈both max∀ ∈and min have integer-valued{ ≤ optimal≥ solution} vectors.{ ≥2 ≥ } A further consequence of the theorem is the following characterization of total unimodularity in terms of integral decomposition (S. Baum and L. Trotter, Opt. and O.R.(1977)15-23). Corollary 7.15 Matrix A ZZm×n is totally unimodular if and only if ∈ b ZZm, k =1, 2,..., and x¯ x ZZn : Ax kb, x 0 , we have ∀ ∈x¯ = x∀+ + x , with x∀ ∈{x ∈ZZn : Ax ≥b, x 0≥ }i. 1 ··· k i ∈{ ∈ ≥ ≥ } ∀ m Proof: ( ) Let b ZZ and let x0 be an extreme point of P = x : Ax b, x 0 . ⇐ ∈ n { ≥ n ≥ } Select k ZZ+ so thatx ¯ = kx0 ZZ – this is possible, since clearly x0 Q . ∈ ∈ n ∈ Then Ax¯ kb, x¯ 0; hencex ¯ = x1 + + xk, with all xi ZZ P . But then ≥x = k≥ 1 x , a convex combination··· of the x ∈P ; hence∩ x = x ZZn i. 0 i=1 k i i ∈ 0 i ∈ ∀ Thus P must beP integral, and total unimodularity of A follows from the theorem. ( ) Letx ¯ ZZn satisfy Ax¯ kb, x¯ 0, where b ZZm and k 1. ⇒ ∈ ≥ ≥ ∈ ≥ We proceed by induction on k; decomposition ofx ¯ is trivial for k = 1. Assume k 2 and denote P = z :0 z x,¯ Ax¯ (k 1)b Az b . 1 ≥ { ≤ ≤ − − ≥ ≥ } ZZn Then k x¯ P = and total unimodularity of A implies P has an extreme point xk . ∈ 6 ∅ n ∈ Definez ¯ =x ¯ xk ZZ ; thenz ¯ 0, Az¯ (k 1)b. By induction,−z ¯ = ∈x + + x ≥, with ≥x ZZ−n , Ax b i. 1 ··· k−1 i ∈ + i ≥ ∀ Thereforex ¯ =z ¯ + x = x + + x + x is the required decomposition ofx ¯. 2 k 1 ··· k−1 k Of course, the same proof applies to polyhedra of the form y : yA c, y 0 . { ≤ ≥ } Unimodularity. The concept of unimodularity can be extended to non-square matrices as follows: Matrix A ZZm×n of rank r is unimodular provided for each row basis B of A, ∈ the gcd of the determinants of r r nonsingular submatrices of B is unity. Note that the extended definition reduces to the× original definition in the case m = n = r, i.e., for square, nonsingular matrices of integers. Exercise 7.16 Show that for A Zm×n with rank(A)= m, the following are equivalent: (i) A is unimodular; ∈ (ii) unimodular column operations on A yield the Hermite normal form [Im 0n−m]; (iii) the submatrix defined by any row subset from A is unimodular. 2 We have seen that total unimodularity characterizes certain families of integral polyhedra which arise from dual pairs of linear programming models with constraints in symmetric form: max yb : yA c, y 0 = min cx : Ax b, x 0 . We now show that unimodularity is { ≤ ≥ } { ≥ ≥ } characterizing for families of integral polyhedra arising from linear programming problems in standard form: max yb : yA = c, y 0 = min cx : Ax b . { ≥ } { ≥ } 88 Theorem 7.17 (Hoffman-Kruskal 1956) For A ZZm×n, the following are equivalent: (i) A is unimodular; ∈ (ii) b ZZm, the polyhedron x : Ax b is integral; ∀ ∈ { ≥ } (iii) c ZZn, the polyhedron y : yA = c, y 0 is integral. ∀ ∈ { ≥ } Proof: (i) (iii): Let c ZZn so that y : yA = c, y 0 = . ⇒ ∈ m { ≥ } 6 ∅ As this polyhedron lies in IR+ , its linealities are extreme points (Theorem 5.36). The equality set for any extreme pointy ¯ consists of m independent relations (Corollary 5.27), r = rank(A) from yA = c and m r from y 0. Let N denote the rows of A corresponding− to the≥m r relations for whichy ¯ = 0. − i Denoting the remaining rows by B, write yA = c as yB B + yN N = c.

Note that for yN =y ¯N = 0, the resulting system yB B = c has the unique solution yB =y ¯B . Hence rank(B)= r, and so B is a row basis. Let [H 0] be the Hermite normal form of B; i.e., BU = [H 0], where U ZZn×n is unimodular. ∈ Since A is unimodular with row basis B, we must have H = Ir. ZZn ZZm Thus cU =yAU ¯ = (¯yB , 0)AU =y ¯B BU =y ¯B [H 0]=y ¯B [Ir 0], which showsy ¯ . Integrality3 of y : yA = c, y 0 now follows from Theorem 7.11(iii). ∈ { ≥ } (iii) (ii): Integrality of x : Ax b follows from Theorem 7.11(v), since for any b ZZm, max⇒ yb : yA = c, y {0 is an≥ integer,} c ZZn such that the max is finite. ∈ (i) {(ii): Let row basis≥ }B have determinant∀ ∈ gcd for full rank submatrices exceeding 1. ¬ ⇒ ¬ Then hii > 1 for some i, for BU = H (Hermite normal form of B) with U unimodular. It follows that Bx = ei has no integral solution, 1 n since Bx = ei BUz = Hz = ei zi = ZZ x = Uz ZZ . ⇒ ⇒ hii 6∈ ⇒ 6∈ Suppose Bx¯ = ei and define P = x : Ax Ax¯ = . Thenx ¯ F = x P : Bx = e {= , a face≥ b of Pc}. 6 ∅ ∈ { ∈ i} 6 ∅ Since F ZZn = , Theorem 7.11(ii) implies P is not integral. 2 ∩ ∅ Restating the theorem in terms of standard form linear programming duality yields:

Corollary 7.18 Matrix A ZZm×n is unimodular if and only if ∈ b ZZm, c ZZn such that max yb : yA = c, y 0 = min cx : Ax b , ∀ ∈both max∀ ∈and min have integer-valued{ optimal≥ solution} vectors.{ ≥2 }

Exercise 7.19 What can be said about the integral decomposition property in Corollary 7.15 for (general) unimodular matrices? 2

Total dual integrality. Recall the proof of (iii) (ii) in Theorem 7.17: When polyhedron y : yA = c, y 0 is integral for each integral choice⇒ of c, the linear programming problem { ≥ } max yb : yA = c, y 0 will have an integer-valued optimal solution vector for any choice of b.{ When b is integral,≥ } the optimal value is also an integer, so Theorem 7.11(v) implies that x : Ax b is an integral polyhedron. The key fact, that when b is integral, integrality { ≥ } of x : Ax b follows from existence of dual optimal integral solutions for all integral c, was{ observed≥ for} pointed polyhedra by D.R. Fulkerson (Math. Prog. 1(1971)161–194) and

89 A. Hoffman (Math. Prog. 6(1974)352–359); the general case was introduced and studied extensively by Edmonds and Giles in Ann. Disc. Math. 1(1977)185–204: Linear system Ax b with A Qm×n is totally dual integral (TDI) when { ≥ } ∈ c ZZn such that max yb : yA = c, y 0 = min cx : Ax b , ∀ ∈ max has an integer-valued{ optimal≥ } solution{ vector. ≥ } Note that when the primal (min) LP is infeasible, Ax b is (vacuously) TDI, and when it is feasible, the restriction to c ZZn for which max{ = min≥ }simply stipulates that the dual have an integral optimal solution∈ for all c ZZn K(A). From the above discussion, we ∈ ∩ immediately have the following consequences of the definition. Corollary 7.20 If P = x : Ax b , with Ax b TDI and b integral, then P = P . 2 { ≥ } { ≥ } I Corollary 7.21 If Ax b is TDI, with b ZZm and c yA : y 0 , then min cx : Ax{ b≥ has} an integral optimal∈ solution∈{ vector. ≥2 } { ≥ } Corollary 7.22 A ZZm×n is unimodular Ax b is TDI b ZZm. 2 ∈ ⇔ { ≥ } ∀ ∈ Exercise 7.23 Show that TDI-ness is not affected by inessential inequalities; i.e., x : Ax b x : cx δ and Ax b TDI , with A and c rational { ≥ }⊆{ ≥ } { ≥ } Ax b, cx δ TDI . 2 ⇒ { ≥ ≥ } TDI-ness is an algebraic property of the linear system Ax b . Consider, e.g., the cone { ≥ } K = (x1, x2) : x1 + x2 0, x1 0, x1 x2 0 = (x1, x2) : x1 + x2 0, x1 x2 0 . The first representation{ of K≥is TDI≥, but the− second≥ }is not{ . This follows (see≥ below)− from≥ the} fact that (1, 1), (1, 0), (1, 1) is a Hilbert basis, whereas (1, 1), (1, 1) is not. A further TDI { − } 1 1 1 { 1 − } representation is given by (x1, x2) : 2 x1 + 2 x2 0, 2 x1 2 x2 0 , as every integral vector in K( ( 1 , 1 ), ( 1 , 1 ) ) is a{ ZZ -combination of≥( 1 , 1 ), (−1 , 1 )≥.} (Recall these two vectors { 2 2 2 − 2 } + { 2 2 2 − 2 } also comprise a Hilbert basis, while (1, 1), (1, 0), (1, 1) is an integral Hilbert basis.) In fact, a TDI representation for any rational{ polyhedron− can} be obtained similarly by scaling. Exercise 7.24 For P = Ax b with A Qm×n, { ≥ } ∈ find τ > 0 so that τAx τb is TDI. 2 { ≥ } We have already considered TDI-ness for homogeneous linear systems in the context of linear duality for finitely generated integral monoids. It is immediate from the discussion following Proposition 4.23 that max y0 : yA = c, y 0 has an integral optimal solution y for all integral c such that the max{ is finite, i.e., Ax≥ }0 is TDI, if and only if the rows of A are a { ≥ } Hilbert basis for the integral elements of K(A). Thus for a cone x : Ax 0 , TDI-ness of the representation rests solely on whether the equality set for its (unique){ lineality≥ } x : Ax =0 defines a Hilbert basis. The same is true for a general polyhedron P = x{: Ax b }, { ≥ } provided the Hilbert basis property holds for the equality set of every lineality. Recall from Theorem 5.28 that C = c : cx cz x P, z F = K(A ) for any face F = ; i.e., the F { ≥ ∀ ∈ ∀ ∈ } F 6 ∅ cone of objective functions optimized over P on F is generated by the rows of the equality

set AF .

90 Theorem 7.25 For P = x : Ax b , with A Qm×n, Ax b is TDI A{ is a Hilbert≥ } basis for∈ZZn K(A ), for every lineality F of P . { ≥ } ⇔ F ∩ F Proof: If Ax b is TDI, then by definition, { ≥ } ZZn ∗ ZZm ∗ ∗ for each face F and c CF : y + s.t. y A = c and cz = y b z F . (i) ∈ ∩∗ ∃ ∈ ∀ ∈ By complementary slackness, yi = 0 for each ai AF , hence ZZn ZZ6∈ for each face F and c K(AF ): c is a +-combination of rows of AF . (ii) Equivalently, ∈ ∩ for each face F : A is a Hilbert basis for ZZn K(A ). (iii) F ∩ F Hence, for each lineality F : A is a Hilbert basis for ZZn K(A ). (iv) F ∩ F We claim that from (iv) it follows that Ax b is TDI. To see this, suppose c ZZn has finite{ optimum≥ } over P . ZZn ZZ∈n Then c CF = K(AF ) for some lineality F . ∈ ∩ ∩ ∗ ∗ ZZm ∗ In view of condition (iv), c = y A with y + and yi = 0 for ai AF . By complementary slackness, y∗ is optimal∈ for max yb : yA = c, y6∈ 0 . Thus Ax b TDI (i) (ii) (iii) (iv) Ax{ b TDI. 2≥ } { ≥ } ⇒ ⇒ ⇒ ⇒ ⇒ { ≥ } Corollary 7.26 Ax b is TDI A is a Hilbert basis for every face F of P. 2 { ≥ } ⇔ F Exercise 7.27 Let A be a matrix with integer-valued entries. (i) Show that A is unimodular each of its row subsets defines a Hilbert basis. (ii) Derive a similar characterization⇔ for total unimodularity of A. 2 The link between TDI-ness and Hilbert bases also enables us to determine TDI representa- tions for the nonempty faces of P directly from any TDI representation for P . Exercise 7.28 For Ax b, cx δ TDI, with A and c rational, show that Ax {b, cx≥= δ is≥ TDI;} i.e., Ax b, cx δ, cx δ is TDI. 2 { ≥ } { ≥ ≥ − ≥ − } Exercise 7.24 shows that every rational polyhedron has a TDI representation (obtained by scaling). Giles and Pulleyblank (Lin. Alg. and Its Appl. 25(1979)191–196) have shown, moreover, that there must be a TDI representation with integral constraint coefficients. Theorem 7.29 Given A Qm×n, there is an integral matrix G so that: P (b)= x : Ax b , b ∈ IRm, h such that P (b)= x : Gx h with Gx h TDI. ∀ { ≥ } ∈ ∃ { ≥ } { ≥ } n Proof: For each ai A, choose i ZZ+ so that iai ZZ and gcd(iaij :1 j n)=1. ∈ ∈ ∈m ZZn ≤ ≤ As the rows of G, take gi = iai :1 i m i=1 µigi :0 µi < 1 i . m { ≤ ≤ }∪{ ∈ ≤ ∀ } Let b IR and consider P (b). ∈ P If P (b)= , then P (b)= x : Gx h = for hi = ibi ,1 i m, and hi = 0, otherwise. When P (b∅) = , define h{= min ≥ }g x ∅g G, thed mine being≤ ≤ finite because g K(A). 6 ∅ i x∈P (b) i ∀ i ∈ i ∈ Now x : Gx h P (b), in view of rows g1, ,gm; for{ the reverse≥ } containment, ⊆ x ˆ P (b) g···xˆ min g x = h g G. ∈ ⇒ i ≥ x∈P (b) i i ∀ i ∈ TDI-ness of Gx h follows from Corollary 7.26, since for any face F of P (b), { ≥ } ZZn 2 GF contains the Hilbert basis gi : ai AF a ∈A µigi :0 µi < 1 i . { ∈ }∪{ i F ∈ ≤ ∀ } P 91 We indicate two important extensions of this theorem. First, h can be taken integral if and only if P (b) is an integral polyhedron. When P (b) = , this follows from Theorem 7.11(v) and Corollary 7.20; for P (b)= , the construction in6 the∅ proof gives an integral h. Second, ∅ when P (b) is full-dimensional, there is a unique minimal TDI representation (i.e., no TDI subsystem) with integral constraint matrix. In this case, (5.29) and (4.4) provide a unique ZZn minimal integral Hilbert basis for CF , for each lineality F of P (b); these are used to construct G and h as in the proof. If P∩(b)= x : G0x h0 , with G0 integral and G0x h0 0 { ≥ } { ≥ } TDI, then by (7.25), GF is a Hilbert basis. But then Hilbert basis minimality implies 0 0 0 GF GF , so G x h contains Gx h , which is therefore a minimal representation. Part⊆ (ii) of the{ following≥ } corollary{ (Schrijver,≥ } Lin. Alg. and Its Appl. 38(1981)27–32) is a discrete analogue for Corollary 5.33.

Corollary 7.30 For P = x : Ax b with A Qm×n, (i) P = P P has an{ integral≥ TDI} representation;∈ I ⇔ (ii) if P is full-dimensional, P has a unique minimal TDI representation with integral constraint matrix. 2

Determining Integral Polyhedra

We describe a procedure due to Chv´atal and Schrijver for constructing the integer hull; see Schrijver, Theory of Linear and Integer Programming (Wiley, 1986). By Theorem 7.11(vi),

we know that when polyhedron P is specified by a rational coefficient matrix, P = PI provided each supporting hyperplane of P with rational coefficients contains an integral element. So we simply push each such hyperplane inward until it intersects the integer lattice. Specifically, let hyperplane x : cx = δ , c Qn support P x : cx δ . Since c ZZn { } ∈ ⊆{ ≥ } is rational, we may assume c with gcd(cj :1 j n) = 1, so that PI x : cx δ and x : cx = δ ZZn = .∈ We thus define P 1 by≤ intersecting≤ P with all⊆{ halfspaces≥ of d thee} { d e} ∩ 6 ∅ n form x : cx δ , where x : cx = δ supports P , with c ZZ and gcd(cj :1 j n)=1. We will{ show≥ that d e}P 1 is a{ polyhedron} – so that the procedure∈ may be iterated.≤ ≤ Thus, for 0 k+1 k 1 0 1 2 P = P and P =(P ) , k =0, 1, 2, , we have P = P P P PI . We will ··· k ⊇ ⊇ ⊇···⊇ also show that for some positive integer k, P = PI ; i.e., the procedure stops. Recall from (7.29) that P has a TDI representation, say Ax b , with A integral. { ≥ } Theorem 7.31 Suppose P = x : Ax b , where Ax b is TDI and A is integral. Then P 1 = x : Ax b ;{ in particular,≥ } P 1 is{ a polyhedron.≥ } { ≥ d e} Proof: If P = , the result is trivial, so assume P = . 1 ∅ 6 ∅ Clearly, P x : Ax b , considering x : aix = bi , for rows ai of A. On the other⊆{ hand, let≥x d :e}cx = δ support{ P x : cx} δ , with c integral. { } ⊆{ ≥ } Then we have δ = min cx : Ax b = max yb : yA = c, y 0 . Now Ax b TDI {max has≥ an} integral{ optimal solution,≥ say} y∗. { ≥ } ⇒ Therefore Ax b cx = y∗Ax y∗ b = y∗ b y∗b = δ . Thus x : Ax ≥ dbe ⇒ x : cx δ ≥, andd e it followsd d ee that ≥ d xe: Axd e b P 1. 2 { ≥ d e}⊆{ ≥ d e} { ≥ d e} ⊆ 92 Corollary 7.32 Suppose F is a face of P . Then F k is a face of P k; namely, F k = P k F , for k =0, 1, 2,... . ∩ Proof: Suppose F = x P : cx = δ , with c integral and P x : cx δ . { ∈ } ⊆{ ≥ } The assertion is clearly true for k = 0; consider the case k = 1. Note that if F = or δ ZZ, then F 1 = P 1 F = , so we assume F = and δ ZZ. Now Ax b, cx∅ δ is6∈ also TDI, by Exercise∩ 7.23.∅ 6 ∅ ∈ { ≥ ≥ } Thus Exercise 7.28 implies Ax b, cx = δ is a TDI system. Hence P 1 F = P 1 P {x : cx≥= δ = x}: Ax b ; cx = δ = ∩ ∩ ∩{ } { ≥ d e } x : Ax b ; cx δ , cx δ = F 1. Moreover,{ F≥1 d=eP 1 ≥P d ex :−cx =≥δ d−= Pe}1 x : cx = δ shows that F 1 is a face of P 1. Thus the assertion is∩ valid∩{ for k = 1.} ∩{ } Applying this to the face F 1 of P 1 yields F 2 = P 2 F 1. Thus F 2 = P 2 F 1 = P 2 P 1 F = P 2 F , ∩ ∩ ∩ ∩ ∩ and the desired result is established by iteration of the process. 2

Theorem 7.33 (Chv´atal) Suppose P = x : Ax b with A Qm×n. { ≥ } ∈ Then P k = P for some k ZZ . I ∈ + 0 Proof: P = P = P = PI ; for P = , use induction on dim(P ) = dimension of P . For dim(P )=∅⇒ 0, let P = xˆ ; thenx ˆ6 ∅ZZn P 0 = P = P andx ˆ ZZn P 1 = P = . { } ∈ ⇒ I 6∈ ⇒ I ∅ Thus consider dim(P ) > 0 and assume the theorem valid for polyhedra of lower dimension. For P = , let P = x IRn : Cx d , for C,d integral and cx δ a row of Cx d . I 6 ∅ I { ∈ ≥ } { ≥ } { ≥ } Then < minx∈P cx = δ δ = minx∈P cx, as P,PI have the same recession cone. −∞ ≤ I Now P x : cx δ ; we show that for some s ZZ , P s x : cx δ . ⊆{ ≥ } ∈ + ⊆{ ≥ } If δ = δ, take s = 0; if not, δ > δ and we have P 1 x : cx δ . If δ = δ , take s = 1; if not, δ > δ and we consider⊆{ F = P≥1 d e}x : cx = δ . Note Fd e ZZn = , as P x : cx d eδ and F x : cx = δ ∩{, but δ > δd ;e} ∩ ∅ I ⊆{ ≥ } ⊆{ d e} d e moreover, P x : cx = δ F dim(F ) < dim(P ). By induction, F r6⊂= {F = ford e} some ⊇ r ⇒ZZ (r = 0 for F = ). I ∅ ∈ + ∅ Applying Corollary 7.32, = F r = P r+1 x : cx = δ , hence P r+1 x : cx > δ . We obtain P r+2 x : cx∅ δ +1 ∩{and so on d ,e} until P s x :⊂{cx δ . d e} In finitely many iterations,⊆{ ≥ the d proceduree }··· “validates”··· the finitely many⊆{ rows≥ of }Cx d . ZZn ZZn ZZ { ≥ } For PI = , if A(P ) = , then yAP , ybP , for some rational vector y. ∅ ∩ ∅ ∈ 6∈ 1 Then (yAP )x =(ybP ) supports (contains) P and P P x :(yAP )x ybP = . When A(P ) ZZn = , Exercise 7.6(iii) shows dim(K⊆) < dim∩{ (P ) for K ≥= dx : Axe} ∅0 . Moreover, A ∩ A6 ∅(Why?), so dim(K) < dim(P ) implies rank(A ) < rank{ (A ).≥ } P ⊆ K P K Thus some row ai of AK is independent of the rows in AP . That is, a , a ZZn K + and P x : a x = b . (w.l.o.g. matrix A is integral) i − i ∈ ∩ 6⊆ { i i} Hence < minx∈P aix = bi = δ < δ = maxx∈P aix = minx∈P aix< + . Now we−∞ proceed as above (P = case) to establish P −= in finitely− many∞ steps. 2 I 6 ∅ I ∅

93 In order to evaluate how quickly the Chv´atal procedure proves PI = , we use the following result from, e.g., Kannan and Lov´asz, Annals of Math. 128(1988)577–602.∅ This result pro- vides in Theorem 7.33 a uniform bound, depending only on dimension, for the number of iterations of the Chv´atal procedure required to establish P = . I ∅ Theorem 7.34 Let P = x IRn : Ax b , with A rational and P ZZn = . Then: max cx min {cx∈ γ , for some≥ } c ZZn, gcd(c :1 j ∩n)=1,∅ x∈P − x∈P ≤ n ∈ j ≤ ≤ and where γn is a constant depending only on dimension. 2

Theorem 7.35 To each n ZZ there corresponds k ZZ such that if P = x : Ax b , ∈ + n ∈ + { ≥ } with A rational, P of dimension n, and P = , then P kn = . I ∅ ∅ Proof: The proof is by induction on n = dim(P ). As in the proof of Theorem 7.33, the case n = 0 is clear (take k0 = 1). m For the inductive step, let A(P )= x : AP x = bP , where P IR , m n = dim(P ). { 1 } ⊆ ≥ If AP x = bP has no integral solution, then P = , as demonstrated in the proof of (7.33). Alternatively,{ } translation byx ˆ x ZZm : A x = ∅b doesn’t affect the Chv´atal procedure. ∈{ ∈ P P } Transforming x y = x xˆ A(P )= y +ˆx : AP y =0 , so we simply assume → m − ⇒ { } A(P )= y IR : Cy =0 , with C integral and of full row rank m n. The Chv´atal{ procedure∈ is also} unaffected by unimodular transformations.− (proof?) From (2.3) we obtain CU = [H 0], with U unimodular and H nonsingular. Thus we assume, after applying the unimodular transformation y z = U −1y (i.e., y = Uz), A(P )= z IRm : [H 0]z =0 = 0 m−n IRn. → { ∈ } { } × n Note that these transformations reduce attention to the case in which P is essentially in IR . m Moreover, to each hyperplane H = x : c1x1 + + cmxm = δ in IR we can associate 0 {m ··· } m−n n H = x :0x1 + +0xm−n + m−n+1 cjxj = δ of the form IR (hyperplane in IR ). { ··· } × 0 n In applying the Chv´atal procedureP to P , we restrict to hyperplanes like H (i.e., in IR ), since pushing H0 until it contains an integral element will imply H does, as well. We thus assume for the remainder of the proof that m n = 0, i.e., that P is full-dimensional. Now Theorem 7.34 provides c ZZn, gcd(c ) = 1, for which− max cx min cx γ . ∈ j x∈P − x∈P ≤ n We claim P k+1+kkn−1 x : cx δ + k , for k =0, 1, ,γ + 1, where δ = min cx . ⊆{ ≥ } ··· n d x∈P e When k = 0, clearly P 1 x : cx δ . Inductively assume the claim true for 0, 1, ,k. ⊆{ ≥ } ··· Now dim(F ) < n for F = P k+1+kkn−1 x : cx = δ + k , so F ZZn = F kn−1 = . (Note we may take k k ∩{k k = 1.) } ∩ ∅⇒ ∅ n ≥ n−1 ≥···≥ 1 ≥ 0 Thus we have (P k+1+kkn−1 )kn−1 x : cx = δ + k = F kn−1 = . ∩{ } ∅ I.e., P k+1+(k+1)kn−1 x : cx = δ + k = , hence P k+1+(k+1)kn−1 x : cx>δ + k . ∩{ } ∅ ⊆{ } Now one more application of the procedure shows P (k+1)+1+(k+1)kn−1 x : cx δ+k+1 . ⊆{ ≥ } Applying the claim for k = γ + 1 yields P γn+2+(γn+1)kn−1 x : cx δ + γ +1 . n ⊆{ ≥ n } Now with k = γ +2+(γ + 1)k we obtain P kn x : cx δ + γ +1 . n n n n−1 ⊆{ ≥ n } And since max cx γ + δ P kn P x : cx δ + γ , we must have P kn = . 2 x∈P ≤ n ⇒ ⊆ ⊆{ ≤ n} ∅ m×n ∗ Theorem 7.36 For each matrix A Q there exists k ZZ+ such that ∗ x : Ax b k = x : Ax ∈ b , b IRm. ∈ { ≥ } { ≥ }I ∀ ∈ 94 Proof: Scale so that A ZZm×n; let ∆ be the maximum subdeterminant magnitude for A. ∈ ∗ 2n+2 n+1 2n+2 n+1 With kn as in (7.35), we validate the theorem for k = max kn, n ∆ +1+n ∆ kn−1 . Let b IRm and take P = x : Ax b . { } ∗ ∈ { ≥ } k kn If PI = , then Theorem 7.35 implies P = P = . For P =∅ , (7.10) and (3.41) imply P = x : Gx h∅ , for G integral with G n2n∆n. I 6 ∅ I { ≥ } k k∞ ≤ Suppose gx δ = min gx : x PI is an inequality from Gx h . Let δ0 = min≥ gx : x {P . ∈ } { ≥ } d { ∈ }e 0 n n 2n n 2n+2 n+1 By Theorem 7.7, we have δ δ n∆ j=1 gj n∆ j=1(n ∆ )= n ∆ . − ≤ | | ≤ 0 The Chv´atal procedure establishes PI P x : gx δ Pin (at most) one step. ⊆{ ≥ } 0 By Theorem 7.35, it requires at most kn−1 more steps to show PI x : gx>δ . Thus within k∗ n2n+2∆n+1 +1+(n2n+2∆n+1)k steps we have⊆{P k∗ x : gx} δ . ≤ n−1 ⊆{ ≥ } Since each inequality gx δ from Gx h gives values δ, δ0 with δ δ0 n2n+2∆n+1, we obtain P k∗ x : ≥Gx h ={ P .≥2} − ≤ ⊆{ ≥ } I

95 96 8 Convex Sets The o-closed sets in IRn are the subspaces of IRn (Corollary 1.13) and the #-closed sets in Qn are sums of rational subspaces and lattices (Theorem 2.31), or equivalently, the topolog- ically closed ZZ-modules (Theorem 2.33). For cone duality, of course any +-closed set is a cone (Proposition 3.4(iv)) and finite cones are +-closed (Corollary 3.13), but not all cones are +-closed. We will see below that topological closure is the additional condition needed to characterize +-closed cones. Once again projection plays a key role in the development. Recall (Theorem 1.17) that for a subspace S IRn, any point x IRn can be expressed as o the sum of its unique, orthogonal projections⊆x0 S, x00 S ; moreover,∈ x0 and x00 are the points in S and So at minimum distance from x.∈ ∈ Extending to arbitrary sets, when x / C IRn, we call x0 C the projection of x into C provided x0 x y x , y∈ C⊆. If x0 is the unique∈ point of C at minimum distance fromk x,− thek≤k projection− k is termed∀ ∈ unique. The hyperplane orthogonal to (x0 x) n − containing x0 is H = y IR : y(x0 x) = x0(x0 x) . Now, x(x0 x) < x0(x0 x) and when y(x0 x) x0(x{0 ∈x), y C,− then H separates− }x and C and− the projection− is said − ≥ − ∀ ∈ to be orthogonal, since H has normal x0 x and is tangent to C at x0. We now show that existence of unique, orthogonal projections− holds precisely for nonempty, closed, convex sets.

Theorem 8.1 For = C IRn, the following are equivalent: (i) C is closed and∅ 6 convex.⊆ (ii) Each point not in C has a unique, orthogonal projection into C. (iii) C is an intersection of halfspaces.

Proof: (i) (ii): Let x / C, δ = inf y x : y C , and y x δ, with y C i. ⇒ ∈ {k − k ∈ } k i − k→ i ∈ ∀ Choose > 0; then y x 2 δ2 +  and y x 2 δ2 + , for sufficiently large i, j. k i − k ≤ k j − k ≤ Now, convexity implies (yi + yj)/2 C, so (yi + yj)/2 x δ, and hence 1 2 2 ∈ 2 k − k ≥ 2 2 2 2 2 yi yj = yi x + yj x 2 (yi + yj)/2 x δ +  + δ +  2δ . k − 2 k k − k k − k −∞ k − k ≤ − Thus yi yj 4 and the sequence yi i=1 is Cauchy. k − k ≤ 0 { } 0 As C is closed, yi x C, and hence yi x x x = δ, as i . For uniqueness, suppose→ ∈ that for some yk C− wek→k have −y kx z →∞x z C. Then x0 x = y x and x∗ =(x0 +∈y)/2 C, sok if y−= xk≤k0, we get− k ∀ ∈ k ∗− k 2 k 1− 0k 2 1 2 ∈ 0 26 0 2 x x = 2 x x + 2 y x (x y)/2 < x x . This contradictsk − k thek fact− thatk amongk − pointsk −k of C, −x0 is atk minimumk − distancek from x. For orthogonality, consider the hyperplane y : y(x0 x)= x0(x0 x) . For y = x we have x(x0 x) < x0(x0 x), since{ (x0 −x)(x0 x) >−0. } And by convexity, for x0−= y C, − − − 6 ∈ (x0 x)(x0 x) (λy + (1 λ)x0 x)(λy + (1 λ)x0 x) λ [0, 1]. Thus we obtain− λ2−(y ≤x0)(y x0) − 2λ−(y x0)(x0 −x) λ −[0, 1].∀ ∈ − − ≥ − − − ∀ ∈ This implies (y x0)(x0 x) 0; hence y(x0 x) x0(x0 x) y C. − − ≥ 0 0− 0 ≥ − ∀ ∈ 0 (ii) (iii): Denote Hx = y : y(x x) x (x x) , where x / C projects to x C. Then⇒ clearly C = H : x{ / C . − ≥ − } ∈ ∈ ∩{ x ∈ } (iii) (i): Since each halfspace is closed and convex, so is C. 2 ⇒ 97 We point out that the assumption C = in (9.1) is made only to accomodate the existence of projections in part (ii) of the equivalence.6 ∅ Of course, the fact that C is closed and convex if and only if C is an intersection of hyperplanes remains valid for C = . The following separation result for convex sets is an immediate consequence of this theorem.∅

Corollary 8.2 Let C IRn be closed and convex, with x0 C the projection of x / C. ⊆ ∈ ∈ Then the hyperplane y : y(x0 x)= x0(x0 x) separates x and C. 2 { − − }

Closed Cones

There are several important consequences of Theorem 9.1, particularly for C = K, a topo- logically closed (convex) cone. In this case, (9.1) shows that each point of IRn has a unique, orthogonal projection into K. Furthermore, orthogonality of the projection from x / K to ∈ x0 K implies that x0(x0 x) = 0, i.e., the vectors x0 and x0 x are orthogonal. To see this, note∈ that we have y(x0 −x) x0(x0 x) y K. In particular,− αx0 K α 0, and so αx0(x0 x) x0(x0 x)− α ≥0, or equivalently,− ∀ ∈ (α 1)x0(x0 x) 0∈ α ∀ 0.≥ The latter − ≥ − ∀ ≥ − − ≥ ∀ ≥ can hold only if x0(x0 x) = 0. Thus it follows as in the proof of (ii) (iii) that K is the intersection of homogeneous− halfspaces; i.e., applying Proposition 3.4(⇒iv), K = K ++ . Thus Theorem 9.1 implies the following characterization of +-closed sets – compare Theorem 2.33 for ZZ-modules.

Corollary 8.3 K is +-closed if and only if K is a topologically closed cone. 2

This corollary shows that for cones, +-closure and topological closure are the same properties; we will refer to such cones simply as closed cones.

Exercise 8.4 Show that K ++ is the topological closure of the cone K. 2

Continuing the above discussion, the fact that x0(x0 x) = 0 and, consequently, that K lies − + + in the halfspace y : y(x0 x) 0 implies that x0 x K . Thus x x0 K and, in fact, x x0 is the{ projection− of≥x into} K + . To see− this,∈ note that for any− y∈ − K + , − − ∈ − y x 2 = y (x x0) x0 2 = y (x x0) 2 + x0 2 2x0(y x + x0). Now,k −2xk0(y kx +−x0)=− 2x−0y k 0, sincek −x0 −K andk yk k K−+ . Also,− y (x x0) 2 0, so it follows− that− y x 2− x0 ≥2 = (x x0)∈ x 2. Thus∈ −x x0 is thek projection− − ofkx ≥into + k − k ≥k k k − − k n − K . Writing x = x0 +(x x0), we see that any x IR can be expressed as the sum of −its (unique) projections into−K and K + . Note that∈ for x K, we write x = x + 0, and + − ∈ + for x K , x = 0+ x is the required decomposition; when x K ( K ), we must have ∈xx − 0, so x = 0 = 0 + 0. Thus we have a direct generalization∈ of∩ Theorem− 1.17 and − ≥ + n part (ii) of Exercise 3.17. In particular, for K = K = IR+, we see that any n-vector is the difference of two nonnegative vectors, an observation used in deriving Corollary 3.22.

98 Corollary 8.5 Let K be a closed cone in IRn. Then each point in IRn is the unique sum of its orthogonal projections into K and K + . 2 − The application of Corollary 9.2 to closed cones yields an extension of the Farkas Theorem. Recall that (3.15) says that either the point c is in the cone K finitely generated by the rows of A IRm×n or it is not, in which case there exists a vector x K + so that cx < 0. Now let S ∈be any (not necessarily finite) subset of IRn for which K(S)∈ is closed and suppose c0 K(S) is the orthogonal projection of c / K(S). Then it follows from the discussion above∈ that K(S) y : y(c0 c) 0 and c0(∈c0 c) = 0, which implies c(c0 c) < 0. Thus c0 c defines a hyperplane⊆{ separating− ≥ }c and K(S−); taking x = c0 c establishes− the following − − generalization of the Farkas Theorem to closed cones.

Corollary 8.6 Let c IRn, S IRn and suppose K(S) is closed. Then exactly one holds: (i) c K(S); (ii)∈ x IR⊆n such that Sx 0, cx< 0. 2 ∈ ∃ ∈ ≥ Exercise 3.21 on duality of pointedness and full-dimensionality extends to closed cones. Also, part (vi) of Exercise 3.5 holds at equality for closed cones – compare Exercise 3.17(iii).

Exercise 8.7 Suppose that K(S) is closed and 0 / S IRn. Show that: 0 cannot be expressed as a nontrivial conical∈ combination⊆ of elements of S K(S) is pointed ⇔ x : Sx 0 is full-dimensional ⇔ { ≥ } x : Sx > 0 = . 2 ⇔ { } 6 ∅ Corollary 8.8 If K is a closed, pointed cone, then yx > 0 y K 0 , for some x. 2 ∀ ∈ \{ } Exercise 8.9 Show that part (vi) of Exercise 3.5 holds at equality for closed cones. What about part (ii) of Exercise 3.5? 2

Exercise 8.10 If S IRn is (topologically) closed, must K(S) be closed? 2 ⊆ Consider once again our development for lattices. Suppose M is a #-closed rational ZZ-module with trivial lineality. By Theorem 2.31, M can be expressed in the form zB : z ZZp for some matrix B Qp×n; hence M is a lattice and it follows that some{ positive∈ integral} ∈ scaling of M is integer-valued (see Corollary 2.12). The following exercise indicates a conical analogue for this result.

Exercise 8.11 Let K be a closed cone with trivial lineality; i.e., K is closed and pointed. Show that there is a unimodular transformation under which K becomes nonnegative. 2

Finite cones are +-closed and we have seen above that much of our earlier development for finite cones remains valid for the more general class of +-closed cones. We continue this theme now by considering the manner in which Fourier-Motzkin elimination can be extended from matrices to more general sets. Recall from (3.27) that when two n-column

99 matrices define a dual pair of polyhedral cones, then Fourier-Motzkin elimination applied to one and (dually) truncation applied to the other yields two (n 1)-column matrices which again define a dual pair of polyhedral cones; i.e., projection of a− polyhedral cone into n the hyperplane H = x IR : xn = 0 corresponds under duality to intersection of the dual (also polyhedral){ cone∈ with H. We} should expect that neither the use of duality nor the corresponding geometry depends crucially on finiteness. This is indeed the case when S IRn generates a closed cone, for in this setting the content of (3.27) remains valid. ⊆ Exercise 8.12 Suppose K = x IRn : Sx 0 , where S IRn with K(S) closed, and let 0 0 n−1 0 { ∈ ≥ } 0 ⊆ n−1 K = x IR :(x , xn) K, xn IR ; i.e., K is the projection of K into IR . Show K0 = {x0 ∈ IRn−1 : S0x0 0∈, where∃ S∈0 }IRn−1 is determined as in (3.9): { ∈ ≥ } ⊆ 0 (i) s S, sn =0 (s1, ,sn−1) S ; ∈ ⇒ ··· s1 ∈ t1 sn−1 tn−1 0 (ii) s, t S, sn > 0, tn < 0 ( , , ) S . 2 ∈ ⇒ sn − tn ··· sn − tn ∈ Exercise 8.13 Define K = x IRn : Sx 0 , where S IRn with K(S) closed, and sup- n { ∈ ≥ } ⊆ pose T IR generates K. Validate relations (ii) – (iv) of Diagram 3.27 for this setting. 2 ⊆ Note, in particular, that when we take T = K, a closed cone, and S = K + , then the geometric interpretation of Exercise 9.13 is again that projection of K into the hyperplane n + H = x IR : xn =0 corresponds dually to intersection of K with H. Considering{ ∈ the example} S = ( 1, 0, 0), (0, 1, 0), (1, 1, 1), (2, 2, 1), (4, 4, 1), , we have { − − ···} K(S) = (x1, x2, x3) : x3 > 0 (x1, x2, 0) : x1 0, x2 0 , so K(S) is not closed. Applying{ (9.12) to S yields S0 =}∪{( 1, 0), (0, 1) ; hence≤ ( 1≤, 1)} satisfies S0( 1, 1) 0. { − 3 − } − − − − ≥ On the other hand, for K = x IR : Sx 0 , we have K = (0, 0, x3) : x3 0 and projecting to eliminate x yields{ K∈0 = (0, 0)≥. Thus} K0 = (x , x {) : S0(x , x ) ≥0 .} If we 3 { } 6 { 1 2 1 2 ≥ } close K(S) by adding the point (1, 1, 0) to S, then K(S) = (x1, x2, x3) : x3 0 and we 0 0 { 0 ≥ } obtain S = ( 1, 0), (0, 1), (1, 1) , so that now K = (x1, x2) : S (x1, x2) 0 = (0, 0) . This example,{ − due to S.− Magnusson} (private communication,{ Cornell, 1992),≥ shows} that{ the} closure stipulation on K(S) cannot be removed in (9.12) and, therefore, that (9.13) may fail for general S IRn. The difficulty stems from the inability of Fourier-Motzkin elimination on S to produce⊆ a constraint representation for K0. The geometric content of Diagram 3.27 remains valid, as is demonstrated in the following exercise.

n 0 0 n−1 0 Exercise 8.14 Let K = x IR : Sx 0 , K = x IR : (x , xn) K, xn IR , where S IRn, and suppose{ T∈ IRn generates≥ } K. { ∈ ∈ ∃ ∈ } ⊆ ⊆ (i) Validate items (ii) and (iv) in Diagram 3.27 for the present setting. (ii) Show that K0 is a closed cone, i.e., that projecting a closed cone from IRn into IRn−1 yields a closed cone. 2 Finally, we note that much of our development on convex cones throughout this section holds for rational, as well as real, descriptions and that the elimination process (3.9) preserves rationality – i.e., a rational generator description leads via elimination to a rational constraint representation, and conversely. When a cone is specified by rational data (generators or constraints), we will say it is rationally spanned, or simply rational, for short. For example, for A Qm×n, we have the finite rational cones yA : y IRm and x IRn : Ax 0 . ∈ { ∈ + } { ∈ ≥ } 100 Exercise 8.15 State the Farkas Theorem (3.15) for rational data; show that Qn can be strengthened to ZZn in alternative (ii). 2

We caution, however, that when we restrict entirely to the rationals, the situation becomes more complicated. For the remainder of the discussion now, we consider cone duality in the rationals; i.e., the conical dual of the set S Qn is defined as x Qn : Sx 0 . In this setting the basic properties of Weyl-Minkowski⊆ duality for finite cones{ ∈ remain valid.≥ } But, the proof of Theorem 9.1 used completeness of IRn, i.e., the Cauchy criterion for convergence. Thus it should not be surprising that implications of this theorem can fail in incomplete n 2 spaces such as Q . Indeed, the cone K = (x1, x2) Q : x2 √2x1 0 is topologi- cally closed in Q2, but its dual is Q2 x :{Kx 0∈ = 0 and− Q2 =≥K is} the dual of 0 , so Corollary 9.3 is no longer valid∩{ in Qn. In Hartmann≥ } { and} Trotter,6 Math. Prog. (A) { } n 49(1991)281–283, it is shown that S Q is closed under (rational) cone duality if and only if S = Qn K for some topologically⊆ closed real cone K whose lineality is rationally spanned. Recall that∩ for S T IRn, the set S is dense in T provided every neighborhood of any ⊆ ⊆ n n point of T contains elements of S. Of course, Q is dense in IR , a result generalized in the following exercise; this result is used in the proof of the theorem which follows.

Exercise 8.16 Show that Qn S is dense in any full-dimensional convex set S IRn. 2 ∩ ⊆ Theorem 8.17 A set J Qn is closed under rational cone duality if and only if J = Qn K, for some⊆ closed real cone K with rationally spanned lineality. ∩ Proof: If J is closed under cone duality in Qn, then J = Qn x : Sx 0 , for some S Qn. ∩{ ≥ } n ⊆ Defining the (real) cone K = x : Sx 0 , clearly K is closed and J = Q K. Moreover, the lineality of K is{ x : Sx≥=0} , which is rationally spanned because∩ S Qn. n { } ⊆ Conversely, let J = Q K, for K a closed (real) cone with rationally spanned lineality L. o Since K is closed, Exercise∩ 9.9 implies L = L(K + ). Since L has a rational basis, Lo = L(K + ) also is rationally spanned. Thus we may assume L(K + )= yB : y IRm , for some B Qm×n of rank m. It is easily seen that the set S ={ y IR∈m : yB} K + is convex.∈ And z IRm zB L(K + ) zB{ ∈= λ x , for∈ x } K + zB = λ (y B), for y S; ∈ ⇒ ∈ ⇒ i i i i ∈ ⇒ i i i i ∈ since B is of full row rank, we mustP have z = i λiyi. P m Thus L(S) = IR and it follows that S is full-dimensional.P Hence Exercise 9.16 implies Qn S is a dense subset of S. Since yB Qn whenever y Qm∩, K + is the closure of T = Qn K + ; i.e., K + = T . Clearly, (T∈)+ = T + , so we must∈ have ∩ J = Qn K = Qn K ++ = Qn (T )+ = Qn T + = Qn x : T x 0 . As J is the∩ (rational)∩ conical dual of∩ T Qn, J ∩is closed under∩{ cone duality≥ } in Qn. 2 ⊆ Exercise 8.18 The closure condition in (9.17) stipulates rationally spanned lineality for K. Does it then follow that K is rationally spanned? Prove or demonstrate false by example. 2

101