A. Appendix: Some Mathematical Concepts
Total Page:16
File Type:pdf, Size:1020Kb
A. APPENDIX: SOME MATHEMATICAL CONCEPTS Overview This appendix on mathematical concepts is simply intended to aid readers who would benefit from a refresher on material they learned in the past or possibly never learned at all. Included here are basic definitions and examples pertaining to aspects of set theory, numerical linear algebra, met- ric properties of sets in real n-space, and analytical properties of functions. Among the latter are such matters as continuity, differentiability, and con- vexity. A.1 Basic terminology and notation from set theory This section recalls some general definitions and notation from set theory. In Section A.4 we move on to consider particular kinds of sets; in the process, we necessarily draw certain functions into the discussion. A set is a collection of objects belonging to some universe of discourse. The objects belonging to the collection (i.e., set) are said to be its members or elements.Thesymbol∈ is used to express the membership of an object in a set. If x is an object and A is a set, the statement that x is an element of A (or x belongs to A) is written x ∈ A. To negate the membership of x in the set A we write x/∈ A. The set of all objects, in the universe of discourse, not belonging to A is called the complement of A. There are many different notations used for the complement of a set A. One of the most popular is Ac. The cardinality of a set is the number of its elements. A set A can have cardinality zero in which case it is said to be empty. The empty set is sometimes called the null set; in either case it is denoted by the symbol ∅. The cardinality of a set can be finite or infinite. When the cardinality of a set is one, the set is called a singleton.Ifx is the one element of a singleton A,wewriteA = {x}. If A and B are two sets and every element of A is an element of B,wecall A a subset of B and write A ⊆ B. This notation allows for the possibility that A and B are the same set, but when A and B are not equal, we write © Springer Science+Business Media LLC 2017 537 R.W. Cottle and M.N. Thapa, Linear and Nonlinear Optimization, International Series in Operations Research & Management Science 253, DOI 10.1007/978-1-4939-7055-1 538 LINEAR AND NONLINEAR OPTIMIZATION A ⊂ B. There is an analogous notion of containment as well. In particular, when A is a subset of B,wesaythatB contains A and write B ⊇ A or B ⊃ A if the sets are not equal. The binary relations ⊂ and ⊃ are called proper inclusion and proper containment, respectively. By convention, the empty set is a subset of every set. A set is customarily described by some criterion that its members must satisfy. The criterion is a statement involving a variable which is made true by each of the members of the set. This gives rise to expressions such as A = {x : statement about x is true}. This is called set-builder notation.1 For example, when we take R,thereal number field, to be the universe of discourse, the nonnegative real numbers (often associated with the nonnegative part of the real number line) is R+ = {x : x ∈ R and x ≥ 0}. As a set, Rn can be described as follows. n R = {x : x =(x1, ...,xn), where xi ∈ R, i =1,...,n}. When we take Rn to be the universe of discourse, the corresponding non- negative orthant canbeexpressedas n { ∈ n ≥ } R+ = x : x R and xi 0,i=1,...,n . This can also be put in a slightly different way, namely as n { ∈ n ≥ } R+ = x R : xi 0,i=1,...,n . The set-builder notation is useful in defining the various operations on sets itemized below. • The union of two sets, say A and B,istheset A ∪ B = {x : x ∈ A or x ∈ B}. 1Some authors use a vertical bar instead of a colon, and whatever symbol is used, one sometimes sees the universe of discourse mentioned before that symbol. Thus, if U is the universe of discourse, a set A might be declared as {x ∈ U : statement about x is true}. But more often than not, the universe of discourse is clear from the context and is not even stated. APPENDIX A: SOME MATHEMATICAL CONCEPTS 539 If Aω is a set for each index ω belonging to a set Ω, their union is 2 Aω = {x : x ∈ Aω for some ω ∈ Ω}. ω∈Ω • The intersection of two sets, say A and B,istheset A ∩ B = {x : x ∈ A and x ∈ B}. If Aω is a set for each index ω belonging to a set Ω, the intersection of all these sets is 3 Aω = {x : x ∈ Aω for all ω ∈ Ω}. ω∈Ω • As mentioned above, the complement of a set A is the set of objects in the universe of discourse that do not belong to A,thatis Ac = {x : x/∈ A}. • Given two sets, say A and B with B regarded as the universe of dis- course, the set-theoretic difference of B and A is the complement of A relative to B. In set-builder notation, we have B \ A = {x : x ∈ B and x/∈ A} = B ∩ Ac. • Another construct is the symmetric difference of two sets, say A and B. This is defined by A Δ B =(A \ B) ∪ (B \ A). • The Cartesian product of two (or more) sets, A and B is another important concept. This set consists of all ordered pairs of elements from A and B. Accordingly, we write A × B = {(x, y):x ∈ A and y ∈ B}. (Note that when A and B are distinct sets, i.e. A = B,thesetsA × B and B × A are likewise distinct.) The Cartesian product of sets can be iterated as in the definition of Rn with n ≥ 2. Rn = Rn−1 × R. 540 LINEAR AND NONLINEAR OPTIMIZATION A.2 Norms and metrics Theimportanceofdistance The concept of the distance between objects arises frequently, both in prac- tice and in theory (see [48]). We are all familiar with the measurement of length and of the ordinary distance between physical objects. Sometimes the “distance” between locations is reckoned in terms of travel time. One can also use the notion of “distance” to measure the “similarity” of objects. Optical character recognition is one of many types of distance measurement done in technological methods aimed at automatically identifying “objects.” In this case, the objects would normally be printed or hand-written char- acters belonging to an alphabet. Such objects can be described by vectors of numbers representing chosen features of the character set. A similar approach can be applied to recognizing faces or fingerprints by various bio- metric measurements. In optimization, the measurement of distance from one vector to another vector is often needed. The distance between suc- cessive iterates in an algorithm, or the distance of a point from the origin are matters that come up repeatedly. A common kind of problem is that of minimizing the distance from a point to a set. For example, the question of whether the system Ax = b, x ≥ 0 has a solution can be settled by deter- mining the distance from the point (vector) b to the finitely generated cone pos (A)={y : y = Ax, x ≥ 0}. If the distance is positive, then b/∈ pos (A) and hence Ax = b, x ≥ 0 has no solution. The converse of this statement is also true. As we saw briefly in Chapter 8, distance can be calculated in various ways. Metrics are the tools that enable us to measure distance. However, because norms2 are a fundamental concept used in defining metrics, we begin with an introduction to vector norms. Norms on Rn In general, vector norms measure the size of vectors in some linear space. For our purposes, the linear space of interest is Rn.Anorm on Rn is a real-valued function ϕ with the following properties: 2There are vector norms and matrix norms. APPENDIX A: SOME MATHEMATICAL CONCEPTS 541 1. ϕ(x) ≥ 0 for all x ∈ Rn. 2. ϕ(αx)=|α|ϕ(x) for all α ∈ R, x ∈ Rn,where|α| is the absolute value of α, i.e., |α| =max{α, −α}. 3. ϕ(x + y) ≤ ϕ(x)+ϕ(y) for all x, y ∈ Rn. If ϕ is a norm on Rn, then for any x ∈ Rn,ϕ(x) is called the norm of x.Note that property 2 implies ϕ(0) = 0. It also implies that for every x ∈ Rn,the vectors x and −x “have the same norm,” that is, ϕ(x)=ϕ(−x). Property 3 is called the triangle inequality. The most commonly encountered vector norms are: n 1. The 1-norm: x1 = |xj| also known as the Manhattan norm. j=1 / 0 0n 1 2 2. The 2-norm: x 2 = xj also known as the Euclidean norm. j=1 3. The ∞-norm: x∞ =max|xj|. 1≤j≤n Norms can also be defined on Rm×n, that is, on real m × n matrices. In a formal sense, a matrix norm must possess the same three properties that a vector norm possesses. It is customary to use the same notation for both vector norms and matrix norms. Matrix norms are continuous functions ||·|| that possess the following properties: 1. ||A|| ≥ 0, with equality if and only if A =0. 2. ||αA|| = |α|||A|| for any scalar α. 3. ||A + B|| ≤ ||A|| + ||B||.