<<

Chapter 2

Linear Spaces

Contents

FieldofScalars ...... 2.2 VectorSpaces ...... 2.3 Subspaces ...... 2.5 Sumofsubsets...... 2.5 Linearcombinations...... 2.6 Linearindependence...... 2.7 BasisandDimension ...... 2.7 Convexity ...... 2.8 Normedlinearspaces ...... 2.9

The `p and Lp spaces ...... 2.10 Topologicalconcepts ...... 2.12 Opensets ...... 2.13 Closedsets...... 2.14 Boundedsets...... 2.15 Convergence of ...... 2.16 ...... 2.17 Cauchysequences ...... 2.18 Banachspaces...... 2.19 Completesubsets ...... 2.19 Transformations ...... 2.21 Lineartransformations...... 2.21 Continuity...... 2.22 Compactness...... 2.23 Upper semicontinuous functions ...... 2.26 QuotientSpaces ...... 2.27 Denseness ...... 2.27 Separability ...... 2.27 Schauderbasis...... 2.27 Summary ...... 2.28

2.1 2.2 c J. Fessler, October 4, 2004, 12:44 (student version)

In systems analysis, linear spaces are ubiquitous. Why? Linear systems/models are easier to analyze; many systems, particularly in , are deliberately designed to be linear; linear models are a useful starting point (approximation) for more complicated nonlinear cases.

Formal definitions of a vector use the concept of a field of scalars, so we first review that.

Field of Scalars (from Applied , Noble and Daniel, 2nd ed.) A field of scalars is a collection of elements α,β,γ,... along with an “addition” and a “multiplication” operator. F To every pair of scalars α, β in , there must correspond a α + β in , called the sum of α and β, such that Addition is commutative: α F+ β = β + α F • Addition is associative: α + (β + γ) = (α + β) + γ • There exists a unique element 0 , called zero, for which α + 0 = α, α • For every α , there corresponds∈F a unique scalar ( α) for which∀α +∈F ( α) = 0. • ∈F − ∈F − To every pair of scalars α, β in , there must correspond a scalar αβ in , called the product of α and β, such that Multiplication is commutative:F αβ = βα F • Multiplication is associative: α(βγ) = (αβ)γ • Multiplication distributes over addition: α(β + γ) = αβ + αγ • There exists a unique element 1 , called one, or unity, or the identity element, for which 1α = α, α • ∈F 1 ∀ ∈F1 For every nonzero α , there corresponds a unique scalar α− , called the inverse of α for which αα− = 1. • ∈F ∈F Simple facts for fields: 0+0=0 (use α = 0 in the definition of 0) • 0 = 0 Proof. For any α, by the associative property (α + 0) + ( 0) = α + (0 + ( 0)) hence α + ( 0) = α. Hence, since •the − zero element is unique, 0 = 0. − − − − Example. The set of rational numbers Q (with the usual definition of addition and multiplication) is a field. The only fields that we will need are the field of real numbers R and the field of complex numbers C. Therefore, hereafter we will use when describing results that hold for either R or C. F c J. Fessler, October 4, 2004, 12:44 (student version) 2.3

Vector Spaces In simple words, a is a space that is closed under vector addition and under scalar multiplication. Definition. A vector space or linear space consists of the following four entities.

1. A field of scalars. F 2. A set of elements called vectors. X 3. An operation called vector addition that associates a sum x + y with each pair of vectors x, y such that Addition is commutative: x + y = y + x ∈X ∈X • Addition is associative: x + (y + z) = (x + y) + z • There exists an element 0 , called the zero vector, for which x + 0 = x, x • For every x , there corresponds∈X a unique vector ( x) for which x +∀ ( x∈X) = 0. • ∈X − ∈X − 4. An operation called multiplication by a scalar that associates with each scalar α and vector x a vector αx , called the product of α and x, such that: ∈F ∈X ∈X Associative: α(βx) = (αβ)x • Distributive α(x + y) = αx + αy • Distributive (α + β)x = αx + βx • If 1 is the identify element of , then 1x = x. x . • 0x = 0 for any x . F ∀ ∈X • ∈X

The requirement that x + y and αx is sometimes called the closure property. ∈X ∈X Simple facts for vector spaces: 0 is unique. • ( 1)x = x for x . • Proof.− x +− ( 1)x =∈X 1x + ( 1)x = (1 + ( 1))x = 0x = 0. α0 = 0 for α− . − − • Proof. α0 = α∈F0 + 0 = α0 + [α0 + ( α0)] = [α0 + α0] + ( α0) = α(0 + 0) + ( α0) = α0 + ( α0) = 0. x + y = x + z implies y = z (cancellation− law) − − − • αx = αy and α = 0 implies x = y (cancellation law) • αx = βx and x 6= 0 implies α = β (cancellation law) • α(x y) = αx 6 αy (distributive law) • (α −β)x = αx − βx (distributive law) • α−x = α( x) =− (αx) • − − − 2.4 c J. Fessler, October 4, 2004, 12:44 (student version)

What are some examples? (Linear algebra classes focus on finite-dimensional examples.) Important Vector Spaces or n-tuple space: = Rn. If x , then x = (a , a ,...,a ) where a R and we use ordinary addition • X ∈X 1 2 n i ∈ and multiplication: x + y = (a1 + b1, a2 + b2,...,an + bn) and αx = (αa1,...,αan). (As a special case, the set of real numbers R (with ordinary addition and multiplication) is a trivial vector space.) 2 2 = 1 R . The set of functions f : R R that are absolutely (Lebesgue) integrable: ∞ ∞ f(x,y) dx dy < , •with X theL usual pointwise definition of addition→ and scalar multiplication. −∞ −∞ | | ∞   2 2 R R To show show that f, g 1 R implies f + g 1 R , one can apply the : f + g f + g . The set of functions on the∈ L plane R2 that are zero∈ outside L of the unit square. | | ≤ | | | | • The set of solutions to a homogeneous  linear system of equati ons Ax = 0. • C[a, b]: the space of real-valued, continuous functions defined on the interval [a, b]. • The space of band-limited signals. • Many more in Luenberger... • Example. For 1 p < , define the following infinite-dimensional1 space: ≤ ∞ 1 = R [a, b] = f : [a, b] R : f is Riemann integrable and f(t) p dt < , X p → | | ∞  Z0  (with the usual pointwise definitions of addition of functions and multiplication of functions by a scalar). To show that this space is a vector space, the only nontrivial work is verifying closure. Clearly if f then αf since 1 αf(t) p dt = α p 1 f(t) p dt < , so is closed under scalar multiplication. ∈X ∈X 0 | | | | 0 | | ∞ X To show that f +g if f, g , i.e.R , closure under addition,R requires a bit more work. Note that since a+b 2 max a , b , it follows for p 1∈Xthat ∈X | | ≤ {| | | |} ≥ a + b p 2p max a p, b p 2p[ a p + b p]. | | ≤ {| | | | } ≤ | | | | Hence if f, g : ∈X 1 1 1 1 f(t) + g(t) p dt [2p f(t) p + 2p f(t) p] dt 2p f(t) p dt +2p f(t) p dt < + = . | | ≤ | | | | ≤ | | | | ∞ ∞ ∞ Z0 Z0 Z0 Z0 showing closure under addition.

Example. Can = (0, ) with = R be a vector space? ?? X ∞ F Cartesian product We can make a “larger” vector space from two vector spaces and (having a common field ) by forming the Cartesian product of and , denoted , which is the collection ofX orderedY pairs (x, y) where x andF y . X Y X ×Y ∈X ∈Y = (x, y) : x and y . X ×Y { ∈X ∈ Y} To be a vector space we must define vector addition and scalar multiplication operations, which we define component-wise: (x1, y1) + (x2, y2) = (x1 + x2, y1 + y2) • α(x, y) = (αx, αy), α . • ∀ ∈F Fact. With these definitions the Cartesian product of two vector spaces is indeed a vector space. The above definition generalizes easily to higher-order combinations. Example. R3 = R R R. × ×

1A definition of dimension is forthcoming... c J. Fessler, October 4, 2004, 12:44 (student version) 2.5

2.3 Subspaces A (nonempty) subset S of a vector space is called a subspace of if S, when endowed with the addition and scalar multiplication operations defined for , is a vector space,X i.e., αx + βy S wheneverX x, y S and α, β . X ∈ ∈ ∈F Example. The subset of = R2[ 1, 1] consisting of symmetric functions (f( t) = f(t)) is a subspace of . It is clearly closed under addition and scalarX multiplication.− − X What are the four types of subspaces of R3? ?? Intuition: think of a subspace like a line or plane (or hyperplane) through the origin. Properties of subspaces 0 S • 0∈ is a subspace of • { }is a subspace of X •A X subspace not equalX to the entire space is called a proper subspace • If M and N are subspaces of a vector spaceX , then the intersection M N is also a subspace of . Proof. see text • Think: intersection of planes (through the origin)X in 3d. ∩ X Typically the union of two subspaces is not a subspace. • Think: union of planes (through the origin) in 3d. Although unions usually fail, we can combine two subspaces by an appropriate sum, defined next.

Sum of subsets Definition. If S and T are two subsets of a vector space, then the sum of those subsets, denoted S + T is defined by

S + T = x = s + t : s S, t T . { ∈ ∈ }

Example. What is the sum of a plane and a line (both through origin) in R3? ?? Example. Consider = R2, with S = (x, 0) : x [0, 1] and T = (0,y) : y [0, 1] . Then S + T is the unit square. X { ∈ } { ∈ }

Proposition. If M and N are subspaces of a vector space , then the sum M + N is a subspace of . X X

Proof. see text Does the previous example illustrate this proposition? ?? Example. Let = f : f(t) = a sin(t + φ) for a, φ R (with the usual definitions of addition and scalar multiplication2). Then M = fX: f(t{) = a sin(t) for a R and N =∈ f }: f(t) = a cos(t) for a R are both (proper) subspaces of . { ∈ } { ∈ } X What is M + N? ??

2It is time to stop saying this. From now on we leave it implicit whenever this is clear, which it usually is. 2.6 c J. Fessler, October 4, 2004, 12:44 (student version)

Linear combinations Definition. A finite sum n α x for x and α is called a linear combination. i=1 i i i ∈X i ∈F (The associative propertyP of vector addition allows us to write such a sum without parentheses.)

Depending on where the xi’s originated we get various properties of linear combinations. n : If xi , i = 1,...,n, then i=1 αixi . This is shown easily by induction from the definition of a vector space. • X ∈X ∈X n M: If xi M, i = 1,...,n, where M is a subspace, then i=1 αixi M by induction from the definition of a subspace. • Any linear∈ combination of vectorsP from a subspace is also in the subspace.∈ S: What if we take linear combinations from a subset ratherP than a subspace? • Definition. If S is a subset of a vector space , then the subspace generated by S is the subspace of linear combinations drawn from S, defined by X n [S] = x : x = α x , for x S, α , and n N . ∈X i i i ∈ i ∈F ∈ ( i=1 ) X [S] is also called the span or the linear hull of S. • [S] is indeed a subspace of since a linear combinations of linear combinations is itself a linear combination. • [S] is the smallest subspaceX of containing S, i.e., if M is a subspace of that contains S, then [S] M. • X X ⊆ If M is a subspace of , then [M] = ?? • Clearly S [S] X • Note that only⊆ finite sums are involved, as in all linear combinations. • Example. For = R3, what is [S] when S consists of a line through the origin plus any point not on that line? X ?? Intuition: a subspace of a general vector space generalizes the notion of a line or plane through the origin of Euclidean 3D space. What about lines or planes that are not through the origin? Linear varieties skip for now. Not needed in Ch. 2 problems. Wait until 3.10.

Definition. A subset V of a vector space is called a linear variety iff V = x0 + M for some x0 and some subspace M of . Another term used is affine subspaceX. ∈ X X Linear varieties arise in certain minimum problems. c J. Fessler, October 4, 2004, 12:44 (student version) 2.7

2.5 Linear independence Often we need to quantify how “big” a subspace is. Definition. A vector x is called linearly dependent on a set S of vectors iff x [S], i.e., x is in the span of S. ∈ Otherwise, if x / [S], then x is called linearly independent of S. ∈ Definition. A set S of vectors is called a linearly independent set if each vector in the set is linearly independent of the remaining vectors in the set, i.e., x S, x / [S x ] . ∀ ∈ ∈ − { } Remark. S may be uncountable, but testing whether x [S x ] requires consideration only of finite sums, by the definition of linear combinations. ∈ − { } Example. This illustrates that S can be uncountable! 1, t = s Let = f : [0, 1] R and define g (t) = X { → } s 0, otherwise.  Then S = g : s [0, 1] is a linearly independent subset of . { s ∈ } X

Theorem. A finite set of vectors x ,..., x is linearly independent iff n α x = 0 implies that α = 0, i. { 1 n} i=1 i i i ∀ P (We are skipping proofs that are found in basic linear algebra books.)

Corollary. Ifafiniteset S = x ,..., x is linearly independent and y [S], then y has a unique expansion y = n α x { 1 n} ∈ i=1 i i for some α1,...,αn . ∈F P and Dimension Definition. A set S is called a basis or Hamel basis [3, p 183] for iff S is linearly independent and [S] = . X X Luenberger [4, p. 20] says “finite set” but Naylor [3, p. 183] and Maddox [2, p. 74] do not. Let us agree to use the above definition, rather than Luenberger’s. Note that can (and usually will) have more than one basis! X Definition. If has a basis S that is a finite set, then we call finite dimensional. Otherwise, if noX such finite S exists, we call infinite dimensionalX . X Definition. A space with a basis consisting of n elements is called an n-dimensional space. This terminology is acceptable thanks to the following result.

Theorem. Any two bases for a finite-dimensional vector space contain the same number of elements.

Most, but not all, properties of (more easily understood) finite-dimensional spaces generalize to infinite-dimensional spaces. Example. P = polynomials of degree n n { ≤ } A basis is 1,t,t2,...,tn , which has dimension n. Why linearly independent? ?? Another basis is the Legendre polynomials: dk/dtk(t2 1)k, k = 1,...,n.  − Exercise. C[0, 1] is infinite dimensional. Hint: prove by counter example considering Pn+1. A basis is a generalization of the usual concept of a coordinate system. Fact. [3, p 184] If S is linearly independent set in a vector space , then there exists a basis B for such that S B. Thus, every vector space has a Hamel basis, (since the empty set isX linearly independent). X ⊆ However, “Hamel basis is not the only concept of basis that arises in analysis. There are concepts of basis that involve topological as well as linear structure. ... In applications involving infinite-dimensional spaces, a useful basis, if one even exists, is usually 2.8 c J. Fessler, October 4, 2004, 12:44 (student version)

something other than a Hamel basis. For example, a complete orthonormal set is far more useful in an infinite-dimensional than a Hamel basis” [3, p. 183]. (More on this later!) We often need sets with less rigid structure than subspaces but that nevertheless still have some structure, so we digress a bit here. 2.4 Convexity Luenberger describes convexity as “the fundamental algebraic concept of vector space.” (p. 25) According to Oxford English dictionary, an algebra is “a calculus of symbols combining according to certain defined laws.” Definition. A set in a vector space is called convex iff for any x, y , αx + (1 α)y for all α [0, 1]. K ∈ K − ∈ K ∈ Geometrically: for any two points in a , the “line segment” between them is also in the set. Properties of convex sets Subspaces and linear varieties are convex. • 0 is “vacuously” convex. •For { }α , α , x = αk : k is convex if is convex (magnification or minification of a set) • ∈F K { ∈ K} K 1 + 2 is convex if 1 and 2 are convex sets in a common vector space. •If K isK a collection of convexK setsK (in a common vector space), then is a convex set. (Important for POCS methods.) • C K∈C K T Definition. The convex cover or convex hull of a set S in a vector space is the smallest convex set containing S, denoted co(S). Equivalently, the convex hull of S is the intersection of all convex sets containing S:

co(S) = . (Picture in 2D of blob and its convex hull) K cK convex : S { \ ⊆K} Problem 2.4 gives a more constructive form for co(S). Cone Definition. A set C in a vector space is called a cone with vertex at the origin if x C implies that αx C for all α [0, ). ∈ ∈ ∈ ∞

Cone Convex Cone Example. The space of nonnegative continuous (real) functions is a convex cone in the vector space of continuous functions. Relationships between cones, convex sets, and subspaces

Subsets

Convex Cones Cones Subspaces Convex Sets {X }

Exercise. An arbitrary union of subspaces (in a common VS) is a cone. c J. Fessler, October 4, 2004, 12:44 (student version) 2.9

2.6 Normed linear spaces We’ve gone about as far as we can with just the basic axioms of a vector space. Fortunately, most of the vector spaces of interest have additional structure: a norm. Definition. A norm on a vector space is a : R that satisfies the following for every x . x 0 (nonnegativity) X k·k X → ∈X • kxk= ≥ 0 iff x = 0 (positive definiteness) • kαxk = α x , α (scaling property)(homogeneity) • kx +ky | | kx k+∀ y ∈F, y (triangle inequality) • k k ≤ k k k k ∀ ∈X Notice that the function α appears above. It is here where our “restriction” to the fields R and C enters3. | | Definition. ( , ) is a or normed linear space or normed linear vector space or just normed space4. X k·k Clearly a norm generalizes the usual notion of length. The following lemma arises remarkably frequently in proofs. Lemma. In a normed space, x y x y for all x, y , i.e., x y x y . Proof. x = x y + y k xk − ky k+ ≤y k .− k ∈X | k k − k k|≤k − k k k k − k ≤ k − k k k Example. Euclidean n-space: En , Rn, x = n x2 k k i=1 i  pP  n Example. Euclidean n-space with a weighted norm: Rn, x = w x2 for w > 0. k k i=1 i i i Example. A different norm for Rn: (Rn, x = max x ) pP  k k k | k| Proof. Clearly this norm is nonnegative, and is zero only if xk = 0 k, i.e., if x = 0. And maxk αxk = α maxk xk . For the triangle inequality, we note that max x + y max| | [ x ∀ + y ] max x + max| y | . | | | | k | k k| ≤ k | k| | k| ≤ k | k| k | k| Those are all finite-dimensional examples.

Example. The vector space of continuous functions on the interval [a, b] with the norm f = b f(t) 2 dt. k k2 a | | What if we replaced “continuous” with “square integrable?” qR (see footnote p. 32 regarding equivalence classes of functions that are equal a.e.)

Example. C[a, b]. The space of continuous functions on the interval [a, b] with the norm f = maxa t b f(t) . Later we will see a reason why this “max” norm can be preferable. k k∞ ≤ ≤ | | Example. Space of real m n matrices, with A = tr(AAT ). × k k Problems involving spaces of matrices arise fairly frequenp tly in systems analysis. R2 and R √a2 + b2 Euclidean or E2 2 For R there are three particularly important norms. For x = (a, b) consider: a + b `1 or city block |max| |a| , b ` ∞ 2 2 {| | | |} Are there others? Yes, e.g., weighted versions: √w1a + w2b , wk > 0. What about R? One norm is x = x . Are there others? If g(x) is a norm on R, then onek conditionk | | is that g(αx) = α g(x), α R. Thus, choosing x = 1 we must have g(α) = α g(1), where| g|(1) > ∀0 since∈ 1 = 0. | | 6 Thus all norms on R have the form x = w x for some w > 0. All such norms are equivalent, in ak sensek to| be| defined in HW, so there is no point in considering anything but the case w = 1. So we usually just speak of R, rather than E1 = (R, = ). k·k | · |

3Probably it is possible to define normed spaces over other fields for which one can define a suitable |·| function that satisfies: (i) |α| = 0 ⇐⇒ α = 0, (ii) |αβ| = |α| |β|, (iii) |α + β| ≤ |α| + |β|. But would such spaces be useful? 4The term normed space should suffice because the presence of a vector space is implied in the axioms of a norm, since vector addition, scalar multiplication, and the zero vector are all part of those axioms. 2.10 c J. Fessler, October 4, 2004, 12:44 (student version)

2.10 The ` and spaces p Lp These normed spaces are ubiquitous in the engineering literature, and represent perhaps the most important examples of (infinite dimensional) normed vector spaces. p Definition. Let p [1, ). The space ` consists of all sequences of scalars a , a ,... for which ∞ a < . The norm of ∈ ∞ p 1 2 i=1 | i| ∞ a vector x `p is defined by ∈ 1/p P ∞ x = a p < . k kp | i| ∞ i=1 ! X

Definition. The space ` consists of bounded sequences. The norm of a vector x = ai in ` is defined by ∞ { } ∞

x = sup ai . k k∞ i | |

Before referring to these spaces as normed spaces, we must confirm that each of the functionals defined above is indeed a norm.

αx p = α x p is trivial to verify • k k | | k k 0 x p > 0 unless x = . •What k k about the triangle inequality? The holds due to the , which in turn follows from the Holder¨ • inequality.

Theorem. (The Holder¨ inequality) If p [1, ) and q [1, ) satisfy 1/p + 1/q = 1, and if x = (a , a ,...) ` and y = (b , b ,...) ` , then ∈ ∞ ∈ ∞ 1 2 ∈ p 1 2 ∈ q

∞ a b x y . | i i| ≤ k kp k kq i=1 X Moreover, equality holds iff either of the following two conditions hold: either x or y equal 0, or • 1/q 1/p a b both x and y are nonzero and | i| = | i| , i. • x y ∀ k kp ! k kq ! See errata: Luenberger’s equality condition omits the cases where x or y equal 0.

Proof. See text. The special case p = q = 2 is particularly important, and is known as the Cauchy-Schwarz inequality:

2 2 2 2 aibi ai bi and hence aibi ai bi . | | ≤ | | | | ≤ | | | | i s i s i i s i s i X X X X X X

Theorem. (The Minkowski inequality)

If x, y `p, p [1, ] then so is x + y, and x + y p x p + y p. For p ∈[1, ),∈ equality∞ holds iff x and y are linearlyk k dependent≤ k k . k k See errata:∈ ∞ Luenberger’s condition for equality is incomplete since it omits the cases where x or y equal 0.

Proof. See text. (It uses the Holder¨ inequality.)

At one point in the proof, we have the inequality ( n a + b p)1/p ( n a p)1/p + ( n b p)1/p . i=1 | i i| ≤ i=1 | i| i=1 | i| Taking the limit as n on the RHS (which increases monotonically with n) yields ( n a + b p)1/p x + y . → ∞ P P i=1P| i i| ≤ k kp k kp Then taking the limit as n on the LHS yields x + y p x p + y p . This technique of first working→ ∞ with one side and thenk the othek ≤r arises k k frequently.k k P c J. Fessler, October 4, 2004, 12:44 (student version) 2.11

The trouble with Riemann integration The space, defined below, involves . Why? Because the set of Riemann integrable functions is not as Lp 1, t Q general as we would like. For example, consider the function f(t) = This function is not Riemann integrable, 0, otherwise∈ . but it is Lebesgue integrable. (Its Lebesgue is zero since the function is nonzero only on a set of zero.) When a function is Riemann integrable, its Lebesgue integral will equal its . Considering Lebesgue integrable functions will more than general enough for any engineering problems. The space Lp Definition. Let p [1, ). The space p[a, b] consists of all real-valued measurable functions x on the interval [a, b] for which x(t) p is Lebesgue∈ integrable.∞ The normL on this space is defined as | | 1/p b x = x(t) p dt . k kp | | Za !

However, there is a subtle caveat with this space. There are functions that are nonzero on a set of measure zero for which x p = 0. So to consider this space to be a normed space, we must treat functions that are equal almost everywhere (a.e.) as being equk kivalent. In other words, a vector in is really an equivalence class of measurable functions that are all equal almost everywhere. Lp This makes the definition of [a, b] a bit more subtle than ` . In particular, we cannot define x to be simply the obvious L∞ ∞ k k∞ choice “supa t b x(t) ,” because that value will be different for different functions in the equivalence class. Instead, we define ≤ ≤ | | x = essential supremum of x(t) = infimum sup y(t) = ess sup x(t) . k k∞ | | y(t)=x(t) a.e. | | | |

See [4, p. 33] for an example. If x [a, b] and y [a, b] with p, q > 1 and 1/p + 1/q = 1, then the Holder¨ inequality is b x(t)y(t) dt x y . ∈ Lp ∈ Lq a | | ≤ k kp k kq Similarly, x + y x + y is the Minkowski inequality for p. The proofs are very similar.R k kp ≤ k kp k kp L 2.12 c J. Fessler, October 4, 2004, 12:44 (student version)

Topological concepts According to the Oxford English dictionary, is the “branch of mathematics concerned with those properties of figures and surfaces which are independent of size and shape and are unchanged by any deformation that is continuous, neither creating new points nor fusing existing ones; hence, with those of abstract spaces that are invariant under homeomorphic transformations.” Anyway, we have been systematically generalizing notions from geometry to more general settings, but one important concept we have yet to generalize is that of . The concept of distance is central to any discussion of optimization, since many optimization problems involve finding the element (within a set) that is the closest to some given point outside that set. Definition. If x and y are two points in a normed space ( , ), then the distance5 between x and y is defined by X k·k d(x, y) = x y . k − k

More generally, if S is a nonempty subset of , then the distance between a point x and the set S is defined by X ∈X d(x,S) = inf x y . y S k − k ∈

Example. In R, d(1, (3, 4]) = 2, for the usual norm x = x . (Picture) Note that there is no “closest point,” a complicationk thatk will| require| careful attention later.

Example. In R2, what is the distance between (0, 0) and (a, b) : 2 a 3, 1 b 2 ? (Trick question) ?? { ≤ ≤ ≤ ≤ }

How would you define the distance between two sets S and T ? ?? Properties of distance d. (We will use these later.) Lemma 2.1 d(x,S) d(y,S) x y . (Picture) | − | ≤ k − k Proof. d(x,S) = infz S x z = infz S (x y) (z y) infz S z y x y = d(y,S) x y . Now rearrange. ∈ k − k ∈ k − − − k ≥ ∈ k − k − k − k − k − k 2 Lemma 2.2 For any two subsets U and V of a normed space , d(U, V ) d(x,U) + d(x,V ), x . (Picture) X ≤ ∀ ∈X Proof. d(U, V ) = infu U d(u,V ) infu U d(x,V ) + x u = d(x,V ) + d(x,U). 2 ∈ ≤ ∈ k − k Lemma. Let ( , ) be a normed space, M a subspace, and x a point. Then α : d(αx,M) = α d(x,M) . Proof. Trivial forX αk·k= 0. ⊆X ∈X ∀ ∈F | | For α = 0: d(αx,M) = infy M αx y = infz M αx αz = α infz M x z = α d(x,M) (Picture) . 2 6 ∈ k − k ∈ k − k | | ∈ k − k | | Preview of optimization Many problems in optimization can be expressed as follows. Given x in a normed space ( , ), and a subset S in , find “the” vector s S that minimizes x s . X k·k X ∈ k − k What questions should we ask about such problems? Is there any best s? I.e., does there exist s? S s.t. x s? = d(x,S)? • If so, is s? unique? ∈ k − k • How is s? characterized? (Better yet would be an explicit formula for s?.) • One purpose of some of the material that follows is to answer these questions. (But be prepared for a challenge since some aspects of these questions remain open problems!) We will return to these questions after introducing Hilbert spaces in Ch. 3.

5This type of distance is a special case of the more general concept of a distance function that is called a if it satisfies: • d(x, x) = 0 • d(x, y) > 0 if x 6= y • d(x, y)= d(y, x) • d(x, y) ≤ d(x, z)+ d(z, y) (triangle inequality) Exercise. Verify that a metric is a more general concept than a norm, i.e., find a metric d(x, y) that is not of the form d(x, y)= kx − yk for any norm k·k. Hint: a metric need not satisfy the scaling property. ?? c J. Fessler, October 4, 2004, 12:44 (student version) 2.13

2.7 Open sets Looking ahead: in optimization we often use iterative algorithms that generate sequences, and we need to know when a converges to a limit that belongs to the same set as the sequence itself. Such questions are related to whether a set is closed or not, so we need concepts of open and closed sets.

Definition. The open sphere centered at x of radius ε is defined by S(x, ε) , y : x y < ε . { ∈X k − k } Also called the open ball, which is perhaps more descriptive since a sphere is often considered to be a surface, whereas a might be more often considered to be a solid. Example. Consider ( = C[0, 1], ) and x = 0. X k·k∞ Then S(x, 2) = continuous functions f : [0, 1] R : maxt [0,1] f(t) < 2 . (Picture of continuous functions between 2.) → ∈ | | ± Clearly this definition of a “sphere” generalizes the 3-space notion of a sphere! Now we need a definition for “inside,” another generalization. Definition. Let P be a subset of a normed space ( , ). A point p P is called an interior point of P iff Xε >k·k0 s.t. S(p, ε) P. (Picture in 2D with in a generic set.) ∈ ∃ ⊆ 2 Example. Let = (a, b) R : a = b (the line at 45◦ in the plane) with x = a if x = (a, a). X ∈ k k | | Is x = (1, 1) an interior point of , i.e., does there exist an open ball that is a subset of ? ??  X X Suppose we change to = R2 with the usual Euclidean norm and S = (a, b) R2 : a = b . X ∈ Is x = (1, 1) an interior point of S? ??  Which points within an open sphere are interior points? ?? Definition. The interior of a set P , denoted Int(P ), is the collection of all interior points of P .

Int(P ) , p P : p is an interior point of P = p P : ε > 0 s.t. S(p, ε) P . { ∈ } { ∈ ∃ ⊆ }

Definition. A set P is called open iff P = Int(P ). Remark. Int(P ) P , so to show a set P is open we must show P Int(P ), i.e., show that x P = ε > 0 s.t. S(x, ε) P. ⊆ ⊆ ∈ ⇒ ∃ ⊂ Examples of open sets. and are open sets •open ∅ spheresX are open (exercise) • Int(Int(P )) = Int(P ) • Int(P ) is open • P Q is open in if P is open in and Q is open in (problem 2.16 p. 44) • (1,×2) (0, 1) EX2 ×Yis open X Y • × ⊂

Proposition. x Int(P ) d(x, P ) > 0. (Picture) ∈ ⇐⇒

Proof. x Int(P ) = ε > 0 s.t. S(x, εe) P, so d(x, P ) = infy e x y = infy /P x y ε, ∈ ⇒ ∃ ⊆ P k − k ∈ k − k ≥ since y / P = y / S(x, ε) = y x ε. ∈ ∈ ⇒ ∈ ⇒ k − k ≥ e We can prove the converse by proving its contrapositive6. e Suppose x / Int(P ). Then ε > 0, S(x, ε) P = , so y P s.t. x y < ε. Thus d(x, P ) = infy P x y = 0. ∈ ∀ ∩ 6 ∅ ∃ ∈ k − k ∈ k − k Alternative direct proof. δ = d(x, P ) > 0 = e z x δ >e0, z P . But y S(x, δ) =e y x < δ = y / P = y P. So S(x, δ) P and hence x Int(P⇒) . k − k ≥ ∀ ∈ ∈ ⇒ k − k ⇒ ∈ ⇒2 ∈ ⊆ ∈ e e e Fact. Similarly: x Int P d(x,P ) > 0, since P = P. ∈ ⇐⇒   e e 6We have shown that A =⇒ B and we want to show the converse B =⇒ A, the contrapositive of which is: not A =⇒ not B. 2.14 c J. Fessler, October 4, 2004, 12:44 (student version)

Closed sets Definition. A point x is called a closure point (or a cluster point or an adherent point)ofaset P iff ∈X ε > 0, p P s.t. x p < ε. ∀ ∃ ∈ k − k

Fact. If P is nonempty, then x is a closure point of P iff d(x,P ) = 0. Definition. The collection of all closure points of a set P is called the closure of P and is denoted P . Example. P = 1, 1/2, 1/3,..., 1/10 = P = P . { } ⇒ Example. P = [0, 1) = P = [0, 1]. ⇒ Example. P = 1/n : n N = P = P 0 . { ∈ } ⇒ ∪ { } Example. Q = R. Properties. P = x : d(x,P ) = 0 if P is nonempty. (If P is empty, then so is P .) • P P{ ∈X } • ⊆ P = P • Definition. A set P is called closed iff P = P . Remark. Since P P , to show a set P is closed we must show P P , i.e., if p is any closure point of P , then p P . ⊆ ⊆ ∈ Examples of closed sets. is closed (!) • ∅ is closed (!) • Xx is closed • {y}: d(x, y) 1 is closed • { ≤ }

Proposition. P is open = P is closed • P is closed =⇒ P is open • ⇒ e Proof. Suppose P is open.e x P = ε > 0 s.t. S(x, ε) P = d(x, P ) ε > 0 = x / P . ∈ ⇒ ∃ ⊆ ⇒ ≥ ⇒ ∈ Thus the contrapositive is x P = x / P i.e., P P . Hence P is closed. ∈ ⇒ ∈ ⊆ e e Suppose P is closed. x P = x / P = P = ε , d(x,P ) > 0. Thus S(x, ε) P so x Int P . ∈ ⇒e ∈ ⇒e e e ⊂ ∈ Thus P Int P and we conclude P is open. (See Luenberger errata for p. 25.) 2 ⊆ e e e   Remark.e We coulde also state the propositione simply as “P is open P is closed.” ⇐⇒ e Proposition. The intersection of a finite number of open sets is open. • The union of a finite number of closed sets is closed. • The intersection of an arbitrary number of closed sets is closed. • The union of an arbitrary number of open sets is open. • Proof. Exercise. Example. Consider the open intervals S = ( 1/n, 1/n) R, n = 1, 2,.... n − ⊂ N Then ∞ S = ( 1, 1), which is open. And S = S , which is open. What is ∞ S ? ?? ∪n=1 n − ∩n=1 n N ∩n=1 n c J. Fessler, October 4, 2004, 12:44 (student version) 2.15

Now we consider open and closedness in the context of the special types of sets we considered previously: convex sets and subspaces.

Proposition. If C is a convex set in a normed space, then C and Int(C) are convex.

Proof. see text Open and closed subspaces Fact. If M is a subspace of and M is open, then M = [3, p. 229]. X X Fact. If M is a subspace of , then M is a subspace of [3, p. 229]. X X In general, subspaces are not necessarily closed in infinite dimensional normed spaces.

Example. The subspace M of continuous functions in 2. Because the rect function is in and is a closure pointL of M but is not in M. (Picture) . L2 Example. The subspace M of finite-length sequences in `2. Because the infinite geometric series (1, 1/2, 1/4, 1/8,...) is in `2, and is a closure point of M, but is not in M. Caution! We have now seen one of our first examples of a situation where our intuition from E3 does not generalize to general normed spaces! We have been thinking of subspaces as being like planes or hyperplanes. Yet even though planes are closed subsets of E3, subspaces are not necessarily closed in general. Remark. We had to use infinite-dimensional examples above because (as we will show soon), finite-dimensional subspaces are closed. Closed sets and Lemma. In a normed space, let U and V be disjoint subsets. If V is closed, then d(u,V ) > 0, u U. Proof. Pick any u U and suppose d(u,V ) = 0. Then u V = V since V is closed. ∀ ∈ But u V contradicts∈ the assumption that U and V are disjoint.∈ 2 ∈ Lemma 2.3 In a normed space, d(y,S) > 0 = d(y, S) > 0. ⇒ ??

Bounded sets Definition. A set S in a normed space ( , ) is called bounded iff M < such that x M, x S. X k·k ∃ ∞ k k ≤ ∀ ∈ Are closed sets bounded? ?? Bounded sets can be open, or closed, or neither. 2.16 c J. Fessler, October 4, 2004, 12:44 (student version)

Sequences

Definition. A sequence is a set of vectors indexed by the natural numbers N, e.g., xn = xn : n N . Formally, a sequence is a mapping from N to some vector space . { } { ∈ } X Definition. If x is a sequence, and n < n < , then x is called a subsequence of x . { n} 1 2 · · · { ni } { n} Notation. x iff x , n { n}∈X n ∈X ∀ 2.8 Convergence of sequences Definition. In a normed space, we say a sequence of vectors x converges to a vector x iff the sequence of real numbers { n} xn x converges to zero, in which case we write xn x or limn xn = x. kIn other− words,k considering the definition of convergence→ of real numbers:→∞

x x x x 0 n → ⇐⇒ k n − k → ε > 0, N < s.t. n N = x x < ε ⇐⇒ ∀ ∃ ε ∞ ≥ ε ⇒ k n − k ε > 0, N < s.t. n N = x S(x, ε). ⇐⇒ ∀ ∃ ε ∞ ≥ ε ⇒ n ∈

Example. (Infinite dimensional, of course.) Consider [0, 1] with x(t) = 1 and x (t) = 1 + t/n. L2 n Then x x because x x = 1 x(t) x (t) 2 dt = 1 t/n 2 dt = 1 0 as n . n → k − nk2 0 | − n | 0 | | √3 n → → ∞ q q Example. (Infinite dimensional, of course.)R Consider `p with x = (1R, 1/2, 1/3,...) and xn = (1, 1/2,..., 1/n, 0, 0,...). p 1/p Then x x = ∞ 1/k . That power series is convergent (and hence goes to zero as n ) for p > 1, but k n − kp k=n+1 → ∞ diverges for p = 1. So we can say “xn x” in `2, for example, but we cannot say that in `1! P  → So all norms are not equivalent in general, unlike in finite-dimensional vector spaces.

Proposition. The limit of a convergent sequence is unique.

Proof. Suppose x x and x y. Then n: n → n → ∀ x y = x x + x y x x + x y k − k k − n n − k ≤ k − nk k n − k which 0 as n . Thus x y = 0 so x y = 0 so x = y. 2 → → ∞ k − k −

Proposition. x x = x x . n → ⇒ k nk → k k Proof. As shown earlier, x y x y . Thus, x x x x 0 if x x, so x x . 2 | k k − k k|≤k − k |k nk − k k| ≤ k n − k → n → k nk → k k

Proposition. If x x, then sup x < , i.e., convergent sequences are bounded. n → n k nk ∞

Proof. N1 s.t. xn x < 1, n > N1. Thus xn = xn x + x xn x + x < 1 + x , so sup ∃ x kmax− 1k + x ,∀ x ,..., x k k< k. (Picture)− k ≤ k − k k k k k 2 n k nk ≤ { k k k 1k k N1 k} ∞ c J. Fessler, October 4, 2004, 12:44 (student version) 2.17

The concept of a closed set is closely connected with limits of sequences. Definition. If S is a subset of , we call x a limit point of S iff there is a sequence of elements of S that converges to x. X ∈X

Proposition. x is a limit point of S iff x S. Thus limit points and cluster points are equivalent! ∈ 7 Proof. If x is a limit point, then x S s.t. x x. Thus d(x, x ∞ ) = 0, so d(x,S) = 0. Hence x S. ∃ { n} ∈ n → { n}n=1 ∈ Suppose x S. Then ε > 0, y S s.t. x y < ε. Choose ε = 1/n and identify xn with the corresponding y. Since x ∈ x < 1/n∀, we see∃x ∈ x, sokx is− a limitk point of S. 2 k n − k n → Corollary. A set is closed iff it contains its limit points.

Summary distance • open sphere • interior point & interior • = its interior • closure point: d(x,P ) = 0, & closure • closed set = its closure • • convergence sequences • limit point of a set (exists convergent sequence to it) = cluster point • closed sets contain their limit points •

Series (A special kind of sequence.)

Definition. An infinite series of the form i∞=1 xi is said to converge to x in a normed space iff the sequence of partial sums n n sn = xi converges to x, in which case we write x = ∞ xi as short hand for x = limn xi. i=1 P i=1 →∞ i=1 Caution.P Sometimes as a lazy short hand one might write Pi∞=1 1/i = . Saying that i∞=1 αi =P is shorthand for saying n ∞ ∞ M < , N N s.t. n > N = αi >M. ∀ ∞ ∃ ∈ ⇒ i=1 P P Since / R, it is truly an to write n∞=1 1/n = , because in the above definition of convergence of an infinite series,∞ it is∈ implied that the limit x isP an element of . ∞ XP Example. In `2, consider x = (1, 1/2, 1/3,...) and xn = (0,..., 0, 1/n, 0,...) where 1/n is in the nth element.

Then i∞=1 xi converges to x. (But it does not converge in `1.) 7A P⊆ B =⇒ d(x,B) ≤ d(x,A). (Picture) 2.18 c J. Fessler, October 4, 2004, 12:44 (student version)

2.11 Cauchy sequences Often in convergence analysis it is easier to examine x x than x x , especially if we have not yet shown that a limit k n − mk k n − ?k x? even exists. Definition. A sequence x in a normed space is called a iff { n} x x 0 as n, m . k n − mk → → ∞ In other words, ε > 0, N > 0 s.t. n, m > N = x x < ε. ∀ ∃ ⇒ k n − mk Fact. In a normed space, every convergent sequence is a Cauchy sequence, since if x x, then n → x x x x + x x 0. k n − mk ≤ k n − k k m − k →

Note the repeated use of the triangle inequality in proofs. In a normed space, it is about all we have to work with! The converse of the above fact is not true in general: Cauchy sequences need not converge in general normed spaces. This is a bit unfortunate since the converse is what we would really like to use usually! nt Example. Consider the sequence of functions f (t) = 1 e− in the normed space n − 1 = f : [0, 1] R : f continuous , f = f = f(t) dt , (Picture) . X { → } k k k k1 | |  Z0  1 nt mt n m fn fm = 0 e− e− dt = (1 e− )/n (1 e− )/m 0 as n, m . So fn is Cauchy. kBut the− “apparentk | limit”− of f is| a step| function,− which− is not− continuous,| → and hence→ not ∞ an element{ } of . R n X How can we “fix” this problem? Broaden the vector space to [0, 1] (i.e., drop the continuity requirement). • X L1 Replace the norm f 1 with f . • k k k k∞ One can show for this example that fn fm does not approach zero as “n, m .” k − k∞ → ∞ (For any fixed m, fn fm 1 as n .) k − k∞ → → ∞ This “incompleteness” can also arise even in subspaces of `2. Example. = set of infinite sequence of reals with only finitely many nonzero terms: X = (a ,...,a , 0, 0,...) : k 1, a R X { 1 k ≥ i ∈ } n with the `p norm x p . Now consider the sequence with elements xn = (1, 1/2, 1/4,..., 1/2 , 0, 0,...). k k 1/p 1/p max n,m 1 p 1 1 x x { } min x Since n m p = k=min n+1,m+1 2k 2p {m,n} 1 1/2p 0 as n, m , n is Cauchy. But there k − k { } ≤ − → → ∞ { } is no x to which xn converges.    ∈X { P}  In some sense, the problem is that has “holes” in it. Broadening the space is the natural solution. Is there an alternate norm that wouldX make this vector space complete? (I doubt it, can you show otherwise?) X c J. Fessler, October 4, 2004, 12:44 (student version) 2.19

2.11 Banach spaces Often we prefer to use normed spaces that are free of the pathologies described in the preceding examples, so we name them. Definition. A normed space ( , ) is called complete iff every Cauchy sequence in has a limit in (and hence converges). X k·k X X Definition. A complete normed space is called a . Examples of Banach spaces Rn, , for p [1, ] • k·kp ∈ ∞ C [a, b] with f = supt [a,b] f(t) • k k∞ ∈ | | `p for p [1, ] with usual x p • [a, b] ∈for p ∞[1, ] with usualk k f • Lp ∈ ∞ k kp Is finding a suitable Banach space usually difficult? Fortunately not, since every normed space has a completion.

Theorem. If ( , ) is a normed space, then , that is a Banach space (called a completion of ) with is a subspaceX k·k ofX ∃ Y k·kY X  • X = Y • Xx Y = x = x . Moreover,• ∈X ⇒isk essentiallykX k uniquekY (i.e., all the completions of are isometric with one another [3, p. 121]). Y X

Example. [0, 1] is the completion of = f : [0, 1] R : f continuous , f = f = 1 f(t) dt . L1 X { → } k k k k1 0 | | Example. [a, b] is the completion of R [a, b] for p [1, ]. R  Lp p ∈ ∞ Showing such results for requires measure theory. Lp Showing completeness of C[a, b] uses the following fact that is a key result from Math 451.

Theorem. (R, ) is complete, i.e., if α R is Cauchy, then α R s.t. α α. | · | { n} ⊂ ∃ ∈ n →

There exist bounded functions that are Lebesgue integrable but not Riemann integrable, e.g. the on the rationals [3, p. 561]. And in fact R1[a, b] is not complete [3, p. 564]. 2.12 Complete subsets Definition. A subset S of a normed space is complete iff every Cauchy sequence from the subset converges to a limit within S. Example. Any finite set is complete. (The next theorems give more interesting cases.)

Theorem. In a normed space, any complete subset is closed.

Proof. If P is complete, then every Cauchy sequence in P has a limit in P . Thus all convergent⊆X sequences (which are of course Cauchy) have limits in P . Thus P is closed. 2

Theorem. In a Banach space, a subset is complete if and only if it is closed.

Proof. The “only if” direction follows from the preceding theorem. Suppose P is closed and xn is a Cauchy sequence in P (and hence in ). Since is Banach, x such that xn x. Hence x is a limit point of{ P .} Since P is closed, x P . Thus P is complete.X X ∃ ∈X → 2 ∈ Exercise. In a normed space, intersections of arbitrarily many complete subsets are complete. What about unions of complete subsets? ?? ?? 2.20 c J. Fessler, October 4, 2004, 12:44 (student version)

There is an asymmetry between the two preceding theorems. We assumed a normed space to show complete set = closed set, but we assumed a Banach space to show closed set = complete set. ⇒ ⇒ Is the stronger assumption (of a Banach space) truly necessary to show closed set = complete set? ?? ⇒ Ok, but in some (incomplete) is there a closed proper subset that is incomplete? ?? X In the preceding theorem we assumed we are already working in a Banach space. What if we only have an ordinary normed space? Are there complete subsets of it? The next theorem shows that the answer can be yes, at least for finite-dimensional subspaces. Recall that if M is a subspace of a normed space , then d(αx,M) = α d(x,M) for all x and α . X | | ∈X ∈F

Theorem. Any finite-dimensional subspace of a normed space is complete (and hence closed).

Proof. By induction on the dimension of the subspace. For a 1D subspace: M = αe : α , where e is a fixed basis vector. { ∈ F} ∈X Any Cauchy sequence xn in this subspace has elements of the form xn = αne. Since xn xm = αn αm e , the { } 1 k − k | − | k k sequence of reals αn is also Cauchy and hence convergent (to some limit α) since E = (R, ) is complete. Thus, x αe {= α} α e 0, so x x = αe M. Hence, any 1D subspace of a| normed · | space is complete. k n − k | n − | k k → n → ∈ Now assume the theorem is true for subspaces of dimension N 1. Suppose M is a N-dimensional subspace of a normed space .− We must show that M is complete. X Let e1,..., eN denote a basis for M. For k = 1,...,N, define Mk = [ e1,..., ek 1, ek+1,..., eN ] and δk = d(ek,Mk) . { − } Claim: δk > 0. Suppose δk = 0. Then since Mk is a N 1 dimensional space it is complete by assumption and hence closed, so d(ek,Mk) = 0 would imply e M , contradicting the− linear independence of e . k ∈ k { k} Suppose x M is Cauchy. Each x has a (unique) representation x = N λne , so for each k 1,...,N : { n} ∈ n n k=1 k k ∈ { } N N P n m n m m n n m xn xm = (λk λk )ek = (λk λk )ek (λj λj )ej d((λk λk )ek,Mk) k − k − − − − ≥ − k=1 j=k X X6 Mk ∈ = λn λm d(e ,M ) = λn λm δ . | k − k | k k | k − k | k | {z } Since x x 0 for a Cauchy sequence, we see that λn λm 0 since δ > 0. Thus λn is Cauchy and hence (by the k n − mk → | k − k | → k { k } completeness of R) converges to some limit λ . Defining x = N λ e M we find that x x since k k=1 k k ∈ n → N P N n n xn x = (λk λk)ek λk λk ek 0. k − k − ≤ | − | k k → k=1 k=1 X X 2 Thus any such Cauchy sequence in M converges to a limit in M, so M is complete. Note this proof’s immediate use of the basis to represent any x M. Proofs about finite-dimensional spaces often start this way. ∈ Corollary. Any finite-dimensional subspace of a normed space is closed. • Any finite-dimensional normed space is complete (and closed). • Exercise. [3, p. 218]. A (Hamel) basis for a Banach space is either finite or uncountably infinite. c J. Fessler, October 4, 2004, 12:44 (student version) 2.21

2.9 Transformations In systems theory, we analyze many systems and mathematical operations that transform one signal into another. Definition. Let and be two vector spaces (over a common field ), and let D be a subset of . A rule “T ” that assignsX Y a single element y to each x D is calledF a transformation from Xto with domain D. We write y = T (x) and T : . or T ∈Y: D . ∈ X Y X →Y →Y See p. 27 or c1 review notes for one-to-one and onto. Definition. A transformation from a vector space (over a scalar field ) into is called a functional on . X F F X Example. A norm is a functional.

Linear transformations Definition. A transformation T : (where and are vector spaces over a common field) is called linear iff X →Y X Y T (αx + βx ) = αT (x ) +βT (x ), x , x , α, β . 1 2 1 2 ∀ 1 2 ∈X ∀ ∈F

Definition. If T is a linear transformation from into itself, then we say T is a linear operator. X X However, the terminology distinguishing linear transformations from linear operators is not universal, and the two terms are often used interchangeably. Example. Let = R and let = be the space of continuous functions on [0, 1]. F X Y t Define the linear transformation T by: F = T (f) iff F (t) = 0 f(τ) dτ. Integration (with suitable limits) is a linear transformation. R

Simple fact for linear transformations: T (0) = 0. Proof. T (0) = T (00) = 0 T (0) = 0. This is called the “zero in, zero out” property in linear systems theory. •

n n Caution! (From Linear Systems by T. Kailath.) By induction it follows that T ( i=1 αixi) = i=1 αi T (xi) for any finite n, but the above does not imply in general that linearity holds for infinite summations or . Further assumptions about “smoothness” or “regularity” or “continuity” of T areP needed for that.P This point is always glossed over in introductory signals and systems courses, where infinite sums (and integrals) are routinely “passed through” linear systems according to the superposition property, with no attempt to verify the validity of such exchanges. 2.22 c J. Fessler, October 4, 2004, 12:44 (student version)

Continuity To define such continuity, we restrict attention to normed spaces. Note that a transformation can be continuous for one pair of normed spaces but discontinuous for another pair, e.g. [3, p. 63,65].

Definition. A transformation T from a normed space ( , ) into a normed space , is called continuous at x0 iff X k·kX Y k·kY ∈ X  ε > 0, δ = δ(x0, ε) > 0 s.t. x x0 < δ = T (x) T (x0) < ε. ∀ ∃ k − kX ⇒ k − kY Definition. If T is continuous at all x , then we simply call T continuous. 0 ∈X Example. f : R defined by f(x) = d(x,S) is continuous for any set S in a normed space , since d(x,S) d(y,S) x y . X → X | − | ≤ k − k Definition. A transformation T from a normed space ( , ) into a normed space , is called uniformly continuous iff X k·kX Y k·kY ε > 0, δ = δ(ε) > 0 s.t. x z < δ = T (x) T (z) < ε. ∀ ∃ k − kX ⇒ k − kY Here δ depends only on ε. Sometimes δ(ε) is called the modulus of continuity of T .

Example. Is uniformly continuous? ?? k·k We often want to exchange transformations and limits; the following proposition shows that continuity is the key condition.

Proposition. A transformation T from a normed space ( , ) into a normed space , is continuous at x iff X k·kX Y k·kY ∈X  xn x = T (xn) T (x) (for any such sequence xn ), i.e., lim T (xn) = T lim xn . → ⇒ → { } n n →∞  →∞  Proof. see text Exercise. A linear transformation from a finite-dimensional normed space to any normed space is continuous. Thus any “counter-example” showing a linear transformation that is not continuous will be infinite dimensional, e.g., the integrator system example [3, p.63]. t Example. Consider = = 2[R] and the linear operator T corresponding to the integral transformation: y(t) = x(τ) dτ . We show that T is notX continuousY L at x = 0. (And in fact is discontinuous everywhere!) −∞ 0 R Consider the specific signals: 6x(t) 6y(t) b = (δ/2)3/2 ab - - a a = 2/δ t a a t − −

Then x 0 = x(t) 2 dt = √2ab2 = δ/√2 < δ but T (x) T (0) = y 0 = 2 b2a3 = 2/3. k − k | | k − k k − k 3 An example of a continuousqR (linear) operator in an infinite dimensional vector space is the discrete-timeq p convolution operator.

Example. Consider = = ` and the transformation y = T (x) yn = k∞= hn kxk, where h `1. X Y ∞ ⇐⇒ −∞ − ∈ Note that `1 corresponds to BIBO stability! P Exercise. Show that T (x) h 1 x . (This also ensures that T (x) is well defined.) k k∞ ≤ k k k k∞ Thus, since T is linear, if x, z ` then T (x) T (z) = T (x z) h 1 x z , so T is (uniformly) continuous, ∈ ∞ k − k∞ k − k∞ ≤ k k k − k∞ with δ(ε) = ε/ h 1. So BIBO LTI systems are uniformly continuous, and thus for such systems we can freely exchange limits and sums. (Andk ink fact we would need to make such exchanges to derive rigorously convolution properties like the .) We will talk much more about such bounded linear operators in Ch. 6. c J. Fessler, October 4, 2004, 12:44 (student version) 2.23

2.13 Compactness Optimization is about maximizing a functional over some set, or more precisely, usually about finding the maximizer (within some set) of a functional. When does a functional achieve its maximum? We wish that the answer were “if the set is closed and bounded,” but unfortunately that is incorrect in general. The concept of a compact set helps answer this question in general.

Definition. A subset K of a normed space ( , ) is called compact or sequentially compact iff every sequence xn in K has some subsequence x that converges to aX limitk·kx K. { } { ni } ∈ (This is not quite the definition of compactness typically used first in a real analysis course, but it is more convenient for our purposes. One can first define a compact set in terms of set coverings, and then prove that a is compact if and only if it is sequentially compact, e.g., [2, p. 63].) As usual, having defined a new concept, we now attempt to relate it to previously defined concepts. Can a cone (other than 0 ) be compact? ?? { } Exercise. In a normed space, arbitrary intersections of compact subsets are compact.

Exercise. What about unions? ?? ??

Lemma. If xn x and xni is a subsequence of xn , then limi xni = x. → { } { } →∞ Proof. Pick any ε > 0. xn x = N s.t. n > N = xn x < ε. Choose I such that i>I =→ n >⇒ N ∃. Then i>I = ⇒x k −x k< ε. 2 ⇒ i ⇒ k ni − k

Proposition. A compact subset K of a normed space ( , ) is complete, closed, (and bounded). X k·k Proof. Claim: K is complete, because Cauchy sequences will have limits in a compact set K. • Let x be any Cauchy in K, then ε > 0, M s.t. n,m>M = x x < ε/2. { n} ∀ ∃ ⇒ k n − mk Since K is compact, ni s.t. xni x K, i.e., ε > 0, I s.t. i>I = xni x < ε/2. For any ε > 0, let N ∃= { max} M, n →. ∈ ∀ ∃ ⇒ k − k { I } For n > N and i>I we have xn x xn xni + xni x < ε/2 + ε/2 = ε. Thus xn x. To show K is closed, we showk that x− k ≤x k= −x Kk fork x − Kk . → • n → ⇒ ∈ { n} ∈ xn K = xni y K. But xn x = xni x, so x = y K. Thus K is closed. {Before} ∈ we show⇒ ∃ boundedness,→ ∈ we need to→ prove⇒ the Weierstrass→ theorem!∈ 2 • So is the reverse true? Are closed and bounded sets compact? The following theorem shows that in general the answer is “yes” in finite-dimensional normed spaces. (But not in general.)

Theorem. In a normed space ( , ), the following are equivalent. (a) is finite dimensional. X k·k (b)X Every closed and bounded subset is compact. (c) The closed unit ball B (0) = x : x 1 is compact [3, p. 269]. 1 { ∈X k k ≤ } Proof. (a) = (b) is the Heine-Borel theorem (see Math 451...) (b) =⇒ (c) is obvious (c) =⇒ (a) will be a homework problem ⇒ This theorem illustrates why infinite-dimensional spaces are more challenging than finite-dimensional spaces. The fact that every closed and bounded subset is compact can be very useful in finite dimensional problems, and we do not have this tool in general infinite-dimensional cases. By having a notion of compactness, we can make statements (such as the Weierstrass theorem below) that apply to both infinite and finite-dimensional spaces, whereas those statements would not hold if we merely assumed that the sets were closed and bounded. Analogy: we write sup when we think max. Here, we write compact when we think “closed and bounded.” Summary 2.24 c J. Fessler, October 4, 2004, 12:44 (student version)

In any normed space: compact = complete = closed • compact =⇒ closed and bounded⇒ • ⇒ In a Banach space: closed = complete In a finite-dimensional normed⇒ space: closed and bounded = compact ⇒

Examples As has been / will be shown, any compact set is closed, bounded, and complete, and that the reverse is true in finite-dimensional spaces. But what about infinite-dimensional spaces? Example.

Consider any of the Banach spaces `p, for p [1, ] and the subset B = x : x 1 . This is a closed set, and hence it is complete (since is complete). Furthermore,∈ B∞is clearly bounded. { ∈X k k ≤ } X We now proceed to exhibit a sequence x in B that has no convergent subsequences. { n} Let x denote the sequence in ` whose nth element is unity and all other elements are zero. Clearly x B. n p { n} ∈ Yet xn xm p = 1, so it is impossible for this xn to have any convergent subsequences since any such convergent subse- quencek would− needk to be Cauchy which cannot occur{ when} x x = 1, n, m N. k n − mkp ∀ ∈ So the set B in `p is closed, bounded, and complete, but not compact.

One might begin to wonder then: are there any compact sets in infinite-dimensional spaces? Example. Any finite set in an infinite-dimensional space is compact. But that is not very interesting since the set x ,..., x is just a closed and bounded subset of the finite-dimensional subspace { 1 n} [x1,..., xn], so of course it is compact. Example. In any normed space , suppose x converges to x . Then the set x x is compact. X { n} ∈X { } ∪ ∪n { n} This still seems like a rather contrived and limited compact set. Are there any interesting (e.g., nonempty) convex compact sets in infinite-dimensional spaces? Conjecture: the following (convex) set is (?) compact:

∞ x ` : 2i x 1 . (2-1) ∈ 1 | i| ≤ ( i=1 ) X Note: the following set is not compact ∞ x ` : x = x 1 , ∈ 1 k k1 | i| ≤ ( i=1 ) X as shown above, so the question is whether adding the “2i” part is enough of a change to make a compact set. Extra credit (=30 homework points) to anyone who can show that (2-1) is or is not compact, or who can give a different example of a nontrivial convex, compact set in an infinite dimensional normed space. c J. Fessler, October 4, 2004, 12:44 (student version) 2.25

Compactness, closedness, and distances Lemma 2.4 In a normed space, closed subsets of compact subsets are themselves compact.

?? Lemma 2.5 In a normed space, any compact set contains the closures of all of its subsets.

?? Lemma 2.6 If U and V are compact, disjoint subsets of a normed space, then d(U, V ) > 0. (Picture)

Proof. Suppose d(U, V ) = 0. Then there exists xn U and yn V such that xn yn 0. Since U is compact, there is a subsequence x { that} ∈ converges{ to} some ∈ x U. k − k → { ni } ∈ Now yni x yni xni + xni x 0 as i , so yni x U. k − k ≤ k − k k − k → → ∞ → ∈ 2 But since V is compact, it is also closed, so the limit of yni must lie in V , contradicting the disjointedness of U and V ...... {...... } ...... Would it suffice for U and V to be closed? No. Consider the sets U = (x,y) R2 : y > 0, x 1/y and V = (x,y) R2 : y < 0, x 1/y . (Picture) These sets are closed and disjoint, yet∈ d(U, V ) = 0. ≥ ∈ ≤ −   Can you find a 1D example? ?? ...... Would it suffice for U and V to be closed and bounded? (We need not look in finite-dimensional normed spaces for counter-examples since there all closed and bounded sets are compact.)

For a counter-example, we turn to the (incomplete) normed space = finite-length real sequences , , where is the X { } k·kp k·kp n usual `p norm. Define U = n∞=1xn and V = n∞=1yn, where xnn= (1, 1/2,..., 1/2 , 0, 0,...) and yn = xon + (1/n, 0, 0,...). Then U and V are disjoint, closed∪ (in this incomplete∪ ), and bounded (for any p [1, ]) yet d(U, V ) = 0. X ∈ ∞ So it seems that we do need a stronger condition than “closed and bounded” in Lemma 2.6...... Would “complete and bounded” suffice? Or do we need compactness? 2.26 c J. Fessler, October 4, 2004, 12:44 (student version)

Upper semicontinuous functions Definition. A (real) functional f defined on a normed space ( , ) is called upper semicontinuous at x iff X k·k ∈X ε > 0, δ > 0 s.t. y , y x < δ = f(y) < f(x) + ε. ∀ ∃ ∀ ∈X k − k ⇒ We call f lower semicontinuous if at x iff f is upper semicontinuous at x. Fact. f is continuous iff f is both upper and− lower semicontinuous.

Equivalently, one can show that f is u.s.c. at x iff f(x) lim supy x f(y). ≥ → ε > 0, δ > 0 s.t. y x < δ = f(y) 0, δ > 0, y s.t. y x < δ and f(y) >S ε. Loosely speaking, it is like a∀ least upper∀ bound∃ “in thek limit− .”k − Example. Consider the following “almost piecewise continuous” real-valued function. f(x) 20

10

2 4 6 x

What is lim supy 2 f(y)? We get the biggest limit approaching from the left, so lim supy 2 f(y) = 20. So f is u.s.c. at x = 2. → → Similarly, lim supy 4 f(y) = 20, so f is not u.s.c. at x = 4. But lim supy 6 f(y) = 20, so f is u.s.c. at x = 6. → → This function is u.s.c. everywhere except ?? and lower semicontinuous everywhere except ??

Does this function achieve a maximum on [3,5]? ?? What about on [1,3]? ?? How about on [0,1)? ??

Theorem. (Weierstrass) An upper semicontinuous (real) functional f on a compact subset K of a normed space ( , ) (i) is bounded on K, and (ii) achieves a maximum on K. X k·k inf g R : g f(x), x K , if f is bounded above on K Proof. Let M = supx K f(x) = { ∈ ≥ ∀ ∈ } ∈ , otherwise. (At this point we do not know if Mis∞ finite or not.) By definition of supremum, a sequence x K such f(x ) M. (This is true even if M = .) ∃ { n} ∈ n → ∞ However, the definition of supremum alone does not ensure that xn converges. Since K is compact, a convergent subsequence x x K, for some x . ∃ ni → ? ∈ ? Since subsequences of convergent sequences have the same limit, limi f(xni ) = M, considering f(xn) as a sequence in R. →∞ { }

Since f is upper semicontinuous, f(x?) lim supi f(xni ) = M. Since f(x) is real and hence finite, f(x ≥) M implies→∞ M must be finite. ... f is bounded on K. ? ≥ On the other hand, by definition of supremum, since x K, M f(x ). ? ∈ ≥ ? So we conclude f(x?) = M, meaning that x? achieves the maximum of f on K. 2 Corollary. A real-valued, continuous functional f on a compact subset K of a normed space ( , ) achieves its maximum and minimum on K. X k·k Example. On a normed space ( , ), the function f : R defined by f(x) = x is continuous. Proof. Recall x y Xxk·ky , so x y = Xx → y 0 = f(x) kf(ky) 0. | k k − k k|≤k − k → ⇒ k − k → ⇒ | − | → An inf. dim. example like splines would be nice here... Corollary (to Weierstrass theorem) A compact subset K of a normed space is (complete, closed, and) bounded. , Proof. Since is continuous, by the Weierstrass theorem y K s.t. M y = supx K x . Hence x k·kM < , x K, i.e., K is bounded. ∃ ∈ k k ∈ k k 2 k k ≤ ∞ ∀ ∈ c J. Fessler, October 4, 2004, 12:44 (student version) 2.27

2.14 Quotient Spaces skip

2.15 Denseness One last topological concept. Definition. A subset D of a normed space ( , ) is called dense iff any of the following equivalent conditions hold. x and ε > 0, y D s.t. x yX

Separability

Definition. A normed space is called separable iff it contains a countable dense set, i.e., D = ∞ y . X n=1 { n} n Example. Euclidean space E is separable. The collection of vectors x = (a1,...,an) with rationalS components is countable and dense in En. Example. ` is separable for p [1, ) p ∈ ∞ If Dn = (r1,...,rn, 0, 0,...) : rk Q , then D = n∞=1Dn is dense in `p. See text for{ proof. ∈ } ∪

Example. p is separable for p [1, ) L n ∈ ∞ If D = r 1 (t) : r Q, I = (a , b ), a , b Q for n N, then D = ∞ D is dense in . n { k=1 k Ik k ∈ k k k k k ∈ } ∈ ∪n=1 n Lp Alternatively, if Dn = f : [a, b] R : f is piecewise linear with n N breakpoints at rk Q and f(rk) Q , P { → ∈ ∈ ∈ } then D = ∞ D is dense in . ∪n=1 n Lp Example. C[a, b] is separable. The set of all polynomials with rational coefficients is countable and dense in C[a, b]. n If D = r + r t + + r t : r Q , then D = ∞ D is dense in C[a, b]. n { 0 1 · · · n k ∈ } ∪n=1 n Example. ` and are not separable (Problem 2.21) ∞ L∞

Schauder basis Definition. In a normed space, x is a Schauder basis for iff for each x , there exists a unique sequence λ { n}∈X X ∈ X { n} such that x = n∞=1 λnxn [5] [2, p. 98]. P Theorem. If a normed space has a Schauder basis, then it is separable [2, p. 100].

What about the converse? The famous Banach conjecture that every separable Banach space has a Schauder basis was shown to be incorrect by Elfon in 1973 [2, p. 100]. This could be considered surprising since separable, complete, normed spaces should be about as “nice as they come.” ...... So is the concept of separability of limited use to use? No, thanks the following key result. Fact. A Hilbert space has a countable orthonormal basis iff it is separable [3, p. 314]. (Any countable orthonormal basis is a Schauder basis.) 2.28 c J. Fessler, October 4, 2004, 12:44 (student version)

Summary Geometrical concepts and their generalizations. point x line α∈Xx : α R { ∈ } plane i αixi : αi R cone x{ C = αx∈ C}for α 0 length xP∈ ⇒ ∈ ≥ sphere kxk : x < ε distance {x ∈Xy k k } coordinate system basisk − k Hierarchy of spaces: vector space normed space Banach space ⊃ ⊃ Some principal results A subset of a Banach space is complete iff it is closed. • Any finite-dimensional subspace of a normed space is complete (and hence closed). • A real-valued, continuous functional on a compact subset of a normed space achieves its maximum and minimum on that subset. • Any closed and bounded subset of a finite dimensional normed space is compact. • c J. Fessler, October 4, 2004, 12:44 (student version)

Bibliography

[1] P. Enflo. A counterexample to the approximation problem in Banach spaces. Acta Math, 130:309–17, 1973.

[2] I. J. Maddox. Elements of . Cambridge, 2 edition, 1988.

[3] A. W. Naylor and G. R. Sell. Linear operator theory in engineering and science. Springer-Verlag, New York, 2 edition, 1982.

[4] D. G. Luenberger. Optimization by vector space methods. Wiley, New York, 1969.

[5] J. Schauder. Zur theorie stetiger abbildungen in funktionenrumen. Math. Zeitsch., 26:47–65, 1927.

[6] P. P. Vaidyanathan. Generalizations of the sampling theorem: Seven decades after Nyquist. IEEE Tr. Circ. Sys. I, Fundamental theory and applications, 48(9):1094–109, September 2001.

[7] F. Deutsch. The convexity of Chebyshev sets in Hilbert space. In A. Yanushauskas Th. M. Rassias, H. M. Srivastava, editor, Topics in polynomials of one and several variables and their applications, pages 143–50. World Sci. Publishing, River Edge, NJ, 1993.

[8] M. Jiang. On Johnson’s example of a nonconvex Chebyshev set. J. Approx. Theory, 74(2):152–8, August 1993.

[9] P. L. Combettes and H. J. Trussell. Method of successive projections for finding a common point of sets in metric spaces. J. Optim. Theory Appl., 67(3):487–507, December 1990.