<<

522

Bounded Linear Operators and the Definition of

Definition. Let V , W be normed vector spaces (both over R or over C). A linear transformation or linear T : V → W is bounded if there is a constant C such that

(1) kT xkW ≤ CkxkV for all x ∈ V . Remark: We use the linearity of T and the homogeneity of the in W to see that x T (x) kT (x)k T  = = W kxkV W kxkV W kxkV we see that T is bounded, satisfying (1), if and only if

sup kT (x)kW ≤ C. kxkV =1

Theorem. Let V , W be normed vector spaces and let T : V → W be a linear transformation. The following statements are equivalent. (i) T is a bounded linear transformation. (ii) T is continuous everwhere in V . (iii) T is continuous at 0 in V .

Proof. (i) =⇒ (ii). Let C as in the definition of bounded linear transfor- mation. By linearity of T we have

kT (v) − T (˜v)kW = kT (v − v˜)kW ≤ Ckv − v˜kV which implies (ii). (ii) =⇒ (iii) is trivial. (iii) =⇒ (i): If T is continuous at 0 there exists δ > 0 such that for all v ∈ V with kvk < δ we have kT vk < 1. Now let x ∈ V and x 6= 0. Then x x δ = δ/2 and thus T (δ ) < 1. 2kxkV V kxkV W But by the linearity of T and the homogeneity of the norm we get x T (x) δ 1 ≥ T (δ ) = δ = kT xkW kxkV W 2kxkV W 2kxkV and therefore kT xkW ≤ CkxkV with C = 2/δ. 

Notation: If T : V → W is linear one often writes T x for T (x).

Definition. We denote by L(V,W ) the set of all bounded linear trans- formations T : V → W . 1 2

L(V,W ) form a vector . S + T is the transformation with (S + T )(x) = S(x) + T (x) and cT is the operator x 7→ cT (x). On L(V,W ) we define the (depending on the norms on V and W ) by

kT vkW kT kL(V,W ) ≡ kT kop = sup . v6=0 kvkV

We can think of kT kL(V,W ) as the best constant for which (1) holds. Note that kT xkW ≤ kT kL(V,W )kxkV . Using the homogeneity of the W -norm we also can write

kT kL(V,W ) = sup kT xkW . kxkV =1

We use the k · kop notation if the choice of V , W and the norms are clear n m from the context. In the textbook, Rudin considers V = R , W = R with the standard Euclidean norms and simply writes kT k for the operator norm.

Lemma. Let V and W be normed spaces. If V is finite dimensional then all linear transformations from V to W are bounded. Pn Proof. Let v1, . . . , vn be a of V . Then for v = j=1 αjvj we have n n n X X X kT vkW = αjT vj ≤ |αj|kT vjkW ≤ kT vjkW max |αk|. W k=1,...,n j=1 j=1 j=1

The expression maxk=1,...,n |αk| defines a norm on V . Since all norms on V are equivalent, there is a constant C1 such that n X max |αj| ≤ C αjvj j=1,...,n V j=1 for all choices of α1, . . . , αn. Thus we get kT vkW ≤ CkvkV for all v ∈ V , Pn where the constant C is given by C = C1 j=1 kT vjk. 

n m Pn 2 1/2 Lemma. On R , R use the Euclidean norms kxk2 = ( j=1 |xj| ) , Pm 2 1/2 kyk2 = ( i=1 |yi| ) . Let A be an m × n and consider the linear n m operator T : R → R defined by T (x) = Ax. Let m n X X 2 1/2 kAkHS := ( |aij| ) . i=1 j=1 Then kT kop ≤ kAkHS.

2 Pm Pn 2 Proof. kAxk2 = i=1 | j=1 aijxj| . By the Cauchy-Schwarz inequality Pn 2 Pn 2 2 | j=1 aijxj| ≤ j=1 |aij| kxk2 and hence kAxk2 ≤ kAkHSkxk2 for all n x ∈ R . Thus kT kop ≤ kAkHS.  3

Remark. We often write A for both the matrix and the linear operator x → Ax. The expression kAkHS is a norm on the space of m × n matrices called the Hilbert-Schmidt norm of A. In many cases it is substantially larger then the operator norm (and so the estimate in the Lemma is rather inefficient). To illustrate this let D an n × n diagonal matrix (with entries λ1,...,λn on the diagonal and zeroes elsewhere. Then it is easy to verify the inequality

kDxk2 ≤ max |λj|kxk2 j=1,...,n and this is seen to be sharp by testing on the unit vectors x = ej. Thus kDk = max |λ | while kDk = (Pn |λ |2)1/2. In particular for op j=1,...,n j HS j=1 j √ the identity operator I we have kIkop = 1 but kIkHS = n.

Normed vector spaces and compleness. Let V be a normed space. Then V is a space with respect to the metric d(x, y) = kx−yk. Recall ∞ that a {xn}n=1 is a Cauchy-sequence in V if for every ε > 0 there is K such that kxn − xmkV < ε for m, n ≥ K.

Definition. A is a normed space which is complete (i.e. every converges).

Lemma: A finite dimensional normed space over R or C is complete. Proof. Let v1, . . . , vN be a basis of V and for x ∈ V let β1(x), . . . , βN (x) be the coordinates of f with respect to the basis v1, . . . , vN . We need to show that {xn} has a limit in V . For every n we can write

N X xn = βi(xn)vi. i=1 The expression

kxk∞ := max |βi(x)| i=1,...,N defines a norm on V which is equivalent to the given norm. In particular we have

|βi(xn) − βi(xm)| ≤ Ckxn − xmkV , i = 1,...,N.

This shows that for each i = 1,...,N the numbers βi(xn) form a Cauchy- PN sequence of scalars and thus converge to a βi. Define x = i=1 βivi (so that βi = βi(x)). Then

N n X  X kxn − xkV = k βi(xn) − βi vikV ≤ |βi(xn) − βi|kvikV i=1 i=1 which converges to 0 as n → ∞. Hence xn → x in V .  4

Theorem: Let V and W be normed spaces and assume that W is com- plete. Then the space L(V,W ) of bounded linear operators from V to W is a Banach space. Proof. We need to show that L(V,W ) is complete. Let Tn be a Cauchy sequence in L(V,W ) (with respect to the norm kT k = sup kT xk ). op kxkV =1 W Then for each x ∈ V , Tnx is Cauchy in W and by assumption Tnx converge to some vector T x. Check, using limiting arguments, that T : V → W is linear. Since Tn is Cauchy the sequence TN is bounded in L(V,W ) and thus there exists M with kTnkop ≤ M. We need to show that T is a . We have for all n and all x ∈ V

kT xkW ≤ kT x − TnxkW + kTnxkW ≤ kT x − TnxkW + kTnkopkxkW .

Letting n → ∞ we see that kT xkW ≤ M|xkV . Hence T ∈ L(V,W ) with kT kop ≤ M. Finally we need to show that Tn → T in L(V,W ). Given ε > 0 we choose N such that kTn − Tmkop < ε/2 for m, n ≥ N.

We prove the following Claim:

For each x ∈ V , x 6= 0, and n ≥ N we have kTnx − T xkW < εkxkV .

Let n ≥ N. Since Tmx → T x in W we can choose m(x) ≥ N such that kTm(x)x − T xkW ≤ ε/4kxkV . We estimate we estimate (with m ≥ N)

kT x − TnxkW ≤ kT x − Tm(x)x + Tm(x)x − Tn(x)kW

≤ kT x − Tm(x)xkW + kTm(x)x − TnxkW ε ≤ kxk + kT − T k kxk 4 V m(x) n op V ε ε ≤ kxk + kxk < εkxk . 4 V 2 V V

Hence the claim is established and thus kTn − T kop < ε for n ≥ N. This shows Tn → T with respect to the operator norm. 

Definition. (i) Let V be a normed over R. Bounded linear transformations from V to R are called bounded linear functionals. The ∗ space L(V, R) with the operator norm is called the to V , or V . (ii) Similarly if V be a over C we call the bounded linear transformations from V to R bounded linear functionals and refer to ∗ the space L(V, C) with the operator norm as the dual space V .

Since R and C are complete the above theorem about completeness of L(V,W ) immediately yields the

Corollary. Dual spaces of normed vector spaces are Banach spaces. 5

n n Exercise: The space L(R , R) of linear functionals ` : R → R can be n identified with R, since for any such ` there is a unique vector y ∈ R such Pn n n that `(x) = j=1 xjyj for all x ∈ R . We consider the p-norm on R , Pn p 1/p kxkp = ( j=1 |xj| ) , 1 ≤ p < ∞. Also set kxk∞ = maxj=1,...,n |xj|. The norm on R is just the | · |. (i) Prove

Pn j=1 xjyj sup = kyk∞ kxk16=0 kxk1 and

Pn j=1 xjyj sup = kyk1. kxk∞6=0 kxk∞ (ii) Prove that if 1 < p < ∞ then Pn xjyj j=1 0 p k`kop ≡ sup = kykp0 where p = . kxkp6=0 kxkp p − 1 Hint: H¨older’sinequality should be used here.

Finally we consider compositions of linear transformations. The operator norms are submultiplicative in the sense of the following lemma. Lemma.Let V1, V2, V3 be norned vector spaces and let T ∈ L(V1,V2) and S ∈ L(V2,V3). Define the composition ST : V1 → V3 by ST (x) = S(T (x)). Then ST ∈ L(V1,V3) and we have

kST kL(V1,V3) ≤ kSkL(V2,V3)kT kL(V1,V2) .

Proof: For x ∈ V1,

kST (x)kV3 ≤ kSkL(V2,V3)kT (x)kV2 ≤ kSkL(V2,V3)kT kL(V1,V2)kxkV1 .

This implies ST ∈ L(V1,V3) and the asserted submultiplicativity property. 

Differentiation

Definition. Let V and W be normed vector spaces. Let U ⊂ V be an open subset of V and let F : U → W be a . Let a ∈ U. We say that F is differentiable at a if there exists a bounded linear transformation 0 T : V → W (usually denoted by Dfa or by f (a)) such that kf(a + h) − f(a) − T hk lim W = 0 h→0 khkV

Uniqueness: For the of f at a to be well defined we must prove uniqueness of T . Suppose there are two linear bounded transformations T kf(a+h)−f(a)−T hkW kf(a+h)−f(a)−T˜ hkW and T˜ with → 0 and → 0 as khkV → khkV khkV 6

kT h−T˜ hkW 0. The implies that limh→0 = 0 . I.e. for every khkV ε > 0 there is a δ > 0 such that kT h − T˜ hkW ≤ εkhkV for all khk ≤ δ. However since T, T˜ are linear this inequality also holds with h replaced by ch for all scalars c, and thus we have

kT h − T˜ hkW ≤ εkhkV for all h ∈ V.

Hence kT − T˜kop ≤ ε for all ε > 0 which implies T = T˜. Thus if the derivative of f at a exists it is uniquely defined and we shall henceforth 0 denote it by Dfa or by f (a). Remark: Other terns for this derivative Dfa are “total derivative of f at a” or “Fr´echet derivative of f at a”.

Exercise: Prove that if f is differentiable at a then f is continuous at a.

Example: Let M(n, n) be the space of n × n matrices, with any norm. 2 Define F (A) = A . Then F is differentiable at any A and its derivative DFA is given by DFA(H) = AH + HA. For the proof observe that F (A + H) − F (A) = (A + H)2 − A2 = AH + 2 2 HA + H and check that limH→0 kH k/kHk = 0. This is accomplished by 2 showing that kH k ≤ CkHk2 and if one uses a submultiplicative norm on M(n, n) one gets this even with the constant 1.

2 2 Example: Let F : R → R be given by F (x1, x2) = x1 +sin(πx2). You are being asked to check from the definition whether F differentiable at (2, 1) 2 and to determine the derivative DF(2,1), as a linear transformation R → R. Observe that 2 2 F (2 + h1, 1 + h2) − F (2, 1) = (2 + h1) − 2 + (sin(π(1 + h2)) − sin π 2 2 = 4h1 + π cos(π)h2 + O(h1) + O(h2). 2 Hence DF(2,1) : R → R is the linear transformation given by

DF(2,1)(h) = 4h1 − πh2 .

Example: Let C([0, 1] be the space of continuous functions with the usual max-norm kfk∞ = maxx∈[0,1] |f(x)|. We define a function Γ: C([0, 1]) → C([0, 1]) by saying that for f ∈ C([0, 1]), Γ[f] is the function whose values at x ∈ [0, 1] are given by Z x Γ[f](x) = f(t)2 cos t dt . 0 Show that Γ is differentiable at every g ∈ C([0, 1]) and determine DΓg for all g ∈ C([0, 1]). 7

For h ∈ C([0, 1]) we compute Z x Z x Γ[g + h](x) − Γ[g](x) = (g(t) + h(t))2 cos t dt − (g(t))2 cos t dt 0 0 Z x Z x = h(t) 2g(t) cos t dt + (h(t))2 cos t dt 0 0 We claim that DΓg : C([0, 1]) → C([0, 1]) is the linear transformation which to each h ∈ C([0, 1]) assigns the function T [h] whose values at x are given by Z x T [h](x) = h(t) 2g(t) cos t dt . 0 We must show that T is a bounded linear transformation and that it is really the derivative of Γ at g. Clearly T is linear and maps C([0, 1]) to itself. It is bounded since Z x

max h(t) 2g(t) cos t dt ≤ 2 max |g(x)| max |h(x)|; 0≤x≤1 0 [0,1] x∈[0,1] this inequality shows that the operator norm of T is at most 2kgk∞. To verify that T is really the derivative of Γ at g we must analyze the error term and show that

max R x h(t)2 cos t dt x∈[0,1] 0 → 0 as khk∞ → 0. khk∞ 2 2 The numerator is bounded by maxt∈[0,1] |h(t) | = khk∞ and hence the dis- played expression tends to 0 as khk∞ → 0. Thus we have shown that Γ is differentiable at g and DΓg = T .

Example: Let C([0, 1] be the space of continuous functions with the usual max-norm kfk∞ = maxx∈[0,1] |f(x)|. Define Λ : C([0, 1]) → R by Z 1 Λ[f] = f(t)2 cos t dt . 0 Show that Λ is differentiable at every g ∈ C([0, 1]) and that

DΛg : C([0, 1]) → R is the bounded linear functional defined by Z 1 DΛg[h] = h(t) 2g(t) cos t dt . 0