<<

1.4 Cauchy in R

Deﬁnition. (1.4.1)

A sequence xn ∈ R is said to converge to a x if • ∀² > 0, ∃N s.t. n > N ⇒ |xn − x| < ². A sequence xn ∈ R is called Cauchy sequence if • ∀², ∃N s.t. n > N & m > N ⇒ |xn − xm| < ².

Proposition. (1.4.2) Every convergent sequence is a .

Proof. Assume xk → x. Let ² > 0 be given. ² • ∃N s.t. n > N ⇒ |xn − x| < 2 . • n, m ≥ N ⇒ ² ² |xn − xm| ≤ |x − xn| + |x − xm| < 2 + 2 = ².

1 Theorem. (1.4.3; Bolzano-Weierstrass Property) Every bounded sequence in R has a that converges to some point in R.

Proof. Suppose xn is a bounded sequence in R. ∃M such that

−M ≤ xn ≤ M, n = 1, 2, ··· . Select xn0 = x1.

• Bisect I0 := [−M, M] into [−M, 0] and [0, M].

• At least one of these (either [−M, 0] or [0, M]) must contain xn for inﬁnitely many indices n.

• Call it I1 and select n1 > n0 with xn1 ∈ I0.

• Continue in this way to get a subsequence xnk such that • I0 ⊃ I1 ⊃ I2 ⊃ I3 ··· −k • Ik = [ak , bk ] with |Ik | = 2 M.

• Choose n0 < n1 < n2 < ··· with xnk ∈ Ik . ∃ • Since ak ≤ ak+1 ≤ M (monotone and bounded), ak → x. −k • Since xnk ∈ Ik and |Ik | = 2 M, we have −k−1 |xnk −x| < |xnk −ak |+|ak − x| ≤ 2 M+|ak − x| → 0 as k → ∞. 2 Corollary. (1.4.5; Compactness) Every sequence in the closed [a, b] has a subsequence in R that converges to some point in R.

Proof. Assume a ≤ xn ≤ b for n = 1, 2, ··· . By Theorem 1.4.3, ∃ ∃ a subsequence xnk and a ≤ x ≤ b such that xnk → x. Lemma. (1.4.6; Boundedness of Cauchy sequence)

If xn is a Cauchy sequence, xn is bounded.

Proof. ∃N s.t. n ≥ N ⇒ |xn − x| < 1. Then supn |xn| ≤ 1 + max{|x1|, ··· , |xN |} (Why?) Theorem. (1.4.3; Completeness) Every Cauchy sequence in R converges to an in [a, b].

Proof. Cauchy seq. ⇒ bounded seq. ⇒ convergent subseq.

3 1.5. Cluster Points of the sequence xn

Deﬁnition. (1.5.1; cluster points)

A point x is called a cluster point of the sequence xn if • ∀² > 0, ∃inﬁnitely many values of n with |xn − x| < ² In other words, a point x is a cluster point of the sequence xn iﬀ ∀² > 0 & ∀N, ∃n > N s.t. |xn − x| < ²

Example • Both 1 and −1 are cluster points of the sequence 1, −1, 1, −1, ··· . 1 • The sequence xn = n has the only cluster point 0. • The sequence xn = n does not have any cluster point.

4 Proposition.

1. x is a cluster point of the sequence xn iﬀ

∃ a subsequence xnk s.t. xnk → x. 2. xn → x iﬀ every subsequence of xn converges to x

3. xn → x iﬀ the sequence {xn} is bounded and x is its only cluster points.

Proof. 1. (⇒) Assume x is a cluster point. Then, we can choose 1 n1 < n2 < n3 ··· s.t. |xnk − x| < k . (Why?) This gives a

subsequence xnk → x. 2. Trivial

3. (⇐)If not, ∃² and ∃ a subseq xnk so that |xnk − x| > ². Since

xnk is bounded, ∃ a convergent subseq. The limit of that subseq would be a cluster pt of the seq xn diﬀerent from x, but there are no such pt. Contradiction. 5 Deﬁnition. (1.5.3; limit superior & limit inferior of seq xn )

Deﬁne the limit superior limxn in the following way: • If xn is bounded above, then lim supn→∞ xn = limxn = the largest cluster point limxn = −∞ if the cluster point is empty • If xn is NOT bounded above, then limxn = ∞

Similarly, we can deﬁne the limit inferior limxn. Examples

• For the seq 1, 0, −1, 1, 0, −1, ··· , limxn = 1 and limxn = −1.

• If xn = n, then limxn = ∞ = limxn n 1+n • Let xn = (−1) n . Then limxn = 1 and limxn = −1.

6 Deﬁnition. (1.6.2; Vector ) A real V is a set of elements called vectors, with given operations of vector + : V × V → V and scalar multiplication · : R × V → V such that the followings hold for all v, u, w ∈ V and all λ, µ ∈ R: 1. v + w = w + v, (v + u) + w = v + (u + w), λ(v + w) = λv + λw, λ(µv) = (λµ)v, (λ + µ)v = λv + µv, 1v = v. 2. ∃0 ∈ V s.t. v + 0 = v. ∃ − v ∈ V s.t. v − v = 0.

• A subset of V is called a subspace if it is itself a vector space with the same operations. • W is a vector subspace of V iﬀ λv + µu ∈ W whenever u, v ∈ W and λ, µ ∈ R.

• The straight line W = {(x1, x2): x1 = 2x2} is a subspace of R2.

7 Euclidean space Rn & Deﬁnitions & Properties The Euclidean n-space Rn with the operations

(x1, ··· , xn) + (y1, ··· , yn) = (x1 + y1, ··· , xn + yn)& λ(x1, ··· , xn) = (λx1, ··· , λxn) is a vector space of dimension n. • The standard of Rn; e1 = (1, 0, ··· , 0), ··· , en = (0, ··· , 0, 1). n • Unique representation: x = (x1, ··· , xn) ∈ R can be expressed uniquely as x = x1e1 + ··· + xnen. P • Inner product of x and y: hx, yi = n x y p i=1 i i • of x: kxk = hx, yi. • Distance between x and y: dist(x, y) = kx − yk • : kx + yk ≤ kxk + kyk. • Cauchy-Schwartz inequality: hx, yi ≤ kxk kyk • Pythagorean theorem: If hx, yi = 0, then kx + yk2 = kxk2 + kyk2.

8 Deﬁnition. (1.7.1; Space (M, d) equipped with d =distance) A (M, d) is a set M and a d : M × M → R such that 1. d(x, y) ≥ 0 for all x, y ∈ M. 2. d(x, y) = 0 iﬀ x = y. 3. d(x, y) = d(y, x) for all x, y ∈ M. 4. d(x, y) ≤ d(x, z) + d(z, y) for all x, y ∈ M.

Example [Fingerprint Recognition] Let M be a data set of ﬁngerprints in Seoul city police department. • Motivation: Design an eﬃcient access system to ﬁnd a target. • We need to deﬁne a dissimilarity function stating the distance between the data. The distance d(x, y) between two data x and y must satisfy the above four rules. • Similarity queries. For a given target x∗ ∈ M and ² > 0, arrest all having ﬁnger print y ∈ M such that d(y, x∗) < ². 9 Deﬁnition. (1.7.3. Normed Space (V, k · k)) A normed space (V, k · k) is a vector space V and a function k · k : V → R called a norm such that 1. kvk ≥ 0, ∀v ∈ V 2. kvk = 0 iﬀ v = 0. 3. kλvk = |λ|kvk, ∀v ∈ V and every scaler λ. 4. kv + wk ≤ kvk + kwk, ∀v, w ∈ V

Examples • V = R and kxk = |x| for all x ∈ R. q 2 2 2 2 • V = R and kvk = v1 + v2 for all v = (v1, v2) ∈ R . • Let V = C([0, 1])=all continuous functions on the interval [a, b]. Deﬁne kf k = sup{|f (x)| : x ∈ [0, 1]} (called supremum norm).

10 Proposition. If (V, k · k) is a and

d(v, w) = kv − wk

, then d is a metric in V. Proof. EASY. Examples • For V = C([0, 1]), the metric is

d(f , g) = kf − gk = sup{|f (x) − g(x)| : x ∈ [0, 1]}.

The sup distance between functions is the largest vertical distance between their graphs.

11 Deﬁnition. A vector space V with a function h·, ·i : V × V → R is called an if 1. h v, vi ≥ 0 for all v ∈ V. 2. h v, vi = 0 iﬀ v = 0. 3. hλv, wi = λhv, wi, ∀v ∈ V and every scaler λ. 4. hv + w, hi = h v, h i + h w, h i. 5. h v, w i = h w, v i

Examples 2 1. V = R and h v, w i = v1w1 + v2w2. Two vectors v and w are orthogonal if h v, w i = 0. R 2. V = C[0, 1] and h f , g i = 1 f (x)g(x)dx p 0 3. kvk = h v, v i is a norm on V.

12 Theorem. (Cauchy-Schwarz inequality ) If h ·, · i is an inner product in a real vector space V, then |h f , g i| ≤ kf kkgk

Proof: • g Suppose g 6= 0. Let h = kgk . It suﬃces to prove that |h f , h i| ≤ kf k. (Why? |h f , g i| ≤ kf kkgk iﬀ |h f , h i| ≤ kf k.) • Denote α = h f , h i. Then

0 ≤ kf − αhk2 = hf − αh, f − αhi = kf k2 − α hh, f i − α hf , hi + |α|2 = kf k2 − |α|2

Hence, |α| = |h f , h i| ≤ kf k. This completes the proof.

13 Chapter 2: of M = Rn n Throughout this chapter, assumepP M = R ( the Euclidean space ) n 2 with the metric d(x, y) = i=1 |xi − yi | = kx − yk Deﬁnition. (D(x, ²), open, neighborhood)

• D(x, ²) := {y ∈ M : d(y, x) < ²} is called ²-ball (or ²-disk) about x. • A ⊂ M is open if ∀ x ∈ M, ∃² > 0 s.t. D(x, ²) ⊂ A. • A neighborhood of x is an open set A containing x.

• Open sets: (a, b), D(x, ²), {(x, y) ∈ R2 : 0 < x < 1}. • The union of an arbitrary collection of open subsets of M is open. (Why?) • The intersection of a ﬁnite number of open subsets of M is ∞ open. (Note that ∩n=1(−1/n, 1/n) = {0} is closed. ) 1 2.2 Interior of a set A: int(A)

Deﬁnition. (2.2.1; Interior point & interior of A) Let (M, d) is a metric space and A ⊂ M. x is called an interior point of A if ∃D(x, ²) s.t. D(x, ²) ⊂ A. Denote

int(A) := the collection of all interior points of A.

Examples. Proofs are very easy. • If A = [0, 1], then int(A) = (0, 1). • int{(x, y) ∈ R2 : 0 < x≤1} = {(x, y) ∈ R2 : 0 < x<1}. • If A is open, then int(A) = A.

• Let (M, d) be a metric space and x0 ∈ M. int{y ∈ M : d(y, x0)≤1} = {y ∈ M : d(y, x0)<1}

2 Deﬁnition. (2.3-4: Closed sets & Accumulation Points ) • A set B in a metric space M is said to be closed if M \ B is open. • x ∈ M is accumulation point (or cluster point ) of a set A ⊂ M if ∀² > 0, D(x, ²) contains y ∈ A with y 6= x.

Prove the followings: 2 • Closed sets: [a, b], {y ∈ R : d(y, x0)≤1}. • The union of an a ﬁnite number of closed subsets of M is ∞ closed. (Note that ∪n=1[1/n, 2 − 1/n] = (0, 2) is open. ) • The intersection of an arbitrary family of closed subsets of M is closed. Why? • Every ﬁnite set in Rn is closed. • A set A ⊂ M is closed iﬀ the accumulation points of A belongs to A. 1 1 1 • A = {1, 2 , 3 , 4 , · · · } ∪ {0} is closed.

3 Deﬁnition. ( of A & Boundary of A) Let (M, d) is a metric space and A ⊂ M. • cl(A) :=the intersection of all containing A. • ∂A = bd(A) = cl(A) ∩ cl(M \ A) is called the boundary of A

Examples • Closure: cl((0, 1)) = [0, 1], cl{(x, y) ∈ R2 : x > y} = {(x, y) ∈ R2 : x ≥ y}. • Boundary bd((0, 1)) = {0, 1}, bd{(x, y) ∈ R2 : x > y} = {(x, y) ∈ R2 : x = y}. Let (M, d) is a metric space and A ⊂ M. Prove that • cl(A) = A ∪ { accumnulation points of A}. • x ∈ cl(A) iﬀ inf{d(x, y): y ∈ A} = 0. • x ∈ bd(A) iﬀ ∀² > 0, D(x, ²) ∩ A 6= ∅ & D(x, ²) ∩ (M \ A) 6= ∅.

4 Deﬁnition. ( & Completes)

Let (M, d) is a metric space and xk a sequence of points in M.

• limk→∞ xk = x iﬀ ∀² > 0, ∃N s.t. k ≥ N ⇒ d(x, xk ) < ².

• xk is Cauchy seq. iﬀ ∀² > 0, ∃N s.t. k, l ≥ N ⇒ d(xk , xl ) < ².

• xk is bounded iﬀ ∃B > 0 & x0 ∈ M s.t. d(xk , x0) < B for all k.

• x is a cluster point of the seq. xk iﬀ ∀², ∃ inﬁnitely many k with d(xk , x) < ². The space M is called complete if every Cauchy seq. in M converges to a point in M. In a metric space, it is easy to prove the followings: • Every convergent seq. is a Cauchy seq. • A Cauchy seq. is bounded. • If a subseq. of a Cauchy seq. converges to x, then the sequence itself converges to x. 5 Chapter 3: Compact & Connected sets Throughout this chapter, we assume that (M, d) is a metric space. Deﬁnition. (3.1.1: Sequentially compact & Compact) Let A ⊂ M. • A is called sequentially compact if EVERY sequence in A has a subsequence that converges to a point in A. • A is compact if EVERY open cover of A has FINITE subcover.

• An open cover of A is a collection {Ui } of open sets such that A ⊂ ∪i Ui . • An open cover {Ui } of A is said to have ﬁnite subcover if a ﬁnite subcollection of {Ui } covers A.

• In chapter 1, we proved that every sequence xn in the closed interval [a, b] has a subsequence that converges to a point in [a, b]. Hence, [a, b] is sequentially compact. 1 Examples of compact set 1. Prove that the entire line R is NOT compact. Proof. Clearly, {D(n, 1) : n = 0, ±1, ±2, ···} is open cover of R but does not have a ﬁnite subcover (why?). 2. Prove that A = (0, 1] is not compact. Proof. Clearly, ∞ (0, 1] = ∪n=1(1/n, 2). Hence, {(1/n, 2) : n = 1, 2, ···} is an open cover of (0, 1] but does not have a ﬁnite subcover. 3. Heine-Borel thm. Let A ⊂ M = Rn. A is compact iﬀ A is closed and bounded. Proof. later. 4. Give an example of a bounded and closed set that is not compact. Sol’n. Let M = {en : n = 1, 2, ···} where √ e1 = (1, 0, 0, ··· ), e2 = (0, 1, 0, ··· ), ··· . Let d(ei , ej ) = 2 if i 6= j. Then (M, d) is a metric space. • The entire metric space M is closed and bounded (why?). • {D(en, 1) ; n = 1, 2 ···} is open cover of M but but does not have a ﬁnite subcover (why?). Hence, M is not compact. 2 Theorem. (3.1.3; Bolzano-Weirstrass theorem) A ⊂ M is compact iﬀ A is sequentially compact.

• Lemma 1: Let A ⊂ M. If A is compact, then A is closed. Proof . We will show M \ A is open. Let x ∈ M \ A. ∞ 1. A ⊂ ∪n=1Un where Un = M \ D(x, 1/n) open set. 2. Since A is compact and {Un} covers A, ∃ a ﬁnite subcover, N that is, ∃N s.t. A ⊂ ∪n=1Un. = UN c c 3. Hence, D(x, 1/N) ⊂ UN ⊂ A = M \ A and therefore M \ A is open. • Lemma 2: Let A ⊂ B ⊂ M. If B is compact and A is closed, then A is compact. Proof. Let Ui be an open covering of A. 1. Set V = M \ A. Note that V is open. 2. Thus {Ui , V } is an open cover of B. 3. Since B is compact, B has a ﬁnite cover, say, {U1, ··· , UN , V }. Hence, A ⊂ U1 ∪ · · · ∪ UN .

3 • Lemma 4: If A is sequentially compact, then A is totally bounded. 1. Deﬁnition of totally bounded: A ⊂ M is totally bounded if N ∀², ∃ ﬁnite set {x1, ··· , xN } ⊂ M s.t. A ⊂ ∪i=1D(xi , ²). 2. Proof. If not, then for some ² > 0 we cannot cover A with ﬁnitely many disks.

(i) Choose x1 ∈ A and x2 ∈ A \ D(x1, ²). n−1 (ii) By assumption, we can repeat; choose xn ∈ A \ ∪i=1 D(xi , ²) for n = 1, 2, ··· . (iii) This seq {xn} satisﬁes d(xn, xm) > ² for all n 6= m. (iv) Hence, xn has no convergent subseq., a contradiction.

• Summery. Let A ⊂ M. • A is compact ⇒ A is closed • A is a closed subset of a compact set ⇒ A is compact. • A is sequentially compact ⇒ A is totally bounded.

4 Proof of B-W thm (⇒): If A is compact, then A is sequentially compact. Let A be compact. Let {xn} be a seq. in A.

1. To derive a contradiction, assume that {xn} has no convergent subseq.

2. Then, {xn} has inﬁnitely many distinct points {yk } which has no accumulation points. (Why? If not, ∃ convergent subseq. )

3. Hence, ∃ some neighborhood Uk of yk containing no other yi .

4. {yn} is closed because it has no accumulation points. Hence, {yn} is compact by Lemma 2. Lemma2: Any closed subset of the compact set A is compact.

5. But {Uk } is an open cover that has no ﬁnite subcover, a contradiction.

6. Hence, xn has a convergence subsequence. The limit lies in A, since A is closed by Lemma 1.

Hence, xn has a subsequence that converges to a point in A. 5 Proof of B-W thm (⇐): If A is sequentially compact, than A is compact. Suppose {Ui } is an open cover of A. We need to prove that {Ui } has ﬁnite subcover.

• ∃r > 0 s.t. ∀y ∈ A, D(y, r) ⊂ Ui for some Ui . Why?

1. If not, ∃yn ∈ A s.t. D(yn, 1/n) is not contained in any Ui .

2. By assumption, {yn} has a subseq., say, ynk → z ∈ A. Since

z ∈ A ⊂ ∪i Ui , z ∈ Ui0 for some Ui0 .

3. Since Ui0 is open, ∃² > 0 s.t. D(z, ²) ⊂ Ui0 .

4. Since ynk → z, ∃N = nk0 ≥ 2/² s.t. yN ∈ D(z, ²/2).

5. But D(yN , 1/N) ⊂ D(z, ²) ⊂ Ui0 (why?), a contradiction. • Since A is totally bounded (see Lemma 4), we can write A ⊂ D(y1, r) ∪ · · · ∪ D(yn, r) for ﬁnitely many yi .

• Since D(yk , r) ⊂ Uik for some Uik , A ⊂ Ui1 ∪ · · · ∪ Uin , ﬁnite subcover. Hence, A is compact.

6 Theorem. (3.15; Compact ⇔ Closed and Totally Bounded) Let A ⊂ M. A is compact iﬀ A is complete and totally bounded.

(Proof of ⇒) Assume A is compact. 1. A is compact ⇒ totally bounded & sequentially compact. 2. A is sequentially compact ⇒ A is complete.

(Proof of ⇐) Assume A is is complete and totally bounded. It suﬃces to prove that A is sequentially compact. Assume that {yn} is a sequence in A.

1. We may assume that the yk are all distinct. (Why? If not, ...) 2. Since A is totally bounded, for each k = 1, 2, ···

∃xk1, ··· , xkLk ∈ M s.t. A ⊂ D(xk1, 1/k) ∪ · · · ∪ D(xkLk , 1/k)

3. Nest page... 7 Theorem. (Continue...) Let A ⊂ M. A is compact iﬀ A is complete and totally bounded.

(Proof of ⇐) Assume A is is complete and totally bounded. It suﬃces to prove that A is sequentially compact.

Assume that {yn} is a sequence in A.

1. We may assume that the yk are all distinct. (Why? If not, ...)

2. Since A is totally bounded, for each k = 1, 2, ···

∃xk1, ··· , xkLk ∈ M s.t. A ⊂ D(xk1, 1/k) ∪ · · · ∪ D(xkLk , 1/k) 3. For k = 1, an inﬁnitely many yn lie in one of these disks D(x1,j , 1). Hence, we can select a subseq. {y11, y12, ···} lying entirely in one of these disks. 4. Repeat the previous step for k = 2 and obtain the subseq. {y21, y22, ···} of {y11, y12, ···} lying entirely in one of these disks D(x2j , 1/2). 5. Now choose the diagonal subsequence y11, y22, y33, ··· . This sequence is Cauchy seq. because d(yii , yjj ) ≤ max{1/i, 1/j}. 6. Since A is complete, yii converges to a point in A. 8 Theorem. (3.2.1, Heine-Borel thm.) Let A ⊂ M = Rn. A is compact iﬀ A is closed and bounded.

Proof. • Recall Thm 3.1.5: A is compact iﬀ A is closed and totally bounded. • Since M = Rn is Euclidean space,

A is bounded ⇔ A is totally bounded

Caution: If M is not Euclidean space, the above statement is not true. See Example 3.1.8 where there is an example that A is bounded but not totally bounded.

9 Theorem. (3.3.1: Nested Set Property)

Let Fk be a sequence of compact non-empty set in a metric space M such that F1 ⊇ F2 ⊇ F3 ⊇ · · · . Then,

∞ ∩k=1Fk 6= ∅.

1. For each n, choose xn ∈ Fn.

2. Since {xn} ⊂ F1 and F1 is compact, ∃ a subseq {xnk } that converges to some point z in F1, that is,

xnk −→ z ∈ F1

3. With a rearrangement, we may assume that xn → z. (why?) 4. n > N =⇒ xn ∈ Fn ⊂ FN =⇒ xn ∈ FN 5. Since limj→∞ xN+j = z & xN+j ∈ FN & FN is compact, it must be z ∈ FN , N = 1, 2, 3, ··· This completes the proof. 10 Deﬁnition. (Path-Connected Sets)

• φ :[a, b] → M is said to be continuous if

tk ∈ [a, b] → t =⇒ φ(tk ) → φ(t)

• A continuous path joining x, y ∈ M is a continuous mapping φ :[a, b] → M such that φ(a) = x, φ(b) = y. • A ⊂ M is said to be path-connected if for any x, y ∈ A, there exists a continuous path φ :[a, b] → M joining x and y such that φ([a, b]) ⊂ A.

11 Deﬁnition. (3.5.1: Separate, Connected Sets) Let A be a subset of a metric space M. • Two open set U, V are said to be separate A if 1. U ∩ V ∩ A = ∅ 2. U ∩ A 6= ∅ &V ∩ A 6= ∅ 3. A ⊂ U ∪ V. • A is disconnected if such sets U, V exist. • A is connected if such sets U, V do not exist.

12 Theorem. (3.3.1) Path-connected sets are connected.

1. Clearly, [a, b] is connected. 2. To derive a contradiction, suppose A is path-connected but not connected. Then ∃ open sets U, V such that (i) U ∩ V ∩ A = ∅ & A ⊂ U ∪ V (ii) ∃x ∈ U ∩ A & ∃y ∈ V ∩ A 3. Since A is path-connected, ∃ a continuous path φ :[a, b] → M s.t. φ(a) = x, φ(b) = y, φ([a, b]) ⊂ A. 4. From Theorem 4.2.1 which we will learn soon, φ([a, b]) is connected. This is a contradiction since U, V separate φ([a, b]).

13 Example 3.1 • Show that A := {x ∈ Rn : kxk ≤ 1} is compact and connected. Proof. 1. Since A is closed and bounded, A is compact by Heine-Borel thm. 2. To prove connectedness, let x, y ∈ A. 3. Deﬁne φ : [0, 1] → Rn by φ(t) = tx + (1 − t)y. Clearly, φ is continuous path joining φ(0) = x and φ(1) = y. 4. kφ(t)k ≤ tkxk + (1 − t)kyk ≤ t + (1 − t) = 1 for t ∈ [0, 1]. Hence, φ([0, 1]) ⊂ A. 5. Hence, A is path-connected.

14 Example 3.2 • Let A ⊂ Rn, x ∈ A and y ∈ Rn \ A. Let φ : [0, 1] → Rn be a continuous path joining x and y. Show that ∃ t0 s.t. φ(t0) ∈ bd(A). 1. Let t0 = sup{t : φ([0, t]) ⊂ A}. This is well-deﬁned because φ(0) = x ∈ A. 2. If t0 = 1, clearly y = φ(t0) ∈ bd(A). 3. Assume 0 ≤ t0 < 1. From the deﬁnition of t0, for 1 c n = 1, 2, ··· , ∃tn s.t. t0 ≤ tk ≤ t0 + n & φ(tn) ∈ A c 4. Since φ(tn) ∈ A → φ(t0), φ(t0) ∈ bd(A).

15 Chapter 4. Continuous Mappings Throughout this chapter, we assume that M = Rn and N = Rm are Euclidean space with the standard metric v u uXn 2 d(x, y) = kx − yk = t (xi − yj ) , x, y ∈ M j=1 v u uXm 2 ρ(v, w) = kv − wk = t (vi − wj ) , v, w ∈ N j=1

Please note that the same symbol k · k may have diﬀerent norm depending on its context. Throughout this chapter, we assume that A ⊂ M = Rn and

f : A → N = Rm is a mapping.

1 Deﬁnition. (4.1.1: Continuity of f : A → N)

• Suppose that x0 ∈ {accumulation points of A}. We write

limx→x0 f (x) = b if ∀² > 0, ∃ δ > 0 s.t.

0 < kx − x0k < δ & x ∈ A ⇒ kf (x) − bk < ²

• Let x0 ∈ A. We say that f is continuous at x0 if either

x0 6= {accumulation points of A} or limx→x0 f (x) = f (x0). • Let B ⊂ A. f is called continuous on B if f is continuous at each point on B. If A = B, we just say that f is continuous.

2 Theorem. (4.1.4: Continuity of f : A → N) The following assertions are equivalent. 1. f is continuous on A

2. For every convergent seq xk → x0 in A, we have f (xk ) → f (x0). 3. For each open set U in N, f −1(U) is open relative to A; that is, f −1(U) = A ∩ V for some open V 4. For each closed set F in N, f −1(F ) is closed relative to A; that is, f −1(F ) = A ∩ G for some close G

easy easy Proof. 1 =⇒ 2 =⇒? 4 =⇒ 3 =⇒? 1

3 Proof of (2 =⇒ 4) Let F ⊂ N be closed. We want to prove that f −1(F ) is closed relative to A. We begin with reviewing the deﬁnition of closed. 1. B is closed iﬀ B = B ∪ {accumulation points of B}.

2. B is closed iﬀ for every sequence {xk } ⊂ B that xk → x0, we necessary have x0 ∈ B. 3. B ⊂ A is closed relative to A iﬀ B = (B ∪ {accumulation points of B}) ∩A

4. B ⊂ A is closed relative to A iﬀ for every sequence {xk } ⊂ B that xk → x0∈ A, we necessary have x0 ∈ B. −1 5. Proof of (2 =⇒ 4). Let xk ∈ f (F ) and let xk → x0 ∈ A. By 2, f (xk ) → f (x0). Since F is closed, f (x0) ∈ F . −1 −1 ∴ x0 ∈ f (F ). ∴ f (F ) is closed relative to A.

4 Proof of (3 =⇒ 1)

For given x0 ∈ A and ² > 0, we must ﬁnd δ > 0 such that

kx − x k < δ & x ∈ A ⇒ kf (x) − f (x )k < ² | 0 {z } | {z 0 } x ∈ D(x0,δ)∩A f (x) ∈ D(f (x0),²)

1. Since D(f (x0), ²) is open, by 3

−1 f (D(f (x0), ²)) is open relative to A.

−1 ∴ f (D(f (x0), ²)) = A ∩ V for some open set V .

2. Since x0 ∈ V and V is open,

∃ δ > 0 s.t. D(x0, δ) ⊂ V .

−1 3. Hence, D(x0, δ) ∩ A ⊂ f (D(f (x0), ²)) and this completes the proof. 5 Theorem. (4.2.1: f (connected) is connected if f ∈ C(M)) Suppose that f : M → N is continuous and let K ⊂ M. (i) If K is connected, so is f (K). (ii) If K is path-connected, so is f (K).

Proof of (i). Suppose f (K) is not connected. 1. From the deﬁnition of disconnectedness, ∃ open U, V s.t.

f (K) ⊂ U ∪ V , U ∩ V ∩ f (K) = ∅, U ∩ f (K) 6= ∅, V ∩ f (K) 6= ∅

2. Since f is continuous, f −1(U) and f −1(V ) are open. Moreover, K ⊂ f −1(U) ∪ f −1(V ), f −1(U) ∩ f −1(V ) ∩ K = ∅, f −1(U) ∩ K 6= ∅, f −1(V ) ∩ K 6= ∅. 3. Hence, K is disconnected, a contradiction.

6 Proof of (ii). If K is path-connected, so is f (K). 1. Let v, w ∈ f (K) and let x, y ∈ K s.t. f (x) = v, f (y) = w. 2. Since K is path-connected, ∃ a continuous curve c : [0, 1] → M s.t.

c(t) ∈ K (0 ≤ t ≤ 1), c(0) = x, c(1) = y

3. Since f is continuous, it is easy to show that c˜(t) = f (c(t)) ∈ f (K) for 0 ≤ t ≤ 1 and andc ˜ : [0, 1] → N is continuous path joining v and w. 4. Hence, f (K) is path-connected

7 Theorem. (4.2.2: f (compact) is compact if f ∈ C(M)) Suppose that f : M → N is continuous and K ⊂ M is compact. Then f (K) is compact.

It suﬃces to prove that f (K) is sequentially compact.

1. Let vn ∈ f (K). Let xn ∈ K s.t. f (xn) = vn. 2. Since K is compact, ∃ a convergent subsequence, say,

xnk → x0 ∈ K.

3. Since f is continuous, vnk = f (xnk ) → f (x0) ∈ f (K). This proves that f (K) is sequentially compact.

8 Examples 2 Let f : R → R be a continuous map. Denote x = (x1, x2). 2 2 • Let f (x) = x1 for x ∈ R . If K ⊂ R be compact, so is f (K) = {x1 : x = (x1, x2) ∈ K}. (Why? Since f is continuous and K is compact, f (K) is compact.) • Let f (x) = 7 for x ∈ R2. The set {7} is compact, while R2 = f −1({7}) is not compact. • The set A = {f (x): kxk = 1} is a closed interval. (Why? K = {x ∈ R2 : kxk = 1} is compact and connected. Hence, A = f (K) is compact and connected. )

9 Theorem.

(1) Let f : A ⊂ N → M and g : A ⊂ N → M be continuous at x0. Then

• f ± αg is continuous at x0 for any α ∈ R.

• fg is continuous at x0

• f /g is continuous at x0 if g(x0) 6= 0. (2) Suppose f : A ⊂ N → M and h : B ⊂ N → Rp are continuous and f (A) ⊂ B. Then h ◦ f : A ⊂ N → Rp is also continuous.

Proof. EASY

10 Theorem. (4.4.1: Maximum-Minimum Principle) Let f : A ⊂ M → R be continuous and let K be a compact subset in A. Then, • f (K) is bounded.

• ∃x0, y0 ∈ K such that

f (x0) = inf f (K) = inf f (x)& f (y0) = sup f (K) = sup f (x). x∈K x∈K

Proof. Since K is compact and f is continuous on K ⊂ A, f (K) is compact. Hence, f (K) is closed and bounded in R by Heine-Borel thm. This completes the proof.

11 Theorem. (4.5.1: Intermediate Value Theorem) Let f : A ⊂ M → R be continuous. Assume K is a connected subset in A and x, y ∈ K and f (x) < f (y). Then, • For every number c ∈ R such that f (x) < c < f (y),

∃ z ∈ K s.t. f (z) = c

Proof. Since K is connected and f is continuous on K ⊂ A, f (K) is connected. Hence, [f (x), f (y)] ⊂ f (K). ∴ ∃ z ∈ K s.t. f (z) = c. This completes the proof.

12 4.6 Throughout this section, we assume that f : A ⊂ Rn → Rm is continuous. • Deﬁnition. Let B ⊂ A. f is uniformly continuous on B if for every ² > 0, there is δ > 0 s.t.

kx − yk < δ & x, y ∈ B ⇒ kf (x) − f (y)k < ².

• Example. Consider f : R → R, f (x) = x2. Then f is continuous on R, but it is not uniformly continuous. Why? Let xn = n + 1/n and yn = n. Then |xn − yn| = 1/n → 0, while |f (xn) − f (yn)| ≥ 1. • Example. Consider f : (0, 1) → R, f (x) = 1/x. Then f is continuous on (0, 1), but it is not uniformly continuous. Why? Let xn = 1/n . Then |xn+1 − xn| < 1/n → 0, while |f (xn+1) − f (xn)| = 1.

13 Theorem. (Uniform Continuity Theorem) Let f : A ⊂ Rn → Rm be continuous and let K ⊂ A be compact. Then f uniformly continuous on K.

1. Let ² be given. Since f is continuous on K, for each x ∈ K, ¡ ² ¢ ∃δx > 0 s.t. f (D(x, δx ) ∩ K) ⊂ D f (x), 2 2. Since K ⊂ ∪x D(x, δx /2) and K is compact, N ∃{x1, ··· , xN } ⊂ K s.t. K ⊂ ∪j=1D(xj , δ) where 1 δ = 2 min{δx1 , ··· , δxN }. 3. If |x − y| < δ, x, y ∈ K, then ∃xj s.t. |x − xj | < δ. Since

|y − xj | ≤ |y − x| + |x − xj | < 2δ ≤ δxj , ² ² kf (x) − f (y)k ≤ kf (x) − f (xj )k + kf (xj ) − f (y)k ≤ 2 + 2 = ².

14 Chapter 5. This chapter deals with very important results in physical science: • a basic iteration technique called the contraction mapping principle (5.7.1) • some applications to diﬀerential and equations and some problems in control theory. (5.7.2, 5.7.3, 5.7.10) To study such results, we need • compactness in a (5.5.3) • uniform convergence, equi-continuity (5.6.2)

1 Deﬁnition. ( convergence & Uniform Convergence) Let N be a metric space with the metric ρ, A a set, and fk : A → N, k = 1, 2, ···

• fk → f pointwise if for each x ∈ A, limk→∞ fk (x) = f (x), i.e.

∀x ∈ A, lim ρ(fk (x), f (x)) = 0 k→∞

• fk → f uniformly if limk→∞ supx∈A ρ(fk (x), f (x)) = 0, i.e.

∀² > 0, ∃ N s.t. k > N ⇒ sup ρ(fk (x), f (x)) < ² x∈A

Examples: k • fk (x) = x → 0 pointwise in (0, 1). (Why?) k • fk (x) = x does NOT converge to 0 uniformly in (0, 1). xn • Show that fn(x) = 1+xn converges pointwise on [0, 2] but that the convergence is not uniform. 2 P Deﬁnition. (5.1.3: Does k gk makes sense ?) Pn Denote fn(x) = gk (x). P k=1 • gk = f (pointwise) if fn → f pointwise. Pk • k gk = f uniformly if fn → f uniformly.

Examples. P k 2k+1 • ∞ (−1) x k=0 (2k+1)! = sin x uniformly in the interval [−100, 100]. P k 1 • k x = 1−x converges uniformly in [−0.9, 0.9] P k 1 • k x = 1−x converges pointwise (NOT uniformly) in (−1, 1) P k • k x does not converge in R \ (−1, 1)

3 The Weierstrass M-test

Theorem. (5.2.1: Cauchy Criterion) Let V be a complete normed vector space with norm k · k, and let A be a set. Let fk : A → V is a sequence of functions. Then fk converges uniformly on A iﬀ

∀ ² > 0, ∃ N s.t. l, k > N ⇒ sup kfk (x) − fl (x)k < ² x∈A

Proof of ⇒.

1. Assume fk → f uniformly. Let ² > 0 be given. 2. Then ∃ N s.t. k ≥ N ⇒ kfk − f k = supx∈A |fk (x) − f (x)| < ²/2. 3. Hence, ² ² l, k ≥ N ⇒ kfk − fl k ≤ kfk − f k + kfl − f k < 2 + 2 = ².

4 Theorem. (5.2.1: Continue...)

♣♦♥ Then fk converges uniformly on A iﬀ ∀ ² > 0, ∃ N s.t. l, k > N ⇒ supx∈A kfk (x) − fl (x)k < ² Proof of ⇐.

1. From the assumption, fk (x) is Cauchy sequence for all x ∈ A.

2. Hence, for all x ∈ A, ∃ limk fk (x) and we can deﬁne f (x) = limk fk (x). 3. Let ² > 0 be given. From the assumption, ∃ N s.t. l, k > N ⇒ supx∈A kfl (x) − fl (x)k < ²/2. 4. From 2, ∀x ∈ A, ∃ Nx s.t. l > Nx ⇒ kf (x) − fl (x)k < ²/2. 5. From 3 and 4, if k ≥ N and x ∈ A, then kfk (x)−f (x)k ≤ kfk (x) − fl (x)k+kfl (x) − f (x)k < ²/2+²/2 for any l ≥ Nx .

6. From 5, k ≥ N ⇒ supx∈A kfk (x) − f (x)k < ². 5 Theorem. (5.2.2: Weierstrass M test) Let V be a complete normed vector space with norm k · k, and let A be a set. Suppose that gk : A → V are functions such that P∞ P∞ supx∈A kgk (x)k < Mk and k=1 Mk < ∞. Then k=1 gk converges uniformly.

Proof. R n 1. Denote fn(x) = k=1 gk (x). Pn+` Pn+` 2. Then kfn(x) − fn+`(x)k = k k=n gk (x)k ≤ k=n Mk . P∞ 3. Since limn→∞ k=n Mk = 0, it follows from 2 and Theorem 5.2.1 that fn converges uniformly.

6 5.5 The space of continuous functions Throughout this section, we assume M = Rn, A ⊂ M, and N = Rn. (N,M: complete normed space) • Denote C(A, N) = {f : f : A → N is continuous }. Then C is a vector space. • For f ∈ C(A, N), f is said to be bounded if there is a constant C such that kf (x)k < C for all x ∈ A.

• Denote Cb(A, N) = {f ∈ C : f is bounded }. • Deﬁne kf k = sup kf (x)k x∈A • kf k is a of the size of f and is called the norm of f .

7 Theorem. (5.5.1-3: Cb(A, N) is a complete normed space) m n Let A ⊂ M = R , N = R . The set Cb(A, N) is a complete normed space equipped with the norm kf k = supx∈A kf (x)k; that is, 1. Cb(A, N) is a normed space. • kf k ≥ 0 and kf k = 0 iﬀ f = 0. • kαf k = |α|kf k for α ∈ R, f ∈ Cb. • kf + gk ≤ kf k + kgk.

2. Completeness: Every Cauchy sequence {fk } in Cb(A, N) converges to a function f ∈ Cb(A, N), that is,

lim kfk − f k = lim sup kfk (x) − f (x)k = 0. k→∞ k→∞ x∈A

• Clearly, Cb(A, N) is a normed space. (EASY!)

• From the deﬁnition, fk → f uniformly iﬀ fk → f in Cb.

• From Cauchy criterion (Theorem 5.2.1), Cb(A, N) is complete. 8 Examples • Let B = {f ∈ C([0, 1], R): f (x) > 0 for all x ∈ [0, 1]}. Show that B is open in C([0, 1], R). Proof. 1. In order to prove that B is open, we must show that ∀ f ∈ B, ∃² > 0 s.t. D(f , ²) ⊂ B. 2. Let f ∈ B. Since [0, 1] is compact, f has a minimum value-say, m- at some point in [0, 1]. Hence, infx∈[0,1] f (x) = m. m 3. Let ² = 2 . We will show D(f , ²) ⊂ B. Proof. If g ∈ D(f , ²), then kg − f k < ², and m ∴ g(x) ≥ f (x)−|g(x) − f (x)| ≥ f (x)−kf − gk ≥ m−² = 2 for all x ∈ [0, 1]. Hence, g ∈ B. ∴ D(f , ²) ⊂ B

• Prove that B is D = {f ∈ Cb : infx∈[0,1] f (x) ≥ 0}. Proof.

1. D = D because if fn ∈ D → f uniformly, then fn(x) → f (x) pointwise and ∴ infx∈[0,1] f (x) ≥ 0. 1 1 2. If f ∈ D, then fn(x) := f (x) + n ∈ B and kfn − f k = n → 0. ∴ B ⊂ D ⊂ B. 9 Examples • PConsider a sequence fn ∈ Cb such that kfn+1 − fnk ≤ rn, where rn is convergent. Prove that fn converges. Proof. 1. Let ² >P0 be given. 2. Since rn is convergent, X∞ ∃ N s.t. n > N ⇒ rk < ² k=n 3. Hence, if n ≥ N, then

n+Xk−1 n+Xk−1 X∞ kfn+k −fnk = k (fj+1−fj )k ≤ kfj+1−fj k ≤ rj < ² j=n j=n j=n

4. From 3, fn is a Cauchy sequence, so it converges.

10 Arzela-Ascoli Theorem Throughout this section, we assume that M = Rm, A ⊂ M, N = Rn (N,M: complete normed space). Deﬁnition. (5.6.1: Equi-continuous) Assume B ⊂ C(A, N). • We say that B is equi-continuous if

∀ ² > 0, ∃ δ > 0 s.t. kx − yk < δ & x, y ∈ A

⇒ supf ∈Bkf (x) − f (y)k < ²

• We say B is pointwise compact iﬀ Bx = {f (x): f ∈ B} is compact in N for each x ∈ A.

11 Example 5.6.4 (Compact sequence) 0 Let fn ⊂ Cb([0, 1], R) and be such that fn exist and Ã ! 0 sup kfnk ≤ C & sup sup |fn(x)| ≤ C n n x∈(0,1) for a positive constant C. Prove that B := {fn} is equicontinuous. Proof. • By the mean value theorem,

0 |fn(x) − fn(y)| ≤ sup |fn(z)| |x − y| ≤ C|x − y|, for all n z∈(0,1)

² • Hence, for given ² > 0, we can choose δ = C and

|x−y| < δ & x, y ∈ [0, 1] ⇒ sup|fn(x)−fn(y)| < C|x−y| < ². n

Hence, B := {fn} is equi-continuous. (So, {fn} has a convergent subsequence. Why? See Arzela-Ascoli theorem.) 12 Theorem. (5.6.2:Arzela-Ascoli theorem) Let A be compact and B ⊂ C(A, N). If B is closed, equi-continuous, and pointwise compact, then B is compact, that is, any sequence fn in B has a uniformly convergent subsequence.

The proof strategy is based on Bolzano-Wierstrass properties.

Theorem. (Special case of Arzela-Ascoli theorem) Let B ⊂ C([0, 1], R). If B is closed, equi-continuous, and bounded, then B is compact.

Proof.

1. Assume fn is a sequence in B. 1 2 n−1 2. Denote C1/n = { n , n , ··· , n , 1}. Let C = ∪nC1/n. 3. Since C is countable, we can write C = {x1, x2, ···}. 13 Theorem. (Special case of Arzela-Ascoli theorem) Let B ⊂ C([0, 1], R). If B is closed, equi-continuous, and bounded, then B is compact.

Proof.

1. Assume fn is a sequence in B. 1 2 n−1 2. Denote C1/n = { n , n , ··· , n , 1}. Let C = ∪nC1/n.

3. Since C is countable, we can write C = {x1, x2, ···}.

4. Since Bx1 is compact, ∃ a convergent subseq of fn(x1). Let us denote this subsequence by

f11(x1), f12(x1), ··· , f1k (x1), ···

5. Similarly, the sequence f1k (x2) has a subsequence

f21(x2), f22(x2), ··· , f2k (x2), ··· which is converegnt.

6. We proceed in this way and then set gn = fnn.

14 Proof of Arzela-Ascoli theorem

7. gn = fnn is obtained by picking out the diagonal

f11 f12 f13 ··· f1n ··· (1st subseq.) f21 f22 f23 ··· f11 ··· (2nd sub seq.) ...... fn1 fn2 fn3 ··· fnn ··· (n-th subseq.)

8. From the construction from the diagonal process,

lim gn(xi ) exists for all xi ∈ C. n→∞

9. Now, we are ready to prove

kgn − gmk = sup |gn(x) − gm(x)| → 0 as m, n → ∞. x∈[0,1]

15 Continue...

9. Proof of limn,m→∞ supx∈A |gn(x) − gm(x)| = 0. a. Let ² > 0 be given. b. From equi-continuity of {gn} ⊂ B, we can choose δ s.t.

|x −y| < δ & x, y ∈ A = [0, 1] ⇒ sup |gn(x)−gn(y)| < ²/3 n

1 c. Choose L ≥ δ . From 8, ² ∃ N s.t. n, m > N ⇒ sup |gn(xi ) − gm(xi )| < . xi ∈C1/L 3

d. For each x ∈ A, there exist yj ∈ C1/L s.t. |x − yj | < δ. Therefore, if n, m > N, then

|gn(x) − gm(x)| ≤ |gn(x) − gn(yj )| + |gn(yj ) − gm(yj )| ² ² ² +|g (x) − g (y )| ≤ + + m m j 3 3 3

This proves limn,m→∞ supx∈A |gn(x) − gm(x)| = 0. 16 Continue...

10. From 9, gn is a Cauchy sequence in C([0, 1], N).

11. Since C([0, 1], N) is the complete normed space, gn converges to some g ∈ C([0, 1], N). 12. Since B is closed, it must be g ∈ B. 13. From 1, 11, and 12, B is sequentially compact, so it is compact.

♣ ♣ ♣ ♣ ♣ ♣ ♣

The proof of Arzela-Ascoli theorem is exactly the same as the special case discussed above except the step 2. For the replacement of the step 2, we use the fact that the compact set A is totally bounded . The compactness of A provides that, for each δ > 0, there exist a ﬁnite set k Cδ = {y1, ··· , yk } such that A ⊂ ∪j=1D(yj , δ). 17 5.7 The contraction mapping principle

Theorem. (5.7.1: Contraction mapping principle) Let M be a complete normed space and Φ: M → M a given mapping. Assume

∃ k ∈ [0, 1) s.t. kΦ(f ) − Φ(g)k ≤ k kf − gk for all f , g ∈ M

Then there exists a unique ﬁxed point f∗ ∈ M s.t. Φ(f∗) = f∗. In fact, if f0 ∈ M and fn+1 = Φ(fn), n = 0, 1, 2, ··· , then

lim kfn − f∗k = 0 n→∞

Key idea: Φ is shrinking distances:

n kfn+1 − fnk = kΦ(fn) − Φ(fn)k ≤ kkfn − fn−1k ≤ · · · ≤ k kf1 − f0k

1 The proof of contraction mapping principle: ∃ f∗ ∈ M s.t. Φ(f∗) = f∗

1. Let f0 ∈ M and fn+1 = Φ(fn), n = 0, 1, 2, ··· .

2. kf2 − f1k = kΦ(f1) − Φ(f0)k ≤ kkf1 − f0k. 2 3. kf3 − f2k = kΦ(f2) − Φ(f1)k ≤ kkf2 − f1k ≤ k kf1 − f0k. n 4. Inductively, kfn+1 − fnk ≤ k kf1 − f0k. 5. Hence, P∞ P∞ n 1 n=0 kfn+1 − fnk ≤ kf1 − f0k n=0 k = kf1 − f0k 1−k < ∞. 6. From the proof in Example 5.5.6 and 4, fn converges.

7. Since M is complete, limn→∞ fn = f∗ for some f∗ ∈ M. 8. Φ is uniformly continuous because kΦ(f ) − Φ(g)k ≤ k kf − gk.

9. From 8, limn→∞ Φ(fn) = Φ(f∗).

10. Hence, f∗ = limn→∞ fn+1 = limn→∞ Φ(fn) = Φ(f∗).

2 The proof of contraction mapping principle: Uniqueness of the ﬁxed point f∗

11. To prove the uniqueness, assume g∗ is another ﬁxed point, i.e., Φ(g∗) = g∗

12. Then f∗ − g∗ = Φ(f∗) − Φ(g∗) and

kf∗ − g∗k = kΦ(f∗) − Φ(g∗)k ≤ kkf∗ − g∗k

Hence, (1 − k)kf∗ − g∗k ≤ 0. 13. Since 1 ≤ k < 1, it must be

kf∗ − g∗k = 0

Hence, f∗ = g∗

3 Theorem. (5.7.2: Existence of sol’n of Diﬀerential equations) 2 Let A ⊂ R be an open neighborhood of (t0, x0). Assume f : A → R is satisfying the following Lipschitz condition:

|f (t, x1) − f (t, x2)| ≤ K|x1 − x2| for all (t, x1), (t, x2) ∈ A.

Then, there is a δ > 0 s.t. the equation

dx(t) = f (t, x), x(t ) = x dt 0 0

1 has a unique C -solution x = φ(t) with φ(t0) = x0, for t ∈ (t0 − δ, t0 + δ), i.e.,

0 φ (t) = f (t, φ(t)) for all t ∈ (t0 − δ, t0 + δ)& φ(t0) = x0

C 1-solution = continuously diﬀerentiable solution 4 Get insight: Proof of Theorem 5.7.2 Before the proof, let us get some insight. Imagine that φ is dx(t) the solution of dt = f (t, x), x(t0) = x0. 0 Since φ (t) = f (t, φ(t)) with φ(t0) = x0, Z t Z t 0 φ(t) = φ(t0) + φ (s)ds = x0 + f (s, φ(s))ds t0 t0 Hence, φ is a ﬁxed point for the map Φ : M → M deﬁned by Z t Φ(φ) = x0 + f (s, φ(s))ds t0 In order to apply the contraction mapping principle, we need to choose a suitable space M. In practice, the solution φ can be achieved from the following iterative method: Z t φn+1(t) = Φ(φn) = x0 + f (s, φn(s))ds & φ0 = x0 t0 5 Proof of Theorem 5.7.2 ˜ 1. Let L = sup(x,t)∈A˜ |f (x, t)| where A is a closed subset of A. Since f is continuous in A, L < ∞. 2. Choose δ such that Kδ < 1 and

{(t, x): |t − t0| < δ, |x − x0| < Lδ} ⊂ A˜

3. Denote C = C([t0 + δ, t0 + δ], R). From theorem 5.5.3, C is a complete normed space (or ) with norm kφk = sup |φ(t)| t∈[t0+δ,t0+δ] 4. Let

M = {φ ∈ C : φ(t0) = x0 & |φ(t) − x0| ≤ Lδ} 5. Then, M is also a complete normed space. (Why? M is closed subset of C w.r.t. the norm k · k.) 6 Proof of Theorem 5.7.2

5. Deﬁne Φ: M → C by (Please ﬁnd its motivation from the previous slide)

Z t Φ(φ) = x0 + f (s, φ(s))ds t0 6. Claim: φ ∈ M ⇒ Φ(φ) ∈ M. Proof. Let φ ∈ M and ψ = Φ(φ).

• ψ(t0) = x0 and ψ ∈ C because ¯ ¯ ¯Z t+h ¯ ¯ ¯ lim |ψ(t + h) − ψ(t)| = lim ¯ f (s, φ(s))ds¯ ≤ lim Lh = 0 h→0 h→0 ¯ t ¯ h→0

• From 1, ¯Z ¯ ¯ t ¯ ¯ ¯ |t−t0| ≤ δ ⇒ |ψ(t)−x0| = ¯ f (s, φ(s))ds¯ ≤ L|t−t0| ≤ Lδ t0 • Hence, ψ ∈ M.

7. From 6, Φ maps M to M. See the condition of Theorem 5.7.1. 7 Proof of Theorem 5.7.2 7. Using the Lipschitz condition, ¯Z ¯ ¯ t ¯ ¯ ¯ kΦ(φ1) − Φ(φ2)k = sup ¯ f (s, φ1(s)) − f (s, φ2(s))ds¯ t∈[t0+δ,t0+δ] t0 ¯Z ¯ ¯ t ¯ ¯ ¯ ≤ sup ¯ K|φ1(s) − φ2(s)|ds¯ t∈[t0+δ,t0+δ] t0

≤ δKkφ1 − φ2k

8. Since δK < 1,

kΦ(φ1) − Φ(φ2)k ≤ kkφ1 − φ2k, k = δK ∈ [0, 1)

9. From 5.7.1, ∃ φ∗ ∈ M s.t. Φ(φ∗) = φ∗.

8 Theorem. (5.7.3: Fredholm equation) Assume that K(x, y) is continuous on [a, b] × [a, b] and

M = sup |K(x, y)| x,y∈[a,b]

If |λ|M|b − a| < 1, then the following Fredholm equation has a unique solution in C([a, b], R):

Z b f (x) = λ K(x, y) f (y) dy + φ(x), x ∈ [a, b] a where λ ∈ R, φ ∈ C([a, b], R).

Proof. For f ∈ C([a, b], R), we deﬁne

Z b (Φ(f ))(x) = λ K(x, y) f (y) dy + φ(x) a 9 Proof of 5.7.3 1. Claim: Φ maps from C([a, b], R) to C([a, b], R). Proof. Let f ∈ C([a, b], R). We need to show that Φ(f ) is continuous. Let ² > 0 be given. • Since [a, b] × [a, b] is compact, K(x, y) is uniformly continuous. • Hence, ∃ δ s.t. k(x1, y) − (x2, y)k < δ &(x1, y), (x2, y) ∈ [a, b] × [a, b] ² imply |K(x1, y) − K(x2, y)| < kf k |b−a|+1 . • If |x1 − x2| < δ and x1, x2 ∈ [a, b], then |(Φ(f ))(x1) − (Φ(f ))(x2)| = R b a |K(x1, y) − K(x2, y)||f (y)|dy ≤ δ kf k|b − a| < ². 2. Set k = |λ|M|b − a|. Then k < 1 and ¯Z ¯ ¯ b ¯ ¯ ¯ kΦ(f ) − Φ(g)k = sup ¯ K(x, y)(f (y) − g(y)|dy¯ ≤ kkf − gk x∈[a,b] a

3. From 5.7.1, ∃ unique f∗ ∈ C([a, b], R) s.t. Φ(f∗) = f∗.

10 Theorem. (5.7.4: Volterra integral equation) Assuming K(x, y) is continuous on [a, b] × [a, b], the Volterra R x integral equation f (x) = λ a K(x, y) f (y) dy + φ(x) has a unique solution f (x) for any λ.

Proof. For f ∈ C([a, b], R), we deﬁne R x (Φ(f ))(x) = λ a K(x, y) f (y) dy + φ(x) 1. As in 5.7.4, Φ maps from C([a, b], R) to C([a, b], R).

2. Let M = supx,y∈[a,b] |K(x, y)|. Then, ¯Z ¯ ¯ x ¯ ¯ ¯ |Φ(f )(x) − Φ(g)(x)| = |λ| ¯ K(x, y)(f (y) − g(y))dy¯ a ≤ |λ||x − a|Mkf − gk

11 Proof of 5.7.4 3. From 2, ¯Z ¯ ¯ x ¯ 2 2 ¯ ¯ |Φ (f )(x) − Φ (g)(x)| = |λ| ¯ K(x, y)(Φ(f )(y) − Φ(g)(y))dy¯ ¯Za ¯ ¯ x ¯ ¯ ¯ ≤ |λ| ¯ M|y − a||λ|Mkf − gkdy¯ a |b − a|2 ≤ |λ|2M2 kf − gk 2! 4. Inductively, we have

|λ|nMn|b − a|n kΦn(f ) − Φn(g)| ≤ kf − gk n!

P |λ|nMn|b−a|n 5. By the ratio test, n! converges. |λ|N MN |b−a|N N 6. Hence, we can choose N so that N! < 1. ∴ Φ is a contraction! 12 Proof of 5.7.4 N 7. From 6, ∃ unique f∗ ∈ C([a, b], R) s.t. Φ (f∗) = f∗. N+1 8. From 7, Φ (f∗) = Φ(f∗). N 9. From 8, Φ(f∗) is a ﬁxed point of Φ . 10. From 7, 9, and the uniqueness of the ﬁxed point, it must be f∗ = Φ(f∗).

What a CUTE IDEA is!

13 Examples • Example 5.7.5. Let Φ : R → R be deﬁned by Φ(x) = x + 1. |Φ(x) − Φ(y)| = |x − y| k|x − y| for any k ∈ [0, 1), and Φ does not have a unique ﬁxed point. • Example 5.7.6. Solve x0(t) = x(t), x(0) = 1. R t Solution. Let Φ(φ)(t) = 1 + 0 φ(s)ds. Let φ0 = 1 and Pn 1 k φn+1 = Φ(φn), n = 0, 1, ··· . Then φn(t) = k=0 k! t . t Hence, φn(t) → e . • Example 5.7.7. Solve x0(t) = t x(t) for t near 0 and x(0) = 3. R t Solution. Let Φ(φ)(t) = 3 + 0 φ(s)ds. Let φ0 = 3 and ³ ´k Pn 1 t2 φn+1 = Φ(φn), n = 0, 1, ··· . Then φn(t) = 3 k=0 k! 2 . t2/2 Hence, φn(t) → 3e .

14 Examples • Example 5.7.5. Consider the integral equation Z x f (x) = a + xe−xy f (y) dy 0 Check directly on which intervals [0, r] we get a contraction. −xy Solution. Let KR(x, y) = xe and let x −xy Φ(f )(x) = a + 0 xe f (y) dy. Then ¯Z ¯ ¯ x ¯ ¯ ¯ kΦ(f ) − Φ(g)| = sup ¯ K(x, y)(f (y) − g(y) dy¯ x∈[0,r] 0 ¯Z ¯ ¯ x ¯ ¯ ¯ ≤ sup ¯ K(x, y) dy¯ kf − gk x∈[0,r] 0 ¯ ¯ ¯ 2 ¯ = sup ¯1 − e−x ¯ kf − gk x∈[0,r]

Since 0 < 1 − e−r 2 < 1 for any r, Φ is a contraction for any r. 15 5.8 The Stone-Weierstrass Theorem Aim of Weierstrass Theorem is to show that any continuous function can be uniformly approximated by a function that has more easily managed properties, such as a . Theorem. (5.8.1: Weierstrass-Bernstein )

Let f ∈ C([0, 1], R). There exist a sequence of polynomial pn such that limn→∞ kpn − f k = 0. In fact,

Xn n! p (x) = xk (1 − x)n−k f (k/n) → f unformly n k!(n − k)! k=0

• n! k n−k Meaning of rk (x) := k!(n−k)! x (1 − x) : Imagine a coin with probability x of getting heads and, consequently, with probability 1 − x of getting tails. In n tosses, the probability of getting exactly k heads is that quantity. 1 Rough proof: Weierstrass-Bernstein Pn Pn 2 • k=0 rk (x) = 1 and k=0(k/n − x) rk (x) = x(1 − x). X lim rk (x) = 0, for any δ > 0 n→∞ k | n −x|>δ

and X lim rk (x) = 1, for any δ > 0 n→∞ k | n −x|<δ • Suppose that in gambling game called n-tosses, f (k/n) dollars is paid out when exactly k heads turn up when n tosses are made.m The average amount (after a lo∼∼ong evening of playing n-tosses) paid out when n tosses are made is

Xn pn(x) = rk (x) f (k/n) ≈ f (x) k=0

2 The Weierstrass-Bernstein theorem can be applied to C([a, b], R) because

g ∈ C([a, b], R) ⇒ f (x) = g(x(b − a) + a) ∈ C([a, b], R).

Theorem. (5.8.2: Stone-Wierstrass) Let M be a metric space, A ⊂ M a compact set, and B ⊂ C(A, R) satisﬁes the following: 1. B is algebra: f , g ∈ B &α ∈ R ⇒ f + g, fg, αg ∈ B 2. 1 ∈ B 3. ∀x, y ∈ A, x 6= y, ∃ f ∈ B s.t. f (x) 6= f (y). Then B is dense in C(A, R), that is, B = C(A, R).

The proof is easy (just technical). I just provide a rough insight.

1. Since B is algebra, f ∈ B ⇒ pn(f ) ∈ B. 2. Assume that A is a ﬁnite set. Then the proof is trivial. 3. Use the concept of ﬁnite δ− for the compact set A. 3 Diﬀerentiable Mappings Deﬁnition: Let A be an open set in Rn. A mapping f : A ⊂ Rn → Rm is said to be diﬀerentiable at x0 ∈ A if ∃ a linear function (m × n ) n m Df(x0) : R → R such that kf(x) − f(x ) − Df(x )(x − x )k lim 0 0 0 = 0 x→x0 kx − x0k

• Theorem 6.2.2. If f : A ⊂ n → m is diﬀerentiable, then ∂fj exist, and R R ∂xi  ∂f ∂f ∂f  1 1 ··· 1 ∂x1 ∂x2 ∂xn  ∂f2 ······ ∂f2  ( ) =  ∂x1 ∂xn  (called Jacobian matrix) Df x  . . .. .   . . . .  ∂fm ······ ∂fm ∂x1 ∂xn

• 1-Dimension. If f :(a, b) → R is diﬀerentiable at x0, then ∃ a number 0 m = f (x0) such that

kf(x) − f(x0) − m(x − x0)k f(x) − f(x0) lim = 0 or lim = m x→x0 kx − x0k x→x0 x − x0

1 n m Thm 6.1.2. If f : A ⊂ R → R is diﬀerentiable at a, then f is continuous at a and Df(a) is uniquely determined.

Proof of uniqueness. Let L1 and L2 be two m×n matrix (or linear mappings) satisfying kf(x) − f(a) − L (x − a)k kf(x) − f(a) − L (x − a)k lim 1 = 0 = lim 2 x→a kx − ak x→a kx − ak

It suﬃces to prove that kL1ej − L2ejk = 0 for j = 1, ··· , n.

1 kL1(hej)−L2(hej)k kL1ej − L2ejk = kL1(hej) − L2(hej)k = |h| khejk

( + ) ( ) ( ) ( + ) ( ) ( ) = kf a hej −f a −L1 hej −[f a hej −f a −L2 hej ]k khejk

( + ) ( ) ( ) ( + ) ( ) ( ) ≤ kf a hej −f a −L1 hej k + kf a hej −f a −L2 hej k khejk khejk

→ 0 as h → 0

• Proof of continuity: Since limy→a kf(y)−f(a)−Df(a)(y−a)k = 0, limy→a kf(y)−f(a)k = 0.

2 n m Thm 6.2.2. Assume f : A ⊂ R → R is diﬀerentiable at x and ∂f ∂f Df(x) = [a ]. Then, j exist and a = j . ij ∂xi ij ∂xi

Proof. Denote e1 = (1, 0, ··· , 0), e2 = (0, 1, 0, ··· , 0), en = (0, ··· , 0, 1). We have lim kf(y)−f(x)−Df(x)(y−x)k = 0 y→x ky−xk

( + ) ( ) ( )( ) lim kf x hei −f x −Df x hei k = 0 ı = 1 2 ⇒ h→0 |h| , , , ··· , n

q m P 2 |fj(x+hei)−fj(x)−aij(hei)| lim j=1 = 0  = 1 2 ⇒ h→0 |h| , , , ··· , n

∂fj ∂fj ⇒ exists and aij = ∂xi ∂xi

3 Thm 6.4.1. Let f : A ⊂ n → m. If each ∂fj exist and continuous on A, R R ∂xi then f is diﬀerentiable on A. h i [Proof for the case n = 2, m = 1.] Let Df(x) = ∂f (x), ∂f (x) , x ∈ A. From ∂x1 ∂x2 the mean value theorem,

f(y) − f(x) = f(y1, y2) − f(x1, y2) + f(x1, y2) − f(x1, x2) ∂f ∂f = (u1, y1) (y1 − x1) + (x1, u2) (y2 − x2) ∂x1 ∂x2 for some ui between xi and yi. Hence,

f(y) − f(x) − Df(x)(y − x) = α (y1 − x1) + β (y2 − x2) h ∂f ∂f i h ∂f ∂f i where α := (u1, y2) − (x) and β := (x1, u2) − (x) . ∂x1 ∂x1 ∂x2 ∂x2

Due to continuity of ∂f and ∂f , α → 0 & β → 0 as y → x and ∂x1 ∂x2

( ) + ( ) |f(y) − f(x) − Df(x)(y − x)| α y1 − x1 β y2 − x2 q = ≤ α2 + β2 → 0 p 2 2 ky − xk (y1 − x1) + (y2 − x2) as . This proves that lim kf(y)−f(x)−Df(x)(y−x)k = 0. y → x y→x ky−xk 4 Remark. About a diﬀerentiable map f : A ⊂ Rn → Rm.

• The proof of Thm 6.4.1 for the general case f : A ⊂ Rn → Rm is almost 2 same as the special case f : A ⊂ R → R.

• Intuitively, x → f(x0) + Df(x0)(x − x0) is supposed to be the best aﬃne approximation to f near x0

• It should be noticed that the existence of ∂fj does not imply that the ∂xi derivative Df exist. Directional Derivatives.Let f : A ⊂ Rn → R be real-valued function.

Let n be a unit vector. d ( + ) = lim f(x+te)−f(x) is called • e ∈ R dtf x te |t=0 t→0 t the directional derivative of f at x in the direction e.

If is diﬀerentiable, then lim f(x+te)−f(x) = ( ) • f t→0 t Df x · e.

• Note that the existence of all directional derivatives at a point need not imply diﬀerentiability. Example. Let ( ) = xy for 2 = and ( ) = 0 if 2 = f x, y x2+y x 6 −y f x, y x 3 2 −y. Note that f is not continuous at (0, 0), since limt→0 f(t, t − t ) = 3 2 lim t(t −t ) = 1 = 0 = (0 0). But all directional derivative of at t→0 t2+t3−t2 − 6 f , f (0, 0) exist: f(ta, tb) 1 t2ab lim = = a t→0 t t t2a2 + tb for any unit vector e = (a, b).

5 m Chain Rule 6.5.1: Let A ⊂ Rn be open and let f : A → R be diﬀerentiable. Let p B ⊂ Rn be open, f(A) ⊂ B, and g : B → R be diﬀerentiable. Then h = g ◦ f is diﬀerentiable on A and Dh(x) = Dg(f(x))Df(x):

 ∂g1 ∂g1  ∂f1 ∂f1  ∂y ··· ∂y ··· . 1 .m ∂x1 ∂xn D(g ◦ f)(x) =  . ... .  . ... .     ∂gp ··· ∂gp ∂fm ··· ∂fm ∂y1 ∂ym ∂x1 ∂xn Proof. From the assumption, it is easy to see that :=♠ z }| { k (g ◦ f)(x) − (g ◦ f)(x ) − Dg(f(x ))(f(x) − f(x )) k lim 0 0 0 = 0 x→x0 kx − x0k :=♣ z }| { k f(x) − f(x ) − Df(x )(x − x ) k lim 0 0 0 = 0 x→x0 kx − x0k

Since (g ◦ f)(x) − (g ◦ f)(x0) − Dg(f(x0))Df(x0)(x − x0) = ♠ + Dg(f(x0))♣, it follows from the above identities that

= ♠ + Dg(f(x0)) ♣ z }| { k (g ◦ f)(x) − (g ◦ f)(x0) − Dg(f(x0))Df(x0)(x − x0) k lim = 0. x→x0 kx − x0k 6

Directional derivatives and examples

1. If h(r, θ) = f(r cos θ, r sin θ), then    cos θ −r sin θ  ∂h ∂h  = ∂f ∂f ∂r ∂θ ∂x ∂y sin θ r cos θ

2. Consider a surface S deﬁned by f(x) =constant. Then ∇f(x) is orthog- onal to this surface. n Proof. Let c : [0, 1] → R be a curve lying on S and c(0) = x0. d 0 = f(c(t)) = ∇f(c(t)) · c0(t). dt This means that ∇f(c(t)) is orthogonal to its tangent vector c0(t). Since this is true for arbitrary curve on S passing x0, ∇f(x0) is orthogonal to S at x0.

3. The direction of greatest rate of increase of f(x) is ∇f(x).

7 6.7.1. Mean Value Theorem. Suppose f : A ⊂ Rn → R is diﬀerentiable on an open set A. For any x, y ∈ A such that the line segment joining x and y lies in A, ∃c on that segment such that f(y) − f(x) = Df(c) · (y − x) Proof. Deﬁne h(t) = f((1 − t)x + ty). Then 0 ∃t0 ∈ (0, 1) such that h(1) − h(0) = h (t0) and therefore

0 f(y) − f(x) = h(1) − h(0) = h (t0) = Df((1 − t0)x + t0y) · (y − x) | ={zc }

8 • Deﬁnition. A bilinear map B : Rm × Rn → R is n × m matrix such that     a11 ··· a1m y1 X . .. . . B(x, y) = aijxiyj = (x1, ··· , xn)  . . .   .  ij an1 ··· anm vm

• Deﬁnition 6.8.4. For positive r, f is said to be of class Cr if all partial derivatives up to order r exist and continuous.

2 • Let f : A ⊂ Rn → R is of class C . Then  2 2  ∂ f ··· ∂ f ∂x1∂x1 ∂x1∂xn D2f(x) =  . ... .   2 2  ∂ f ··· ∂ f ∂xn∂x1 ∂xn∂xn

• If D2f us continuous, D2f is symmetric.

9 3 3 Taylor’s Theorem 6.8.5.[Case:f ∈ C ]. Let f : A ⊂ Rn → R is of class C . Suppose x ∈ A and x + th ∈ A for 0 ≤ t ≤ 1. Then ∃c = x + t0h, 0 < t0 < 1, such that n n X ∂f 1 X ∂2f f(x + h) − f(x) = (x)hi + (x)hihj ∂x 2! ∂x ∂x i=1 i i,j=1 i j n 1 X  ∂3f  + (x + t0h)hihjhk 3! ∂xi∂xj∂x i,j,k=1 k Proof. R 1 d R 1 Pn ∂f f(x + h) − f(x) = f(x + th)dt = (x + th)hidt 0 dt 0 i=1 ∂xi Pn R 1 ∂f d(t−1) d(t−1) = (x + th)hi dt (Why? = 1) i=1 0 ∂xi dt dt Pn h ∂f R 1 d  ∂f  i = (x)hi − (x + th)hi (t − 1) dt i=1 ∂xi 0 dt ∂xi Pn ∂f = (x)hi + R1(h, x) i=1 ∂xi where n X Z 1  ∂2f  R1(h, x) = (1 − t) (x + th)hihj dt ∂x ∂x i,j=1 0 i j

10  2  Using d (t−1) = (1 ) and integration by part, dt − 2! − t

Pn R 1 d  (t−1)2   ∂2f  R1(h, x) = − (x + th)hihj dt i,j=1 0 dt 2! ∂xi∂xj 1 Pn ∂2f = (x)hihj + R2(h, x) 2! i,j=1 ∂xi∂xj where n X Z 1 (t − 1)2  ∂3f  R2(h, x) := (x + th)hihjhk dt 2! ∂xi∂xj∂x i,j,k=1 0 k Recall the second mean value theorem for integral Z 1 Z 1 f(t)g(t)dt = g(t0) f(t)dt for some 0 < t0 < 1. 0 0 Hence, ∃t0, 0 < t0 < 1 such that n X  ∂3f  Z 1 (t − 1)2 R2(h, x) = (x + t0h)hihjhk dt . ∂xi∂xj∂x 2! i,j,k=1 k 0 | {z1 } 3! One can proceed by using induction using the same method to get the general Taylor’s theorem. 6.8.5. Taylor’s Theorem [General Case:f ∈ Cr]. Let f : A ⊂ Rn → R is of class Cr. Suppose x ∈ A and x + th ∈ A for 0 ≤ t ≤ 1. Then

1 r−1 f(x + h) = f(x) + Df(x) · h + ··· + D f(x) · (h, ··· , h) + Rr−1(x, h) r! where Rr−1(x, h) is the remainder. Furthermore,

R 1(x, h) r− → 0 as h → 0 khkr−1

Another proof of Taylor’s formula. Let g(t) = f(x+th) for t ∈ [0, 1]. Applying one-dimensional Taylor’s formula, there exists ˜t ∈ (0, 1) such that r−1 X 1 1 g(1) = g(0) + g(k)(0) + g(k−1)(˜t) k! (r − 1)! k=1 Note that ( ) = + 1 (k−1)(˜), (1) = ( + ), (0) = ( ), Rr−1 x, h r!g t g f x h g f x 0 Pn ∂f g (0) = Df(x) · h = (x)hi i=1 ∂xi 00 2 Pn ∂2f g (0) = D f(x) · (h, h) = (x)hihj i,j=1 ∂xi∂xj 000 3 Pn  ∂3f  g (0) = D f(x)(h, h, h) = (x)hihjhk i,j,k=1 ∂xi∂xj∂xk

11 n Theorem 6.9.2. If f : A ⊂ R → R is diﬀerentiable and x0 ∈ A is an extreme point for f, then Df(x0) = 0.

Proof. Assume Df(x0) 6= 0. We try to prove that f(x0) is not a local extreme value.

Df(x0) • Let h = . Since f is diﬀerentiable at x0, kDf(x0)k 1 lim |f(x0 + λh) − f(x0) − Df(x0) · (λh)| = 0. λ→0 |λ|

kDf(x0)k • Hence, (for given  = 2 ) there exist δ > 0 such that

kDf(x0)k 0 < |λ| < δ ⇒ |f(x0 + λh) − f(x0) − Df(x0) · (λh)| < |λ| 2

Since Df(x0) · h = kDf(x0)k, we have

kDf(x0)k kDf(x0)k − |λ| < f(x0 + λh) − f(x0) − kDf(x0)kλ < |λ| 2 2 This leads to the followings:

12 kDf(x0)k – for 0 < λ < δ, 2 λ < f(x0 + λh) − f(x0). Hence, f(x0) is not local maximum.

kDf(x0)k – for −δ < λ < 0, f(x0 + λh) − f(x0) < 2 λ. Hence, f(x0) is not local minimum. n 3 Theorem 6.9.4. Suppose f : A ⊂ R → R is a C −function and x0 is a critical point.

• If f has a local maximum at x0, then Hx0 (f) is negative semi-deﬁnite.

• If Hx0 (f) is negative ( positive ) deﬁnite, then f has a local maximum (minimum) at x0

Indeed, this theorem holds true for f ∈ C2.

Proof. Since Df(x0) = 0, Taylor’s theorem gives

1 2 f(x0 + h) − f(x) = D f(x0)(h, h) + R2(x0, h) 2 where lim R2(x0,h) = 0. h→0 khk 2 If D f(x0) is negative deﬁnite, then

1 2 D f(x0)(h, h) + R2(x0, h) < 0 for suﬃciently small h 2 and therefore f(x0 + h) − f(x0) < 0 for suﬃciently small h. Hence, f(x0) has a local maximum at x0. 13  a b  • Example 6.9.5. The matrix = is positive deﬁnite if A b d

 a b   x  (x, y) > 0 if (x, y) 6= (0, 0). b d y

2 2 Hence, A is positive deﬁnite iﬀ ax + 2bxy + dy > 0 for all all x, y. 2 Therefore, A is positive deﬁnite iﬀ a > 0 and ad − b > 0.

• Example 6.9.6. Let f(x, y) = x2 − xy + y2. Then Df(0, 0) = (0, 0) and  2 −1  D2f(0, 0) = . Hence, the Hessian is positive deﬁnite. Thus −1 2 f has a local minimum at (0, 0).

14 Chapter 8. Integration 2 Deﬁnition. Let A ⊂ ℝ be a bounded set and let f : A → ℝ be a .

∙ We enclose A in some rectangle B = [a1, b1] × [a2, b2] and extend f to the whole rectangle by deﬁning it to be zero outside of A.

∙ Let P be a partition of B obtained by dividing a1 = x0 < x1 < ⋅ ⋅ ⋅ < xn = b1 and a2 = y0 < y1 < ⋅ ⋅ ⋅ < ym = b2:

P = {[xi, xi+1] × [yj, yj+1] : i = 0, 1, ⋅ ⋅ ⋅ , n − 1, j = 0, 1, ⋅ ⋅ ⋅ , m − 1}. | {z } =subrectangle R

∙ Deﬁne the upper sum of f: X U(f, P) := sup{f(x, y) ∣ (x, y) ∈ R}× (volume of R) R∈P

∙ Deﬁne the lower sum of f: X L(f, P) := inf{f(x, y) ∣ (x, y) ∈ R}× (volume of R) R∈P 1 ∙ Deﬁne the upper integral of f on A by Z f = inf {L(f, P): P is a partition of B} A and the lower integral of f on A by Z f = sup {L(f, P): P is a partition of B} A

∙ We say that f is Riemann integrable or integrable if Z Z f = f. A A

∙ If f is integrable on A, we denote Z Z Z f = f = f. A A A Volume and sets of measure zero.

Deﬁnition. Let A be a bounded set of ℝn.

∙ The characteristic function1 A of A is the map deﬁned by 1A(x) = 1 if x ∈ A and 1A(x) = 1 if x ∈/ A.

∙ We say that A has volume if 1A is Riemann integrable and the volume is the number Z vol(A) = 1A(x)dx. A

∙ The set A is said to have measure zero if for every  > 0 there is a countable number of rectangles R1,R2, ⋅ ⋅ ⋅ such that ∞ ∞ X A ⊂ ∪n=1Rn & vol(Rn) < . n=1

∙ Examples: The set of has measured zero in ℝ. As a 2 subset of ℝ , the has measure zero.

2 ∙ Lebesgue’s monotone convergence theorem. Let gn : [0, 1] → be ℝ integrable functions and R 1 g (x)dx < ∞. Suppose that 0 ≤ g ≤ g 0 n n+1 n and gn(x) → 0 for all x ∈ [0, 1]. Then Z 1 lim gn(x)dx = 0. n→∞ 0

R 1 −nx2 p ∙ Example: limn→∞ 0 e x dx = 0 if p > −1.

2 ∙ Fubini’s Theorem. Let A = [a, b] × [c, d] ⊂ ℝ , and let f : A → ℝ be continuous. Then Z Z b Z d  Z d Z b  f = f(x, y)dy dx = f(x, y)dx dy A a c c a

3 Chapter 10 Fourier . Fourier analysis arouse historically in connection with problems in mechanic such as heat conduction and wave motion.

∙ Vibrating string. Consider a string of length l with clamped ends that is free to vibrate when plunked. Let y(t, x) is the displacement of the string at time t and x ∈ [0, l]. – y obeys the wave equation ∂2y ∂2y = c2 ∂t2 ∂x2 (Force=mass× acceleration = tension) – That the string has clamped ends entails that y(t, 0) = y(t, l) = 0.

∙ It is both important and remarkable that any solution y(x, t) can be decomposed into harmonics: ∞ ∞ X X n  nc y(x, t) = cn yn(x, t) = cnsin x cos(!nt),!n = | {z } l l n=1 standing wave n=1 | {z } frequency

4 ∙ Physically, a standing wave is a synchronous up-and-down motion that repeats its shape periodically after time 2 , such as occurs when a string ! produces a pure note.

∙ Speciﬁc standing waves called fundamental solutions (a kind of basis) are given by n  yn(x, t) = sin x cos(!nt), n = 0, 1, 2, ⋅ ⋅ ⋅ l

∙ Thus a complicated-looking vibration is in reality an inﬁnite linear com- bination of harmonics.

∙ The purpose of Fourier analysis is to carry out this procedure of decom- position using general method.

Exercise: Using separable , prove that any solution y(x, t) can be decomposed into harmonics ∞ ∞ X X n  nc y(x, t) = cnyn(x, t)= cnsin x cos(!nt),!n = l l n=1 n=1 n 10.1 Review: Inner Product in ℝ .

∙ For x, y ∈ ℝn, deﬁne inner product and norm: n X p ⟨x, y⟩ = x(j)y(j), ∥x∥ = ⟨x, x⟩. j=1

∙ The distance (or metric) between x and y is deﬁned by ∥x − y∥, and hence ∥x − y∥ = 0 implies x = y.

∙ If ⟨x, y⟩ = 0, x and y are said to be orthogonal.

n ∙{ e1, e2, ⋅ ⋅ ⋅ , en} is said to be an orthonormal basis of ℝ if n 1. ℝ = span{e1, e2, ⋅ ⋅ ⋅ , en}

2. ∥ej∥ = 1, j = 1, ⋅ ⋅ ⋅ , n

3. ⟨ej, ei⟩ = 0 if i ∕= j.

∙ For example, e1 = (1, 0, ⋅ ⋅ ⋅ , 0), e2 = (0, 1, 0, ⋅ ⋅ ⋅ , 0), .... 5 n ∙ If {e1, e2, ⋅ ⋅ ⋅ , en} is an orthonormal basis, then every x ∈ ℝ can be rep- resented uniquely by n X x = ⟨x, ej⟩ej j=1

∙ If Vm = span{e1, ⋅ ⋅ ⋅ , em}, the element in Vm closest to x is m X xm = ⟨x, ej⟩ej j=1

qPn 2 with the distance ∥x − xm∥ = j=m ⟨x, ej⟩ .

This useful properties in Euclidean space can be generalized to inﬁnite dimensional spaces by introducing . 10.1 Inner Product space C[0, 2]

∙ Let A be the interval (0, 2).

∙ Let V be the space of all continuous functions f : [0, 2] → ℂ. ∙ For f, g ∈ V , we deﬁne the inner product Z 2 ⟨ f, g ⟩ = f(x)g(x) dx 0 where g(x) denotes the complex conjugate of g(x). The above inner product can be approximated by n X ⟨ f, g ⟩ ≈ f(xj)¯g(xj) ∆x. j=1 where we divide the interval [0, 2] into n subintervals with endpoints x0 = 0 < x1 < = 2 and equal width ∆ = 2 . ⋅ ⋅ ⋅ < xn  x n ∙ Two functions f and g are said to be orthogonal if Z 2 ⟨ f, g ⟩ = f(x)g(x)dx = 0. 0 6 ∙ The norm of f is deﬁned as s Z 2 ∥f∥ = p⟨ f, f ⟩ = ∣f(x)∣2dx. 0

∙ The distance between f and g is deﬁned by d(f, g) = ∥f − g∥.

∙ If { n } is an orthogonal set of functions on the interval A with the property that ∥n∥ = 1, then we call { n } as an orthonormal set.

∙ Example. 1 1 1 1 1 { √ , √ cos x, √ sin x, √ cos 2x, √ sin 2x, ⋅ ⋅ ⋅ } 2     is an orthonormal set in V . 10.1 Inner Product space

Deﬁnition. Let V be a complex vector space V . An inner product on V is a mapping ⟨⋅, ⋅⟩ : V × V → ℂ with the following properties :

1. ⟨ f + g ℎ⟩ = ⟨ f, ℎ ⟩ + ⟨ g, ℎ ⟩ for all f, g, ℎ ∈ V and , ∈ ℂ.

2. ⟨ f, g ⟩ = ⟨ g, f ⟩

3. ⟨ f, f ⟩ ≥ 0, and ⟨ f, f ⟩ = 0 ⇒ f = 0

Theorem 10.1.2. The space V of the continuous functions f :[a, b] → ℂ forms an inner product space if we deﬁne Z b ⟨f, g⟩ = f(x)g(x)dx. a

7 10.1 Inner Product space V = C[a, b] Consider the space V of the continuous functions f :[a, b] → ℂ with the inner product ⟨f, g⟩ = R b ( ) ( ) a f x g x dx.

∙ Deﬁne the norm of f by ∥f∥ = p⟨ f, f ⟩.

∙ Deﬁne the distance between f and g by d(f, g) = ∥f − g∥.

For f, g, ℎ ∈ V , we have

∙ Cauchy-Schwarz inequality. ∣⟨ f, g ⟩∣ ≤ ∥f∥∥g∥

∙ Minkowski inequality. ∥f + g∥ ≤ ∥f∥ + ∥g∥

∙ Parallelogram law. ∥f + g∥2 + ∥f − g∥2 = 2∥f∥2 + 2∥g∥2

∙ Pythagorean Theorem. If ⟨ f, g ⟩ = 0, then ∥f + g∥2 = ∥f∥2 + ∥g∥2

8 Cauchy-Schwarz inequality. ∣⟨ f, g ⟩∣ ≤ ∥f∥∥g∥ Proof:

Suppose = 0. Let = g . Then ∙ g ∕ ℎ ∥g∥ ∣⟨ f, g ⟩∣ ≤ ∥f∥∥g∥ ⇔ ∣⟨ f, ℎ ⟩∣ ≤ ∥f∥

∙ Denote = ⟨ f, ℎ ⟩. Then 0 ≤ ∥f − ℎ∥2 = ⟨f − ℎ, f − ℎ⟩ = ∥f∥2 − ⟨ℎ, f⟩ − ¯ ⟨f, ℎ⟩ + ∣ ∣2 = ∥f∥2 − ∣ ∣2 Hence, ∣ ∣ = ∣⟨ f, ℎ ⟩∣ ≤ ∥f∥. This completes the proof.

9 Minkowski inequality. ∥f + g∥ ≤ ∥f∥ + ∥g∥ Proof: ∥f + g∥2 = ⟨f + g, f + g⟩ = ∥f∥2 + ⟨f, g⟩ + ⟨g, f⟩ + ∥g∥2 ≤ ∥f∥2 + 2∥f∥∥g∥ + ∥g∥2 = (∥f∥ + ∥g∥)2

10 Deﬁnition of convergence in an inner product space V . Let V be an inner product space and let fn be a sequence in V . We say that fn converges to f (in mean) and write fn → f if ∥fn − f∥ = 0, that is,

∀ > 0, ∃N s.t. n ≥ N ⇒ ∥fn − f∥ < . Pn Similarly, a series k=1 gk converges to f if n X lim ∥ gk − f∥ = 0. n→∞ k=1

Examples: Let V = C([0, 1]), the space of continuous functions f : [0, 1] → ℂ.

R 1 ∙ Let fn = nx [0, 1 ] +(2−nx) ( 1 , 2 ]. Then fn → 0 in mean, that is, ∣fn(x)− n n n 0 0∣2dx → 0.

2 2 ∙ Let fn = n x [0, 1 ] + (2n − n x) ( 1 , 2 ]. Then n n n Z 1 2 lim fn(x) = 0 (∀x ∈ ℝ)& lim ∣fn(x) − 0∣ dx = ∞. n→∞ n→∞ 0

11 Deﬁnition of Cauchy sequence. A sequence fn in an inner product space is said to be a Cauchy sequence when

∀ > 0, ∃N s.t. n, m ≥ N ⇒ ∥fn − fm∥ < . An inner product space is called complete if every Cauchy sequence in V converges. A complete inner product space is called a Hilbert space.

Remark: The inner product space V = C([0, 2]) is not complete.

n ∙ Let fn(x) = x for 0 ≤ x ≤ 1 and fn(x) = 1 for 1 ≤ x ≤ 2.

2 R 1 n m 2 ∙ Then fn is Cauchy sequence since ∥fn − fm∥ = 0 ∣x − x ∣ dx → 0 as n, m → ∞.

∙ However, fn → f where f(x) = 0 for 0 ≤ x ≤ 1 and f(x) = 1 for 1 ≤ x ≤ 2. f∈ / V .

12 A complete inner product space. To make the inner product space V = C([a, b]) complete, we need the following theorem and measure theory:

Theorem 8.3.4 If ( ) is integrable, 0, and R b ( ) = 0, then the set g x g ≥ a g x dx {x ∈ [a, b]: g(x) ∕= 0} has measure zero. Proof. TA

♣ For any integrable function f, theorem 8.3 leads to Z b ∣f(x)∣2dx = 0 ⇒ f = 0 except for those x in a set of measure zero. a Regarding such a f as equivalent to zero, we have the following theorem:

2 Theorem 10.1.6 Let V = L ([a, b]) be the space of functions f :[a, b] → ℂ that ∣f∣2 is integrable. Then V is an inner product space with inner product = R b ( ) ( ) and norm = p . ⟨f, g⟩ a f x g x dx ∥f∥ ⟨f, f⟩

13 If ( ) is integrable, 0, and R b ( ) = 0, Proof of Theorem 8.3.4: g x g ≥ a g x dx then the set {x ∈ [a, b]: g(x) ∕= 0} has measure zero.

∙ We ﬁrst show that a set Am = {x ∈ A : g(x) > 1/m} has measure zero.

Recall R b ( ) = inf ( ): is any partition . ∙ a g x {U f, P P }

Let 0 be given. There exist a partition such that ( )  . ∙  > P U g, P < m

∙ Let I1, ⋅ ⋅ ⋅ ,Ik be the subintervals of the partition P such that Ii ∩ Am ∕= ∅. Then k k X X   ∣Ii∣ ≤ m sup g(x)∣Ii∣ ≤ m U(g, P ) < . I i=1 i=1 i

where ∣Ii∣ is the length of the interval Ii.

k Pk ∙ Since Am ⊂ ∪i=1Ii and i=1 ∣Ii∣ < , Am has measure zero.

∞ ∙ Since {x ∈ [a, b]: g(x) ∕= 0}⊂∪ m=1Am, the set has measure zero.

14 Proof of Theorem 10.1.6: Prove that V = L2([a, b]) is an inner product space.

If = 0, R b ( ) 2 = 0. From Theorem 8.3.4, = 0 since we are ∙ ∥f∥ a ∣f x ∣ dx f identifying functions that agree except on a set of measure zero.

∙ It is easy to see that ⟨f, g⟩ satisﬁes all the other rules of inner product space. We only need to prove that ∣⟨f, g⟩∣ < ∞ for all f, g ∈ V .

∙ If we split f and g into real and imaginary part, and into positive and negative part, we are reduce to the case in which f and g are real and positive.

∙ From Lebesgue monotone convergence theorem (page 467), it suﬃces to show that Z b lim (fg)M < ∞ ( see page 462) M→∞ a

∙ Note that 0 ≤ (fg) ≤ f√ g√ + f√2 + g√2 . M M M M M 15 R b( ) √ √ + √ 2 + √ 2. ∙ a fg M ≤ ∥f M ∥∥g M ∥ ∥f M ∥ ∥g M ∥

Hence, R b( ) + 2 + 2 ∙ a fg M ≤ ∥f∥∥g∥ ∥f∥ ∥g∥ < ∞. Example 10.1.8. If f1, ⋅ ⋅ ⋅ , fn are orthonormal in an inner prod- uct space V , prove that f1, ⋅ ⋅ ⋅ , fn are linearly independent.

∙ Deﬁnition. f1, ⋅ ⋅ ⋅ , fn are said to be linearly independent if n X cifi = 0 ⇒ c1 = ⋅ ⋅ ⋅ = cn = 0. i=1

Pn ∙ Assume that i=1 cifi = 0. We want to prove c1 = ⋅ ⋅ ⋅ = cn = 0.

∙ Due to orthogonality, we have * n + 2 X ck = ck∥fk∥ = cifi, fk = ⟨0, fk⟩ = 0. i=1

16 Example 10.1.8. Let V be an inner product space. Deﬁne the project of f on g to be the vector ⟨f, g⟩ ℎ = g ∥g∥2 Show that ℎ and f − ℎ are orthogonal, and interpret this result geometrically.

Proof: First, let us prove it when ∥g∥ = 1: ⟨ℎ, f − ℎ⟩ = ⟨ℎ, f⟩ − ∥ℎ∥2 = ⟨⟨f, g⟩g, f⟩ − ∣ ⟨f, g⟩ ∣2 = 0.

For the general case, repeat the above procedure.

17 10.2 Orthogonal family of functions

∙ Throughout this section, we assume that V is an inner product space with an inner product ⟨⋅, ⋅⟩.

∙ A vector  ∈ V is called normalized if ∥∥ = p⟨, ⟩ = 1.

∙ f and g are called orthogonal if ⟨f, g⟩ = 0.

∙ Deﬁnition. An orthonormal family 0, 1, ⋅ ⋅ ⋅ in V is called complete if every f ∈ V can be written ∞ X f = ckk (ck = ⟨f, k⟩) k=0 P∞ We call f = k=0 ckk the of f with respect to 0, 1, ⋅ ⋅ ⋅ and ck = ⟨f, k⟩ the Fourier coeﬃcients.

∙ An orthonormal family {k} in V is complete iﬀ for every f ∈ V , n X lim ∥f − ⟨f, k⟩ k∥ = 0. n→∞ k=0

18 P∞ Theorem 10.2.1: Suppose f = k=0 ckk for an orthonor- mal family 0, 1, ⋅ ⋅ ⋅ in V (convergence in mean). Then ck = ⟨f, k⟩ = ⟨f, k⟩. Proof.

Pn ∙ Set sn = k=0 ckk, so that ∥sn − f∥ → 0.

∙ Hence, ∣ ⟨f − sn, i⟩ ∣ ≤ ∥f − sn∥ → 0 as n → ∞.

Pn ∙ If n ≥ i, then ⟨sn, ⟩ = k=0 ⟨ckk, i⟩ = ci.

∙ If n ≥ i, ∣⟨f − sn, i⟩∣ = ∣⟨f, i⟩ − ci∣ ≤ ∥f − sn∥ → 0 as n → ∞.

∙ Hence, ⟨f, i⟩ = ci.

19 Examples of complete orthonormal families :

∙ Let V = L2([0, 2]) be the inner product space in Theorem 10.1.6.

einx ∙ The exponential system {n(x) = √ : n = 0, ±1, ±2} is a complete 2 orthonormal system in the space V, that is, Fourier series for f ∈ V for this family is given by

∞ ikx Z 2 X cke 1 −ikx f = √ , ck = ⟨f, k⟩ = √ f(x)e dx. 2 2 k=−∞ 0

∙ The trigonometric system √1 , cos√ mx, sin√ nx, m, n = 1, 2, ⋅ ⋅ ⋅ is complete 2 2 2 orthonormal system in V.

Proof. See Mean completeness theorem 10.3.1. (optional)

20 Gram-Schmidt process :

∙ Let g0, g1, g2, ⋅ ⋅ ⋅ be an linearly independent functions in an inner product space V .

∙ We can form a corresponding orthonormal system 0, 1, ⋅ ⋅ ⋅ as follows

g0 0 = ∥g0∥ ˜1 1 = ˜1 = g1 − ⟨g1, 0⟩ 0 ∥˜1∥ k ˜k X k+1 = ˜k = gk − ⟨g1, i⟩ i ∥˜ ∥ k i=0

21 Theorem: Bessel inequality: Let 0, 1, ⋅ ⋅ ⋅ be an orthonomal system P∞ 2 in an inner product space V . For each f ∈ V , the real series i=0 ∣⟨f, i⟩∣ converges and ∞ X 2 2 ∣⟨f, i⟩∣ ≤ ∥f∥ . i=0 Proof.

Pn ∙ Set sn = k=0 ckk where ck = ⟨f, k⟩.

∙ Key idea 1: f − sn and sn are orthogonal.

2 2 2 ∙ Key idea 2: Apply Pythagoras’ theorem: ∥f∥ = ∥f − sn∥ + ∥sn∥ .

2 2 ∙ Hence, ∥sn∥ ≤ ∥f∥ .

2 Pn 2 ∙ Since i are orthogonal, ∥sn∥ = i=0 ∣⟨f, i⟩∣ .

22 Parseval’s Theorem : Let 0, 1, ⋅ ⋅ ⋅ be an orthonomal system in an inner product space V . Then 0, 1, ⋅ ⋅ ⋅ is complete iﬀ for every f ∈ V , we have ∞ X 2 2 ∣⟨f, i⟩∣ = ∥f∥ . i=0

Proof.

Pn ∙ Set sn = k=0 ckk where ck = ⟨f, k⟩.

2 2 2 ∙ Then ∥f∥ = ∥f − sn∥ + ∥sn∥ .

2 ∙ If 0, 1, ⋅ ⋅ ⋅ is complete, ∥f − sn∥ → 0. Therefore, letting n → ∞, ∞ 2  2 2 X 2 ∥f∥ = lim ∥f − sn∥ + ∥sn∥ = 0 + ∣⟨f, i⟩∣ n→∞ i=0

P∞ 2 2 2 2 ∙ Conversely, if i=0 ∣⟨f, i⟩∣ = ∥f∥ , then ∥f∥ − ∥sn∥ → 0, and so ∥f − 2 sn∥ → 0.

23