<<

Xiaolong Han

Department of , California State University, Northridge, CA 91330, USA E-mail address: [email protected] Remark. You are entitled to a reward of 1 point toward a homework assignment if you are the first person to report a bona-fide mathematical mistake (i.e. not including language typos and grammatical errors.) Contents

The language: Sets and mappings 4 Chapter 1. Metric spaces and continuous functions 6 1.1. Metric spaces 6 1.2. Convergent and divergent of 12 1.3. Continuous functions 13 Chapter 2. Sequences and of functions 16 2.1. Sequences and series of numbers 16 2.2. Pointwise convergence of sequences and series of functions 19 2.3. of sequences and series of functions 21 2.4. Uniform convergence and continuity 22 2.5. Uniform convergence and differentiation 24 2.6. Uniform convergence and integration 25 2.7. 28 2.8. Exponential function and logarithmic function 30 Chapter 3. Functions of several variables 33 3.1. Linear transformations and matrices 33 3.2. Differentiation 37 3.2.1. Partial derivatives 40 3.3. Inverse function theorem 42 3.3.1. The contraction principle 42 3.4. Implicit function theorem 46 Bibliography 47

3 The language: Sets and mappings

We speak the language of sets and mappings in mathematics. Definition (Sets). • Given a set A, x ∈ A denotes that x is an element (or member, point) of A and x∈ / A denotes that x is not an element of A. • We say that two sets A and B are equal, denoted by A = B, if they have the same elements. • Given two sets A and B, we say that A is a subset of B, denoted by A ⊂ B (or A ⊆ B), if each element of A is an element of B; we say that A is a proper subset of B, denoted by A ( B, if A ⊆ B and A 6= B. Remark. • B ⊃ A means A ⊂ B. • Given two sets A and B, A = B if and only if (i.e. iff) A ⊂ B and B ⊂ A. Example. (1). N = {1, 2, 3, ...} denotes the set of natural numbers. (Notice that in some other books, N refers to the set {0, 1, 2, 3, ...}.) (2). Z = {..., −3, −2, −1, 0, 1, 2, 3, ...} denotes the set of integers. (3). Q = {p/q : p, q ∈ Z, q 6= 0} denotes the set of rational numbers. (4). R denotes the set of real numbers, which is a complete ordered field. (5). R \ Q is called the set of irrational numbers. (6). C = {x + iy : x, y ∈ R} denotes the set of complex numbers. Definition (The empty set). The set that has no elements is called the empty set and is denoted by ∅. A set that is not equal to the empty set is called nonempty. Definition (Union, intersection, and complement). Let A and B be two sets. • The union of A and B is A ∪ B = {x : x ∈ A or x ∈ B}. • The intersection of A and B is A ∩ B = {x : x ∈ A and x ∈ B}. We say that A and B are disjoint if A ∩ B = ∅. • The complement of A in B is B \ A = {x : x ∈ B and x∈ / A}. If all the set operations are within a universal set X, then for a set A ⊂ X, we simply call X \A the complement of A, and is also denoted by Ac. Remark. • x ∈ A iff x 6∈ Ac and x ∈ Ac iff x 6∈ A. • E ⊂ A iff E ∩ Ac = ∅ and E ⊂ Ac iff E ∩ A = ∅. Remark. Given a family of sets F, we define [ E = {x : x ∈ E for some E ∈ F}, E∈F 4 THE LANGUAGE: SETS AND MAPPINGS 5 and \ E = {x : x ∈ E for all E ∈ F}. E∈F In particular, if F = {Eλ}λ∈Λ, where λ is called the index and Λ is called the index set, then we define [ Eλ = {x : x ∈ Eλ for some λ ∈ Λ}, λ∈Λ and \ Eλ = {x : x ∈ Eλ for all λ ∈ Λ}. λ∈Λ For example, n [ Ei = {x : x ∈ Ei for some i = 1, ..., n}, i=1 and n \ Ei = {x : x ∈ Eλ for all i = 1, ..., n}. i=1 Theorem (De Morgan’s Law). Let F be a family of sets. Then !c !c [ \ \ [ E = Ec and E = Ec. E∈F E∈F E∈F E∈F Definition (Mappings). Let A and B be two sets. A mapping (or function) f from A to B, denoted by f : A → B, is a correspondence that assigns to each element of A an element of B; for x ∈ A, we denote by f(x) the assigned element in B. In the case when B = R or C, we call the mapping f a real-valued or complex-valued function. Definition (Domain and range). Let f be a mapping from A to B. We call A the domain of f. Given a subset E ⊂ A, we define f(E) = {y ∈ B : y = f(x) for some x ∈ E} the image of E. We call f(A) the range of f. Definition (Inverse image). Let f be a mapping from A to B. Given a subset E ⊂ B, we define f −1(E) = {x ∈ A : f(x) ∈ E} the inverse image of E. Remark. Notice that in defining the inverse images, we do not assume that f −1 is a mapping. In fact, to make f −1 a mapping, we need the following concepts. Definition (Onto and one-to-one mappings). Let f be a mapping from A to B. • We say that f is one-to-one (or injective) if x, y ∈ A and x 6= y imply f(x) 6= f(y). • We say that f is onto (or surjective) if f(A) = B. Definition (Invertible mappings). Let f be a mapping from A to B. We say that f is invertible if f is both one-to-one and onto, i.e. f establishes an one-to-one correspondence (or ) between the sets A and B. Let f be an inverible mapping from A to B. Given each element y ∈ B, there is exactly one element x ∈ A such that f(x) = y and we denote by f −1(y) = x. This defines a mapping f −1 : B → A and we call f −1 the inverse of f. CHAPTER 1

Metric spaces and continuous functions

1.1. Metric spaces Definition (Metric spaces). A metric space is a set X equipped with a function d : X × X → [0, ∞) that satisfies (i). d(p, q) = 0 iff p = q; (ii). d(p, q) = d(q, p) for all p, q ∈ X; (iii). (Triangle inequality) d(p, q) ≤ d(p, r) + d(r, q) for all p, q, r ∈ X. We call d a metric and d(p, q) the distance between p and q.

k k Example. In R , denote a point x = (x1, ..., xk) ∈ R . Then 1  2 2 2 d(x, y) = |x − y| := |x1 − y1| + ··· + |xn − yn| defines a metric on Rk, which is called the Euclidean metric. In this case, Rk is called the Euclidean space. In fact, for any 1 ≤ p ≤ ∞, 1  p p p dp(x, y) = |x1 − y1| + ··· + |xn − yn| defines a metric on Rk. In particular,

ρ(x, y) := d∞(x, y) = sup {|xi − yi|} i=1,...,n is called the square metric. Definition. Let X be a metric space. All points and sets mentioned below are understood to be elements and subsets of X.

• A neighborhood of p is a set Nr(p) consisting of all q such that d(p, q) < r, for some r > 0. The number r is called the radius of Nr(p). • A point p is an interior point of E if there is a neighborhood N of p such that N ⊂ E. E is open if every point of E is an interior point of E. • A point p is a limit point of E if every neighborhood of p contains a point q 6= p such that q ∈ E. E is closed if every limit point of E is a point of E. • E is bounded if there is a real number M and a point q ∈ X such that d(p, q) < M for all p ∈ E. Definition (Intervals). Let a, b ∈ R. We define the following intervals as subsets of R. • (a, b) = {x ∈ R : a < x < b}, • (a, b] = {x ∈ R : a < x ≤ b}, • [a, b) = {x ∈ R : a ≤ x < b}, • [a, b] = {x ∈ R : a ≤ x ≤ b}, 6 1.1. METRIC SPACES 7

• (a, ∞) = {x ∈ R : x > a}, • [a, ∞) = {x ∈ R : x ≥ a}, • (−∞, b) = {x ∈ R : x < b}, • (−∞, b] = {x ∈ R : x ≤ b}, Remark. In Rudin, only [a, b] are referred as “intervals”; while (a, b) are called “segments”.

Definition (Cells). Let ai, bi ∈ R and ai < bi for i = 1, ..., k. We define a k-cell

{x = (x1, ..., xk): ai ≤ xi ≤ bi for all i = 1, ..., k} = [a1, b1] × · · · × [ak, bk] as a subset of Rk. In particular, a 1-cell is an interval and a 2-cell is a rectangle. Remark.

• If 0 < r < s, then Nr(p) ⊂ Ns(p). k • The neighborhood Nr(p) of p in the Euclidean space R is also called the ball with center p and radius r. In the following proposition, we verify that every neighborhood is open. k k So we call Nr(p) ∈ R an open ball. One can also verify that {q ∈ R : |q − p| ≤ r} is closed, by which we call a closed ball. • Let Rk be equipped with the square metric. Then a neighborhood is in fact a geometric square. • Throughout the note, we will assume that Rk is equipped with the Euclidean metric if not otherwise stated. • (a, b), (a, ∞), and (−∞, a) are open in R;[a, b], [a, ∞), and (−∞, a] are closed in R. • The k-cells are closed in Rk. k k • If E ⊂ R is bounded, then E ⊂ NM (q) for some q ∈ R and M > 0. It then follows that E ⊂ NM+|q|(0). Therefore, given that E is bounded, we can assume that E ⊂ Nr(0) for some r > 0. Similarly, given that E is bounded, we can assume that E ⊂ I for some k- cell I ⊂ Rk. (For example, one can take I = [−M−|q|,M+|q|]×· · ·×[−M−|q|,M+|q|].) Proposition 1.1. Every neighborhood is an open set.

Proof. Consider a neighborhood Nr(p), and let q ∈ Nr(p). Then d(p, q) < r. Let h = r − d(p, q). Then for all points s ∈ Nh(q), d(s, q) < h. Therefore, by the triangle inequality, d(s, p) ≤ d(s, q) + d(q, p) < h + d(q, p) = r.

Hence, s ∈ Nr(p). So Nh(q) ⊂ Nr(p) and Nr(p) is open.  Proposition 1.2. If p is a limit point of a set E, then every neighborhood of p contains infinitely many points of E. Proof. Suppose there is a neighborhood N of p which contains only a finite number of points of E. Let q1, ..., qn be those points of N ∩ E, which are distinct from p, and put

r = min{d(p, q1), ..., d(p, qn)}.

The minimum of a finite set of positive numbers is positive. It follows that d(p, qm) ≥ r so qm 6∈ Nr(p) for all 1 ≤ m ≤ n. The neighborhood Nr(p) contains no point q of E such that q 6= p. Therefore, p is not a limit point of E. This contradiction establishes the proposition.  Theorem 1.3. A set E is open iff its complement is closed. 1.1. METRIC SPACES 8

Proof. First, suppose Ec is closed. Choose x ∈ E. Then x 6∈ Ec, and x is not a limit point of Ec since Ec is closed. Hence there exists a neighborhood N of x such that Ec ∩ N is empty, that is, N ⊂ E. Thus E is open. Next, suppose E is open. Let x be a limit point of Ec. Then every neighborhood of x contains a point of Ec, so that x is not an interior point of E. Since E is open, this means that c c x 6∈ E so x ∈ E . It follows that E is closed.  Proposition 1.4.

(i). For any collection {Gα} of open sets, ∪αGα is open. (ii). For any collection {Fα} of closed sets, ∩αFα is closed. n (iii). For any finite collection G1, ..., Gn of open sets, ∩i=1Gi is open. n (iv). For any finite collection F1, ..., Fn of closed sets, ∪i=1Fi is closed. Proof.

(i). Put G = ∪αGα. If x ∈ G, then x ∈ Gα for some α. Now there is a neighborhood N of x such that x ∈ N ⊂ Gα. Then N ⊂ G hence G is open. (ii). By De Morgan’s Law, !c \ [ c Fα = Fα, α α c c c in which Fα is open since Fα is closed. By (i), ∪αFα is open. Hence, (∩αFα) is open so that ∩αFα is closed. n (iii). Put H = ∩i=1Gi. For any x ∈ H, there exist neighborhoods Nri (x) of x such that Ni(x) ⊂ Gi for i = 1, ..., n. Put

r = min{r1, ..., rn}.

Then Nr(x) ⊂ Nri (x) ⊂ Gi for all i = 1, ..., n, so that Nr(x) ⊂ H, and H is open. (iv). (iv) follows (iii) similarly as (ii) follows (i).  Definition (Closure). Let X be a metric space and E ⊂ X. Denote E0 the set of limit points of E in X. Then the closure of E is the set E = E ∪ E0. Proposition 1.5. Let X be a metric space and E ⊂ X. Then (i). E is closed, (ii). E = E iff E is closed, (iii). E ⊂ F for every closed subset F of X that contains E. By (i) and (iii), E is the smallest closed subset of X that contains E, i.e. \ E = F. F ⊃E,F closed. Proof. c c (i). It suffices to show that E is open. Let p ∈ E . Then p 6∈ E and p 6∈ E0. 0 Since p 6∈ E , there exists a neighborhood N = Nr(p) of p such that (N \{p}) ∩ E = ∅. Together with the fact that p 6∈ E, we have that N ∩E = ∅. Now we show that N ∩E0 = ∅. 0 0 Suppose there is q ∈ Nr(p) ∩ E . Set s = r − d(p, q). Then Ns(q) ⊂ Nr(p). But q ∈ E implies that there is x ∈ (Ns(q) \{p}) ∩ E. So x ∈ (Nr(p) \{p}) ∩ E, contradicting with 0 c c the fact that Nr(p) ∩ E = ∅. Hence, N ∩ (E ∪ E ) = ∅ so N ⊂ E . Hence, E is open. 1.1. METRIC SPACES 9

(ii). If E = E, then E is closed by (i). If E is closed, then E0 ⊂ E by the definition of closed sets. Hence, E = E ∪ E0 ⊂ E. (iii). IF F is closed and F ⊃ E, then F ⊃ F 0 (again, by the definition of closed sets). Hence, F ⊃ F 0 ⊃ E0. Then F ⊃ E.  Definition (Compact sets). Let X be a metric space and E ⊂ X. By an open cover of a set E we mean a collection {Gα} of open subsets of X such that E ⊂ G. E is said to be compact if every open cover of E contains a finite subcover. Theorem 1.6. (i). Compact subsets of metric spaces are closed. (ii). Closed subsets of compact sets are compact. Proof. (i). Let K be a compact subset of a metric space X. We shall prove that Kc is open in X. c Suppose p ∈ K . Then p 6∈ K. Choose any q ∈ K. Then d(p, q) > 0. Set rq = d(p, q)/2.

Then the neighborhoods Nrq (p) and Nrq (q) are disjoint.

Since K ⊂ ∪q∈K Nrq (q) and K is compact, there are q1, ..., qn such that n [ K ⊂ N (q ). rqi i i=1 Now set n \ V = N (p). rqi i=1 Notice that n ! n ! \ [ x ∈ V ∩ W = N (p) ∩ N (q ) rqi rqi i i=1 i=1 would imply that x ∈ N (p) for all i = 1, ..., n and x ∈ N (q ) for some i = 1, ..., n. But rqi rqi i this is impossible since N (p) ∩ N (q ) = ∅ for all i = 1, ..., n. That is, V ∩ W = ∅. It rqi rqi i then follows that V ∩ K = ∅, i.e. V ⊂ Kc. So K is open. (ii). Let K be a compact subset of a metric space X and F ⊂ K be a closed set. Suppose c that {Vα} is an open cover of F . Notice that {Vα} ∪ F is an open cover of X since c c ({Vα} ∪ F ) ⊃ (F ∪ F ) = X, and therefore is an open cover of K. Because K is compact, c c there is a finite subset of {Vα} ∪ F , denoted as {Vα1 , ..., Vαn ,F }, as open cover of K.

Hence, {Vα1 , ..., Vαn } forms an finite open cover of F .  We next show that the k-cells are compact in Rk. We need some preparation. ∞ Theorem 1.7. Let {In}n=1 be a of k-cells such that Im ⊃ In+1 for all n ∈ N. Then ∞ ∩n=1In 6= ∅. Proof. We first prove the theorem in R1. That is, there is a sequence of intervals, denoted as In = [an, bn], such that In ⊃ In+1 for all n ∈ N. Then

a1 ≤ a2 ≤ · · · ≤ an ≤ · · · ≤ bn ≤ · · · ≤ b2 ≤ b1.

Let E = {an}. Then E is bounded above by b1 and therefore has a supremum, denoted as x. Hence, x ≥ an for all n ∈ N. 1.1. METRIC SPACES 10

Notice that also for each n ∈ N, E is bounded above by bn. (This is because an ≤ an+m < bn+m ≤ bm for all n, m ∈ N.) Therefore, x = sup E ≤ bn for all n ∈ N. This implies that ∞ x ∈ [an, bn] for all n ∈ N. So x ∈ ∩n=1In. We now prove the theorem in Rk for arbitrary k ∈ N. That is, there is a sequence of k-cells, denoted as

In = [an,1, bn,1] × · · · × [an,k, bn,k],

such that In ⊃ In+1 for all n ∈ N. Then for any j = 1, ..., k,

[an,j, bn,j] ⊃ [an+1,j, bn+1,j] 1 for all n ∈ N. By the previous result in R , there is xj ∈ [an,j, bn,j] for all n ∈ N. Set ∞ x = (x1, ..., xn). We have that x ∈ In for all n ∈ N, i.e. x ∈ ∩n=1In.  ∞ Remark. In fact, if In = [an, bn] and In ⊃ In+1 for all n ∈ N, then ∩n=1In = [a, b], in which a = sup{an} and b = inf{bn}. Theorem 1.8. Every k-cell is compact. Proof. Let

I = [a1, b1] × · · · × [ak, bk] be a k-cell in k. Denote R 1 k ! 2 X 2 δ = |bi − ai| . i=1 Then for all x, y ∈ I, we have that |x − y| ≤ δ.

Suppose, to get a contradiction, that there exists an open cover {Gα} of I which contains no k finite subcover of I. Put ci = (ai + bi)/2. The intervals [ai, ci] and [ci, bi] then determine 2 k-cells Qj whose union is I. At least one of these sets Qj, called I1, can not be covered by any finite subcollection of {Gα} (otherwise I could be so covered). We next subdivide I1 and continue the process. We obtain a sequence {In} with the following properties:

(a). I ⊃ I1 ⊃ I2 ⊃ · · · In ⊃ · · · ; (b). for each n ∈ N, In is not covered by any finite subcollection of {Gα}; −n (c). if x, y ∈ In, then |x − y| ≤ 2 δ. ∞ By (a) and the above theorem, there is a point z ∈ ∩n=1In. Since z ∈ I ⊂ ∪αGα, we know that z ∈ Gα for some α. Since Gα is open, there exists r > 0 such that Nr(z) ⊂ Gα. That is, if |y − z| < r, then y ∈ Gα. −n If n is sufficiently large that 2 δ < r, then (c) implies that In ⊂ Gα, i.e. Gα as an open set forms a finite open cover of In, which contradicts (b).  Theorem 1.9. Let E ⊂ Rk. Then the following statements are equivalent. (i). E is bounded and closed. (ii). E is compact. (iii). Every infinite subset of E has a limit point in E. Proof. (i)⇒(ii). Suppose that E is bounded and closed. Since E is bounded, E ⊂ I for some k-cell I ⊂ Rk. We know that I is compact, then E is compact since a closed subset of a compact set is also compact. 1.1. METRIC SPACES 11

(ii)⇒(iii). Suppose that E is compact. Assume that K ⊂ E is an infinite subset of E with no limit point in E. Then for each point p ∈ E, p is not a limit point of K. So there is a neighborhood Up such that (Up \{p}) ∩ K = ∅, i.e. Up ∩ K ⊂ {p}. That is, Up contains at most one point from K (namely, the point p). Now {Up : p ∈ E} forms an open cover of E, which is infinite (since E ⊃ K). Now there is no finite subcollection of {Up : p ∈ E} that covers E, contradicting with the fact that E is compact. (In fact, there is no finite subcover of {Up : p ∈ E} that covers K. If so, since each Up contains at most one point from K, K must be finite, which is not possible.) (iii)⇒(i). Suppose that every infinite subset of E has a limit point of E. Assume that E is not bounded. Then E is not contained in Nn(0). That is, there is a point of E, called xn, such that |xn| > n. The set {xn} is infinite and has no limit point. (Otherwise, for a limit point p, N1(p) would contain infinitely many xn’s. However, there are only finitely many xn’s with |xn| < |p| + 1 so xn can be in N1(p).) Therefore, E is bounded. Assume that E is not closed. Then there exists a limit point p of E such that p 6∈ E. In the neighborhood N1/n(p), there is a point yn ∈ E, i.e. |yn − p| < 1/n. The set {yn} is infinite. (Otherwise, there is m ∈ N such that |ym − x| < 1/n for infinitely many n ∈ N. It then would imply that |ym − p| = 0 so p = ym ∈ E, which is not possible.)

By (iii), {yn} has a limit point q ∈ E. Notice that for any j ∈ N, there is ynj ∈ N1/j(q) such that nj ≥ j (since N1/j(q) contains infinitely many yn’s). So 1 1 2 |q − p| ≤ |q − ynj | + |ynj − p| < + ≤ j nj j for all j ∈ N, which implies that |q − p| = 0 and q = p. Now p ∈ E, contradicting with the fact that p 6∈ E. Therefore, E is closed.  Remark (Heine-Borel theorem). The equivalence of (i) and (ii) above is called the Heine- Borel theorem.

Theorem 1.10 (Weierstrass). Every bounded infinite subset E of Rk has a limit point in Rk. Proof. Since E is bounded, E ⊂ I for some k-cell in Rk. Since I is compact, E has a limit k point in I ⊂ R by the above theorem.  Homework Assignment . 1-1. Construct a bounded set in R with exactly three limit points. 1-2. Let X be a metric space and E ⊂ X. (a). Prove that E0, the set of the limit points of E, is closed. 0 (b). Prove that E = E0, i.e. the closure of E has the same limit points as E.

1-3. Let A1,A2, ... be subsets of a metric space. n n (a). If Bn = ∪i=1Ai, prove that Bn = ∪i=1Ai. ∞ ∞ (b). If B = ∪i=1Ai, prove that B ⊃ ∪i=1Ai. ∞ ∞ (c). If B = ∪i=1Ai, is B = ∪i=1Ai? Prove your assertion. 1-4.

(a). Let {Gα} be a collection of open sets. Is ∩αGα open? Prove your assertion. (b). Let {Fα} be a collection of closed sets. If ∪αFα closed? Prove your assertion. 1.2. CONVERGENT AND DIVERGENT OF SEQUENCES 12

∞ 1-5. Let In = (an, bn). Suppose that In ⊃ In+1 for all n ∈ N. Is ∩n=1In 6= ∅? Prove your assertion. 1-6. (a). Suppose that E ⊂ R1 is bounded. Is E compact? Prove your assertion. (b). Suppose that E ⊂ R1 is closed. Is E compact? Prove your assertion. (c). Suppose that E ⊂ R1 is bounded. Does every infinite subset of E have a limit point in E? Prove your assertion. (d). Suppose that E ⊂ R1 is closed. Does every infinite subset of E have a limit point in E? Prove your assertion.

1.2. Convergent and divergent of sequences Throughout this section, we assume that X is a metric space equipped with a metric d.

∞ Definition (Convergent and divergent sequences). A sequence {pn}n=1 in a metric space X is said to converge if there is a point p ∈ X with the following property: For every ε > 0 there is N ∈ N such that d(pn, p) ≤ ε for all n ≥ N. We then say {pn} converges to p and p is the limit of {pn}, denoted as pn → p, or

lim pn = p. n→∞

If {pn} does not converge, then it is said to diverge.

Theorem 1.11. Let {pn} be a sequence in X.

(i). pn → p for some p ∈ X iff every neighborhood of p contains pn for all but finitely many n. 0 0 (ii). The limit of a convergent sequence is unique. That is, if pn → p and pn → p for p, p ∈ X, then p = p0. (iii). Convergent sequences are bounded. Here, we say a sequence is bounded if the set of the points in such sequence is bounded. (iv). If p is a limit point of E ⊂ X, then there is a sequence {pn} in E such that pn → p.

∞ Definition (Subsequences). Let {pn}n=1 be a sequence and {nk} be a sequence of natural ∞ ∞ numbers such that n1 < n2 < n3 < ··· . Then {pnk }k=1 is called a subsequence of {pn}n=1. Remark. A sequence converges iff every subsequence converges (to the same limit).

∞ Definition (). A sequence {pn}n=1 in a metric space X is said to be a Cauchy sequence if for every ε > 0 there is N ∈ N such that d(pn, pm) ≤ ε for all n, m ≥ N.

Theorem 1.12. Let {pn} be a sequence in X.

(i). If {pn} converges, then it is Cauchy. k (ii). In R , if {pn} is Cauchy, then it is convergent. Remark. In Rk, a sequence is convergent iff it is Cauchy. Proof.

(i). For every ε > 0, since {pn} converges, there is p ∈ X and N ∈ N such that for d(pn, p) ≤ ε/2 for all n ≥ N. Now if n, m ≥ N, then ε ε d(p , p ) ≤ d(p , p) + d(p , p) ≤ + = ε. n m n m 2 2

That is, {pn} is Cauchy. 1.3. CONTINUOUS FUNCTIONS 13

∞ (ii). Let E be the set of points in the sequence {pn}n=1.

We first show that E is bounded. If not, then there is nk ∈ N such that |pnk+1 | ≥ |pnk |+1 for all k ∈ N. Then by the triangle inequality,

|pnk+1 − pnk | ≥ |pnk+1 | − |pnk | ≥ 1.

This contradicts with the fact that {pn} is Cauchy. Notice that if E is finite, then {pn} is convergent. Indeed, let E = {q1, ..., ql}. Then we claim that there is N ∈ N and j = 1, ..., l such that pn = qj for all n ≥ N. If not, there are n, m → ∞ such that pn = qi and pm = qj for some i, j = 1, ..., l and i 6= j. So d(pn, pm) = d(qi, qj) > 0, contradicting with the fact that {qn} is Cauchy. We now can assume that E is infinite. Then by Weierstrass theorem, E has a limit k point p in R . We next show that pn → p. For any ε > 0, since {pn} is Cauchy, there is M ∈ N such that d(pn, pm) ≤ ε/2 for all n, m ≥ M. Since p is a limit point of E, the neighborhood Nε/2(p) contains infinitely many elements of E. Therefore, there is m ≥ M such that pm ∈ Mε/2(p), i.e. d(pm, p) ≤ ε/2. Hence, by the triangle inequality,

d(pn, p) ≤ d(pn, pm) + d(pm, p) ≤ ε,

and pn → p.  Definition (Complete metric spaces). Let X be a metric space. If every Cauchy sequence in X converges, then we say X is complete. Remark. • Let X be a complete metric space. Then every Cauchy sequence in X has a unique limit in X. • Let Q be equipped with metric d(x, y) = |x − y| for x, y ∈ Q. Let  1 n p = 1 + . n n

Then {pn} is a Cauchy sequence but is not convergent in Q. In fact, in R, pn → e 6∈ Q. Therefore, Q is not complete. Homework Assignment .

1-7. Suppose that {pn} is a Cauchy sequence in a metric space X, and some subsequence

{pni } converges to a point p ∈ X. Prove that the full sequence {pn} converges to p. 1.3. Continuous functions Throughout this section, we assume that X,Y,Z are metric spaces equipped with metrics dX , dY , dZ , respectively. Definition (Limits of functions). Let E ⊂ X and f : E → Y be a function. Let p be a limit point of E. We write f(x) → q for as x → p, or lim f(x) = q x→p if there is a point q ∈ Y with the following property: For every ε > 0, there exists a δ > 0 such that dY (f(x), q) < ε for all x ∈ E for which 0 < dX (x, p) < δ. 1.3. CONTINUOUS FUNCTIONS 14

Theorem 1.13. Let E ⊂ X and f : E → Y be a function. Let p be a limit point of E and q ∈ Y . Then the following statements are equivalent. (i). f(x) → q as x → p. (ii). For every sequence {xn} in E such that xn 6= p and limn→∞ xn = p, we have that limn→∞ f(xn) = q. Definition (Continuous functions). Let E ⊂ X and f : E → Y be a function. Then f is said to be continuous at p ∈ E if for every ε > 0, there exists a δ > 0 such that

dY (f(x), f(p)) < ε for all x ∈ E for which dX (x, p) < δ. If f is continuous at every point of E, then f is said to be continuous on E. Remark. Notice that one does not need p ∈ E for a function f : E → Y to have a limit at p. Whereas one needs p ∈ E for f to be continuous at p, i.e. f has to be defined at p. Theorem 1.14. Let E ⊂ X and f : E → Y be a function. Let p ∈ E ∩ E0. Then f is continuous at p iff f(x) → f(p) as x → p. Theorem 1.15. (i). Let E ⊂ X and f, g : E → R be continuous functions. Let c ∈ R. Then cf, f + g, and fg are continuous. f/g is also continuous at x ∈ X such that g(x) 6= 0. (ii). Let f : E → Y and g : f(E) → Z be continous. Then g ◦ f(x) := g(f(x)) : E → Z is continuous. Theorem 1.16. Let f : X → Y be a function. Then f is continuous iff f −1(U) is open in X for every open set U in Y . Proof. Suppose that f −1(U) is open in X for every open set U in Y . Let p ∈ X and ε > 0. −1 Then Nε(f(p)) is open in Y . Hence, f (Nε(f(p))) is open in X and contains p. Therefore, −1 there is δ > 0 such that Nδ(p) ⊂ f (Nε(f(p))). That is, if q ∈ Nδ(p), i.e. dX (q, p) < δ, then −1 q ∈ f (Nε(f(p))), i.e. f(q) ∈ Nε(f(p)) and dY (f(q), f(p)) < ε. So f is continuous at any point x ∈ X and is then continuous on X. Suppose that f is continuous. Let U ⊂ Y is open. If f −1(U) = ∅, then f −1(U) is open. If f −1(U) 6= ∅, then let p ∈ f −1(U). Hence, f(p) ∈ U. Since U is open, there is a neighborhood Nε(f(p)) ⊂ U. Notice that y ∈ Nε(f(p)) if dY (f(p), y) < ε. Since f is continuous, there is δ > 0 such that dY (f(q), f(p)) < ε if dX (q, p) < δ. Therefore, f(q) ∈ Nε(f(p) if dX (q, p) < δ, that is, −1 −1 −1 Nδ(q) ⊂ f (Nε(f(p))) ⊂ f (U). So f (U) is open.  Corollary 1.17. Let f : X → Y be a function. Then f is continuous iff f −1(V ) is closed in X for every closed set V in Y . Theorem 1.18. Let X be a compact metric space and f : X → R be continuous. Then there are a, b ∈ X such that f(a) ≤ f(x) ≤ f(b) for all x ∈ X. Proof. Let A = f(X) ⊂ R. It suffices to show that there are largest element and smallest element in A. Suppose that there is no largest element in A. Then {(−∞, a): a ∈ A} is an open cover of A. (This is because if there is M ∈ A that is not contained in (−∞, a) for each a ∈ A, then M ≥ a for each a ∈ A, implying that M is the maximum in A.) Since A is compact, there is a finite open cover of A, denoted as

{(−∞, a1), ..., (−∞, an)}. 1.3. CONTINUOUS FUNCTIONS 15

Let a = max{a1, ..., an}. Then a 6∈ (−∞, ai) for all i = 1, ..., n. But a ∈ A, leading to a contradiction to the fact that {(−∞, a1), ..., (−∞, an)} covers A. Similarly, there is a smallest element of A.  Definition (Uniform continuity). Let E ⊂ X and f : E → Y be a function. We say that f is uniformly continuous on E if for every ε > 0, there exists δ > 0 such that

dY (f(p), f(q)) < ε for all p, q ∈ E for which dX (p, q) < δ. Theorem 1.19. Let X be a compact metric space and Y be a metric space. Suppose that f : X → Y be continuous. Then f is uniformly continuous.

Proof. Let dX and dY be the metrics on X and on Y , respectively. Let ε > 0. Since f is continuous, there is δz > 0 for each z ∈ X such that dY (f(z), f(y)) < ε/2 if dX (z, y) < δz and y ∈ X. Now {BdX (z, δz/2) : z ∈ X} covers X, there is a finite open cover of X since X is compact, denoted as

{BdX (z1, δz1 /2), ..., BdX (zn, δzn /2)}.

Let δ = min{δz1 /2, ..., δzn /2}. Let x, y ∈ X such that dX (x, y) < δ. There is zi for some i = 1, ..., n such that x ∈ BdX (zi, δzi /2). It then follows that dY (f(x), f(z)) < ε/2. By triangle inequality, δ d (z, y) ≤ d (z, x) + d (x, y) < zi + δ < δ . X X X 2 zi Hence, dY (f(z), f(y)) < ε/2. By triangle inequality again,

dY (f(x), f(y)) ≤ dY (f(x), f(z)) + dY (f(z), f(y)) < ε.  Homework Assignment . 1-8. Let f : X → Y be continuous. (a). Prove that f(E) ⊂ f(E) for every E ⊂ X. (b). Is f(E) = f(E)? Prove your assertion. 1-9. (a). Let E ⊂ X be open and f : E → Y be a continuous function. If f uniformly continuous? Prove your assertion. (b). Let E ⊂ X be closed and f : E → Y be a continuous function. If f uniformly continuous? Prove your assertion. 1-10. Let f : X → R be continuous. Prove that the zero set of f Z(f) := {p ∈ X : f(p) = 0} is a closed subset of X. 1-11. Let E ⊂ R1 and f : E ⊂ R1 be uniformly continuous. (a). Suppose that E is bounded. Prove that f is bounded on E, i.e. f(E) is a bounded subset of R1. (b). Is f bounded on E (without the condition that E is bounded)? 1-12. Let K ⊂ Rk be compact and F ⊂ Rk be closed such that K ∩ F = ∅. Prove that there exists δ > 0 such that d(p, q) > δ for all p ∈ K and q ∈ F . 1-13. Let K,F ⊂ X be closed such that K ∩ F = ∅. Is the conclusion in Problem 1-12 correct? Prove your assertion. CHAPTER 2

Sequences and series of functions

2.1. Sequences and series of numbers

Throughout this section, we assume that {an}, {bn}, and {cn} are sequences of (real) num- bers.

∞ Definition (Convergent and divergent sequences of numbers). A sequence {an}n=1 of num- bers is said to converge if there is a ∈ R with the following property: For every ε > 0 there is N ∈ N such that |an − a| ≤ ε for all n ≥ N. We then say {an} converges to a and a is the limit of {an}, denoted as an → a, or lim an = a. n→∞

If {an} does not converge, then it is said to diverge. Example. (1). 1/np → 0 for any p > 0. √ (2). n p → 1 for any p > 0. (3). (log n)α/np → 0 for any α ∈ R and p > 0. α n (4). n√ /p → 0 for any α ∈ R and p > 1. (5). n n → 1. (6). xn → 0 if |x| < 1.

Definition (lim sup and lim inf). We say a is a subsequential limit of {an} if there is a subsequence of {an} that converges to a. Let E be the set of all subsequential limits of {an}. Then define lim sup an := sup E and lim inf an := inf E. n→∞ n→∞ If there is a subsequence that diverges to ∞, then we also denote ∞ ∈ E; similarly, if there is a subsequence that diverges to −∞, then we also denote −∞ ∈ E; Theorem 2.1.

(i). lim inf an and lim sup an are both subsequential limits of {an}. (ii). A sequence {an} is convergent iff −∞ < lim inf an = lim sup an < ∞. (iii). A sequence {an} is bounded iff −∞ < lim inf an ≤ lim sup an < ∞. Proof.

(i). Let E be the set of subsequential limits of {an}. Suppose that lim sup an = ∞. Then for any k ∈ N, there is a subsequential limit mk ≥ k + 1. If mk = ∞, then ∞ ∈ E so lim sup an = ∞ and we are done. If mk < ∞, since mk is a subsequential limit, there is a subsequence that converges to mk. Hence,

there exists nk ∈ N such that |ank − mk| < 1. Hence,

ank = mk − (mk − ank ) ≥ mk − |mk − ank | ≥ k.

That is, ank → ∞ as k → ∞. 16 2.1. SEQUENCES AND SERIES OF NUMBERS 17

Suppose that lim sup an = −∞. Then E = {−∞} and an → −∞. Now suppose that −∞ < lim sup an = m < ∞. Then m = sup E, where E is the set of the subsequential limits of {an}. Then there exists m1 ∈ E such that m − 1/2 ≤ m1 ≤ m. Since m1 is a subsequential limit, there is a subsequence that converges to m1. Hence,

there exists n1 ∈ N such that |an1 − m1| ≤ 1/2. By the triangle inequality,

|an1 − m| ≤ |an1 − m1| + |m1 − m| ≤ 1.

Similarly, there exists nk+1 > nk such that |ank − m| < 1/k. Now ank → m as k → ∞ and therefore m ∈ E, i.e. m is a subsequential limit. Similarly, one can show that lim inf an is a subsequential limit of {an}.  If a sequence is convergent, then it is bounded. However, bounded sequences are not neces- sarily convergent, e.g. {(−1)n} is divergent. The following property of monotonicity guarantees bounded sequences to be convergent.

∞ Remark (Monotonic sequences). A sequence {an}n=1 of real numbers is said to be • (monotonically) increasing if an ≤ an+1 for all n ∈ N. • (monotonically) decreasing if an ≥ an+1 for all n ∈ N. A sequence is said to be monotonic if it is increasing or decreasing. A monotonic sequence converges iff it is bounded. In fact, let E be the set of numbers in {an}.

• If {an} is increasing and bounded, then an → a, where a = sup E. • If {an} is increasing, then it is unbounded iff an → ∞. • If {an} is decreasing and bounded, then an → a, where a = inf E. • If {an} is decreasing, then it is unbounded iff an → −∞.

Theorem 2.2. Suppose that an → a and bn → b. Let α, β ∈ R.

(i). αan + βbn → αa + βb. (ii). anbn → ab. (iii). If bn 6= 0 and b 6= 0, then an/bn → a/b. Theorem 2.3.

(i). Suppose that an ≤ bn ≤ cn for n ≥ N0, where N0 ∈ N is some fixed integer. If an → a and cn → a, then bn → a. (ii). Suppose that 0 ≤ an ≤ bn for n ≥ N0, where N0 ∈ N is some fixed integer. If an → ∞, then bn → ∞. (iii). Suppose that an ≤ bn ≤ 0 for n ≥ N0, where N0 ∈ N is some fixed integer. If bn → −∞, then an → −∞. Proof.

(i). Let ε > 0. Since an → a, there is N1 ∈ N such that |an − a| ≤ ε if n ≥ N1. Since cn → a, there is N2 ∈ N such that |cn − a| ≤ ε if n ≥ N2. Set N = max{N0,N1,N2}. If n ≥ N, then an ≤ bn ≤ cn. Hence,

bn − a ≤ cn − a ≤ |cn − a| ≤ ε and a − bn ≤ a − an ≤ |a − an| ≤ ε.

So |bn − a| ≤ ε if n ≥ N and bn → a.  2.1. SEQUENCES AND SERIES OF NUMBERS 18

Definition (Convergent and of numbers). Let p, q ∈ N such that p ≤ q. We write q X an = ap + ··· + aq. n=p ∞ With the sequence {an} we associate a sequence of partial sums {sn}n=1, where n X sn = ak. k=1 P∞ If {sn} converges to s, we say that the infinite series n=1 an converges and write ∞ X an = s. n=1 P∞ If {sn} diverges, we say that the infinite series n=1 an diverges. P∞ One can similarly define n=0 an. P Remark (Cauchy criterion). The series an converges iff for every ε > 0 there is N ∈ N such that

m X ak ≤ ε k=n if m ≥ n ≥ N. Example. (1). ∞ ( X = 1 if |x| < 1, xn 1−x diverges if |x| ≥ 1. n=0 (2). ∞ X 1 = e. n! n=0 (3). ∞ ( X 1 converges if p > 1, np diverges if 0 ≤ p ≤ 1. n=1 Theorem 2.4. P P (i). If an converges, then an → 0 as n → ∞. That is, if an 6→ 0, then an diverges. P (ii). Suppose that an ≥ 0 for n ≥ N0, where N0 ∈ N is some fixed integer. Then an converges iff its partial sums form a bounded sequence. P (iii). Suppose that |an| ≤ bn for n ≥ N0, where N0 ∈ N is some fixed integer. If bn converges, P then an converges. P (iv). Suppose that 0 ≤ an ≤ bn for n ≥ N0, where N0 ∈ N is some fixed integer. If an = ∞, P then bn = ∞. P (v). Suppose that an ≤ bn ≤ 0 for n ≥ N0, where N0 ∈ N is some fixed integer. If bn = −∞, P then an = −∞. Proof. 2.2. POINTWISE CONVERGENCE OF SEQUENCES AND SERIES OF FUNCTIONS 19 P (iii). Let ε > 0. Since bn converges, by the Cauchy criterion, there is N ∈ N and N ≥ N0 such that

m m X X bk = bk ≤ ε if m ≥ n ≥ N. k=n k=n

Here, we use the fact that bn ≥ |an| ≥ 0 if n ≥ N ≥ N0. By the triangle inequality,

m m m X X X ak ≤ |ak| ≤ bk ≤ ε if m ≥ n ≥ N. k=n k=n k=n P Therefore, an converges by the Cauchy criterion.  Theorem 2.5 (Root and ratio tests). pn (i). : Let α = lim sup |an|. Then P (a). if α < 1, then an converges. P (b). if α > 1, then an diverges. (c). if α = 1, then the root test is inconclusive. (ii). : P (a). If lim sup |an+1/an| < 1, then an converges. P (b). If |an+1/an| ≥ 1 for n ≥ N0, where N0 ∈ N is some fixed integer, then an diverges. P P Definition. We say a series an converges absolutely if |an| converges. Theorem 2.6. If a series converges absolutely, then it converges. Remark. A is not necessarily absolutely convergent, e.g. P(−1)n/n con- verges but P 1/n diverges. Homework Assignment .

2-1. Let E be the set of the numbers in an increasing sequence {an}. Prove that

(a). If E is bounded, then an → a, where a = sup E. (b). If E is unbounded, then an → ∞. P 2-2. Suppose that an ≥ 0 for n ≥ N0, where N0 ∈ N is some fixed integer. Then an converges iff its partial sums form a bounded sequence. pn 2-3. Let α = lim sup |an|. P (a). Prove that if α < 1, then an converges. P (b). Prove that if α > 1, then an diverges. (c). If α = 1, then is the series convergent? Prove your assertion.

(Hint: If α < 1, then there is β such that α < β < 1. If lim sup an < β, then there is N ∈ N such that an < β for all n ≥ N.)

2.2. Pointwise convergence of sequences and series of functions Throughout this section, we assume that X is a metric space equipped with metrics d. Let ∞ E ⊂ X and {fn}n=1 be a sequence of real-valued functions on E, i.e. fn : E → R for all n ∈ N. ∞ Definition (Pointwise convergence of sequences of functions). Suppose that {fn(x)}n=1 con- verges for every x ∈ E. We can then define a function f by

f(x) := lim fn(x). n→∞ 2.2. POINTWISE CONVERGENCE OF SEQUENCES AND SERIES OF FUNCTIONS 20

We say that {fn} converges on E and that f is the limit, or the limit function. Sometimes we use a more descriptive terminology and shall say that “{fn} converges to f pointwise on E.” Example (Limit function of a sequence of continuous function is not necessarily continuous). n Let fn : [0, 1] → R be defined by fn(x) = x . Then ( 0 if 0 ≤ x < 1, lim fn(x) = n→∞ 1 if x = 1.

0 0 Example ((lim fn) 6= lim fn). Let fn : R → R be defined as sin(nx) f (x) = √ . n n Then f(x) = lim fn(x) = 0. n→∞ 0 0 √ Hence, f (x) = 0 for all x ∈ R. But notice that fn(x) = n cos(nx) and 0 √ lim fn(0) = n → ∞ as n → ∞. n→∞ R b R b Example ( a lim fn 6= lim a fn). Let fn : [0, 1] → R be defined as 2 n fn(x) = nx(1 − x ) . Then f(x) = lim fn(x) = 0. n→∞ R 1 Hence, 0 f = 0. But Z 1 Z 1 2 n n 1 fn(x) dx = n x(1 − x ) dx = → as n → ∞. 0 0 2(n + 1) 2 P∞ Definition (Pointwise convergence of series of functions). Suppose that n=1 fn(x) con- verges for every x ∈ E. We can then define a function f by ∞ X f(x) := fn(x). n=1 P P We say that the series fn converges to f (pointwise) on E or f is the sum of the series fn. Homework Assignment . 2-4. Let fn : R → R be defined by x f (x) = . n 1 + nx2

(a). Find f = lim fn. 0 (b). Find lim fn. 0 0 (c). Is f = lim fn? 2-5. Let ∞ 1 X f (x) = and f(x) = f (x). n 1 + n2x n n=1 (a). For what values of x ∈ [0, ∞) does the series converge? (b). Is f continuous wherever the series converges? 2.3. UNIFORM CONVERGENCE OF SEQUENCES AND SERIES OF FUNCTIONS 21

2.3. Uniform convergence of sequences and series of functions Throughout this section, we assume that X is a metric space equipped with metrics d. Let ∞ E ⊂ X and {fn}n=1 be a sequence of real-valued functions on E, i.e. fn : E → R for all n ∈ N. ∞ Definition (Uniform convergence). We say that {fn}n=1 converges uniformly on E to a function f if for every ε > 0, there is N ∈ N such that

|fn(x) − f(x)| ≤ ε if n ≥ N for all x ∈ E. P We say that the series fn(x) converges uniformly on E if the sequence {sn(x)} of partial sums defined by n X fi(x) = sn(x) i=1 converges uniformly on E.

Remark. {fn} does not converge uniformly to f on E iff for some ε > 0 and every N ∈ N, there exist x ∈ E and n ≥ N such that |fn(x) − f(x)| ≥ ε. Example. Determine whether the following convergence is uniform. (1). fn(x) = x + 1/n √and f(x) = x on R. (2). fn(x) = sin(nx)/ n and f(x) = 0 on R. n (3). fn(x) = x and f(x) = 0 on [0, 1]. n (4). fn(x) = x and f(x) = 0 on [0, 1). n (5). fn(x) = x and f(x) = 0 on [0, 1/2].

Theorem 2.7. Suppose that lim fn(x) = f(x) on E. Put

Mn = sup |fn(x) − f(x)|. x∈E

Then fn → f uniformly on E iff Mn → 0 as n → ∞. ∞ Theorem 2.8. {fn}n=1 converges uniformly on E iff for every ε > 0, there is N ∈ N such that n, m ≥ N implies that

|fn(x) − fm(x)| ≤ ε for all x ∈ E. ∞ Proof. Necessity. Suppose that {fn}n=1 converges uniformly on E to a function f. Then for every ε > 0, there is N ∈ N such that n ≥ N implies that ε |f (x) − f(x)| < for all x ∈ E. n 2 Hence, if m, n ≥ N, then

|fn(x) − fm(x)| ≤ |fn(x) − f(x)| + |f(x) − fm(x)| ≤ ε for all x ∈ E.

Sufficiency. Suppose that the Cauchy criterion holds. Then for each x ∈ E, {fn(x)} is a Cauchy sequence and thus converges to a limit, which we may call f(x). Hence, {fn} converges to f pointwise on E, We have to prove that the convergence is uniform. Let ε > 0. There is N ∈ N such that n, m ≥ N implies that

|fn(x) − fm(x)| ≤ ε for all x ∈ E.

Now fix n above and let m → ∞. Since fm(x) → f(x), we have that

|fn(x) − f(x)| ≤ ε for all x ∈ E.  2.4. UNIFORM CONVERGENCE AND CONTINUITY 22 P Theorem 2.9. Suppose that |fn(x)| ≤ Mn for x ∈ E and n ∈ N. If Mn converges, then P fn converges uniformly on E. P Proof. If Mn converges, then for arbitrary ε > 0,

m m m X X X fi(x) ≤ |fi(x)| ≤ Mi ≤ ε i=n i=n i=n for all x ∈ E and n, m are large enough. Uniform convergence then follows by the Cauchy criterion.  Homework Assignment . √ n 2-6. Let fn = x.

(a). Does {fn} converge uniformly on [1, ∞)? Prove your assertion. (b). Does {fn} converge uniformly on (0, 1]? Prove your assertion. (c). Does {fn} converge uniformly on [1, 2]? Prove your assertion. 2-7. Suppose that fn is bounded on a set E for all n ∈ N. (a). If fn → f uniformly on E, then is f bounded on E? Prove your assertion. (b). If fn → f pointwise on E, then is f bounded on E? Prove your assertion. 2-8. Suppose that fn is bounded for all n ∈ N and fn → f uniformly on E. Prove that {fn} is uniformly bounded, i.e. there exists M ∈ R such that |fn(x)| ≤ M for all n ∈ N and x ∈ E. 2-9. Suppose that fn → f uniformly on E and gn → g uniformly on E. Prove that fn + gn → f + g uniformly on E. 2-10. Suppose that fn → f uniformly on E and gn → g uniformly on E. Prove that if fn and gn are bounded, then fngn → fg uniformly. 2-11. Suppose that fn → f uniformly on E and gn → g uniformly on E. Is fngn → fg uniformly on E? Prove your assertion.

2.4. Uniform convergence and continuity Throughout this section, we assume that X is a metric space equipped with metrics d. Let ∞ E ⊂ X and {fn}n=1 be a sequence of real-valued functions on E, i.e. fn : E → R for all n ∈ N.

Theorem 2.10. Suppose that fn → f uniformly on E. Let x be a limit point of E and suppose that lim fn(t) = An for n ∈ . t→x N

Then {An} converges and lim f(t) = lim An. t→x n→∞ That is, lim lim fn(t) = lim lim fn(t). t→x n→∞ n→∞ t→x

Proof. Let ε > 0. Since fn → f uniformly on E, there is N ∈ N such that

|fn(t) − fm(t)| ≤ ε for all t ∈ E if n, m ≥ N. Letting t → x, we obtain that

|An − Am| ≤ ε if n, m ≥ N.

Therefore, {An} is Cauchy and converges to A for some A ∈ R. We next need to show that limt→x f(t) = A. 2.4. UNIFORM CONVERGENCE AND CONTINUITY 23

Let ε > 0. Since fn → f uniformly on E, there is N1 ∈ N such that |fn(t) − f(t)| ≤ ε/3 for all t ∈ E and n ≥ N1. Since An → A, there is N2 ∈ N such that |An − A| ≤ ε/3 if n ≥ N2. Set N = max{N1,N2}. Since limt→x fN (t) = AN , there exists δ > 0 such that |fA(t)−AN | < ε/3 if 0 < d(t, x) < δ and t ∈ E. Now if 0 < d(t, x) < δ and t ∈ E, then by the triangle inequality, ε ε ε |f(t) − A| ≤ |f(t) − f (t)| + |f (t) − A | + |A − A| < + + = ε. N N N N 3 3 3 Hence, limt→x f(t) = A.  An immediate consequence is the following.

Theorem 2.11. If fn is continuous on E for all n ∈ N and fn → f uniformly on E, then f is continuous on E.

Proof. Let x ∈ E. Since fn is continuous, limt→x fn(t) = fn(x). Since fn → f uniformly on E, by Theorem 2.10,

lim f(t) = lim lim fn(t) = lim lim fn(t) = lim fn(x) = f(x). t→x t→x n→∞ n→∞ t→x n→∞ That is, f is continuous at x.  Definition. We denote C(X) the space of real-valued, continuous, and bounded functions on X. Remark. Notice that if X is compact, then continuous functions are bounded so the bound- edness condition in the above definition is redundant. Theorem 2.12. For f ∈ C(X), define its supremum norm kfk = sup |f(x)|. x∈X Then kf − gk defines a metric on C(X). Moreover, C(X) is complete. Proof. Let f, g, h ∈ C(X).

(i). First kf − fk = supx∈X |f(x) − f(x)| = 0. Second, if kf − gk = supx∈X |f(x) − g(x)| = 0, then |f(x) − g(x)| = 0 for all x ∈ X so f(x) = g(x) and f = g.

(ii). kf − gk = supx∈X |f(x) − g(x)| = supx∈X |g(x) − f(x)| = kg − fk. (iii). For any x ∈ X, |f(x) − g(x)| ≤ |f(x) − h(x)| + |h(x) − g(x)|. Then sup |f(x) − g(x)| ≤ sup(|f(x) − h(x)| + |h(x) − g(x)|) ≤ sup |f(x) − h(x)| + sup |h(x) − g(x)|. x∈X x∈X x∈X x∈X That is, kf − gk ≤ kf − hk + kh − gk.

To prove that C(X) equipped with metric k · k is complete, let {fn} be a Cauchy sequence in C(X). Then for any ε > 0, there is N ∈ N such that

kfn − fmk = sup |fn(x) − fm(x)| ≤ ε x∈X if n, m ≥ N. Therefore, for any x ∈ X, |fn(x) − fm(x)| ≤ ε if n, m ≥ N. That is, {fn(x)} is a Cauchy sequence in R (since fn is real-valued). Since R is complete, there is a limit, denoted as f(x) ∈ R, of the sequence {fn(x)}. Hence, fn → f pointwise on X. We next need to show that f ∈ C(X). First, f is real-valued since f(x) ∈ R for all x ∈ X. Second, by Theorem 2.8, fn → f uniformly on X. Since fn is continuous, f is also continuous. 2.5. UNIFORM CONVERGENCE AND DIFFERENTIATION 24

Third, since fn → f uniformly, there exists N ∈ N such that |fn(x) − f(x)| ≤ 1 for all x ∈ X and n ≥ N. Since fN is bounded, there is M ∈ R such that |fN (x)| ≤ M for all x ∈ X. Then by the triangle inequality,

|f(x)| ≤ |fN (x)| + |fN (x) − f(x)| ≤ M + 1 for all x ∈ X so f is bounded. Therefore, f ∈ C(X) so C(X) is complete.  Homework Assignment . 2-12. Let fn be continuous for all n ∈ N and fn → f pointwise on E for some continuous function f on E. Does fn → f uniformly? Prove your assertion. 2-13. Let fn be continuous for all n ∈ N and fn → f uniformly on E. Prove that fn(xn) → f(x) for every sequence of points xn ∈ E such that xn → x and x ∈ E.

2.5. Uniform convergence and differentiation ∞ Theorem 2.13. Let {fn}n=1 be a sequence of differentiable real-valued functions on [a, b] such 0 ∞ that fn → f pointwise on [a, b]. If {fn}n=1 converges uniformly on [a, b], then f is differentiable and 0 0 lim fn(x) = f (x) for all x ∈ [a, b]. n→∞ Proof. Fix x ∈ [a, b] and write f (t) − f (x) f(t) − f(x) g (t) = n n and g(t) = . n t − x t − x for t ∈ [a, b] and t 6= x. We next need to show that limt→x g(t) exists (so that f is differentiable at x). 0 Since {fn} converges uniformly, there is N ∈ N such that 0 0 |fn(t) − fm(t)| < ε for all t ∈ [a, b] if n, m ≥ N.

Applying the mean value theorem to the function fn − fm, we have that 0 0 |(fn(t) − fm(t)) − (fn(x) − fm(x))| ≤ |fn(s) − fm(s)||t − x| < ε|t − x|, in which s is between x and t. It then follows that for all t ∈ [a, b] and t 6= x,

fn(t) − fn(x) fm(t) − fm(x) |gn(t) − gm(t)| = − t − x t − x

(fn(t) − fm(t)) − (fn(x) − fm(x)) = t − x |(f (t) − f(t)) − (f (x) − f(x))| = n n |t − x| < ε, if n, m ≥ N. That is, {gn} converges to g uniformly on [a, b] \{x}. Notice that limt→x gn(t) = 0 0 fn(x). By Theorem 2.10, we then have that {fn(x)} converges as n → ∞ and 0 lim g(t) = lim lim gn(t) = lim lim gn(t) = lim fn(x). t→x t→x n→∞ n→∞ t→x n→∞ 0 0 That is, f (x) exists and equals limn→∞ fn(x). 

One can also conclude that fn → f uniformly from the conditions in the above theorem. In fact, 2.6. UNIFORM CONVERGENCE AND INTEGRATION 25

∞ Theorem 2.14. Let {fn}n=1 be a sequence of differentiable real-valued functions on [a, b] 0 such that {fn(x0)} converges for some point x0 ∈ [a, b]. If {fn} converges uniformly on [a, b], then {fn} converges uniformly on [a, b]. 0 0 Moreover, let limn→∞ fn = f, then f (x) = lim fn(x) for all x ∈ [a, b]. Proof. See Problem 2-14.  Homework Assignment . ∞ 2-14. Let {fn}n=1 be a sequence of differentiable real-valued functions on [a, b] such that 0 {fn(x0)} converges for some point x0 ∈ [a, b]. Prove that if {fn} converges uniformly on [a, b], then {fn} converges uniformly on [a, b].

2.6. Uniform convergence and integration

Definition (Partitions). A partition P = {x0, x1, ..., xn} of the interval [a, b] is a finite set of numbers x0, x1, ..., xn such that

a = x0 < x1 < ··· < xn = b.

• The norm (or mesh, gap) of a partition P = {x0, x1, ..., xn}, denoted by kP k, is max{∆xi : i = 1, ..., n}, in which ∆xi = xi − xi−1. • Let P and Q be partitions of [a, b]. We say that Q is a refinement of P if P ⊂ Q.

Let f :[a, b] → R be bounded and P = {x0, x1, ..., xn} be a partition of [a, b]. Define the lower Riemann sum with respect to the partition P as n X S(f, P ) = mi∆xi, where mi = inf f(x), x∈[xi−1,xi] i=1 and the upper Riemann sum with respect to the partition P as n X S(f, P ) = Mi∆xi, where Mi = sup f(x). i=1 x∈[xi−1,xi]

Example. Let f(x) = x on [0, 1]. Let xi = i/n for i = 0, ..., n and Pn = {x0, ..., xn}. Then n n X 1 X i − 1 n2 − n S(f, P ) = m ∆x = = , n i i n n 2n2 i=1 i=1 and n n X 1 X i n2 + n S(f, P ) = m ∆x = = . n i i n n 2n2 i=1 i=1 Remark. Let f :[a, b] → R be bounded and P and Q be two partitions of [a, b]. Then • m(b − a) ≤ S(f, P ) ≤ S(f, P ) ≤ M(b − a), where m = infx∈[a,b] f(x) and M = supx∈[a,b] f(x). • If Q is a refinement of P , then S(f, P ) ≤ S(f, Q) and S(f, P ) ≥ S(f, Q). • S(f, P ) ≤ S(f, P ∪ Q) ≤ S(f, P ∪ Q) ≤ S(f, Q).

Definition (Lower and upper Riemann sums). Let f :[a, b] → R be bounded. Write S(f) = sup{S(f, P ): P is a partition of [a, b]}, and S(f) = inf{S(f, P ): P is a partition of [a, b]}. 2.6. UNIFORM CONVERGENCE AND INTEGRATION 26

Since S(f, P ) ≤ S(f, Q) for any two partitions P and Q of [a, b], S(f) ≤ S(f). Definition (Riemann integrable functions). Let f :[a, b] → R be bounded. We say that f is Riemann integrable on [a, b] if S(f) = S(f); in this case, we say that the Riemann integral of f on [a, b] is I := S(f) = S(f), denoted as Z b f dx = I. a Example. Let the Dirichlet function f : [0, 1] → R be defined by ( 0 if x ∈ ∩ [0, 1]; f(x) = Q 1 if x ∈ [0, 1] \ Q. Then S(f, P ) = 0 and S(f, P ) = 1 for any partition P of [0, 1], so S(f) = 0 6= 1 = S(f). Proposition 2.15. Let f :[a, b] → R be bounded. Then f is Riemann integrable on [a, b] iff for every ε > 0, there exists a partition P = P (ε) of [a, b] such that S(f, P ) − S(f, P ) ≤ ε. Proof. Sufficiency: For every ε > 0, there exists a partition P = P (ε) of [a, b] such that S(f, P ) − S(f, P ) ≤ ε. Then 0 ≤ S(f) − S(f) ≤ S(f, P ) − S(f, P ) ≤ ε. So S(f) = S(f) and f is Riemann integrable. Necessity: Let f be Riemann integrable. Then S(f) = S(f) = I. For any ε > 0, since S(f) = sup{S(f, P ): P is a partition of [a, b]}, there is a partition P1 of [a, b] such that ε I − ≤ S(f, P ) ≤ S(f) = I; 2 1 since S(f) = inf{S(f, P ): P is a partition of [a, b]}, there is a partition P2 of [a, b] such that ε I = S(f) ≤ S(f, P ) ≤ I + . 2 2 That is, ε ε I − ≤ S(f, P ) ≤ I ≤ S(f, P ) ≤ I + . 2 1 2 2 Let P = P1 ∪ P2. Then P is a refinement of P1 and of P2. So

S(f, P1) ≤ S(f, P ) ≤ I and I ≤ S(f, P ) ≤ S(f, P2). Therefore, ε ε I − ≤ S(f, P ) ≤ I ≤ S(f, P ) ≤ I + , 2 2 and S(f, P ) − S(f, P ) ≤ ε. 

Example. Let f(x) = x on [0, 1]. Let xi = i/n for i = 0, ..., n and Pn = {x0, ..., xn}. Then n2 − n 1 S(f, P ) = → , n 2n2 2 and n2 + n 1 S(f, P ) = → , n 2n2 2 R 1 as n → ∞. Therefore, f is Riemann integrable and 0 f = 1/2 since S(f, Pn) ≤ I ≤ S(f, Pn) for all n ∈ N. 2.6. UNIFORM CONVERGENCE AND INTEGRATION 27

Remark (Riemann-Lebesgue Theorem). Let f :[a, b] → R be bounded. Then f is Riemann integrable on [a, b] iff f is continuous a.e. on [a, b], i.e. the set of points at which f is discontinuous is of measure zero. ∞ Theorem 2.16. Let {fn}n=1 be a sequence of Riemann integrable functions on [a, b] such that fn → f uniformly on [a, b]. Then f is Riemann integrable and Z b Z b f(x) dx = lim fn(x) dx. a n→∞ a (The existence of the limit on the right-hand-side is part of the conclusion.) Proof. Put εn = sup |fn(x) − f(x)|. x∈[a,b]

Notice that fn + εn and fn − εn are both Riemann integrable. Then Z b Z b S(fn − εn) = (fn − εn) dx and S(fn + εn) = (fn + εn) dx. a a Therefore, Z b Z b (fn − εn) dx = S(fn − εn) ≤ S(f) ≤ S(f) ≤ S(fn + εn) = (fn + εn) dx. a a Hence, Z b Z b 0 ≤ S(f) − S(f) ≤ (fn + εn) dx − (fn − εn) dx = 2εn(b − a). a a

We know that εn → 0 since fn → f uniformly on [a, b]. Therefore, S(f) − S(f) = 0 and f is Riemann integrable. Moreover, Z b Z b Z b

f dx − fn dx ≤ |f − fn| dx ≤ εn(b − a), a a a which implies that Z b Z b f(x) dx = lim fn(x) dx. a n→∞ a  ∞ Remark. Let {fn}n=1 be a sequence of Riemann integrable functions on [a, b] such that fn → f pointwise on [a, b]. Then f is not necessarily Riemann integrable. For example, enumerate ∞ the rational numbers in [0, 1] as {rn}n=1 and define ( 0 if x ∈ {r1, ..., rn}; fn(x) = 1 if x ∈ [0, 1] \{r1, ..., rn}.

Then for all n ∈ N, we have that fn is Riemann integrable since there are n points {r1, ..., rn} where fn is not continuous. Now fn → f, in which f is the Dirichlet function and is not Riemann integrable. ∞ Corollary 2.17. Let {fn} be a sequence of Riemann integrable functions on [a, b] such P n=1 that fn converges uniformly to f on [a, b]. Then f is Riemann integrable and ∞ Z b X Z b f(x) dx = fn(x) dx. a n=1 a 2.7. POWER SERIES 28

Proof. See Problem 2-16.  Homework Assignment . ∞ 2-15. Let {fn}n=1 be a sequence of Riemann integrable functions on [a, b] such that {fn} converges pointwise to a Riemann integrable function f on [a, b]. Is Z b Z b f(x) dx = lim fn(x) dx? a n→∞ a Prove your assertion. ∞ P 2-16. Let {fn}n=1 be a sequence of Riemann integrable functions on [a, b] such that fn converges uniformly to f on [a, b]. Then prove that f is Riemann integrable and ∞ Z b X Z b f(x) dx = fn(x) dx. a n=1 a

2.7. Power series In this section, we study the functions which are represented by power series, i.e. ∞ ∞ X n X n cnx and cn(x − a) . n=0 n=0 For example, the of an analytic function f at a is ∞ X f (n) (x − a)n. n! n=0 Theorem 2.18. Set pn 1 α = lim sup |cn| and R = . n→∞ α (If α = 0, then set R = ∞; if α = ∞, then set R = 0.) P n Then cnx converges if |x| < R and diverges if |x| > R. We call R the radius of conver- P n gence for the power series cnx .

pn n pn Proof. Since |cnx | = |x| |cn|,

pn n pn |x| lim sup |cnx | = |x| lim sup |cn| = . n→∞ n→∞ R P n P n By the root test, if |x| < R, then |x|/R < 1 and |cnx | converges; if |x| > R, then cnx diverges.  Remark. Just as in the root test, when |x| = R, the above theorem is inconclusive. For example, take ∞ X (−1)n xn. n n=1 Then r 1 1 α = lim sup n = 1 and R = = 1. n→∞ n α Now at x = 1, the series P(−1)nxn/n converges; at x = −1, P(−1)nxn/n diverges. 2.7. POWER SERIES 29

Theorem 2.19. Let R > 0. Suppose that the series

∞ X n f(x) = cnx n=0 has radius of convergence R > 0 (and R may be ∞). Then P n (i). for any ε > 0, cnx converges uniformly on [−R + ε, R − ε]; (ii). f is continuous in (−R,R), (iii). f is differentiable in (−R,R), and

∞ 0 X n−1 f (x) = ncnx ; n=1 (iv). f defined above has derivatives of all orders in (−R,R), which are given by

∞ (k) X n−k f (x) = n(n − 1) ··· (n − k + 1)cnx . n=k In particular, for k = 0, 1, ... (k) f (0) = k!ck. Proof. (i). Given ε > 0, let x = R − ε ∈ (−R,R). Then we have that

∞ X n cn(R − ε) n=0

P n converges absolutely, i.e. |cn(R − ε) | converges. (By the previous theorem, the power n series converges absolutely in the interior of its interval of convergence.) Since |cnx | ≤ n P n |cn(R − ε) | for all x ∈ [−R + ε, R − ε], cnx converges uniformly on [−R + ε, R − ε]. P n (ii). For any x ∈ (−R,R), let ε = min{|R − x|, |R + x|}/2. Then f(x) = cnx converges uniformly on [−R + ε, R − ε] 3 x. Hence, f is continuous at x and therefore is continuous in (−R,R√ ). (iii). Since n n → 1 as n → ∞,

pn pn lim sup |ncn| = lim sup |cn|. n→∞ n→∞

P n−1 P n That is, ncnx has the same radius R of convergence as cnx . P n−1 For any x ∈ (−R,R), let ε = min{|R − x|, |R + x|}/2. Then ncnx converges uniformly on [−R + ε, R − ε] 3 x. Hence, f is differentiable at x and

∞ 0 X n−1 f (x) = ncnx . n=1 Therefore, f is continuous in (−R,R). (iv). It follows by successively applying (iii).  2.8. EXPONENTIAL FUNCTION AND LOGARITHMIC FUNCTION 30

2.8. Exponential function and logarithmic function Define ∞ X xn E(x) = , n! n=0 in which n! = 1 · 2 ··· n. Then n+1 x /(n + 1)! x lim sup n = lim sup = 0. n→∞ x /n! n→∞ n + 1 By the ratio test, E(x) converges for all x ∈ R. By Theorem 2.19, we know that (i). E(x) converges uniformly on R; (ii). E(x) is continuous on R, (iii). E(x) is differentiable in R, and ∞ ∞ ∞ X n X xn−1 X xn E0(x) = xn−1 = = = E(x); n! (n − 1)! n! n=1 n=1 n=0 (iv). E defined above has derivatives of all orders on R, which are given by ∞ ∞ ∞ X xn−k X xn−k X xn E(k)(x) = n(n − 1) ··· (n − k + 1) = = = E(x). n! (n − k)! n! n=k n=k n=0 Notice that for x, y ∈ R, E(x) and E(y) both converge absolutely. Then ∞ ! ∞ ! X xi X yj E(x)E(y) = i! j! i=0 j=0 ∞ ∞ X X xiyj = i!j! i=0 j=0 ∞ X X xiyj = i!j! k=0 i+j=k ∞ k X X xiyk−i = i!(k − i)! k=0 i=0 ∞ k X 1 X k! = xiyk−i k! i!(k − i)! k=0 i=0 ∞ X (x + y)k = k! k=0 = E(x + y). We then derive that (1). 1 = E(0) = E(x + (−x)) = E(x)E(−x) so E(−x) = 1/E(x). (2). Since E(x) > 0 for x ≥ 0, E(x) > 0 for all x ∈ R. (3). Since E(x) > x/2 for x > 0, E(x) → ∞ as x → ∞. (4). Since E(−x) = 1/E(x), E(x) → 0 as x → −∞. (5). Since E(x) > xn+1/(n + 1)! for x > 0, xnE(−x) = xn/E(x) < (n + 1)!/x → 0 as x → ∞. (6). Set e = E(1). Then 2.8. EXPONENTIAL FUNCTION AND LOGARITHMIC FUNCTION 31

• for any n ∈ N, E(n) = E(1) ··· E(1) = en. • for any n ∈ N, e = E(1) = E(1/n) ··· E(1/n) so E(1/n) = e1/n. • for any p, q ∈ N, E(p/q) = (E(1/q))p) = ep/q. • for any r ∈ Q, E(r) = er. • At last, for any x ∈ R \ Q, define ex = sup E(r). r∈Q,r 0, and L(E(x)) = x for all x ∈ R. Then all the properties of L(x) can be derived from the ones of E(x) = ex. We write customarily that L(x) = log x. For example, (1). differentiating the above equation, we have that L0(E(x))E0(x) = 1, which is, by writing y = E(x), that 1 L0(y) = . y (2). L(x) → ∞ as x → ∞ and L(x) → −∞ as x → 0+. (3). L(e) = 1 since L(E(1)) = 1 and E(1) = e (4). L(1) = 0 since L(E(0)) = 0 and E(0) = 1. (5). Writing y1 = E(x1) and y2 = E(x2),

L(y1y2) = L(E(x1)E(x2)) = L(E(x1 + x2)) = x1 + x2. (6). For any α > 0, choose 0 < ε < α. Then Z x 1 Z x x−α(xε − 1) x−α log(x) = x−α dt ≤ x−α tε−1 dt = → 0 1 t 1 ε as x → ∞. Homework Assignment . 2-17. Prove that (a). 1 lim(1 + x) x = e, x→0 (b).  xn lim 1 + = ex. n→∞ n 2-18. Find (a). e − (1 + x)1/x lim , x→0 x 2.8. EXPONENTIAL FUNCTION AND LOGARITHMIC FUNCTION 32

(b). n(n1/n − 1) lim . n→∞ log n CHAPTER 3

Functions of several variables

3.1. Linear transformations and matrices In Chapter 1, we use Rn as an example of the metric spaces and study the limits and continuous functions on Rn, for example, we can equip Rn with the Euclidean metric. To investigate the differentiation of functions on Rn, we need the structure of vector space on Rn so we can add or subtract elements in Rn. Definition (Vector spaces). A nonempty set X is a vector space (over R) if x + y ∈ X and cx ∈ X for all x, y ∈ X (which we will call “vectors”) and all c ∈ R (which we will call “scalars”). (i). Let x1, ..., xk ∈ X. They are said to be independent if for all c1, ...ck ∈ R,

c1x1 + ··· + ckxk = 0 implies that c1 = ··· = ck = 0. Otherwise, they are said to be dependent. Observe that no independent vectors contains the zero vector. (ii). The dimension of X is the maximal number of independent vectors, i.e. if X contains an independent set of r vectors but contains no independent set of r + 1 vectors, then X has dimension r and we write dim X = r. Observe that dim Rn = n. (iii). Let x1, ..., xk ∈ X and c1, ..., ck ∈ R, the vector

c1x1 + ··· + ckxk

is called a linear combination of x1, ..., xk. Let S ⊂ X. If E is the set of all linear combinations of elements of S, then we say S spans E, or that E is the span of S. Observe that every span is a vector space. (iv). An independent set of X which spans X is called a basis of X, i.e. if dim X = r and x1, ..., xr form a basis of X, then every x ∈ X can be written as x = c1x1 + ··· + crxr for some c1, ..., cr ∈ R. n In R , let ej be the vector whose j-th coordinate is 1 and whose other coordinates are n n all 0. Then {e1, ..., en} is a basis of R , which we call the standard basis of R . Remark. Let X be a vector space such that dim X = n. Then • A set E of n vectors spans X iff E is independent. • X has a basis, and every basis consists of n vectors. • Every independent set of vectors {x1, ..., xr} for 1 ≤ r ≤ n can be completed to a basis, i.e. there is a basis of X that contains {x1, ..., xr}. Definition (Linear transformations). Let X and Y be vector spaces. A mapping A : X → Y is said to be a linear transformation if

A(cx) = cA(x) and A(x1 + x2) = A(x1) + A(x2) for all x, x1, x1 ∈ X and c ∈ R. We often write Ax instead of A(x) if A is linear. The linear transformations from X to X is also called linear operators. 33 3.1. LINEAR TRANSFORMATIONS AND MATRICES 34

We use L(X,Y ) to denote the set of all linear transformations from X to Y . If X = Y , we simply write L(X) = L(X,X). Notice that L(X,Y ) is also a vector space. Remark. Let A ∈ L(X,Y ). • A0 = 0. • A is completely determined by its action on any basis. Indeed, let {x1, ..., xn} be a basis of X. Then every x ∈ X can be written as

x = c1x1 + ··· + cnxn. By the linearity of A, we have that

Ax = c1Ax1 + ··· + cnAxn. • If dim X = n and dim Y = m, then A can be described by an m by n matrix. Indeed, let {x1, ..., xn} be a basis of X and {y1, ..., ym} be a basis of Y . Since Axj ∈ Y , m X Axj = aijyi for some a1j, ..., amj ∈ R. i=1

Therefore, for every x = c1x1 + ··· + cnxn ∈ X, n n m m n ! X X X X X Ax = cjAxj = cj aijyi = aijcj yi. j=1 j=1 i=1 i=1 j=1 It is convenient to visualize these numbers by matrix operation. That is, each x = c1x1 + ··· + cnxn ∈ X is assigned to a column vector

c1 .  .  . cn

Similarly, each y = d1y1 + ··· + dmym ∈ Y is assigned to a column vector

 d1  .  .  . dm Then A corresponds to an m by n matrix   a11 a12 ··· a1n [A] :=  ···  am1 am2 ··· amn t such that Ax is given by this matrix acting on (c1, ..., cn) . In fact, given any m by n matrix, one can define a linear transformation from X to Y similarly as above. However, such correspondence between L(X,Y ) and m by n matrices is not one-to-one, since a change of basis will result different matrices. We move on to Rn, which is now a vector space equipped with a Euclidean distance |x − y| for x, y ∈ Rn. Definition (Norms of linear transformations). Let A ∈ L(Rn, Rm). Define the norm of A as |Ax| kAk := sup . x6=0 |x| 3.1. LINEAR TRANSFORMATIONS AND MATRICES 35

Remark. • Notice that |Ax| ≤ kAk|x| for all x ∈ Rn. • Notice that for each c ∈ R and c 6= 0, |A(cx)| |Ax| = . |cx| |x| Taking c = 1/|x|, we have that |Ax| |A(x/|x|)| |Ax | = = 1 , |x| |x/|x|| |x1|

in which |x1| = 1. So without loss of generality, we can define |Ax| kAk := sup = sup |Ax|. |x|=1 |x| |x|=1 Theorem 3.1. (i). Let A ∈ L(Rn, Rm). Then kAk < ∞ and A is uniformly continuous. (ii). Equipped with the distance kA − Bk for A, B ∈ L(Rn, Rm), L(Rn, Rm) is a metric space. (iii). Let A ∈ L(Rn, Rm) and B ∈ L(Rm, Rk). Then BA ∈ L(Rn, Rk) and kBAk ≤ kBkkAk. n m (iv). Let A ∈ L(R , R ) and [A] = [aij]. Then 1 m n ! 2 X X 2 kAk ≤ |aij| . i=1 j=1 Proof. n n (i). Let {e1, ..., en} be the standard basis in R . For each x = c1e1 + ··· + cnen ∈ R such that |x| = 1, we have that

|cj| = |cjej| ≤ |x| = 1 for all j = 1, ..., n. Hence,

n n n X X X |Ax| = cjAej ≤ |cj||Aej| ≤ |Aej| < ∞. j=1 j=1 j=1 To prove uniform continuity, for each ε > 0, set δ = ε/kAk. We then have that |A(x − y)| ≤ kAk|x − y| < ε, if |x − y| < δ. (ii). See Problem 3-2. (iii). It follows immediately from that |(BA)x| = |B(A(x))| ≤ kBk|Ax| ≤ kBkkAk|x| for all x ∈ Rn. n (iv). Let x = c1e1 + ··· + cnen ∈ R such that |x| = 1. Then m n ! X X Ax = aijcj ui. i=1 j=1 3.1. LINEAR TRANSFORMATIONS AND MATRICES 36

m 2 2 2 Here, {u1, ..., um} is the standard basis of R . Notice that |x| = |c1| + ··· + |cn| = 1. By Cauchy-Schwarz inequality,

m n 2 m n ! n ! m n

2 X X X X 2 X 2 X X 2 |Ax| = aijcj ≤ |aij| |cj| = |aij| . i=1 j=1 i=1 j=1 j=1 i=1 j=1 Thus, 1 m n ! 2 X X 2 kAk = sup |Ax| ≤ |aij| . |x|=1 i=1 j=1 

We now know that L(Rn) is a vector space and a metric space equipped with metric k·k. We next investigate the invertiable linear operators on Rn, which will later be used in the discussion of inverse function theorem. Recall that a mapping A : X → Y is

• one-to-one (or injective) if x1, x2 ∈ X and x1 6= x2 imply Ax1 6= Ax2. • onto (or surjective) if the range f(X) equals Y . In fact, A is one-to-one iff Ax = 0 implies that x = 0. (See Problem 3-1.) An important fact about A ∈ L(X) when dim X < ∞ is that A is one-to-one iff it is onto.

Theorem 3.2. Let X be vector spaces such that dim X < ∞. Then A ∈ L(X) is one-to-one iff it is onto.

Proof. Let {x1, ..., xn} be a basis of X. Notice that since A is linear, the range A(X) is spanned by {Ax1, ..., Axn}. Suppose that A is one-to-one. Then {Ax1, ..., Axn} is independent. Indeed, let

c1Ax1 + ··· + cnAxn = A (c1x1 + ··· + cnxn) = 0.

Then c1x1 + ··· + cnxn = 0 since A is one-to-one. Hence, c1 = ··· = cn = 0 since {x1, ..., xn} is a basis. Therefore, {Ax1, ..., Axn} is independent and so spans X since dim X = n. It then follows that A(X) = X. Suppose that A is onto, i.e. A(X) = X. Since A(X) is spanned by {Ax1, ..., Axn}, Ax1, ..., Axn are independent. Now let x = c1x1 + ··· + cnxn such that Ax = 0. Then

A (c1x1 + ··· + cnxn) = c1Ax1 + ··· + cnAxn = 0.

Hence, c1 = ··· = cn = 0 so x = 0 since Ax1, ..., Axn are independent. 

Therefore, A ∈ L(Rn) is invertible iff Ax = 0 implies that x = 0. In this case, A−1 ∈ L(X). Theorem 3.3. Let Ω ⊂ L(Rn) be the set of all invertible linear operators on Rn. (i). Let A ∈ Ω and B ∈ L(Rn). Suppose that kB − Ak · kA−1k < 1.

Then B ∈ Ω. This, in particular, implies that Ω is an open subset of L(Rn). (ii). The mapping A → A−1 is continuous on Ω.

Proof. 3.2. DIFFERENTIATION 37

(i). Set α = 1/kA−1k and β = kB − Ak. Then we have that α > β. To prove that B ∈ Ω, it suffices to show that Bx = 0 implies that x = 0, or equivalently Bx 6= 0 if x 6= 0. For each x ∈ Rn, α|x| = α|A−1Ax| ≤ αkA−1k|Ax| = |Ax| ≤ |(A − B)x| + |Bx| ≤ kA − Bk|x| + |Bx| = β|x| + |Bx|. Hence, |Bx| ≥ (α − β)|x| > 0 if x 6= 0. It then follows that B is also invertible. To show Ω is open, see Problem 3-3. (ii). Recall that α = 1/kA−1k and β = kB − Ak. To show A → A−1 is continuous, it suffices to prove that B−1 → A−1 (i.e. kB−1 − A−1k → 0) if B → A (i.e. β = kB − Ak → 0) for A, B ∈ Ω. Letting x = B−1y (notice that B is invertiable), we have that |y| ≥ (α − β)|B−1y|. That is, 1 kB−1k ≤ . α − β Notice that B−1 − A−1 = B−1(A − B)A−1. Then β kB−1 − A−1k ≤ kB−1kkA − BkkA−1k ≤ → 0 α(α − β) as β → 0.  Homework Assignment . 3-1. Let X be a vector space and A ∈ L(X). Prove that A is one-to-one iff Ax = 0 implies that x = 0. 3-2. Prove that L(Rn, Rm) is a metric space equipped with the distance kA − Bk for A, B ∈ L(Rn, Rm). 3-3. Let Ω ⊂ L(Rn) be the set of all invertible linear operators on Rn. Prove that Ω is open. 3.2. Differentiation Throughout this section, we assume that E is an open set in Rn and f : E → Rm. We also n m set {e1, ..., en} and {u1, ..., um} as the standard basis of R and R , respectively. Definition. Let x ∈ E. If there exists a linear transformation A ∈ L(Rn, Rm) such that |f(x + h) − f(x) − Ah| lim = 0, h→0 |h| then we say that f is differentiable at x and write f 0(x) = A. 3.2. DIFFERENTIATION 38

We call f 0(x) the differential of f at x. Here, h → 0 is understood as h ∈ Rn and |h| → 0. If f is differentiable at every x ∈ E, then we say that f is differentiable in E. Remark. • It is understood that h ∈ Rn in above. If |h| is small enough, then x + h ∈ E since E is open. Thus f(x + h) is defined and is in Rm. Since A ∈ L(Rn, Rm), Ah ∈ Rm so f(x + h) − f(x) − Ah ∈ Rm. Notice that in the limit above, the norm in the numerator is in Rm while the norm in the denominator is in Rn. • If f is differentiable at x ∈ E, then f(x + h) = f(x) + f 0(x)h + r(h), in which |r(h)| lim = 0. h→0 |h| Replacing h by y − x with y ∈ E and |y − x| small enough, we have that f(y) = f(x) + f 0(x)(y − x) + r(y − x), that is, f(y) − f(x) can be approximated by a linear transformation f 0(x)(y − x) near x. • If f is differentiable at x ∈ E, then f 0(x) is a linear transformation. If f is differentiable on E, then f 0 is a function that maps E to L(Rn, Rm). For example, if f : R → R and is differentiable, then for any x ∈ R, f 0(x) is a number and is understood as a linear transformation on R by h → f 0(x)h; in the meantime, f 0 : R → R is a function on R. • If f = A ∈ L(Rn, Rm), then f is differentiable and f 0 = A. For example, if f = A ∈ L(R, R), then f(x) = ax for some a ∈ R and f 0(x) = a. • If f is differentiable at x ∈ E, then f is continuous at x. The following theorem shows that if f is differentiable at x, then its differential f 0(x) is unique.

Theorem 3.4. Let E be an open set in Rn and f : E → Rm. Let x ∈ E. Suppose that |f(x + h) − f(x) − A h| |f(x + h) − f(x) − A h| lim 1 = 0 and lim 2 = 0, h→0 |h| h→0 |h| n m for some A1,A2 ∈ L(R , R ). Then A1 = A2, i.e. the differential, if exists, is unique. n m Proof. Let B = A1 − A2 ∈ L(R , R ). By the triangle inequality,

|Bh| = |A1h − A2h|

= |(f(x + h) − f(x) − A2h) − (f(x + h) − f(x) − A1h)|

≤ |f(x + h) − f(x) − A2h| + |(f(x + h) − f(x) − A1h)|. Hence, |Bh| lim = 0. h→0 |h| For each k ∈ Rn and k 6= 0, let h = tk for t ∈ R. Then h → 0 in Rn as t → 0 in R. Therefore, |B(tk)| |B(k)| lim = = 0. t→0 |tk| |k| n That is, Bk = 0 for all k ∈ R so B = 0 and A1 = A2.  3.2. DIFFERENTIATION 39

The following theorem shows the chain rule in multi-variable setting, which is particularly friendly to state with the help of linear transformations.

Theorem 3.5. Let E be an open set in Rn and f : E → Rm. Let g : F → Rk such that m F ⊂ R is open and F ⊃ f(E). If f is differentiable at x0 ∈ E and g is differentiable at f(x0), then g ◦ f is differentiable at x0 and 0 0 0 (g ◦ f) (x0) = g (f(x0))f (x0), 0 n m 0 m k 0 0 n k in which f (x0) ∈ L(R , R ) and g (f(x0)) ∈ L(R , R ) so g (f(x0))f (x0) ∈ L(R , R ). 0 0 Proof. Put F = g ◦ f, y0 = f(x0), A = f (x0), and B = g (y0). Then we need to show that |F (x + h) − F (x ) − BAh| 0 0 → 0 as h → 0 in n. |h| R Set

u(h) := f(x0 + h) − f(x0) − Ah. 0 n Since f (x0) = A, |u(h)| = ε(h)|h| for some ε(h) → 0 in R as h → 0 in R . Set

v(k) := g(y0 + k) − g(y0) − Bk. 0 m Since g (y0) = B, |v(k)| = η(k)|k| for some η(k) → 0 in R as k → 0 in R . n Now given h ∈ R , let k = f(x0 + h) − f(x0), i.e. f(x0 + h) = y0 + k. Then k = Ah + u(h). Moreover, |k| = |Ah + u(h)| ≤ (kAk + ε(h))|h|, and

F (x0 + h) − F (x0) − BAh = g(f(x0 + h)) − g(f(x0)) − BAh

= g(y0 + k) − g(y0) − BAh = Bk + v(k) − BAh = B(k − Ah) + v(k) = B(u(h)) + v(k). Hence,

|F (x0 + h) − F (x0) − BAh| ≤ |B(u(h))| + |v(k)| = kBk|u(h)| + η(k)|k|. ≤ ε(h)kBk|h| + η(k)|Ah + u(h)| = ε(h)kBk|h| + η(k)(kAk + ε(h))|h|. It then follows that |F (x + h) − F (x ) − BAh| 0 0 ≤ ε(h)kBk + η(k)(kAk + ε(h)) → 0 |h| n 0 as h → 0 in R . So g ◦ f is differentiable at x0 and (g ◦ f) (x0) = BA.  Remark. Let γ :(a, b) → E ⊂ Rn and f : E → R be differentiable. Then f ◦ γ :(a, b) → R is differentiable and (f ◦ γ)0(t) = f 0(γ(t))γ0(t) for t ∈ (a, b). The following theorem is a generalization of the mean value theorem to the multi-variable setting. 3.2. DIFFERENTIATION 40

n Theorem 3.6. Suppose that E ⊂ R is convex, i.e. for a, b ∈ E, λ1a + λ2b ∈ E for all 0 0 ≤ λ1, λ2 ≤ 1 and λ1 + λ2 = 1. Let f : E → R be differentiable and kf (x)k ≤ M for all x ∈ E. Then |f(b) − f(a)| ≤ M|b − a| for all a, b ∈ E. Proof. Fix a, b ∈ E. Define γ : R → Rn by γ(t) = (1 − t)a + tb. Since E is convex, γ(t) ∈ E for all t ∈ [0, 1]. Put g(t) = f(γ(t)). Notice that γ0(t) = b − a. Then g0(t) = f 0(γ(t))γ0(t) = f 0(γ(t))(b − a). By the mean value theorem for g, we have that |g(1) − g(0)| ≤ |g0(t)||1 − 0| ≤ kf 0(γ(t))k|b − a| ≤ M|b − a|. Thus the theorem is complete by noticing that g(1) = f(b) and g(0) = f(a).  Corollary 3.7. If f : E → R is differentiable and f 0 = 0 on a convex set E ⊂ Rn, then f is constant on E. 3.2.1. Partial derivatives. Definition (Partial derivatives). Let E be an open set in Rn and f : E → Rm. Then m X f = fi(x)ui, i=1 in which f1, ..., fm : E → R are the components of f. For x ∈ E, i = 1, ..., m, and j = 1, ..., n, we define the partial derivative of f at x,

fi(x + tej) − fi(x) (Djfi)(x) = lim , t→0 t provided that the limit exists. That is, (Djfi) is the derivative of fi with respect to xj, keeping the other variables fixed. Theorem 3.8. Suppose that f is differentiable at x ∈ E. Then the partial derivatives (Djfi)(x) exist and m 0 X f (x)ej = (Djfi)(x)ui i=1 for j = 1, ..., n. Proof. Fix j = 1, ..., n. Since f is differentiable at x, f(x + h) − f(x) = f 0(x)h + r(h), m in which h ∈ R and |r(h)|/|h| → 0 as h → 0. Letting h = tej for t ∈ R, then 0 0 f(x + tej) − f(x) = f (x)(tej) + r(tej) = tf (x)ej + r(tej), in which |r(tej)|/|t| → 0 as t → 0. Hence,

f(x + tej) − f(x) 0 lim = f (x)ej. t→0 t Pm Note that f = i=1 fiui. It then follows that each component in this sum has a limit as t → 0. Therefore, each (Djfi)(x) exists and the equation in the theorem follows from the above equation.  3.2. DIFFERENTIATION 41

2 2 Remark. Let yj = (aj, bj) ∈ R such that yj → x for some x = (a, b) ∈ R as j → ∞. Then aj → a and bj → b as j → ∞. Indeed, for each ε > 0, there exists N ∈ N such that q 2 2 |yj − x| = |aj − a| + |bj − b| < ε if j ≥ N. Therefore, |aj − a| < ε and |bj − b| < ε if j ≥ N. So aj → a and bj → b as j → ∞. In fact, the√ reverse is also true. Indeed,√ for each ε > 0, there exists N1,N2 ∈ N such that |aj − a| < ε/ 2 if j ≥ N1 and |bj − b| < ε/ 2 if j ≥ N2. Hence, q 2 2 |yj − x| = |aj − a| + |bj − b| < ε if j ≥ max{N1,N2}. Therefore, yj → x as j → ∞. m Similarly, let yj = (aj1, ..., ajm) and x = (x1, ..., xm) in R . Then yj → x as j → ∞ iff aji → xi as j → ∞ for all i = 1, ..., m. See Problem 3-6. Corollary 3.9. Suppose that f is differentiable at x ∈ E. Then the partial derivatives (Djfi)(x) exist and 0 [f (x)] = [(Djfi)(x)]. Here, [f 0(x)] is the matrix representation of f 0(x) ∈ L(Rn, Rm) with respect to the standard bases n m Pn n in R and in R . That is, for h = j=1 hjej ∈ R , m n ! 0 X X f (x)h = (Djfi)(x)hj ui. i=1 j=1 Remark. From above, we know that if f is differentiable at x, then the partial derivatives exists. However, the reverse is not true. In fact, the existence of the partial derivatives do not even imply that the function is continuous at x. For example, let ( 0 if (x, y) = (0, 0), f(x, y) = xy x2+y2 if (x, y) 6= (0, 0).

Then (D1f)(x, y) and (D2f)(x, y) exist at every point, although f is not continuous at (0, 0). See Problem 3-4.

Definition. Let f : E → R. Assume that f is differentiable at x ∈ E. Then f 0(x) ∈ n Pn n n L(R , R). Given v = j=1 vjej ∈ R such that |v| = 1, i.e. v is a direction in R , n f(x + tv) − f(x) 0 X lim = f (x)v = (Djf)(x)vj = (∇f)(x) · v, t→0 t j=1 Pn which is called the direction derivative of f at x in the direction of v. Here, ∇f = j=1(Djf)(x)ej Pn Pn is called the gradient of f at x and the dot product u · v for u = j=1 ujej and v = j=1 vjej in n is defined as R n X u · v = ujvj. j=1 Definition. Let E ⊂ Rn be open. We say f : E → Rm is continuously differentiable mappings if f is differentiable and f 0 is continuous on E (i.e. f 0 : E → L(Rn, Rm) is continuous). The set of all continuously differentiable mappings are denoted as C1(E).

1 Remark. f ∈ C (E) iff the partial derivatives Djfi exist and are continuous on E. See [Rudin, Theorem 9.21]. 3.3. INVERSE FUNCTION THEOREM 42

Homework Assignment . 3-4. Let ( 0 if (x, y) = (0, 0), f(x, y) = xy x2+y2 if (x, y) 6= (0, 0).

2 (a). Prove that (D1f)(x, y) and (D2f)(x, y) exist for all (x, y) ∈ R . (b). Prove that f is not continuous at (0, 0). 2 2 3-5. Let f = (f1, f2): R → R be defined by

x1 x1 f1(x1, x2) = e cos x2 and f2(x1, x2) = e sin x2.

(a). Find the partial derivatives (Djfi)(x) for i = 1, 2 and j = 1, 2. (b). Find the differential f 0(x). (c). Evaluate f 0(x) at x = (0, 0), (0, π/3), and (π/2, 0). m 3-6. Let yj = (aj1, ..., ajm) and x = (x1, ..., xm) in R . Prove that yj → x as j → ∞ iff aji → xi as j → ∞ for all i = 1, ..., m. 3-7. Let f : E → R be differentiable. Suppose that f has a local maximum at x ∈ E. Prove that f 0(x) = 0. (Hint: Use the maximal value principle for single-variable functions.) 3-8. Let f, g : Rn → R be differentiable. Prove that ∇(fg) = f∇g + g∇f. 3-9. Let f : Rn → R be differentiable. Prove that ∇(1/f) = −∇f/f 2 when f 6= 0.

3.3. Inverse function theorem 3.3.1. The contraction principle.

Definition (Contractions). Let X be a metric space equipped with metric d. A mapping ϕ : X → X is said to be a contraction if there exists c < 1 such that

d(ϕ(x), ϕ(y)) ≤ cd(x, y) for all x, y ∈ X.

Remark. Every contraction is uniformly continuous.

Theorem 3.10 (Contraction principle/Banach fixed point theorem). Let X be a complete metric space. Then for any contraction ϕ on X, there exists a unique point x ∈ X such that ϕ(x) = x.

Proof. The uniqueness of the fixed point (if exists) follows from the fact that if ϕ(x) = x and ϕ(y) = y for some x, y ∈ X, then d(x, y) = d(ϕ(x), ϕ(y)) ≤ cd(x, y) so d(x, y) = 0 and x = y. Notice that the completeness of the metric space is not used here. That is, in a metric space, any contraction has at most one fixed point. Pick x0 ∈ X arbitrarily. Define xn recursively by xn+1 = ϕ(xn) for n = 0, 1, 2, ... Since ϕ is a contraction, there exists c < 1 such that

d(xn+1, xn) = d(ϕ(xn), ϕ(xn−1)) ≤ cd(xn, xn−1). Hence, induction gives n d(xn+1, xn) ≤ c d(x1, x0). 3.3. INVERSE FUNCTION THEOREM 43

If n < m, by triangle inequality, m X d(xn, xm) ≤ d(xi, xi−1) i=n+1 m−1 X i ≤ d(x1, x0) c i=n ∞ X i ≤ d(x1, x0) c i=n cn ≤ d(x , x ) → 0 1 − c 1 0 as n → ∞. Therefore, {xn} is a Cauchy sequence and thus has a limit x in X since X is complete. Since ϕ is continuous,

ϕ(x) = lim ϕ(xn) = lim xn+1 = x. n→∞ n→∞  Theorem 3.11 (Inverse function theorem). Let E ⊂ Rn be open and f : E → Rn. Suppose that f ∈ C1(E) and f 0(a) is invertiable for some a ∈ E. Then (i). there exist open sets U, V ⊂ Rn such that a ∈ U, b = f(a) ∈ V , f is one-to-one on U, and f(U) = V . (ii). if g : V → U is the inverse of f (restricted on U), then for any y ∈ V , g0(y) = (f 0(x))−1 , with x = g(y). (iii). g ∈ C1(V ), i.e. g is differentiable and g0 is continuous (from V to L(Rn, Rn)). Remark. The inverse function theorem can be regarded as the (vast) generalization of the inverse of invertible linear transformations: Let f = A ∈ L(Rn, Rn). Suppose that A : Rn → Rn is invertible, equivalently, the matrix [A] is an invertiable matrix (i.e. det A 6= 0). Then the inverse g = A−1 : Rn → Rn is also invertiable. Moreover, g0 = A−1 = (f 0)−1. Proof. (i). Put f 0(a) = A and choose λ so that 2λkA−1k = 1. Since f ∈ C1(E), f 0 is continuous at a so there is an open ball U ⊂ E with center a such that kf 0(x) − Ak < λ for all x ∈ U. We show that f is one-to-one on U. To this end, fix y ∈ Rn. Define a function ϕ by ϕ(x) = x + A−1(y − f(x)) for x ∈ E. Note that f(x) = y iff ϕ(x) = x, i.e. x is a fixed point of ϕ. Now ϕ0(x) = I − A−1f 0(x) = A−1(A − f 0(x)), and 1 1 kϕ0(x)k ≤ kA−1kkA − f 0(x)k < λ · = for x ∈ U. 2λ 2 3.3. INVERSE FUNCTION THEOREM 44

By the mean value inequality in Theorem 3.6 since U is convex, 1 |ϕ(x ) − ϕ(x )| ≤ |x − x |. 1 2 2 1 2 Therefore, ϕ has a unique fixed point in U by the contraction principle in Theorem 3.10, that is, f(x) = y for at most one x ∈ U. Take V = f(U). Then f : U → V is of course onto (and one-to-one as proved). We

show that V is open. To this end, pick y0 = f(x0) ∈ V for some x0 ∈ U. Let B = Bx0 (r)

and radius r such that B ⊂ U (this can always be done by, say, choosing Bx0 (2r) ⊂ U.)

We next prove that By0 (λr) ⊂ V , which will imply that V is open. Suppose that |y − y0| < λr. Then r |ϕ(x ) − x | = kA−1k|y − y | < kA−1kλr = . 0 0 0 2 If x ∈ B, then 1 r |ϕ(x) − x | ≤ |ϕ(x) − ϕ(x )| + |ϕ(x ) − x | ≤ |x − x | + ≤ r. 0 0 0 0 2 0 2 That is, ϕ(x) ∈ B. Thus, ϕ is a contraction of B into B. By the contraction principle in Theorem 3.10 again, there is a unique fixed point x ∈ B. For this x, f(x) = y so y ∈ f(B) ⊂ f(U) = V . Therefore, V is open. (ii). Recall that A = f 0(a), 1 λ = , and ϕ(x) = x + A−1(y − f(x)). 2kA−1k Pick y ∈ V and y + k ∈ V for k small enough since V is open by (i). Then there exist x ∈ U and x + h ∈ U such that y = f(x) and y + k = f(x + h) since f : U → V is one-to-one and onto by (i). It then follows that

k = f(x + h) − f(x).

Moreover, notice that

ϕ(x + h) − ϕ(x) = x + h − A−1(y − f(x + h)) − (x + A−1(y − f(x))) = h − A−1(f(x + h) − f(x)) = h − A−1k.

1 Therefore, owing to the fact that |ϕ(x1) − ϕ(x2)| ≤ 2 |x1 − x2| from (i), we then have that 1 |h − A−1k| ≤ h. 2 which implies that 1 |h| ≤ 2|A−1k| ≤ 2kA−1k|k| ≤ |k|. λ That is, |k| ≥ λ|h|. Now since x ∈ U, kf 0(x) − Ak < λ = 1/(2kA−1k), i.e. kf 0(x) − Ak · kA−1k = 1/2. By (i) in Theorem 3.3, f 0(x) is also invertiable (since A = f 0(a) is invertible). Denote 3.3. INVERSE FUNCTION THEOREM 45

T = (f 0(x))−1. We next show that g0(y) = T . Notice that g = f −1 and g(y + k) − g(y) − T k = x + h − x − T k = h − T k = h − T (f(x + h) − f(x)) = −T (f(x + h) − f(h) − f 0(x)h). Hence, using the fact that |k| ≥ λ|h|, |g(y + k) − g(y) − T k| |T (f(x + h) − f(h) − f 0(x)h)| = |k| |k| kT k|f(x + h) − f(h) − f 0(x)h| ≤ λ|h| kT k |f(x + h) − f(h) − f 0(x)h| = . λ |h| Taking k → 0 on the left-hand-side, h → 0 in the right-hand-side. Then the limit exists and equals zero since f is differentiable at x on the right-hand-side. We therefore have that g0(y) = T . (iii). Recall that Ω ⊂ L(Rn, Rn) is the set of invertiable linear operators. Since g is differentiable on V , g is continuous on V . Notice that by (ii), g0(y) = (f 0(g(y)))−1 = ((f 0 ◦ g)(y))−1. Since f ∈ C1(U), f 0 : U → Ω is continuous. Therefore, f 0 ◦ g : V → Ω is continuous since composition of continuous functions are continuous. By (ii) in Theorem 3.3, the mapping j : A → A−1 is continuous on Ω. We finally have that (f 0 ◦g(y))−1 = j ◦(f 0 ◦g(y)) : V → Ω is continuous. (We remark that ((f 0 ◦ g)(y))−1 denotes inverse of linear transformation (f 0 ◦ g)(y).)  Remark (Jacobian). Let E ⊂ Rn be open and f : E → Rn. If f is differentiable at x ∈ E, then we call the determinant of the linear operator f 0(x) the Jacobian of f at x, i.e. 0 0 Jf (x) = det f (x). One should note that Jf (x) is independent of the matrix form of f (x) with respect to different basis so one can use any matrix form of f 0(x) to compute the Jacobian. 1 The inverse function theorem asserts that if f ∈ C and Jf (x) 6= 0, then f is locally invertiable (i.e. invertiable in a neighborhood of x), the inverse g is also C1, and g0(y) = (f 0(x))−1 with y = f(x). Notice that however, inverse function theorem is a local theorem. That is, even if Jf (x) 6= 0 everywhere, f is not necessarily invertible in the domain. See Problem 3-10.

Homework Assignment . 2 2 3-10. Let f = (f1, f2): R → R be defined by

x1 x1 f1(x1, x2) = e cos x2 and f2(x1, x2) = e sin x2.

2 (a). Prove that Jf (x) 6= 0 for all x = (x1, x2) ∈ R . (b). Put a = (0, π/3) and b = f(a). Let g be the inverse of f in a neighborhood of a. Find an explicit formula of g. Compute f 0(a) and g0(b) and verify that g0(b) = (f 0(a))−1. (c). Prove that f is not invertiable on R2. (Hint: It suffices to prove that f is not one-to-one.) 3.4. IMPLICIT FUNCTION THEOREM 46

2 2 3-11. Let f = (f1, f2): R → R be defined by 2 2 f1(x1, x2) = x1 − x2 and f2(x1, x2) = 2x1x2. Put a = (1, 1) and b = f(a). Let g be the inverse of f in a neighborhood of a. Use the inverse function theorem to find the linear approximation of g at b, i.e. g(y) ≈ g(b) + g0(b)(y − b). 3.4. Implicit function theorem n m Let x = (x1, ..., xn) ∈ R and y = (y1, ..., ym) ∈ R . Write (x, y) = (x1, ..., xn, y1, ..., ym) ∈ n+m n+m n R . Then every A ∈ L(R , R ) splits into two linear transformations Ax and Ay that are defined by Ax(h) = A(h, 0) and Ay(k) = A(0, k) for h ∈ Rn and k ∈ Rm, that is,

A(h, k) = Ax(h) + Ay(k). Theorem 3.12 (Implicit function theorem for linear transformations). Let A ∈ L(Rn+m, Rn). n m Suppose that Ax is invertiable. Then there is a unique h ∈ R corresponding to k ∈ R such that A(h, k) = 0. Such correspondence is given by −1 h = −Ax Ay(k). Proof. Notice that A(h, k) = 0 iff

Ax(h) + Ay(k) = 0. Therefore, −1 h = −Ax Ay(k), if Ax is invertiable.  Theorem 3.13 (Implicit function theorem). Let E ⊂ Rn+m be open and f : E → Rn. Suppose that f ∈ C1(E) and f(a, b) = 0 for some (a, b) ∈ E. Put A = f 0(a, b) and assume that n+m n Ax is invertible. Then there exist open sets U ⊂ R and W ∈ R with (a, b) ∈ U such that (i). there is a unique x corresponding to y ∈ W such that (x, y) ∈ U and f(x, y) = 0. Denote such correspondence by x = g(y), i.e. f(g(y), y) = 0; 1 0 −1 (ii). g ∈ C (W ) and g (b) = −Ax Ay. See [Rudin, Theorem 9.28] for the proof. Homework Assignment . 3-12. Let f : R5 → R2 be defined by x1 f1(x1, x2, y1, y2, y3) = 2e + x2y1 − 4y2 + 3 and f2(x1, x2, y1, y2, y3) = x2 cos x1 − 6x1 + 2y1 − y3. Put a = (0, 1) and b = (3, 2, 7) (a). Verify that f(a, b) = 0. 0 (b). Put A = f (a, b). Find A, Ax and Ay (in the matrix forms with respect to the standard basis). (c). Prove that there is a function g in neighborhood W of b such that f(g(y), y) = 0 for y ∈ W . Compute g0(b). Bibliography

[Rudin] W. Rudin, Principles of mathematical analysis. Third edition. McGraw-Hill Book Co., New York- Auckland-D¨usseldorf,1976. x+342 pp.

47