MATH 3210 Metric spaces
University of Leeds, School of Mathematics November 29, 2017
Syllabus: 1. Definition and fundamental properties of a metric space. Open sets, closed sets, closure and interior. Convergence of sequences. Continuity of mappings. (6) 2. Real inner-product spaces, orthonormal sequences, perpendicular distance to a subspace, applications in approximation theory. (7) 3. Cauchy sequences, completeness of R with the standard metric; uniform convergence and completeness of C[a, b] with the uniform metric. (3) 4. The contraction mapping theorem, with applications in the solution of equations and differential equations. (5) 5. Connectedness and path-connectedness. Introduction to compactness and sequential compactness, including subsets of Rn. (6) LECTURE 1
Books: Victor Bryant, Metric spaces: iteration and application, Cambridge, 1985. M. O.´ Searc´oid,Metric Spaces, Springer Undergraduate Mathematics Series, 2006. D. Kreider, An introduction to linear analysis, Addison-Wesley, 1966.
1 Metrics, open and closed sets
We want to generalise the idea of distance between two points in the real line, given by d(x, y) = |x − y|, and the distance between two points in the plane, given by
p 2 2 d(x, y) = d((x1, x2), (y1, y2)) = (x1 − y1) + (x2 − y2) . to other settings.
[DIAGRAM]
This will include the ideas of distances between functions, for example.
1 1.1 Definition Let X be a non-empty set. A metric on X, or distance function, associates to each pair of elements x, y ∈ X a real number d(x, y) such that (i) d(x, y) ≥ 0; and d(x, y) = 0 ⇐⇒ x = y (positive definite); (ii) d(x, y) = d(y, x) (symmetric); (iii) d(x, z) ≤ d(x, y) + d(y, z) (triangle inequality).
Examples: (i) X = R. The standard metric is given by d(x, y) = |x − y|. There are many other metrics on R, for example
d(x, y) = |ex − ey|;
|x − y| if |x − y| ≤ 1, d(x, y) = 1 if |x − y| ≥ 1. Let X be any set whatsoever, then we can define
1 if x 6= y, d(x, y) = (the discrete metric). 0 if x = y,
2 (ii) X = R . The standard metric is the Euclidean metric: if x = (x1, x2) and y = (y1, y2) then p 2 2 d2(x, y) = (x1 − y1) + (x2 − y2) .
This is linked to the inner-product (scalar product), x.y = x1y1 + x2y2, since it is just p(x − y).(x − y). We will study inner products more carefully later, so for the moment we won’t prove the (well-known) fact that it is indeed a metric. Other possible metrics include
d∞(x, y) = max{|x1 − y1|, |x2 − y2|}.
Let’s check the axioms. In fact (i) and (ii) are easy (i.e., the distance is positive definite, symmetric); for (iii) let’s write |x1 −y1| = p, |x2 −y2| = q, |y1 −z1| = r and |y2 −z2| = s. Then |x1 − z1| ≤ p + r and |x2 − z2| ≤ q + s; so
d∞(x, z) = max{|x1 − z1|, |x2 − z2|} ≤ max{p + r, q + s}
≤ max{p, q} + max{r, s} = d∞(x, y) + d∞(y, z). by inspection.
2 Another metric on R comes from d1(x, y) = |x1 − y1| + |x2 − y2|. These metrics are all translation-invariant (i.e., d(x + z, y + z) = d(x, y)), and homogeneous (i.e., d(kx, ky) = |k|d(x, y)).
2 (iii) Take X = C[a, b]. Here are three metrics: s Z b 2 d2(f, g) = (f(x) − g(x)) dx. a Again, this is linked to the idea of an inner product, so we will delay proving that it is a metric.
Z b d1(f, g) = |f(x) − g(x)| dx, a the area between two curves [DIAGRAM].
d∞(f, g) = max{|f(x) − g(x)| : a ≤ x ≤ b}, the maximum separation between two curves. [DIAGRAM].
Example: on C[0, 1] take f(x) = x and g(x) = x2 and calculate
Z 1 1/2 2 2 p d2(f, g) = (x − x ) dx = 1/30, 0 Z 1 2 d1(f, g) = |x − x | dx = 1/6, and 0 2 d∞(f, g) = max |x − x | = 1/4. x∈[0,1]
1.2 Definition A set X together with a metric d is called a metric space, sometimes written (X, d). If A ⊆ X then we can use d to measure distances between points of A, and (A, d) is also a metric space, called a subspace of (X, d).
LECTURE 2
Examples: 1. The interval [a, b] with d(x, y) = |x − y| is a subspace of R. 2 2 2 p 2 2 2. The unit circle {(x1, x2) ∈ R : x1 +x2 = 1} with d(x, y) = (x1 − y1) + (x2 − y2) is a subspace of R2. 3. The space of polynomials P is a metric space with any of the metrics inherited from C[a, b] above.
1.3 Definition
3 Let (X, d) be a metric space, let x ∈ X and let r > 0. The open ball centred at x, with radius r, is the set B(x, r) = {y ∈ X : d(x, y) < r}, and the closed ball is the set
B[x, r] = {y ∈ X : d(x, y) ≤ r}.
Note that in R with the usual metric the open ball is B(x, r) = (x − r, x + r), an open interval, and the closed ball is B[x, r] = [x − r, x + r], a closed interval.
2 For the d2 metric on R , the unit ball, B(0, 1), is disc centred at the origin, excluding the boundary. You may like to think about what you get for other metrics on R2. 1.4 Definition A subset U of a metric space (X, d) is said to be open, if for each point x ∈ U there is an r > 0 such that the open ball B(x, r) is contained in U (“room to swing a cat”).
Clearly X itself is an open set, and by convention the empty set ∅ is also considered to be open.
1.5 Proposition Every “open ball” B(x, r) is an open set.
Proof: For if y ∈ B(x, r), choose δ = r − d(x, y). We claim that B(y, δ) ⊂ B(x, r). If z ∈ B(y, δ), i.e., d(z, y) < δ, then by the triangle inequality
d(z, x) ≤ d(z, y) + d(y, x) < δ + d(x, y) = r.
So z ∈ B(x, r).
1.6 Definition A subset F of (X, d) is said to be closed, if its complement X \ F is open.
Note that closed does not mean “not open”. In a metric space the sets ∅ and X are both open and closed. In R we have: (a, b) is open. [a, b] is closed, since its complement (−∞, a) ∪ (b, ∞) is open. [a, b) is not open, since there is no open ball B(a, r) contained in the set. Nor is it closed, since its complement (−∞, a) ∪ [b, ∞) isn’t open (no ball centred at b can be contained in the set).
1.7 Example
4 If we take the discrete metric,
1 if x 6= y, d(x, y) = 0 if x = y,
then each point {x} = B(x, 1/2) so is an open set. Hence every set U is open, since for x ∈ U we have B(x, 1/2) ⊆ U.
Hence, by taking complements, every set is also closed.
1.8 Proposition
In a metric space, every one-point set {x0} is closed.
Proof: We need to show that the set U = {x ∈ X : x 6= x0} is open, so take a point x ∈ U. Now d(x, x0) > 0, and the ball B(x, r) is contained in U for every 0 < r < d(x, x0). [DIAGRAM]
1.9 Theorem
Let (Uα)α∈A be any collection of open subsets of a metric space (X, d) (not necessarily S finite!). Then α∈A Uα is open. Let U and V be open subsets of a metric space (X, d). Then U∩V is open. Hence (by induction) any finite intersection of open subsets is open.
S Proof: If x ∈ Uα then there is an α with x ∈ Uα. Now Uα is open, so α∈A S B(x, r) ⊂ Uα for some r > 0. Then B(x, r) ⊂ α∈A Uα so the union is open.
If now U and V are open and x ∈ U ∩ V , then ∃r > 0 and s > 0 such that B(x, r) ⊂ U and B(x, s) ⊂ V , since U and V are open. Then B(x, t) ⊂ U ∩ V if t ≤ min(r, s). [DIAGRAM.]
So the collection of open sets is preserved by arbitrary unions and finite intersections.
1 1 However, an arbitrary intersection of open sets is not always open; for example (− n , n ) T∞ 1 1 is open for each n = 1, 2, 3,..., but n=1(− n , n ) = {0}, which is not an open set.
LECTURE 3
For closed sets we swap union and intersection.
1.10 Theorem
Let (Fα)α∈A be any collection of closed subsets of a metric space (X, d) (not necessar- T ily finite!). Then α∈A Fα is closed. Let F and G be closed subsets of a metric space (X, d). Then F ∪ G is closed. Hence (by induction) any finite intersection of closed
5 subsets is closed.
To prove this we recall de Morgan’s laws. We use the notation Sc for the complement X \ S of a set S ⊂ X.
[ [ c \ c x 6∈ Aα ⇐⇒ x 6∈ Aα for all α, so ( Aα) = Aα. α \ \ c [ c x 6∈ Aα ⇐⇒ x 6∈ Aα for some α, so ( Aα) = Aα. α
c S Proof: Write Uα = Fα = X \ Fα which is open. So α∈A Uα is open by Theorem T c S c S 1.9. Now, by de Morgan’s laws, ( Fα) = Fα. This is just Uα. Since T α∈A α∈A α∈A its complement is open, α∈A Fα is closed. Similarly, the complement of F ∪ G is F c ∩ Gc, which is the intersection of two open sets and hence open by Theorem 1.9. Hence F ∪ G is closed. Infinite unions of closed sets do not need to be closed. An example is S∞ 1 n=1[ n , ∞) = (0, ∞), which is open but not closed. 1.11 Definition The closure of S, written S, is the smallest closed set containing S, and is contained in all other closed sets containing S. Also S is dense if S = X. A smallest closed set containing S does exist, because we can define \ S = {F : F ⊃ S,F closed}, the intersection of all closed sets containing S. There is at least one, namely X itself.
1.12 Example in R The closure of S = [0, 1) is [0, 1]. This is closed, and there is nothing smaller that is closed and contains S.
1.13 Theorem
The set Q of rationals is dense in R, with the usual metric.
Proof: Suppose that F is a closed subset of R which contains Q: we claim that it F = R. For U = R \ F is open and contains no points of Q. But an open set U (unless it is empty) must contain an interval B(x, r) for some x ∈ U, and hence a rational number. Our only conclusion is that U = ∅ and F = R, so that Q = R.
1.14 Proposition
6 Let S ⊂ X. Then: (i) S ⊂ S. (ii) S = S ⇐⇒ S is closed (so S = S). (iii) S ⊂ T ⇒ S ⊂ T . (iv) ∅ = ∅, X = X. (v) S ∪ T = S ∪ T . (vi) S ∩ T ⊂ S ∩ T .
Proof: All these are quite easy except (v) and (vi) (CHECK).
For (v) note that S ⊂ S and T ⊂ T so S ∪ T ⊂ S∪T , which is closed, so S ∪ T ⊂ S∪T . Also S ⊂ S ∪ T and T ⊂ S ∪ T so S ∪ T ⊂ S ∪ T . So equal.
For (vi), we have S ∩ T ⊂ S and S ∩ T ⊂ T so S ∩ T ⊂ S ∩ T .
But we don’t need to have equality; for example X = R, S = (0, 1), T = (1, 2). Then S ∩ T = ∅ = ∅, whereas S ∩ T = [0, 1] ∩ [1, 2] = {1}.
1.15 Definition We say that V is a neighbourhood (nhd) of x if there is an open set U such that x ∈ U ⊆ V ; this means that ∃δ > 0 s.t. B(x, δ) ⊆ V . Thus a set is open precisely when it is a neighbourhood of each of its points.
1.16 Example The half-open interval [0, 1) is a neighbourhood of every point in it except for 0.
1.17 Theorem For a subset S of a metric space X, we have x ∈ S iff V ∩ S 6= ∅ for all nhds V of x (i.e., all neighbourhoods of x meet S).
Proof: If there is a neighbourhood of x that doesn’t meet S, then there is an open subset U with x ∈ U and U ∩ S = ∅. [DIAGRAM?] But then X \ U is a closed set containing S and so S ⊂ X \ U, and then x∈ / S because x ∈ U. Conversely, if every neighbourhood of x does meet S, then x ∈ S, as otherwise X \ S is as open neighbourhood of x that doesn’t meet S.
LECTURE 4
1.18 Definition
7 The interior of S, int S, is the largest open set contained in S, and can be written as [ int S = {U : U ⊂ S,U open}. the union of all open sets contained in S. There is at least one, namely ∅.
We see that S is open exactly when S = int S, otherwise int S is smaller.
1.19 Examples in R int[0, 1) = (0, 1); clearly this is open and there is no larger open set contained in [0, 1). int Q = ∅. For any non-empty open set must contain an interval B(x, r) and then it contains an irrational number, so isn’t contained in Q. 1.20 Proposition int S = X \ (X \ S).
Proof: By De Morgan’s laws, [ int S = {U : U ⊂ S,U open} \ = X \ {U c : U ⊂ S,U open} \ = X \ {F : F ⊃ X \ S,F closed} = X \ (X \ S).
This is because U ⊂ S if and only if U c = X\U ⊃ X\S. Also F = U c is closed precisely when U is open. That is, there is a correspondence between open sets contained in S and closed sets containing its complement.
1.21 Corollary (i) int S ⊂ S. (ii) int S = S ⇐⇒ S is open. (iii) S ⊂ T ⇒ int S ⊂ int T . (iv) int (int S) = int S. (v) int(S ∪ T ) ⊃ int S ∪ int T . (vi) int(S ∩ T ) = int S ∩ int T .
Proof: Easy, or take complements and use Prop’s 1.14 and 1.20.
1.22 Definition The boundary or frontier of S is ∂S = S \ int S = S ∩ X \ S. This writes ∂S as the intersection of two closed sets, so it is also closed.
1.23 Examples in R
8 For S = [0, 1) we have int S = (0, 1) and S = [0, 1] so ∂S = {0, 1}.
For S = Q we have int S = ∅ and S = R, so ∂S = R.
1.24 Examples in R2 For S = {(x, y): x2 + y2 < 1}, we have int S = S and S = {(x, y): x2 + y2 ≤ 1}, so ∂S is the circle {(x, y): x2 + y2 = 1}.
For S = [0, 1) regarded as the subset {(x, y) : 0 ≤ x < 1, y = 0} of R2, we have S = {(x, y) : 0 ≤ x ≤ 1, y = 0} and int S = ∅ so ∂S = S.
2 Convergence and continuity
Let (xn) be a sequence in a metric space (X, d), i.e., x1, x2,.... (Sometimes we may start counting at x0.) 2.1 Definition
We say xn → x (i.e., xn tends to x or converges to x) if d(xn, x) → 0 as n → ∞. That is, for all ε > 0 there is an N such that d(xn, x) < ε for n ≥ N (“for n sufficiently large”).
This is the usual notion of convergence if we think of points in Rm with the Euclidean metric.
2.2 Theorem
(i) The sequence (xn) tends to x if and only if for every open U with x ∈ U, ∃n0 s.t. xn ∈ U for all n ≥ n0.
(ii) Let S be a subset of the metric space X. Then x ∈ S if and only if there is a sequence (xn) of points of S with xn → x.
Proof: (i) If xn → x and x ∈ U, then there is a ball B(x, ε) ⊂ U, since U is open. But xn → x so d(xn, x) < ε for n sufficiently large, i.e., xn ∈ U for n sufficiently large.
Conversely, if the “open set” condition works, and ε > 0, choose U = B(x, ε). Then xn ∈ U for n sufficiently large, and so d(xn, x) < ε for n large.
1 (ii) If x ∈ S, then for each n we have B(x, n ) ∩ S 6= ∅ by Theorem 1.17. So choose xn ∈ B(x, 1/n) ∩ S. Clearly d(xn, x) → 0, i.e., xn → x.
Conversely, if x 6∈ S, then there is a neighbourhood U of x with U ∩ S = ∅. Now no sequence in S can get into U so it cannot converge to x.
9 2.3 Examples
2 1. Take (R , d1), where d1(x, y) = |x1 − y1| + |x2 − y2|, where x = (x1, x2) and 1 2n+1 y = (y1, y2), and consider the sequence ( n , n+1 ). We guess its limit is (0, 2). To see if this is right, look at 1 2n + 1 1 2n + 1 1 1 d1 , , (0, 2) = + − 2 = + → 0 n n + 1 n n + 1 n n + 1 as n → ∞. So the limit is (0, 2).
LECTURE 5
n 2. In C[0, 1] let fn(t) = t and f(t) = 0 for 0 ≤ t ≤ 1. Does fn → f, (a) in d1, and (b) in d∞?
(a) Z 1 n 1 d1(fn, f) = t dt = → 0 0 n + 1 as n → ∞. So fn → f in d1.
(b) n d∞(fn, f) = max{t : 0 ≤ t ≤ 1} = 1 6→ 0 as n → ∞. So fn 6→ f in d∞.
Note: say gn → g pointwise on [a, b] as n → ∞ if gn(x) → g(x) for all x ∈ [a, b]. If we 0 for 0 ≤ x < 1, define g(x) = then f → g pointwise on [0, 1]. But g 6∈ C[0, 1], as 1 for x = 1, n it is not continuous at 1.
3. Take the discrete metric 1 if x 6= y, d (x, y) = 0 0 if x = y.
Then xn → x ⇐⇒ d0(xn, x) → 0. But since d0(xn, x) = 0 or 1, this happens if and only if d0(xn, x) = 0 for n sufficiently large. That is, there is an n0 such that xn = x for all n ≥ n0. All convergent sequences in this metric are eventually constant. So, for example d0(1/n, 0) 6→ 0.
A result on convergence in R2. 2.4 Proposition
10 2 Take R with any of the metrics d1, d2 and d∞. Then a sequence xn = (an, bn) con- verges to x = (a, b) if and only if an → a and bn → b.
Proof: We have d1(xn, x) = |an − a| + |bn − b|. This tends to zero as n → ∞ if and only if each of the terms |an − a| and |bn − b| does. And that’s the same as saying that an → a and bn → b.
2 2 1/2 2 Also d2(xn, x) = (|an − a| + |bn − b| ) , which tends to zero if and only if |an − a| + 2 2 2 |bn − b| does; this happens if and only if |an − a| and |bn − b| tend to zero, which is the same as an → a and bn → b.
Finally, d∞(xn, x) = max{|an − a|, |bn − b|}. If this tends to zero then so do |an − a| and |bn − b| as they are smaller and still positive; and if they both tend to zero then so does their maximum, which is less than their sum. Again this is the same as saying an → a and bn → b.
A similar result holds for Rk in general.
Now let’s look at continuous functions again.
2.5 Theorem
If fn → f in (C[a, b], d∞), then fn → f in (C[a, b], d1).
(d∞ convergence is stronger than d1 convergence.)
Proof: d∞(fn, f) = max{|fn(x) − f(x)| : a ≤ x ≤ b} → 0 as n → ∞, so, given ε > 0 there is an N so that d∞(fn, f) < ε for n ≥ N. It follows that if n ≥ N then
Z b Z b d1(fn, f) = |fn(x) − f(x)| dx ≤ ε dx = ε(b − a), a a
so d1(fn, f) → 0 as n → ∞.
Note: It is also true that if d∞(fn, f) → 0 then fn → f pointwise on [a, b]. The converse is FALSE.
Now we look at continuous functions between general metric spaces.
2.6 Definition
Let f :(X, dX ) → (Y, dY ) be a map between metric spaces. We say that f is continuous at x0 ∈ X if for each ε > 0 there is a δ > 0 such that dY (f(x), f(x0)) < ε whenever dX (x, x0) < δ.
11 So f is continuous, if it is continuous at all points of X.
2.7 Proposition
For f as above, f is continuous at x0 if, whenever a sequence xn → x0, then f(xn) → f(x0) (“sequential continuity”).
Proof: Same proof as in real analysis, more or less. If f is continuous at x0 and xn → x0. Then for each ε > 0 we have a δ > 0 such that dY (f(x), f(x0)) < ε whenever dX (x, x0) < δ. Then there’s an n0 with d(xn, x0) < δ for all n ≥ n0, and so d(f(xn), f(x0)) < ε for all n ≥ n0. Thus f(xn) → f(x).
Conversely, if f is not continuous at x0, then there is an ε for which no δ will do, so we can find xn with d(xn, x0) < 1/n but d(f(xn), f(x0)) ≥ ε. Then xn → x0 but f(xn) 6→ f(x0). But there is a nicer way to define continuity. For a mapping f : X → Y and a set U ⊂ Y , let f −1(U) be the set
f −1(U) = {x ∈ X : f(x) ∈ U}.
This makes sense even if f −1 is not defined as a function.
2.8 Theorem A function f : X → Y is continuous if and only if f −1(U) is open in X for every open subset U ⊂ Y .
(“The inverse image of an open set is open.” Note that for f continuous we do not expect f(U) to be open for all open subsets of X, for example f : R → R, f ≡ 0, then f(R) = {0}, not open.)
LECTURE 6
−1 Proof: Suppose that f is continuous, that U is open, and that x0 ∈ f (U), so f(x0) ∈ U. Now there is a ball B(f(x0), ε) ⊂ U, since U is open, and then by continuity there is a δ > 0 such that dY (f(x), f(x0)) < ε whenever dX (x, x0) < δ. This −1 −1 means that for d(x, x0) < δ, f(x) ∈ U and so x ∈ f (U). That is, f (U) is open. [DIAGRAM]
Conversely, if the inverse image of an open set is open, and x0 ∈ X, let ε > 0 be given. −1 We know that B(f(x0), ε) is open, so f (B(f(x0), ε)) is open, and contains x0. So it contains some B(x0, δ) with δ > 0.
12 −1 But now if d(x, x0) < δ, we have x ∈ B(x0, δ) ⊂ f (B(f(x0), ε)) so f(x) ∈ B(f(x0), ε) and we have d(f(x), f(x0)) < ε.
2.9 Example
Let X = R with the discrete metric, and Y any metric space. Then all functions f : X → Y are continuous!
(i) Because the inverse image of an open set is an open set, since all sets are open. (ii) Because whenever xn → x0 we have xn = x0 for n large, so obviously f(xn) → f(x0).
2.10 Proposition (i) A function f : X → Y is continuous if and only if f −1(F ) is closed whenever F is a closed subset of Y .
(ii) If f : X → Y and g : Y → Z are continuous, then so is the composition g ◦ f : X → Z defined by (g ◦ f)(x) = g(f(x)).
[DIAGRAM]
Proof: (i) We can do this by complements, as if F is closed, then U = F c is open, and f −1(F ) = f −1(U)c (a point is mapped into F if and only if it isn’t mapped into U). Then f −1(F ) is always closed when F is closed ⇐⇒ f −1(U) is always open when U is open.
(ii) Take U ⊂ Z open; then (g ◦ f)−1(U) = f −1(g−1(U)); for these are the points which map under f into g−1(U) so that they map under g ◦ f into U. Now g−1(U) is open in Y , as g is continuous, and then f −1(g−1(U)) is open in X since f is continuous.
2.11 Definition A function f : X → Y is a homeomorphism between metric spaces if it is a bijection s.t. f and f −1 are continuous. Then we say X and Y are homeomorphic, or X ∼ Y .
2.12 Example
The real line R is homeomorphic to the open interval (0, 1). For if we take y = tan−1 x this maps it homeomorphically onto (−π/2, π/2), and this can be mapped 1 homeomorphically onto (0, 1), e.g. by z = π (y + π/2).
13 3 Real inner-product spaces
Notation: vectors written u, v, w, etc. (Sometimes just u, v, w). Scalars written a, b, c, etc. Functions written f, g, h. Coordinates of a vector u normally written u1, u2, u3, etc.
3.1 Inner product in Rn 2 For vectors u = (u1, u2) and v = (v1, v2) in R we write hu, vi for the standard inner product hu, vi = u1v1 + u2v2 ; sometimes written u.v or (u, v). We can do similarly for vectors in Rn where n = 1, 2, 3, ..., (i.e., n components), so if u = (u1, . . . , un) and v = (v1, . . . , vn) we have
hu, vi = u1v1 + u2v2 + ... + unvn.
For example,
h(1, 2, 3, 4), (0, −1, 5, 2)i = 1.0 − 2.1 + 3.5 + 4.2 = 21.
3.2 Standard properties of the scalar product I. LINEARITY. hau + bv, wi = ahu, wi + bhv, wi, for a, b real and u, v, w vectors. II. SYMMETRY.
hu, vi = hv, ui.
III. POSITIVE DEFINITENESS. hu, ui ≥ 0 for all u, and we have hu, ui = 0 if and only if u = 0. 2 2 The first two are easy to check. For III note that hu, ui = u1 + ... + un ≥ 0, and it will be zero if and only if u1 = ... = un = 0.
3.3 Definition of a general (real) inner product Let V be a real vector space and suppose that we have for each pair of vectors u, v in V a real number written hu, vi, such that properties I, II and III of (3.2) hold. Then we call V a real inner product space, and hu, vi the inner product of u and v.
14 N.B. In quantum mechanics and elsewhere people use complex inner products. Not in this course.
LECTURE 7
3.4 Examples
1. The usual inner product on Rn. 2. We can define a new inner product on R2 by
hu, vi = 2u1v1 + 3u2v2. Easily checked to be linear (do it!) and symmetric. For positive definiteness, note that 2 2 hu, ui = 2u1 + 3u2 ≥ 0
and is > 0 unless u1 = u2 = 0. The following alternative is not an inner product, e.g. define
hu, vi = 2u1v1 − 3u2v2, 2 2 so hu, ui = 2u1 − 3u2, and would be negative if u = (0, 1), say. 3. For a < b define C[a, b] to be the vector space of all continuous real functions on [a, b]. For f, g ∈ C[a, b] define Z b hf, gi = f(x)g(x) dx. a Example: in C[0, 1], let f(x) = x + 1 and g(x) = 2x. Z 1 Z 1 2x3 1 5 hf, gi = (x + 1)(2x) dx = (2x2 + 2x) dx = + x2 = . 0 0 3 0 3 3.5 Other properties of inner products (a) hu, av+bwi = hav+bw, ui (rule II) = ahv, ui+bhw, ui (rule I) = ahu, vi+bhu, wi (rule II again). So it is linear in the second argument as well as the first. (b) h0, ui = h0u + 0u, ui = 0hu, ui + 0hu, ui = 0 for all u, using rule I. Also hu, 0i = h0, ui = 0, using rule II. This is for any u ∈ V .
(c) More generally we can check that
ha1u1 + a2u2 + ... + aN uN , b1v1 + b2v2 + ... + bM vM i behaves like multiplication, and we get
N M X X aibjhui, vji. i=1 j=1
15 4 Lengths, angles, orthogonality
4.1 Definition In an inner product space we define the length of a vector v (sometimes called its size or norm) by kvk = phv, vi. Note that hv, vi is always ≥ 0; also by property III, kvk = 0 if and only if v = 0. n 2 2 2 This agrees with what we usually do in R√ , e.g. v = (3, 4, −12), then kvk = 3 + 4 + (−12)2 = 9 + 16 + 144 = 169, so kvk = 169 = 13. Example: in C[−1, 1] let f(x) = x. Then
Z 1 x3 1 2 kfk2 = x2 dx = = , −1 3 −1 3 so kfk = p2/3. Note that if v ∈ V and a ∈ R, then kavk2 = hav, avi = a2hv, vi = a2kvk2, √ so kavk = a2kvk = |a|.kvk, taking the positive square root. For example (−2)v is twice as big as v, but with direction reversed.
4.2 Definition The angle between two non-zero vectors u and v is the unique solution θ to
hu, vi = kuk kvk cos θ
in the range 0 ≤ θ ≤ π (radians!) It is easy to check that the angle between u and u is 0, and the angle between u and −u is π. We say u and v are orthogonal if hu, vi = 0. This is because the angle between them satisfies cos θ = 0 so θ = π/2. This is sometimes written u ⊥ v. To make sense of our definition we will need to know that hu, vi cos θ = kuk.kvk
lies between −1 and 1; see later. Example: in C[0, 1] find the number a such that the functions f(t) = t and g(t) = 3t + a are orthogonal. Solution: Z 1 at2 1 a hf, gi = t(3t + a) dt = t3 + = 1 + , 0 2 0 2
16 a so hf, gi = 0 ⇐⇒ 1 + 2 = 0, or a = −2.
More generally, a set of vectors {u1,..., uN } is an orthogonal set if hui, uji = 0 when- ever i 6= j.
4.3 Pythagoras’s theorem If hu, vi = 0 then ku + vk2 = kuk2 + kvk2.
[DIAGRAM – square on the hypotenuse etc.]
Proof:
ku + vk2 = hu + v, u + vi = hu, ui + hv, ui + hu, vi + hv, vi = kuk2 + 0 + 0 + kvk2,
using orthogonality.
4.4 Parallelogram identity
ku + vk2 + ku − vk2 = 2kuk2 + 2kvk2.
[DIAGRAM – draw a parallelogram.] The sums of the squares of the two diagonals equals the sums of the squares of the four sides. Proof: expand the inner products; see the example sheets.
5 Cauchy–Schwarz and its consequences
In order to make sense of (4.2) we need the following.
5.1 Cauchy–Schwarz inequality For u and v in an inner-product space,
hu, vi2 ≤ hu, ui hv, vi,
i.e., |hu, vi| ≤ kuk kvk.
Example: if u1, . . . , un and v1, . . . .vn are real numbers, then
n n !1/2 n !1/2
X X 2 X 2 uivi ≤ ui vi . i=1 i=1 i=1
17 Note that the LHS is |hu, vi| and the RHS is kuk kvk, where u = (u1, . . . , un), n v = (v1, . . . , vn) and we use the standard inner product in R .
LECTURE 8
We give two proofs, and in each we assume that u 6= 0 and v 6= 0 (otherwise the inequality is obvious).
Proof 1:
Take kau − bvk2 = a2kuk2 − 2abhu, vi + b2kvk2 ≥ 0, with a = hu, vi and b = kuk2. We get
kuk2(hu, vi2 − 2hu, vi2 + kuk2kvk2) ≥ 0,
which gives the result.
Proof 2:
For real t we have htu + v, tu + vi ≥ 0, i.e., t2hu, ui + 2thu, vi + hv, vi ≥ 0. We’ll minimize this over t, so by differentiation this is where
2thu, ui + 2hu, vi = 0.
So we put t = −hu, vi/hu, ui, and we get hu, vi2 hu, vi2 − 2 + hv, vi ≥ 0. hu, ui hu, ui This simplifies to hu, vi2 − + hv, vi ≥ 0, hu, ui i.e., hu, vi2 ≤ hv, vi, hu, ui which is what is required.
hu, vi NOW we know that lies between -1 and 1, and so the definition of angle kuk kvk makes sense.
18 5.2 Triangle inequality In an inner product space we have
ku + vk ≤ kuk + kvk.
For example, in Rn this gives
n !1/2 n !1/2 n !1/2 X 2 X 2 X 2 (ui + vi) ≤ ui + vi . i=1 i=1 i=1 [DIAGRAM – triangle of vectors] Proof:
ku + vk2 = hu + v, u + vi = kuk2 + 2hu, vi + kvk2 ≤ kuk2 + 2kuk kvk + kvk2 = (kuk + kvk)2.
5.3 Theorem In an inner-product space the norm (length) of a vector satisfies (i) kuk ≥ 0, and kuk = 0 if and only if u = 0; (ii) kauk = |a| kuk; (iii) ku + vk ≤ kuk + kvk.
5.4 Corollary Let V be an inner-product space, and define d(x, y) = kx − yk. Then d is a metric.
Proof: From Theorem 5.3, we see easily that d(x, y) ≥ 0 and d(x, y) = 0 if and only if x − y = 0, i.e., x = y. Also d(x, y) = kx − yk = ky − xk = d(y, x). Finally
d(x, z) = kx − zk = k(x − y) + (y − z)k ≤ kx − yk + ky − zk = d(x, y) + d(y, z).
So every inner-product space is a metric space.
5.5 The space `2
2 ∞ The elements of the space ` (also written `2) are real sequences (uk)k=1 such that P∞ 2 k=1 uk < ∞. 1 1 1 2 P∞ 1 2 So, for example ( 2 , 4 , 8 ,...) ∈ ` , since k=1 2k < ∞ (geometric series); but 2 P∞ 2 (1, 2, 3, 4,...) 6∈ ` , since k=1 k = ∞.
19 We shall get a vector space by adding sequences term-wise; if u = (uk) and v = (vk), then u + v = (uk + vk) and au = (auk), just like vectors with an infinite sequence of components. 2 How do we know that (uk + vk) is still in ` ?
Proof: for each N,
N !1/2 N !1/2 N !1/2 X 2 X 2 X 2 (uk + vk) ≤ uk + vk k=1 k=1 k=1 ∞ !1/2 ∞ !1/2 X 2 X 2 ≤ uk + vk = A, k=1 k=1 say, where we used first the triangle inequality in RN . Since this holds for every N we PN 2 2 let N → ∞ to see that k=1(uk + vk) converges, and its limit is at most A .
In fact `2 is an inner-product space; define
∞ X hu, vi = ukvk. k=1
To see that this sum converges, use Cauchy–Schwarz in RN :
N N !1/2 N !1/2 X X 2 X 2 |ukvk| ≤ uk vk k=1 k=1 k=1 ∞ !1/2 ∞ !1/2 X 2 X 2 ≤ uk vk = B, k=1 k=1 P∞ P∞ say. Hence k=1 |ukvk| converges to a limit which is at most B. So k=1 ukvk is absolutely convergent.
It is easy now to check that this defines an inner product. Also kuk2 = hu, ui = P∞ 2 n k=1 uk, so it is like R with n = ∞. It is an infinite-dimensional vector space, but a very useful one.
LECTURE 9
6 Orthonormal sets
6.1 Definition
20 A set of vectors {e1,..., en} in an inner product space is orthonormal if it is orthogonal and each vector has norm 1. So 0 if i 6= j, he , e i = i j 1 if i = j.
If it’s also a basis for the inner product space, then we call it an orthonormal basis.
Examples: (i) (1, 0, 0), (0, 1, 0), (0, 0, 1) is an orthonormal basis of R3 (the standard basis); 2 3 4 4 3 (ii) An unusual orthonormal basis of R is e1 = ( 5 , 5 ) and e2 = (− 5 , 5 ). [DIAGRAM – draw the vectors]
6.2 Proposition
If {e1,..., en} is orthonormal, then
n n !1/2
X X 2 aiei = ai , i=1 i=1 for any scalars a1, . . . , an, and so the vectors {e1,..., en} are linearly independent. Proof: * n n + n n X X X X aiei, ajej = aiajhei, eji, i=1 j=1 i=1 j=1 Pn 2 by (3.5). All terms except for those with i = j are zero, and we get i=1 ai , as required.
Pn Pn 2 Also, if i=1 aiei = 0, then i=1 ai = 0, and so a1 = ... = an = 0; i.e., the vectors are independent.
6.3 The Gram–Schmidt process
We start with a sequence v1,..., vn of independent vectors and end up with a se- quence e1,..., en of orthonormal vectors such that for each 1 ≤ k ≤ n the set {e1,..., ek} spans the same subspace as {v1,..., vk}.
Define w1 = v1 and e1 = w1/kw1k.
Let w2 = v2 − hv2, e1ie1, and e2 = w2/kw2k.
Then w3 = v3 − hv3, e1ie1 − hv3, e2ie2, and e3 = w3/kw3k.
In general
k X wk+1 = vk+1 − hvk+1, eiiei, and ek+1 = wk+1/kwk+1k. i=1
21 Then {e1,..., en} are orthonormal and for each k the vectors e1,..., ek span the same space as v1,..., vk.
Proof: Basically, the orthonormality property is shown by induction.
Suppose that we know that e1,..., ek are orthonormal (k = 1 is already done). Then we work out hwk+1, eji for j ≤ k. So
k X hwk+1, eji = hvk+1, eji − hvk+1, eiihei, eji = hvk+1, eji − hvk+1, eji = 0. i=1
So each new vector wk+1 and hence also ek+1 is orthogonal to the earlier ej. It isn’t zero since vk+1 is independent of v1,..., vk.
Also ek+1 = wk+1/kwk+1k implies that kek+1k = kwk+1k/kwk+1k = 1.
The span of e1,..., ek is k-dimensional and contained in span{v1,..., vk}, so must equal it.
4 Example: Take v1 = (1, 0, 0, 1), v2 = (2, 3, 2, 0) and v3 = (0, 7, −2, 2) in R . Set w1 = v1 and 1 e1 = w1/kw1k = √ (1, 0, 0, 1). 2 1 1 w2 = v2 − hv2, e1ie1 = (2, 3, 2, 0) − √ 2√ (1, 0, 0, 1) = (1, 3, 2, −1). 2 2
Note that w2 ⊥ e1. Then 1 e2 = w2/kw2k = √ (1, 3, 2, −1). 15 Next
w3 = v3 − hv3, e1ie1 − hv3, e2ie2 2 1 1 1 = (0, 7, −2, 2) − √ √ (1, 0, 0, 1) − √ 15√ (1, 3, 2, −1) 2 2 15 15 = (0, 7, −2, 2) − (1, 0, 0, 1) − (1, 3, 2, −1) = (−2, 4, −4, 2).
Finally, (−2, 4, −4, 2) 1 e3 = w3/kw3k = √ = √ (−1, 2, −2, 1). 40 10 Having done this, CHECK that
1 if i = j, he , e i = i j 0 if i 6= j.
22 Example (Legendre polynomials): Take the functions 1, t, t2, t3,... in C[−1, 1] with inner product
Z 1 hf, gi = f(t)g(t) dt. −1 √ 2 R 1 Now k1k = −1 1 dt = 2, so e1(t) = 1/ 2.
Next take 1 Z 1 t w2(t) = t − ht, e1ie1 = t − √ √ dt = t − 0 = t. 2 −1 2 Also Z 1 r 2 2 2 3 kw2k = t dt = , so e2(t) = w2(t)/kw2k = t. −1 3 2 Then
2 2 2 w3(t) = t − ht , e1ie1(t) − ht , e2ie2(t) 1 Z 1 3 Z 1 = t2 − t2 dt − t t3 dt 2 −1 2 −1 1 1 = t2 − − 0 = t2 − . 3 3 But Z 1 Z 1 2 2 1 2 4 2 2 1 kw3k = (t − ) dt = (t − t + ) dt −1 3 −1 3 9 2 4 2 8 = − + = , 5 9 9 45 so r45 1 r5 e (t) = (t2 − ) = (3t2 − 1). 3 8 3 8
In general, en(t) has degree n − 1. Lots of useful systems of polynomials are obtained by orthonormalizing 1, t, t2, t3,... with respect to different inner products (e.g. Cheby- shev, Hermite, Laguerre, . . . ).
23 LECTURE 10
7 Orthogonal projections and best approximation
Many approximation problems consist of taking a vector v and a subspace W of an inner-product space, and then finding the closest element w in W to v, i.e., minimizing the size of the error v − w.
Examples: 1. Take R3 with the usual inner product and W a plane through the origin. The closest point of W is obtained by “dropping a perpendicular onto W ”.
[DIAGRAM]
2. Find the best approximation to the function f(t) = |t| on [−1, 1] by a quadratic g(t) = a + bt + ct2, in the sense of minimizing
Z 1 kf − gk2 = (f(t) − g(t))2 dt. −1 7.1 Theorem Let W be a (finite-dimensional) subspace of an inner-product space V , let v ∈ V , and let w ∈ W satisfy hv − w, zi = 0 for all z ∈ W. Then kv − yk ≥ kv − wk for all y ∈ W . That is, w is the closest point in W to v, and it is unique.
[DIAGRAM: plot v, w, y.]
Proof: for y ∈ W write v − y = (v − w) + (w − y) and note that v − w is orthogonal to w − y, since w − y is in W . By Pythagoras’s theorem (4.3),
kv − yk2 = kv − wk2 + kw − yk2 ≥ kv − wk2, as required. Note that if y 6= w, then kv −yk > kv −wk so the closest point is unique.
7.2 Definition If W is a subspace of an inner product space V , then its orthogonal complement, W ⊥, is the set of all vectors u that are orthogonal to every vector of W . ⊥ ⊥ Clearly 0 ∈ W , and indeed W is a subspace, since if u1 and u2 are orthogonal to everything in W , then ha1u1 + a2u2, wi = a1hu1, wi + a2hu2, wi = 0 for all w ∈ W .
24 Example: if W is the 1-dimensional subspace of R3 spanned by the vector w = (3, 5, 7) ⊥ then x = (x1, x2, x3) is in W if and only if hx, wi = 0, i.e., 3x1 + 5x2 + 7x3 = 0. This is the plane perpendicular to W .
It can be checked that (W ⊥)⊥ is W again.
Now in (7.1) we have that if v − w lies in W ⊥, then w is the best approximation to v by vectors in W .
7.3 The normal equations
Suppose that w1,..., wn is a basis for W . Then the best approximant w to v is found by solving hv − w, wii = 0 for each i, because this makes v − w orthogonal to all linear combinations of the wi. Hence we have hw, wii = hv, wii for each i. Pn Suppose now that w = k=1 ckwk is the best approximant. Then we have
n X ckhwk, wii = hv, wii k=1 for each i = 1, . . . , n.
Example. In C[−1, 1] we take f(t) = |t|; to approximate it by a quadratic take w1(t) = 1, 2 w2(t) = t and w3(t) = t . 2 The best approximant c0 + c1t + c2t to |t| satisfies:
2 c0h1, 1i + c1ht, 1i + c2ht , 1i = hf, 1i, 2 c0h1, ti + c1ht, ti + c2ht , ti = hf, ti, 2 2 2 2 2 c0h1, t i + c1ht, t i + c2ht , t i = hf, t i.
Now we can easily check that
Z 1 k 0 if k is odd, t dt = 2 if k is even, −1 k+1 so we can soon calculate inner products and get
2 Z 1 2c0 + 0 + c2 = |t| dt = 1 3 −1
25 2 Z 1 0 + c1 + 0 = |t|t dt = 0, 3 −1 Z 1 2 2 2 1 c0 + 0 + c2 = t |t| dt = . 3 5 −1 2 Note that Z 1 Z 0 Z 1 |t| dt = (−t) dt + t dt, −1 −1 0 etc. 3 15 The solution to these equations is c0 = 16 , c1 = 0 and c2 = 16 , giving the approximation 3 15 |t| ≈ + t2. 16 16
7.4 Corollary
Suppose that e1,..., en is an orthonormal basis for W . Then the best approximant of v ∈ V by an element of W is n X w = hv, ekiek. k=1
26 Pn Proof: Let w = k=1 ckek. Then the normal equations become
n X ckhek, eii = hv, eii, k=1
which reduces to ci = hv, eii using orthonormality.
Thus we could have solved the example of approximating f(t) = |t| by using an or- thonormal basis for the quadratic polynomials, e.g. the Legendre functions.
7.5 Definition
The orthogonal projection of v onto W , written PW v, is the closest vector w ∈ W to v. In particular, n X PW v = hv, ekiek, k=1 if {e1,..., en} is an orthonormal basis of W . Note that PW : V → W is a linear mapping.
LECTURE 11
3 Example: the plane W = {(x1, x2, x3) ∈ R : x1 + x2 + x3 = 0} is a 2-dimensional subspace with orthonormal basis e = √1 (1, −1, 0) and e = √1 (1, 1, −2). CHECK 1 2 2 6 that these are orthonormal and lie in W (so, since dim W = 2, they are also a basis for it).
Calculate PW (1, 0, 0). It is
PW (1, 0, 0) = h(1, 0, 0), e1ie1 + h(1, 0, 0), e2ie2 1 1 2 1 1 = (1, −1, 0) + (1, 1, −2) = ( , − , − ). 2 6 3 3 3 Now for some more serious applications of the theory.
7.6 Least squares approximation Problem: find the line through (0, 0) (to be varied later) which “best approximates” the data (x1, y1),..., (xn, yn). We would like yi = cxi for each i, but we don’t know c and the points won’t always lie exactly on a line.
[DIAGRAM]
Pn 2 We decide to minimize i=1(yi − cxi) , least squares approximation, useful in statis- tical applications.
27 n This is the same as taking x = (x1, . . . , xn) and y = (y1, . . . , yn) in R and minimizing ky − cxk. Take V to be Rn, usual inner product, and W to be the one-dimensional subspace {ax : a ∈ R}. This is the same as finding the closest point to y in W .
Solution: take hy, xi c = , hx, xi since this is the orthogonal projection onto W . In detail, w = cx, and the normal equation is hw, xi = hy, xi, or chx, xi = hy, xi. So x1y1 + ... + xnyn c = 2 2 . x1 + ... + xn Example: find the best fit to the data x y 2 3 1 2 3 3 4 5 Solution: 2.3 + 1.2 + 3.3 + 4.5 37 c = = . 22 + 12 + 32 + 42 30 7.7 Generalization
Suppose that y is known/guessed to be a linear combination of m variables x1,..., xm, y = c1x1 + ... + cmxm, so we have experimental data
x1 x2 ... xm y x11 x21 . . . xm1 y1 ...... x1n x2n . . . xmn yn
n Set up the problem in R and choose c1, . . . , cm to minimize ky − (c1x1 + ... + cmxm)k. If W = span{x1,..., xm}, then we want the closest point in W to y. We know from (7.3) that the constants c1, . . . , cm are determined by the normal equa- tions: m X h ckxk, xii = hy, xii for each i, k=1 i.e.,
c1hx1, x1i + ... + cmhxm, x1i = hy, x1i ...... = ...
c1hx1, xmi + ... + cmhxm, xmi = hy, xmi
28 To get a unique solution we need the vectors x1,..., xm to be independent, which re- quires n ≥ m.
Example: Use the method of least squares approximation to find the best relation of the form y = c1x1 + c2x2 fitting the following experimental data:
x1 x2 y i) 1 0 2 ii) 0 1 3 iii) 1 1 2 iv) 1 −1 0
4 Solution: We work in R and take x1 = (1, 0, 1, 1), x2 = (0, 1, 1, −1) and y = (2, 3, 2, 0). Normal equations are
3c1 + 0c2 = 4 0c1 + 3c2 = 5,
4 5 so c1 = 4/3 and c2 = 5/3. So the best relation is y = 3 x1 + 3 x2, giving
x1 x2 yexperimental ytheoretical 1 0 2 4/3 0 1 3 5/3 1 1 2 3 1 −1 0 −1/3 7.8 Curve fitting
Given (x1, y1),..., (xn, yn) find a (polynomial) curve which fits these points well in the sense of least squares approximation. 2 Example: Find the parabola y = c0 + c1x + c2x which best fits the points (0, 0), (1, 4), (−1, 1), (−2, 5).
Solution: Apply the method of least squares approximation to y = c0x0 + c1x1 + c2x2 2 with x0 = 1, x1 = x and x2 = x . Put x0 = (1, 1, 1, 1), x1 = (0, 1, −1, −2), x2 = (0, 1, 1, 4), y = (0, 4, 1, 5). Note that x0 is the vector with all components 1, x1 the vector of x values, and and x2 the vector of x2 values. Normal equations are
4c0 − 2c1 + 6c2 = 10, −2c0 + 6c1 − 8c2 = −7, 6c0 − 8c1 + 18c2 = 25,
from which c0 = 3/10, c1 = 8/5 and c2 = 2.
Example: Find the line y = c0 + c1x which best fits the points (2, 3), (1, 2), (3, 3) and (4, 5). (Data used earlier to get y = cx only.)
29 Solution: let x0 = (1, 1, 1, 1), x1 = (2, 1, 3, 4) and y = (3, 2, 3, 5). So we want y ≈ c0x0 + c1x1. Normal equations are
4c0 + 10c1 = 13 10c0 + 30c1 = 37,
9 giving c0 = 1 and c1 = 9/10, or y = 1 + 10 x.
LECTURE 12
8 Cauchy sequences and completeness
Recall that if (X, d) is a metric space, then a sequence (xn) of elements of X converges to x ∈ X if d(xn, x) → 0, i.e., if given ε > 0 there exists N such that d(xn, x) < ε whenever n ≥ N.
Often we think of convergent sequences as ones where xn and xm are close together when n and m are large. This is almost, but not quite, the same thing.
8.1 Definition
A sequence (xn) in a metric space (X, d) is a Cauchy sequence if for any ε > 0 there is an N such that d(xn, xm) < ε for all n, m ≥ N.
1 1 Example: take xn = 1/n in R with the usual metric. Now d(xn, xm) = n − m . Suppose that n and m are both at least as big as N; then d(xn, xm) ≤ 1/N.
[DIAGRAM, showing the points]
Hence if ε > 0 and we take N > 1/ε, we have d(xn, xm) ≤ 1/N < ε whenever n and m are both ≥ N.
In fact all convergent sequences are Cauchy sequences, by the following result.
8.2 Theorem
Suppose that (xn) is a convergent sequence in a metric space (X, d), i.e., there is a limit point x such that d(xn, x) → 0. Then (xn) is a Cauchy sequence.
Proof: take ε > 0. Then there is an N such that d(xn, x) < ε/2 whenever n ≥ N. Now suppose both n ≥ N and m ≥ N. Then
d(xn, xm) ≤ d(xn, x) + d(x, xm) = d(xn, x) + d(xm, x) < ε/2 + ε/2 = ε,
30 and we are done.
8.3 Proposition Every subsequence of a Cauchy sequence is a Cauchy sequence.
Proof: if (xn) is Cauchy and (xnk ) is a subsequence, then given ε > 0 there is an N such that d(xn, xm) < ε whenever n, m ≥ N. Now there is a K such that nk ≥ N whenever k ≥ K. So d(xnk , xnl ) < ε whenever k, l ≥ K.
Does every Cauchy sequence converge?
Examples: 1. (X, d) = Q, as a subspace of R with the usual metric. Take x0 = 2 and define x = xn + 1 . The sequence continues 3/2, 17/12, 577/408,... and indeed n+1 2 xn x 1 2 xn → x where x = 2 + x , i.e., x = 2. But this isn’t in Q. √ Thus (xn) is Cauchy in R, since it converges to 2 when we think of it as a sequence in R. So it is Cauchy in Q, but doesn’t converge to a point of Q.