<<

HS I c Gabriel Nagy

Hilbert Spaces I: Basic Properties Notes from the Analysis Course (Fall 07 - Spring 08) In this section we introduce an important class of Banach spaces, which carry some additional geometric structure, that enables us to use our two- or three-dimensional intuition.

Convention. Throughout this note all vector spaces are over C. A. Algebraic Preliminaries: Sesquilinear Forms

Definition. Given two vector spaces X and Y, a map φ : X × Y → C is said to be a on X × Y, if:

• for every x ∈ X , the map Y 3 y 7−→ φ(x, y) ∈ C is linear; • for every y ∈ Y, the map X 3 x 7−→ φ(x, y) ∈ C is conjugate linear, meaning that the conjugate map X 3 x 7−→ φ(x, y) ∈ C is linear. Exercise 1♥. Given a sesquilinear form φ on X × Y, prove that the map

? φ : Y × X 3 (y, x) 7−→ φ(x, y) ∈ C is a sesquilinear form on Y × X . This map is referred to as the adjoint of φ. For the remainder of this sub-section we are going to restrict our attention to the case when X = Y. In this case Lemmas 1-3 contain three easy but fundamental results. Lemma 1 (The ). If X is a and φ : X × X → C is a sesquilinear form, then φ(x + y, x + y) + φ(x − y, x − y) = 2φ(x, x) + φ(y, y), ∀ x, y ∈ X . (1) Proof. This follows directly, using the properties of sesquilinear forms, which yield φ(x + y, x + y) = φ(x, x) + φ(x, y) + φ(y, x) + φ(y, y), φ(x − y, x − y) = φ(x, x) − φ(x, y) − φ(y, x) + φ(y, y), for all x, y ∈ X .

Lemma 2 (The ). If X is a vector space and φ : X × X → C is a sesquilinear form, then

3 1 X φ(x, y) = i−kφ(x + iky, x + iky), ∀ x, y ∈ X . (2) 4 k=0

1 Proof. Denote by T4 the set of roots of unity of order 4, that is, T4 = {±1, ±i}, so that the sum in the right-hand side of (2) simply reads P ζφ¯ (x + ζy, x + ζy). ζ∈T4 By the properties of sesquilinear forms, we have

φ(x + ζy, x + ζy) = φ(x, x) + φ(ζy, x) + φ(x, ζy) + φ(ζy, ζy) = = φ(x, x) + ζφ¯ (y, x) + ζφ(x, y) + ζζφ¯ (y, y), ∀ x, y ∈ X , ζ ∈ C.

If we specialize to the case when ζ ∈ T, i.e. |ζ| = 1(= ζζ¯ ), then the above calculation yields ζφ¯ (x + ζy, x + ζy) = ζφ¯ (x, x) + ζ¯2φ(y, x) + φ(x, y) + ζφ¯ (y, y), and then the sum in the right-hand side of (2) is:

3 X  X   X  i−kφ(x + iky, x + iky) = ζ¯ · φ(x, x) + φ(y, y) + ζ¯2 · φ(y, x) + 4φ(x, y), k=0 ζ∈T4 ζ∈T4 so by the obvious identities P ζ¯ = P ζ¯2 = 0, the desired identity (2) follows. ζ∈T4 ζ∈T4 Definitions. Let X be a vector space, and let φ : X × X → C be a sesquilinear form. A. We say that φ is hermitian, if φ? = φ, which means:

φ(x, y) = φ(y, x), ∀ x, y ∈ X . (3)

B. We say that φ is positive definite, if:

φ(x, x) ≥ 0, ∀ x ∈ X . (4)

C. We say that φ is strictly positive definite, if:

(i) φ is positive definite, and (ii) φ(x, x) = 0 ⇒ x = 0.

Remarks 1-3. Let φ : X × X → C be a sesquilinear form. 1. The condition that φ is hermitian is equivalent to the condition:

φ(x, x) ∈ R, ∀ x ∈ X . (5) The implication “hermitian” ⇒ (5) is trivial. Conversely, if φ satisfies (5), then (using the notations from the proof of Lemma 2), by the Polarization Identity we have 1 X 1 X φ(y, x) = ζφ(y + ζx, y + ζx) = ζφ(y + ζx, y + ζx), 4 4 ζ∈T4 ζ∈T4 so using the “change of variable” γ = ζ¯, we also get 1 X φ(y, x) = γ¯φ(y +γx, ¯ y +γx ¯ ), (6) 4 γ∈T4

2 and now the desired identity (3) follows from the properties of sesquilinear forms, which imply

φ(y +γx, ¯ y +γx ¯ ) = φγ¯(x + γy), γ¯(x + γy) =

= γγφ¯ (x + γy, x + γy) = φ(x + γy, x + γy), ∀ γ ∈ T4, so when we go back to (6), again by the Polarization Identity we get

1 X φ(y, x) = γφ¯ (x + γy, x + γy) = φ(x, y). 4 γ∈T4

2. If φ is hermitian, then it satisfies

φ(x + y, x + y) = φ(x, x) + φ(y, y) + 2Re φ(x, y), (7) φ(x − y, x − y) = φ(x, x) + φ(y, y) − 2Re φ(x, y), (8)

for all x, y ∈ X . These two identities are referred to as the Law of Cosine.

3. If φ is positive definite, then φ is hermitian. This follows immediately from Remark 1.

Lemma 3 (Cauchy-Buniakowski-Schwartz Inequality). If φ : X × X → C, then |φ(x, y)|2 ≤ φ(x, x) · φ(y, y), ∀ x, y ∈ X . (9)

Furthermore, if φ(x, x) 6= 0, then one has equality in (9), if and only if there exists ζ ∈ C, such that φ(ζx + y, ζx + y) = 0. (10) Proof. Fix x and y, and note that, by the Law of Cosine, one has

2   2 0 ≤ φ(αx + βy, αx + βy) = |α| φ(x, x) + 2Re αβφ¯ (x, y) + |β| φ(y, y), ∀ α, β ∈ C. (11)

Fix now β ∈ C, with |β| = 1, such that βφ(x, y) = |φ(x, y)|, and define the function g : R 3 t 7−→ φ(tx + βy, tx + βy) ∈ R, which by (11) can equivalently be written as

2 g(t) = t φ(x, x) + 2t|φ(x, y)| + φ(y, y), t ∈ R. (12) By positive definiteness, we also know that

g(t) ≥ 0, ∀ t ∈ R. (13)

We now have two cases to consider: (i) φ(x, x) = 0; (ii) φ(x, x) > 0. In case (i) – when g is a polynomial of degree 1 – condition (13) forces g to be constant, so φ(x, y) = 0, and then (9) is trivial. In case (ii) it follows that g is a quadratic polynomial (with positive leading coefficient), so condition (13) forces the discriminant ∆ to be non-positive, i.e. 4|φ(x, y)|2 − 4φ(x, x) · φ(y, y) ≤ 0, from which (9) is trivial. Furthermore, if equality holds in (9), this means that

3 ∆ = 0, so the equation g(t) = 0 has a unique real solution t0, which means precisely that φ(t0x + βy, t0x + βy) = 0, so we also get ¯ ¯ ¯ ¯  ¯ φ(t0βx + y, tβx + y) = φ β(t0x + βy), β(t0x + βy) = ββφ(t0x + βy, t0x + βy) = 0, ¯ so (10) holds with ζ = t0β. Conversely, if (10) holds for some ζ, then by (9) it follows that

0 ≤ |φ(x, ζx + y)|2 ≤ φ(x, x) · φ(ζx + y, ζx + y) = 0, 0 ≤ |φ(y, ζx + y)|2 ≤ φ(y, y) · φ(ζx + y, ζx + y) = 0,

which give φ(x, ζx + y) = φ(y, ζx + y) = 0, so we get φ(x, y) = −ζφ(x, x) and φ(y, y) = −ζφ(y, x) = −ζφ(x, y) = ζζφ¯ (x, x), thus clearly forcing

|φ(x, y)|2 = |ζ|2φ(x, x)2 = φ(x, x) · φ(y, y).

The final result in this sub-section establishes the characterization of positive definite sesquilinear forms in terms of convexity. Theorem 1. Let X be a C-vector space. For a map q : X → [0, ∞), the following are equivalent.

(i) There exists a positive definite sesquilinear form phi : X × X → C, such that

q(x) = pφ(x, x), ∀ x ∈ X . (14)

(ii) q is a on X , which satisfies the Parallelogram Law:

q(x + y)2 + q(x − y)2 = 2q(x)2 + 2q(y)2, ∀ x, y ∈ X . (15)

Moreover, φ is unique and satisfies the Polarization Identity:

3 1 X φ(x, y) = i−kq(x + iky)2, (16) 4 k=0 Proof. (i) ⇒ (ii). Assume q is defined by (14), and let us prove first that q is a seminorm. Since φ(ζx, ζx) = ζζφ¯ (x, x) = |ζ|2φ(x, x), by taking square roots, it follows that

q(ζx) = |ζ| · q(x), ∀ ζ ∈ C, x ∈ X . To prove the triangle inequality, we use the Law of Cosine and the Cauchy-Buniakowski- Schwartz Inequality, which gives:

φ(x + y, x + y) = |φ(x + y, x + y)| = |φ(x, x)| + 2Re φ(x, y) + φ(y, y)| ≤ = |φ(x, x) + 2|φ(x, y)| + φ(y, y) ≤ ≤ φ(x, x) + 2pφ(x, x) · φ(y, y) + φ(y, y) = pφ(x, x) + pφ(y, y)2,

so by taking square roots we get q(x + y) ≤ q(x) + q(y), ∀ x, y ∈ X .

4 The fact that q satisfies the Parallelogram Law (15), as well as the Polarization Identity (16) is immediate from Lemmas 1 and 2. (ii) ⇒ (i). Assume q be a seminorm that satisfies the Parallelogram Law (15), define φ : X × X → C by (16), and let us prove that φ is positive definite, and it also satisfies (14).

Claim 1: φ(y, x) = φ(x, y), ∀ x, y ∈ X . Fix x, y ∈ X , so by definition, we have

3 1 X φ(y, x) = ikq(y + ikx)2. (17) 4 k=0 Since q is a seminorm, we have q(y + ikx) = qik(x + i−ky) = |ik| · q(x + i−ky) = q(x + i−ky), 1 P3 k −k 2 ∀ k = 0,..., 3, so if we go back to (17) we now have φ(y, x) = 4 k=0 i q(x + i y) , and using the notations from Lemma 2 we have 1 X 1 X φ(y, x) = ζq(x + ζy¯ )2 = ζq¯ (x + ζy)2 = φ(x, y). 4 4 ζ∈T4 ζ∈T4 (The second equality uses the “change of variable” γ = ζ¯.) Having proved Claim 1, we now see that, in order to prove that φ is sesquilinear, it suffices to prove that, for every x ∈ X , the map X 3 y 7−→ φ(x, y) ∈ C is C-linear. Claim 2: φ(x, 2y) = 2φ(x, y), ∀ x, y ∈ X Using the Parallelogram Law (with x + iky in place of x and iky in place of y), we have the equality q(x+2iky)2 +q(iky)2 = 2q(x+iky)2 +2q(iky), which combined with the equality q(iky) = q(y), gives q(x + 2iky)2 = 2q(x + iky)2 + q(y)2, so by the definition of φ we get (the third sum is 0):

3 3 3 1 X 1 X 1 X φ(x, 2y) = i−kq(x + 2iky)2 =  2i−kq(x + iky)2 +  i−k · q(y) = 2φ(x, y). 4 4 4 k=0 k=0 k=0

Claim 3: φ(x, y1 + y2) = φ(x, y1 + y2), ∀ x, y1, y2 ∈ X . 1 k 1 k If we apply the Parallelogram Law (with 2 x + i y1 in place of x and 2 x + i y2 in place of y), we get

k k 2 1 k 2 1 k 2 k k 2 q(x + i y1 + i y2) = 2q( 2 x + i y1) + 2q( 2 x + i y2) − q(i y1 − i y2) , −k k k so if we multiply by i , sum up, and divide by 4, then using the identity q(i y1 − i y2) = q(y1 − y2), we get (the third sum is 0):

3 3 1 X 1 X φ(x, y + y ) = 2i−kq( 1 x + iky )2 + 2i−kq( 1 x + iky )2+ 1 2 4 2 1 4 2 1 k=0 k=0 (18) 3 1 X +  i−k · q(y − y )2 = 2φ( 1 x, y ) + 2φ( 1 x, y ). 4 1 2 2 1 2 2 k=0

5 By Claims 1 and 2, however, we have

1 1 2φ( 2 x, yj) = 2φ(yj, 2 x) = φ(yj, x) = φ(x, yj), j = 1, 2,

so going back to (18) yields: φ(x, y1 + y2) = φ(x, y1) + φ(x, y2). Claim 4: φ(x, ζy) = ζφ(x, y), ∀ x, y ∈ X , ζ ∈ C. 1  P3 −k 2 First of all, the equality is obvious if ζ = 0, since φ(x, 0) = 4 · k=0 i · q(x) = 0. Second, since the map C 3 α 7−→ q(x + αy) ∈ [0, ∞) is continuous, since it satisfies the inequality q(x + αy) − q(x + βy) ≤ q(αy − βy) = |α − β|q(y), ∀ α, β ∈ C, it follows that (for fixed x, y ∈ X ), the map ζ 7−→ φ(x, ζy) continuous in ζ, so it suffices to prove the Claim for ζ in a dense subset of C, and we choose such a dense subset to be

−k S = {(m + ni)2 : m, n ∈ Z, k ∈ N}. Since by Claim 2 we clearly have φ(x, 2−ky) = 2−kφ(x, y), it suffices to consider ζ in the subset S0 = {m + ni : m, n ∈ Z}. Moreover since by Claim 3 (and the case ζ = 0) we know that φ(x, (m + ni)y) = φ(x, my) + φ(x, niy) = mφ(x, y) + nφ(x, iy), it suffices to prove the Claim only for ζ = i. But this is pretty clear since by definition (using again the notations from Lemma 2, and the “change of variable” γ = iζ) we have

1 X i X i X φ(x, iy) = ζq¯ (x + iζy)2 = iζq(x + iζy)2 = γq¯ (x + γy)2 = iφ(x, y). 4 4 4 ζ∈T4 ζ∈T4 γ∈T4 Having proved that φ is a (hermitian) sesquilinear form, all that remains to prove is the fact that it satisfies the identity φ(x, x) = q(x)2, ∀ x ∈ X . (19) At this point, all we have to do is to notice that

 q(x + x) = 2q(x), if k = 0 k  q(x + i x) = q(x − x) = 0, if k = 2 √  q(x ± ix) = |1 ± i| · q(x) = 2q(x), if k = 1, 3 so by the definition of φ we have

3 1 X 1 φ(x, x) = i−kq(x + ikx)2 = 4q(x)2 − 2iq(x)2 + 2iq(x)2 = q(x)2. 4 4 k=0

B. (Pre-)Hilbert Spaces

The spaces we introduce in this sub-section are those whose geometry is “governed” by the Parallelogram Law. Definitions. Let H be a .

6 A. We say that H is a pre-, if its satisfies the Parallelogram Law:

kx + yk2 + kx − yk2 = 2kxk2 + 2kyk2, ∀ x, y ∈ H. (20)

B. If H is a pre-Hilbert space, then according to Theorem 1, there exists a unique strictly positive sesquilinear map, denoted from now on by

( · | · ): H × H 3 (x, y) 7−→ (x|y) ∈ C, which is referred to as the inner product, satisfying

(x|x) = kxk2, ∀ x ∈ H.

Moreover, the inner product can be reconstructed out of the norm, by the Polarization Identity: 3 1 X (x|y) = i−kkx + ikyk2, ∀ x, y ∈ H. (21) 4 k=0 C. We say that H is a Hilbert space, if H is simultaneously a pre-Hilbert space and a . Remarks 4-5. Assume H is a pre-Hilbert space. For future reference, we record here two of the results from the previous sub-section, which in this current setting have the following specific forms. 4. By Remark 2, the Law of Cosine, reads:

kx + yk2 = kxk2 + kyk2 + 2Re(x|y), (22) kx − yk2 = kxk2 + kyk2 − 2Re(x|y), (23)

for all x, y ∈ H.

5. By Lemma 3, the Cauchy-Buniakowksi-Schwartz inequality reads:

|(x, y)| ≤ kxk · kyk, ∀ x, y ∈ H. (24)

Moreover, if x 6= 0, then one has equality in (24), if and only if there exists ζ ∈ C, such that y = ζx.

Remark 6. If q is a seminorm on a C-vector space X , that satisfies the Parallelogram Law (15), then by Theorem 1 there exists a (unique) positive definite sesquilinear form φ : X × X → C, such that φ(x, x) = q(x)2, ∀ x ∈ X . Consider the linear subspace N = {x ∈ X : q(x) = 0}. If we equip the quotient space X /N with the induced norm, defined (correctly) by kxˆk = q(x), where X 3 x 7−→ xˆ ∈ X /N denotes the quotient map, then it is obvious that X /N becomes a pre-Hilbert space. The inner product on the quotient space is defined (correctly) by (ˆx|yˆ) = φ(x, y). Remark 7. The completion of a pre-Hilbert space is obviously a Hilbert space, since the norm obviously satisfies the Parallelogram Law.

7 Remark 8. If H is a pre-Hilbert space, and we equip the product space H × H with the product topology, then the inner product map H × H 3 (x, y) 7−→ (x|y) ∈ C is continuous. Indeed, given two norm-convergent sequences xn → x and yn → y in H, by the Cauchy- Buniakowski-Schwartz Inequality we have

(xn|yn) − (x|y) = (xn − x|yn) + (x|yn − y) ≤ (xn − x|yn) + (x|yn − y) ≤

≤ kxn − xk · kynk + kxk · kyn − yk, so using the fact that kynk → kyk it follows immediately that (xn|yn) − (x|y) → 0. Example 1. Given a positive integer n, we equip Cn with the norm

p 2 2 n kxk2 = |x1| + ··· + |xn| , x = (x1, . . . , xn) ∈ C .

n It is pretty clear that (C , k . k2) is a Hilbert space, hereafter referred to as the standard n-dimensional Hilbert space. The standard inner product is defined by

n (x|y) =x ¯1y1 + ··· +x ¯ny + n, x = (x1, . . . , xn), y = (y1, . . . , yn) ∈ C . Example 2. Given a measure space (X, A, µ), the Banach space L2(X, A, µ) is a Hilbert space. The corresponding inner product is Z (f|g) = fg¯ dµ, f, g ∈ L2(X, A, µ). X Example 3. As a special case of Example 2, if we start with an arbitrary set S, and we consider the measure space (S, P(S), ν), where ν is the counting measure, we obtain the Hilbert space `2(S), whose inner product is

X 2 (x|y) = x¯sys, x = (xs)s∈S, y = (ys)s∈S ∈ ` (S). s∈S

2 L The Hilbert space ` (S) is the completion of the pre-Hilbert space ( s∈S C, k . k2). In the case when S = N, the Hilbert space `2(N) is simply denoted by `2. Example 4. If Ω is a locally compact Hausdorff space, and µ is an “honest” Radon measure on Ω, such that (∗) µ(D) > 0, for all non-empty open sets D ⊂ Ω, then the space Cc(Ω). of C-valued continuous functions on Ω, with compact support, is 2 2 injectively embedded in L (Ω, Bor(Ω), µ) = L (Ω, M(µ), µ), so (Cc(Ω), k . k2) is a pre-Hilbert space. 2 In the absence of condition (∗), the natural map Cc(Ω) → L (Ω, M(µ), µ) is not injective, so we only obtain a seminorm Z 2 q(f) = |f| dµ, f ∈ Cc(Ω), X which still satisfies the Parallelogram Law (15). Passing to quotients and taking completion (as indicated in Remarks 6 and 7, we obtain again the Hilbert space L2(Ω, Bor(Ω), µ) = L2(Ω, M(µ), µ).

8 In what follows we are going to derive an important result concerning the geometry of Hilbert spaces, in part based on the Parallelogram Law. Theorem 2. Assume H is a Hilbert space, and C ⊂ H is a non-empty, norm-closed, convex subset. For every x ∈ H, there exists a unique x0 ∈ C, such that

kx − x0k ≤ kx − yk, ∀ y ∈ C, (25) so in effect kx − x0k = dist(x, C). Moreover, the vector x0 ∈ C is the unique one satisfying the inequality Re(x − x0|y − x0) ≤ 0, ∀ y ∈ C. (26)

Proof. Consider the quantity δ = dist(x, C) = infy∈C kx−yk. Since C is norm-closed, one has the equivalence δ = 0 ⇔ x ∈ C, in which case the Theorem is trivially satisfied with x0 = x. (The fact that x0 = x is the only vector in C that satisfies (26) is quite obvious, because in (26) we can let y = x.) Therefore, for the rest of the proof we may assume δ > 0, thus x 6∈ C. ∞ By the definition of δ, there exists a sequence (yn)n=1 ⊂ C, such that limn→∞ kx − ynk = δ. Let us apply now the Parallelogram Law (20) – with x − ym in place of x and x − yn in place of y – to obtain the equality

2 2 2 2 kym − ynk = 2kx − ymk + 2kx − ynk − k2x − ym − ynk , ∀ m, n ∈ N. (27)

1 Since C is convex, the vector 2 (ym + yn) belongs to C, so (before taking squares) the last term in the right-hand side of (27) satisfies the inequality

1 k2x − ym − ynk = 2kx − 2 (ym + yn)k ≥ 2δ, so if we go back to (27) we obtain

2 2 2 2 kym − ynk ≤ 2kx − ymk + 2kx − ynk − 4δ , ∀ m, n ∈ N. (28)

∞ Since limn→∞ kx−ynk = δ, by (28) it follows immediately that (yn)n=1 is a Cauchy sequence, this convergent to some point x0 (that necessarily belongs to C, which is closed), and this point will satisfy kx − x0k = limn→∞ kx − ynk = δ, thus having the desired property (25). To prove the uniqueness of x0, among all x0 ∈ C that satisfy (25), we first prove the following

Claim 1: If x0 ∈ C is any vector with property (25), then x0 also satisfies (26).

To prove the Claim, we argue by contradiction, assuming there exist x0, y ∈ C, such that kx − x0k = δ, and Re(x − x0|y − x0) > 0. Define, for every t ∈ R, the vector

yt = x0 + t(y − x0) = (1 − t)x0 + ty, (29)

2 2 and consider the function f : R → R, given by f(t) = kx − ytk = (x − x0) − t(y − x0) . Using the Law of Cosine (23) – with x − x0 in place of x and t(y − x0) in place of y – we can write 2 2 2 f(t) = kx − x0k + t ky − x0k − 2tRe(x − x0|y − x0), 2 2 0 so f is a quadratic function, with f(0) = kx − x0k = δ , and f (0) = −2Re(x − x0|y − x0). Since by assumption we have f 0(0) < 0, it follows that f is decreasing on a small open

9 interval around 0, so in particular there exists s ∈ (0, 1), such that f(s) < f(0) = δ2. But now we reached a contradiction, because for such an s, by convexity and by (29) it follows 2 2 that ys belongs to C, so by the definition of δ we also have f(s) = kx − ysk ≥ δ . Having proved Claim 1, we now see that in order to finish the proof of the Theorem, all we need to do is to prove the second uniqueness statement, that is, the following

Claim 2: Whenever x0, x1 are such that

Re(x − xk|y − xk) ≤ 0, ∀ y ∈ C, k = 0, 1, (30) it follows that x0 = x1. Using (30) it follows immediately that

Re(x − x0|x1 − x0) ≤ 0;

Re(x1 − x|x1 − x0) = Re(x − x1|x0 − x1) ≤ 0,

so if we add these two inequalities we get Re(x1 − x0|x1 − x0) = 0, which means that 2 kx1 − x0k = 0, i.e. x1 = x0.

Orthogonal Projections and Riesz’ Theorem

As an important application of Theorem 2 from the preceding sub-section, we have the following. Proposition-Definition 1. Let H be a Hilbert space, and let Y ⊂ H be a closed linear subspace. Denote, for every x ∈ H, by PY x the unique vector x0 ∈ Y, such that

kx − x0kleqkx − yk, ∀ y ∈ Y. (31)

(See Theorem 2 for existence, uniqueness, and additional properties of x0.) The map x 7−→ PY x has the following properties.

(i) For every vector x ∈ H, the vector PY x is the unique vector x0 ∈ Y, such that

(x − x0|y) = 0, ∀ y ∈ Y. (32)

(ii) PY : H → H is linear, and satisfies the identity

2 2 2 kPX xk + kx − PY xk = kxk , ∀ x ∈ H, (33)

so in particular PY is also continuous, with kPY k ≤ 1.

2 (iii) PY = PY .

(iv) Range PY = Y and Ker PY = Range (IH − PY ), where IH : H → H denotes the identity operator.

The linear continuous operator PY : H → H is called the orthogonal projection onto Y.

10 Proof. (i). Fix x ∈ H, and let x0 ∈ Y be the unique vector satisfying (31). By Theorem 2 we also know that x0 is the unique vector in Y satisfying

Re(x − x0|y − x0) ≤ 0, ∀ y ∈ Y. (34)

Since x0 ∈ Y, it is obvious that {y − x0 : y ∈ Y} = Y, so condition (34) is equivalent to:

Re(x − x0|y) ≤ 0, ∀ y ∈ Y. (35) It is obvious that (32)⇒(35), so in order to finish the proof of (i), we only need to prove the implication (35)⇒(32). But this is clear, because if we start with some y ∈ Y, we can choose some ζ ∈ T, such that |(x − x0|y)| = ζ(x − x0|y) = (x − x0|ζy), and ζy also belongs to Y. (ii). By the uniqueness condition (i), in order to prove the linearity of PY it suffices to prove the identities  (x1 + x2) − (PY x1 + PY x2) y = 0, ∀ x1, x2 ∈ H, y ∈ Y; (36)  ζx − ζPY x y = 0, ∀ x ∈ H, ζ ∈ C, y ∈ Y; (37)

(The equality (36) yields PY (x1 +x2) = PY x1 +PY x2, while the equality (37) yields PY (ζx) = ζPY x.) Both identities (36) and (37) are trivial, since by the sesquilinearity of the inner product we have:    (x1 + x2) − (PY x1 + PY x2) y = x1 − PY x1 y + x2 − PY x2 y ;  ¯  ζx − ζPY x y = ζ x − PY x y To prove the equality (33) we use the Law of Cosine, which yields

2 2 2 kxk = kx − PY xk + kPY k + 2Re(x − PY x|PY x),

and we use property (i) with y = PY x, which yields (x − PY x|PY x) = 0. (iii)-(iv). First of all, it is trivial that Range PY ⊂ Y, by construction. Secondly, right from the definition it follows, that

PY x = x, ∀ x ∈ Y, (38)

so we also have the inclusion Range PY ⊃ Y, so in fact we have the equality Range PY = Y. Thirdly, if we start with an arbitrary x ∈ H, then PY x ∈ Y, and then using (38) we get 2 PY (PY x) = PY x, which means that we have the identity PY = PY . Finally, in order to prove the equality

Ker PY = Range(IH − PY ), (39)

2 we first notice that, by the obvious identity PY (IH − PY ) = PY − PY = 0, we clearly have the inclusion Range(IH − PY ) ⊂ Ker PY . Conversely, if x ∈ Ker PY , then x = x − PY x = 1 (IH − PY )x, so obviously x belongs to Range(IH − PY ).

1 There is nothing special about the inclusion Ker P ⊂ Range(IH − P ). It holds for arbitrary linear maps P : H → H, on arbitrary vector spaces H.

11 Theorem 3 (Riesz). Let H be a Hilbert space. For a linear functional, φ : H → C, the following are equivalent:

(i) φ is norm-continuous;

(ii) there exists a vector x ∈ H, such that

φ(y) = (x|y), ∀ y ∈ H. (40)

Moreover, in this case the vector x is unique and has kxk = kφk. Proof. (i) ⇒ (ii). Assume φ is norm continuous, and let us prove the existence and unique- ness of x satisfying (40). The case when φ is identically 0 is trvial, by taking x = 0. (The uniqueness is obvious, because in the case φ = 0, any vector x satisfying (40) has kxk2 = (x|x) = φ(x) = 0.) For the remainder of the proof we assume that φ is not identi- cally 0, which implies the existence of some x0 with φ(x0) 6= 0. Consider then the (proper) linear subspace Y = Ker φ, and the vector x1 = x0 − PY x0. By linearity, combined with the fact that PY x0 ∈ Y = Ker φ we have φ(x1) = φ(x0) − φ(PY x0) = φ(x0) 6= 0. Consider then the linear functional φ1 : H 3 y 7−→ (x1|y) ∈ C, and let us remark that Ker φ = Ker φ1. (41) Indeed, if we start with some y ∈ Y, then by Proposition 1 (i), we have

φ1(y) = (x1|y) = (x0 − PY x0|y) = 0,

thus proving the inclusion Ker φ ⊂ Ker φ1. Since φ1 is not identically 0, and H/Ker φ is one-dimensional, this inclusion forces the equality (41). Furthermore, using the same one- dimensional argument, it follows that there exists some ζ ∈ C r {0k, such that φ = ζφ1, which means that ¯ φ(y) = ζ(x1|y) = (ζx1|y), ∀ y ∈ Y, ¯ thus proving (40) with x = ζx1. For the uniqueness, assume x0 ∈ H is another vector satisfying (40), and let us prove that x0 = x. This is, however, trivial, since the equalities (x|y) = φ(y) = (x0|y) force

(x0 − x|y) = 0, ∀ y ∈ H,

and this clearly implies kx0 − xk2 = (x0 − x|x0 − x) = 0. (ii) ⇒ (i). Assume φ is represented by (40), and let us prove that φ is continuous, and has kφk = kxk. The continuity is obvious by the Cauchy-Buniakowski-Schwartz Inequality, which yields:

|φ(y)| = (x|y) ≤ kxk · kyk, ∀ y ∈ H, thus also proving the inequality kφk ≤ kxk. By the norm inequality, we also know that

|φ(y)| ≤ kφk · kyk, ∀ y ∈ H,

12 so if we plug in y = x we get

2 kxk = (x|x) = (x|x) = |φ(x)| ≤ kφk · kxk, which forces (at least in the case x 6= 0) the inequality kxk ≤ kφk. Comment. Riesz’ Theorem has many important applications, which will discussed throughout the entire chapter. Besides Exercise 2 below, a whole group of applications will appear in sub-section D. Exercise 2. Prove that all Hilbert spaces are reflexive. This means that, if one considers the dual Banach space H∗, and if one defines, for every x ∈ H, the linear continuous functional x# : H∗ 3 φ 7−→ φ(x) ∈ C, then the map #: H 3 x 7−→ x# ∈ (H∗)∗

is an isometric linear isomorphism.

D. Orthogonality

Riesz’ Theorem shows that, the inner products on Hilbert spaces enjoy properties sim- ilar to dual pairings (see DT I). To be a bit more precise, we can introduce the following “sesquilinear (hilbertian) analogues” of the polars and absolute polars, as indicated in the following Exercises. Exercises 3-4. Let H be a Hilbert space. 3. Given a non-empty set A ⊂ H, define its hilbertian polar to be the set

A◦,h = {x ∈ A : Re(x|a) ≤ 1, ∀ a ∈ A}.

Prove the hilbertian Bi-polar Theorem:

(A◦,h)◦,h = conv(A ∪ {0}).

4. Given a non-empty set A ⊂ H, define its hilbertian absolute polar to be the set

,h A = {x ∈ A : Re(x|a) ≤ 1, ∀ a ∈ A}.

Prove the hilbertian Absolute Bi-polar Theorem:

,h ,h (A ) = conv(bal A).

Another construction from DT I was the annihilator. The hilbertian annihilator has a privileged role in the theory of Hilbert spaces, and therefore the above terminology and notation are slightly modified as follows. Definition. Given a Hilbert space H, and some non-empty subset A ⊂ H, we define the orthogonal set to A to be:

A⊥ = {x ∈ H :(x|a) = 0, ∀ a ∈ A}.

13 As a special case (when A is a singleton), given two vectors x, y ∈ H, we shall say that x is orthogonal to y, if (x|y) = 0, in which case we write x ⊥ y. Note that the relation ⊥ is symmetric. The main properties of orthogonality are summarized below. Proposition 2. Suppose H is a Hilbert space. (i) For any non-empty subset A ⊂ H, the set A⊥ is a closed linear subspace. (ii) If A ⊂ B ⊂ H are non-empty sets, then A⊥ ⊃ B⊥. (iii) For any non-empty subset A, one has the equality: span A = (A⊥)⊥. (42)

(iv) For any closed linear subspace X , the linear subspace X ⊥ satisfies the equalities: • X ∩ X ⊥ = {0}; •X + X ⊥ = H. Proof. (i). Since by definition we clearly have \ A⊥ = {x}⊥, (43) x∈A it suffices to show that {x}⊥ is a closed linear subspace, which is trivial by Riesz’ Theorem, ⊥ since {x} = Ker φx, where φx : H 3 y 7−→ (x|y) ∈ C is known to be linear and continuous. (ii). This follows immediately from (43). (iii). From DT I we know that span A = (Aann)ann, (44) where the annihilator spaces are defined using the dual pairing between H and its topological dual H∗ as: Aann = {φ ∈ H∗ : φ(a) = 0, ∀ a ∈ A}, (Aann)ann = {y ∈ H : φ(y) = 0, ∀ φ ∈ Aann}.

∗ Using Riesz’ Theorem, we know that the correspondence Φ : H 3 x 7−→ φx ∈ H is a ann ⊥ ⊥ bijection. Using this bijection, it follows immediately that A = {φx : x ∈ A } = Φ(A ), which in turn means that

ann ann ⊥ ⊥ ⊥ ⊥ (A ) = {y ∈ H : φx(y) = 0, ∀ x ∈ A } = {y ∈ H :(x|y) = 0, ∀ x ∈ A } = (A ) , and the desired equality (42) follows immediately from (44). (iv). The equality X ∩ X ⊥ = {0} is trivial, since any vector x ∈ X ∩ X ⊥ is orthogonal to itself, i.e. kxk2 = (x|x) = 0. To prove the second equality, we start with an arbitrary vector y ∈ H, and we consider the vectors y1 = PX y and y2 = y − y1. On the one hand, y1 velarly belongs to X . On the other hand, by the properties of the orthogonal projection PX , we also know that that ⊥ (y2|x) = 0, ∀ x ∈ X , which means that y2 ∈ X , and we are done, since by construction y = y1 + y2.

14 Corollary 1. If H is a Hilbert space, and A is a non-empty subset, then:

⊥ A⊥ = span A .

⊥ ⊥ Proof. Since the left-hand side is closed, we have: A⊥ = (A⊥)⊥ = span A . Corollary 2 (of the proof). Given a Hilbert space H and a closed linear subspace X ⊂ H, the orthogonal projections PX and PX ⊥ , satisfy the identity PX + PX ⊥ = IH. In particular, ⊥ Ker PX = X .

Proof. Start with an arbitrary y ∈ H, and consider the vectors y1 = PX y ∈ X and y2 = ⊥ ⊥ ⊥ y−y1 ∈ X , constructed in the proof of (iv). Since (X ) = X , by the same argument (with ⊥ X in place of X ), we can consider the vector z1 = PX ⊥ y, so that the vector z2 = y − z1 ⊥ belongs to X . Now we have y1 + y2 = y = z2 + z1, with y1, z2 ∈ X and y2, z1 ∈ X , so ⊥ by the equality X ∩ X = {0} we must have y1 = z2 and y2 = z1. In particular, we have (IH − PX )y = y2 = z1 = PX ⊥ y, thus proving the desired identity IH − PX = PX ⊥ . ⊥ The equality Ker PX = X follows now immediately from Proposition-Definition 1.

E. Geometry in Coordinates.

In this section we introduce the suitable objects that allow one to use coordinates, for geometric calculations in a Hilbert space. Definition. Assume H is a pre-Hilbert space. A non-empty subset S ⊂ H is said to be an orthogonal subset, if x 6= 0, ∀ x ∈ S, and furthermore,

(o) whenever x, y ∈ S, x 6= y, it follows that x ⊥ y.

Remark 9. If S is orthogonal, then it is linearly independent. Indeed, if x1, . . . , xn ∈ S Pn are n distinct vectors, and α1, . . . , αn ∈ C are such that k=1 αkxk = 0, then

n n X X 2 0 = xj αkxk) = = (xj|αkxk) = αjkxjk , k=1 k=1 which (by the condition xj 6= 0) forces αj = 0, ∀ j = 1, . . . , n. Proposition-Definition 3. If H is a Hilbert space and S ⊂ H is an orthogonal subset, then the following are equivalent:

(i) S is maximally orthogonal, in the sense that there is no orthogonal subset S0 ) S; (ii) span S = H. Such a set S is called an orthogonal for H. Proof. (i) ⇒ (ii). Assume S is maximally orthogonal. and let us prove the equality (ii). Argue by contradiction, assuming the closed linear subspace X = span S is proper, i.e. X ( H. By Proposition 2, it follows that X ⊥ 6= {0}, so if we take some vector x ∈ X ⊥ r{0},

15 then x ⊥ y, ∀ y ∈ S, so the set S0 = S ∪ {x} ) S is still orthogonal, thus contradicting the maximality of S. (ii) ⇒ (i). Assume condition (ii). In order to prove the maximality of S, it suffices to prove the equality S⊥ = {0}. (45) (Indeed, if (45) holds, then no other set S0 ) S can be orthogonal, since we would have the inclusion S0 r S ⊂ S⊥ as well as the fact that S0 r S consists of non-zero vectors.) But the ⊥ equality (45) is trivial, since by Corollary 1 we have S⊥ = span S = H⊥ = {0}. Definitions. Given a pre-Hilbert space H, a non-empty subset S ⊂ H is said to be orthonormal, if

•S is orthogonal;

• kxk = 1, ∀ x ∈ S.

If H is a Hilbert space, we say that S is an orthonormal basis for H, if S is both an orthonormal subset and an orthogonal basis for H. Proposition 4. Assume H is a Hilbert space, which is not zero-dimensional.

(i) For every orthonormal set S ⊂ H, there exists at least one orthonormal basis B for H, which contains H.

(ii) Any two orthonormal bases for H have the same cardinality.

(iii) If H is finite dimensional, then all orthonormal bases B have card B = dimCH. Proof. (i). Fix S. Consider the collection B of all orthonormal subsets B ⊃ S. Remark now that, if B ⊂ B is a sub-collection which is totally ordered by ⊃, then the set B = S B 0 0 B∈B0 clearly belongs to B, and is an upper bound for B0. Therefore, by Zorn’s Lemma, the collection B possesses a maximal element B. By construction, B is orthonormal, and contains S. By Proposition 3, in order to prove that B is an orthonormal basis, it suffices to show that B is maximally orthogonal, But this is pretty obvious, for if there exists x 6= 0, such that −1 x ⊥ y, ∀ y ∈ B, then the vector x0 = kxk x will have the same property, so the orthonormal set B ∪ {x0} ) B would contradict the maximality of B. (iii). This statement is clear, because in the finite dimensional case all linear subspaces are automatically norm-closed, so any orthonormal basis B for H will satisfy span B = H. (ii). Fix two orthonormal bases A and B, and let us show that card A ≤ card B. (By symmetry, the other inequality will also hold, so we will have in fact equality.) The case when B is finite is trivial by (iii). Assuming B is infinite, we consider the “rational” complex

sub-field Q+iQ ⊂ C, and the linear span X = spanQ+iQB, consisting of linear combinations of vectors in B, with coefficients in B. On the one hand, since B is infinite and Q + iQ is countable, it follows immediately that card X = card B. On the other hand, since Q + iQ is dense in C, and spanCB = H, it follows that X is dense in H, so in particular, for every 1 a ∈ A, the set Xa = {x ∈ X : kx − ak < 2 } is non-empty. Let us remark now that, if a1, a2 ∈ A are distinct, then Xa1 ∩ Xa2 = ∅. Indeed, if there existed a vector x ∈ Xa1 ∩ Xa2 ,

16 1 then the inequalities kx − ajk < 2 , j = 1, 2 force ka1 − a2k < 1, which is impossible, since by the Law of Cosine we have

2 2 2 ka1 − a2k = ka1k + ka2k − 2Re(a1|a2) = 2. S  Since all the sets Xa, a ∈ A, are disjoint and nonempty, it follows that card A ≤ card a∈A Xa ≤ card X = card B, and we are done. Definition. Given a Hilbert space H, the cardinality of one (thus of any) orthonormal basis of H is called the Hilbertian dimension of H, and is denoted by h-dim H. Theorem 4. Let H be a Hilbert space, and let B ⊂ H be an orthonormal subset, listed2 as B = {ei}i∈I . Let X = span B. Q (i) For an I-tuple α = (αi)i∈I ∈ I C, the following are equivalent:

• the I-tuple of vectors (αiei)i∈I is summable in H; 2 P 2 • α belongs to ` (I), i.e. i∈I |αi| < ∞. P Moreover, in this case the vector x = i∈I αiei belongs to X , its its norm given by the equality 2 X 2 kxk = |αi| , (46) i∈I

and the coefficients αi are given by

αi = (ei|x), ∀ i ∈ I. (47)

 2 (ii) For any vector v ∈ H, the I-tuple (ei|v) belongs to ` (I), and furthermore, the P i∈I vector i∈I (ei|v)ei ∈ X coincides with the orthogonal projection of v onto X , that is, one has the equality X PX v = (ei|v)ei. (48) i∈I Proof. Before we proceed with the proof of (i), let us introduce a few notations. Define, for P every finite set of indices F ∈ Pfin(I), the finite sum xF = i∈I αiei, and let us first remark that, since i 6= j ⇒ (ei|ej) = 0, we have

2 X X X 2 kxF k = (xF |xF ) = (αiei|αjej) = α¯iαj(ei|ej) = |αi| , ∀ F ∈ Pfin(I). (i,j)∈F ×F (i,j)∈F ×F i∈F (49) With this notation, the condition that the I-tuple (αiei)i∈I is summable means that the net

(xF )F ∈Pfin(I) is norm-convergent in H. P 2 (i). Assume first that (αiei)i∈I is summable, and let us prove that i∈I |αi| < ∞. This is, however, pretty obvious, since the condition xF → x forces kxF k → kxk, which by (49) gives 2 X 2 kxk = lim |αi| , F ∈Pfin(I) i∈F

2 This means that i 6= j ⇒ ei 6= ej.

17 2 thus proving that α = (α)i∈I indeed belongs to ` (I), as well as the equality (46). 2 Conversely, let us assume now that (α)i∈I belongs to ` (I), and let us show that the P 2 net (xF )F ∈Pfin(I) is convergent. Let s = i∈I |αi| , which is equal to the limit of the net P 2 (sF )F ∈Pfin(I) of partial sums, defined as sF = i∈F |αi| . Since we work in a Banach space,

it suffices to show that the net (xF )F ∈Pfin(I) is Cauchy. Since the net (sF )F ∈Pfin(I) ⊂ [0, ∞) is convergent, hence Cauchy, it follows that, for every ε > 0, there exists Gε ∈ Pfin(I), such 2 that |sF1 − sF2 | < ε /4, for all F1,F2 ∈ Pfin(I), with F1,F2 ⊃ Gε. Now, if we fix two such sets F1 and F2, using (49) it follows that ε2 kx − x k2 = kx k2 = s = s − < , j = 1, 2, Fj Gε Fj rGε Fj rGε Fj Gε 4 which immediately yields ε ε kx − x k ≤ kx − x k + kx − x k < + = ε. F1 F2 F1 Gε F2 Gε 2 2 P The fact that the vector i∈I αiei = limF xF belongs to X is trivial, since xF ∈ span{ei}i∈F ⊂ X , ∀ F ∈ Pfin(I). To prove (47) we use Remark 8, by which it follows that:

(ei|x) = lim (ei|xF ). (50) F ∈Pfin(I)

Since for fixed i ∈ I and F ∈ Pfin(I), we have X X  0 if F 63 i (ei|xF ) = (ei|αjej) = αj(ei|ej) = αi textifF 3 i j∈F j∈F and the desired equality (47) now follows immediately from (50).  2 (ii). Fix v ∈ H, and let us prove first that the I-tuple (ei|v) i∈I belongs to ` (I). Denote the coefficients (ei|v) simply by αi. Fix for the moment some finite set of indices P F ∈ Pfin(I), and let us consider the vectors wF = i∈F αiei and yF = v − wF . Notice now that by construction, for every i ∈ F , we have X X (ei|wF ) = (ei|αjej) = αj(ei|ej) = αi, j∈F j∈F which means that (ei|v − wF ) = 0, ∀ i ∈ F , so in particular we also get X X (wF |yF ) = (wF |v − wF ) = (αiei|v − wF ) = α¯i(ei|v − wF ) = 0, i∈F i∈F so by the Law of Cosine we get

2 2 2 2 kvk = kwF + yF k = kwF k + kyF k .

2 P 2 2 Since by (i) we already know that kwF k = i∈F |αi| , the above calculation yields kvk = P 2 2 i∈F |αi| + kyF k , so in particular it follows that

X 2 2 |αi| ≤ kvk , ∀ F ∈ Pfin(I), i∈F

18 2 so the I-tuple (αi)i∈I indeed belongs to ` (I). P Consider now, using (i), the vector w = i∈I αiei ∈ X . By (47), we know that (ei|w) = αi = (ei|v), which means that

(ei|v − w) = 0, ∀ i ∈ I.

This means, of course that v − w belongs to B⊥ = X ⊥. Now we have the decomposition ⊥ v = w + (v − w) with w ∈ X and v − w ∈ X , which implies the equality w = PX v.

Corollary 3. Using the same notations as in Theorem 4, assume B = {ei}i∈I is an orthonormal basis for the Hilbert space H. The correspondence

 2 U : H 3 x 7−→ (ei|x) i∈I ∈ ` (I) establishes an isometric linear isomorphism. Proof. Using the notations from Theorem 4, in this case the space X = span B coincides with the entire space H, so the equality (48) simply reads: X x = (ei|x)ei, ∀ x ∈ H. (51) i∈I

2 P By Theorem 4 we also know that the map T : ` (I) 3 (αi)i∈I 7−→ i∈I αiei ∈ H is isometric (and trivially linear), and then everything follows from (47) and (51) which imply that T and U are inverses of each other. Exercise 5♥. Prove that, if H and H0 are pre-Hilbert spaces, and U : H → H0 is a . Prove that U is isometric, if and only if U preserves the inner products, i.e. (Ux|Uy) = (x|y), ∀ x, y ∈ H. Exercise 6. Use the notations and assumptions from Theorem 4. Prove that, if α = 2 P P (αi)i∈I and β = (βi)i∈I both belong to ` (I), then the vectors x = i∈I αiei and y = i∈I βiei satisfy the equality X (x|y) = α¯iβi. (52) i∈I

Conclusion. Using the notations as above, and assuming that B = {ei}i∈I is an or- thonormal basis for the Hilbert space H, the equality (52) now reads: X X (x|y) = (ei|x) · (ei|y) = (x|ei) · (ei|y), ∀ x, y ∈ H. (53) i∈I i∈I

2 P 2 As a special case, one obtains: kxk = i∈I (ei|x) , ∀ x ∈ H. The formulas (53) and (51) are known as Parseval Identities. ♥ Exercise 7. Let H be a Hilbert space, let B = {ei}i∈I be an orthonormal subset, and let X = span B. Prove the generalized Parseval Identity: X X (PX x|y) = (x|PX y) = (PX x|PX y) = (ei|x) · (ei|y) = (x|ei) · (ei|y), ∀ x, y ∈ H. i∈I i∈I

19