Hilbert Spaces

1 Introduction

Hilbert spaces are the mathematical structures that underpinns modern quan- tum mechanics. Here we go through the essential concepts needed before learning how to apply this to the study of nature. Familarity with mathematical sym- bols, calculus and linear algebra is assumed. Some of the notation we will use are:

⇒ leads to ∀ for all ∈ belongs to −→ maps ⊆ subset of ⇔ equivalent → goes to | with the condition

2 Linear spaces

We want to consider the topic of linear spaces. In linear algebra, we have the following composition rules for vectors. Note that the latin letters u, v, w, ... denote vectors while the greek letters λ, µ, ... denote scalars.

(L1) u + v = v + u (commutative law) u + (v + w) = (u + v) + w (associative law) u + 0 = u (0 is the null element) u + (−1)u = 0 (-u is the opposite element )

(L2) λ(µu) = (λµ)u 1 · u = u 0 · u = 0 λ · 0 = 0 (0 is the null element)

(L3) (λ + µ)u = λu + µu (distributive law) λ(u + v) = λu + λv (distributive law)

Remark: The same rules apply for other entities, such as n-tiples in Rn and m × n-matrices. For n-tiples, addition is defined as

(x1, x2, ... , xn) + (y1, y2, ... , yn) = (x1 + y1, x2 + y2, ... , xn + yn). (1) and multiplication with a scalar is defined as

λ(x1, x2, ... , xn) = (λx1, λx2, ... , λxn) (2) The rules (L1)-(L3) also apply to functions, given appropriate definitions of the operations ’addition’ and ’multiplication with scalar’. Such spaces are called function spaces.

1 We now define the term linear space, a space in which (L1)-(L3) are axioms and further properties are derived from these axioms.

Definition: A linear space over R is a set H on which the following opera- tions are defined:

addition: u ∈ H, v ∈ H =⇒ u + v ∈ H, multiplication with scalar: λ ∈ R, u ∈ H =⇒ λu ∈ H, such that (L1)-(L3) applies. The space can also be over C instead of R (we then talk about a linear space over C). The elements on H are called vectors. Instead of linear space, the term may be used.

Example: For the complex n-tiples

n C = {(z1, z2, ... , zn)|zk ∈ C, (k = 1, ... , n)} addition and multiplication with a scalar are defined as in (1) and (2). It can be shown that (L1)-(L3) applies to this set, so Cn is a linear space over C.

Example: A subset Ω of R can be a limited or unlimited interval of R. Given end points a and b, some possible subsets Ω are:

(a, b) = {x|a < x < b} , (a, b] = {x|a < x ≤ b} , [a, ∞) = {x|a ≤ x}

Let Φ = Φ(Ω) = {all functions f :Ω −→ R} 1be the set of all real functions on Ω. For two functions f, g ∈ Φ(Ω), λ ∈ R, the functions f + g and λf are defined as

f + g : x 7→ f(x) + g(x), x ∈ Ω,

λf : x 7→ λf(x), x ∈ Ω, Using these operations, it can be shown that (L1)-(L3) applies to Φ and it is therefore a linear space over R. (Here, 0 represents the null function.) This is an example of a . We simply regard the functions as vectors in the linear space Φ(Ω). If one replaces R with C, it can be shown that the set of all complex functions on Ω is a linear space over C.

Definition: U ⊆ H is a linear subspace of H if

u ∈ U, v ∈ U =⇒ u + v ∈ U, and λ ∈ R (or C), u ∈ U =⇒ λu ∈ U. Because U is a subspace of H, (L1)-(L3) still apply.

Example: If H is a set containing all geometrical vectors in 3-dimensional space and U is a set containing all vectors parallel to a given plane π, then U is a subspace of H. It can be shown that if u and v are two vectors in the plane π, both the vector u + v and all vectors λu, with λ ∈ R are also in π. 1The notation should be read as: The set of all functions f that maps points from the interval Ω to R

2 Example: The set n n X U = {(z1, ... , zn) ∈ C | zk = 0} k=1 is a subspace of Cn, because n n n 0 00 X 0 00 X 0k X 00 z ∈ U, z ∈ U =⇒ (zk + zk ) = z + zk = 0}, k=1 k=1 k=1

n X λ ∈ C, z ∈ U =⇒ λzk = 0 k=1 However, the set n n X U = {(z1, ... , zn) ∈ C | zk = 1} k=1 is not a subspace of Cn. This is because the zero vector 0 = (0, ..., 0) 6∈ U. All subspaces must contain the zero vector, because 0 · z = 0 for all z.

Example: The set

2 U = {(x1, x2) ∈ R | x1x2 = 0} is not a subspace of R2. We can easily prove this with an example:

x0 = (0, 1) ∈ U, x00 = (1, 0) ∈ U =⇒ x0 + x00 = (1, 1) 6∈ U

Example: The set Πn(R) containing all possible polynomials

n X k akx k=0 with a degree ≤ n and real coefficients ak can be shown to be a subspace of Φ(R). If p and q are polynomials and λ ∈ R, both p + q and λp are polynomials as well. This also holds true for polynomials with complex coefficients, only in this case Πn(C) is a subspace of Φ(C).

Definition: Consider a linear space H. The vectors u1, u2, ..., un ∈ H are said to be linearly independent if

n X λjuj = 0 =⇒ λj = 0, ∀j j=1

If any vector v ∈ H can be expressed as a linear combination of the vectors uj, the set u1, ... , un is said to be a in H. Remark: This is certainly true in a finite space. To show this for an infinite space, more care is required.

All bases in a linear space H have the same number of elements. The number of elements in a basis of H is the dimension of H. It is not possible to have a finite basis if the space is infinite-dimensional.

Example: The dimension of n-dimensional real space is

n dim R = n.

3 The dimension of n-dimensional complex space is n dim C = n The dimension of the set containing all n-degree polynomials is

dim Πn = n + 1, 2 n because the basis of Πn is [1, x, x , ..., x ]. The set Πn is a subspace of all poly- nomials (any degree), which is denoted Π.

We will now define the important function spaces C(Ω) and Ck(Ω). Ω is a con- nected domain in Rn, that may be open (the boundary points are not part of the domain) or closed (all boundary points are part of the domain). • C(Ω): Functions that are continuous in Ω. This is a linear space and a subspace of Φ(Ω). • Ck(Ω): Functions whose derivatives of order ≤ k are continuous in Ω. This is a linear space and a subspace of Φ(Ω).

3 Scalar product and norm

Given two elements in a linear space u, v we denote their scalar product by (u|v). In the ordinary Rn case, the scalar product can be written

n X (u|v) = ukvk. (3) k=1 The length of a vector can be expressed with the help of a scalar product: n 1/2 X 2 1/2 kuk = (u|u) = ( uk) (4) k=1 The length of a vector is usually called the norm. The norm kuk ≥ 0 for all u ∈ H. In the C case, we must modify the definition of the scalar product. The norm should still be a real, positive number and be defined by the scalar product. Equation (4) will usually not give a real number if the numbers uk are complex. If the scalar product is defined as

n X ∗ (u|v) = ukvk, (5) k=1 then the norm can be written

n 1/2 X 2 1/2 kuk = (u|u) = ( |uk| ) . k=1 This is a real number ≥ 0. The following rules hold true for the scalar product (u|v), as defined by (3) and (5):

(S1) (u|λ1v1 + λ2v2) = λ1(u|v1) + λ2(u|v2)

(S2) (u|v) = (v|u)∗

(S3) (u|u) ≥ 0 (equality for u = 0)

∗ ∗ (S4) (λ1u1 + λ2u2|v) = λ1(u1|v) + λ2(u2|v))

4 Note that (S4) follows from (S1) and (S2).

Definition: A scalar product on a linear space H is a rule which associates two elements u, v ∈ H to a scalar, (u|v), so that the rules (S1)-(S4) apply. A linear space with a scalar product is called a pre-.

Example: In the continuous case C(Ω) the scalar product is defined by Z (u|v) = u∗(x)v(x)dx.

Ω It is easily verified that rules (S1)-(S4) apply.

Example: Now for the general case. Given a positive function w > 0, w ∈ C(Ω), Z ∗ (u|v)w = u (x)v(x)w(x)dx (6) Ω is a scalar product on C(Ω). When Ω is a finite domain in R this also holds for piecewise continuous functions with finite-value discontinuities. The func- tion w is called a weight function. Different weight functions define different pre-Hilbert spaces.

Definition: For a pre-Hilbert space H,

i) u, v are orthogonal if (u|v) = 0. This is written u ⊥ v.

ii) the norm of u is defined as kuk = (u|u)1/2.

For a scalar product with a weight function according to (6), is given by Z u ⊥ v =⇒ u∗(x)v(x)w(x)dx = 0

Ω and the norm of a vector u is given by Z kuk = ( |u(x)|2w(x)dx)1/2

Ω We will now look at a few important properties of norms. Pythagoras’ theorem

u ⊥ v =⇒ ku + vk2 = kuk2 + kvk2 is valid in all pre-Hilbert spaces. In linear algebra we have the expression

(u|v) = kuk · kvkcosθ =⇒ |(u|v)| ≤ kuk · kvk, where equality means u and v are parallel. This expression is known as Cauchy’s inequality or Cauchy-Schwarz’ inequality. For the special case of the scalar pro- duct defined by (5), Cauchy’s inequality reads n n n X ∗ X 2 1/2 X 2 1/2 | ukvk| ≤ ( (uk) ) ( (vk) ) k=1 k=1 k=1 For the special case of the scalar product defined by (6), Cauchy’s inequality reads

5 Z Z Z | u∗(x)v(x)w(x)dx| ≤ ( |u(x)|2w(x)dx)1/2( |v(x)|2w(x)dx)1/2

Ω Ω Ω

For kuk = (u|u)1/2 the following rules hold:

(i) kuk ≥ 0 (kuk = 0 only if u = 0)

(ii) kλuk = |λ| · kuk

(iii) ku + vk ≤ kuk + kvk (triangle inequality)

The space of continuous functions C(Ω) with scalar product Z ∗ (u|v)w = u (x)v(x)w(x)dx (7) Ω and norm Z kuk = ( |u(x)|2w(x)dx)1/2 (8)

Ω is a pre-Hilbert space. In general, for a function space to be a pre-Hilbert space the functions must not necessarily be continuous but the in (7) and (8) must exist and be finite. We will now introduce the function space L2. Let w(x) > 0 ∈ Ω. With L2 = L2(w, Ω) we refer to the set of functions in Ω such that Z 2 1/2 kukL2(w) = ( |u(x)| w(x)dx) Ω exist and is finite. The special case of w = 1 =⇒ L2 ≡ L2(Ω). The existence and finiteness of the is a rather subtle question. This is left unexplored by us and we will assume, unless explicitly stated, that this is always the case with the functions we consider.

It is not immediately obvious that L2(w, Ω) is a pre-Hilbert space. One can show that L2(w, Ω) have all the properties required to be a linear space (see Sparr, page 257). However, complications arise when trying to show that it is also a pre-Hilbert space. Consider the generalized scalar product in (6). There are some functions such that the scalar product Z 2 (u|u)w = |u| w(x)dx = 0, Ω even though u is not the zero function. This does not comply with rule (S3). One such function is ( 1 if x = x u(x) = 0 0 if x 6= x0 The work-around for this problem is to identify these functions with the zero function. With this extension of the definition of zero function, the L2(w, Ω) becomes a pre-Hilbert space.

Example:

6 1 ∈ L2 ([0, 1]) 1 6∈ L2(R) −1/3 −1/3 x ∈ L2 ([0, 1]) x 6∈ L2 ([1, ∞)) −x −x e ∈ L2 ([0, ∞)) e 6∈ L2(R)

Check that these are true by considering ordinary integrals and see if they diverge. Notice that here L2(Ω) means L2(1, Ω).

4 Projections

We define a projection of u on v as (v|u) P u = v. [v] kvk2 In the case of u and v being geometrical vectors, the projection of u on v is the component of u along the direction of v. The concept of projection is now exemplified in the context of , together with the notion of orthogonality and norm.

Example: Consider L2([−π, π]).

Z π Z π (eikx|einx) = e−ikxeinxdx = ei(n−k)xdx = −π −π 1 h iπ = ei(n−k)x = 0 if n 6= k i(n − k) −π ( 2π if n = k =⇒ (eikx|einx) = 0 if n 6= k

In other words, eikx and einx are orthogonal functions if n 6= k. Now, let u be a given function in L2([−π, π]). We find that Z π ikx −ikx (e |u) = e u(x)dx = 2πck(u), −π where ck(u) is a Fourier coefficient for u. The projection of u on the subspace spanned by the functions eikx is

ikx (e |u) ikx ikx P ikx u = e = c (u)e . [e ] keikxk2 k This is a term in the Fourier series of u. The complete Fourier series is written

∞ ∞ X X (eikx|u) u ∼ c (u)eikx = eikx. k keikxk2 −∞ −∞ (Here the symbol ∼ is used instead of the equal sign = since we have not yet discussed the criteria for convergence, an issue that we not consider here. Essentially the equal sign can be used if we limit ourselves to continous functions with continuous derivatives, see Sparr). Evidently, the terms of the Fourier series of u can be interpreted as u:s projections on the orthogonal functions eikx. We have found a geometrical interpretation of Fourier series.

Example: The previous example can be used to show that the functions

7 ∞ ∞ {cos(kx)}k=0 ∪ {sin(kx)}k=1 are pairwise orthogonal (verify this by expanding cos(kx) and sin(kx) in the ikx −ikx exponential functions e and e ). L2-norms of the following functions are

Z π 1/2 √ k1(x)k = |1|2dx = 2π −π

Z π 1/2 √ kcos(kx)k = cos2(kx)dx = π, k = 1, 2, ... −π Z π 1/2 √ ksin(kx)k = sin2(kx)dx = π, k = 1, 2, ... −π The norm of 1(x) differs from the other norms. For an arbitrary function u, it holds that Z π (cos(kx)|u) = cos(kx)u(x)dx = πak(u), k = 0, 1, ... −π Z π (sin(kx)|u) = sin(kx)u(x)dx = πbk(u), k = 1, 2, ... −π where ak(u) and bk(u) are the trigonometric Fourier coefficients of u. The pro- jections of u on the subspaces [cos(kx)] and [sin(kx)] are

(1(x)|u) a (u) P u = P u = 1(x) = 0 [1(x)] [cos(0x)] k1(x)k2 2

(cos(kx)|u) P u = cos(kx) = a (u)cos(kx), k = 1, 2, ... [cos(kx)] kcos(kx)k2 k (sin(kx)|u) P u = sin(kx) = b (u)sin(kx), k = 1, 2, ... [sin(kx)] ksin(kx)k2 k We recognize these projections as the terms in the trigonometric Fourier series. We can now arrange things according to Fourier theory: ∞ a0(u) X u ∼ + (a (u)cos(kx) + b (u)sin(kx)) = 2 k k k=1 ∞ ∞ X X = P[cos(kx)]u + P[sin(kx)]u k=0 k=1 n and we note that, from linear algebra, if ϕ1, ..., ϕn is an orthogonal basis in R , every u ∈ Rn can be written n X (ϕk|u) u = 2 ϕk. kϕkk k=1 5 Gram-Schmidt’s orthogonalization method

It is advantageous to use orthogonal bases in pre-Hilbert spaces. Gram-Schmidt’s orthogonalization method is a method for constructing an orthogonal basis from a non-orthogonal basis.

Theorem: From a set of linearly independent vectors u1, ..., un, one can always construct a set of orthogonal vectors ϕ1, ..., ϕn that are linear combinations of

8 u1, ..., un. These vectors are unambiguously defined except for a proportionality factor (i.e. an arbitrary multiplicative scalar).

The general proof of the theorem is based on the induction method, and one finds that

n−1 X (ϕm|un) ϕ (x) = u (x) − ϕ (x), ρ = kϕ k2. n n ρ m m m m=1 m Here, we just consider a concrete example to show how to proceed in practice. k Consider L2([−1, 1]) and the polynomials uk(x) = x , k = 0, 1, ... (these are called monomials). The orthogonal functions ϕk are calculated in the following manner:

ϕ0 = u0 = 1,

Z 1 ρ0 = (ϕ0|ϕ0) = dx = 2, −1

1 1 Z 1 ϕ1 = u1 − P[ϕ0]u1 = u1 − (ϕ0|u1)ϕ0 = x − ( xdx)1 = x, ρ0 2 −1

Z 1 2 2 ρ1 = (ϕ1|ϕ1) = x dx = , −1 3 1 1 ϕ2 = u2 − P[ϕ0,ϕ1]u2 = u2 − (ϕ0|u2)ϕ0 − (ϕ1|u2)ϕ1 = ρ0 ρ1

1 Z 1 3 Z 1 1 = x2 − ( x2dx)1 − ( x3dx)x = x2 − , 2 −1 2 −1 3

Z 1 2 1 2 8 ρ2 = (ϕ2|ϕ2) = (x − ) dx = , −1 3 45

etc.

In the previous example, we orthogonalized monomials according to the scalar product Z (u|v) = u∗(x)v(x)dx. Ω It possible to obtain different by using Gram-Schmidt’s method to orthogonalize monomials for different pre-Hilbert spaces L2(w, I). There is a group of very important orthogonal polynomials which recur naturally in different applications. They are available as standard functions in programs such as Fortran, Mathematica, Matlab and Maple. We will only mention a few of these especially important orthogonal polynomials here.

Legendre polynomials: Let I = [−1, 1] and w(x) = 1. These are the poly- nomials that appeared in the example above. They are usually ”normalized” by the condition P (1) = 1. The first five are:

9 P0(x) = 1

P1(x) = x

1 P (x) = (3x2 − 1) 2 2 1 P (x) = (5x3 − 3x) 3 2 35 15 3 P (x) = x4 − x2 + 4 8 4 8

: Let I = [−1, 1] and w(x) = 1/ 1 − x2. The obtai- ned polynomials can be described by the formula

Tn(x) = cos(n · arccos(x)), which can be shown to be a polynomial using trigonometric formulas. The first five Chebyshev polynomials are

T0(x) = 1

T1(x) = x

2 T2(x) = 2x − 1

3 T3(x) = 4x − 3x

4 2 T4(x) = 8x − 8x + 1

Chebyshev polynomials can used to describe the quantum time evolution of a wavefunction according to the time-dependent Schrödinger equation.

Laguerre polynomials: Let I = [0, ∞) and w(x) = e−x.The obtained polyno- mials can be described by the formula ex dn L (x) = (e−xxn). n n! dxn The first four are

L0(x) = 1

L1(x) = −x + 1

1 L (x) = (x2 − 4x + 2) 2 2 1 L (x) = (−x3 + 9x2 − 18x + 6) 3 6 The Laguerre polynomials, which satisfy the differential equation

xy” + (α + 1 − x)y0 + ny = 0

10 with α = 0, appear in the radial part of the solution to the Schrödinger equation for a 1-electron atom. The α 6= 0 case corresponds to the so-called generalized Laguerre polynomials.

2 : Let I = R and w(x) = e−x . The obtained polynomials can be described by the formula

n 2 d 2 H (x) = (−1)nex (e−x ). n dxn The first four Hermite polynomials are

H0(x) = 1

H1(x) = 2x

2 H2(x) = 4x − 2

3 H3(x) = 8x − 12x

The series of Hermite polynomials can also be calculated by recursion:

Hn+1 = 2xHn(x) − 2nHn−1(x) Hermite polynomials have the symmetry condition:

n Hn(−x) = (−1) Hn(x)

The normalization is Z ∞ −x2 n √ Hn(x)Hm(x)e dx = δn,m2 n! π. −∞ One can show that a modified function

n √ −1/2 −x2/2 Ψn(x) = (2 n! π) e Hn(x) satisfies

∞ ∞ Z 1 Z 2 √ −x Ψn(x)Ψm(x)dx = n Hn(x)Hm(x)e dx = δm,n. −∞ 2 n! π −∞ These functions are important because they provide the solution to

00 2 Ψn(x) + (2n + 1 − x )Ψn(x) = 0. This is the Schrödinger equation for the quantum harmonic oscillator.

6 Convergence in norm

A norm in a linear space permits a natural convergence concept:

un → u when n → ∞ ⇐⇒ kun − uk → 0 when n → ∞ (9) ∞ Let us study convergence in linear spaces. Consider the set {ϕk}1 of pairwise orthogonal vectors in H. We wish to determine if we can write, for a generic u ∈ H,

11 ∞ X u = ckϕk, k=1 and if yes, how to choose the coefficients ck. The convergence criteria in (9) states

N X ku − ckϕkk → 0 when N → ∞ k=1 It can be shown that the optimal choice of coefficient is 1 ck = (ϕk|u), ρk = (ϕk|ϕk). ρk

∞ Definition: Suppose {ϕk}1 is a sequence of pairwise orthogonal vectors in H. The quantities 1 ck(u) = (ϕk|u), ρk = (ϕk|ϕk) ρk ∞ are called the Fourier coefficients with respect to {ϕk}1 . The series ∞ X u ∼ ck(u)ϕk (10) k=1 is called the Fourier series of u, or the orthogonal expansion with respect to ∞ {ϕk}1 .

Orthogonality does not guarantee that the series (10) converges for all u. It ∞ is also required that {ϕk}1 contains ”enough” function to span all of H, as presented in the next definition:

∞ Definition: {ϕk}1 is an orthogonal basis in H if the vectors are pairwise orthogonal and if any u ∈ H can be developed in a Fourier series:

∞ X 1 u = (ϕ |u)ϕ , ρ = (ϕ |ϕ ) ρ k k k k k 1 k ∞ One also says that {ϕk}1 is a complete orthogonal system. The basis is also orthonormal when ρk = 1, ∀k.

For our scopes, the spaces L2(w, I) have bases. The systems of polynomials and trigonometric functions previously mentioned make up bases in L2.

A Hilbert space is a pre-Hilbert space which is also complete. The space L2 is a pre-Hilbert space that can also be shown to be complete (Fischer-Reisz’s theorem). Therefore, L2(w, I) is a Hilbert space.

Completeness is not a trivial property; consider for example the space of conti- nuous function C0 and a sequence within it defined as

 1 1 if k < x ≤ 1  kx+1 1 1 fk(x) = 2 if − k ≤ x ≤ k  1 0 if − 1 ≤ x < − k , with k = 2, 3, .., ∞. If, for simplicity, we consider the weight function w(x) = 1, we see that two functions fk(x), fl(x) are such that

12 Z 1 2 lim |fk(x) − fl(x)| dx = 0. k,l→∞ −1

This means that the distance between fk, fl → 0 as x → ∞. Thus there is a limit to the sequence, i.e. fk→∞ tends to some function. However, such limit does not belong to C0, since ( 0 if − 1 ≤ x < −0 fk→∞(x) = f(x) 1 if 0 < x ≤ 1 is not a continuous function. Thus C0 is not ”big” enough to be complete (this issue does not exist for finite spaces; once given a scalar product, they are Hilbert spaces). To have completeness, one must enlarge the space of admissible functions, e.g. with discontinuities. To handle this ”enlarged” space, one needs to use a more general concept of integration (the so-called Lebesgue integration), which is not further discussed here.

Examples of bases in L2(w):

2 • The sequence of vectors 1, x, x , ... is a basis of L2[a, b] (this is supported by the Weierstrass theorem, see Sparr, page 280). • On the interval [−1, 1], some orthogonal bases are the Legendre polyno- ∞ ∞ ikx ∞ mials, {sin(kπx)}1 ∪ {cos(kπx)}0 and {e }−∞. • More general, the Chebyshev, Laguerre and Hermite polynomials can be used as orthogonal bases in order to expand functions.

7 Operators in Hilbert space

An operator produces an image of a Hilbert space by mapping elements of one vector space to another vector space:

A : H1 → H2 From linearity follows that

A(λu + µv) = λAu + µA(v).

An important case is when H1 = H2 = H, i.e. we have an operator on H.

Example: The n × n matrix n×n A ∈ R defines an operator Cn → Cn so that y = Ax

Example: (Integral operators.) Consider a compact interval I on R and an operator K(x, y) that is continuous on R. Z v(x) = K(x, y)u(y)dy I P This is reminiscent of the matrix product vi = j Aijuj, only with an integral instead of a sum. The function K(x, y) is called the kernel of the operator.

13 Let’s consider a pre-Hilbert space H. An operator A is said to be bounded if

kAuk ≤ ckuk, ∀u ∈ H.

That is, for all u’s, the norm is bounded by the same number c ∈ R. Bounded operators are continuous in the sense that

xn → x =⇒ Axn → Ax, which follows from:

kAxn − Axk = kA(xn − x)k ≤ ckxn − xk.

What about differential operators? Even in simple cases, we encounter problems with unbounded operators.

Example: Consider the differential operator D : u 7→ u0, an operator of the type

D : C1 → C0. Consider u(x) = eikx. We now have Du(x) = ikeikx = iku(x). Independent of which norm is used, kDuk = kkuk Since k can be any number, D is not bounded.

8 Symmetric operators

Consider a square matrix A. If

Au = λu, u 6= 0, then λ is the eigenvalue and u is the eigenvector of A. If A has a basis of eigenvectors e1, ..., en, then

n n n X X X u = ukek =⇒ v = Au = ukAek = λkukek k=1 k=1 k=1 explains the action on a generic u. This can be written on matrix form as       v1 λ1 0 0 u1  .   ..   .   .  =  0 . 0   .  . vn 0 0 λn un This shows that in a basis of eigenvectors, A is represented by a diagonal matrix. If A is real and symmetric,

T n×n A = A, A ∈ R then according to the spectral theorem all eigenvalues λk are real and A has an orthonormal basis of eigenvectors. n In the orthonormal basis {ek}1 of A, (e |u) u = (e |u), u e = k e = P u, k k k k 2 k [ek] kekk

14 n X =⇒ Au = λkP[ek]u, k=1 where the operator P[ek] is the orthogonal projection on the eigenvector ek and λk ∈ R is the corresponding eigenvalue, k = 1, 2, .... Since this is valid for any u, another way to express the spectral theorem is

n X A ≡ λkP[ek]. k=1 A real, symmetric matrix A is positive definite or positive semi-definite if it has the quadratic form

uTAu > 0 or uTAu ≥ 0, ∀u 6= 0. This is equivalent with

λk > 0 or λk ≥ 0, ∀k. This can be generalized for complex matrices with a few changes. We define the operation Ae as a simultaneous conjugation and transposition:

Ae = (A∗)T We want to show that Ae = A when

n n (Au|v) = (u|Av), ∀u, v ∈ R or C (11) We show this by analysing the two sides of (11) separately.

n n n n X ∗ X X ∗ X ∗ ∗ Left side = (Au|v) = (Au)kvk = ( Akjuj) vk = uj Akjvk k=1 k=1 j=1 j,k=1

n n X ∗ X ∗ Right side = (u|Av) = uj (Av)j = uj Ajkvk j=1 j,k=1 Now since both sides should be equal for any choice of u0s and v0s (consider e.g. vk = δk,4 and uj = δj,3)

∗ ∗ T =⇒ Akj = Ajk =⇒ A = (A ) This means that A = Ae. A matrix with this property is called self-adjoint.

We now expand the discussion from matrices to Hilbert spaces. In particular, we are interested in studying symmetric and self-adjoint operators in function spaces. Given a Hilbert space H the operator A ∈ H is symmetric if it can can be moved from one side to the other of the scalar product without changing it:

(u|Av) = (Au|v) The more common name for a symmetric operator is Hermitian operator. For a Hermitian operator A and u = v,

(u|Au) = (Au|u) = (u|Au)∗ Thus, (u|Au) is a real number. If (u|Au) ≥ 0, ∀u, then A is positive semi-definite.

15 Propositions: Suppose A is Hermitian in Hilbert space H. Then all the following applies: 1. All eigenvalues of A are real.

2. Given two eigenvalues λ1 6= λ2, the corresponding eigenvectors u1, u2 are orthogonal. 3. If A is positive (semi-)definite, all eigenvalues are > 0 (or ≥ 0) Proof: 1. Assume Au = λu, u 6= 0. Then

(u|Au) = (Au|u) =⇒ (u|λu) = (λu|u) =⇒ λ(u|u) = λ∗(u|u)

Since (u|u) > 0, it must be that λ = λ∗, and therefore λ must be real. 2. Consider Au1 = λ1u1, Au2 = λ2u2

with u1, u2 6= 0, λ1 6= λ2 and λ1, λ2 ∈ R.

(Au1|u2) = (u1|Au2) =⇒ λ1(u1|u2) = λ2(u1|u2)

Since λ1 6= λ2 ⇒ (u1|u2) = 0 3. If A is positive definite,

0 < (u|Au) = λ(u|u)

Because (u|u) > 0 for u 6= 0, it must be that λ > 0 (or λ ≥ 0 if A is semi-positive definite).

Example: Consider the integral operator K : L2(I) −→ L2(I), Z Ku(x) = K(x, y)u(y)dy, I For an Hermitian operator we get the condition

(u|Kv) = (Ku|v)

Z Z u∗(x)(Kv)(x)dx = ((Ku)(y))∗v(y)dy I I

Z Z Z Z ⇐⇒ u∗(x)dx K(x, y)v(y)dy = ( (K(y, x)u(x)dx)∗v(y)dy I I I I

Z Z Z Z ⇐⇒ K(x, y)u∗(x)v(y)dxdy = K∗(y, x)u∗(x)v(y)dxdy I I I I We choose the following u(x) and v(y): ( 1 if |x − x | < h u(x) = h 0 0 otherwise ( 1 if |y − y | < h y(x) = h 0 0 otherwise

16 By letting h → 0, we find that

K(y, x) = K∗(x, y). That an integral operator is symmetric thus provides a condition on its kernel that is analogous to the case where a symmetric matrix have the elements Aij = ∗ Aji

Example: Consider the expression 1 1 d D = , in [0, 2π], i i dx If u, v ∈ C1(I), then

 1  Z 2π v0(x) u| Dv = u∗(x) dx = i 0 i Z 2π  0 ∗ 1 ∗ 2π u (x) = [u (x)v(x)]0 + v(x)dx = i 0 i Z 2π   1 ∗ 2π 1 = [u (x)v(x)]0 + Du|v dx. i 0 i 1 1 ∗ 2π For (u| i Dv) = ( i Du|v) to be true, the expression [u (x)v(x)]0 must be zero. This is the case if u and v are 2π-periodic. The differential operator 1 d A = i dx is thus an Hermitian operator on C1([0, 2π]) for 2π-periodic functions. The eigenvalues and eigenvectors of A are found by solving the differential equation

Au = λu ⇐⇒ u0 = iλu with the boundary conditions u(0) = u(2π). The general solution to this equa- tion is:

u(x) = Ceiλx. Because of the periodic boundary conditions u(0) = u(2π),

1 = eiλ2π ⇐⇒ λ is an integer. Thus, the operator A has the eikx with eigenvalues k = 0, ±1, ±2, ... We have previously determined this to be an orthogonal basis.

Example: By the same token, it can be shown that the differential operator

d2 −D2 = − , 0 ≤ x ≤ π dx2 is Hermitian for functions u(0) = v(0) = 0, u(π) = v(π) = 0. We define the operator A so that

2 2 A = −D ,DA = {u ∈ C ([0, π])|u(0) = u(π) = 0}. Furthermore, Z π (u|Au) = (u0|u0) = |u0(x)|2dx ≥ 0, 0

17 so the operator A is positive semi-definite. In order to find the eigenvalues and eigenfunctions of A, we must solve the equation

Au = λu ⇐⇒ −u00 = λu, with boundary conditions u(0) = u(π) = 0. This differential equation has the solution

2 λk = k , ϕk(x) = sin(kx), k = 1, 2, ... The operator A has an orthogonal basis consisting of the eigenfunctions sin(kx).

9 Sturm-Liouville operators

An important class of self-adjoint operators are Sturm-Liouville ordinary differential operators. Sturm-Liouville operators arise in many areas of physics and applied mathe- matics. Here, we will consider the case of ordinary differential equations (ODE). Sturm-Liouville operators also occur in partial differential equations (PDE) but we will not consider these. ODEs involve n-derivatives, i.e finding a solution requires one or more integ- ration. To uniquely specify the solution we need as many boundary conditions as derivatives. ODEs can be linear, for example:

d2φ + m2φ = 0. dx2 ODEs can also be non-linear, for example:

d2φ + m2sin(φ) = 0. dx2 We will only focus on linear ODEs. A linear ODE can be written on the form

N X dnφ a (x) (x) = a (x). n dxn 0 n=1 Here,

• all an(x) are supposed to be known. • the largest order derivative decides the order of the equation.

• the equation is homogeneous if a0(x) = 0. • non-uniqueness is related to lack of specification of suitable boundary con- ditions. (For an N-order linear ODE, we usually require N boundary con- ditions.) A Sturm-Liouville operator is a differential operator that can be written on the form

d2φ dφ a (x) + a (x) + a (x)φ = λu(x)φ. (12) 2 dx2 1 dx 0 We assume that

• a0 (x) , a1 (x) , a2 (x) and u (x) are real and non-zero for x ∈ [a, b]. • the function is ”well-behaved”, i.e. the derivatives exist up to the necessary order.

18 • the solutions are expected to satisfy boundary conditions. • the equation is an eigenvalue equation, i.e. λ is not a parameter. Rather, it is an eigenvalue and finding λ is part of finding the solution to the Sturm-Liouville problem. An equivalent way to write (12) is

d  dφ − p(x) + q(x)φ = λw(x)φ. (13) dx dx The two different ways of expressing the Sturm-Liouville operator can be used to find the solution. First, divide (12) by a2:

2 d φ a1 dφ a0 u 2 + + = λ φ. (14) dx a2 dx a2 a2 Then, divide (13) by −p(x):

d2φ 1 dp dφ q w + − φ = −λ φ, (15) dx2 p dx dx p p By comparing (14) and (15) we find the equations

1 dp a (x) = 1 p(x) dx a2(x) q(x) a (x) = − 0 p(x) a2(x) w(x) u(x) = − p(x) a2(x) These equations allow us to move from one expression to the other (see the example below).

The Schrödinger equation is a special case of the Sturm-Liouville problem.

Orthogonal polynomials come from solutions to Sturm-Liouville problems with special boundary conditions. Furthermore, the weight function w(x) in (13) determines the weight function in the scalar product of the solutions.

Example: (Hermite’s differential operator.) Consider

y00 − 2xy0 + λy = 0, x ∈ (−∞, ∞). Rewrite this as

y00 − 2xy0 = −λy.

2 After multiplying with e−x , this can be written as

2 2 −(ex y0)0 = λe−x y. It is the same as (13), with

2 2 p(x) = e−x , q(x) = 0, w(x) = e−x . This is also the same as equation (12), with

a2 = 1, a1 = −2x, a0 = 0, u(x) = −1.

19 Thus the relationships

w(x) u(x) 1 dp a (x) = − , = 1 p(x) a2(x) p(x) dx a2(x) are fulfilled. Consider λ = 2n, n = 0, 1, ... One can show that when λ = 2n, Hn(x) are so- lutions, and that these make up a complete, orthonormal system in L2(−∞, ∞) 2 with the weight function w(x) = e−x (verify this!). In particular, the function −x2 z(x) = e Hn(x) satisfies

z00 + (2n + 1 − x2)z = 0, i.e. the Schrödinger equation for the harmonic oscillator.

20