<<

Jim Lambers MAT 415/515 Fall Semester 2013-14 Lectures 1 and 2 Notes

These notes correspond to Section 5.1 in the text.

Introduction

This course is about solutions of ordinary differential (ODEs). Unlike ODEs covered in MAT 285, whose solutions can be expressed as linear combinations of n functions, where n is the order of the ODE, the ODEs discussed in this course have solutions that are expressed as infinite series of functions, similar to power series seen in MAT 169. This approach of representing solutions using infinite series provides the following benefits:

• It allows solutions to be represented using simpler functions, particularly , than the exponential or normally used to represent solutions in “closed form”. This leads to more efficient evaluation of solutions on a computer or calculator.

• It enables solution of ODEs with variable coefficients, as opposed to ODEs seen in MAT 285 that either had constant coefficients or were of a special form.

• It facilitates approximation of solutions by polynomials, by truncating the infinite series after a certain number of terms, which in turn aids in understanding the qualitative behavior of solutions.

The solution of ODEs via infinite series will lead to the definition of several families of special functions, such as Bessel functions or various kinds of orthogonal polynomials such as Legendre polynomials. Each family of special functions that we will see in this course has an relation, which simplifies computations of coefficients in infinite series involving such functions. As n such, orthogonality is going to be an essential concept in this course. Vectors in R are known to be orthogonal if and only if they are perpendicular, or, equivalently, if their is equal to n zero. We will begin this course by applying familiar concepts from vector spaces such as R to sets of functions, thus leading to the concept of a .

Vectors in Function Spaces

We begin with some necessary terminology. A V , also known as a linear vector space, is a of objects, called vectors, together with two operations:

• Addition of two vectors in V , which must be commutative, associative, and have an , which is the zero vector 0. Each vector v must have an additive inverse −v which, when added to v, yields the zero vector.

• Multiplication of a vector in V by a , which is typically a real or . The term “scalar” is used in this context, rather than “number”, because the multiplication process is “scaling” a given vector by a factor indicated by a given number. must satisfy distributive laws, and have an identity element, 1, such that 1v = v for any vector v ∈ V .

1 Both operations must be closed, which means that the result of either operation must be a vector in V . That is, if u and v are two vectors in V , then u + v must also be in V , and αv must be in V for any scalar α.

n Example The set of all points in n-dimensional space, R , is a vector space. Addition is defined as follows:       u1 v1 u1 + v1  u2   v2   u2 + v2  u + v =   +   =   = u + v.  .   .   .   .   .   .  un vn un + vn Scalar multiplication is defined by   αv1  αv2  αv =   .  .   .  αvn Similarly, the set of all n-dimensional points whose coordinates are complex numbers, denoted by n C , is also a vector space. 2 In these next few examples, we introduce some vector spaces whose vectors are functions, which are also known as function spaces.

Example The set of all polynomials of degree at most n, denoted by Pn, is a vector space, in which addition and scalar multiplication are defined as follows. Given f(x), g(x) ∈ Pn,

(f + g)(x) = f(x) + g(x), (αf)(x) = αf(x).

These operations are closed, because adding two polynomials of degree at most n will not yield a sum whose degree is greater than n, and multiplying any by a nonzero scalar will not change its degree. 2 Example The set of all functions with power series of the form

∞ X n f(x) = anx , n=0 that are convergent on the (−1, 1) is a vector space, in which addition and multiplication are defined as in the previous example. These operations are closed because the sum of two convergent series is also convergent, as is a scalar multiple of a convergent series. 2 Example The set of all continuous functions on the interval [a, b], denoted by C[a, b], is a vector space in which addition and scalar multiplication are defined as in the previous two examples. These operations are closed because the sum of two continuous functions, and a scalar multiple of a , is also continuous. 2

A vector space V is most effectively described in terms of a set of specific vectors {v1, v2,...} that, in conjunction with the operations of addition and scalar multiplication, can be used to obtain every vector in the space. That is, for every vector v ∈ V , there must exist scalars c1, c2,..., such that v = c1v1 + c2v2 + ··· .

2 We say that v is a of v1, v2,..., and the scalars c1, c2,... are the coefficients of the linear combination. Ideally, it should be possible to express any vector v ∈ V as a unique linear combination of the vectors v1, v2,... that are to be used to describe all vectors in V . With this criteria in mind, we introduce the following two essential concepts from linear :

• A set of vectors {v1, v2,..., vn} is linearly independent if the vector

c1v1 + c2v2 + ··· + anvn = 0

is satisfied if and only if c1 = c2 = ··· = 0. In other words, this set of vectors is linearly independent if it is not possible to express any vector in the set as a linear combination of other vectors in the set. This definition can be generalized in a natural way to an infinite set of vectors. If a set of vectors is not linearly independent, then we say that it is linearly dependent.

• A set of vectors {v1, v2,..., vn} spans a vector space V if, for any vector v ∈ V , there exist scalars c1, c2, . . . , an such that

v = c1v1 + c2v2 + ··· + anvn.

That is, any vector in V can be expressed as a linear combination of vectors in the set. We define span{v1, v2,..., vn} to be the set of all linear combinations of v1, v2,..., vn. As with , the notion of span generalizes naturally to an infinite set of vectors.

We then say that a set of vectors {v1, v2,...} (which may be finite or infinite) is a for a vector space V if it is linearly independent, and if it spans V . This definition ensures that any vector in V is a unique linear combination of the vectors in the basis. If a basis for V is finite, then we say that V is finite-dimensional and define the of V to be the number of elements in a basis; all bases of a finite-dimensional vector space must have the same number of elements. If V does not have a finite basis, then we say that V is infinite-dimensional.

Example The P3, consisting of polynomials of degree at most 3, has a basis 2 3 {1, x, x , x }. It is clear that any polynomial in P3 can be expressed as a linear combination of these basis functions, as the coefficients of any such polynomial are also the coefficients in the linear combination of these basis functions. To confirm linear independence, suppose that there exists constants c0, c1, c2, and c3 such that

2 3 c0(1) + c1x + c2x + c3x = 0 for all x ∈ R. Then certainly this must be the case at x = 0, which requires that c1 = 0. Substituting 3 other values of x into the above equation yields a system of 3 linear equations in the remaining 3 unknows c1, c2 and c3. It can be shown that the only solution of such a system of equations is 2 3 the trivial solution c1 = c2 = c3 = 0. Therefore the set {1, x, x , x } is linearly independent. An alternative basis consists of the first 4 Chebyshev polynomials {1, x, 2x2 − 1, 4x3 − 3x}. It can be confirmed using a similar approach that these polynomials are linearly independent. 2 Example The function space consisting of all power series that are convergent on the interval (−1, 1) has as a basis the infinite set {1, x, x2, x3,...}. Using an inductive argument, it can be shown that this set is linearly independent 2

3 Scalar Product n Recall that the dot product of two vectors u and v in R is

u · v = u1v1 + u2v2 + ··· + unvn = kukkvk cos θ, where q 2 2 2 kuk = u1 + u2 + ··· + un is the magnitude or length of u, and θ is the angle between u and v, with 0 ≤ θ ≤ π radians. The dot product has the following properties:

1. u · u = kuk2

2. u · (v + w) = u · v + u · w

3. u · v = v · u

4. u · (cv) = c(u · v)

When u and v are perpendicular, then cos θ = 0. It follows that u · v = 0, and we say that u and v are orthogonal. We would like to generalize the concept of a dot product to vectors in function spaces, and we also need to ensure that complex numbers are properly taken into account. To that , we define the scalar product of two functions f(x) and g(x) to be

Z b hf|gi = f ∗(x)g(x)w(x) dx, a where w(x) is a weight function and, for any complex number z = x+iy, z∗ = x−iy is the complex conjugate of z (also denoted by z). The interval of integration [a, b] depends on the function space under consideration. Using this definition, it can be verified that the scalar product has the following properties:

1. hf|g + hi = hf|gi + hf|hi

2. hf|gi = hg|fi∗

3. hf|cgi = chf|gi for any complex number c

Note that the second property is slightly different from the corresponding property for vectors in n R , as it requires the complex conjugate. Combining the second and third property yields the result hcf|gi = c∗hf|gi.

Hilbert Spaces n Just as we use kvk to the magnitude of a vector v ∈ R , we need a notion of magnitude for a function f(x) in a function space. To that end, we say that a function k · k : V → R is a on a vector space V if it satisfies the following conditions:

1. kvk ≥ 0 for any vector v ∈ V , and kvk = 0 if and only if v = 0.

2. kαvk = |α|kvk for any complex scalar α.

4 3. ku + vk ≤ kuk + kvk for any two vectors u, v ∈ V . This is known as the Triangle .

Any function that satisfies these conditions is useful for measuring the magnitude of a vector. Given a scalar product h·|·i, we define kvk = hv|vi1/2 to be the norm induced by this scalar product. Using the properties of the scalar product, it can be shown that this norm satisfies the above conditions, except that for a function space, kfk may equal zero even if f is nonzero at isolated points, or more generally, if f is nonzero on a set of measure zero. We can now define a specific type of function space that will be of use to us. A H is a function space, together with a scalar product and induced norm, that is also complete. This means that there exists a basis ϕ1, ϕ2,... such that for any f ∈ H, there exists scalars c1, c2,... such that ∞ X f = anϕn. n=1

Example As before, let P3 be the space of polynomials of degree at most 3. If we use the scalar product Z 1 hf|gi = f ∗(s)g(s) ds, −1 with weight function w(x) = 1, then P3 is a Hilbert space. Let L0(x) = 1 and L1(x) = x. Then

Z 1 Z 1 2 hL0|L0i = 1 ds = 2 1 ds = 2, −1 0 Z 1 Z 1 2 2 2 hL1|L1i = x ds = 2 x dx = , −1 0 3 Z 1 hL0|L1i = 1(x) ds = 0. −1 Here, we have used the fact that if f(x) is an even function, meaning that f(−x) = f(x), then Z a Z a f(x) dx = 2 f(x) dx, −a 0 whereas if f(x) is an odd function, meaning that f(−x) = −f(x), then Z a f(x) dx = 0. −a 2

Cauchy-Schwarz Inequality The Cauchy-Schwarz inequality, also known as the Schwarz inequality, states that if a norm on a vector space V is defined by kfk = hf|fi1/2 for any f ∈ V , where hf|gi is an inner product as defined previously, then |hf|gi| ≤ kfkkgk.

5 We will prove this inequality in the case where V is a vector space defined over the real numbers; the proof can be generalized to a complex vector space. For f, g ∈ V and c ∈ R, with g 6= 0, we have hf − cg|f − cgi ≥ 0. It follows from the properties of the inner product that

0 ≤ hf − cg|f − cgi ≤ hf|fi − hf|cgi − hcg|fi + hcg|cgi ≤ kfk2 − 2chf|gi + c2kgk2.

We now try to find the value of c that minimizes this expression. Differentiating with respect to c and equating to zero yields the equation

−2hf|gi + 2ckgk2 = 0, and therefore the minimum occurs when c = hf|gi/kgk2. It follows that

0 ≤ kfk2 − 2chf|gi + c2kgk2 hf|gi hf|gi2 ≤ kfk2 − 2 hf|gi + kgk2 kgk2 kgk4 hf|gi2 hf|gi2 ≤ kfk2 − 2 + kgk2 kgk2 hf|gi2 ≤ kfk2 − . kgk2

It follows that hf|gi2 ≤ kfk2kgk2. Taking the root of both sides yields the Cauchy-Schwarz inequality.

Orthogonal Expansions n Just as two vectors u, v ∈ R are ort]]hogonal if u · v = 0, we say that two functions f and g in a Hilbert space are orthogonal if hf|gi = 0. An of a Hilbert space is particularly useful because the coefficients of a function with respect to such a basis are easily computed. Specifically, suppose a Hilbert space H has a basis {ϕ1, ϕ2,...} such that hϕi|ϕji = 0 for i 6= j. Then, let f ∈ H have the expansion ∞ X f = anϕn. n=1

If we take the scalar product of both sides with ϕk for some positive k, we obtain * + ∞ ∞ X X hϕk|fi = ϕk anϕn = anhϕk|ϕni = akhϕk|ϕki, n=1 n=1 which yields the coefficients hϕn|fi an = , n = 1, 2,.... hϕn|ϕni

6 This is significant because it shows that the coefficients can be computed independently of one another. n Recall that a vector u ∈ R is a unit vector if kuk = 1, and that a unit vector can be obtained from a nonzero vector v by normalizing v, which means dividing by its magnitude: u = v/kvk is a unit vector. Similarly, given a function f in a Hilbert space H, f is said to be normalized if kfk = 1. We then say that a set of function {ϕ1, ϕ2,...} is orthonormal if

hϕi|ϕji = δij, where  0 i 6= j δ = ij 1 i = j is called the .

∞ Example Consider the infinite set of functions {sin nx}n=1, with the scalar product Z π hf|gi = f ∗(s)g(s) ds. 0 Then we have, for positive m and n, with m 6= n, Z π hsin mx| sin nxi = sin mx sin nx dx 0 1 Z π = cos[(m − n)x] − cos[(m + n)x] dx 2 0   π 1 1 = sin[(m − n)x] − sin[(m + n)x] m − n m + n 0 = 0, k sin nxk2 = hsin nx| sin nxi Z π = sin2 nx dx 0 Z π 1 − cos 2nx = dx 0 2 π x − 1 sin 2nx = 2n 2 0 π = . 2 Therefore, the functions in this set are orthogonal, but not orthonormal. Since the norm of each is pπ/2, it follows that the set ( )∞ r 2 sin nx π n=1 ∞ is an orthonormal set. Then, to compute the coefficients {an}n=1 in the expansion r ∞ 2 X f(x) = a sin nx, π n n=1

7 we need only compute the scalar products * + r 2 r 2 Z π

an = sin nx f = f(x) sin nx dx, n = 0, 1, 2,.... π π 0

This representation of f(x) is called a Fourier sine series. 2

Expansions and Scalar Products Suppose that two functions f and g are expanded in the same :

∞ ∞ X X f = anϕn, g = bmϕm. n=1 m=1 Then, using the properties of the scalar product, we have * + ∞ ∞ X X hf|gi = anϕn bmϕm n=1 m=1 ∞ ∞ X X ∗ = anbmhϕn|ϕmi n=1 m=1 ∞ ∞ X X ∗ = anbmδnm n=1 m=1 ∞ X ∗ = anbn. n=1 In other words, the scalar product reduces to a dot product:

hf|gi = a∗b, where a and b are “column vectors” (which may be infinitely long) consisting of the coefficients ∗ {an} and {bm}, respectively. For any vector v, we denote by v the Hermitian of v, which is the transpose and complex conjugate of v. In some texts, v∗ is written as vH or v†. In the case of f = g, we obtain

∞ 2 X 2 kfk = hf|fi = |an| . n=1 This relationship is known as Parseval’s identity. Example The functions

r 1 r 2 ϕ (x) = , ϕ (x) = cos nx, n = 1, 2,... 0 π n π form an orthonormal set with respect to the scalar product Z π hf|gi = f ∗(s)g(s) ds. 0

8 Now, consider the functions

3 2 2 ψ1(x) = cos x + sin x + cos x + 1, ψ2(x) = cos x − cos x.

It can be verified directly by computing hϕj|ψki for j = 0, 1, 2, 3 and k = 1, 2, or by rewriting ψ1 and ψ2 using trigonometric identities, that these functions have the following expansions in the ∞ basis {ϕn}n=0: √ √ √ √ 3 π 7 π π π ψ1(x) = ϕ0(x) + √ ϕ1(x) − √ ϕ2(x) + √ ϕ3(x), 2 4 2 2 2 4 2 √ √ π rπ π ψ2(x) = ϕ0(x) − ϕ1(x) + √ ϕ2(x). 2 2 2 2

Because the basis is orthonormal, we can compute hψi|ψji, for i, j = 1, 2, by computing dot products of the appropriate sets of coefficients: √ √ √ √ 3 π 2 7 π 2  π 2  π 2 hψ1|ψ1i = + √ + − √ + √ 2 4 2 2 2 4 2 63π = , 16 √ √ √ √ √ √ 3 π π 7 π rπ π π π hψ1|ψ2i = − √ − √ √ + √ (0) 2 2 4 2 2 2 2 2 2 4 2 π = − , 4 √ √  π 2 rπ 2  π 2 hψ1|ψ2i = + + √ 2 2 2 2 7π = . 8 2

Bessel’s Inequality Suppose that a function f has an expansion in an orthonormal basis for a Hilbert space H, which P∞ is n=1 anϕn. If f is not in H, then it is not guaranteed that

∞ X f(x) = anϕn(x) n=1 at any x in the domain prescribed in the underlying scalar product, as f(x) may lie outside of the span of the basis functions. That is, the basis functions are not complete with respect to any function space containing f. However, regardless of completeness, we do have Bessel’s inequality, which states that

∞ 2 X 2 kfk ≥ |an| , n=1 with equality occurring if the expansion of f is complete. Note that an may be complex, which is why the is required even though it is squared.

9 This inequality can be proved by applying the properties of scalar products to the inequality * + ∞ ∞ X X f − anϕn f − anϕn ≥ 0. n=1 n=1 If equality occurs, then we say that the expansion converges in the mean to f. However, this does not necessarily mean that the expansion agrees with f at every single point in the domain on which the scalar product is defined. This is because, as mentioned earlier, that hf|fi = 0 is possible even if f is nonzero at certain isolated points in the interval of integration. This will be illustrated in the following example. Example Consider the function f(x) = x on (−π, π). We wish to obtain an expansion of this function in terms of the functions cos nx, n = 0, 1, 2,... and sin nx, n = 1, 2,..., which are known to be orthogonal with respect to the standard scalar product on (−π, π), Z π hf|gi = f ∗(s)g(s) ds. −π Therefore, our expansion will have the form ∞ X f(x) = a0 + (an cos nx + bn sin nx). n=1

Since f(x) is an odd function, an = 0 for n = 0, 1, 2,..., so we need only compute the coefficients ∞ {bn}n=1, which are given by hsin nx|fi b = n hsin nx| sin nxi 1 Z π = f(t) sin nt dt π −π 1 Z π = t sin nt dt π −π  π Z π  1 1 1 = − t cos nt + cos nt dt π n −π n −π  π  1 1 1 = − t cos nt + 2 sin nt π n n −π π 2 = − cos ntπ n −π 2(−1)n+1 = . n Therefore, the Fourier sine series for f(x) = x on (−π, pi) is

∞ X (−1)n+1 x = 2 sin nx. n n=1 Figure 1 shows how this series converges to f(x). Note that because the 2π-periodic extension of f(x) has a discontinuity at x = kπ, where k is any odd integer, at these points the series converges to the average of the values of f(x) at these jumps, which is 0. The resulting oscillations near the discontinuity are an example of Gibbs’ phenomenon. 2

10 Figure 1: Fourier sine series expansion of f(x) = x (red ), compared to f(x) itself (blue curve)

Dirac Notation In quantum physics, Dirac notation is typically used to represent vectors in function spaces. A function f is written as |fi, which is called a ket, and then its complex conjugate, f ∗, is written as hf|, which is called a bra. Then, by putting a bra and a ket together, and removing one vertical bar, we obtain the scalar product: hf| and |gi are combined to obtain hf|gi. Now, suppose that a function |fi in a Hilbert space H is expanded in an orthonormal basis {|ϕ1i, |ϕ2i,...}. Then we have

∞ ∞  ∞  X X X |fi = aj|ϕji = |ϕjihϕj|fi =  |ϕjihϕj| |fi. j=1 j=1 j=1

Since this applies to any |fi in a Hilbert space, it follows that

∞ X |ϕjihϕj| = I, j=1 where I is the identity operator on H, defined by I|fi = |fi. The above summation is called a resolution of the identity. This is analogous to the resolution of the n×n identity I in terms of orthonormal vectors n n in R . If {v1, v2,..., vn} is an orthonormal basis for R , then n X ∗ vjvj = I, j=1

11 ∗ where, as before, vj is the Hermitian transpose of vj, which is the transpose and complex conjugate ∗ of vj. Each term in the above summation is called an outer product, whereas the expression u v is an inner product of u and v, which is actually the dot product.

12