INNER PRODUCT SPACES AND THE GRAM-SCHMIDT PROCESS
A. HAVENS
1. The Dot Product and Orthogonality 1.1. Review of the Dot Product. We first recall the notion of the dot product, which gives us a familiar example of an inner product structure on the real vector spaces Rn. This product is connected to the Euclidean geometry of Rn, via lengths and angles measured in Rn. Later, we will introduce inner product spaces in general, and use their structure to define general notions of length and angle on other vector spaces.
Definition 1.1. The dot product of real n-vectors in the Euclidean vector space Rn is the scalar product · : Rn × Rn → R given by the rule n n ! n X X X (u, v) = uiei, viei 7→ uivi . i=1 i=1 i n Here BS := (e1,..., en) is the standard basis of R . With respect to our conventions on basis and matrix multiplication, we may also express the dot product as the matrix-vector product v1 t î ó . u v = u1 . . . un . . vn It is a good exercise to verify the following proposition.
Proposition 1.1. Let u, v, w ∈ Rn be any real n-vectors, and s, t ∈ R be any scalars. The Euclidean dot product (u, v) 7→ u · v satisfies the following properties. (i.) The dot product is symmetric: u · v = v · u. (ii.) The dot product is bilinear: • (su) · v = s(u · v) = u · (sv), • (u + v) · w = u · w + v · w. Thus in particular, for fixed w, the maps x 7→ w · x and x 7→ x · w are linear maps valued in R. (iii.) The dot product is positive definite: u · u ≥ 0 with equality if and only if u = 0.
In physics and engineering contexts, where vectors are often defined as being mathematical objects embodying a direction together with a notion of magnitude or length, the dot product reveals its power by allowing the computation of lengths and angles. For example, in three dimensions we see that the dot product of a vector with itself gives the sum of the squares of 2 2 2 the components: u · u = u1 + u2 + u3. Since the component vectors relative to the standard 1 Spring 2018 M235.4 -Linear Algebra: Inner Product Spaces Havens basis are mutually perpendicular, by the Pythagorean theorem we deduce that u · u is the square of the Euclidean length of u. This motivates the following definition:
Definition 1.2. The magnitude, or length of a vector v ∈ Rn is the quantity
n !1/2 √ X 2 kvk := v · v = vi . i=1 By positive definiteness of the dot product, kvk is a well defined real number associated to the n-vector v. In particular, k · k : Rn → R defines a norm on Rn, as it satisfies the following properties for all u, v ∈ Rn and s ∈ R: (i.) non-degeneracy: kuk ≥ 0 with equality if and only if u = 0, (ii.) absolute homogeneity: ksuk = |s|kuk, (iii.) sub-additivity: ku + vk ≤ kuk + kvk. The inequality in (iii.) is called the triangle inequality.
Speaking of triangles, there is a further connection of the norm to triangles, which leads to the fact that we can extract angles using dot products. To any pair of nonparallel vectors u and v we obtain a triangle 4(u, v) := {t1u + t2v | t1, t2 ∈ [0, 1], t1 + t2 ≤ 1} with sides u, v and third side the line segment (1 − t)u + tv = u + t(v − u), t ∈ [0, 1]. By the law of cosines, kv − uk2 = kuk2 + kvk2 − 2kukkvk cos θ where θ is the interior angle of the triangle 4(u, v) between its edges u and v. Using that
kv − uk2 = (u − v) · (u − v) = u · u − 2u · v + v · v = kuk2 + kvk2 − 2u · v , we see that u·v = kukkvk cos θ. On the other hand, if u and v are collinear, the dot product is easily seen to be ±kukkvk, with positive sign if and only if u = sv for a positive scalar s. (Recall, two vectors are parallel if and only if one is a scalar multiple of the other.) Thus we have the following proposition giving a geometric, “coordinate-free” interpretation of the dot product:
Proposition 1.2. Let u, v ∈ Rn, and let θ ∈ [0, π] be the measure of the angle between the vectors u and v, as measured in a plane containing both u and v. Then the dot product is the scalar u · v = kukkvk cos θ . In particular, u and v are collinear if and only if |u · v| = kukkvk, and otherwise θ ∈ (0, π), and there is a uniquely determined plane containing u and v equal to span {u, v}. More generally, the Cauchy-Schwartz inequality holds:
|u · v| ≤ kukkvk . 2 Spring 2018 M235.4 -Linear Algebra: Inner Product Spaces Havens
The Cauchy-Schwartz inequality in this case follows readily from the fact that | cos θ| ≤ 1. In the extreme case, the left hand side of the Cauchy-Schwartz inequality may be 0. Geometrically, this requires either one of the vectors be the zero vector, or cos θ = 0, and thus θ = π/2+kπ for k ∈ Z. Thus, in particular, two nonzero vectors in Rn are perpendicular if and only if their dot product is 0.
Definition 1.3. Two vectors u and v are said to be orthogonal if and only if u · v = 0.
Note that 0 is orthogonal to all vectors, and also parallel to all vectors. Observe that the dot product allows us to calculate angles directly from the components of vectors. In particular, the angle θ between vectors u and v is given by u · v θ = arccos . kukkvk
Definition 1.4. A vector u ∈ Rn is called a unit vector if and only if kuk = 1. A vector u may be normalized to a unit vector uˆ by scaling: u uˆ = . kuk The set of all unit vectors in Rn forms a set called the (n − 1)-dimensional unit sphere: n−1 n n S := {x ∈ R : kxk = 1} = {x ∈ R : x · x = 1} . The reason it is called (n−1)-dimensional, rather than n-dimensional, is akin to the reason a plane is considered 2-dimensional. A point of a plane is determined by two scalars (the weights needed to locate a position via a linear combination of vectors in a basis spanning the plane), while a point x ∈ Sn−1 ,→ Rn is determined by n − 1 scalars, since any n − 1 components of x determine the last component via the condition x · x = 1.
3 Spring 2018 M235.4 -Linear Algebra: Inner Product Spaces Havens
Exercises (1) Compute all possible dot products between the following vectors: 1 −2 −1 1/2 0 −7 a = 1 , b = 3 , c = 4 , v = −1/3 , v = 9 , w = 6 . 1 −3 7 1/6 9 0
(2) Verify proposition 1.1 above directly using the coordinate formula given for the dot product.
(3) Give a concrete example showing that the dot product does not have the cancellation property, i.e. show that u · v = u · w does not imply that v = w. For a fixed pair u, v ∈ R3, describe geometrically the set of all vectors w ∈ R3 such that u·w = u·v.
(4) For real numbers, it is well known that multiplication satisfies an associative property: a(bc) = abc = a(bc) for any a, b, c ∈ R. Why is there no associative property of the dot product? What’s wrong with writing a · (b · c) = a · b · c = (a · b) · c?
(5) A regular tetrahedron is a solid in R3 with four faces, each of which is an equilateral triangle. Find the angles between the faces of a tetrahedron, which are dihedral angles (a dihedral angle is an angle between the faces of a polyhedron).
(6) By a diagonal of a cube, we mean the line segment from one vertex of a cube to the farthest vertex across the cube. By a diagonal of a cube’s face, we mean the diagonal of the square face from one vertex to the opposite vertex of that face. (a) Find the lengths of the diagonals of a cube and diagonals of faces in terms of the side length of a cube. (b) Find the angles between a diagonal of a cube and an adjacent edge of the cube. (c) Each diagonal of the cube is adjacent to how many face diagonals? Find the angle between a diagonal of a cube and an adjacent face diagonal.
(7) Prove that, for any u, v ∈ Rn, 2 kuk2 + 2 kvk2 = ku + vk2 + ku − vk2 , and 1Å ã u · v = ku + vk2 − ku − vk2 4 (8) Consider linearly independent vectors u, v ∈ R2, and let P be the parallelogram whose sides they span. Under what conditions are the diagonals of P orthogonal?
(9) Demonstrate via vector algebra that the diagonals of a parallelogram always bisect each other.
4 Spring 2018 M235.4 -Linear Algebra: Inner Product Spaces Havens
1.2. Orthogonal Sets and Orthonormal Bases.
n Definition 1.5. A finite collection {u1,..., up} ⊂ R of vectors is called an orthogonal set if ui · uj = 0 whenever i <, j.
n Example 1.1. The standard basis of R ,BS := (e1,..., en), is an (ordered) orthogonal set. Proposition 1.3. Any orthogonal set S of nonzero n-vectors is linearly independent and
thus a basis for the subspace span S. Moreover, if S = {u1,..., up} and y ∈ span S, then p Ç å X ui · y y = ui . i=1 ui · ui n Pp Proof. For any S = {u1,..., up} ⊂ R − {0}, if 0 = i=1 ciui, then for any j = 1, . . . , p, p ! X 0 = uj · 0 = uj · ciui i=1 p X = uj · (ciui) i=1 p X = ci(ui · uj) = cj(uj · uj) i=1
By assumption uj 6= 0, and thus by positive definiteness uj · uj 6= 0, whence cj must be zero.
Thus for each j = 1, . . . , p, cj = 0 and thus the only way to build 0 as a linear combination of the vectors of S is as the trivial combination. This establishes that S is a linearly independent set, whence it is a basis of the subspace span S. Pp If y ∈ span S, then there is a linear combination of the ui’s expressing y: y = i=1 aiui. Then, in analogy to the above computation, we find that y · uj = aj(uj · uj) for any j = 1, . . . , p, whence the coefficients expressing y in the basis S are ai = (ui · y)/(ui · ui). We thus call orthogonal sets orthogonal bases of the subspaces they span.
Definition 1.6. An orthogonal set S is called an orthonormal basis of a subspace W ⊂ Rn if S ⊂ Sn−1 and W = span S. That is, S is an orthonormal basis of W if it is an orthogonal basis of W , all of whose elements are unit vectors.
The standard basis again furnishes the standard example. We will see soon that we can produce orthonormal bases in a systematic way from a generic basis. Moreover, we’ll uncover a connection between orthonormal bases and geometric transformations such as rotations and reflections.
5 Spring 2018 M235.4 -Linear Algebra: Inner Product Spaces Havens
1.3. Orthogonal Complements and Projections.
Definition 1.7. The orthogonal complement of a subspace W ⊂ Rn is the set W ⊥ of vectors which are orthogonal to all vectors in W :
⊥ n W = {x ∈ R : x · w = 0 for all w ∈ W } .
Proposition 1.4. Let W ⊂ Rn be any subspace. Then W ⊥ is a subspace of Rn. Moreover, the following statements are equivalent: (1) y ∈ W ⊥, (2) y · v = 0 for every v ∈ S where W = span S, (3) y ∈ Nul Bt for any matrix B whose columns are an orthogonal basis of W , (4) ky − wk = ky + wk for every w ∈ W , (5) ky − wk is minimized if and only if w = 0
The proof is left as an exercise, as is the proof of the next theorem:
Theorem 1.1. Let A ∈ Rm×n. Then the orthogonal complement (Row A)⊥ of the row space Row A is isomorphic to the null space Nul A, and the left nullspace Nul At is equal to to the orthogonal complement (Col A)⊥ of the column space Col A.
Definition 1.8. For a nonzero u ∈ Rn, the orthogonal projection of a vector v ∈ Rn onto the line ` = span {u} is the vector Åu · v ã proj (v) = proj (v) := u . ` u u · u Observation 1.1. The orthogonal projection onto ` = span {u} is the normalized outer product of u with itself, acting on v; if uˆ = u/kuk, then (uˆ ⊗ uˆ)(v) = (uˆ · v)uˆ = projuˆ (v) = proju(v). More generally, the following proposition tells us that we can project orthogonally onto a subspace.
Proposition 1.5. Given a subspace W ⊆ Rn and any vector y ∈ Rn, there is a unique decomposition
y = proj W y + proj W ⊥ y , ⊥ where proj W (y) ∈ W and proj W ⊥ (y) ∈ W .
This decomposition realizes Rn as an internal direct sum of vector spaces W ⊕ W ⊥, which is to say that every vector in Rn can be uniquely described by summing a vector w ∈ W and a vector w0 ∈ W ⊥.
Proof. Take any orthogonal basis S = {u1,..., up}, and construct the vectors proj W (y), proj W ⊥ (y) by the formulae p X proj W (y) := proj ui (y) , proj W ⊥ (y) := y − proj W (y) . i=1 6 Spring 2018 M235.4 -Linear Algebra: Inner Product Spaces Havens
⊥ It is evident that proj W (y) ∈ W , and to check that proj W ⊥ (y) ∈ W , we just take the dot
product with any uj ∈ S: p ! X uj · proj W ⊥ (y) = uj · y − proj ui (y) i=1 p Ç å X ui · y = uj · y − ui · uj i=1 ui · ui
= uj · y − ui · y = 0 .
0 0 For uniqueness, let w = proj W y, w = y − w, and assume that y = w˜ + w˜ for w˜ ∈ W and w˜ 0 ∈ W ⊥. Then w˜ + w˜ 0 = w + w0 and thus w − w˜ = w0 − w˜ 0 ∈ W ∩ W ⊥ = {0} .
0 0 Thus w = w˜ and w = w˜ .
Proposition 1.6. The orthogonal projection w = proj W y of y onto a subspace W yields the distance minimizing vector in W from y: ky − wk < ky − vk for any v ∈ W − {w} .
Proof. Assume v ∈ W − {w} where ww = proj W y. Then y − v = y − w + w − v, and y − w ∈ W ⊥ while w − v ∈ W − {0}, whence by the Pythagorean theorem ky − vk2 = ky − wk2 + ky − vk2 > ky − wk2 .
This may be interpreted as follows: for a given subspace W ⊆ Rn and any vector y ∈ n R , the vector w = proj W (y) is the “best approximation” of y in the subspace W . The complimentary orthogonal component y − w is the minimum length component from W to y in any any decomposition of y as a sum of linearly independent vectors.
7 Spring 2018 M235.4 -Linear Algebra: Inner Product Spaces Havens
Exercises (1) Prove proposition 1.4.
(2) Prove theorem 1.1.
(3) Give a formula for a reflection through a subspace W ⊆ Rn using projection.
(4) Let W ⊂ Rn be a proper nontrivial subspace. Consider the sequence of maps n ⊥ 0 → W → R → W → 0 , where the first two arrows are given by inclusion, the map to W ⊥ is given by projec- tion, and the final map is the trivial map. (a) Show that the image of each map is the kernel of the next map. Such a chain of maps, where the kernel of each map is the image from the previous map, is called an exact sequence, and in this case, where there are four maps and the first and last spaces of the sequence are both trivial, the sequence is called a short exact sequence. (b) For a general short exact sequence of linear maps of vector spaces 0 → V → W → U → 0 , argue that the first two arrows are injective maps, and the last two arrows are surjective maps. Show that W ∼= V ⊕ U, i.e., that every w ∈ W can be written uniquely as a sum of elements such that one is in the image of the map coming from V and the other is in the complement of the kernel of the map from W .
8 Spring 2018 M235.4 -Linear Algebra: Inner Product Spaces Havens
1.4. The Orthogonal Groups. We will now study matrices that encode an orthonormal basis (initially, of a subspace, but the truly interesting cases are square matrices).
n Proposition 1.7. Let {u1,..., up} be an orthonormal basis of a subspace W ⊆ R and let î ó n U = u1 ... up . Then for any y ∈ R
t proj W (y) = UU y = (U ⊗ U)y .
The proof is an exercise in unpacking the notations and definitions of the previous sub- sections.
n×n t Definition 1.9. A matrix U ∈ R is said to be an orthogonal matrix if U U = In.
The motive for this definition is as follows: one may ask “which invertible matrices U ∈ Rn×n preserve the dot product, in the sense that (Ux) · (Uy) = x · y for all vectors x, y ∈ n t R ?” It is then easy to see that one requires U U = In in order for the transformation x 7→ Ux to preserve the dot product. Moreover, this condition requires the columns to form an orthonormal basis: since u1 · u1 ... u1 · uj ... u1 · un ...... . . . . . t U U = ui · u1 ... ui · uj ... uj · un = In , . . . . . ...... un · u1 ... un · uj ... un · un
2 one has that kuik = ui · ui = 1 and ui · uj = 0 if i 6= j. The set of real orthogonal n × n matrices is denoted O(n), and is an example of a classical matrix group (and also a Lie group). By a group, we mean a mathematical set endowed with an associative binary product for which there is a unique identity element and an inverse for each element. It is easy to check that O(n) becomes a group with the binary operation being matrix multiplication. Observe that any orthogonal matrix has inverse equal to its transpose, and consequently has determinant ±1. Thus, the transformations x 7→ Ux for U ∈ O(n) are volume preserving. If the determinant is positive, they are also orientation preserving. We can define a subgroup SO(n) ⊂ O(n) of matrices which are orthogonal and have determinant 1, called special orthogonal matrices. Consider the case of SO(3). Any matrix U ∈ SO(3) is one whose columns are a right- handed orthonormal basis of R3. Thus, we can understand U as the matrix of a spatial rotation, carrying the standard basis onto a new orthonormal right-handed frame specified by its columns. But how can we produce such matrices? One strategy, which is unique to R3, is to choose any pair of orthogonal vectors, say u1 and u2, normalize them to uˆ1 = u1/ku1k and uˆ2 = u2/ku2k, and to define uˆ3 = uˆ1 × uˆ2. Another way to get 3 × 3 special orthogonal matrices is described in the exercises. Finally, one can start with any positively oriented 9 Spring 2018 M235.4 -Linear Algebra: Inner Product Spaces Havens basis, and modify it to create an orthonormal one. It is this last method which is amenable to generalization to dimensions ≥ 3.
Exercises (1) Prove proposition 1.7.
(2) Prove that O(n) is a group. In particular, you must show that a product of orthogonal matrices is again orthogonal. Show moreover that SO(n) is a subgroup of O(n) by showing that a product of special orthogonal matrices is a special orthogonal matrix, and that all necessary inverses and the identity are present.
(3) In analogy to the subspace test for vector spaces and using the previous problem as your starting model, devise and prove a subgroup test to determine whether a subset H of a group G is itself a group with respect to the operation it inherits from G.
(4) In this exercise you will derive a general matrix formula for spatial rotations, called the Rodriquez formula, and then explore the structure of SO(3). Fix a unit axial vector uˆ ∈ Rn and an angle ϕ ∈ [0, 2π). (a) Show that the rotation of a vector x about the axis span {uˆ} by an angle of ϕ counterclockwise (relative to the view from the tip of uˆ towards 0) is given by
uˆ Ä ä Rϕ (x) = 1 − cos(ϕ) proj uˆ (x) + cos(ϕ)x + sin(ϕ)uˆ × x . This is the Rodriguez formula for spatial rotation.
uˆ (b) Use the preceding part to write out a matrix U such that Rϕ (x) = Ux, in terms of the components of uˆ and the angle ϕ. (c) What are the eigenvalues and complex eigenvectors of the matrix U associated uˆ to Rϕ (x) (hint: do not try to compute the characteristic polynomial directly from the matrix).
(5) Show that the elements of O(3) that are not in SO(3) decompose as products of a reflection matrix for reflection though a plane and a matrix in SO(3). Show that any matrix in O(3) decomposes as a product of just reflection matrices for reflections through a system of planes. What’s the maximum number of reflections needed?
10 Spring 2018 M235.4 -Linear Algebra: Inner Product Spaces Havens
1.5. Gram-Schmidt, Take 1. The main result of this section is that there is a process by which one can construct an orthonormal basis from a given basis. This process is known as the Gram-Schmidt process. n Given a basis (v1,..., vp) of W ⊆ R , we construct an orthonormal basis (uˆ1,..., uˆp) as follows:
(1) Set uˆ1 := v1/kv1k.
(2) Let u2 := v2 − proj uˆ1 (v2). Set uˆ2 = u2/ku2k. Pk−1 (3) For each k from 3 to p in succession, set uk := vk − i=1 proj uˆi (vk) and uˆk = uk/kukk.
(4) The set (uˆ1,..., uˆp) is the output, and is an orthonormal basis.
Observe that we could describe the algorithm in the following way:
(1) Set uˆ1 := v1/kv1k.
(2) Set U1 = [uˆ1]. t (3) For k ∈ {2, . . . , p}, set uk = (I − Uk−1Uk−1)vk and uˆk = uk/kukk, then form Uk = î ó Uk−1 uˆk .
(4) The columns of Up are the desired orthonormal basis.
Thus, in a given step, the algorithm works by computing and subsequently normalizing the
component of vk in the orthogonal complement of the subspace spanned by the previously
built orthonormal vectors. This is accomplished by projecting vk onto the subspace spanned
by previously built orthonormal vectors, then subtracting that piece from vk, normalizing,
and appending the new unit vector uˆk to the list.
Example 1.2. Construct an orthonormal basis via the Gram-Schmidt process from the basis
Ü 3 4 0 ê B = −4 , 3 , 2 0 0 1
of R3. Set
3 3/5 1 uˆ1 = √ −4 = −4/5 . 32 + 42 0 0
Then
4 Ü 4 3/5 ê 3/5 4 4 u = 3 − 3 · −4/5 −4/5 = 3 − 0 = 3 . 2 0 0 0 0 0 0 11 Spring 2018 M235.4 -Linear Algebra: Inner Product Spaces Havens
4/5 The first two vectors were already orthogonal! We then set uˆ = 3/5 , and 2 0
0 Ü 0 3/5 ê 3/5 Ü 0 4/5 ê 4/5 u := 2 − 2 · −4/5 −4/5 − 2 · 3/5 3/5 3 1 1 0 0 1 0 0 0 3/5 4/5 8 6 = 2 + −4/5 − 3/5 5 5 1 0 0 0 24/25 24/5 = 2 − −32/25 − 18/5 1 0 0 0 = 0 . 1
Then since ku3k = 1, we can set uˆ3 := u3, and our set
Ü 3/5 4/5 0 ê −4/5 , 3/5 0 0 0 1
3 is an orthonormal basis of R . Observe that uˆ3 = uˆ1 × uˆ2.
Example 1.3. Orthonormalize the set
1 0 0 −1 1 0 , , . 1 −1 1 −1 1 −1 to obtain an orthonormal basis of the subspace of R4 spanned by these vectors. 1 −1 1 Begin by setting uˆ1 = 2 . 1 −1
Llet U1 = [uˆ1]. Then 1 −1 1 −1 1 −1 1 −1 1 t U1U1 = . 4 1 −1 1 −1 −1 1 −1 1 12 Spring 2018 M235.4 -Linear Algebra: Inner Product Spaces Havens
Set 0 1 −1 1 −1 0 1 1 −1 1 −1 1 1 u2 = − −1 4 1 −1 1 −1 −1 1 −1 1 −1 1 1 0 −3 1 1 3 = − −1 4 −3 1 3 3/4 1/4 = . −1/4 1/4 √ 3/4 3/2 √ 1/4 3/6 î ó √2 Thus set uˆ2 = = √ and let U2 = uˆ2 uˆ2 . 3 −1/4 − 3/6 √ 1/4 3/6 13 3 −3 3 1 3 7/3 −7/3 7/3 t U2U2 = , 16 −3 −7/3 7/3 −7/3 3 7/3 −7/3 7/3 so we set 0 13 3 −3 3 0 −3/8 0 1 3 7/3 −7/3 7/3 0 −7/24 t u3 = − U2U2 = = , 1 16 −3 −7/3 7/3 −7/3 1 7/24 −1 3 7/3 −7/3 7/3 −1 −7/24 −3/8 −3/2 −7/24 −7/6 » 3 » 3 and uˆ3 = 4 19 = 19 . The set 7/24 7/6 −7/24 −7/6 1 1 −3/2 √ 1 −1 3 1/3 3 −7/6 , , 2 1 2 −1/3 19 7/6 −1 1/3 −7/6 is the desired orthonormal basis of the subspace spanned by the original vectors.
13 Spring 2018 M235.4 -Linear Algebra: Inner Product Spaces Havens
2. Inner Product Spaces 2.1. Symmetric Bilinear Positive Definite Products. We will now generalize the no- tion of a dot product to that of an inner product, which will allow us to define orthogonality, projections, and other geometric concepts in the general setting of a real vector space V .
Definition. Let V be a real vector space. The structure of an inner product on V is a scalar-valued bilinear, symmetric, positive definite pairing
h·, ·i : V × V → R . That is (i.) For any fixed v ∈ R, the maps hv, ·i : V → R and h·, vi : V → R are linear, (ii.) hu, vi = hv, ui for every u, v ∈ V , (ii.) hv, vi ≥ 0 with equality if and only if v = 0. The vector space V becomes an inner product space ÄV, h·, ·iä when endowed with such a scalar product. » As with the Euclidean dot product, one obtains a norm kvk := hv, vi, called the 2-norm.
One could also define other norms from the inner product, such as the p-norm kvkp := 1/p Ähv, viä . Orthogonality and the theory of orthogonal projections can be carried out just as with the dot product within the context of an inner product space. One again has a Cauchy-Schwartz inequality:
Proposition 2.1. Let ÄV, h·, ·iä be an inner product space. Then |hu, vi| ≤ kukkvk , » for any u, v inV , where the norm k · k is defined by kvk := hv, vi for any v ∈ V .
Here’s a “cute proof” that uses only elementary algebra and the defining properties of inner products. Proof. Let p(t) = hu − tv, u − tvi for arbitrary vectors u, v ∈ V . Then expanding the inner product by appealing to bilinearity and symmetry, we have that p(t) = hu, ui − 2hu, vit + hv, vit2 = kvk2t2 − 2hu, vit + kuk2 . By positive definiteness, p(t) ≥ 0 for all t. The minimum value of p is attained at the vertex, where t = hu, vi/kvk2. Thus Çhu, viå2 Çhu, viå p(hu, vi/kvk2) = kvk2 − 2hu, vi + kuk2 kvk2 kvk2 hu, vi2 = − + kuk2 ≥ 0 kvk2 2 2 2 =⇒ hu, vi ≤ kuk kvk . 14 Spring 2018 M235.4 -Linear Algebra: Inner Product Spaces Havens
Exercises (1) For each definition and theorem from section 1, write out the analogous definitions and theorems for inner product spaces.
(2) Let ÄV, h·, ·iä be an inner product space. Let k · k be the 2-norm induced by the inner product h·, ·i. Use the Cauchy-Schwarz inequality to prove the triangle inequality ku − vk ≤ kuk + kvk .
(3) Give another argument for the Cauchy-Schwarz inequality, using the idea of best approximation in a subspace.
15 Spring 2018 M235.4 -Linear Algebra: Inner Product Spaces Havens
2.2. Examples.
Example 2.1. A matrix A ∈ Rn×n is said to be a positive definite square matrix if and only if for every vector x ∈ Rn, xtAx ≥ 0, with equality if and only if x = 0. A positive definite matrix may have negative entries. For example,
2 −1 0 A = −1 2 −1 0 −1 2 can be shown to be positive definite. Given a symmetric positive definite matrix A ∈ Rn×n, one can define an inner product t n n hx, yiA = x Ay for any x, y ∈ R . The standard inner product structure on R , the dot product, is given when A is chosen to be the identity matrix.
Example 2.2. Consider the space Pn of real polynomials in a single variable, of degree ≤ n. Fix a positive number a ∈ R+. An inner product structure, dependent on a is given by Z a hp(t), q(t)ia := p(t)q(t) dt . −a Two polynomials can then be said to be orthogonal over the interval [−a, a] if
hp(t), q(t)ia = 0. Orthogonal polynomials are important to constructing the solutions to many differential equations. More generally, after refining what it means to be integrable, one can use the integral to define an inner product and a norm on the space of square integrable functions over a domain of R. This idea is of crucial importance in the study of functional analysis and has applications to the solutions of many differential equations that occur in mathematical physics.
Example 2.3. The space of real trigonometric polynomials Tn of degree ≤ n has an inner product structure given by 1 Z π hf, gi = f(t)g(t) dt . π −π » With this inner product and the 2-norm kfk = hf, fi, the set
Ç 1 å √ , sin t, cos t, sin 2t, cos 2t, . . . , sin nt, cos nt 2
gives an orthonormal basis of Tn, which is a subspace of the space of square-integrable functions over [−π, π].
If f(t) is a square integrable function on the interval [−π, π], then we can project onto Tn to obtain the nth Fourier approximation of f:
a0 fn(t) = proj f(t) = √ + b1 sin(t) + c1 cos(t) + . . . bn + sin(nt) + cn cos(nt) , Tn 2 16 Spring 2018 M235.4 -Linear Algebra: Inner Product Spaces Havens where Æ 1 ∏ 1 Z π √ a0 = f(t), √ = f(t)/ 2 dt , 2 π −π 1 Z π bk = hf(t), sin(kt)i = f(t) sin(kt) dt , π −π 1 Z π ck = hf(t), cos(kt)i = f(t) cos(kt) dt . π −π Observe that the constant term a 1 Z π √0 = f(t) dt 2 2π −π is the average value of f(t) on the interval [−π, π]. Fourier series are a crucial tool in modern analysis, particularly of periodic functions. They were initially conceived by Joseph Fourier when he was trying to analytically express the general solution to the heat equation (a certain partial differential equation describing the dissipation of heat; more generally “heat equation” occur throughout mathematics when modeling dissipative systems).
17