<<

Chapter 3

Inner Products and Norms

The of Euclidean relies on the familiar properties of length and . The abstract concept of a on a formalizes the geometrical notion of the length of a vector. In , the angle between two vectors is governed by their dot , which is itself formalized by the abstract concept of an inner prod- uct. Inner products and norms lie at the heart of analysis, both linear and nonlinear, in both finite-dimensional vector spaces and infinite-dimensional spaces. It is im- possible to overemphasize their importance for both theoretical developments, practical applications, and in the design of numerical solution algorithms. We begin this chapter with a discussion of the basic properties of inner products, illustrated by some of the most important examples. is founded on inequalities. The most basic is the Cauchy– Schwarz , which is valid in any . The more familiar inequality for the associated norm is then derived as a simple consequence. Not every norm arises from an inner product, and, in more general norms, the triangle inequality becomes part of the definition. Both inequalities retain their validity in both finite-dimensional and infinite-dimensional vector spaces. Indeed, their abstract formulation helps us focus on the key ideas in the proof, avoiding all distracting complications resulting from the explicit formulas. In Rn, the characterization of general inner products will lead us to an extremely important class of matrices. Positive definite matrices play a key role in a variety of applications, including minimization problems, least squares, mechanical systems, electrical circuits, and the differential equations describing dynamical processes. Later, we will generalize the notion of positive definiteness to more general linear operators, governing the ordinary and partial differential equations arising in and dynamics. Positive definite matrices most commonly appear in so-called Gram form, consisting of the inner products between selected elements of an inner product space. The test for positive definiteness is based on . Indeed, the associated matrix factorization can be reinterpreted as the process of completing the for the associated . So far, we have confined our attention to real vector spaces. Complex , vectors and functions also play an important role in applications, and so, in the final section, we formally introduce complex vector spaces. Most of the formulation proceeds in direct analogy with the real version, but the notions of inner product and norm on complex vector spaces requires some thought. Applications of complex vector spaces and their inner products are of particular importance in Fourier analysis and signal processing, and absolutely essential in modern quantum mechanics.

2/25/04 88 c 2004 Peter J. Olver ° v v v 3

v2

v 2 v 1

v1

Figure 3.1. The Euclidean Norm in R2 and R3.

3.1. Inner Products. The most basic example of an inner product is the familiar n v ; w = v w = v w + v w + + v w = v w , (3.1) h i · 1 1 2 2 · · · n n i i i = 1 X T T between (column) vectors v = ( v1, v2, . . . , vn ) , w = ( w1, w2, . . . , wn ) lying in the Eu- clidean space Rn. An important observation is that the dot product (3.1) can be identified with the matrix product w1 w2 v w = vT w = ( v v . . . v )  .  (3.2) · 1 2 n .    w   n  between a row vector vT and a column vector w.   The dot product is the cornerstone of Euclidean geometry. The key fact is that the dot product of a vector with itself, v v = v2 + v2 + + v2 , · 1 2 · · · n is the sum of the squares of its entries, and hence, as a consequence of the classical , equal to the square of its length; see Figure 3.1. Consequently, the Euclidean norm or length of a vector is found by taking the :

v = √ v v = v2 + v2 + + v2 . (3.3) k k · 1 2 · · · n Note that every nonzero vector v = 0 has ppositive length, v 0, while only the zero vector has length 0 = 0. The dot6 product and Euclideank normk ≥ satisfy certain evident properties, and thesek servk e to inspire the abstract definition of more general inner products. Definition 3.1. An inner product on the real vector space V is a pairing that takes two vectors v, w V and produces a real v ; w R. The inner product is ∈ h i ∈ required to satisfy the following three axioms for all u, v, w V , and c, d R. ∈ ∈

2/25/04 89 c 2004 Peter J. Olver ° (i) Bilinearity: c u + d v ; w = c u ; w + d v ; w , h i h i h i (3.4) u ; c v + d w = c u ; v + d u ; w . h i h i h i (ii) Symmetry: v ; w = w ; v . (3.5) h i h i (iii) Positivity: v ; v > 0 whenever v = 0, while 0 ; 0 = 0. (3.6) h i 6 h i A vector space equipped with an inner product is called an inner product space. As we shall see, a given vector space can admit many different inner products. Verification of the inner product axioms for the Euclidean dot product is straightforward, and left to the reader. Given an inner product, the associated norm of a vector v V is defined as the positive square root of the inner product of the vector with itself: ∈

v = v ; v . (3.7) k k h i The positivity axiom implies that v 0pis real and non-negative, and equals 0 if and only if v = 0 is the zero vector. k k ≥

Example 3.2. While certainly the most basic inner product on R2, the dot product v w = v1 w1 + v2 w2 is by no means the only possibility. A simple example is provided by the· weighted inner product

v w v ; w = 2v w + 5v w , v = 1 , w = 1 . (3.8) h i 1 1 2 2 v w µ 2 ¶ µ 2 ¶ Let us verify that this formula does indeed define an inner product. The symmetry axiom (3.5) is immediate. Moreover,

c u + d v ; w = 2(cu + dv )w + 5(cu + dv )w h i 1 1 1 2 2 2 = c(2u w + 5u w ) + d(2v w + 5v w ) = c u ; w + d v ; w , 1 1 2 2 1 1 2 2 h i h i which verifies the first bilinearity condition; the second follows by a very similar computa- tion. (Or, one can use the symmetry axiom to deduce the second bilinearity identity from the first; see Exercise .) Moreover, 0 ; 0 = 0, while h i v ; v = 2v2 + 5v2 > 0 whenever v = 0, h i 1 2 6 since at least one of the summands is strictly positive, verifying the positivity requirement (3.6). This serves to establish (3.8) as an legitimate inner product on R2. The associated 2 2 weighted norm v = 2v1 + 5v2 defines an alternative, “non-Pythagorean” notion of length of vectorsk andk distance between points in the . p A less evident example of an inner product on R2 is provided by the expression v ; w = v w v w v w + 4v w . (3.9) h i 1 1 − 1 2 − 2 1 2 2

2/25/04 90 c 2004 Peter J. Olver ° Bilinearity is verified in the same manner as before, and symmetry is obvious. Positivity is ensured by noticing that v ; v = v2 2v v + 4v2 = (v v )2 + 3v2 0, h i 1 − 1 2 2 1 − 2 2 ≥ and is strictly positive for all v = 0. Therefore, (3.9) defines another inner product on R 2, 6 with associated norm v = v2 2v v + 4v2 . k k 1 − 1 2 2 Example 3.3. Let c1, . .p. , cn be a of positive numbers. The corresponding weighted inner product and weighted norm on Rn are defined by

n n v ; w = c v w , v = v ; v = c v2 . (3.10) h i i i i k k h i v i i i = 1 u i = 1 X p u X t th The numbers ci > 0 are the weights. The larger the weight ci, the more the i coordinate of v contributes to the norm. Weighted norms are particularly important in statistics and data fitting, where one wants to emphasize certain quantities and de-emphasize others; this is done by assigning suitable weights to the different components of the data vector v. Section 4.3 on least squares approximation methods will contain further details.

Inner Products on Inner products and norms on function spaces play an absolutely essential role in mod- ern analysis and its applications, particularly Fourier analysis, boundary value problems, ordinary and partial differential equations, and numerical analysis. Let us introduce the most important examples. Example 3.4. Let [a, b] R be a bounded closed . Consider the vector space C0[a, b] consisting of all con⊂tinuous functions f(x) defined for a x b. The of the product of two continuous functions ≤ ≤

b f ; g = f(x) g(x) dx (3.11) h i Za defines an inner product on the vector space C0[a, b], as we shall prove below. The asso- ciated norm is, according to the basic definition (3.7),

b f = f(x)2 dx , (3.12) k k s Za and is known as the L2 norm of the function f over the interval [a, b]. The L2 inner product and norm of functions can be viewed as the infinite-dimensional function space versions of the dot product and Euclidean norm of vectors in Rn. 1 2 For example, if we take [a, b] = [0, 2 π ], then the L inner product between f(x) = sin x and g(x) = cos x is equal to

π/2 1 π/2 1 sin x ; cos x = sin x cos x dx = sin2 x = . h i 2 2 Z0 ¯x = 0 ¯ ¯ 2/25/04 91 ¯ c 2004 Peter J. Olver ° Similarly, the norm of the function sin x is

π/2 π sin x = (sin x)2 dx = . k k s 4 Z0 r One must always be careful when evaluating function norms. For example, the c(x) 1 has norm ≡ π/2 π 1 = 12 dx = , k k s 2 Z0 r not 1 as you might have expected. We also note that the value of the norm depends upon which interval the integral is taken over. For instance, on the longer interval [0, π ],

π 1 = 12 dx = √π . k k sZ0 Thus, when dealing with the L2 inner product or norm, one must always be careful to specify the function space, or, equivalently, the interval on which it is being evaluated. Let us prove that formula (3.11) does, indeed, define an inner product. First, we need to check that f ; g is well-defined. This follows because the product f(x)g(x) of two continuous functionsh i is also continuous, and hence its integral over a bounded interval is defined and finite. The symmetry requirement is immediate:

b f ; g = f(x) g(x) dx = g ; f , h i h i Za because of functions is commutative. The first bilinearity axiom c f + d g ; h = c f ; h + d g ; h , h i h i h i amounts to the following elementary integral identity

b b b c f(x) + d g(x) h(x) dx = c f(x) h(x) dx + d g(x) h(x) dx, Za Za Za valid for arbitrary£ continuous¤functions f, g, h and scalars (constants) c, d. The second bilinearity axiom is proved similarly; alternatively, one can use symmetry to deduce it from the first as in Exercise . Finally, positivity requires that

b f 2 = f ; f = f(x)2 dx 0. k k h i ≥ Za This is clear because f(x)2 0, and the integral of a nonnegative function is nonnegative. Moreover, since the function≥f(x)2 is continuous and nonnegative, its integral will vanish, b f(x)2 dx = 0 if and only if f(x) 0 is the zero function, cf. Exercise . This completes a ≡ Zthe demonstration that (3.11) defines a bona fide inner product on the function space C0[a, b].

2/25/04 92 c 2004 Peter J. Olver ° w

θ v

Figure 3.2. Angle Between Two Vectors.

Remark: The L2 inner product formula can also be applied to more general functions, but we have restricted our attention to continuous functions in order to avoid certain technical complications. The most general function space admitting this important inner product is known as , which forms the foundation for modern analysis, [126], including , [51], and also lies at the heart of modern quantum mechanics, [100, 104, 122]. One does need to be extremely careful when trying to extend the inner product to more general functions. Indeed, there are nonzero, discontinuous functions with zero “L2 norm”. An example is 1, x = 0, 1 f(x) = which satisfies f 2 = f(x)2 dx = 0 (3.13) 0, otherwise, k k 1 ½ Z− because any function which is zero except at finitely many (or even countably many) points has zero integral. We will discuss some of the details of the Hilbert space construction in Chapters 12 and 13. The L2 inner product is but one of a vast number of important inner products on functions space. For example, one can also define weighted inner products on the function space C0[a, b]. The weights along the interval are specified by a (continuous) positive scalar function w(x) > 0. The corresponding weighted inner product and norm are

b b f ; g = f(x) g(x) w(x) dx, f = f(x)2 w(x) dx . (3.14) h i k k s Za Za The verification of the inner product axioms in this case is left as an exercise for the reader. As in the finite-dimensional versions, weighted inner products play a key role in statistics and data analysis.

3.2. Inequalities. There are two absolutely fundamental inequalities that are valid for any inner product on any vector space. The first is inspired by the geometric interpretation of the dot product

2/25/04 93 c 2004 Peter J. Olver ° on Euclidean space in terms of the angle between vectors. It is named† after two of the founders of modern analysis, Augustin Cauchy and Herman Schwarz, who established it in the case of the L2 inner product on function space. The more familiar triangle inequality, that the length of any side of a triangle is bounded by the sum of the lengths of the other two sides is, in fact, an immediate consequence of the Cauchy–Schwarz inequality, and hence also valid for any norm based on an inner product. We will present these two inequalities in their most general, abstract form, since this brings their essence into the spotlight. Specilizing to different inner products and norms on both finite-dimensional and infinite-dimensional vector spaces leads to a wide variety of striking, and useful particular cases. The Cauchy–Schwarz Inequality

In two and three-dimensional Euclidean geometry, the dot product between two vec- tors can be geometrically characterized by the equation

v w = v w cos θ, (3.15) · k k k k where θ measures the angle between the vectors v and w, as drawn in Figure 3.2. Since

cos θ 1, | | ≤ the of the dot product is bounded by the product of the lengths of the vectors: v w v w . | · | ≤ k k k k This is the simplest form of the general Cauchy–Schwarz inequality. We present a simple, algebraic proof that does not rely on the geometrical notions of length and angle and thus demonstrates its universal validity for any inner product. Theorem 3.5. Every inner product satisfies the Cauchy–Schwarz inequality

v ; w v w , v, w V. (3.16) | h i | ≤ k k k k ∈ Here, v is the associated norm, while denotes absolute value of real numbers. Equal- k k | · | ity holds if and only if v and w are parallel† vectors. Proof : The case when w = 0 is trivial, since both sides of (3.16) are equal to 0. Thus, we may suppose w = 0. Let t R be an arbitrary scalar. Using the three basic inner product axioms, we 6have ∈

0 v + t w 2 = v + t w ; v + t w = v 2 + 2t v ; w + t2 w 2, (3.17) ≤ k k h i k k h i k k

† Russians also give credit for its discovery to their compatriot Viktor Bunyakovskii, and, indeed, many authors append his name to the inequality.

† Recall that two vectors are parallel if and only if one is a scalar multiple of the other. The zero vector is parallel to every other vector, by convention.

2/25/04 94 c 2004 Peter J. Olver ° with equality holding if and only if v = t w — which requires v and w to be parallel vectors. We fix v and w, and consider the−right hand side of (3.17) as a quadratic function, 0 p(t) = at2 + 2bt + c, where a = w 2, b = v ; w , c = v 2, ≤ k k h i k k of the scalar variable t. To get the maximum mileage out of the fact that p(t) 0, let us look at where it assumes its minimum. This occurs when its derivative vanishes:≥ b v ; w p0(t) = 2at + 2b = 0, and thus at t = = h i . − a − w 2 k k Substituting this particular minimizing value into (3.17), we find v ; w 2 v ; w 2 v ; w 2 0 v 2 2 h i + h i = v 2 h i . ≤ k k − w 2 w 2 k k − w 2 k k k k k k Rearranging this last inequality, we conclude that v ; w 2 h i v 2, or v ; w 2 v 2 w 2. w 2 ≤ k k h i ≤ k k k k k k Taking the (positive) square root of both sides of the final inequality completes the proof of the Cauchy–Schwarz inequality (3.16). Q.E.D. Given any inner product on a vector space, we can use the quotient v ; w cos θ = h i (3.18) v w k k k k to define the “angle” between the elements v, w V . The Cauchy–Schwarz inequality tells us that the ratio lies between 1 and +1, and∈hence the angle θ is well-defined, and, in fact, unique if we restrict it to lie− in the range 0 θ π. ≤ ≤ For example, using the standard dot product on R3, the angle between the vectors v = ( 1, 0, 1 )T and w = ( 0, 1, 1 )T is given by

1 1 1 cos θ = = , and so θ = π = 1.0472 . . . , i.e., 60◦. √2 √2 2 3 · On the other hand, if we use the weighted inner product v ; w = v1 w1 +2v2 w2 +3v3 w3, then h i 3 cos θ = = .67082 . . . , whereby θ = .835482 . . . . 2 √5 Thus, the measurement of angle (and length) is dependent upon the choice of an underlying inner product. Similarly, under the L2 inner product on the interval [0, 1], the “angle” θ between the p(x) = x and q(x) = x2 is given by

1 3 2 x dx 1 x ; x 0 4 15 cos θ = h 2i = Z = = , x x 1 1 1 1 r16 k k k k x2 dx x4 dx 3 5 s s q q Z0 Z0

2/25/04 95 c 2004 Peter J. Olver ° so that θ = 0.25268 radians. Warning: One should not try to give this notion of angle between functions more significance than the formal definition warrants — it does not correspond to any “angular” properties of their graph. Also, the value depends on the choice of inner product and the interval upon which it is being computed. For example, if 1 we change to the L2 inner product on the interval [ 1, 1], then x ; x2 = x3 dx = 0, − h i 1 2 Z− 1 and hence (3.18) becomes cos θ = 0, so the “angle” between x and x is now θ = 2 π. Orthogonal Vectors In Euclidean geometry, a particularly noteworthy configuration occurs when two vec- 1 3 tors are , which means that they meet at a right angle: θ = 2 π or 2 π, and so cos θ = 0. The angle formula (3.15) implies that the vectors v, w are perpendicular if and only if their dot product vanishes: v w = 0. Perpendicularity also plays a key role in general inner product spaces, but, for historical· reasons, has been given a more suggestive name. Definition 3.6. Two elements v, w V of an inner product space V are called orthogonal if their inner product vanishes: ∈v ; w = 0. h i is a remarkably powerful tool in all applications of linear , and often serves to dramatically simplify many computations. We will devote all of Chapter 5 to a detailed exploration of its implications. Example 3.7. The vectors v = ( 1, 2 )T and w = ( 6, 3 )T are orthogonal with − respect to the Euclidean dot product in R2, since v w = 1 6 + 2 ( 3) = 0. We deduce · · · − that they meet at a 90◦ angle. However, these vectors are not orthogonal with respect to the weighted inner product (3.8): 1 6 v ; w = ; = 2 1 6 + 5 2 ( 3) = 18 = 0. h i 2 3 · · · · − − 6 ¿ µ ¶ µ − ¶ À Thus, the property of orthogonality, like in general, depends upon which inner product is being used. 2 1 Example 3.8. The polynomials p(x) = x and q(x) = x 2 are orthogonal with 1 − respect to the inner product p ; q = p(x) q(x) dx on the interval [0, 1], since h i Z0 1 1 x ; x2 1 = x x2 1 dx = x3 1 x dx = 0. − 2 − 2 − 2 Z0 Z0 They fail to be ­orthogonal ®on most other¡ interv¢ als. For¡example, ¢on the interval [0, 2],

2 2 x ; x2 1 = x x2 1 dx = x3 1 x dx = 3. − 2 − 2 − 2 Z0 Z0 The Triangle Ine­ quality ® ¡ ¢ ¡ ¢ The familiar triangle inequality states that the length of one side of a triangle is at most equal to the sum of the lengths of the other two sides. Referring to Figure 3.3, if the

2/25/04 96 c 2004 Peter J. Olver ° v + w w

v

Figure 3.3. Triangle Inequality.

first two side are represented by vectors v and w, then the third corresponds to their sum v + w, and so v + w v + w . The triangle inequality is a direct consequence of the Cauchy–Sckhwarz inequalitk ≤ k yk, andk hencek holds for any inner product space. Theorem 3.9. The norm associated with an inner product satisfies the triangle inequality v + w v + w (3.19) k k ≤ k k k k for every v, w V . Equality holds if and only if v and w are parallel vectors. ∈ Proof : We compute v + w 2 = v + w ; v + w = v 2 + 2 v ; w + w 2 k k h i k k h i k k 2 v 2 + 2 v w + w 2 = v + w , ≤ k k k k k k k k k k k k where the inequality follows from Cauchy–Schwarz. Taking square ro¡ots of both sides¢ and using positivity completes the proof. Q.E.D. 1 2 3 Example 3.10. The vectors v = 2 and w = 0 sum to v + w = 2 .       1 3 2 −       Their Euclidean norms are v = √6 and w = √13, while v + w = √17. The k k k k k k triangle inequality (3.19) in this case says √17 √6 + √13, which is valid. ≤ Example 3.11. Consider the functions f(x) = x 1 and g(x) = x2 + 1. Using the L2 norm on the interval [0, 1], we find − 1 1 1 23 f = (x 1)2 dx = , g = (x2 + 1)2 dx = , k k s − 3 k k s 15 Z0 r Z0 r 1 77 f + g = (x2 + x)2 dx = . k k s 60 Z0 r The triangle inequality requires 77 1 + 23 , which is true. 60 ≤ 3 15 q q q 2/25/04 97 c 2004 Peter J. Olver ° The Cauchy–Schwarz and triangle inequalities look much more impressive when writ- ten out in full detail. For the Euclidean inner product (3.1), they are

n n n 2 2 vi wi vi wi , ¯ ¯ ≤ v v ¯ i = 1 ¯ u i = 1 u i = 1 ¯ X ¯ u X u X (3.20) n¯ ¯ t n t n ¯ ¯ (v + w )2 v2 + w2 . v i i ≤ v i v i u i = 1 u i = 1 u i = 1 u X u X u X t t t Theorems 3.5 and 3.9 imply that these inequalities are valid for arbitrary real numbers 2 v1, . . . , vn, w1, . . . , wn. For the L inner product (3.12) on function space, they produce the following splendid integral inequalities:

b b b f(x) g(x) dx f(x)2 dx g(x)2 dx , ¯ a ¯ ≤ s a s a ¯ Z ¯ Z Z (3.21) b¯ ¯ b b ¯ 2 ¯ ¯ f(x) + g(x) dx¯ f(x)2 dx + g(x)2 dx , s ≤ s s Za Za Za £ ¤ which hold for arbitrary continuous (and, in fact, rather general) functions. The first of these is the original Cauchy–Schwarz inequality, whose proof appeared to be quite deep when it first appeared. Only after the abstract notion of an inner product space was properly formalized did its innate simplicity and generality become evident.

3.3. Norms.

Every inner product gives rise to a norm that can be used to measure the or length of the elements of the underlying vector space. However, not every norm that is used in analysis and applications arises from an inner product. To define a general norm on a vector space, we will extract those properties that do not directly rely on the inner product structure.

Definition 3.12. A norm on the vector space V assigns a v to each k k vector v V , subject to the following axioms for all v, w V , and c R: ∈ ∈ ∈ (i) Positivity: v 0, with v = 0 if and only if v = 0. k k ≥ k k (ii) Homogeneity: c v = c v . k k | | k k (iii) Triangle inequality: v + w v + w . k k ≤ k k k k As we now know, every inner product gives rise to a norm. Indeed, positivity of the norm is one of the inner product axioms. The homogeneity property follows since

c v = c v ; c v = c2 v ; v = c v ; v = c v . k k h i h i | | h i | | k k Finally, the triangle inequalitp y for an pinner product normpwas established in Theorem 3.9. Here are some important examples of norms that do not come from inner products.

2/25/04 98 c 2004 Peter J. Olver ° n T Example 3.13. Let V = R . The 1–norm of a vector v = ( v1 v2 . . . vn ) is defined as the sum of the absolute values of its entries:

v = v + v + + v . (3.22) k k1 | 1 | | 2 | · · · | n | The max or –norm is equal to the maximal entry (in absolute value): ∞

v = sup v1 , v2 , . . . , vn . (3.23) k k∞ { | | | | | | } Verification of the positivity and homogeneity properties for these two norms is straight- forward; the triangle inequality is a direct consequence of the elementary inequality

a + b a + b | | ≤ | | | | for absolute values. The Euclidean norm, 1–norm, and –norm on Rn are just three representatives of the general p–norm ∞ n v = p v p . (3.24) k kp v | i | u i = 1 u X t This quantity defines a norm for any 1 p < . The –norm is a limiting case of as p . Note that the Euclidean norm (3.3)≤ is the∞2–norm,∞and is often designated as such; it→is the∞ only p–norm which comes from an inner product. The positivity and homogeneity properties of the p–norm are straightforward. The triangle inequality, however, is not trivial; in detail, it reads

n n n p v + w p p v p + p w p , (3.25) v | i i | ≤ v | i | v | i | u i = 1 u i = 1 u i = 1 u X u X u X t t t and is known as Minkowski’s inequality. A proof can be found in [97]. Example 3.14. There are analogous norms on the space C0[a, b] of continuous functions on an interval [a, b]. Basically, one replaces the previous sums by . Thus, the Lp–norm is defined as

b p f = p f(x) dx . (3.26) k kp s | | Za In particular, the L1 norm is given by integrating the absolute value of the function:

b f = f(x) dx. (3.27) k k1 | | Za The L2 norm (3.12) appears as a special case, p = 2, and, again, is the only one arising from an inner product. The proof of the general triangle or Minkowski inequality for p = 1, 2 is 6 again not trivial, [97]. The limiting L∞ norm is defined by the maximum f = max f(x) : a x b . (3.28) k k∞ { | | ≤ ≤ }

2/25/04 99 c 2004 Peter J. Olver ° Example 3.15. Consider the p(x) = 3x2 2 on the interval 1 x 1. Its L2 norm is − − ≤ ≤

1 2 2 18 p 2 = (3x 2) dx = = 1.8974 . . . . k k s 1 − 5 Z− r Its L∞ norm is p = max 3x2 2 : 1 x 1 = 2, k k∞ | − | − ≤ ≤ with the maximum occurring at x =©0. Finally, its L1 norm isª 1 2 p 1 = 3x 2 dx k k 1 | − | Z− √2/3 √2/3 1 − = (3x2 2) dx + (2 3x2) dx + (3x2 2) dx 1 − √2/3 − √2/3 − Z− Z− Z = 4 2 1 + 8 2 + 4 2 1 = 16 2 2 = 2.3546 . . . . 3 3 − 3 3 3 3 − 3 3 − ³ q ´ q ³ q ´ q Every norm defines a distance between vector space elements, namely d(v, w) = v w . (3.29) k − k For the standard dot product norm, we recover the usual notion of distance between points in Euclidean space. Other types of norms produce alternative (and sometimes quite useful) notions of distance that, nevertheless, satisfy all the familiar properties: (a) Symmetry: d(v, w) = d(w, v); (b) d(v, w) = 0 if and only if v = w; (c) The triangle inequality: d(v, w) d(v, z) + d(z, w). ≤ Unit Vectors Let V be a fixed . The elements u V with unit norm u = 1 play a special role, and are known as unit vectors (or functions).∈ The following easyk lemmak shows how to construct a pointing in the same direction as any given nonzero vector. Lemma 3.16. If v = 0 is any nonzero vector, then the vector u = v/ v obtained by dividing v by its norm6 is a unit vector parallel to v. k k Proof : We compute, making use of the homogeneity property of the norm: v v u = = k k = 1. Q.E.D. k k v v ° ° ° k k ° k k ° ° T √ Example 3.17. The vector v =° ( 1, °2 ) has length v 2 = 5 with respect to the standard Euclidean norm. Therefore, the− unit vector poinktingk in the same direction is

1 v 1 1 u = = = √5 . 2 v 2 √5 2 Ã ! k k µ − ¶ − √5

2/25/04 100 c 2004 Peter J. Olver ° On the other hand, for the 1 norm, v = 3, and so k k1 v 1 1 1 u = = = 3 2 v 1 3 2 Ã ! k k µ − ¶ − 3 is the unit vector parallel toe v in the 1 norm. Finally, v = 2, and hence the corre- sponding unit vector for the norm is k k∞ ∞ v 1 1 1 u = = = 2 . v 2 2 Ã 1 ! k k∞ µ − ¶ − Thus, the notion of unit vectorb will depend upon which norm is being used. Example 3.18. Similarly, on the interval [0, 1], the quadratic polynomial p(x) = x2 1 has L2 norm − 2

1 1 2 7 p = x2 1 dx = x4 x2 + 1 dx = . k k2 s − 2 s − 4 60 Z0 Z0 r ¡ ¢ ¡ ¢ p(x) Therefore, u(x) = = 60 x2 15 is a “unit polynomial”, u = 1, which is p 7 − 7 k k2 “parallel” to (or, morek kcorrectlyq , a scalarq multiple of) the polynomial p. On the other hand, for the L∞ norm,

2 1 1 p = max x 2 0 x 1 = 2 , k k∞ − ≤ ≤ and hence, in this case u(x) = 2p(x)©=¯ 2x2 1¯ is¯ the correspªonding unit function. ¯ − ¯ ¯ The unit sphere for the given norm is defined as the set of all unit vectors e S = u = 1 V. (3.30) 1 k k ⊂ Thus, the unit sphere for the Euclidean©norm on Rª n is the usual round sphere S = x 2 = x2 + x2 + + x2 = 1 . 1 k k 1 2 · · · n For the norm, it is the unit© cube ª ∞ n S = x R x = 1 or x = 1 or . . . or x = 1 . 1 { ∈ | 1 § 2 § n § } For the 1 norm, it is the unit diamond or “octahedron”

n S = x R x + x + + x = 1 . 1 { ∈ | | 1 | | 2 | · · · | n | } See Figure 3.4 for the two-dimensional pictures.

In all cases, the closed unit ball B1 = u 1 consists of all vectors of norm less than or equal to 1, and has the unit sphere kas kits≤boundary. If V is a finite-dimensional © ª normed vector space, then the unit ball B1 forms a compact , meaning that it is closed and bounded. This basic topological fact, which is not true in infinite-dimensional

2/25/04 101 c 2004 Peter J. Olver ° 1 1 1

0.5 0.5 0.5

-1 -0.5 0.5 1 -1 -0.5 0.5 1 -1 -0.5 0.5 1

-0.5 -0.5 -0.5

-1 -1 -1

Figure 3.4. Unit Balls and Spheres for 1, 2 and Norms in R2. ∞

spaces, underscores the fundamental distinction between finite-dimensional and the vastly more complicated infinite-dimensional realm. Equivalence of Norms While there are many different types of norms, in a finite-dimensional vector space they are all more or less equivalent. Equivalence does not mean that they assume the same value, but rather that they are, in a certain sense, always close to one another, and so for most analytical purposes can be used interchangeably. As a consequence, we may be able to simplify the analysis of a problem by choosing a suitably adapted norm.

n Theorem 3.19. Let 1 and 2 be any two norms on R . Then there exist positive constants c?, C? > 0k suc· k h thatk · k

? ? n c v v C v for every v R . (3.31) k k1 ≤ k k2 ≤ k k1 ∈ Proof : We just sketch the basic idea, leaving the details to a more rigorous real anal- ysis course, cf. [125, 126]. We begin by noting that a norm defines a f(v) = v on Rn. (Continuity is, in fact, a consequence of the triangle inequality.) Let k k S1 = u 1 = 1 denote the unit sphere of the first norm. Any continuous function de- fined onka compactk set achieves both a maximum and a minimum value. Thus, restricting © ª the second norm function to the unit sphere S1 of the first norm, we can set c? = u? = min u u S , C? = U? = max u u S , (3.32) k k2 { k k2 | ∈ 1 } k k2 { k k2 | ∈ 1 } ? ? ? ? for certain vectors u , U S1. Note that 0 < c C < , with equality holding if and only if the the norms are the∈ same. The minimum≤and maxim∞ um (3.32) will serve as the constants in the desired inequalities (3.31). Indeed, by definition,

c? u C? when u = 1, (3.33) ≤ k k2 ≤ k k1 and so (3.31) is valid for all u S . To prove the inequalities in general, assume v = 0. ∈ 1 6 (The case v = 0 is trivial.) Lemma 3.16 says that u = v/ v 1 S1 is a unit vector in the first norm: u = 1. Moreover, by the homogeneitk yk prop∈ erty of the norm, k k1 u 2 = v 2/ v 1. Substituting into (3.33) and clearing denominators completes the prok kof of (3.31).k k k k Q.E.D.

2/25/04 102 c 2004 Peter J. Olver ° norm and 2 norm 1 norm and 2 norm ∞ Figure 3.5. Equivalence of Norms.

Example 3.20. For example, consider the Euclidean norm and the max norm k · k2 on Rn. According to (3.32), the bounding constants are found by minimizing and k · k∞ maximizing u = max u1 , . . . , un over all unit vectors u 2 = 1 on the (round) k k∞ { | | | | } ? k k unit sphere. Its maximal value is obtained at the poles, when U = ek, with ek = 1. 1 § 1 k k∞ Thus, C? = 1. The minimal value is obtained when u? = , . . . , has all equal √n √n µ ¶ components, whereby c? = u = 1/√n . Therefore, k k∞ 1 v 2 v v 2. (3.34) √n k k ≤ k k∞ ≤ k k One can interpret these inequalities as follows. Suppose v is a vector lying on the unit sphere in the Euclidean norm, so v 2 = 1. Then (3.34) tells us that its norm is bounded from above and below byk1/k√n v 1. Therefore, the unit∞Euclidean sphere sits inside the unit sphere in the ≤norm,k kand∞ ≤outside the sphere of radius 1/√n. Figure 3.5 illustrates the two-dimensional∞situation. One significant consequence of the equivalence of norms is that, in Rn, convergence is independent of the norm. The following are all equivalent to the standard ε–δ convergence of a u(1), u(2), u(3), . . . of vectors in Rn: (a) the vectors converge: u(k) u?: −→ (b) the individual components all converge: u(k) u? for i = 1, . . . , n. i −→ i (c) the difference in norms goes to zero: u(k) u? 0. k − k −→ The last case, called convergence in norm, does not depend on which norm is chosen. Indeed, the basic inequality (3.31) implies that if one norm goes to zero, so does any other norm. An important consequence is that all norms on Rn induce the same — convergence of , notions of open and closed sets, and so on. None of this is true in infinite-dimensional function space! A rigorous development of the underlying topological and analytical properties of compactness, continuity, and convergence is beyond the scope of this course. The motivated student is encouraged to consult a text in real analysis, e.g., [125, 126], to find the relevant definitions, theorems and proofs.

2/25/04 103 c 2004 Peter J. Olver ° Example 3.21. Consider the infinite-dimensional vector space C0[0, 1] consisting of all continuous functions on the interval [0, 1]. The functions

1 nx, 0 x 1 , f (x) = − ≤ ≤ n n 0, 1 x 1, ( n ≤ ≤ have identical L∞ norms

fn = sup fn(x) 0 x 1 = 1. k k∞ { | | | ≤ ≤ } On the other hand, their L2 norm

1 1/n 1 f = f (x)2 dx = (1 nx)2 dx = k n k2 s n s − √3n Z0 Z0 goes to zero as n . This example shows that there is no constant C ? such that → ∞ ? f C f 2 k k∞ ≤ k k 0 2 0 for all f C [0, 1]. Thus, the L∞ and L norms on C [0, 1] are not equivalent — there exist ∈ 2 functions which have unit L norm but arbitrarily small L∞ norm. Similar comparative results can be established for the other function space norms. As a result, analysis and topology on function space is intimately related to the underlying choice of norm.

3.4. Positive Definite Matrices. Let us now return to the study of inner products, and fix our attention on the finite- dimensional situation. Our immediate goal is to determine the most general inner product which can be placed on the finite-dimensional vector space Rn. The resulting analysis will lead us to the extremely important class of positive definite matrices. Such matrices play a fundamental role in a wide variety of applications, including minimization problems, me- chanics, electrical circuits, and differential equations. Moreover, their infinite-dimensional generalization to positive definite linear operators underlie all of the most important ex- amples of boundary value problems for ordinary and partial differential equations. T Let x ; y denote an inner product between vectors x = ( x1 x2 . . . xn ) and y = h Ti n ( y1 y2 . . . yn ) , in R . We begin by writing the vectors in terms of the standard vectors: n n x = x e + + x e = x e , y = y e + + y e = y e . (3.35) 1 1 · · · n n i i 1 1 · · · n n j j i = 1 j = 1 X X To evaluate their inner product, we will the three basic axioms. We first employ the bilinearity of the inner product to expand

n n n x ; y = x e ; y e = x y e ; e . h i i i j j i jh i j i * i = 1 j = 1 + i,j = 1 X X X

2/25/04 104 c 2004 Peter J. Olver ° Therefore we can write n x ; y = k x y = xT K y, (3.36) h i ij i j i,j = 1 X where K denotes the n n matrix of inner products of the basis vectors, with entries × k = e ; e , i, j = 1, . . . , n. (3.37) ij h i j i We conclude that any inner product must be expressed in the general (3.36). The two remaining inner product axioms will impose certain conditions on the inner product matrix K. Symmetry implies that

k = e ; e = e ; e = k , i, j = 1, . . . , n. ij h i j i h j i i ji Consequently, the inner product matrix K is symmetric:

K = KT . Conversely, symmetry of K ensures symmetry of the bilinear form:

x ; y = xT K y = (xT K y)T = yT KT x = yT K x = y ; x , h i h i where the second equality follows from the fact that the quantity is a scalar, and hence equals its . The final condition for an inner product is positivity. This requires that

n 2 T n x = x ; x = x K x = k x x 0 for all x R , (3.38) k k h i ij i j ≥ ∈ i,j = 1 X with equality if and only if x = 0. The precise meaning of this positivity condition on the matrix K is not as immediately evident, and so will be encapsulated in the following very important definition. Definition 3.22. An n n matrix K is called positive definite if it is symmetric, KT = K, and satisfies the positivit× y condition

T n x K x > 0 for all 0 = x R . (3.39) 6 ∈ We will sometimes write K > 0 to mean that K is a symmetric, positive definite matrix. Warning: The condition K > 0 does not mean that all the entries of K are positive. There are many positive definite matrices which have some negative entries — see Ex- ample 3.24 below. Conversely, many symmetric matrices with all positive entries are not positive definite! Remark: Although some authors allow non-symmetric matrices to be designated as positive definite, we will only say that a matrix is positive definite when it is symmetric. But, to underscore our convention and remind the casual reader, we will often include the superfluous adjective “symmetric” when speaking of positive definite matrices.

2/25/04 105 c 2004 Peter J. Olver ° Our preliminary analysis has resulted in the following characterization of inner prod- ucts on a finite-dimensional vector space.

Theorem 3.23. Every inner product on Rn is given by

T n x ; y = x K y, for x, y R , (3.40) h i ∈ where K is a symmetric, positive definite matrix.

Given any symmetric† matrix K, the homogeneous quadratic polynomial

n T q(x) = x K x = kij xi xj, (3.41) i,j = 1 X is known as a quadratic form on Rn. The quadratic form is called positive definite if

n q(x) > 0 for all 0 = x R . (3.42) 6 ∈ Thus, a quadratic form is positive definite if and only if its coefficient matrix is. 4 2 Example 3.24. Even though the K = has two 2 −3 µ − ¶ negative entries, it is, nevertheless, a positive definite matrix. Indeed, the corresponding quadratic form

q(x) = xT K x = 4x2 4x x + 3x2 = (2x x )2 + 2x2 0 1 − 1 2 2 1 − 2 2 ≥ is a sum of two non-negative quantities. Moreover, q(x) = 0 if and only if both 2x x = 0 1 − 2 and x2 = 0, which implies x1 = 0 also. This proves positivity for all nonzero x, and hence K > 0 is indeed a positive definite matrix. The corresponding inner product on R 2 is 4 2 y x ; y = ( x x ) − 1 = 4x y 2x y 2x y + 3x y . h i 1 2 2 3 y 1 1 − 1 2 − 2 1 2 2 µ − ¶ µ 2 ¶ 1 2 On the other hand, despite the fact that the matrix K = has all positive 2 1 entries, it is not a positive definite matrix. Indeed, writing out µ ¶

T 2 2 q(x) = x K x = x1 + 4x1 x2 + x2, we find, for instance, that q(1, 1) = 2 < 0, violating positivity. These two simple examples should be enough to con−vince the− reader that the problem of determining whether a given symmetric matrix is or is not positive definite is not completely elementary. With a little practice, it is not difficult to read off the coefficient matrix K from the explicit formula for the quadratic form (3.41).

† Exercise shows that the coefficient matrix K in any quadratic form can be taken to be symmetric without any loss of generality.

2/25/04 106 c 2004 Peter J. Olver ° Example 3.25. Consider the quadratic form

q(x, y, z) = x2 + 4xy + 6y2 2xz + 9z2 − depending upon three variables. The corresponding coefficient matrix is

1 2 1 1 2 1 x K = 2 6 −0 whereby q(x, y, z) = ( x y z ) 2 6 −0 y .       1 0 9 1 0 9 z − −       Note that the squared terms in q contribute directly to the diagonal entries of K, while the mixed terms are split in half to give the symmetric off-diagonal entries. The reader might wish to try proving that this particular matrix is positive definite by establishing T positivity of the quadratic form: q(x, y, z) > 0 for all nonzero ( x, y, z ) R3. Later, we will devise a simple, systematic test for positive definiteness. ∈ Slightly more generally, a quadratic form and its associated symmetric coefficient matrix are called positive semi-definite if

T n q(x) = x K x 0 for all x R . (3.43) ≥ ∈ A positive semi-definite matrix may have null directions, meaning non-zero vectors z such that q(z) = zT K z = 0. Clearly any nonzero vector z ker K that lies in the matrix’s defines a null direction, but there may be others.∈A positive definite matrix is not allowed to have null directions, and so ker K = 0 . As a consequence of Proposition 2.39, we deduce that all positive definite matrices are{ in}vertible. (The converse, however, is not valid.) Theorem 3.26. If K is positive definite, then K is nonsingular. 1 1 Example 3.27. The matrix K = is positive semi-definite, but not 1 −1 positive definite. Indeed, the associated quadraticµ − form¶

q(x) = xT K x = x2 2x x + x2 = (x x )2 0 1 − 1 2 2 1 − 2 ≥ is a perfect square, and so clearly non-negative. However, the elements of ker K, namely the scalar multiples of the vector ( 1, 1 )T , define null directions, since q(c, c) = 0. a b Example 3.28. By definition, a general symmetric 2 2 matrix K = is × b c positive definite if and only if the associated quadratic form satisfies µ ¶

2 2 q(x) = ax1 + 2bx1 x2 + cx2 > 0 (3.44) for all x = 0. tells us that this is the case if and only if 6 a > 0, a c b2 > 0, (3.45) − i.e., the quadratic form has positive leading coefficient and positive (or nega- tive discriminant). A direct proof of this elementary fact will appear shortly.

2/25/04 107 c 2004 Peter J. Olver ° Furthermore, a quadratic form q(x) = xT K x and its associated symmetric matrix K are called negative semi-definite if q(x) 0 for all x and negative definite if q(x) < 0 for all x = 0. A quadratic form is called indefinite≤ if it is neither positive nor negative 6 semi-definite; equivalently, there exist one or more points x+ where q(x+) > 0 and one or more points x where q(x ) < 0. Details can be found in the exercises. − − Gram Matrices Symmetric matrices whose entries are given by inner products of elements of an inner product space play an important role. They are named after the nineteenth century Danish Jorgen Gram — not the metric mass unit!

Definition 3.29. Let V be an inner product space, and let v1, . . . , vn V . The associated ∈ v ; v v ; v . . . v ; v h 1 1 i h 1 2 i h 1 n i  v2 ; v1 v2 ; v2 . . . v2 ; vn  K = h i h i h i . (3.46) . . .. .  . . . .     v v v v v v   n ; 1 n ; 2 . . . n ; n   h i h i h i  is the n n matrix whose entries are the inner products between the chosen vector space elements.× Symmetry of the inner product implies symmetry of the Gram matrix:

k = v ; v = v ; v = k , and hence KT = K. (3.47) ij h i j i h j i i ji In fact, the most direct method for producing positive definite and semi-definite matrices is through the Gram matrix construction. Theorem 3.30. All Gram matrices are positive semi-definite. The Gram matrix

(3.46) is positive definite if and only if v1, . . . , vn are linearly independent. Proof : To prove positive (semi-)definiteness of K, we need to examine the associated quadratic form n T q(x) = x K x = kij xi xj. i,j = 1 X Substituting the values (3.47) for the matrix entries, we find

n q(x) = v ; v x x . h i j i i j i,j = 1 X Bilinearity of the inner product on V implies that we can assemble this into a single inner product

n n q(x) = x v ; x v = v ; v = v 2 0, i i j j h i k k ≥ * i = 1 j = 1 + X X

2/25/04 108 c 2004 Peter J. Olver ° where v = x1 v1 + + xn vn lies in the subspace of V spanned by the given vectors. This immediately pro· v· ·es that K is positive semi-definite. Moreover, q(x) = v 2 > 0 as long as v = 0. If v , . . . , v are linearly independent, k k 6 1 n then v = 0 if and only if x1 = = xn = 0, and hence q(x) = 0 if and only if x = 0. Thus, in this case, q(x) and K are· · ·positive definite. Q.E.D. 1 3 Example 3.31. Consider the vectors v = 2 , v = 0 . For the standard 1   2   1 6 − Euclidean dot product on R3, the Gram matrix is    v v v v 6 3 K = 1 1 1 2 = . v · v v · v 3 −45 µ 2 · 1 2 · 2 ¶ µ − ¶ Since v1, v2 are linearly independent, K > 0. Positive definiteness implies that the asso- ciated quadratic form q(x , x ) = 6x2 6x x + 45x2 1 2 1 − 1 2 2 is strictly positive for all (x1, x2) = 0. Indeed, this can be checked directly using the criteria in (3.45). 6 On the other hand, for the weighted inner product x ; y = 3x y + 2x y + 5x y , (3.48) h i 1 1 2 2 3 3 the corresponding Gram matrix is v ; v v ; v 16 21 K = 1 1 1 2 = . (3.49) h v ; v i h v ; v i 21 −207 µ h 2 1 i h 2 2 i ¶ µ − ¶ Since v1, v2 are still linearlye independent (which, of course, does not depend upon which inner product is used), the matrix K is also positive definite. In the case of the Euclidean dot product, the construction of the Gram matrix K can e be directly implemented as follows. Given column vectors v , . . . , v Rm, let us form 1 n ∈ the m n matrix A = ( v1 v2 . . . vn ). In view of the identification (3.2) between the dot product× and multiplication of , the (i, j) entry of K is given as the product k = v v = vT v ij i · j i j of the ith row of the transpose AT with the jth column of A. In other words, the Gram matrix can be evaluated as a matrix product: K = AT A. (3.50) For the preceding Example 3.31, 1 3 1 3 1 2 1 6 3 A = 2 0 , and so K = AT A = − 2 0 = − .   3 0 6   3 45 1 6 µ ¶ 1 6 µ − ¶ − − Theorem 3.30 implies that the Gram matrix (3.50) is positive definite if and only if the columns of A are linearly independent vectors. This implies the following result.

2/25/04 109 c 2004 Peter J. Olver ° Proposition 3.32. Given an m n matrix A, the following are equivalent: × (a) The n n Gram matrix K = AT A is positive definite. × (b) A has linearly independent columns. (c) A = n m. ≤ (d) ker A = 0 . { } Changing the underlying inner product will, of course, change the Gram matrix. As noted in Theorem 3.23, every inner product on Rm has the form

T m v ; w = v C w for v, w R , (3.51) h i ∈ where C > 0 is a symmetric, positive definite m m matrix. Therefore, given n vectors × v , . . . , v Rm, the entries of the Gram matrix with respect to this inner product are 1 n ∈ k = v ; v = vT C v . ij h i j i i j

If, as above, we assemble the column vectors into an m n matrix A = ( v1 v2 . . . vn ), × th T th then the Gram matrix entry kij is obtained by multiplying the i row of A by the j column of the product matrix C A. Therefore, the Gram matrix based on the alternative inner product (3.51) is given by K = AT C A. (3.52)

Theorem 3.30 immediately implies that K is positive definite — provided A has rank n.

Theorem 3.33. Suppose A is an m n matrix with linearly independent columns. Suppose C > 0 is any positive definite m ×m matrix. Then the matrix K = AT C A is a positive definite n n matrix. × × The Gram matrices constructed in (3.52) arise in a wide variety of applications, in- cluding least squares approximation theory, cf. Chapter 4, and mechanical and electrical systems, cf. Chapters 6 and 9. In the majority of applications, C = diag (c1, . . . , cm) is a diagonal positive definite matrix, which requires it to have strictly positive diagonal entries m ci > 0. This choice corresponds to a weighted inner product (3.10) on R . Example 3.34. Returning to the situation of Example 3.31, the weighted inner 3 0 0 product (3.48) corresponds to the diagonal positive definite matrix C = 0 2 0 .   0 0 5 1 3  Therefore, the weighted Gram matrix (3.52) based on the vectors 2 , 0 is     1 6 − 3 0 0 1 3     1 2 1 16 21 K = AT C A = − 0 2 0 2 0 = − , 3 0 6     21 207 µ ¶ 0 0 5 1 6 µ − ¶ − e     reproducing (3.49).

2/25/04 110 c 2004 Peter J. Olver ° The Gram matrix construction is not restricted to finite-dimensional vector spaces, but also applies to inner products on function space. Here is a particularly important example.

Example 3.35. Consider vector space C0[0, 1] consisting of continuous functions on 1 the interval 0 x 1, equipped with the L2 inner product f ; g = f(x) g(x) dx. Let ≤ ≤ h i 0 us construct the Gram matrix corresponding to the simple monomialZ functions 1, x, x2. We compute the required inner products

1 1 1 1 ; 1 = 1 2 = dx = 1, 1 ; x = x dx = , h i k k h i 2 Z0 Z0 1 1 1 1 x ; x = x 2 = x2 dx = , 1 ; x2 = x2 dx = , h i k k 3 h i 3 Z0 Z0 1 1 1 1 x2 ; x2 = x2 2 = x4 dx = , x ; x2 = x3 dx = . h i k k 5 h i 4 Z0 Z0 Therefore, the Gram matrix is

1 ; 1 1 ; x 1 ; x2 1 1 1 h i h i h i 2 3 K = x ; 1 x ; x x ; x2 = 1 1 1 .  h i h i h i   2 3 4  x2 ; 1 x2 ; x x2 ; x2 1 1 1    3 4 5   h i h i h i    As we know, the monomial functions 1, x, x2 are linearly independent, and so Theorem 3.30 implies that this particular matrix is positive definite. The alert reader may recognize this particular Gram matrix as the 3 3 Hilbert matrix that we encountered in (1.67). More generally, the Gram matrix corresp×onding to the monomials 1, x, x2, . . . , xn has entries

1 i 1 j 1 i+j 2 1 k = x − ; x − = x − dt = , i, j = 1, . . . , n + 1. ij h i i + j 1 Z0 − Therefore, the monomial Gram matrix is the (n + 1) (n + 1) Hilbert matrix (1.67): × K = Hn+1. As a consequence of Theorems 3.26 and 3.33, we have proved the following non-trivial result.

Proposition 3.36. The n n Hilbert matrix H is positive definite. In particular, × n Hn is a nonsingular matrix.

Example 3.37. Let us construct the Gram matrix corresponding to the functions π 1, cos x, sin x with respect to the inner product f ; g = f(x) g(x) dx on the interval h i π Z−

2/25/04 111 c 2004 Peter J. Olver ° [ π, π ]. We compute the inner products − π π 1 ; 1 = 1 2 = dx = 2π, 1 ; cos x = cos x dx = 0, h i k k π h i π Z− π Z−π cos x ; cos x = cos x 2 = cos2 x dx = π, 1 ; sin x = sin x dx = 0, h i k k π h i π Z−π Z− π sin x ; sin x = sin x 2 = sin2 x dx = π, cos x ; sin x = cos x sin x dx = 0. h i k k π h i π Z− Z− 2π 0 0 Therefore, the Gram matrix is a simple : K = 0 π 0 . Positive   definiteness of K is immediately evident. 0 0 π   3.5. Completing the Square.

Gram matrices furnish us with an almost inexhaustible supply of positive definite matrices. However, we still do not know how to test whether a given symmetric matrix is positive definite. As we shall soon see, the secret already appears in the particular computations in Examples 3.2 and 3.24. You may recall the importance of the method known as “completing the square”, first in the derivation of the quadratic formula for the solution to

q(x) = a x2 + 2b x + c = 0, (3.53)

and, later, in facilitating the integration of various types of rational and algebraic functions. The idea is to combine the first two terms in (3.53) as a perfect square, and so rewrite the quadratic function in the form

b 2 ac b2 q(x) = a x + + − = 0. (3.54) a a µ ¶ As a consequence, b 2 b2 ac x + = − , a a2 µ ¶ and the well-known quadratic formula

b √b2 ac x = − § − a follows by taking the square root of both sides and then solving for x. The intermediate step (3.54), where we eliminate the linear term, is known as completing the square. We can perform the same kind of manipulation on a homogeneous quadratic form

2 2 q(x1, x2) = ax1 + 2bx1 x2 + cx2. (3.55)

2/25/04 112 c 2004 Peter J. Olver ° In this case, provided a = 0, completing the square amounts to writing 6 b 2 ac b2 ac b2 q(x , x ) = ax2 + 2bx x + cx2 = a x + x + − x2 = ay2 + − y2. 1 2 1 1 2 2 1 a 2 a 2 1 a 2 ¶ µ (3.56) The net result is to re-express q(x1, x2) as a simpler sum of squares of the new variables b y = x + x , y = x . (3.57) 1 1 a 2 2 2 It is not hard to see that the final expression in (3.56) is positive definite, as a function of y1, y2, if and only if both coefficients are positive:

ac b2 a > 0, − > 0. a Therefore, q(x , x ) 0, with equality if and only if y = y = 0, or, equivalently, x = 1 2 ≥ 1 2 1 x2 = 0, This conclusively proves that conditions (3.45) are necessary and sufficient for the quadratic form (3.55) to be positive definite. Our goal is to adapt this simple idea to analyze the positivity of quadratic forms depending on more than two variables. To this end, let us rewrite the quadratic form identity (3.56) in matrix form. The original quadratic form (3.55) is

a b x q(x) = xT K x, where K = , x = 1 . (3.58) b c x µ ¶ µ 2 ¶ Similarly, the right hand side of (3.56) can be written as

a 0 y q (y) = yT D y, where D = ac b2 , y = 1 . (3.59) 0 y à −a ! µ 2 ¶ Anticipatingb the final result, the equations (3.57) connecting x and y can themselves be written in matrix form as b 1 0 T y x + x T y = L x or 1 = 1 a 2 , where L = b . y 1 µ 2 ¶ à x2 ! à a ! Substituting into (3.59), we find

yT D y = (LT x)T D (LT x) = xT L D LT x = xT K x, where K = LD LT (3.60) is the same factorization (1.56) of the coefficient matrix, obtained earlier via Gaussian elimination. We are thus led to the realization that completing the square is the same as the LD LT factorization of a symmetric matrix! Recall the definition of a regular matrix as one that can be reduced to upper triangular form without any row interchanges. Theorem 1.32 says that the regular symmetric matrices are precisely those that admit an LD LT factorization. The identity (3.60) is therefore valid

2/25/04 113 c 2004 Peter J. Olver ° for all regular n n symmetric matrices, and shows how to write the associated quadratic form as a sum of×squares:

q(x) = xT K x = yT D y = d y2 + + d y2 , where y = LT x. (3.61) 1 1 · · · n n

The coefficients di are the diagonal entries of D, which are the pivots of K. Furthermore, the diagonal quadratic form is positive definite, yT D y > 0 for all y = 0, if and only if T 6 all the pivots are positive, di > 0. Invertibility of L tells us that y = 0 if and only if x = 0, and hence positivity of the pivots is equivalent to positive definiteness of the original quadratic form: q(x) > 0 for all x = 0. We have thus almost proved the main result that completely characterizes positive 6definite matrices. Theorem 3.38. A symmetric matrix K is positive definite if and only if it is regular and has all positive pivots.

As a result, a K is positive definite if and only if it can be factored K = LD LT , where L is special lower triangular, and D is diagonal with all positive diagonal entries. 1 2 1 Example 3.39. Consider the symmetric matrix K = 2 6 −0 . Gaussian   elimination produces the factors 1 0 9 −   1 0 0 1 0 0 1 2 1 L = 2 1 0 , D = 0 2 0 , LT = 0 1 −1 .       1 1 1 0 0 6 0 0 1 −       in its factorization K = LD LT . Since the pivots — the diagonal entries 1, 2, 6 in D — are all positive, Theorem 3.38 implies that K is positive definite, which means that the associated quadratic form satisfies

q(x) = x2 + 4x x 2x x + 6x2 + 9x2 > 0, for all x = ( x , x , x )T = 0. 1 1 2 − 1 3 2 3 1 2 3 6 Indeed, the LD LT factorization implies that q(x) can be explicitly written as a sum of squares: q(x) = x2 + 4x x 2x x + 6x2 + 9x2 = y2 + 2y2 + 6y2, (3.62) 1 1 2 − 1 3 2 3 1 2 3 T where y1 = x1 + 2x2 x3, y2 = x2 + x3, y3 = x3, are the entries of y = L x. Positivity − 2 of the coefficients of the yi (which are the pivots) implies that q(x) is positive definite. 1 2 3 Example 3.40. Lets test whether the matrix K = 2 3 4 is positive definite.   3 4 8 When we perform Gaussian elimination, the second pivotturns outto be 1, which im- mediately implies that K is not positive definite — even though all its entries− are positive. (The third pivot is 3, but this doesn’t help; all it takes is one non-positive pivot to dis- qualify a matrix from being positive definite.) This means that the associated quadratic 2 2 2 form q(x) = x1 + 4x1 x2 + 6x1 x3 + 3x2 + 8x2 x3 + 8x3 assumes negative values at some points; for instance, q( 2, 1, 0) = 1. − −

2/25/04 114 c 2004 Peter J. Olver ° A direct method for completing the square in a quadratic form goes as follows. The first step is to put all the terms involving x1 in a suitable square, at the expense of introducing extra terms involving only the other variables. For instance, in the case of the quadratic form in (3.62), the terms involving x1 are x2 + 4x x 2x x 1 1 2 − 1 3 which we write as (x + 2x x )2 4x2 + 4x x x2. 1 2 − 3 − 2 2 3 − 3 Therefore,

q(x) = (x + 2x x )2 + 2x2 + 4x x + 8x2 = (x + 2x x )2 + q(x , x ), 1 2 − 3 2 2 3 3 1 2 − 3 2 3 where 2 2 e q(x2, x3) = 2x2 + 4x2 x3 + 8x3 is a quadratic form that only involves x , x . We then repeat the process, combining all e 2 3 the terms involving x2 in the remaining quadratic form into a square, writing

2 2 q(x2, x3) = 2(x2 + x3) + 6x3. This gives the final form e q(x) = (x + 2x x )2 + 2(x + x )2 + 6x2, 1 2 − 3 2 3 3 which reproduces (3.62). In general, as long as k = 0, we can write 11 6 q(x) = k x2 + 2k x x + + 2k x x + k x2 + + k x2 11 1 12 1 2 · · · 1n 1 n 22 2 · · · nn n k k 2 = k x + 12 x + + 1n x + q(x , . . . , x ) (3.63) 11 1 k 2 · · · k n 2 n µ 11 11 ¶ = k (x + l x + + l x )2 + q(x , . . . , x ), 11 1 21 2 · · · n1 n 2e n where k21 k12 e kn1 k1n l21 = = , . . . ln1 = = , k11 k11 k11 k11 are precisely the multiples appearing in the matrix L obtained from Gaussian Elimination applied to K, while n

q(x2, . . . , xn) = kij xi xj i,j = 2 X is a quadratic form involving eone less variable. Thee entries of its symmetric coefficient matrix K are k = k = k l k , for i j. ij ji ij − j1 1i ≥ e Thus, the entries of K thate lieeon or below the diagonal are exactly the same as the entries appearing on or below the diagonal of K after the the first phase of the elimination process. e 2/25/04 115 c 2004 Peter J. Olver ° In particular, the second pivot of K is the diagonal entry k22. Continuing in this fashion, the steps involved in completing the square essentially reproduce the steps of Gaussian elimination, with the pivots appearing in the appropriate diagonale positions. With this in hand, we can now complete the proof of Theorem 3.38. First, if the upper left entry k11, namely the first pivot, is not strictly positive, then K cannot be positive T definite because q(e1) = e1 K e1 = k11 0. Otherwise, suppose k11 > 0 and so we can write q(x) in the form (3.63). We claim≤that q(x) is positive definite if and only if the reduced quadratic form q(x2, . . . , xn) is positive definite. Indeed, if q is positive definite and k11 > 0, then q(x) is the sum of two positive quantities, which simultaneously vanish ? ? if and only if x1 = x2 = e = xn = 0. On the other hand, suppose qe(x2, . . . , xn) 0 for some x?, . . . , x? , not all zero.· · · Setting x? = l x? l x? makes the initial≤square 2 n 1 − 21 2 − · · · − n1 n term in (3.63) equal to 0, so e q(x?, x?, . . . , x? ) = q(x?, . . . , x? ) 0, 1 2 n 2 n ≤ proving the claim. In particular, positive definiteness of q requires that the second pivot e k22 > 0. We then continue the reduction procedure outlined in the preceding paragraph; if a non-positive pivot appears an any stage, the original quadratice form and matrix cannot be epositive definite, while having all positive pivots will ensure positive definiteness, thereby proving Theorem 3.38. The Cholesky Factorization

The identity (3.60) shows us how to write any regular quadratic form q(x) as a of squares. One can push this result slightly further in the positive definite case. Since each pivot di > 0, we can write the diagonal quadratic form (3.61) as a sum of pure squares:

2 2 d y2 + + d y2 = d y + + d y = z2 + + z2 , 1 1 · · · n n 1 1 · · · n n 1 · · · n ¡ p ¢ ¡ p ¢ where zi = di yi. In matrix form, we are writing

q (y) = ypT D y = zT z = z 2, where z = S y, with S = diag ( d , . . . , d ) k k 1 n Since D = S2, the matrix S can be thought of as a “square root” of thep diagonalpmatrix D.bSubstituting back into (1.52), we deduce the Cholesky factorization

K = LD LT = LS ST LT = M M T , where M = LS (3.64) of a positive definite matrix. Note that M is a lower triangular matrix with all positive entries, namely the square roots of the pivots mii = di on its diagonal. Applying the Cholesky factorization to the corresponding quadratic form produces p q(x) = xT K x = xT M M T x = zT z = z 2, where z = M T x. (3.65) k k One can interpret (3.65) as a change of variables from x to z that converts an arbitrary inner product norm, as defined by the square root of the positive definite quadratic form q(x), into the standard Euclidean norm z . k k

2/25/04 116 c 2004 Peter J. Olver ° 1 2 1 Example 3.41. For the matrix K = 2 6 −0 considered in Example 3.39,   1 0 9 − the Cholesky formula (3.64) gives K = M MT , where  1 0 0 1 0 0 1 0 0 M = LS = 2 1 0 0 √2 0 = 2 √2 0 .       1 1 1 0 0 √6 1 √2 √6 − − The associated quadratic function can then be writtenas asum of pure squares:

q(x) = x2 + 4x x 2x x + 6x2 + 9x2 = z2 + z2 + z2, 1 1 2 − 1 3 2 3 1 2 3 where z = M T x, or, explicitly, z = x + 2x x , z = √2 x + √2 x , z = √6 x .. 1 1 2 − 3 2 2 3 3 3 3.6. Complex Vector Spaces. Although physical applications ultimately require real answers, complex numbers and complex vector spaces assume an extremely useful, if not essential role in the intervening analysis. Particularly in the description of periodic phenomena, complex numbers and complex exponentials help to simplify complicated trigonometric formulae. Complex vari- able methods are ubiquitous in electrical engineering, Fourier analysis, potential theory, fluid mechanics, , and so on. In quantum mechanics, the basic physical quantities are complex-valued wave functions. Moreover, the Schr¨odinger equation, which governs quantum dynamics, is an inherently complex partial differential equation. In this section, we survey the basic facts about complex numbers and complex vector spaces. Most of the constructions are entirely analogous to their real counterparts, and so will not be dwelled on at length. The one exception is the complex version of an inner product, which does introduce some novelties not found in its simpler real counterpart. (integration and differentiation of complex functions) and its applica- tions to fluid flows, potential theory, waves and other areas of mathematics, and engineering, will be the subject of Chapter 16. Complex Numbers

Recall that a is an expression of the form z = x + i y, where x, y R ∈ are real and† i = √ 1. The set of all complex numbers (scalars) is denoted by C. We call x = Re z the real part− of z and y = Im z the imaginary part of z = x + i y. (Note: The imaginary part is the real number y, not i y.) A real number x is merely a complex number with zero imaginary part, Im z = 0, and so we may regard R C. Complex addition and multiplication are based on simple adaptations of the rules of⊂ real arithmetic to include the identity i 2 = 1, and so − (x + i y) + (u + i v) = (x + u) + i (y + v), (3.66) (x + i y) (u + i v) = (xu y v) + i (xv + y u). · −

† Electrical engineers prefer to use j to indicate the imaginary unit.

2/25/04 117 c 2004 Peter J. Olver ° Complex numbers enjoy all the usual laws of real addition and multiplication, including commutativity: z w = w z. T We can identity a complex number x + i y with a vector ( x, y ) R2 in the real ∈ plane. For this reason, C is sometimes referred to as the complex plane. Complex addition (3.66) corresponds to vector addition, but complex multiplication does not have a readily identifiable vector counterpart. Another important on complex numbers is that of complex conjugation. Definition 3.42. The of z = x + i y is z = x i y, whereby Re z = Re z, while Im z = Im z. − − Geometrically, the operation of complex conjugation coincides with reflection of the corresponding vector through the real axis, as illustrated in Figure 3.6. In particular z = z if and only if z is real. Note that z + z z z Re z = , Im z = − . (3.67) 2 2 i Complex conjugation is compatible with complex arithmetic:

z + w = z + w, z w = z w. In particular, the product of a complex number and its conjugate

z z = (x + i y) (x i y) = x2 + y2 (3.68) − is real and non-negative. Its square root is known as the modulus of the complex number z = x + i y, and written z = x2 + y2 . (3.69) | | Note that z 0, with z = 0 if and onlyp if z = 0. The modulus z generalizes the absolute value| | ≥of a real n|um|ber, and coincides with the standard Euclidean| | norm in the xy–plane, which implies the validity of the triangle inequality

z + w z + w . (3.70) | | ≤ | | | | Equation (3.68) can be rewritten in terms of the modulus as

z z = z 2. (3.71) | | Rearranging the factors, we deduce the formula for the reciprocal of a nonzero complex number: 1 z 1 x i y = , z = 0, or, equivalently, = − . (3.72) z z 2 6 x + i y x2 + y2 | | The general formula for complex division w w z u + i v (xu + y v) + i (xv y u) = or = − , (3.73) z z 2 x + i y x2 + y2 | | 2/25/04 118 c 2004 Peter J. Olver ° z

r

θ

z

Figure 3.6. Complex Numbers.

is an immediate consequence. The modulus of a complex number,

r = z = x2 + y2 , | | is one component of its polar coordinate represenp tation

x = r cos θ, y = r sin θ or z = r(cos θ + i sin θ). (3.74) The polar angle, which measures the angle that the line connecting z to the origin makes with the horizontal axis, is known as the phase, and written

θ = ph z. (3.75) As such, the phase is only defined up to an integer multiple of 2π. The more common term for the angle is the argument, written arg z = ph z. However, we prefer to use “phase” throughout this text, in part to avoid confusion with the argument of a function. We note that the modulus and phase of a product of complex numbers can be readily computed:

z w = z w , ph (z w) = ph z + ph w. (3.76) | | | | | | Complex conjugation preserves the modulus, but negates the phase:

z = z , ph z = ph z. (3.77) | | | | − One of the most important equations in all of mathematics is ’s formula

e i θ = cos θ + i sin θ, (3.78) relating the complex exponential with the real sine and cosine functions. This fundamental identity has a variety of mathematical justifications; see Exercise for one that is based on comparing series. Euler’s formula (3.78) can be used to compactly rewrite the polar form (3.74) of a complex number as

z = r e i θ where r = z , θ = ph z. (3.79) | |

2/25/04 119 c 2004 Peter J. Olver ° Figure 3.7. Real and Imaginary Parts of ez.

The complex conjugate identity

i θ i θ e− = cos( θ) + i sin( θ) = cos θ i sin θ = e , − − − permits us to express the basic in terms of complex exponentials:

i θ i θ i θ i θ e + e− e e− cos θ = , sin θ = − . (3.80) 2 2 i These formulae are very useful when working with trigonometric identities and integrals. The exponential of a general complex number is easily derived from the basic Eu- ler formula and the standard properties of the exponential function — which carry over unaltered to the complex domain; thus,

ez = ex+ i y = ex e i y = ex cos y + i ex sin y. (3.81) Graphs of the real and imaginary parts of the complex exponential function appear in Figure 3.7. Note that e2 π i = 1, and hence the exponential function is periodic

ez+2 π i = ez (3.82) with imaginary period 2π i — indicative of the periodicity of the trigonometric functions in Euler’s formula. Complex Vector Spaces and Inner Products A complex vector space is defined in exactly the same manner as its real cousin, cf. Def- inition 2.1, the only difference being that we replace real scalars by complex scalars. The most basic example is the n-dimensional complex vector space Cn consisting of all column T vectors z = ( z1, z2, . . . , zn ) that have n complex entries: z1, . . . , zn C. Verification of each of the vector space axioms is immediate. ∈ We can write any complex vector z = x + i y Cn as a linear combination of two ∈ real vectors x = Re z, y = Im z Rn. Its complex conjugate z = x i y is obtained by ∈ −

2/25/04 120 c 2004 Peter J. Olver ° taking the complex conjugates of its individual entries. Thus, for example, if 1 + 2 i 1 2 1 2 i 1 2 z = 3 = 3 + i 0 , then z = −3 = 3 i 0 .             −5 i −0 5 −5 i −0 − 5 −   n n         In particular, z R C is a real vector if and only if z = z. ∈ ⊂ Most of the vector space concepts we developed in the real domain, including span, , basis, and , can be straightforwardly extended to the com- plex regime. The one exception is the concept of an inner product, which requires a little thought. In analysis, the most important applications of inner products and norms are based on the associated inequalities: Cauchy–Schwarz and triangle. But there is no natural ordering of the complex numbers, and so one cannot make any sense of a complex inequality like z < w. Inequalities only make sense in the real domain, and so the norm of a complex vector should still be a positive and real. With this in mind, the na¨ıve idea of simply summing the squares of the entries of a complex vector will not define a norm on Cn, since the result will typically be complex. Moreover, this would give some nonzero complex vectors, e.g., ( 1 i )T , a zero “norm”, violating positivity. The correct definition is modeled on the formula z = √z z | | that defines the modulus of a complex scalar z C. If, in analogy with the real definition (3.7), the quantity inside the square root is to represen∈ t the inner product of z with itself, then we should define the “dot product” between two complex numbers to be z w = z w, so that z z = z z = z 2. · · | | Writing out the formula when z = x + i y and w = u + i v, we find z w = z w = (x + i y) (u i v) = (xu + y v) + i (y u xv). (3.83) · − − Thus, the dot product of two complex numbers is, in general, complex. The real part of z w is, in fact, the Euclidean dot product between the corresponding vectors in R 2, while its· imaginary part is, interestingly, their scalar cross-product, cf. (cross2 ). The vector version of this construction is named after the nineteenth century French mathematician Charles Hermite, and called the Hermitian dot product on Cn. It has the explicit formula z1 w1 z2 w2 z w = zT w = z w + z w + + z w , for z =  . , w =  . . (3.84) · 1 1 2 2 · · · n n . .      z   w   n   n  Pay attention to the fact that we must apply complex conjugation  to all the entries of the second vector. For example, if 1 + i 1 + 2 i z = , w = , then z w = (1+ i )(1 2 i )+(3+2 i )( i ) = 5 4 i . 3 + 2 i i · − − − µ ¶ µ ¶

2/25/04 121 c 2004 Peter J. Olver ° On the other hand,

w z = (1 + 2 i )(1 i ) + i (3 2 i ) = 5 + 4 i . · − − Therefore, the Hermitian dot product is not symmetric. Reversing the order of the vectors results in complex conjugation of the dot product:

w z = z w. · · This is an unforeseen complication, but it does have the desired effect that the induced norm, namely

0 z = √z z = √zT z = z 2 + + z 2 , (3.85) ≤ k k · | 1 | · · · | n | is strictly positive for all 0 = z Cn. For example,p if 6 ∈ 1 + 3 i z = 2 i , then z = 1 + 3 i 2 + 2 i 2 + 5 2 = √39 .   − 5 k k | | | − | | − | − p   The Hermitian dot product is well behaved under complex vector addition:

(z + z) w = z w + z w, z (w + w) = z w + z w. · · · · · · However, while complex scalar multiples can be extracted from the first vector without e e e e alteration, when they multiply the second vector, they emerge as complex conjugates:

(c z) w = c (z w), z (c w) = c (z w), c C. · · · · ∈ Thus, the Hermitian dot product is not bilinear in the strict sense, but satisfies something that, for lack of a better name, is known as sesqui-linearity. The general definition of an inner product on a complex vector space is modeled on the preceding properties of the Hermitian dot product.

Definition 3.43. An inner product on the complex vector space V is a pairing that takes two vectors v, w V and produces a complex number v ; w C, subject to the ∈ h i ∈ following requirements, for u, v, w V , and c, d C. ∈ ∈ (i) Sesqui-linearity: c u + d v ; w = c u ; w + d v ; w , h i h i h i (3.86) u ; c v + d w = c u ; v + d u ; w . h i h i h i (ii) Conjugate Symmetry: v ; w = w ; v . (3.87) h i h i (iii) Positivity:

v 2 = v ; v 0, and v ; v = 0 if and only if v = 0. (3.88) k k h i ≥ h i

2/25/04 122 c 2004 Peter J. Olver ° Thus, when dealing with a complex inner product space, one must pay careful at- tention to the complex conjugate that appears when the second argument in the inner product is multiplied by a complex scalar, as well as the complex conjugate that appears when reversing the order of the two arguments. But, once this initial complication has been properly dealt with, the further properties of the inner product carry over directly from the real domain. Theorem 3.44. The Cauchy–Schwarz inequality,

v ; w v w , | h i | ≤ k k k k with now denoting the complex modulus, and the triangle inequality | · | v + w v + w k k ≤ k k k k are both valid on any complex inner product space. The proof of this result is practically the same as in the real case, and the details are left to the reader. Example 3.45. The vectors v = ( 1 + i , 2 i , 3 )T , w = ( 2 i , 1, 2 + 2 i )T , satisfy − − v = √2 + 4 + 9 = √15, w = √5 + 1 + 8 = √14, k k k k v w = (1 + i )(2 + i ) + 2 i + ( 3)(2 2 i ) = 5 + 11 i . · − − − Thus, the Cauchy–Schwarz inequality reads v ; w = 5 + 11 i = √146 √210 = √15 √14 = v w . | h i | | − | ≤ k k k k Similarly, the triangle inequality tells us that v + w = ( 3, 1 + 2 i , 1 + 2 i )T = √9 + 5 + 5 = √19 √15 + √14 = v + w . k k k − k ≤ k k k k Example 3.46. Let C0[ π, π ] denote the complex vector space consisting of all complex valued continuous functions− f(x) = u(x)+ i v(x) depending upon the real variable π x π. The Hermitian L2 inner product on C0[ π, π ] is defined as − ≤ ≤ − π f ; g = f(x) g(x) dx , (3.89) h i π Z− with corresponding norm

π π f = f(x) 2 dx = u(x)2 + v(x)2 dx . (3.90) k k s π | | s π Z− Z− £ ¤ The reader should verify that (3.89) satisfies the basic Hermitian inner product axioms. In particular, if k, l are integers, then the inner product of the complex exponential functions e i kx and e i lx is 2π, k = l, π π π i kx i lx i kx i lx i (k l)x i (k l)x e ; e = e e− dx = e − dx =  e − h i π π  = 0, k = l. Z− Z−  i (k l) ¯ 6 − ¯x = π ¯ −  ¯ 2/25/04 123  ¯ c 2004 Peter J. Olver ° We conclude that when k = l, the complex exponentials e i kx and e i lx are orthogonal, since their inner product is6 zero. This example will be of fundamental significance in the complex formulation of Fourier analysis.

2/25/04 124 c 2004 Peter J. Olver °