Lecture V: Bilinear Forms and Orthogonality Mat 204 - Fall 2006 Princeton University

LECTURE V: BILINEAR FORMS AND ORTHOGONALITY MAT 204 - FALL 2006 PRINCETON UNIVERSITY

ALFONSO SORRENTINO

1. Bilinear forms Denition. Let V be a vector space. A bilinear form on V is an application

b : V × V −→ R that satises the two following conditions:

i) b(c1u1 + c2u2, v) = c1b(u1, v) + c2b(u2, v), ∀ c1, c2 ∈ R, ∀ u1, u2, v ∈ V ; ii) b(u, d1v1 + d2v2) = d1b(u, v1) + d2b(u, v2), ∀ d1, d2 ∈ R, ∀ v1, v2, u ∈ V. A bilinear form is said to be symmetric [resp. skew-symmetric] if

b(u, v) = b(v, u) [resp. b(u, v) = −b(v, u)] ∀ u, v ∈ V. We would like, as already done for linear transformations, to associate to b a matrix, that encodes all the information we need to know about b; it will obviously depend on our choice of a basis on V . Denition. Let be a vector space of dimension and let be its V n E = (e1, . . . , en) basis. Given a bilinear form b : V × V −→ R, we will call matrix of b w.r.t. E, the matrix   b(e1, e1) ...... b(e1, en)  b(e2, e1) ...... b(e2, en)   . . . .  A =  . . . .  ∈ Mn(R).  . . . .  b(en, e1) ...... b(en, en) Let us see how we can use this matrix to recover the behavior of b on V × V . Consider two generic vectors u = Ex and v = Ey in V . We want to determine an   x1 . expression for in terms of and (i.e., with respect to ). Set  .  b(u, v) x y E x =  .  xn   y1 . and  . ; we have: y =  .  yn   n ! n b(e1, v) . X X  .  b(u, v) = b xiei, v = xib(ei, v) = (x1, . . . , xn)  .  = i=1 i=1 b(en, v)   b(e1, v) . T  .  = x  .  . b(en, v) 1 2 ALFONSO SORRENTINO

For each i = 1, . . . , n, we can compute b(ei, v):

 n  n X X b(ei, v) = b ei, yjej = b(ei, ej)yj = j=1 j=1   y1 .  .  (i) = (b(ei, e1), . . . , b(ei, en))  .  = A y . yn Therefore,  A(1)y   A(1)  . . T  .  T  .  T b(Ex, Ey) = x  .  = x  .  y = x Ay . A(n)y A(n) The expression T b(Ex, Ey) = x Ay is called expression of the bilinear form b w.r.t. E. Remark. From the expression above, it follows that, once we x a basis E of V , b is uniquely determined by the matrix A. Hence, there exists a bijection (w.r.t. a xed ) between the set of bilinear forms on and . E V Mn(R) Let us see what other information about b, can be recovered from its matrix. Proposition 1. Let V be a n-dimensional vector space and b : V × V −→ R a bilinear form on V . Fix a basis E on V and let b(Ex, Ey) = xT Ay. We have that: b is symmetric ⇐⇒ A is a symmetric matrix [i.e., AT = A ]. T Proof. (=⇒) Since b(ei, ej) = b(ej, ei), then aij = aji and consequently A = A. (⇐=) Observe that, since xT Ay is a real number, it coincides with its transpose; i.e., xT Ay = (xT Ay)T = yT AT x = yT Ax , where we used, to get the last equality, that A is symmetric. Therefore, for any u, v ∈ V : T T b(u, v) = b(Ex, Ey) = x Ay = y Ax = b(Ey, Ex) = b(v, u) . One can verify, in a completely analogous way, that:

Proposition 2. Let V be a n-dimensional vector space and b : V × V −→ R a bilinear form on V . Fix a basis E on V and let b(Ex, Ey) = xT Ay. We have that: b is skew-symmetric ⇐⇒ A is a skew-symmetric matrix [i.e., AT = −A ]. One natural question that one could ask is: what happens when I change the basis in V ?

Let and 0 be two bases for and assume that 0 (with ) is E E V E = EC C ∈ GLn(R) the formula for the change of basis. Consider two generic vectors u = Ex = E0x0 and v = Ey = E0y0. We have: T 0 0 0 0 0 T 0 0 x Ay = b(Ex, Ey) = b(u, v) = b(E x , E y ) = (x ) A y . Since E0 = EC, then x = Cx0 and y = Cy0; therefore: xT Ay = (Cx0)T A(Cy0) = ((x0)T CT )A(Cy0) = (x0)T (CT AC)y = (x0)T A0y0 . It follows that A0 = CT AC . LECTURE V: BILINEAR FORMS AND ORTHOGONALITY 3

Remark. Let 0 . They are called congruent if there exists A, A ∈ Mn(R) C ∈ such that 0 T . One can easily verify that congruence is an equiv- GLn(R) A = C AC alence relation on the set of . Mn(R) In particular (recall what we have proved for similar matrices in Lecture IV):

Proposition 3. Two matrices 0 are congruent they represent A, A ∈ Mn(R) ⇐⇒ the same bilinear form b : V × V −→ R, with respect to dierent bases.

Denition. Let V a n-dimensional vector space and b : V × V −→ R a bilinear form on V . We call rank of b, the rank of its matrix A, with respect to any basis E. Observe that this denition makes sense (namely, it does not depend on the choice of the basis E), since two congruent matrices have the same rank: rank (A0) = rank (CT AC) (we have already noticed that multiplication by invertible matrices, does not change the rank). Hence, rank (b) is an invariant of b. A bilinear form is said to be degenerate, if rank (b) < n; otherwise it is said to be non-degenerate.

Proposition 4. Let V a n-dimensional vector space and b : V ×V −→ R a bilinear form on V . The following conditions are equivalent: i) b is degenerate; ii) there exists a non-zero vector u0 ∈ V , such that b(u0, v) = 0 for any v ∈ V ; iii) there exists a non-zero vector v0 ∈ V , such that b(u, v0) = 0 for any u ∈ V . Proof. We show only the equivalence between i) and ii) [the proof of the equivalence between i) and iii) is pretty similar]. Let b(Ex, Ey) = xT Ay. We have: b is degenerate ⇐⇒ rank A < n ⇐⇒ ⇐⇒ the rows A(1),...,A(n) are linearly dependent ⇐⇒ n there exist (not all zero), such that X (i) ⇐⇒ c1, . . . , cn ∈ R ciA = 0 ⇐⇒ i=1 there exists such that T ⇐⇒ c ∈ Mn,1(R), c 6= 0, c A = 0 . Let us verify that:

T T c A = 0 ⇐⇒ c Ay = 0, ∀ y ∈ Mn,1(R). The implication (=⇒) is evident. Let us verify (⇐=). In fact:

 1   0  T  .  T 0 = c A  .  = c A(1)  .  0 . .  0   0  T  .  T 0 = c A  .  = c A(n) .  .  1 Consequently,

T T T T c A = c A(1) ...A(n) = c A(1) . . . c A(n) = (0,..., 0) = 0 . 4 ALFONSO SORRENTINO

This allows us to conclude the proof. In fact, using this fact: b is degenerate ⇐⇒ rank A < n ⇐⇒ there exists such that T ⇐⇒ c ∈ Mn,1(R), c 6= 0, c A = 0 ⇐⇒ there exists such that T ⇐⇒ c ∈ Mn,1(R), c 6= 0, c Ay = 0, ∀ y ∈ Mn,1(R) ⇐⇒ there exists such that ⇐⇒ u0 = Ec ∈ V, u0 6= 0, b(u0, v) = 0, ∀ v = Ey ∈ V. 2. Orthogonality Note: From here on, we will just consider symmetric bilinear forms.

In the physical space, it is well known the concept of orhogonal vectors; namely, two vectors are orthogonal if, considered at a same starting point, they form a right angle. We would like to generalize this concept to a general vector space, but it is not evident at all what a natural denition should be. Think for instance at the space of polynomials with real coecients. What might be the meaning of orthogonal polynomials? The main goal of this section is to introduce the concept of orthogonality in a more general framework.

Let us rst rephrase the denition of orthogonal physical vectors in R2 in a dierent way. Dene the application 2 2 b : R × R −→ R (v1, v2) 7−→ kv1kkv2k cos θ where kv1k and kv2k is the euclidean length of the vectors and θ ∈ [0, π] is the smallest angle between these vectors (when we apply them at the same point). One can verify that this is a symmetric bilinear form and that, if v1, v2 6= 0, then π b(v , v ) = 0 ⇐⇒ θ = . 1 2 2 Therefore,

v1 and v2 are orthogonal ⇐⇒ b(v1, v2) = 0 . This observation justies the following denition.

Denition. Let b : V × V −→ R be a symmetric bilinear form and u, v ∈ V . If b(u, v) = 0, then the two vectors are called b-orthogonal (or simply orthogonal) and it will be denoted by u ⊥ v. If S is a non-empty set of V (not necessarily a subspace!), the set S⊥ := {u ∈ V : b(u, v) = 0, for all v ∈ V } (i.e., the set of vectors that are b-orthogonal to all vectors of S) is called b-orthogonal subspace of S. It is easy to verify that S⊥ is a vector subspace of V [Exercise]. ⊥ ⊥ If S = {w}, we will abbreviate w , instead of {w} . Moreover, if W = hw1, . . . , wsi, one can easily check that ⊥ ⊥ ⊥ W = w1 ∩ ... ∩ ws . Finally, a vector v ∈ V is called b-isotropic (or simply isotropic), if b(v, v) = 0. The set of all isotropic vectors of V is called b-isotropic cone of V and is denoted by Ib(V ).

Remark. i) In general, the b-isotropic cone Ib(V ) is not a vector subspace of V . For example, suppose that v1, v2 ∈ Ib(V ) and that b(v1, v2) 6= 0; one has that v1 + v2 6∈ Ib(V ). In fact:

b(v1 + v2, v1 + v2) = b(v1, v1) + b(v2, v2) + 2b(v1, v2) = 2b(v1, v2) 6= 0 . LECTURE V: BILINEAR FORMS AND ORTHOGONALITY 5

Moreover, observe that if Ib(V ) 6= {0}, then it is the union (in the set theoretical sense) of 1-dimensional vector subspaces of V . In fact, if v ∈ and , it is easy to verify that . Ib(V ) c ∈ R cv ∈ Ib(V ) ii) If Ib(V ) = {0}, then b is non degenerate. In fact, using proposition 4, if b(u0, v) = 0 for all v ∈ V , then in particular b(u0, u0) = 0 and this implies that u0 ∈ Ib(V ) = {0}. The converse is false! Namely, there exist non-degenerate symmetric bilinear forms, whose isotropic cone is not trivial. For example, consider the matrix: 0 1 A = . 1 −1 It denes (w.r.t. the canonical basis of R2) a non-degenerate symmetric bilinear form on 2 (in fact ), but 2 ; in fact, R rank (A) = 2 Ib(R ) 6= {0} 2 . e1 ∈ Ib(R ) Proposition 5. Let b : V × V −→ R be a symmetric bilinear form and u ∈ V a non-isotropic vector. Then, V = hui ⊕ u⊥ , i.e., each vector v ∈ V can be decomposed uniquely into the sum of a vector v1 parallel (proportional) to u and a vector v2 b-orthogonal to u. These vectors are given by: b(u, v) b(u, v) v = u and v = v − u . 1 b(u, u) 2 b(u, u) Proof. By hypothesis, we know that b(u, u) 6= 0. We need to verify that hui ∩ u⊥ = h0i and hui + u⊥ = V. Let us start by proving the rst equality. If w ∈ hui ∩ u⊥, then w = cu (there exists c ∈ R) and b(w, u) = 0 . Therefore 0 = b(cu, u) = cb(u, u) and this implies - since b(u, u) 6= 0 - that c = 0. Hence, w = 0. ⊥ Let us verify now that hui + u = V . In fact, for any v ∈ V , if we dene v1 and v2 ⊥ as above, we have: v1 + v2 = v, v1 ∈ hui and v2 ∈ u . In fact: b(u, v) b(u, v) b(u, v ) = b u, v − u = b(u, v) − b(u, u) = 0 . 2 b(u, u) b(u, u) Denition. Let b : V × V −→ R be a symmetric bilinear form and u ∈ V a non- isotropic vector. For any , the scalar b(u,v) is called Fourier coecient of v ∈ V b(u,u) v with respect to u and will be denoted by cu(v). The vector v1 = cu(v)u is called b-orthogonal projection of v in the u direction. Observe that the application

cu : V −→ R v 7−→ cu(v) is a linear application [Exercise].

Proposition 6. Let V a n-dimensional vector space with basis E. Let b be a symmetric bilinear form on V , such that T b(Ex, Ey) = x Ay .

Let W be a s-dimensional vector subspace of V , with basis {z1, . . . , zs} and let (with and ). Then, the cartesian (z1 . . . zs) = ED D ∈ Mn,s(R) rank (D) = s 6 ALFONSO SORRENTINO equations of W ⊥ (i.e., its b-orthogonal subspace) with respect to E, are given by the HLS(s, n, R): DT AX = 0 . Moreover, if b is non-degenerate, then dim W ⊥ = n − dim W .

Proof. Let v = Ex ∈ V . We have: ⊥ v = Ex ∈ W ⇐⇒ b(z1, v) = ... = b(zs, v) = 0 ⇐⇒ T T ⇐⇒ D(1)Ax = ... = D(s)Ax = 0 ⇐⇒ ⇐⇒ DT Ax = 0 . Hence, W ⊥ has cartesian equations given by the HLS(s, n, R) DT AX = 0 . If is non-degenerate, then and therefore . In this case, b rank A = n A ∈ GLn(R) rank (DT A) = rank (DT ) = rank (D) = s. Therefore, dim W ⊥ = n − rank (DT A) = n − s = n − dim W. Example. Let b be a symmetric bilinear form on R4 dened, with respect to the canonical basis of 4, by the matrix E = (e1 e2 e3 e4) R  1 0 0 0   0 1 2 −1  A =   .  0 2 0 0  0 −1 0 −1 i) Verify that b is non-degenerate. ⊥ ii) Set W = he3, e1 + e2i. Determine the cartesian equations of W . iii) Verify that W ∩ W ⊥ = h0i. iv) Determine the b-isotropic vectors in W ⊥.

Solution: i) b is non-degenerate, since det A 6= 0. ii) We have:  0 1   0 1  (e3 e1 + e2) = ED, with D =   .  1 0  0 0 0 2 0 0 W ⊥ has cartesian equations DT AX = 0, where D = , 1 1 2 −1 i.e., 2x = 0 x = 0 2 ⇐⇒ 2 x1 + x2 + 2x3 − x4 = 0 x1 + 2x3 − x4 = 0 . Therefore, dim W ⊥ = 4 − 2 = 2 and W ⊥ = h(−2, 0, 1, 0), (1, 0, 0, 1)i. iii) The cartesian equations of W are obtained directly imposing the condition rank (Dx) = 2: x1 − x2 = 0 x4 = 0 . Hence, W ∩ W ⊥ has cartesian equations given by:    x1 − x2 = 0  x1 = 0  x = 0  x = 0 4 ⇐⇒ 2 x2 = 0 x3 = 0    x1 + 2x3 − x4 = 0  x4 = 0 LECTURE V: BILINEAR FORMS AND ORTHOGONALITY 7

this means W ∩ W ⊥ = h0i. iv) We have:

4 T 2 2 2 Ex ∈ Ib(R ) ⇐⇒ x Ax = 0 ⇐⇒ x1 + x2 + 4x2x3 − 2x2x4 − x4 = 0 . Therefore, 4 ⊥ if and only if Ex ∈ Ib(R ) ∩ W     x2 = 0  x2 = 0  x2 = 0 x1 + 2x3 − x4 = 0 ⇐⇒ x1 + 2x3 − x4 = 0 or x1 + 2x3 − x4 = 0 2 2  x1 − x4 = 0  x1 − x4 = 0  x1 + x4 = 0 . In conclusion: 4 ⊥ 4 for all 4 for all Ib(R ) ∩ W = {(t, 0, 0, t) ∈ R , t ∈ R} ∪ {(−t, 0, t, t) ∈ R , t ∈ R} or equivalently

4 ⊥ Ib(R ) ∩ W = h(1, 0, 0, 1)i ∪ h(−1, 0, 1, 1)i .

3. Quadratic forms

Denition. Let b : V × V −→ R be a symmetric bilinear form. The application Q : V −→ R v 7−→ Q(v) := b(v, v) is called quadratic form associated to b.

Remark. For any v1, v2 ∈ V , we have:

Q(v1 + v2) = b(v1 + v2, v1 + v2) =

= b(v1, v1) + b(v2, v2) + 2b(v1, v2) =

= Q(v1) + Q(v2) + 2b(v1, v2) . Therefore, we can reconstruct b from its quadratic form: 1 b(v , v ) = [Q(v + v ) − Q(v ) − Q(v )] 1 2 2 1 2 1 2 (b is also called polar form of Q) . In particular, it follows that there exists a 1 − 1 correspondence between the set of symmetric bilinear forms on V and the set of quadratic forms on V .

Let V be a n-dimensional vector space and E a basis for V . Let b be a symmetric bilinear form on V , with b(Ex, Ey) = xT Ay . For any vector v = Ex ∈ V , we have: n n T X X Q(v) = Q(Ex) = b(Ex, Ex) = x Ax = aijxixj ; i=1 j=1 this is called expression of the quadratic form Q with respect to E. Remark. The expression n n X X aijxixj i=1 j=1 is a homogeneous polynomial of degree 2 in x1, . . . , xn. Conversely, every homogeneous polynomial of degree 2 in x1, . . . , xn determines a quadratic form (with respect to a xed basis E). In fact, if X P (x1, . . . , xn) = bijxixj 1≤i≤j≤n 8 ALFONSO SORRENTINO is an arbitrary homogeneous polynomial of degree in , it suces to 2 R[x1, . . . , xn] dene a matrix , where A = (aij) ∈ Mn(R) aii = bii for i = 1, . . . , n bij for aij = aji = 2 1 ≤ i ≤ j ≤ n . Consider now the quadratic form Q : V −→ R, such that Q(Ex) = xT Ax. It is easy to verify that . Q(Ex) = P (x1, . . . , xn)

Hence, there is also a 1 − 1 correspondence between the set of quadratic forms on V and the set of homogeneous polynomials of degree 2 in x1, . . . , xn (and also with the set of symmetric matrices of order n). Example. Consider the homogeneous polynomial of degree 2 2 2 P (x1, x2, x3, x4) = x2 − 2x1x3 + x2x4 + 2x4 ∈ R[x1, x2, x3, x4] . It denes the symmetric matrix:  0 0 −1 0  1  0 1 0 2  A =   ∈ M4(R).  −1 0 0 0  1 0 2 0 2 The symmetric bilinear form b : R4 × R4 −→ R dened by A (with respect to the canonical basis) is: 1 1 b(x, y) = x y − x y − x y + x y + x y + 2x y 2 2 1 3 3 1 2 2 4 2 4 2 4 4 and the associated quadratic form: 2 2 Q(x) = x2 − 2x1x3 + x2x4 + 2x4 . 4. Inner products

Denition. Let V be a real vector space and b : V × V −→ R a bilinear form. b is said to be positive denite, if it satises the following two conditions: i) b(v, v) ≥ 0 , for all v ∈ V ii) b(v, v) = 0 ⇐⇒ v = 0 . If b satises only i) is called positive semidenite. Analogously, b is said to be negative denite, if it satises the following two conditions: i0) b(v, v) ≤ 0 , for all v ∈ V ii0) b(v, v) = 0 ⇐⇒ v = 0 . If b satises only i0) is called negative semidenite. A symmetric positive denite bilinear form is called inner product. Denition. Let b be an inner product on V . For each v ∈ V , we call the norm (or length) of v (with respect to b), the non-negative real number given by: kvk := pb(v, v) . Observe that kvk is well dened, since b(v, v) ≥ 0 (b is positive denite). Moreover, the following properties are obvious: i) kvk = 0 ⇐⇒ v = 0 ; ii) kcvk = |c| kvk , for any c ∈ R. We will say that a vector v is a versor (w.r.t. b) if kvk = 1 ; in particular, for any non-zero vector , the vector 1 is a versor. v ∈ V kvk v LECTURE V: BILINEAR FORMS AND ORTHOGONALITY 9

Moreover one can show that:

Proposition 7. Let V be a real vector space and b an inner product on V . Then, for any u, v ∈ V : i) |b(u, v)| ≤ kuk kvk (Cauchy-Schwartz inequality); ii) |kuk − kvk| ≤ ku + vk ≤ kuk + kvk (Triangular inequality).

Proof. i) For each t ∈ R we have: b(u + tv, u + tv) = kuk2 + 2b(u, v)t + kvk2t2 ≥ 0 . From this inequality, it follows that the polynomial Q = kvk2t2 +2b(u, v)t+ kuk2 has at most one real root and therefore its discriminant ∆ ≤ 0; hence: ∆ = b(u, v)2 − kuk2kvk2 ≤ 0 . We can then conclude that |b(u, v)| = pb(u, v)2 ≤ kuk kvk. ii) Let us show the second inequality. Using i), we have: ku + vk2 = b(u + v, u + v) = kuk2 + kv2k + 2b(u, v) ≤ ≤ kuk2 + kv2k + kuk kvk ≤ (kuk + kvk)2 . Taking the square root on both sides, we obtain the desired inequality. Let us show the rst inequality. We have: kuk = ku + v − vk ≤ ku + vk + k − vk = ku + vk + kvk ⇐⇒ kuk − kvk ≤ ku + vk kvk = kv + u − uk ≤ ku + vk + k − uk = ku + vk + kuk ⇐⇒ kvk − kuk ≤ ku + vk and this implies |kuk − kvk| ≤ ku + vk. Denition. A couple (V, b), where V is real vector space and b an inner product on V , is called euclidean space.

Example. [Canonical inner product on Rn] Dene n n · : R × R −→ R (x, y) =⇒ x · y := x1y1 + ... + xnyn , where x = (x1, . . . , xn) and y = (y1, . . . , yn). This is a bilinear form [Exercise] and its matrix (with respect to the canonical basis of n) is (the identity): R In T x · y = x Iny . Moreover, 2 2 i) x · x = x1 + ... + xn ≥ 0 ; 2 2 ii) x · x = x1 + ... + xn = 0 ⇐⇒ x = 0 . Therefore, · is an inner product and is called canonical (or standard) inner product. Remark. From ii) in the denition of positive deniteness, it follows that if b is an inner product, then Ib(V ) = {0} (i.e., no nonzero vector is b-isotropic). The converse is false; for instance, consider the bilinear form on R2 given (with respect to the canonical basis of R2) by: −2 1 A = . 1 −2

Obviously, b is not an inner product: b(e1, e1) = −2 < 0; but one can easily show that 2 . In fact, 2 has equation 2 2 and this Ib(R ) = {0} Ib(R ) −2x + 2xy − 2y = 0 equation has only one solution in R2, given by (0, 0) [to see this, one can denote x and see that the equation 2 has not solutions in ]. t = y −2t + 2t − 2 = 0 R 10 ALFONSO SORRENTINO

It is also evident, that if b is an inner product of V , with dim V = n, then it has rank n, i.e., it is non-degenerate. In particular, it is always possible to nd a new basis on V , such that the matrix representing b w.r.t. this basis is In. Such a basis is called b-orthonormal basis. Denition. Let be a -dimensional vector space and a basis. V n E = (e1 . . . en) Let b be an inner product of V , with matrix A (w.r.t. E), i.e., b(Ex, Ey) = xT Ay. If , is called -orthonormal basis and one has: A = In E b n T X b(Ex, Ey) = x Iny = xiyi ∀ Ex, Ey ∈ V. i=1 Hence, with respect to this basis, b acts on the coordinates of the vectors of V as the canonical inner product on Rn. Remark. We shall see in the next section, how we can actually compute such a basis. In the meanwhile, let us just verify the following properties. i) Let b be an inner product on V . Let us verify that t non-zero vectors u1, . . . , ut ∈ V , pairwise b-orthogonal, are linearly independent. Suppose that Pt ; for : i=1 ciui = 0 j = 1, . . . , t

t ! t X X 0 = b(uj, 0) = b uj, ciui = cib(uj, ui) = cjb(uj, uj) i=1 i=1

and since b(uj, uj) > 0, it follows that cj = 0 (for all j = 1, . . . , t). ii) If E is a b-orthonormal basis of V , then:   x1 .  .  b(v, ei) = xi ∀v = E  .  ∈ V xn [in fact, Pn Pn ]. It follows b(v, ei) = b j=1 xjej, ei = j=1 xjb(ej, ei) = xi that for any vector v ∈ V , the following representation holds:

n X v = b(v, ei)v . i=1 Proposition 8. Let b be an inner product on V , n-dimensional vector space. Let and be two bases with , where . One has: E F F = EC C ∈ GLn(R) i) If and are both -orthonormal, then [i.e., it is orthogonal: E F b C ∈ On(R) T C C = In]. ii) If is -orthonormal and , then is also -orthonormal. E b C ∈ On(R) F b Therefore, there is a correspondence between -orthonormal bases and . 1 − 1 b On(R)

Proof. Let A be the matrix that represents b with respect to E and B the matrix of b with respect to F. We have already observed that B = CT AC. i) If and are both -orthonormal, then and consequently E F b B = A = In T . Therefore . C InC = In C ∈ On(R) ii) If is -orthonormal and , then and consequently E b C ∈ On(R) A = In the matrix of w.r.t. is T T . Hence, is also b F B = C InC = C C = In F b-orthonormal. LECTURE V: BILINEAR FORMS AND ORTHOGONALITY 11

5. Gram-Schmidt orthogonalization process Let V be a real vector space and x an inner product b. What we are going to show in this section, is that given any (nite or countable) ordered set of vectors of V , it is always possible to b-orthonormalize them, i.e., substitute them with a new set of vectors, pairwise b-orthogonal, that is equivalent (in a certain sense to be specied) to the original one. The proof of this result will provide a simple algorithm, known as Gram-Schmidt process. Let us start with a denition.

Denition. Let V be a real vector space with an inner product b and two sets {u1, . . . , ut,...}, {w1, . . . , wt,...} of vectors of V , that are in 1 − 1 correspondence. We shall say that {w1, . . . , wt,...} is a b-orthogonalization of {u1, . . . , ut,...}, if for any k ≥ 1:

(ak) hu1, . . . , uki = hw1, . . . , wki ; (bk) the vectors w1, . . . , wk are pairwise b-orthogonal. Theorem 1 (Gram-Schmidt). Let V be a real vector space and b an inner product on V . Given any set of vectors of V {u1, . . . , ut,...}, there exists a b- orthogonalization {w1, . . . , wt,...}. This is essentially unique, in the sense that if 0 0 is another -orthogonalization of , then {w1, . . . , wt,...} b {u1, . . . , ut,...} hwki = 0 for any (i.e., the two vectors are proportional). hwki k ≥ 1 Proof. Let us proceed by induction on t. If t = 1, then it will be sucient to set w1 = u1 and it is trivial to verify that the properties (a1) and (b1) of the above denition hold.

Let t ≥ 2 and suppose that there exist t vectors w1, . . . , wt that verify the properties (ak) and (bk) for 1 ≤ k ≤ t; we want to construct a vector wt+1, such that the vectors w1, . . . , wt+1 verify (at+1) and (bt+1). Let us set: t X (1) wt+1 = ut+1 − ciwi i=1 where 0 if wi = 0 ci = cwi (ut+1) if wi 6= 0 [ is Fourier's coecient of with respect to ]. Obviously: cwi (ut+1) ut+1 wi

wt+1 ∈ hw1, . . . , wt, ut+1i and ut+1 ∈ hw1, . . . , wt, wt+1i .

From (at) it follows that

hu1, . . . , ut+1i = hw1, . . . , wt, wt+1i ; therefore (at+1) holds. Let us verify (bt+1). It will be enough to verify that wt+1 ⊥ wj, for any j = 1, . . . , t. If wj = 0, the claim is trivial. For, let us assume that wj 6= 0. We have: t X b(wt+1, wj) = b(ut+1, wj) − cib(wi, wj) = b(ut+1, wj) − cjb(wj, wj) = 0 . i=1 We want now to discuss the uniqueness matter . If t = 1 the claim is obvious. Assume that and suppose to have veried that 0 , for any . t ≥ 2 hwki = hwki k ≤ t We need to show that hwt+1i = hwt+1i. Since the two b-orthogonalizations satisfy (at+1), we have: 0 0 hw1, . . . , wt+1i = hu1, . . . , ut+1i = hw1, . . . , wt+1i . Therefore, 0 wt+1 = rwt+1 + w, r ∈ R and w ∈ hw1, . . . , wti . 12 ALFONSO SORRENTINO

From (bt+1), it follows: 0 0 0 ⊥ ⊥ and therefore 0 wt+1 ∈ hw1, . . . , wti = hw1, . . . , wti b(w, wt+1) = 0 ; ⊥ wt+1 ∈ hw1, . . . , wti and therefore b(w, wt+1) = 0 . Hence:

0 0 b(w, w) = b(wt+1 − rwt+1, w) = b(wt+1, w) − rb(wt+1, w) = 0 − r0 = 0 . Consequently, since is an inner product, and it follows that 0 , b w = 0 wt+1 = rwt+1 that implies: 0 hwt+1i ⊆ hwt+1i . The other inclusion can be shown similarly. The proof of the above theorem suggests an algorith to nd the b-orthogonalization of an order set of vectors of V . As already done before, we will set c0(v) = 0 for any v ∈ V . It follows from (1) that the Gram-Schmidt b-orthogonalization of {u1, . . . , ut,...} is the following:

w1 = u1 ;

w2 = u2 − cw1 (u2)w1 ;

w3 = u3 − cw1 (u3)w1 − cw2 (u3)w2 ; . . t−1 X wt = ut − cwi (ut)wi ; i=1 . .

Remark. Let be a basis of .A -orthogonalization 0 of is still E = (e1, . . . , en) V b E E a basis of V ; in fact, from the above description of the algorithm, it follows easily that 0 , where is an upper triangular matrix such that ; E = EC C c11 = ... = cnn = 1 therefore and in particular 0 is -orthogonal. If we divide each vector C ∈ GLn(R) E b by its norm, we get an orthonormal basis (i.e., any vector has norm 1 and they are pairwise orthogonal).

[See also The Factorization A = QR, in § 3.4 in the book]

Let us prove this important corollary of Gram-Schmidt's theorem.

Corollary 1. Let V be a real vector space and b an inner product on it. If W is a nite dimensional vector subspace, then V = W ⊕ W ⊥ . Proof. Let dim W = t. Using Gram-Schmidt process, one can determine a b- ⊥ orthonormal basis of W : {w1, . . . , wt}. Since b is an inner product, then W ∩W = {0}. For, it is sucient to verify that W + W ⊥ = V . For any , let us consider the vector 0 Pt . Obviously, 0 . v ∈ V v = i=1 b(v, wi)wi v ∈ W Since v = v0 + (v − v0), we only need to verify that v − v0 ∈ W ⊥. In fact, for any j = 1, . . . , t: t ! t 0 X X b(v − v , wj) = b(v, wj) − b b(v, wi)wi, wj = b(v, wj) − b(v, wi)b(wi, wj) = i=1 i=1

= b(v, wj) − b(v, wj) = 0 . LECTURE V: BILINEAR FORMS AND ORTHOGONALITY 13

Example. In R3 with the canonical inner product, consider the following ve vectors:

u1 = (1, 0, 1), u2 = (−1, 1, 0), u3 = (0, 1, 1), u4 = (0, 1, 0), u5 = (1, 2, 0) . Let us nd a b-orthogonalization of these vectors, using Gram-Schmidt process. We have:

w1 = u1 ; u2 · w1 −1 1 1 w2 = u2 − w1 = u2 − u1 = − , 1, ; w1 · w1 2 2 2 3 u3 · w1 u3 · w2 1 2 w3 = u3 − w1 − w2 = u3 − u1 − 3 w2 = 0 ; w1 · w1 w2 · w2 2 2 u4 · w1 u4 · w2 0 1 1 1 1 w4 = u4 − w1 − w2 − 0w3 = u4 − w1 − 3 w2 = , , − . w1 · w1 w2 · w2 2 2 3 3 3

Observe now, that necessarily w5 = 0 (otherwise w1, w2, w4, w5 would be four linearly independent vectors in R3). Indeed, u5 · w1 u5 · w2 u5 · w4 w5 = u5 − w1 − w2 − 0w3 − w4 = w1 · w1 w2 · w2 w4 · w4 1 = u − w − w − 3w = 0 . 5 2 1 2 4 Example. In R3 consider the inner product b, dened - with respect to a basis - by the matrix E = (e1, e2, e3)  2 1 −1   1 2 0  . −1 0 1 Using Gram-Schmidt process, nd a b-orthonormal basis.

[Before solving this exercise, we should ask ourselves a natural question: how can we check that b is an inner product? This question arises the following problem: is it possible to characterize the inner products (i.e., positive deniteness) in terms of their matrices? We will discuss this problem immediately after this exercise.]

Using Gram-Schmidt, we can b-orthogonalize E:

f1 = e1 ; b(f1, e2) 1 f2 = e2 − f1 = e2 − e1 ; b(f1, f1) 2 b(f1, e3) b(f2, e3) f3 = e3 − f1 − f2 ; b(f1, f1) b(f2, f2) observing that  1   0  1 2 3 1 1 b(f , f ) = − , 1, 0 A 1 = and b(f , e ) = − , 1, 0 A 0 = , 2 2 2   2 2 3 2   2 0 1 we obtain: 1 −1 2 1 1 1 1 2 1 f3 = e3 − e1 − 3 e2 − e1 = e3 + e1 − e2 + e1 = e1 − e2 + e3 . 2 2 2 2 3 6 3 3 14 ALFONSO SORRENTINO

Hence:  1 2  1 − 2 3 1 F = (f1 f2 f3) = EC with C =  0 1 − 3  . 0 0 1 Since √ r3 r1 kf k = 2, kf k = and kf k = , 1 2 2 3 3 a b-orthonormal basis is given by:  √  √1 − √1 2 3 2 6 3√ 0 f1 f2 f3 0 0 F = = EC with C =  0 √2 − 3  . kf1k kf2k kf3k  6 √3  0 0 3 Let us consider the problem of characterizing the matrices, corresponding to inner products in V (where V is, as usual, a n-dimensional vector space). Let us introduce the following notation: given , dene (for ) A ∈ Mn(R) αi i = 1, . . . , n the determinant of the submatrix A(1, . . . , i|1, . . . , i). These n scalars α1, . . . , αn (αn = det A) are called upper-left minors of A (or determinants of the upper-left submatrices, see § 6.2 in the book). We state - without proving it - the following result: Theorem 2 (Jabobi-Sylvester). Let b be a symmetric bilinear form on a n- dimensional vector space V . Let E be a basis for V and A the matrix that denes b, with respect to E. Then, b is positive denite (i.e., it is an inner product) if and only if all upper-left minors of A are positive: α1, . . . , αn > 0.

Remark. Observe that if some of the αi's is equal to zero, then the matrix might not be even positive semidenite. For instance, consider: 0 0 0 0 A = and B = ; 0 1 0 −1 in both cases α1 = α2 = 0, but while A is positive semidenite, B is negative semidenite. In order to nd a necessary and sucient condition for positive semide- niteness, one cannot consider only the upper-left minors, but needs to consider ALL principal minors, i.e., all determinants of submatrices of A, obtained by considering the same rows and columns: A(i1, . . . , ik|i1, . . . , ik) for 1 ≤ i1 < . . . < ik ≤ n and 1 ≤ k ≤ n. The criterion in this case becomes (see also § 6.2 in the book): Theorem 3. Let b be a symmetric bilinear form on a n-dimensional vector space V . Let E be a basis for V and A the matrix that denes b, with respect to E. Then, b is positive semidenite if and only if all its principal minors of A are non-negative (i.e., ≥ 0). Example. Verify that the symmetric bilinear form b, given in the previous example, is indeed an inner product:

2 1 −1 2 1 α1 = 2 > 0, α2 = 3 > 0 and α3 = 1 2 0 = 1 > 0 . 1 2 −1 0 1 Remark. [Exercise] Before closing this section, it is interesting to point out that Jacobi-Sylvester's theorem implies that, performing Gram-Schmidt process, the b- orthogonal basis F one gets, is such that: αi b(f1, f1) = α1, b(fi, fi) = , αi−1 where αi's, for all i = 1, . . . , n are the upper-left minors of the matrix we called A in the theorem. LECTURE V: BILINEAR FORMS AND ORTHOGONALITY 15

[See also QR-Factorization, Least squares method and Fourier Analysis, in the book]

Department of Mathematics, Princeton University E-mail address: [email protected]