<<

6.5 Unitary and Orthogonal Operators and their Matrices In this section we focus on length-preserving transformations of an . Throughout, as usual, we assume V is an inner product space.

Definition 6.36. A linear of an inner product space V over F is a linear map T satisfying

x V, T(x) = x ∀ ∈ || || || || It should be clear that every eigenvalue of an isometry must have modulus 1: if T(w) = λw, then

w 2 = T(w) 2 = λw 2 = λ 2 w 2 || || || || || || | | || || Example 6.37. Let T = L (R2), where A = 1 4 3 . Then A ∈ L 5 3− 4 2 2  2 x 1 4x 3y 1 x T = − = (4x 3y)2 + (3x + 4y)2 = x2 + y2 = y 5 3x + 4y 25 − y       

The in this example is very special in that its inverse is its :

1 1 4 3 1 4 3 T A− = = = A 16 + 9 3 4 5 3 4 25 25 −  −  We call such matrices orthogonal.

Definition 6.38. A T on an inner product space V is an invertible linear map satis- fying T∗T = I = TT∗.A is a matrix satisfying A∗ A = I.

• If V is real, we usually call these orthogonal operators/matrices: this isn’t necessary, since unitary encompasses both real and complex spaces. Note that an satisfies AT A = I.

• If β is an of a finite-dimensional V, then T (V) is unitary if and only if ∈ L the matrix [T]β is unitary.

• We need only assume T∗T = I (or TT∗ = I) if V is finite-dimensional: if β is an orthonormal basis, then

T∗T = I [T∗] [T] = I [T] [T∗] = I TT∗ = I ⇐⇒ β β ⇐⇒ β β ⇐⇒

If V is infinite-dimensional, we need T∗ to be both the left- and right-inverse of T. This isn’t an empty requirement: see Exercise 6.5.12..

1 i 2+2i Example 6.39. The matrix A = 3 2 2i i is easily seen to be unitary: −  1 i 2 + 2i i 2 2i 1 i2 + 4 + 8i + 4i2 2 2i A∗ A = − − = − − 9 2 2i i 2 + 2i i 9 2 + 2i i  −   −   − 

1 Theorem 6.40. Let T be a linear operator on V.

1. If T is a unitary/orthogonal operator, then it is a linear isometry.

2. If T is a linear isometry and V is finite-dimensional, then T is unitary/orthogonal.

Proof. 1. If T is unitary, then

x, y V, x, y = T∗T(x), y = T(x),T(y) (†) ∀ ∈ h i h i h i In particular taking x = y shows that T is an isometry.

2. (I T T) = I (T T) = I T T is self-adjoint. By the , there exists an − ∗ ∗ ∗ − ∗ ∗ − ∗ orthonormal basis of V of eigenvectors of I T T. For any such x with eigenvalue λ, − ∗ 2 2 2 0 = x T(x) = x, x T(x),T(x) = x, (I T∗T)x = λ x || || − || || h i − h i h − i || || = λ = 0 ⇒ Since I T T = 0 on a basis, T T = I. Since V is finite-dimensional, we also have TT = I − ∗ ∗ ∗ whence T is unitary.

The finite-dimensional restriction is important in part 2: we use the existence of adjoints, the spectral theorem, and that a left-inverse is also a right-inverse. Again, see Exercise 6.5.12. for an example of a non-unitary isometry in infinite dimensions. The proof shows a little more:

Corollary 6.41. On a finite dimensional space, being unitary is equivalent to each of the following:

(a) Preservation of the inner producta (†). In particular, in a real inner product space isomteries x,y also preserve the angle θ between vectors since cos θ = xh yi . || |||| || (b) The existence of an orthonormal basis β = w ,..., w such that T(β) = T(w ),...,T(w ) { 1 n} { 1 n } is also orthonormal. (c) That every orthonormal basis β of V is mapped to an orthonormal basis T(β).

a(†) is in fact equivalent to being an isometry in infinite dimensions: recall the identity. . .

While (a) is simply (†), claims (b) and (c) are also worth proving explicitly: see Exercise 6.5.8. If β is n the standard orthonormal basis of F and T = LA, then the columns of A form the orthonormal set T(β). This makes identifying unitary/orthogonal matrices easy:

Corollary 6.42. A matrix A M (R) is orthogonal if and only if its columns form an orthonormal ∈ n basis of Rn with respect to the standard (dot) inner product. A matrix A M (C) is unitary if and only if its columns form an orthonormal basis of Cn with ∈ n respect to the standard (hermitian) inner product.

2 Examples 6.43. 1. The matrix A = cos θ sin θ M (R) is orthogonal for any θ. Example 6.37 is θ sin θ −cos θ ∈ 2 this with θ = tan 1 3 . More generally (Exercise 6.5.6.), it can be seen that every real orthogonal − 4  2 2 matrix has the form A or × θ cos θ sin θ B = θ sin θ cos θ  − 

for some angle θ. The effect of the linear map LAθ is to rotate counter-clockwise by θ, while that 1 of LBθ is to reflect across the line making angle 2 θ with the positive x-axis. √2 √3 1 1 2. A = √2 0 2 M3(R) is orthogonal: check the columns!. √6 − √2 √3 1 ! ∈ − − 3. The matrix A = 1 1 i is unitary: indeed it maps the standard basis to the orthonormal basis √2 i 1  1 1 1 i T(β) = , √ i √ 1  2   2   It is also easy to check that the characteristic polynomial is

1 t i 2 √2 √2 1 1 1 πi/4 p(t) = det − 1 = t + = t = (1 i) = e± i t − √2 2 ⇒ √2 ± √2 √2 − !   whence the eigenvalues of T both have modulus 1. 4. Here is an example of an infinite-dimensional unitary operator. On the space C[ π, π], the − function T( f (x)) = eix f (x) is linear. Moreover

π π ix 1 ix 1 ix ix e f (x), g(x) = e f (x)g(x) dx = f (x)e− g(x) dx = f (x), e− g(x) 2π π 2π π D E Z− Z− D E ix 1 whence T∗( f (x)) = e− f (x). Indeed T∗ = T− and so T is a unitary operator. Since C[ π, π] is infinite-dimensional, we don’t expect all parts of the Corollary to hold: − • Being unitary, T preserves the inner product. However, in contrast to a unitary operator on a finite-dimensional complex space, T has no eigenvalues/eigenvectors, since

T( f ) = λ f x, eix f (x) = λ f (x) f (x) 0 ⇐⇒ ∀ ⇐⇒ ≡ • T certainly maps any orthonormal set to an orthonormal set, however it can be seen that C[ π, π] has no orthonormal basis!a − aAn orthonormal set β = f : k Z can be found so that every function f equals an infinite series in the sense that { k ∈ } f ∑ a f = 0. However, these are not finite sums and so β is not a basis. Moreover, given that the is defined by || − k k|| an integral, this also isn’t quite the same as saying that f = ∑ ak fk as functions. Indeed there is no guarantee that such an infinite series is itself continuous! For these reasons, when working with Fourier series, one tends to consider a broader class than the continuous functions.

3 Unitary and Orthogonal Equivalence Suppose A M (R) is symmetric (self-adjoint) AT = A. By the spectral theorem, A has an orthonor- ∈ n mal eigenbasis β = w ,..., w : Aw = λ w . If we write U = (w w ), then the columns of U { 1 n} j j j 1 ··· n are orthonormal and thus U is an orthogonal matrix. We can therefore write

λ1 0 1 . ···. . T A = UDU− = U  . .. .  U 0 λ  ··· n   The same approach works if A M (C) is normal: we now have A = UDU where U is unitary. ∈ n ∗ 1+i 1+i Example 6.44. The matrix A = 1 i 1+i is normal as can easily be checked. Its characteristic polynomial is − −  p(t) = t2 2(1 + i)t + 4i = (t 2i)(t 2) − − − with corresponding orthonormal eigenvectors

1 1 1 1 w = , w = 2 √ i 2i √ i 2 −  2   We conclude that

1 1 1 1 2 0 1 1 1 − 1 1 1 0 1 i A = = √ i i 0 2i √ i i i i 0 i 1 i 2 −     2 −  −     −  This is an example of unitary equivalence:

Definition 6.45. Square matrices A, B are unitarily equivalent if there exists a unitary matrix U such T that B = U∗ AU. Orthogonal equivalence is similar: B = U AU.

The above discussion proves half the following:

Theorem 6.46. A M (C) is normal if and only if it is unitarily equivalent to a ∈ n (the matrix of its eigenvalues). Similarly, A M (R) is symmetric if and only if it is orthogonally equivalent to a diagonal matrix. ∈ n Proof. We have already observed the ( ) direction. ⇒ For the converse, let D be diagonal, U unitary, and A = U∗DU. Then

A∗ A = (U∗DU)∗U∗DU = U∗D∗UU∗DU = U∗DDU = = U∗DU(U∗DU)∗ = AA∗ ··· 1 since U∗ = U− and because diagonal matrices commute: DD = DD. In the special case where A is real and U is orthogonal, then A is symmetric:

AT = (UT DU)T = UT DTU = UT DU = A

4 Exercises. 6.5.1. For each matrix A find an orthogonal or unitary U and a diagonal D = U∗ AU. 2 1 1 1 2 0 1 2 3 3i (a) (b) − (c) − (d) 1 2 1 2 1 1 0 3 + 3i 5         1 1 2 6.5.2. Which of the following pairs are unitarily/orthogonally equivalent? Explain your answer. 0 1 0 2 0 0 0 1 0 2 − (a) A = and B = (b) A = 1 0 0 and B = 0 1 0 1 0 2 0    −      0 0 1 0 0 0 0 1 0 1 0 0     − (c) A = 1 0 0 and B = 0 i 0     0 0 1 0 0 i −   2 2  a eiθ b 6.5.3. Let a, b C be such that a + b = 1. Prove that every 2 2 matrix of the form − is ∈ | | | | × b eiθ a unitary. Are these all the unitary 2 2 matrices? Prove or disprove. . .   × 1 6.5.4. If A, B are orthogonal/unitary, prove that AB and A− are also orthogonal/unitary. (This proves that orthogonal/unitary matrices are groups under ) 6.5.5. Check that A = 1 5 4i M (C) satisfies AT A = I and is therefore a complex orthogonal 3 4i −5 ∈ 2 matrix.  (These don’t have the same nice relationship with inner products, and are thus less useful to us) 6.5.6. Supply the details of Exercise 6.43.1. (Hints: β = i, j is orthonormal, whence Ai, Aj must be orthonormal. Now draw pictures to { } { } compute the result of rotating and reflecting the vectors i and j.) 6.5.7. Prove that A M (C) has an orthonormal basis of eigenvectors whose eigenvalues have ∈ n modulus 1, if and only if A is unitary. 6.5.8. Prove parts (b) and (c) of Corollary 6.41 for a finite-dimensional inner product space: (a) If β is an orthonormal basis such that T(β) is orthonormal, then T is unitary. (b) If T is unitary, and η is an orthonormal basis, then T(η) is an orthonormal basis. 6.5.9. Let T be a linear operator on a finite-dimensional inner product space V. If T(x) = x for || || || || all x in some orthonormal basis of V, must T be unitary? Prove or disprove. 6.5.10. Let T be a unitary operator on an inner product space V and let W be a finite-dimensional T-invariant subspace of V. Prove:

(a)T (W) = W (Hint: show that TW is injective. . . ); (b) W⊥ is T-invariant. 6.5.11. Let W a subspace of an inner product space V such that V = W W . Define T (V) by ⊕ ⊥ ∈ L T(u + w) = u w where u W and w W . Prove that T is unitary and self-adjoint. − ∈ ∈ ⊥ 6.5.12. In the inner product space `2 of square-summable sequences (see section 6.1), consider the linear operator T(x1, x2,...) = (0, x1, x2,...). Prove that T is an isometry and compute its adjoint. Check that T is non-invertible and non-unitary. 6.5.13. Prove Schur’s Lemma for matrices. Every A M (R) is orthogonally equivalent and every ∈ n A M (C) is unitarily equivalent to an upper . ∈ n

5 6.6 Orthogonal Projections Recall the discussion surrounding the Gram-Schmidt process, where we saw that any finite-dimensional subspace W of an inner product space V has an orthonormal basis β = w ,..., w . We could W { 1 n} then define the orthogonal projection onto W as the map

n πW : V V : x ∑ x, wj wj → 7→ j=1

In this section we develop the projections more rigorously, though we start from a slightly different place. First recall the notion of a direct sum within a V:

V = X Y v V, unique x X, y Y such that v = x + y ⊕ ⇐⇒ ∀ ∈ ∃ ∈ ∈

Definition 6.47. A linear map T (V) is a projection if: ∈ L

V = (T) (T) and T (T) = I (T) R ⊕ N R R Otherwise said, T(r + n) = r whenever r (T) and n (T). ∈ R ∈ N Alternatively, given V = X Y, the projection along Y onto X is the map v = x + y x. ⊕ 7→ We call a A M (F) a if L (Fn) is a projection. ∈ n A ∈ L

1 1 2 1 2 Example 6.48. A = 5 −3 6 is a projection matrix with (A) = Span 3 and (A) = Span 1 . − R N Identifying projections is very easy: check this yourself for the above matrix!  

Lemma 6.49. T (V) is a projection if and only if T2 = T. ∈ L Proof. Throughout, assume r (T) and n (T). ∈ R ∈ N ( ) Since every vector in V has a unique representation v = r + n, we simply compute ⇒ T2(v) = T T(r + n) = T(r) = r = T(v)

( ) Suppose T 2 = T. Note first that if r (T), then r = T(v) for some v V, whence ⇐ ∈ R ∈ T(r) = T2(v) = T(v) = r (†)

Thus T is the identity on (T). Moreover, if x (T) (T), (†) says thata x = T(x) = 0. R ∈ R ∩ N Now observe that for any v V, ∈ T v T(v) = T(v) T2(v) = 0 = v T(v) (T) − − ⇒ − ∈ N so thatv = T(v) + v T(v) is a decomposition into (T)- and (T)-parts. We conclude that − R N V = (T) (T) and that T is a projection. R ⊕ N  aBy the Rank–Nullity Theorem, this is enough to establish V = (T) (T) when V is finite-dimensional. R ⊕ N

6 We can generalize the above to describe all projection matrices in M2(R). There are three cases:

1. A = I is the : (A) = R2 and (A) = 0 ; R N { } 2. A = 0 is the : (A) = 0 and (A) = R2; R { } N 3. Choose distinct 1-dimensional subspaces (A) = Span ( a ) and (A) = Span ( c ), then R b N d 1 a 1 ad ac A = (d c) = − ad bc b − ad bc bd bc −   −  −  Think about why this last does what we claim.

Thusfar the discussion hasn’t had anything to do with inner products. Now we specialize:

Definition 6.50. If V is an inner product space then T (V) is an orthogonal projection if it is a ∈ L projection for which

(T) = (T)⊥ and (T) = (T)⊥ N R R N Given a subspace W V for which W W = V, the projection corresponding to (T) = W and ≤ ⊕ ⊥ R (T) = W is the orthogonal projection onto W: namely π . The orthogonal projection onto W is N ⊥ W ⊥ then π = I π . W⊥ − W In the language above, the identity and zero matrices are both 2 2 real orthogonal projection matri- × ces, while those of type 3 are orthogonal if ( c ) ( b ): d k −a 1 a 1 a2 ab A = (a b) = a2 + b2 b a2 + b2 ab b2     More generally, if β = v ,..., v Fn is orthonormal, then π has matrix ∑k w w . { 1 k} ≤ Span β j=1 j ∗j Theorem 6.51. A projection T (V) is orthogonal if and only if it is self-adjoint T = T . ∈ L ∗ Proof. ( ) By assumption, (T) and (T) are orthogonal subspaces. Letting x, y V and using ⇒ R N ∈ subscripts to denote (T)- and (T)-parts, we see that R N

x,T(y) = x + x , y = x , y + y = T(x), y = T∗ = T h i h r n ri h r r ni h i ⇒ ( ) Suppose T is a self-adjoint projection. By the fundamental subspaces theorem, ⇐

(T) = (T∗) = (T)⊥ N N R Moreover, since T is a projection already, we have that V = (T) (T) = (T) (T) , from R ⊕ N R ⊕ R ⊥ whicha (T) = ( (T) ) = (T) . R R ⊥ ⊥ N ⊥

aRecall that if V = U U , then (U ) = U. Alternatively, one can check explicitly that x T(x) = 0 for any ⊕ ⊥ ⊥ ⊥ || − || x (T) to see that (T) (T) ( (T) ) = (T) ; though the calculation is really just a combination of part ∈ N ⊥ N ⊥ ≤ R ≤ R ⊥ ⊥ N ⊥ of Lemma 6.49 and the proof that V = U U = (U ) = U. ⊕ ⊥ ⇒ ⊥ ⊥

7 Orthogonal Projections and the Spectral Theorem It should be clear that every projection T has (at most) two eigenspaces:

• (T) is an eigenspace with eigenvalue 1; R • (T) is an eigenspace with eigenvalue 0. N If V is finite-dimensional and ρ, η are bases of (T), (T) respectively, then the matrix of T with R N respect to ρ η has block form ∪ I 0 [T]ρ η = ∪ 0 0   where rank I = rank T. In particular, every such projection is diagonalizable. The language of pro- jections allows us to rephrase the Spectral Theorem.

Theorem 6.52 (Spectral Theorem, mk. II). Let V be finite-dimensional and T (V) be a ∈ L normal/self-adjoint with distinct eigenvalues λ1,..., λk and corresponding eigenspaces E1,..., Ek. Let π (V) be the orthogonal projection onto E . Then: j ∈ L j 1. V = E E is a direct sum of orthogonal subspaces, in particular, E is the direct sum of 1 ⊕ · · · ⊕ k ⊥j the remaining eigenspaces;

2. π π = 0 if i = j; i j 6 3. I = π + + π ; V 1 ··· k 4. T = λ π + + λ π . 1 1 ··· k k aNormal if V is complex, self-adjoint if V is real.

Proof. 1. T is diagonalizable and so V is the direct sum of the eigenspaces of T. Since T is normal, the eigenvectors corresponding to distinct eigenvalues are orthogonal, whence the eigenspaces are mutually orthogonal. In particular, this says that

Eˆ := E E⊥ j i ≤ j i=j M6 Since V is finite-dimensional, we have V = E E , whence j ⊕ ⊥j ˆ ˆ dim Ej = ∑ dim Ei = dim V dim Ej = dim E⊥j = Ej = E⊥j i=j − ⇒ 6

2. This is clear by part 1, since (π ) = E = Eˆ . N j ⊥j j 3. Write x = ∑k x where each x E . Then π (x) = x : now add. . . j=1 j j ∈ j j j k k k 4.T (x) = ∑j=1 T(xj) = ∑j=1 λjxj = ∑j=1 λjπj(x).

8 Definition 6.53. The spectrum of a normal/self-adjoint operator T on a finite dimensional inner product space is its set of eigenvalues. The expressions in parts 3 and 4of the theorem are called, respectively, the resolution of the identity and the spectral decomposition of T.

1+i 1+i Examples 6.54. 1. Recall Example 6.44 where we had a A = 1 i 1+i with or- thonormal eigenvectors − −  1 1 1 1 w = , w = 2 √ i 2i √ i 2 −  2  

Writing π2, π2i for the orthogonal projection matrices onto these eigenspaces, it is easy to see that

1 1 1 1 i 1 1 i π2 = w2w∗ = (1 i) = π = w w∗ = − 2 2 i 2 i 1 2i 2i 2i 2 i 1 −  −    Now observe that

1 0 1 i i 1 π + π = and 2π + 2iπ = + = A 2 2i 0 1 2 2i i 1 1 i   −    0 1 1 2. The matrix A = 1 0 1 has eigenvalues 2, 1, 1 and an orthonormal eigenbasis 1 1 0 − −   1 1 1 1 1 1 w , w , w = 1 , 1 , 1 { 1 2 3}  √   √ −  √   3 1 2 0 6 2  −        As eigenspaces, we have E2 = Span w1 and E 1 = Span w2, w3 . The orthogonal projections { } − { } have matrices matrices 1 1 1 1 π = w wT = 1 1 1 2 1 1 3   1 1 1   1 1 0 1 1 2 2 1 1 T T 1 − 1 − 1 − − π 1 = w2w2 + w3w3 = 1 1 0 + 1 1 2 = 1 2 1 − 2 −  6  −  3 − −  0 0 0 2 2 4 1 1 2 − − − −       It is now easy to check the resolution of the identity and the spectral decomposition:

π2 + π 1 = I and 2π2 π 1 = A − − −

9 Exercises. 6.6.1. Compute the matrices of the orthogonal projections onto the following subspaces: in all cases we use the standard inner product. 4 2 (a) Span 1 in R − 1 1 (b) Span 2 , 0 in R3 1 1 − n i   1 o 3 (c) Span 1 , i in C 0 1 n 1   1 o 3 (d) Span 1 , 2 in R (watch out, these vectors aren’t orthogonal!) 0 1 6.6.2. For each ofn the matrices o in Exercise 6.5.1., compute the projections onto each eigenspace, verify the resolution of the identity and the spectral decomposition.

6.6.3. If W be a finite-dimensional subspace of an inner product space V. If T = πW is the orthogonal projection onto W, prove that I T is the orthogonal projection onto W . − ⊥ 6.6.4. Let T (V) where V is finite-dimensional. ∈ L (a) If T is an orthogonal projection, prove that T(x) x for all x V. || || ≤ || || ∈ (b) Give an example of a projection for which the inequality in (a) is false. (c) If T is a projection for which T(x) = x for all x V, what is T? || || || || ∈ (d) If T is a projection for which T(x) x for all x V, prove that T is an orthogonal || || ≤ || || ∈ projection. 6.6.5. Let T be a on a finite-dimensional inner product space. If T is a projection, prove that it must be an orthogonal projection. 6.6.6. Let T be a normal operator on a finite-dimensional complex inner product space V. Use the spectral decomposition T = λ π + + λ π to prove: 1 1 ··· k k (a) If Tn is the zero map for some n N, then T is the zero map. ∈ (b)U (V) commutes with T if and only if U commutes with each π . ∈ L j (c) There exists a normal U (V) such that U2 = T. ∈ L (d) T is invertible if and only if λ = 0 for all j. j 6 (e) T is a projection if and only if every λj = 0 or 1. (f)T = T if and only if every λ is imaginary. − ∗ j

10 6.7 The Decomposition and the Pseudoinverse Given T (V, W) between finite-dimensional inner product spaces, the overarching concern of this ∈ L chapter is the existence and computation of bases β, γ of V, W with two properties: • That β, γ be orthonormal, thus facilitating easy calculation within V, W; γ • That the matrix [T]β be as simple as possible. We have already answered two versions of this question: Spectral Theorem When V = W and T is normal/self-adjoint, β = γ such that [T] is diagonal. ∃ β Schur’s Lemma When V = W and T is any linear map, β = γ such that [T] is upper triangular. ∃ β In this section we allow V = W and β = γ, and obtain a result that applies to any linear map between 6 6 finite-dimensional inner product spaces. We start with an example.

2 3 3 1 Example 6.55. Let T = LA (R , R ) where A = 2 2 and consider orthonormal bases ∈ L 1− 3 β = v , v of R2 and γ = w , w , w of R3 respectively:  { 1 2} { 1 2 3} 1 1 1 1 1 1 1 1 1 − 1 β = , − , γ = 0 , 2 , 1 √ 1 √ 1  √   √ −  √ −   2   2   2 1 6 1 3 1  −          Observe that T(v1) = 4w1 and T(v2) = 2√3w2, whence

4 0 [T]γ = 0 2√3 β   0 0   The bases β, γ come close to ‘diagonalizing’ the operator. The resulting scalars on the main diagonal (4, 2√3) behave very like eigenvalues. Our main result says that such bases always exist.

Theorem 6.56 (Singular Value Decomposition). Suppose V, W are finite-dimensional inner prod- uct spaces and that T (V, W) has rank r. ∈ L 1. There exist orthonormal bases β = v ,..., v of V and γ = w ,..., w of W, and positive { 1 n} { 1 m} scalars σ σ σ such that 1 ≥ 2 ≥ · · · ≥ r σ w if j r diag(σ ,..., σ ) O T(v ) = j j ≤ equivalently [T]γ = 1 r j β OO (0 otherwise  

2. Each vj is an eigenvector of T∗T: indeed

2 σj vj if j r T∗T(vj) = ≤ (0 otherwise

We conclude that the scalars σj are uniquely determined by T.

11 Definition 6.57. The numbers σ1,..., σr are the singular values of T. If T is not maximum rank, we have additional zero singular values σ = = σ = 0. If A is a matrix, its singular values r+1 ··· min(m,n) are those of the linear map LA.

• While the singular values are determined by T, there is often significant freedom of choice of the bases β, γ, particularly if any of the eigenspaces of T T have dimension 2. ∗ ≥ • If V = W and T is normal/self-adjoint, then we may choose β to be an eigenbasis of T. In this case σj is the modulus of the corresponding eigenvalue (see Exercise 6.7.6.).

• Singular Value Decomposition for Matrices: If rank A = r, then A = UΣV∗ where

diag(σ ,..., σ ) O Σ = 1 r , U = (w ,..., w ), V = (v ,..., v ) OO 1 m 1 n   Since the columns of U, V are the orthonormal elements of β, γ, these matrices are unitary.

T 14 2 Examples 6.58. 1. First recall Example 6.55. We have A∗ A = A A = 2 14 with eigenvalues σ2 = 16 and σ2 = 12 and orthonormal eigenvectors v = 1 1 , v = 1 1 . The singular 1 2 1 √2 1 2 √2  −1 values are therefore σ1 = 4, σ2 = 2√3. Now compute  

1 1 1 1 1 1 w1 = Av1 = 0 , w2 = Av2 = −2 σ1 √2 1 σ2 √6 −1     1 1 and observe that these are orthonormal. Finally choose w3 = 1 to complete the or- √3 −1 thonormal basis γ of R3. We therefore have the singular value decomposition − 

1 1 1 3 1 √2 √6 √3 4 0 1 1 2 1 √ √ = = = 0 − √ 2 2 A 2 2 UΣV∗  √6 √3  0 2 3 1 1  −  1 1 1   √− √ ! 1 3 − − 0 0 2 2  √2 √6 √3        2 3 T 4 6 2 2 2. The matrix A = 0 2 has A A = 6 13 with eigenvalues σ1 = 16 and σ2 = 1 and orthonormal eigenbasis   1 1 1 2 β = , − √ 2 √ 1  5   5  

The singular values are therefore σ1 = 4 and σ2 = 1, from which we obtain

1 1 1 2 1 1 γ = Av , Av = , σ 1 σ 2 √ 1 √ 2  1 2   5   5 −  and the singular value decomposition

2 1 1 2 √5 √5 4 0 √5 √5 A = UΣV∗ = 1 2 2 1 − 0 1 − √5 √5 !   √5 √5 !

12 Proof of the Singular Value Decompostion. 1.T ∗T is self-adjoint: by the spectral theorem it has an orthonormal basis of eigenvectors β = v ,..., v . Suppose T T(v ) = λ v , then { 1 n} ∗ j j j

T(vj),T(vk) = T∗T(vj), vk = λjδjk

2 Certainly every eigenvalue is a non-negative : λ = T(v ) 0. Since rank T = j j ≥ r, exactly r of the λ ’s are non-zero: by reordering the basis vectors if necessary, we may assume j that

λ λ > 0 1 ≥ · · · ≥ r 1 If j r, define σj := λj > 0 and wj := T(vj), then the set w1,..., wr is orthonormal. If ≤ σj { } necesary, extend this top an orthonormal basis γ. 2. This is very easy given part 1:

T∗(w ), v = w ,T(v ) = w , σ w = σ δ = σ v , v = T∗(w ) = σ v j k j k j k k k jk j j k ⇒ j j j 2 since β is a basis. We conclude that T∗T(vj) = σj vj.

It is typically much harder to find singular values in non-standard inner product spaces, since com- putation of the adjoint is typically so difficult. Here is a classic example.

1 Example 6.59. Consider the inner product f , g = f (x)g(x) dx on the polynomial spaces P (R) h i 0 2 and P (R). As previously seen, we have orthonormal bases 1 R β = √5(6x2 6x + 1), √3(2x 1), 1 , γ = √3(2x 1), 1 { − − } { − } d Let T = dx be the derivative operator. It is easy to find the matrix of T:

2√15 0 0 [T]γ = β 0 2√3 0   This matrix is already in the required form, whence β, γ are suitable bases, and the singular values of T are σ1 = 2√15 and σ2 = 2√3. In case you are unconvinced and want to use the method to evaluate this directly, use the orthonor- mality of β, γ to compute

60 0 0 γ γ 2 2 [T∗T] = ([T] )∗[T] = 0 12 0 = σ = 60, σ = 12 β β β   ⇒ 1 2 0 0 0   The standard basis of R3 is clearly an orthonormal basis of eigenvectors for this matrix: up to sign, [v ] , [v ] , [v ] is therefore forced to be the standard ordered basis of R3. This says that β was the { 1 β 2 β 3 β} correct basis of P2(R) all along.

13 The Pseudoinverse Given the singular values of an operator, it is straightforward to define something that looks like an inverse map, but which makes sense even when the operator is not invertible!

Definition 6.60. Suppose we have the singular value decomposition of T (V, W). The pseudoin- ∈ L verse of T is the linear map T† (W, V) defined by ∈ L 1 v if j r † σj j T (wj) = ≤ (0 otherwise

• If A = UΣV∗ is the singular value decomposition of a matrix, then its pseudoinverse is the diag(σ 1,..., σ 1) O matrix of L† , namely1 A† = VΣ†U , where Σ† = 1− r− . A ∗ OO   • Restricted to (T) = Span v ,..., v we obtain an N ⊥ { 1 r} V W T (T) : (T)⊥ (T) = Span w1,..., wr N ⊥ N → R { } T (T) † (T)⊥ with inverse T (T) : (T) (T)⊥. Moreover, in N † R R R → N T terms of orthogonal projections, (T) (T) NL RL ⊥ † † TT = π (T) and T T = π⊥ (T) 0 0 R N { V} { W } 3 1 Examples 6.61. 1. Again continuing Example 6.55, A = 2 2 has pseudoinverse 1− 3   † 1 1 A = v1w1∗ + v2w2∗ σ1 σ2 1 1 1 1 = (1 0 1) + − ( 1 2 1) √ √ 1 √ √ √ 1 − − 4 2 2   2 3 2 6   1 1 0 1 1 1 2 1 1 5 4 1 = + − = 8 1 0 1 12 1 2 1 24 1 4 5   − −   −  † † which is exactly what we would have found by computing A = VΣ U∗. Observe that

2 1 1 1 0 1 A† A = and AA† = 1 2 1 0 1 3     1 1 2   are the orthogonal projection matrices onto Span v , v = R2 and Span w , w R3 respec- { 1 2} { 1 2} ≤ tively. These are projections onto two-dimensional subspaces since rank A = 2. It also easy to check that

T 1 T † A(A A)− A = AA

in accordance with our discussion of least-squares problems.

1This isn’t the singular value decomposition of A† since σ 1 σ 1, etc! 1− ≤ 2−

14 2. Finally we check that the pseudoinverse of the derivative operator behaves roughly as ex- pected. The pseudoinverse of T = d : P (R) P (R), as seen in Example 6.59, maps dx 2 → 1 1 1 T†(√3(2x 1)) = √5(6x2 6x + 1) = (6x2 6x + 1) − 2√15 − 2√3 − 1 1 T†(1) = √3(2x 1) = x 2√3 − − 2 b b b 1 b = T†(a + bx) = T† a + + √3(2x 1) = a + x + (6x2 6x + 1) ⇒ 2 √ − 2 − 2 12 −  2 3      b a b = x2 + ax 2 − 2 − 6 The pseudoinverse of ‘differentiation’ therefore returns a particular choice of anti-derivative.

Exercises. 6.7.1. Find the ingredients β, γ and the singular values for each of the following: x 2 3 x x+y (a)T (R , R ) where T ( y ) = x y ∈ L − R R   1 (b)T: P2( ) P1( ) and T( f ) = f 00 where f , g := 0 f (x)g(x) dx → h 2iπ (c) V = W = Span 1, sin x, cos x and f , g = f (x)g(x) dx, with T( f ) = f + 2 f { } h i 0 R 0 6.7.2. Find a singular value decomposition of each of theR matrices: 1 1 1 1 1 (a) 1 1 (b) 1 0 1 (c) 1 1 0 1 1 1 0 1 1− 0 1 − − − − 6.7.3. Find an explicit formula for T † in each of the examples in Exercise 6.7.1.. 6.7.4. Find the pseudoinverse of each of the matrices in Exercise 6.7.2.. 6.7.5. Suppose T : V W is written according to the singular value decompostion. Compute T → ∗ in terms of β, γ and prove that γ is a basis of eigenvectors of TT∗ with the same non-zero eigenvalues as T∗T, including repetitions. 6.7.6. Suppose T = (V) is a normal operator. Prove that each v in the singular value theorem may L j be chosen to be an eigenvector of T and that σj is the modulus of the corresponding eigenvalue. 6.7.7. Let V, W be finite-dimensional inner product spaces and T (V, W). Prove: ∈ L † 1 (a) If T is injective, then T∗T is invertible and T = (T∗T)− T∗. † 1 (b) If T is surjective, then TT∗ is invertible and T = T∗(TT∗)− . m † 6.7.8. Suppose A Mm n(R) and b R and define z = A b. ∈ × ∈ (a) If Ax = b is consistent (has a solution), prove that z is the solution with minimal norm. (b) If Ax = b is inconsistent, prove that z is the unique minimizer of Az b with minimal || − || norm. 6.7.9. Find the minimal norm solution to the first system, and the vector which comes closest to solving the second:

3x + y = 1 3x + 2y + z = 9 2x 2y = 0 (x 2y + 3z = 3 − − x + 3y = 0

 15 6.8 Bilinear and Quadratic Forms In this section we slightly generalize the idea of an inner product. Throughout, V is a vector space over a field F: it need not be an inner product space and F can be any field (not just R or C).

Definition 6.62. A B : V V F is a function which is linear in each entry when the × → other is held fixed. That is: v, x, y V, λ F, ∀ ∈ ∈ λx + y, v = λ x, v + y, v , v, λx + y = λ v, x + v, y h i h i h i h i h i h i Additionally, B is symmetric if x, y V, x, y = y, x . ∀ ∈ h i h i

Examples 6.63. 1. If V is a real inner product space, then the inner product , is a symmetric h i bilinear form. Note that a complex inner product is not bilinear! 2. If A M (F), then B(x, y) := xT Ay is a bilinear form on Fn. For instance, on R2, ∈ n 1 2 B(x, y) = xT y = x y + 2x y + 2x y 2 0 1 1 1 2 2 1   defines a symmetric bilinear form, though not an inner product since it isn’t positive definite; for example B(j, j) = 0.

As seen above, we often make use of a matrix.

Definition 6.64. Let B be a bilinear form on a finite-dimensional V with basis β = v ,..., v . The { 1 n} matrix of B with respect to β is the matrix [B] = A M (F) with ijth entry β ∈ n

Aij = B(vi, vj)

Given x, y V, compute their co-ordinate vectors [x] , [y] with respect to β, then ∈ β β T B(x, y) = [x]β A[y]β

The set of bilinear forms on V is therefore in bijective correspondence with the set Mn(F). Moreover,

T T T T T B(y, x) = [y]β A[x]β = [y]β A[x]β = [x]β A [y]β

  γ Finally, if γ is another basis of V, then an appeal to the change of co-ordinate matrix Qβ yields

β β β β β β B(x, y) = [x]T A[y] = (Q [x] )T A(Q [y] ) = [x]T(Q )T AQ [y] = [B] = (Q )T[B] Q β β γ γ γ γ γ γ γ γ ⇒ γ γ γ γ To summarize, we’ve proved the following:

Lemma 6.65. Let B be a bilinear form on a finite-dimensional vector space.

1. If A is the matrix of B with respect to some basis, then every other matrix of B has the form QT AQ for some invertible Q.

2. B is symmetric if and only if its matrix with respect to any (and all) bases is symmetric.

16 Diagonalization of symmetric bilinear forms As with everything else in this chapter, the ideal situation is when a basis can be found which diago- nalizes a matrix. Bilinear forms are no different. For instance, Example 6.63.2 can be written 1 2 B(x, y) = xT y = x y + 2x y + 2x y = (x + 2x )(y + 2y ) 4x y 2 0 1 1 1 2 2 1 1 2 1 2 − 2 2   T x + 2x 1 0 y + 2y = 1 2 1 2 ( ) x2 0 4 y2 ∗    −    1 0 1 2 = [B]γ = where γ = , − ⇒ 0 4 0 1  −      γ 1 1 2 − 1 2 If β is the standard basis, then the change of co-ordinate matrix is Qβ = 0− 1 = 0 1 , hence the signs in expression ( ). ∗   Note how the change of co-ordinate matrix is elementary: this gives a clue of how to diagonalize any symmetric bilinear form. . .

1 2 3 Example 6.66. To diagonalize B(x, y) = xT Ay = xT 2− 0 4 y, we perform simultaneous row −3 4 1  −  (λ) a and column operations. We only need elementary matrices Eij of type III.

1 0 3 ( ) ( ) E 2 AE 2 = 0 4 10 (add twice the first row to the second, columns similarly) 21 12  −  3 10 1 −  1 0 0 ( 3) (2) (2) ( 3) E − E AE E − = 0 4 10 (subtract 3 times the first row from the third, etc.) 31 21 12 13  −  0 10 10 −  1 0 0 (1) ( 3) (2) (2) ( 3) (1) E E − E AE E − E = 0 6 0 (add the third row to the second, etc.) 23 31 21 12 13 32   0 0 10 −   If β = i, j, k is the standard basis, we therefore have change of co-ordinate matrix { } 1 2 0 1 0 3 1 0 0 1 1 3 β (2) ( 3) (1) − − − Q = Q = E E − E = 0 1 0 0 1 0 0 1 0 = 0 1 0 γ 12 13 32         0 0 1 0 0 1 0 1 1 0 1 1         1 1 3 With respect to the basis γ = 0 , −1 , −0 the symmetric bilinear form B is diagonal. If 0 1 1 you’re having trouble believing,n invert  the change  o of co-ordinate matrix and check that

B(x, y) = (x 2x + 3x )(y 2y + 3y ) + 6x y 10( x + x )( y + y ) 1 − 2 3 1 − 2 3 2 2 − − 2 3 − 2 3 a (λ) th Recall that Eij is the identity matrix with an additional λ in the ij entry. (λ) th th • As a column operation (right-multiplication), Eij adds λ times the i column to the j . (λ) (λ) T th th • As a row operation (left-multiplication), Eji = (Eij ) adds λ times the i row to the j .

17 Warning! If F = R then every symmetric B is diagonalizable with respect to an orthonormal basis of eigenvectors (for the usual inner product on Rn). It is very unlikely that the above algorithm will produce such a basis! Note how γ in the previous example is not orthogonal in R3. The algorithm has several advantages over the spectral theorem: it is typically faster than computing eigenvectors and it applies for vector spaces over any field. A disadvantage of the algorithm is that there are many many choices: here is an example;

T 1 6 Example 6.67. The bilinear form B(x, y) = x 6 3 y = x1y1 + 6x1y2 + 6x2y1 + 3x2y2 can be diag- onalized as follows:  1 0 1 6 1 6 1 0 1 6 • 6 1 6 3 0− 1 = 0 33 = [B]γ where γ = 0 , −1 . This corresponds to − − B (x, y) = (x+ 6x )(y + 6y ) 33x y    1 2 1 2 − 2 2

1 2 1 6 1 0 11 0 1 0 • 0− 1 6 3 2 1 = −0 3 = [B]η where η = 2 , 1 . This corresponds to − − B (x, y) = 11 x y + 3(2x + x )(2y + y )    − 1 1 1 2 1 2

• If F = R, we may apply the spectral theorem: ζ = 6 , 1 √37 is an orthogonal 1+√37 − −6 eigenbasis, with respect to which B is diagonal but veryn unpleasant!  o

1 6 1+√37 1 6 6 1 √37 2+√37 0 [B]ζ = − − = 74 + 2√37 1 √37 6 6 3 1+√37 6 0 2 √37  − −     −  (6x +(1+√37)x )(6y +(1+√37)y ) (6x (1+√37)x )(6y (1+√37)y ) B(x, y) = (2 + √37) 1 2 1 2 + (2 √37) 2− 1 2− 1 74+2√37 − 74+2√37

Theorem 6.68. Suppose B is a bilinear form of a finite-dimensional space V over a field F.

1. If B is diagonalizable, then it is symmetric.

2. If B is symmetric and F does not have characteristic two (see below), then B is diagonalizable.

Proof. 1. If B is diagonalizable, β such that [B] is diagonal and thus symmetric. ∃ β 2. The converse is more difficult. First suppose B is non-zero (otherwise the result is trivial). We prove by induction on n = dim V. If n = 1 the result is trivial: B(x, y) = axy for some a F is clearly symmetric. ∈ Fix n N and assume that every non-zero symmetric bilinear form on a dimension n vector ∈ space over a field with char F = 2 is diagonalizable! Let dim V = n + 1. By the discussion 6 below, x V such that B(x, x) = 0. Consider the linear map ∃ ∈ 6 T: V F : v B(x, v) → 7→ Clearly rank T = 1 = dim (T) = n. Moreover, B is symmetric when restricted to (T); ⇒ N N by the induction hypothesis there exists a basis β of (T) such that [B (T)]β is diagonal. But N N then B is diagonal with respect to the basis β x . ∪ { }

18 Aside: Characteristic two fields This means 1 + 1 = 0 in F, which holds, for instance, in the field Z = 0, 1 of remainders modulo 2. We now see the importance char F = 2 has to the above result. 2 { } 6 • The proof requires the existence of x V such that B(x, x) = 0. If B is non-zero, there exist u, v ∈ 6 such that B(u, v) = 0. If both B(u, u) = 0 = B(v, v), then x = u + v does the job 6 B(x, x) = B(u, v) + B(v, u) = 2B(u, v) = 0 6 whenever char F = 2. 6 • Consider B(x, y) = xT 0 1 y on the 2D vector space Z2 = 0 , 1 , 0 , 1 over Z . Ev- 1 0 2 { 0 0 1 1 } 2 ery element of this space satisfies B(x, x) = 0! Perhaps surprisingly, the matrix of B is identical  2     with respect to any basis of Z2, and so B is symmetric but non-diagonalizable. In the above example notice how the three diagonal matrix representations have something in com- mon: one each of the diagonal entries are positive and negative. This is a general phenomenon:

Theorem 6.69 (Sylvester’s Law of Inertia). Suppose B is a symmetric bilinear form on a real vector space V with diagonal diag(λ1,..., λn). Then the number of entries λj which are positive/negative/zero is independent of the diagonal representation.

Sketch Proof. For simplicitly, let V = Rn and write B(x, y) = xT Ay where A is symmetric.

1. First define rank B := rank A and observe that the rank of any matrix of B is independent of basis (exercises).

2. Let β, γ be diagonalizing bases, ordered according to whether B is positive, negative or zero.

β = v ,..., v , v ,..., v , v ,..., v { 1 p p+1 r r+1 n} γ = w ,..., w , w ,..., w , w ,..., w { 1 q q+1 r r+1 n} Here r = rank B in accordance with part 1: our goal is to prove that p = q.

3. Assume p < q, define the matrix C and check what follows:

T v1 A . .  T  vp A C = T M(r q+p) n(R)  wq+1 A  ∈ − ×    .   .   T   wr A    (a) null C n r, so x Rn such that Cx = 0 and x Span v ,..., v . ≥ − ∃ ∈ 6∈ { r+1 n} (b) The first p entries of Cx = 0 mean that x Span v ,..., v , v ,..., v and so ∈ { p+1 r r+1 n} B(x, x) < 0. Note how we use part (a) to get a strict inequality here. (c) Now write x with respect to γ: this time we see that x Span w ,..., w , w ,..., w , ∈ { 1 q r+1 n} whence B(x, x) 0: a contradiction. ≥

19 Quadratic Forms

Definition 6.70. To every symmetric bilinear form B : V V F is associated a quadratic form × → K : V F : x B(x, x) → 7→ A function K : V F is termed a quadratic form when such a symmetric bilinear form exists. →

Examples 6.71. 1. If B is a real inner product, then K(v) = v, v = v 2 is the square of the norm. h i || || 2. Let dim V = n and A be the matrix of B with respect to a basis β. By the symmetry of A,

n T Aij if i = j K(x) = x Ax = ∑ xi Aijxj = ∑ a˜ijxixj where a˜ij = i,j=1 1 i j n (2Aij if i = j ≤ ≤ ≤ 6 2 2 T 3 1 E.g., K(x) = 3x1 + 4x2 2x1x2 corresponds to the bilinear form B(x, y) = x 1− 4 y − −  Diagonalizing Conics A fun application of quadratic forms and their relationship to symmetric (diagonalizable) bilinear forms is to the study of quadratic manifolds: in R2 these are conics, and in R3 quadratic surfaces such as ellipsoids. For example, the general non-zero quadratic form on R2 is

a b K(x) = ax2 + 2bxy + cy2 B(v, w) = vT w ! b c   Diagonalizing with respect to an orthonormal basis v , v , there exist scalars λ , λ with { 1 2} 1 2 2 2 K(t1v1 + t2v2) = λ1t1 + λ2t2

With respect to this basis the general conic has the form

λ t2 + λ t2 + µ t + µ t = η, λ , λ , µ , µ , η R 1 1 2 2 1 1 2 2 1 2 1 2 ∈ µ If the λ are non-zero, we may complete the squares via the linear transformations s = t + j . The i j j 2λj canonical forms are then recovered:

Parabola λ1 or λ2 = 0.

2 2 Ellipse λ1s1 + λ2s2 = k where λ1, λ2 have the same sign. 2 2 Hyperbola λ1s1 + λ2s2 = k where λ1, λ2 have opposite signs. Since we only applied a /reflection (change to the orthonormal basis v , v ) and translation { 1 2} (completing the square), it is clear that every conic may be recovered thus from the canonical forms. One could instead diagonalize K using our earlier algorithm, though this probably won’t produce an orthonormal basis and so we couldn’t interpret the transformation as a rotation/reflection. By Sylvester’s Law, however, the diagonal entries will have the same number of positive/negative/zero terms, so the canonical forms will still look the same.

20 Examples 6.72. 1. We describe and plot the conic with equation 7x2 + 24xy = 144. 7 12 The matrix of the associated bilinear form is 12 0 t2 y which has orthonormal eigenpairs 4  4 t1 1 4 1 3 3 (λ1, v1) = (16, ), (λ2, v2) = ( 9, − ) 5 3 − 5 4 22 4 3 In the rotated basis, we have the canonical hyperbola  1 v2 2 1 t2 t2 v1 16t2 9t2 = 144 1 2 = 1 1 1 2 2 2 4 224 − − ⇐⇒ 3 − 4 − − 1 2 2− − x which is easily plotted. Note here that 3− 2 3 − − 4− 4 1 1 − − t = (4x + 3y), t = ( 3x + 4y) 1 5 2 5 − 4 − which quickly recovers the original equation. 2. The conic defined by K(x) = x2 + 12xy + 3y2 = 33 de- t − 2 fines a hyperbola in accorandance with Example 6.67. With y 1 0 6 respect to the basis η = 2 , 1 , we see that 3 { − } −   K(x) = 11x2 + 3(2x + y)2 = 11t2 + 3t2 = 33 2 3 − − 1 2 − − t2 t2 1 1 2 = 1 − v ⇐⇒ 3 − 11 2 6 3 v1 3 6 While η is not an orthogonal basis, one can still plot the − − conic although it looks a little strange! Of course, it is easier 31 x − to plot if one first obtains the orthonormal basis ζ, 2 √ 2 √ 2 6 K(x) = (2 + 37)s1 + (2 37)s2 = 33 − 3 − − t1 however the calculation to find ζ is time-consuming and the expressions for s1, s2 are extremely ugly. A similar approach can be taken to higher degree quadratic equations.

Exercises. 6.8.1. Prove that the sum of any two bilinear forms is bilinear, and that any scalar multi- ple of a bilinear form is bilinear: thus the set of bilinear forms on V is a vector space. (You can’t use matrices here, since V could be infinite-dimensional!) 6.8.2. Compute the matrix of the bilinear form

B(x, y) = x y 2x y + x y x y 1 1 − 1 2 2 1 − 3 3 3 1 1 0 on R with respect to the basis β = 0 , 0 , 1 . 1 1 0 n   −   o 6.8.3. Check that the function B( f , g) = f 0(0)g00(0) is a bilinear form on the vector space of twice- differentiable functions. Find the matrix of B with respect to β = cos t, sin t, cos 2t, sin 2t { } when restricted to the subspace Span β.

21 6.8.4. For each matrix A with real valued entries, find a diagonal matrix D and an Q such that QT AQ = D. 3 1 2 1 3 (a) (b) 1 4 0 3 2     2 0 1 − 6.8.5. If K is a quadratic form and K(x) = 2, what is the value of K(3x)? 6.8.6. If F does not have characteristic 2, and K(x) = B(x, x) is a quadratic form, prove that we can recover the bilinear form B via 1 B(x, y) = (K(x + y) K(x) K(y)) 2 − −

T 0 1 2 6.8.7. If B(x, y) = x 1 0 y is a bilinear form on F , compute the quadratic form K(x). 6.8.8. With reference to the proof of Sylvester’s Law, explain why B(x, x) = 0 Ax = 0. Also  ⇐⇒ explain why rank B is independent of the choice of diagonalizing basis. 6.8.9. If char F = 2, apply the diagonalizing algorithm to the symmetric bilinear form B = xT 0 1 y 6 1 0 on F2. What goes wrong if char F = 2?  6.8.10. Describe and plot the following conics: (a) x2 + y2 + xy = 6 (b) 35x2 + 120xy = 4x + 3y 6.8.11. Suppose that a non-empty, non-degeneratea conic C in R2 has the form ax2 + 2bxy + cy2 + dx + ey + f = 0, where at least one of a, b, c = 0, and define ∆ = b2 ac. Prove that: 6 − • C is a parabola if and only if ∆ = 0; • C is an ellipse if and only if ∆ < 0; • C is a hyperbola if and only if ∆ > 0.

(Hint: λ1, λ2 are the eigenvalues of a , so. . . )

aThe conic contains at least two points and cannot be factorized as a product of two straight lines: for example, the following are disallowed; • x2 + y2 + 1 = 0 is empty (unless one allows conics over C ...) • x2 + y2 = 0 contains only one point; • x2 xy x + y = (x 1)(x y) = 0 is the product of two lines. − − − −

22