Math 4242 Spring 2020 Selected Homework Solutions

Home , Kernel (set theory), Row echelon form

a 0 1.2.12(a): Show that if D = ( 0 b ) is a 2 × 2 diagonal matrix with a 6= b, then the only matrices that commute (under matrix multiplication) with D are other 2 × 2 diagonal matrices. 1.2.12(b): What if a = b?

w x aw ax aw bx Consider a generic 2×2 matrix A = ( y z ). Then DA = ( by bz ) , whereas AD = ay bz . In particular, DA = AD precisely when ax = bx and ay = by. If a 6= b, then this can only happen when x = y = 0. This handles part (a) of the question. On the other hand, if a = b, then one always has ax = bx and ay = by. Hence, DA = AD for every A.

−1 1 −1 1.3.33(b): Find the A = LU factorization of the coeﬃcient matrix A = 1 1 1 , and then use Forward −1 1 2 and Back Substitution to solve the corresponding linear systems Ax = bj for each of the right-hand 1 −3 sides b1 = −1 and b2 = 0 . 1 2

We start by declaring the −1 in the top left corner of A to be our first pivot. Then we can add the first row to the second and −1 times the first row to the third. This will have the effect of zeroing out −1 1 −1 the entries below the first pivot. The resulting matrix is U = 0 2 0 . 0 0 3 Notice that U is already upper-triangular, so there is no need to do any work with the second and 1 0 0 third pivots. Thus, we’ve found our factorization: A = LU, where L = −1 1 0 . Recall that L is the 1 0 1 product of the inverses of the type-1 elementary matrices we used to reduce A.

Now, we’ll perform forward substitution on both b1 and b2 to solve Lcj = bj. Let’s name the variables αj cj = βj for convenience. Then the first “row” in Lcj = bj implies that αj is equal to the first entry γj of bj. That is, α1 = 1 and α2 = −3. The second row implies that −αj + βj is equal to the second entry of bj. Since we’ve already solved for αj, we can use this to solve for βj: β1 = 0 and β2 = −3. Finally, the third row of Lcj = bj implies that αj + γj is equal to the third entry of bj. We’ve solved for αj, so we can substitute to determine γj: γ1 = 0 and γ2 = 5. 1 Next, we perform back substitution to solve Ux = cj for each j. We have c1 = 0 . We thus find that 0 y = z = 0 in this case (from the last two equations), which leaves us with x = −1. Similarly, we have −3 c2 = −3 . The last equation then implies that z = 5/3, while the second equation implies y = −3/2. 5 Substituting these into the first equation, we find that x = 3 − 3/2 − 5/3 = −1/6.

−1 −1/6 Summarizing, A 0 = b1 and A −3/2 = b2. 0 5/3

1.4.10: Write down the elementary 4×4 permutation matrix (a) P1 that permutes the second and fourth rows, and (b) P2 that permutes the ﬁrst and fourth row. (c) Do P1 and P2 commute? (d) Explain what the matrix products P1P2 and P2P1 do to a 4 × 4 matrix.

1 0 0 0 0 0 0 1 0 0 0 1 0 1 0 0 We have P1 =   and P2 =  . 0 0 1 0 0 0 1 0 0 1 0 0 1 0 0 0

The effect of P1P2 on a 4 × 4 matrix (when we multiply by P1P2 on the left) is to first perform the permutation corresponding to P2 (since it is closer to the matrix on which we’re operating), then perform the permutation corresponding to P2. That is, P1P2 sends the first row temporarily to the fourth row and then to the second row, sends the second row to the fourth row, and sends the fourth row to the first row. On the other hand, P2P1 corresponds to performing the operations in the opposite order. That is, P2P1 sends the first row to the fourth row, the second row to the first row, and the fourth row to the second row.

More brieﬂy, P1P2 and P2P1 both cycle the ﬁrst, second, and fourth rows, but in the opposite direction (P1P2 cycles them in “ascending” order, and P2P1 in “descending” order). Since these are distinct permutations of the rows, the matrices P1P2 and P2P1 cannot be equal. In other words, P1 and P2 do not commute.

1 0 0 1 0 0 1 0 0 1.5.4: Show that the inverse of L = a 1 0 is L−1 = −a 1 0 . However, the inverse of M = a 1 0 is b 1 0 −b 0 1 b c 1 1 0 0 not −a 1 0 . What is M −1? −b −c 1

1 0 0 1 0 0 1 0 0 If we multiply L −a 1 0 , we ﬁnd that it equals a−a 1 0 = I. Similarly, −a 1 0 L is equal to the −b 0 1 b−b 0 1 −b 0 1 1 0 0 3 × 3 identity. Thus, the inverse of L is indeed −a 1 0 . −b 0 1 1 0 0 We claim that M −1 = −a 1 0 . Indeed, one can check (by multiplying) that this is a two-sided −b+ac −c 1 inverse of M.

1.5.25: Find the inverse of each of the following matrices, if possible, by applying the Gauss–Jordan 1 0 −2 Method. (a) 1 −2 , (e) 3 −1 0 . 3 −3 −2 1 −3

1 −2 1 0 For (a), we set up the augmented matrix 3 −3 | 0 1 . Using the top-left 1 as our first pivot, we reduce 1 −2 1 0 1 −2 1 0 to 0 3 | −3 1 . Next we multiply the second row by 1/3, obtaining 0 1 | −1 1/3 . Finally, we add 1 0 −1 2/3 1 −2 −1 2/3 twice the last row to the first row, giving us 0 1 | −1 1/3 . Therefore, the inverse of 3 −3 is −1 1/3 . 1 0 −2 1 0 0 For (b), we again set up the augmented matrix 3 −1 0 0 1 0 . Working with the first pivot, we can −2 1 −3 0 0 1 1 0 −2 1 0 0 zero out the entries below it in the first column: 0 −1 6 −3 1 0 . Next, the −1 in position (2, 2) is 0 1 −7 2 0 1 1 0 −2 1 0 0 our second pivot. We can use it to make the entry in position (3, 2) a zero: 0 −1 6 −3 1 0 . 0 0 −1 −1 1 1 1 0 −2 1 0 0 Next, we multiply the last two rows by −1: 0 1 −6 3 −1 0 . Finally, we use the third pivot to zero 0 0 1 1 −1 −1 1 0 0 3 −2 −2 3 −2 −2 out the entries above it in its column: 0 1 0 9 −7 −6 . Thus, our inverse is 9 −7 −6 . 0 0 1 1 −1 −1 1 −1 −1

1.6.14: (a) Explain why the inverse of a permutation matrix equals its transpose: P −1 = P T . (b) If A−1 = AT , is A necessarily a permutation matrix?

We can characterize the condition A−1 = AT for a square matrix A (with real entries) as follows: n T Consider the columns of A as vectors a1,..., an in R . Then A A = In when ai·aj equals zero for T T −1 i 6= j and one for i = j (in this case, it will also be true that AA = In). That is, A = A when the different columns of A are perpendicular (“orthogonal”) to each other and each column has length 1. (Such matrices are called orthogonal matrices) In any case, since a permutation matrix has a single 1 in each row and column, with the rest of the entries being zero, it satisfies this criterion. After all, different columns are required to have their 1’s in different spots, whereas the dot product of each column with itself is exactly 1. On the other hand, there are plenty of orthogonal matrices that are not permutation matrices. For example, the 2 × 2 cos θ sin θ rotation matrices − sin θ cos θ , where θ ∈ R, are orthogonal (as one can check by hand).

1.8.1: Which of the following systems has (i) a unique solution? (ii) inﬁnitely many solutions? (iii) ( ( x − 2y = 1 2x + y + 3z = 1 no solution? In each case, ﬁnd all solutions: (a) , (b) , (c) 3x + 2y = −3 x + 4y − 2z = −3  x + y − 2z = −3  2x − y + 3z = 7 . x − 2y + 5z = 1

( x − 2y = 1 For (a), we can subtract three times the first equation from the second to obtain . 8y = −6 We are now in row-echelon form. Since there are two pivots and zero free variables, there is a unique solution. The usual back-substitution method gives us this solution: y = −3/4 and x = 1−3/2 = −1/2. For (b), we similarly reduce: subtract 1/2 times the first equation from the second. We thus obtain ( 2x + y + 3z = 1 . The corresponding matrix is now in row-echelon form. There is a pivot in 7/2y − 7/2z = −7/2 each row (hence, the system is consistent), but no pivot in the z column. Thus, there are infinitely many solutions. Back-substitution gives a description of these: y = z − 1, and x = 1 − 2z. That is, the 1−2z general solution is −1+z . z Finally, for (c), we subtract twice the first row from the second and subtract the first row from the   x + y − 2z = −3 x + y − 2z = −3   third: −3y + 7z = 13 . We then subtract the second row from the third: −3y + 7z = 13 . −3y + 7z = 4 0 = −9 The last row is a contradiction; hence, the system was inconsistent and had no solutions to begin with.

1 −1 1 1 1 2 1 3 1.8.7: Determine the rank of the following matrices: (a) 1 −2 , (b) −2 −1 −3 , (c) 1 −1 2 , (d) −1 1 0 2 −1 0 3 2 −1 1 , (e) 0 , (f) ( 0 −1 2 5 ). 1 1 −1 −2

1 1 For (a), it has row-echelon form 0 −3 . Thus, the matrix must be nonsingular and have rank 2. 2 1 3 A row-echelon form of the matrix in (b) is ( 0 0 0 ). Hence, it has rank 1. In (c), we can subtract the first row from the second and add the first row to the third, giving us 1 −1 1 0 0 1 . Then we fail to form a pivot in the second column. Using the 1 in position (2, 3) as our next 0 0 1 1 −1 1 pivot, we reach the row-echelon form 0 0 1 . Since there are two pivots, the matrix has rank 2. 0 0 0 In (d), we again reduce. Using the top-left 2 as our first pivot, we can zero out the rest of the first 2 −1 0 2 −1 0 column: 0 0 1 . We can now swap the second and third rows, obtaining 0 3/2 −1 . This matrix 0 3/2 −1 0 0 1 is in row-echelon form with three pivots, so the original matrix must have rank 3. 3 In (e), the row-echelon form is simply 0 , so we have rank 1. 0 The matrix in (f) is already in row-echelon form, so it has rank 1. 2 −1 1.9.1: Use Gaussian Elimination to find the determinant of the following matrices: (a) −4 3 , (b) 0 1 −2 −1 0 3 . 2 −3 0

2 −1 For (a), we add twice the first row to the second, giving 0 1 . This matrix is upper-triangular, so its 2 −1 determinant is the product of its diagonal entries. That is, det −4 3 = 2 ∗ 1 = 2. −1 0 3 For (b), we swap the first two rows, obtaining 0 1 −2 . We then add twice the first row to the 2 −3 0 −1 0 3 third row, giving us 0 1 −2 . Finally, we add three times to second row to the third row, yielding 0 −3 6 −1 0 3 the upper-triangular matrix 0 1 −2 . This has determinant zero, so the original matrix also has 0 0 0 determinant zero.

2.1.7: Let F(R2, R2) denote the vector space consisting of all functions f : R2 → R2. (a) Which of the 2 2 x−y ex 1 x y following functions f(x, y) are elements? (i) x + y , (ii) xy , (iii) cos y , (iv) ( 3 ), (v) −y x , (vi) x y . (b) Sum all of the elements of F(R2, R2) you identiﬁed in part (a). Then multiply your sum x+y by the scalar −5. (c) Carefully describe the zero element of the vector space F(R2, R2).

For part (a), a function f : R2 → R2 is the same as a pair of functions R2 → R (one is the x- coordinate of the output, one is the y-coordinate of the output). Thus, (ii), (iii), and (iv)are all 2 2 1 elements of F(R , R ). For example, (iii) is the constant function that takes the value ( 3 ) on the entire domain R2. Just to illustrate why the other functions are not elements of this vector space, consider (v). It represents a function from R2 (because the different coordinates of the output depend on the two input variables x, y) to the space of 2 × 2 real matrices (i.e. not the correct range, R2). For part (b), recall that when a vector space is constructed as the space of functions from a set S into a vector space V , the addition and scalar multiplication are defined “pointwise.” For example, if f, g : S → V , then f + g is also a function S → V whose value at each s ∈ S is equal to f(s) + g)(s) (where the addition is the vector addition in V ). Scalar multiplication works similarly. Note that any extra structure on S (e.g. if S is a vector space in its own right) is not relevant to this construction. In particular, this means that −5 times the sum of (ii), (iii), and (iv) is the function f(x, y) = −5x + 5y − 5ex − 5 . −5xy − 5 cos y − 15 Based on the definition of vector addition (described above) in our space, the zero element (i.e. the 0 additive identity) is the function f(x, y) = ( 0 ). In other words, it is the constant function whose value 0 2 everywhere in the domain is ( 0 ), i.e. the zero vector in the range, R .

2.2.6(a) Can you construct an example of a subset S ⊂ R2 with the property that cv ∈ S for all c ∈ R, v ∈ S, and yet S is not a subspace? (b) What about an example in which v + w ∈ S for every v, w ∈ S, and yet S is not a subspace?

Recall that a subset S ⊂ R2 is a subspace when it is closed under scalar multiplication and addition. The properties described in (a) and (b) are exactly these, respectively. Thus, (a) is asking for a subset closed under scalar multiplication but not closed under addition, and vice versa for (b). For (a), consider the union of the two coordinate axes: S = {(x, y)|xy = 0}. Then for any point on S, either x = 0 or y = 0. If we consider c(x, y) = (cx, cy) for such a point (and any c ∈ R), we ﬁnd that cx = 0 or cy = 0 (respectively). Hence, S is closed under scalar multiplication. On the other hand, (1, 0), (0, 1) ∈ S, but their sum (1, 1) is not in S. So S is not closed under vector addition, hence is not a subspace. For (b), consider the right half-plane S0 = {(x, y)|x ≥ 0}. If we add any two vectors in S0, we again get a vector in S0. This is because the sum of their x coordinates, both being non-negative, will again be non-negative (the requirement for belonging to S0). On the other hand, −1 · (x, y) 6∈ S0 for a typical element of S0. Indeed, if (x, y) has positive x coordinate (the vector (1, 0) ∈ S0 is an example), then −1 · (x, y) = (−x, −y) has negative x coordinate. In that case (−x, −y) 6∈ S0. Therefore, S0 is not closed under scalar multiplication, and thus fails to be a subspace.

2.3.8(a): Determine whether the polynomials x2 + 1, x2 − 1, x2 + x + 1 span P(2).

2 (2) Consider an arbitrary element b0+b1x+b2x in P . This lies in the span of the three given polynomials 2 2 2 2 if there exist scalars c1, c2, c3 ∈ R such that c1(x + 1) + c2(x − 1) + c3(x + x + 1) = b0 + b1x + b2x . Since two polynomials are equal precisely when their coeﬃcients with respect to each term are equal, the above happens when  c − c + c = b  1 2 3 0 c3 = b1 .  c1 + c2 + c3 = b2 We can determine whether such a system is consistent by passing to the augmented matrix and doing   1 −1 1 b0

the usual Gaussian elimination algorithm. The corresponding augmented matrix is 0 0 1 b1. 1 1 1 b2 However, since we’re interested only in whether or not system is always consistent (rather than also wanting to determine the subspace of b’s on which it is consistent), we can ignore the augmented 1 −1 1 column. In particular, the given polynomials span all of P(2) precisely when the matrix 0 0 1 1 1 1 has rank 3. 1 −1 1 Let’s reduce our matrix! We can subtract the ﬁrst row from the third to obtain 0 0 1. Then we 0 2 0 1 −1 1 can swap the second and third rows to obtain 0 2 0. At this point, we’ve reached row-echelon 0 0 1 form and can spot three pivots. Thus, the original matrix has rank 3, implying that x2 + 1, x2 − 1, x2 + x + 1 span all of P(2).

1 −2 2 0 3 −2 2.3.22: (a) Show that the vectors 2 , −1 , 1 are linearly independent. (b) Which of the 1 1 −1 1 1 0 0 1 0 1 0 T following vectors are in their span? (i) 2 , (ii) 0 , (iii) 0 , (iv) 0 . (c) Suppose b = (a, b, c, d) 1 0 0 0 lies in their span. What conditions must a, b, c, d satisfy?

We can answer all of these questions by reducing the augmented matrix whose columns are the given vectors (and whose right-hand side is b) to row-echelon form:     1 −2 2 a 1 −2 2 a

0 3 −2 b  0 3 −2 b      2 −1 1 c  0 3 −3 c − 2a

1 1 −1 d 0 3 −3 d − a   1 −2 2 a

0 3 −2 b    0 0 −1 c − 2a − b

0 0 −1 d − a − b   1 −2 2 a

0 3 −2 b    0 0 −1 c − 2a − b 

0 0 0 d − a − b − (c − 2a − b) Since each column contains a pivot, there are no free variables. In particular, the set of solutions to the homogeneous equation Ax = 0 (where the columns of A are the given vectors) is {0}. That is, the given vectors are linearly independent. Inspecting the lone zero row in the row-echelon form, we see that the system Ax = b is consistent when d − a − b − (c − 2a − b) = 0. Simplifying, this is when a − c + d = 0. But then this is the exact condition on a, b, c, d for b = (a, b, c, d)T to belong to the span of the given vectors. For example, (i), (iii), and (iv) satisfy this condition, so they are in the span, but (ii) is not.

2.4.11: (a) Show that 1, 1 − t, (1 − t)2, (1 − t)3 is a basis for P(3). (b) Write p(t) = 1 + t3 in terms of the basis elements.

We’ll give a solution for (a) based on a dirty trick (although in the course of solving (b) we’ll prove (a) in a more mechanical way). Deﬁne the new variable x = 1 − t. Then observe that the space of polynomials in t of degree at most 3 is the same as the space of polynomials in x of degree at most 3. More pedantically, this linear change of variables induces an “isomorphism” (i.e. an invertible linear (3) (3) map) of vector spaces Pt → Px (since t = 1 + x, the change really is invertible). But an invertible linear map takes any basis of the domain to a basis of the codomain (convince yourself this is true!). In 2 3 (3) 2 3 particular, the inverse of our map takes the monomoial basis 1, x, x , x of Px to 1, 1−t, (1−t) , (1−t) . Our basis vectors are 1, 1 − t, 1 − 2t + t2, and 1 − 3t + 3t2 − t3. Then we have to solve the system 1 1 1 1 1 0 −1 −2 −3 0 0 0 1 3 0 . 0 0 0 −1 1 Luckily, this system is already in row-echelon form. Moreover, we can see that there are four pivots, so the coeﬃcient matrix is invertible (i.e. the given vectors indeed form a basis). We can thus solve using 3 back-substitution: c4 = −1, c3 = 3, c2 = −3, and c1 = 2. That is, 1 + t = 2 · 1 + (−3) · (1 − t) + 3 · (1 − t)2 + (−1) · (1 − t)3.1

1 1 −1 0 2.5.4: Suppose x∗ = 2 is a particular solution to the equation −1 0 1 x = b. (a) What is b? (b) 3 0 1 −1 Find the general solution.

1 −1 0 −1 For (a), we just need to multiply out the matrix-vector product −1 0 1 x∗ = 2 . 0 1 −1 −1 For (b), we can perform our usual augmented-matrix method with the right-hand column starting as −1 2 = b. Instead, we’ll recall our theorem that says a general solution looks like x∗ + z, where z is −1 an element of the null space of our matrix. So we just compute the null space (again by reducing the matrix):

 1 −1 0  1 −1 0  1 −1 0 −1 0 1  0 −1 1  0 −1 1 . 0 1 −1 0 1 −1 0 0 0

z Using back-substitution, we see that the null space consists of vectors z = z . Therefore, the general z solution has the form 1 + z 1 1 2 + z = 2 + z 1 , 3 + z 3 1 where z ∈ R.

2.5.21(a,c): For each of the following matrices ﬁnd bases for the (i) image, (ii) coimage, (iii) kernel, and 1 1 2 1 (iv) cokernel. (a) 1 −3 , (c) 1 0 −1 3 . 2 −6 2 3 7 0

1 −3 1 −3 (a) We compute the row-echelon form of 2 −6 , which is 0 0 . Since we have one pivot, (i) the 1 image of our matrix is the span of its first column, i.e. ( 2 ),(ii) the coimage is the span of its first row, 1 3 2 i.e. −3 , (iii) the kernel is the span of ( 1 ), and (iv) the cokernel is the span of −1 . 1 1 2 1 (b) Let’s compute the row-echelon form: subtract the first row from the second to obtain 0 −1 −3 2 . 2 3 7 0 1 1 2 1 Next, subtract twice the first row from the third to obtain 0 −1 −3 2 . Finally, we add the second 0 1 3 −2 1 1 2 1 row to the third, giving the row-echelon form 0 −1 −3 2 . The first two columns have pivots, which 0 0 0 0 implies that the first two columns of the original matrix form a basis of the image. That is, (i) a basis 1 1 for the image is given by 1 , 0 . 2 3 Since row operations do not change the row space, the two non-zero columns of the row-echelon form 1 0 1 −1 a basis for the coimage: 2 , −3 . 1 2 x3−3x4 Using back-substitution with the row-echelon form, we can describe the kernel as −3x3+2x4 . Thus, x3 x4 1 −3 −3 2 a basis for this subspace is 1 , 0 . 0 1 Finally, since the rank is two, the dimension of the cokernel is one (rank-nullity implies that the sum of these dimension is equal to the number of rows). In the course of finding the row-echelon form, we found that the third row plus the second row minus three times the first row is equal to the zero vector. −3 That is, 1 gives a basis for the cokernel. 1

2.5.36: Prove or give a counterexample: If U is the row echelon form of A, then Im(U) = Im(A).

Since row operations in general do not preserve the span of the columns (i.e. the image of the matrix), 1 we should not expect this to be true. Indeed, suppose A = ( 1 ). Then the row echelon form of A is 1 U = ( 0 ). It’s clear that Im(U) 6= Im(A).

2.5.38: Prove that ker A ⊆ ker A2. More generally, prove ker A ⊂ ker BA for every compatible matrix B. We only need to prove the second statement, since the ﬁrst is just the special case A = B. Suppose v is a vector belonging to ker A. We aim to show that v ∈ ker BA (then every element of ker A is an element of ker BA, i.e. ker A is a subset of ker BA). But associativity of matrix multipliction implies BAv = B(Av) = B0 = 0, as desired.

3.1.21(a,b,c): For each of the given pairs of functions in C0[0, 1], ﬁnd their L2 inner product hf, gi and their L2 norms kfk, kgk: (a) f(x) = 1, g(x) = x, (b) f(x) = cos 2πx, g(x) = sin 2πx, (c) f(x) = x, g(x) = ex.

q q √ R 1 R 1 R 1 2 For (a), we have hf, gi = 0 x dx = 1/2, kfk = 0 1 dx = 1, and kgk = 0 x dx = 1/ 3. q For (b), we have hf, gi = R 1 sin 2πx cos 2πx dx = R 1 1 sin 4πx dx = 0. Also, kfk = kgk = R 1 sin2 2πx dx = √ 0 0 2 0 1/ 2. For (c), we use integration-by-parts to obtain hf, gi = R 1 xex dx = e − R 1 ex dx = 1. We already √ q 0 0 R 1 2x p 2 computed kfk = 1/ 3 in part (a). Finally, kgk = 0 e dx = e /2 − 1/2.

3.2.18: Find all vectors in R4 that are orthogonal to both (1, 2, 3, 4)T and (5, 6, 7, 8)T .

1 2 3 4 Notice that the set of vectors orthogonal to both of these is exactly the kernel of the matrix ( 5 6 7 8 ). We can therefore answer this problem by computing the row-echelon form (and using it to solve the 1 2 3 4 corresponding homogeneous system): 0 −4 −8 −12 . The general element of the kernel must thus look x3+2x4 like −2x3−3x4 . As x , x range over all real numbers, this parametrizes the set of vectors orthogonal x3 3 4 x4 to both (1, 2, 3, 4)T and (5, 6, 7, 8)T .

3.3.3: Which of the two vectors u = (−2, 2, 1)T , v = (1, 4, 1)T , w = (0, 0, −1)T are closest to each other in distance for (a) the Euclidean norm, (b) the ∞ norm, (c) the 1 norm?

√ √ √ √ For (a), the distance√ between u and√v is 9 + 4 = 13, between u and w is 4 + 4 + 4 = 12, and between v and w is 1 + 16 + 4 = 21. So u and w are closest. For (b), the distance between u and v is max(3, 2, 0) = 3. The distance between u and w is max(2, 2, 2) = 2. And the distance between v and w is max(1, 4, 2) = 4. So u and w are again the closest. For (c), the distance between u and v is 3 + 2 + 0 = 5, between u and w is 2 + 2 + 2 = 6, and between v and w is 1 + 4 + 2 = 7. Therefore, u and v are actually closest with respect to the 1 norm.

3.4.22(a,b): Find the Gram matrix corresponding to each of the following sets of vectors using the n −2 −1 1 1 0 Euclidean dot product on R . Which are positive deﬁnite? (ii) ( 1 ) , , , (iv) 1 , 0 , 1 , 2 3 −1 0 1 1 1 −1 0 1 and (vi) −1 , 0 . 0 1

For (ii), we shouldn’t expect these vectors to be linearly independent (since 3 vectors in a 2-dimensional 5 4 −3 space must always be dependent). Indeed, the 3 × 3 Gram matrix is 4 13 −1 . If we reduce this −3 −1 2 5 4 −3 matrix (via attempting the LU factorization algorithm), we first obtain 0 49/5 7/5 . Working with 0 7/5 1/5 5 4 −3 the second pivot, we obtain 0 49/5 7/5 . Hence, this matrix isn’t positive definite (positive definite 0 0 0 matrices are invertible), implying that the original three vectors are linearly dependent (which we knew for abstract reasons). 2 1 1 For (iv), the Gram matrix is 1 2 1 . If we apply the LU factorization algorithm, working with the 1 1 2 2 1 1 2 1 1 first pivot gives us 0 3/2 1/2 . Working with the second pivot gives us 0 3/2 1/2 . We’ve reached an 0 1/2 3/2 0 0 4/3 upper triangular matrix with all positive entries on the diagonal. In particular, our (symmetric) Gram matrix must therefore admit an LDLT factorization with diagonal D having these positive entries exactly. Therefore, the Gram matrix is positive definite (which implies the original three vectors are linearly independent). We could have reached the same conclusion alternatively by remembering that Gram matrices are always positive semi-definite, so invertibility (which follows from the diagonal entries of U being non-zero in this case) implies positive definiteness. 2 −1 For (vi), the 2 × 2 Gram matrix is −1 3 . This is positive definite: its associated quadratic form is 2x2 − 2xy + 3y2 = 2(x − y/2)2 + (5/2)y2, which is always non-negative (in light of the positive coefficients 2 and 5/2 and equal to zero precisely when x − y/2 = y = 0 (i.e. when x = y = 0). We could have anticipated this via the fact that the original two vectors are visibly linearly independent.

3.5.7: Write the following quadratic forms in matrix notation and determine if they are positive deﬁnite: (a) x2 + 4xz + 2y2 + 8yz + 12z2, (b) 3x2 − 2y2 − 8xy + xz + z2.

1 0 2 For (a), the associated matrix is 0 2 4 . This is positive semi-definite, but not positive definite, since 2 4 12 we can rewrite the quadratic form as (x + 2z)2 + 2(y + 2z)2. This is non-negative for every (x, y, z)T and zero exactly when x + 2z = y + 2z = 0, i.e. when x = y = −2z. 3/2 −4 1/2 For (b), the quadratic form is given by (x, y, z) −4 −2 0 (x, y, z)T . This is visibly not positive 1/2 0 1 definite, as it has a negative entry on its diagonal. In particular, if we plug in (x, y, z) = (0, 1, 0), then the quadratic form evaluates to −2 < 0.

1 0 −1+i 3.6.28: (a) Determine whether the vectors v1 = i , v2 = 1+i , v3 = 1+i are linearly inde- 0 2 −1 pendent or linearly dependent. (b) Do they form a basis of C3? (c) Compute the Hermitian norm of each vector. (d) Compute the Hermitian dot products between all diﬀerent pairs. Which vectors are orthogonal?

1 0 −1+i For (a,b), we can perform Gaussian elimination on the matrix whose columns are the v’s: i 1+i 1+i 0 2 −1 1 0 −1+i 1 0 −1+i 1 0 −1+i reduces to 0 1+i 1+i−i(−1+i) = 0 1+i 2+2i . The latter reduces to 0 1+i 2+2i . We’ve reached 0 2 −1 0 2 −1 0 0 −5 row-echelon form, and we can see that there are three pivots; hence these three vectors are linearly independent and form a basis of C3 (any three linearly independent vectors must form a basis of this 3-dimensional space). p √ √ √ √ √ For (c), we have kv1k = 1 + i(−i) = 2, kv2k = 0 + 2 + 4 = 6, and kv3k = 2 + 2 + 1 = 5.

For (d), we have v1·v2 = 0 + i(1 − i) + 0 = 1 + i, v1·v3 = 1(−1 − i) + i(1 − i) + 0 = 0, and v2·v3 = 0 + (1 + i)(1 − i) + 2(−1) = 2 − 2 = 0. Hence, both of v1, v2 are orthogonal to v3, although not to each other. (The fact that v1, v2 are orthogonal to v3 and are visibly not parallel to each other can lead to an alternative proof that these three vectors are linearly independent, hence form a basis for C3.)

3 4 T 4 12 3 T 48 5 36 T 4.1.22: (a) Prove that v1 = ( 5 , 0, 5 ) , v2 = (− 13 , 13 , 13 ) , v3 = (− 65 , − 13 , 65 ) form an orthonormal basis for R3 for the usual dot product. (b) Find the coordinates of v = (1, 1, 1)T relative to this basis. p 2 2 (c) Verify the formula kvk = c1 + ··· + cn (where ci = hv, uii are the coordinates for v with respect to the orthonormal basis u1,..., un).

For (a), we just need to take the pairwise dot products for each of the vi’s: for orthogonality, v1·v2 = −12/65 + 0 + 12/65 = 0, v1·v3 = −144/325 + 0 + 144/65 = 0, and v2·v3 = 192/845 − 300/845 + 108/845 = 0; for the fact that each vector is normalized to unit length, v1·v1 = 9/25 + 0 + 16/25 = 1, 482+252+362 4225 v2·v2 = 16/169 + 144/169 + 9/169 = 1, and v3·v3 = 652 = 4225 = 1. For (b), we must compute the dot products of v with each of the elements of the orthonormal basis: v·v1 = 7/5, v·v2 = 11/13, and v·v3 = −37/65. Therefore, 7 11 37 v = v + v − v . 5 1 13 2 65 3 √ For (c), we know that kvk = 3. On the other hand,

r8281 + 3025 + 1369 r12675 √ p(7/5)2 + (11/13)2 + (−37/65)2 = p49/25 + 121/169 + 1369/4225 = = = 3, 4225 4225 which conﬁrms the formula in this case.

4.2.9(a): Construct an orthonormal basis for R3 with respect to the inner product deﬁned by the positive 4 −2 0 deﬁnite matrix −2 3 −1 . 0 −1 2

1/2 Let’s start with the standard basis and apply Gram–Schmidt. First, u1 = e1/ke1k = e1/2 = 0 , as 0 0 1/2 we can see from the top-left enry of the matrix. Second, u2 = e2 −he2, u1iu1 = e2 −(−1)e1/2 = 1 . 0 √1 ! 1 2 2 Therefore, u2 = √ (e1/2 + e2) = √1 . 1−2+3 2 0 √ 0 Third, u3 = e3 − he3, u1iu1 − he3, u2iu2, which is equal to e3 − he3, e2/ 2iu2 (since e1, e3 are orthogonal). This simpliﬁes to √1 ! 1 0 √ 2 2 4 0 + 1/ 2 √1 = 1 . 1 2 2 0 1

 √1  1/4 2 6 0 0 2 √1 Finally, u = u /ku k = √ 1/2 = 6 . 3 3 3 6  √  1 √2 3

2 1 −1 4.3.27(c): Find the QR factorization of the matrix 0 1 3 . −1 −1 1

We want to perform Gram–Schmidt√ on the columns of this matrix,√ using the dot product as the inner T product. First, k(2, 0, −1) k = 5, which implies that r11 = 5 and u1 (the ﬁrst column of Q) is √ √ √ √ T 1 equal to (2/ 5, 0, −1/ 5) . Second, 1 ·u1 = 3/ 5, so r12 = 3/ 5. Moreover, this implies that −1 6/5 −1/5 −1/5 0 1 p p u2 = 1 − 0 = 1 . Therefore, u2 = 5/6 1 and r22 = 6/5. −1 −3/5 −2/5 −2/5 Third,

√ 6/5 7/15 0 −1 p −1 u3 = 3 − (−3/ 5)u1 − 5/6(1/5 + 3 − 2/5)u2 = 3 + 0 + −7/3 . 1 1 −3/5 14/15

√ √ √ 2/3 0 This implies that r13 = −3/ 5 and r23 = 7 2/ 15. Simplifying, u3 = 2/3 . Thus, u3 = 4/3 p 2/3 p 3/8 2/3 and r33 = 8/3. Summarizing, we have 4/3    √ √ √  √ √ √  2 1 −1 2/ 5 −√1/ √30 1/√6 5√ 3/ √5 √−3/√5  0 1 3  =  0√ √5/ √6√ 1/ √6   0 6/ 5 7√2/√15 , −1 −1 1 −1/ 5 − 2/ 15 2/ 3 0 0 8/ 3

which is the QR factorization of our matrix.

T 4.4.2: Find the orthogonal projection of the vector v = (1, 1, 1) onto the following√ √ subspaces,√ using the indicated orthonormal/orthogonal bases (a) the line in the direction (−1/ 3, 1/ 3, 1/ 3)T ; (b) the line spanned by (2, −1, 3)T ; (c) the plane spanned by (1, 1, 0)T , (−2, 2, 1)T ; (d) (−3/5, 4/5, 0)T , (4/13, 3/13, −12/13)T .

For part (a), we must compute v · √1 (−1, 1, 1)T = √1 . Therefore, the orthogonal projection is 3 3 √1 √1 (−1, 1, 1)T = (−1/3, 1/3, 1/3)T . 3 3 For part (b), we have v · (2, −1, 3)T = 4. Since (2, −1, 3)T isn’t normalized, the orthogonal projection T v·(2,−1,3) T 2 T T is (2,−1,3)T ·(2,−1,3)T (2, −1, 3) = 7 (2, −1, 3) = (4/7, −2/7, 6/7) . T T v·(1,1,0) T v·(−2,2,1) T For part (c), the projection has a similar formula: (1,1,0)T ·(1,1,0)T (1, 1, 0) + (−2,2,1)T ·(−2,2,1)T (−2, 2, 1) . 2 T We can evaluate each of the relevant dot products to ﬁnd that the orthogonal projection is 2 (1, 1, 0) + 1 T T 9 (−2, 2, 1) = (7/9, 11/9, 1/9) . For part (d), the basis vectors are normalized already. Therefore, the orthogonal projection is (v · (−3/5, 4/5, 0)T )(−3/5, 4/5, 0)T + (v · (4/13, 3/13, −12/13)T )(4/13, 3/13, −12/13)T , which is equal to T T 1007 301 60 T (−3/25, 4/25, 0) + (−20/169, −15/169, 60/169) = − 4225 , 4225 , 169 .

4.4.19: Let V = P(4) denote the space of quartic polynomials, with the L2 inner product hp, qi = R 1 (2) −1 p(x)q(x) dx. Let W = P be the subspace of quadratic polynomials. (a) Write down the conditions that a polynomial p ∈ P(4) must satisfy in order to belong to the orthogonal complement W ⊥. (b) Find a basis for and the dimension of W ⊥. (c) Find an orthogonal basis for W ⊥.

2 3 4 ⊥ For part (a), consider p(x) = a0 + a1x + a2x + a3x + a4x . The condition that p belong to W is equivalent to the condition that it be orthogonal to each vector in a spanning set (e.g. a basis) for W . So we have the conditions hp, 1i = hp, xi = hp, x2i = 0. By computing these integrals, we get a system of 3 homogeneous equations in the 5 coefficients a0, . . . , a4 for p:  2 2 2a0 + + 3 a2 + + 5 a4 = 0  2 2 + 3 a1 + + 5 a3 + = 0  2 2 2 3 a0 + + 5 a2 + + 7 a4 = 0 For (b), we solve the equations from part (a), finding that the row-echelon form of the associated 2 0 2/3 0 2/5 matrix is 0 2/3 0 2/5 0 . Therefore, dimW ⊥ = 2, with the general element looking like p(x) = 0 0 8/45 0 16/105 3 3 6 2 3 4 35 a4 − 5 a3x − 7 a4x + a3x + a4x . Thus, a basis for W ⊥ is given by 3 − 30x2 + 35x4, −3x + 5x3. We claim that this is an orthogonal R 1 2 4 3 basis already. The inner product of our two basis elements is −1(3 − 30x + 35x )(−3x + 5x ) dx. Notice that the integrand is an odd function (one of our basis elements is even and the other is odd; R 1 the product of an even function and an odd function is odd). But −1 f(x) dx = 0 whenever f is odd (assuming the integral is defined).

4.4.20: Let W ⊂ V . Prove that (a) W ∩ W ⊥ = {0}, (b) W ⊆ (W ⊥)⊥.

Note that we do not need to assume that W is ﬁnite-dimensional (or even closed), otherwise our “unique decomposition” theorem would imply that W = (W ⊥)⊥. For part (a), suppose that v ∈ W ∩ W ⊥. Then hv, vi = 0, since v is orthogonal to each vector in W (but is itself a vector in W ). By positive deﬁniteness of the inner product, we must therefore have v = 0. For part (b), suppose that w ∈ W and consider an arbitrary z ∈ W ⊥. If we can show that hw, zi = 0, then we’ve proven that w is orthogonal to each element of W ⊥. That is to say, w ∈ (W ⊥)⊥. But w was itself arbitrary, so this would prove that W ⊆ (W ⊥)⊥, as we wanted. Returning to hw, zi, it is indeed equal to zero, since z is meant to be orthogonal to each vector in W .

5.2.3: For each of the following quadratic functions, determine whether there is a minimum. If so, ﬁnd the minimizer and the minimum value for the function. (b) 3x2 + 3xy + 3y2 − 2x − 2y + 4 and (d) x2 + y2 + yz + z2 + x + y − z.

3 3/2 For (b), the quadratic form part is given by the positive deﬁnite matrix K = 3/2 3 and the linear form part is given by f = (1, 1)T (using the notation from the textbook). The unique solution ∗ 2/9 ∗ x = 2/9 to Kx = f is our minimizer, and it takes the minimum value 32/9. For (d), the quadratic form part is again positive deﬁnite. We can solve the corresponding Kx∗ = f equation to rewrite the function as (x + 1/2)2 + (y + 1)2 + (y + 1)(z − 1) + (z − 1)2 − 5/4. In particular, the minimizer is (−1/2, −1, 1)T and the minimum value is −5/4.

2 3 2 4 −2 −1 5.4.4: Find the least squares solution to the linear system Ax = b when (a) A = 1 5 and b = 1 , 2 0 3 2 1 4 0 1 −2 1 0 (b) A = 1 0 −3 and b = 1 . 5 2 −2 0

T T 25 3 7 For (a), we need to solve the system A Ax = A b, which is ( 3 38 ) x = ( 13 ). This has the unique 227/941 solution x = 304/941 . 31 10 −4 1 For (b), we similarly need to solve the normal equations. In this case, these are 10 9 −2 x = 0 . −4 −2 30 −3 218/5262 Via row-reduction we can ﬁnd the solution x = −358/5262 . −521/5262

5.5.1: Find the straight line y = α + βt that best fits the following data in the least squares sense: (a) t = (−2, 0, 1, 3)T , y = (0, 1, 2, 5)T . (b) t = (1, 2, 3, 4, 5)T , y = (1, 0, −2, −3, −3)T . 1 −2 1 0 α For (a), we hope to find the least squares solution to 1 1 ( β ) = y. The corresponding normal 1 3 4 2 α 8 equations are ( 2 14 )( β ) = ( 17 ). This has the unique solution α = 3/2 and β = 1, i.e. the line of best fit is y = 3/2 + t. 5 15 α −7 For (b), we similarly construct the normal equations ( 15 55 )( β ) = −32 . These have the unique solution α = 19/10 and β = −11/10. So the line of best fit is y = 19/10 − 11/10t.

∂f ∂f 7.1.23: (a) Show that the partial derivatives ∂x[f] = ∂x and ∂y[f] = ∂y both deﬁne linear operators on the space of continuously diﬀerentiable functions f(x, y). (b) For which values of a, b, c, d is the map ∂f ∂f L[f] = a ∂x + b ∂y + cf + d linear?

We won’t give a pedantic proof (using the definition of the derivative as a liimit of difference quotients) of part (a). Instead, we just note that ∂x[af] = a∂x[f] for a constant a, and that ∂x[f +g] = ∂x[f]+∂x[g]. The same holds for ∂y, hence these are linear operators. Since a linear combination of linear operators is also linear, we need to just check whether L[f] = f and L[f] = 1 are linear (because of part (a)). But the first of these is the identity map, which is certainly linear. On the other hand, if L[f] = 1, then L[f + g] = 1 6= L[f] + L[g]. The same problem arises for any L[f] = d. Therefore, the map as described is linear precisely when d = 0.

8.2.31: An elementary reflection matrix has the form Q = I − 2uuT , where u ∈ Rn is a unit vector. (a) 1 Find the eigenvalues and eigenvectors of the elementary reflection matrices for the unit vectors (i) ( 0 ), √1 ! 3 0 2 5 (ii) 4 , (iii) 1 , (iv) 0 . (b) What are the eigenvalues and eigenvectors of a general elementary 5 0 − √1 2 reflection matrix?

We’ll just answer part (b), since the unit vectors in part (a) are special cases. We claim that the eigenvectors of Q = I − 2uuT are as follows: each v ∈ u⊥ is an eigenvector with eigenvalue 1, and each vector in the span of u is an eigenvector with eigenvalue −1. If we can show this, then Q is diagonalizable with eigenbasis consisting of u together with any basis for u⊥. In that case, the eigenvectors and eigenvalues we’ve described must be all of the eigenvectors and eigenvalues of Q. So let’s check that Qv = v. Indeed, Qv = v − 2u(u · v) = v. To ﬁnish, it just remains to see that Qu = −u. But Qu = u − 2u(u · u) = u − 2u = −u, as we wanted.

1 1 8.3.14: Diagonalize the Fibonacci matrix F = ( 1 0 ).

√ The characteristic polynomial of F is λ2 − λ − 1, and so the eigenvalues of F are 1± 5 . Recall that the √ 2 positive eigenvalue 1+ 5 is also called the “Golden Ratio,” φ. We’ll also give a name to the negative √ 2 1− 5 eigenvalue: φe = 2 . By considering the matrices F − φI and F − φeI (and determining their null φ spaces), we ﬁnd that the φ-eigenspace of F is spanned by and the φe-eigenspace is spanned by 1 φe . The two of these together form an eigenbasis for F . Thus, the matrix whose columns are these 1 eigenvectors is the change-of-basis matrix that puts F into diagonal form. More speciﬁcally,

−1 1 1 φ φ φ φ φ F = = e e . 1 0 1 1 φe 1 1