Math 480 Notes on Orthogonality the Word Orthogonal Is a Synonym for Perpendicular. Question 1: When Are Two Vectors V 1 and V2
Total Page:16
File Type:pdf, Size:1020Kb
Math 480 Notes on Orthogonality The word orthogonal is a synonym for perpendicular. n Question 1: When are two vectors ~v1 and ~v2 in R orthogonal to one another? The most basic answer is \if the angle between them is 90◦" but this is not very practical. How could you tell whether the vectors 0 1 1 0 1 1 @ 1 A and @ 3 A 1 1 are at 90◦ from one another? One way to think about this is as follows: ~v1 and ~v2 are orthogonal if and only if the triangle formed by ~v1, ~v2, and ~v1 − ~v2 (drawn with its tail at ~v2 and its head at ~v1) is a right triangle. The Pythagorean Theorem then tells us that this triangle is a right triangle if and only if 2 2 2 (1) jj~v1jj + jj~v2jj = jj~v1 − ~v2jj ; where jj − jj denotes the length of a vector. 0 x1 1 . The length of a vector ~x = @ . A is easy to measure: the Pythagorean Theorem (once again) xn tells us that q 2 2 jj~xjj = x1 + ··· + xn: This expression under the square root is simply the matrix product 0 x1 1 T . ~x ~x = (x1 ··· xn) @ . A : xn Definition. The inner product (also called the dot product) of two vectors ~x;~y 2 Rn, written h~x;~yi or ~x · ~y, is defined by n T X hx; yi = ~x ~y = xiyi: i=1 Since matrix multiplication is linear, inner products satisfy h~x;~y1 + ~y2i = h~x;~y1i + h~x;~y2i h~x1; a~yi = ah~x1; ~yi: (Similar formulas hold in the first coordinate, since h~x;~yi = h~y; ~xi.) Now we can write 2 2 jj~v1 − ~v2jj = h~v1 − ~v2;~v1 − ~v2i = h~v1;~v1i − 2h~v1;~v2i + h~v2;~v2i = jj~v1jj − 2h~v1;~v2i + jj~v2jj; so Equation (1) holds if and only if h~v1;~v2i = 0: n Answer to Question 1: Vectors ~v1 and ~v2 in R are orthogonal if and only if h~v1;~v2i = 0. Exercise 1: Which of the following pairs of vectors are orthogonal to one another? Draw pictures to check your answers. i) 1 −2 ; 2 1 ii) 0 1 1 0 1 1 @ 1 A ; @ −1 A 3 0 iii) 0 1 1 0 2 1 @ 1 A ; @ 1 A 1 3 Exercise 2: Find two orthogonal vectors in R6 all of whose coordinates are non-zero. Definition. Given a subspace S ⊂ Rn, the orthogonal complement of S, written S?, is the subspace consisting of all vectors ~v 2 Rn that are orthogonal to every ~s 2 S. Theorem 1. If S is a subspace of Rn and dim(S) = k, then dim(S?) = n − k. The basic idea here is that every vector in Rn can be built up from vectors in S and vectors in S?, and these subspaces do not overlap. Think about the case of R3: the orthogonal complement of a line (a 1-dimensional subspace) is a plane (a 2-dimensional subspace) and vice versa. Key Example: Given an m × n matrix A 2 Rm×n, the orthogonal complement of the row space Row(A) is precisely N(A). Why is this? The definition of matrix multiplication shows that being in the nullspace of A is exactly the same as being orthogonal to every row of A: recall that if 2 ~a1 3 6 ~a2 7 A = 6 . 7 4 . 5 ~am and ~x 2 Rn, then the product A~x is given by 2 ~a1 · ~x 3 6 ~a2 · ~x 7 A~x = 6 . 7 : 4 . 5 ~am · ~x Now, notice how the theorem fits with what we know about the fundamental subspaces: if dim N(A) = k, then there are k free variables and n − k pivot variables in the system A~x = ~0. Hence dim Row(A) = n − k. So the dimensions of N(A) and its orthogonal complement Row(A) add to n, as claimed by the Theorem. This argument actually proves the Theorem in general: every subspace S in Rn has a basis ~s1; ~s2; : : : ; ~sm (for some m 6 1), and S is then equal to the row space of the matrix 2 T 3 ~s1 T 6 ~s2 7 A = 6 . 7 : 4 . 5 T ~sm The statement that S contains a finite basis deserves some explanation, and will be considered in detail below. Another Key Example: Given an m × n matrix A 2 Rm×n, the orthogonal complement of the column space Col(A) is precisely N(AT ). This follows by the same sort of argument as for the first Key Example. The following theorem should seem geometrically obvious, but it is annoyingly difficult to prove directly. Theorem 2. If V and W are subspaces of Rn and V = W ?, then W = V ? as well. Proof. One half of this statement really is easy: if V is the orthogonal complement of W , this means V consists of all vectors ~v 2 Rn such that for all ~w 2 W , ~v · ~w = 0. Now if ~w 2 W , then this means ~w is definitely perpendicular to every ~v 2 V (i.e. ~v · ~w = 0), and hence W ⊂ V ?. But why must every vector that is orthogonal to all of V actually lie in W ? We can prove this using what we know about dimensions. Say V is k{dimensional. Then the dimension of V ? is n − k by Theorem 1. But Theorem 1 also tells us that the dimension of W is n − k (because V = W ?). So W is an (n−k){dimensional subspace of the (n−k){dimensional space V ?, and from Section 5 of the Notes ? of Linear Independence, Bases, and Dimension, we know that W must in fact be all of V . Corollary. For any matrix A, Col(A) = N(AT )?. Note that this statement has a nice implication for linear systems: the column space Col(A) consists of all vectors ~b such that A~x = ~b has a solution. If you want to check whether A~x = ~b has a solution, you can now just check whether or not ~b is perpendicular to all vectors in N(AT ). Sometimes this is easy to check - for instance, if you have a basis for N(AT ). (Note that if a vector ~w is perpendicular to each vector in a basis for some subspace V , then ~w is in fact perpendicular to all linear combinations of these basis vectors, so ~w 2 V ?. This raises question: how can we find a basis for N(AT )? Row reduction gives rise to an equation EA = R; where R is the reduced echelon form of A and E is a product of elementary matrices and permutation matrices (corresponding the the row operations performed on A). Say R has k rows of zeros. Note that the dimension of N(AT ) is precisely the number of rows of zeros in R (why?), so we are looking for an independent set of vectors in N(AT ) of size k. If you look at the last k rows in the matrix equation EA = R, you'll see that this equation says that the last k rows of E lie in the left-hand nullspace N(AT ). Moreover, these vectors are independent, because E is a product of invertible matrices, hence invertible (so its rows are independent). We now consider in detail the question of why every subspace of Rn has a basis. Theorem 3. If S is a subspace of Rn, then S has a basis containing at most n elements. Equiva- lently, dim(S) 6 n. Proof. First, recall that every set of n + 1 (or more) vectors in Rn is linearly dependent, since they form the columns of a matrix with more columns than rows. So every sufficiently large set of vectors in S is dependent. Let k be the smallest number between 1 and n + 1 such that every set of k vectors in S is dependent. If k = 1, then every vector ~s 2 S forms a dependent set (all by itself) so S must contain only the zero vector. In this case, the zero vector forms a spanning set for S. So we'll assume k > 1. Then there is a set ~s1; : : : ; ~sk−1 of vectors in S which is linearly independent, and every larger set in S is dependent. We claim that this set actually spans S (and hence is a basis for S). The proof will be by contradiction, meaning that we'll consider what would happen if this set did not span S, and we'll see that this would lead to a contradiction. If this set did not span S, then there would be a vector ~s 2 S that is not a linear combination of the vectors ~s1; : : : ; ~sk−1. We claim that this makes the set ~s1; : : : ; ~sk−1; ~s linearly independent. Say (2) c1~s1 + ··· + ck−1~sk−1 + ck~s = ~0: We will prove that all the scalars ci must be zero. If ck were non-zero, then we could solve the above equation for ~s, yielding c1 ck−1 ~s = − ~s1 − · · · − ~sk−1: ck ck But that's impossible, since ~s is not a linear combination of the vectors ~s1; : : : ; ~sk−1! So ck is zero, and the equation (2) becomes c1~s1 + ··· + ck−1~sk−1 = ~0: Since ~s1; : : : ; ~sk−1 is independent, the rest of the ci must be zero as well. We have now shown that ~s1; : : : ; ~sk−1; ~s is a linearly independent set in S, but this contradicts the assumption that all sets of size k in S are dependent.