Notes on Vector and Matrix Norms

Notes on Vector and Matrix Norms Robert A. van de Geijn Department of Computer Science The University of Texas at Austin Austin, TX 78712 [email protected] September 15, 2014 1 Absolute Value Recall that ifpα 2 C, then jαj equals its absolute value. In other words, if α = αr + iαc, then jαj = p 2 2 αr + αc = αα. This absolute value function has the following properties: • α 6= 0 ) jαj > 0 (j · j is positive definite), •j αβj = jαjjβj (j · j is homogeneous), and •j α + βj ≤ jαj + jβj (j · j obeys the triangle inequality). 2 Vector Norms A (vector) norm extends the notion of an absolute value (length or size) to vectors: Definition 1. Let ν : Cn ! R. Then ν is a (vector) norm if for all x; y 2 Cn • x 6= 0 ) ν(x) > 0 (ν is positive definite), • ν(αx) = jαjν(x) (ν is homogeneous), and • ν(x + y) ≤ ν(x) + ν(y) (ν obeys the triangle inequality). Exercise 2. Prove that if ν : Cn ! R is a norm, then ν(0) = 0 (where the first 0 denotes the zero vector in Cn). Answer: Let x 2 Cn and ~0 the zero vector of size n and 0 the scalar zero. Then ν(~0) = ν(0 · x) 0 · x = ~0 = j0jν(x) ν(·) is homogeneous = 0 algebra End of Answer Note: often we will use k · k to denote a vector norm. 1 2.1 Vector 2-norm (length) n Definition 3. The vector 2-norm k · k2 : C ! R is defined by p H p p 2 2 kxk2 = x x = χ0χ0 + ··· + χn−1χn−1 = jχ0j + ··· + jχn−1j : To show that the vector 2-norm is a norm, we will need the following theorem: n H Theorem 4. (Cauchy-Schartz inequality) Let x; y 2 C . Then jx yj ≤ kxk2kyk2. Proof: Assume that x 6= 0 and y 6= 0, since otherwise the inequality is trivially true. We can then choose H xb = x=kxk2 and yb = y=kyk2. This leaves us to prove that jxb ybj ≤ 1, with kxbk2 = kybk2 = 1. H H H Pick α 2 C with jαj = 1 s that αxb yb is real and nonnegative. Note that since it is real, αxb yb = αxb yb = H αyb xb. Now, 2 0 ≤ kxb − αybk2 H 2 H = (x − αyb) (xb − αyb)(kzk2 = z z) H H H H = xb xb − αyb xb − αxb yb + ααyb yb (multiplying out) H 2 H H H = 1 − 2αxb yb + jαj (kxbk2 = kybk2 = 1 and αxb yb = αxb yb = αyb xb) H = 2 − 2αxb yb (jαj = 1). H Thus 1 ≥ αxb yb and, taking the absolute value of both sides, H H H 1 ≥ jαxb ybj = jαjjxb ybj = jxb ybj; which is the desired result. QED Theorem 5. The vector 2-norm is a norm. Proof: To prove this, we merely check whether the three conditions are met: Let x; y 2 Cn and α 2 C be arbitrarily chosen. Then • x 6= 0 ) kxk2 > 0 (k · k2 is positive definite): Notice that x 6= 0 means that at least one of its components is nonzero. Let's assume that χj 6= 0. Then q p 2 2 2 kxk2 = jχ0j + ··· + jχn−1j ≥ jχjj = jχjj > 0: •k αxk2 = jαjkxk2 (k · k2 is homogeneous): p 2 2 p 2 2 2 2 p 2 2 2 kαxk2 = jαχ0j + ··· + jαχn−1j = jαj jχ0j + ··· + jαj jχn−1j = jαj (jχ0j + ··· + jχn−1j ) = p 2 2 jαj jχ0j + ··· + jχn−1j = jαjkxk2. •k x + yk2 ≤ kxk2 + kyk2 (k · k2 obeys the triangle inequality). 2 H kx + yk2 = (x + y) (x + y) = xH x + yH x + xH y + yH y 2 2 ≤ kxk2 + 2kxk2kyk2 + kyk2 2 = (kxk2 + kyk2) : Taking the square root of both sides yields the desired result. 2 QED 2.2 Vector 1-norm n Definition 6. The vector 1-norm k · k1 : C ! R is defined by kxk1 = jχ0j + jχ1j + ··· + jχn−1j: Exercise 7. The vector 1-norm is a norm. Answer: We show that the three conditions are met: Let x; y 2 Cn and α 2 C be arbitrarily chosen. Then • x 6= 0 ) kxk1 > 0 (k · k1 is positive definite): Notice that x 6= 0 means that at least one of its components is nonzero. Let's assume that χj 6= 0. Then kxk1 = jχ0j + ··· + jχn−1j ≥ jχjj > 0: •k αxk1 = jαjkxk1 (k · k1 is homogeneous): kαxk1 = jαχ0j + ··· + jαχn−1j = jαjjχ0j + ··· + jαjjχn−1j = jαj(jχ0j + ··· + jχn−1j) = jαj(jχ0j + ··· + jχn−1j) = jαjkxk1. •k x + yk1 ≤ kxk1 + kyk1 (k · k1 obeys the triangle inequality). kx + yk1 = jχ0 + 0j + jχ1 + 1j + ··· + jχn−1 + n−1j ≤ jχ0j + j 0j + jχ1j + j 1j + ··· + jχn−1j + j n−1j = jχ0j + jχ1j + ··· + jχn−1j + j 0j + j 1j + ··· + j n−1j = kxk1 + kyk1: End of Answer The vector 1-norm is sometimes referred to as the \taxi-cab norm". It is the distance that a taxi travels along the streets of a city that has square blocks. 2.3 Vector 1-norm (infinity norm) n Definition 8. The vector 1-norm k · k1 : C ! R is defined by kxk1 = maxi jχij. Exercise 9. The vector 1-norm is a norm. Answer: We show that the three conditions are met: Let x; y 2 Cn and α 2 C be arbitrarily chosen. Then • x 6= 0 ) kxk1 > 0 (k · k1 is positive definite): Notice that x 6= 0 means that at least one of its components is nonzero. Let's assume that χj 6= 0. Then kxk1 = max jχij ≥ jχjj > 0: i 3 •k αxk1 = jαjkxk1 (k · k1 is homogeneous): kαxk1 = maxi jαχij = maxi jαjjχij = jαj maxi jχij = jαjkxk1. •k x + yk1 ≤ kxk1 + kyk1 (k · k1 obeys the triangle inequality). kx + yk1 = max jχi + ij i ≤ max(jχij + j ij) i ≤ max(jχij + max j jj) i j = max jχij + max j jj = kxk1 + kyk1: i j End of Answer 2.4 Vector p-norm n Definition 10. The vector p-norm k · kp : C ! R is defined by pp p p p kxkp = jχ0j + jχ1j + ··· + jχn−1j : Proving that the p-norm is a norm is a little tricky and not particularly relevant to this course. To prove the triangle inequality requires the following classical result: n 1 1 H Theorem 11. (Hölderinequality) Let x; y 2 C and p + q = 1 with 1 ≤ p; q ≤ 1. Then jx yj ≤ kxkpkykq. Clearly, the 1-norm and 2 norms are special cases of the p-norm. Also, kxk1 = limp!1 kxkp. 3 Matrix Norms It is not hard to see that vector norms are all measures of how \big" the vectors are. Similarly, we want to have measures for how \big" matrices are. We will start with one that are somewhat artificial and then move on to the important class of induced matrix norms. 3.1 Frobenius norm m×n Definition 12. The Frobenius norm k · kF : C ! R is defined by v um−1 n−1 u X X 2 kAkF = t jαi;jj : i=0 j=0 Notice that one can think of the Frobenius norm as taking the columns of the matrix, stacking them on top of each other to create a vector of size m × n, and then taking the vector 2-norm of the result. Exercise 13. Show that the Frobenius norm is a norm. 4 Answer: The answer is to realize that if A = a0 a1 ··· an−1 then v u 0 1 2 u a0 v v v u m−1 n−1 n−1 m−1 n−1 u u u u B a1 C u X X 2 uX X 2 uX 2 u B C kAkF = t jαi;jj = t jαi;jj = t kajk = u B . C : 2 u B . C i=0 j=0 j=0 i=0 j=0 u . t @ A a n−1 2 In other words, it equals the vector 2-norm of the vector that is created by stacking the columns of A on top of each other. The fact that the Frobenius norm is a norm then comes from realizing this connection and exploiting it. Alternatively, just grind through the three conditions! End of Answer Similarly, other matrix norms can be created from vector norms by viewing the matrix as a vector. It turns out that other than the Frobenius norm, these aren't particularly interesting in practice. 3.2 Induced matrix norms m n m×n Definition 14. Let k · kµ : C ! R and k · kν : C ! R be vector norms. Define k · kµ,ν : C ! R by kAxkµ kAkµ,ν = sup : n x 2 C kxkν x 6= 0 Let us start by interpreting this. How \big" A is, as measured by kAkµ,ν , is defined as the most that A magnifies the length of nonzero vectors, where the length of the vectors (x) is measured with norm k · kν and the length of the transformed vector (Ax) is measured with norm k · kµ. Two comments are in order. First, kAxkµ sup = sup kAxkµ: n kxk x 2 C ν kxkν =1 x 6= 0 This follows immediately from the fact this sequence of equivalences: kAxkµ Ax x sup = sup k kµ = sup kA kµ = sup kAykµ = sup kAykµ = sup kAxkµ: n kxk n kxk n kxk n x 2 C ν x 2 C ν x 2 C ν x 2 C kykν =1 kxkν =1 x 6= 0 x 6= 0 x 6= 0 x 6= 0 y = x kxkν Also the \sup" (which stands for supremum) is used because we can't claim yet that there is a vector x with kxkν = 1 for which kAkµ,ν = kAxkµ: The fact is that there is always such a vector x.

Load more