Matrix Analysis

Appendix A Matrix Analysis This appendix collects useful background from linear algebra and matrix analysis, such as vector and matrix norms, singular value decompositions, Gershgorin’s disk theorem, and matrix functions. Much more material can be found in various books on the subject including [47, 51, 231, 273, 280, 281, 475]. A.1 Vector and Matrix Norms We work with real or complex vector spaces X, usually X = Rn or X = Cn.We n usually write the vectors in C in boldface, x, while their entries are denoted xj , j ∈ [n],where[n]:={1,...,n}. The canonical unit vectors in Rn are denoted by e1,...,en. They have entries 1 if j = , (e ) = δ = j ,j 0 otherwise. Definition A.1. A nonnegative function ·: X → [0, ∞) is called a norm if (a) x =0if and only if x = 0 (definiteness). (b) λx = |λ|x for all scalars λ and all vectors x ∈ X (homogeneity). (c) x + y≤x + y for all vectors x, y ∈ X (triangle inequality). If only (b) and (c) hold, so that x =0does not necessarily imply x = 0,then· is called a seminorm. If (a) and (b) hold, but (c) is replaced by the weaker quasitriangle inequality x + y≤C(x + y) for some constant C ≥ 1,then·is called a quasinorm. The smallest constant C is called its quasinorm constant. S. Foucart and H. Rauhut, A Mathematical Introduction to Compressive Sensing, 515 Applied and Numerical Harmonic Analysis, DOI 10.1007/978-0-8176-4948-7, © Springer Science+Business Media New York 2013 516 A Matrix Analysis A space X endowed with a norm ·is called a normed space. Definition A.2. Let X be a set. A function d : X × X → [0, ∞) is called a metric if (a) d(x, y)=0if and only if x = y. (b) d(x, y)=d(y,x) for all x, y ∈ X. (c) d(x, z) ≤ d(x, y)+d(y,z) for all x, y, z ∈ X. If only (b) and (c) hold, then d is called a pseudometric. The set X endowed with a metric d is called a metric space. A norm ·on X induces a metric on X by d(x, y)=x − y. A seminorm induces a pseudometric in the same way. n n The p-norm (or simply p-norm) on R or C is defined for 1 ≤ p<∞ as n 1/p p xp := |xj | , (A.1) j=1 and for p = ∞ as x∞ := max |xj |. j∈[n] For 0 <p<1, the expression (A.1) only defines a quasinorm with quasinorm constant C =21/p−1. This is derived from the p-triangle inequality p ≤ p p x + y p x p + y p. − p Therefore, the p-quasinorm induces a metric via d(x, y)= x y p for 0 <p<1. We define a ball of radius t ≥ 0 around a point x in a metric space (X, d) by B(x,t)=Bd(x,t)={z ∈ X : d(x, z) ≤ t}. If the metric is induced by a norm ·on a vector space, then we also write B · (x,t)={z ∈ X : x − z≤t}. (A.2) If x = 0 and t =1,thenB = B · = B · (0, 1) is called unit ball. The canonical inner product on Cn is defined by n n x, y = xj yj , x, y ∈ C . j=1 A.1 Vector and Matrix Norms 517 Rn n ∈ Rn On ,itisgivenby x, y = j=1 xj yj, x, y .The2-norm is related to the canonical inner product by ' x2 = x, x. The Cauchy–Schwarz inequality states that n |x, y| ≤ x2y2 for all x, y ∈ C . More generally, for p, q ∈ [1, ∞] such that 1/p +1/q =1(with the convention that 1/∞ =0and 1/0=∞), Holder’s¨ inequality states that n |x, y| ≤ xpyq for all x, y ∈ C . Given 0 <p<q≤∞, applying the latter with x, y, p,andq replaced by | |p | |p − p ≤ p (q−p)/p [ x1 ,..., xn ] , [1,...,1] , q/p,andq/(q p) gives x p x qn , i.e., 1/p−1/q xp ≤ n xq. (A.3) We point out the important special cases √ √ x1 ≤ nx2 and x2 ≤ nx∞. If x has at most s nonzero entries, that is, x0 =card({ : x =0 }) ≤ s, then the 1/p−1/q above inequalities become xp ≤ s xq, and in particular, √ x1 ≤ sx2 ≤ sx∞. For 0 <p<q≤∞, we also have the reverse inequalities xq ≤xp, (A.4) and in particular, x∞ ≤x2 ≤x1. Indeed, the bound x∞ ≤xp is obvious, and for p<q<∞, n n n q | |q | |q−p| |p ≤ q−p | |p ≤ q−p p q x q = xj = xj xj x ∞ xj x p x p = x p. j=1 j=1 j=1 Both bounds (A.3)and(A.4) are sharp in general. Indeed, equality holds in (A.3) for vectors with constant absolute entries, while equality holds in (A.4) for scalar multiples of a canonical unit vector. n n Definition A.3. Let ·be a norm on R or C . Its dual norm ·∗ is defined by x∗ := sup |y, x|. y ≤1 518 A Matrix Analysis In the real case, the dual norm may equivalently be defined via x∗ =supy, x, y∈Rn, y ≤1 while in the complex case x∗ =supRe y, x. y∈Cn, y ≤1 The dual of the dual norm ·∗ is the norm ·itself. In particular, we have x =sup|x, y| =supRe x, y. (A.5) y ∗≤1 y ∗≤1 The dual of ·p is ·q with 1/p +1/q =1. In particular, ·2 is self-dual, i.e., x2 =sup|y, x|, (A.6) y 2≤1 while ·∞ is the dual norm of ·1 and vice versa. Given a subspace W of a vector space X,thequotient space X/W consists of the residue classes [x]:=x + W = {x + w, w ∈ W }, x ∈ X. The quotient map is the surjective linear map x → [x]=x + W ∈ X/W .The quotient norm on X/W is defined by [x]X/W := inf{v, v ∈ [x]=x + W }, [x] ∈ X/W. Let us now turn to norms of matrices (or more generally, linear mappings be- m×n tween normed spaces). The entries of A ∈ C are denoted Aj,k, j ∈ [m],k ∈ [n]. The columns of A are denoted ak,sothatA =[a1|···|an]. The transpose of m×n n×m A ∈ C is the matrix A ∈ C with entries (A )k,j = Aj,k. A matrix B ∈ Cn×n is called symmetric if B = B. The adjoint (or Hermitian transpose) m×n ∗ n×m ∗ n of A ∈ C is the matrix A ∈ C with entries (A )k,j = Aj,k.Forx ∈ C and y ∈ Cm,wehaveAx, y = x, A∗y. A matrix B ∈ Cn×n is called self- ∗ n adjoint (or Hermitian) if B = B. The identity matrix on C is denoted Id or Idn. AmatrixU ∈ Cn×n is called unitary if U∗U = UU∗ = Id. A self-adjoint matrix B ∈ Cn×n possesses an eigenvalue decomposition of the form B = U∗DU,where n×n U is a unitary matrix U ∈ C and D =diag[λ1,...,λn] is a diagonal matrix containing the real eigenvalues λ1,...,λn of B. Definition A.4. Let A : X → Y be a linear map between two normed spaces (X, ·) and (Y,||| · ||| ). The operator norm of A is defined as A.1 Vector and Matrix Norms 519 A := sup ||| Ax||| =sup||| Ax||| . (A.7) x ≤1 x =1 In particular, for a matrix A ∈ Cm×n and 1 ≤ p, q ≤∞, we define the matrix norm (or operator norm) between p and q as Ap→q := sup Axq =supAxq. (A.8) x p≤1 x p=1 For A : X → Y and x ∈ X, the definition implies that ||| Ax||| ≤ Ax.Itis an easy consequence of the definition that the norm of the product of two matrices A ∈ Cm×n and B ∈ Cn×k satisfies ABp→r ≤Aq→rBp→q, 1 ≤ p, q, r ≤∞. (A.9) We summarize formulas for the matrix norms Ap→q for some special choices of p and q. The lemma below also refers to the singular values of a matrix, which will be covered in the next section. Lemma A.5. Let A ∈ Cm×n. (a) We have ' ∗ A2→2 = λmax(A A)=σmax(A), ∗ ∗ where λmax(A A) is the largest eigenvalue of A A and σmax(A) the largest singular value of A. n×n In particular,if B ∈ C is self-adjoint, then B2→2 =maxj∈[n] |λj (B)|, where the λj (B) denote the eigenvalues of B. (b) For 1 ≤ p ≤∞, we have A1→p =maxk∈[n] akp. In particular, m A1→1 =max |Aj,k|, (A.10) k∈[n] j=1 A1→2 =maxak2. (A.11) k∈[n] (c) We have n A∞→∞ =max |Aj,k|. j∈[m] k=1 Proof. (a) Since A∗A ∈ Cn×n is self-adjoint, it can be diagonalized as A∗A = U∗DU with a unitary U and a diagonal D containing the eigenvalues ∗ n λ of A A on the diagonal. For x ∈ C with x2 =1, 2 ∗ ∗ Ax 2 = Ax, Ax = A Ax, x = U DUx, x = DUx, Ux . 520 A Matrix Analysis 2 ∗ 2 Since U is unitary, we have Ux 2 = Ux, Ux = x, U Ux = x 2 =1. Moreover, for an arbitrary vector z ∈ Cn, n n | |2 ≤ | |2 ∗ 2 Dz, z = λj zj max λj zj = λmax(A A) z 2. j∈[n] j=1 j=1 ' ∗ Combining these facts establishes the inequality A2→2 ≤ λmax(A A).

Matrix Analysis

Sobolev Spaces, Theory and Applications

On Quasi Norm Attaining Operators Between Banach Spaces

On the Cone of a Finite Dimensional Compact Convex Set at a Point

Notes on Partial Differential Equations John K. Hunter

Boundedness in Linear Topological Spaces

C H a P T E R § Methods Related to the Norm Al Equations

Surjective Isometries on Grassmann Spaces 11

POLARS and DUAL CONES 1. Convex Sets

Locally Solid Riesz Spaces with Applications to Economics / Charalambos D

On the Ekeland Variational Principle with Applications and Detours

Approximation Boundedness Surjectivity

The Column and Row Hilbert Operator Spaces